gabotorresruiz commented on PR #40346:
URL: https://github.com/apache/superset/pull/40346#issuecomment-4522024786

   ## Review: list/get tools for saved queries and query history
   
   Thorough pass over the 4 new MCP tools. Summary: clean, well-tested, 
correctly secured, and faithful to the existing `list_datasets` / 
`get_dataset_info` pattern. **Approve with minor comments** — nothing blocking.
   
   ### Verified (all good ✅)
   
   - **RBAC / row-level security is correctly enforced.** `QueryDAO.base_filter 
= QueryFilter` restricts results to `Query.user_id == current_user` unless the 
user has `can_access_all_queries`; `SavedQueryDAO.base_filter = 
SavedQueryFilter` restricts to `created_by == current_user`. `BaseDAO.list` and 
`find_by_id` (default `skip_base_filter=False`) both apply these, so the new 
tools cannot leak another user's queries. The docstrings ("...or all queries 
for admins") match the actual filter behavior.
   - `class_permission_name="Query"` / `"SavedQuery"` match the REST API view 
names exactly, and `method_permission_name` defaults to `read` → gates on 
`can_read`. Consistent with other read tools.
   - Schema fields map to real columns: `Query` has `start_time`/`end_time` 
(`Numeric`, epoch seconds), `changed_on`, `schema`, `status`, `database_id`. 
`Query` has no `uuid`, so `get_query_info` correctly accepts numeric ID only, 
while `get_saved_query_info` correctly supports ID *or* UUID.
   - Tests cover the meaningful paths: basic list, filters, search, empty 
results, pagination, not-found, default page size (25), default ordering 
(`start_time desc`), and the search+filters mutual-exclusion validator.
   
   ### Suggested changes (none blocking)
   
   **1. `sql` in `DEFAULT_QUERY_COLUMNS` + `page_size=25` produces heavy 
default payloads.** `DEFAULT_QUERY_COLUMNS` includes `sql` and 
`DEFAULT_QUERY_PAGE_SIZE = 25`, so a bare `list_queries()` returns 25 full SQL 
bodies. This contrasts with `DEFAULT_SAVED_QUERY_COLUMNS` (no `sql`) and 
`DEFAULT_DATASET_COLUMNS`, whose comment explicitly says *"Minimal defaults for 
reduced token usage."* For an LLM-facing tool, consider dropping `sql` from the 
query default columns — callers can request it via `select_columns` or use 
`get_query_info`.
   
   **2. `QueryError.create()` / `SavedQueryError.create()` are dead code.** 
Both classmethods are defined but never called: the get-tools build 
`QueryError(...)` directly, `ModelGetInfoCore` builds `error_schema(...)` 
directly, and the list-tools re-raise. Either drop them or use them in the 
`except` handlers for consistency.
   
   **3. `changed_on` is selectable but not sortable for queries.** It is in 
`ALL_QUERY_COLUMNS` but not `SORTABLE_QUERY_COLUMNS`, despite `Query` having an 
indexed `changed_on` column (`ti_user_id_changed_on`). Minor asymmetry — 
consider adding it to sortable, or leave as-is since `start_time` is the 
natural recency sort.
   
   **4. Minor test-coverage gaps (optional).** No test exercises 
`select_columns` projection (the context-based field filtering is a core 
feature), rejection of a non-sortable `order_column`, or the `InternalError` 
exception path in the get-tools.
   
   **5. `app.py` instruction-text inconsistency (trivial).** The 
`list_saved_queries` line says "(1-based pagination)" while `list_queries` says 
"(most recent first)". Both are 1-based; cosmetic only.
   
   ### Notes (consistent with existing code, not PR issues)
   
   - List tools annotate `-> QueryList | QueryError` but actually return a 
`dict` from `model_dump()` and re-raise on error rather than returning 
`QueryError`. This mirrors `list_datasets`, so it's an established pattern.
   - `error_type="InternalError"` (PascalCase) vs `"not_found"` (snake_case) 
inconsistency is pre-existing (`get_dataset_info` does the same).
   - `populate_by_name=True` in the schemas is unnecessary (no aliases) but 
harmless and matches the dataset schema.
   
   Item #1 is the only one I'd suggest addressing before merge; the rest are 
nits.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to