mistercrunch commented on issue #34384:
URL: https://github.com/apache/superset/issues/34384#issuecomment-3134939137

   I ran an analysis in Claude Code, here's what it had to say after parsing 
through the code:
   
   ----
   
   ## Analysis
   
   ### Current Implementation
   
   The ESTIMATE_QUERY_COST feature already exists in Superset but requires two 
conditions:
   1. Feature flag: `ESTIMATE_QUERY_COST` must be enabled in 
`superset/config.py`
   2. Per-database setting: `cost_estimate_enabled: true` must be added to the 
database's extra attributes
   
   ### Currently Supported Databases
   - **BigQuery** - Full support with dry run API (returns data processed in 
B/KB/MB/GB)
   - **PostgreSQL** - Uses `EXPLAIN` command (returns startup and total cost)
   - **Presto/Trino** - Uses `EXPLAIN (TYPE IO, FORMAT JSON)` (returns detailed 
metrics)
   
   ### How It Works
   1. User clicks "Estimate cost" button in SQL Lab
   2. Frontend calls `/api/v1/sqllab/estimate/` endpoint
   3. Database engine spec's `estimate_query_cost()` method executes the 
appropriate estimation command
   4. Results are formatted and displayed in a modal
   
   ### Key Limitation
   **The feature currently only displays costs - it doesn't implement the 
threshold warning system suggested in this issue.** This would be a valuable 
enhancement.
   
   ### Implementation Path for Threshold Warnings
   To implement the proposed warning system:
   1. Add a new config parameter like `QUERY_COST_WARNING_THRESHOLD` with 
sub-settings per metric type
   2. Modify `QueryEstimationCommand` to check thresholds after estimation
   3. Update the frontend to display warnings before query execution
   4. Consider making thresholds configurable per database or per user role
   
   ### Extending to Other Databases
   The architecture is well-designed for extension. To add support for 
databases like Snowflake or Redshift:
   1. Override `get_allow_cost_estimate()` to return `True`
   2. Implement `estimate_statement_cost()` to execute the database's cost 
estimation command
   3. Parse and format the results appropriately
   
   Many major databases could be supported since they have EXPLAIN 
capabilities: Snowflake, Redshift, MySQL, Oracle, SQL Server, Databricks, and 
ClickHouse.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@superset.apache.org
For additional commands, e-mail: notifications-h...@superset.apache.org

Reply via email to