Ujjwaljain16 opened a new pull request, #38227:
URL: https://github.com/apache/superset/pull/38227

   ### SUMMARY
   
   When `GLOBAL_ASYNC_QUERIES` is enabled, the web server and Celery workers 
independently generate cache keys for the same query. Formatting differences in 
ad-hoc SQL expressions (e.g., `\r\n` vs `\n`, leading/trailing whitespace) can 
result in different `QueryObject.cache_key()` values, leading to HTTP 422 
errors due to async cache mismatches.
   
   While `where` and `having` clauses are sanitized during `validate()`, ad-hoc 
SQL expressions in the following fields were not normalized prior to hashing:
   
   * `metrics` (adhoc SQL)
   * `columns` (adhoc SQL)
   * `orderby` (adhoc SQL)
   
   This patch introduces a minimal normalization step inside 
`QueryObject.cache_key()` to ensure deterministic hash generation:
   
   * Uses `copy.deepcopy(self.to_dict())` to prevent mutation.
   * Normalizes CRLF (`\r\n`) to LF (`\n`).
   * Strips leading/trailing whitespace.
   * Applies normalization only when `expressionType == "SQL"` (defensive 
guard).
   * Does **not** render Jinja or modify execution lifecycle.
   
   This change is strictly limited to formatting normalization in the hashing 
layer and does not alter query semantics or template processing.
   
   Fixes #37114
   
   ---
   
   ### BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
   
   Not applicable (backend-only change).
   
   ---
   
   ### TESTING INSTRUCTIONS
   
   #### Automated Tests
   
   Run:
   
   ```
   pytest tests/unit_tests/queries/test_query_object_cache_key_normalization.py
   ```
   
   Test coverage includes:
   
   * CRLF vs LF parity
   * Mixed newline normalization
   * Leading/trailing whitespace normalization
   * Mixed metric types (string + adhoc SQL)
   * Defensive `orderby` structure handling
   * No mutation of original `QueryObject`
   
   #### Manual Verification
   
   1. Enable `GLOBAL_ASYNC_QUERIES = True` in `superset_config.py`.
   2. Create or edit a chart with:
   
      * An ad-hoc SQL metric containing multiple lines (including CRLF if 
possible).
      * Or a custom SQL expression in `Sort by`.
   3. Load the chart/dashboard.
   4. Confirm:
   
      * No HTTP 422 error occurs.
      * Async query executes successfully.
      * Cache key remains consistent between web and worker processes.
   
   ---
   
   ### ADDITIONAL INFORMATION
   
   * [x] Has associated issue: #37114
   * [ ] Required feature flags
   * [ ] Changes UI
   * [ ] Includes DB Migration
   * [ ] Introduces new feature or API
   * [ ] Removes existing feature or API
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to