aminghadersohi opened a new pull request, #38079: URL: https://github.com/apache/superset/pull/38079
### SUMMARY The dashboard thumbnail digest computation is non-deterministic, causing excessive cache misses and unnecessary Selenium screenshot regeneration. This was observed in production with a **4.3% cache hit rate** (24 hits out of 555 triggers over 14 days) for a single workspace, with one dashboard being re-screenshotted **132 times in a single day**. #### Root cause 1. **`dashboard.datasources` returns a Python `set`**, and `_adjust_string_with_rls()` iterates over it to build the hash input string. Python sets have non-deterministic iteration order across different processes (different `PYTHONHASHSEED`). Different Gunicorn workers produce different digests for the same dashboard+user → cache miss → Selenium screenshot → all chart queries fire against the data warehouse. 2. **`dashboard.charts`** depends on `self.slices`, a SQLAlchemy relationship with no `order_by` clause, adding another source of ordering instability. #### Fix - Sort datasources by ID before iterating in `_adjust_string_with_rls()` - Sort chart names in `get_dashboard_digest()` before including in the hash input These are minimal, targeted changes that ensure digest stability without changing any other behavior. ### BEFORE/AFTER SCREENSHOTS OR COVERAGE URL N/A - backend-only change, no UI impact. ### TESTING INSTRUCTIONS Added `test_dashboard_digest_deterministic_datasource_order` which verifies that three different orderings of the same datasources produce identical digests. ### ADDITIONAL INFORMATION - **Related PRs**: #37895, #37899, #37941 (reduce per-computation DB cost but don't fix the digest instability) - **Impact**: For a dashboard with N datasources, there are N! possible iteration orders from the set, each potentially producing a different digest. Sorting reduces this to exactly 1 deterministic result. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
