Re: [I] [SIP-39] Global Async Query Support [superset]

via GitHub Wed, 03 Jul 2024 01:33:54 -0700


villebro commented on issue #9190:
URL: https://github.com/apache/superset/issues/9190#issuecomment-2205419056


   With this SIP still being behind an experimental feature flag, and not 
actively maintained, I've been thinking about ways we could simplify this 
architecture, and finally making this generally available in a forthcoming 
Superset release. The reason I think stabilizing this feature is very important 
Superset's current synchronous query execution model causes lots of issues:
   - If many people open the same chart/dashboard at the same time, both will 
execute a query to the underlying database, due to no locking of queries
   - if a user refreshes a dashboard multiple times, they can quickly congest 
the downstream database with heavy queries, both eating up webserver threads 
and database resources.
   - the web worker threads/processes get blocked waiting for long running 
queries to complete executing, making it impossible to effectively scale web 
worker replica sets based on CPU consumption. This should make it possible to 
get by with much slimmer webworker replica sets. Furthermore, async workers 
could be scaled up/down based on the queue depth.
   
   To simplify the architecture and reuse existing functionality, I propose the 
following:
   - The websocket architecture is removed. In the future only polling would be 
supported. Also the concept of a "query context cache key" is removed in favor 
of only a single cache key, i.e. the one we already use for chart data.
   - When requesting chart data, if the data exists in the cache, the data is 
returned normally
   - When chart data isn't available in the cache, only the `cache_key` is 
returned, along with additional details: when the most recent request has been 
submitted, status (pending, executing) etc.
   
   The async execution flow is changed to be similar to SQL Lab async 
execution, with the following changes:
   - when the async worker starts executing the query, the cache key is locked 
using the `KeyValueDistributedLock` class. This means that only a single worker 
executes any one cache key query at a time.
   - To support automatic cancelling of queries, we add a new optional field 
"poll_ttl" to the query context, which makes it possible to automatically 
cancel queries that are not being actively polled. Every time the cache key is 
polled, the latest poll time is updated on the metadata object. The worker 
periodically checks the metadata object, and if the `poll_ttl` is defined, and 
the last poll time is older, the query is automatically cancelled. This ensures 
that if a person closes a dashboard with lots of long running queries, the 
queries are automatically cancelled if nobody is actively waiting for the 
results. By default, frontend requests have `poll_ttl` set to whichever value 
is set in the config (`DEFAULT_CHART_DATA_POLL_TTL`). Cache warmup requests 
would likely not have a `poll_ttl` set, so as to avoid unnecessary polling.
   - To limit hammering the polling endpoint, we introduce a customizable 
backoff function in `superset_config.py`, which makes it possible to define how 
polling backoff should be implemented. The default behavior would be some sort 
of exponential backoff, where freshly started queries are polled more actively, 
and queries that have been pending/running for a long time are polled less 
frequently. When the frontend requests chart data, the backend provides the 
recommended wait time in the response time based on the backoff function.
   
   I assume we need a new SIP for this, but I wanted to drop this comment here 
to get initial feedback.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [I] [SIP-39] Global Async Query Support [superset]

Reply via email to