benjreinhart opened a new issue #14289:
URL: https://github.com/apache/superset/issues/14289


   This bug effects both the polling and WebSocket implementations of the async 
queries experience introduced in SIP-39.
   
   ### Expected results
   
   When clients are notified of query completion (either by polling or WS 
message), they should be able to retrieve the results of that query using the 
`result_url` value the server gave them. The server should respond with a 200 
and the cached query results. This should be the case for all charts in 
Superset.
   
   ### Actual results
   
   When a client is notified the query has completed, the client initiates the 
call to the backend with the `result_url`. For some charts, the server is 
responding with a 400 bad request because it could not find the cached results.
   
   #### Screenshots
   
   <img width="1208" alt="Screen Shot 2021-04-21 at 4 45 26 PM" 
src="https://user-images.githubusercontent.com/606233/115635890-8c3b0c80-a2c1-11eb-8e38-4f20859be2ee.png";>
   
   #### How to reproduce the bug
   
   There are a few charts/dashboards where this is happening. One example is:
   
   1. Locally, visit `/superset/dashboard/deck/`
   2. Notice that after some time, all charts render an "Unexpected error" 
message.
   
   ### Environment
   
   Latest master
   
   ### Additional context
   
   I have not completed my investigation, but have found at least two reasons 
why this is happening.
   
   (For background context, the cache keys are derived from the `form_data` the 
client sends to the server.)
   
   1. The explore_json api viz objects modify the `form_data` values that are 
sent from the client. The background worker running the query caches the 
`form_data` in order to later use it to regenerate the cache key for the query 
results. However, it is caching the `form_data` _after_ it has been modified, 
which means during subsequent lookup and key generation, it's using a value 
different than the one the user originally provided, leading to a different 
cache key and therefore a cache miss.
   2. Even if 1 is fixed such that the subsequent requests use the same initial 
input data, some of the code modifying the the `form_data` object are doing so 
non-deterministically with things like generating and adding UUIDs to the 
object.
   
   While we should fix some of these issues quickly to unblock the experience, 
I think a better approach longer term is to decouple the cache key generation 
(and caching of values) from the code that is constructing and executing SQL. 
The cache key generation should be able to be invoked independently of 
executing a query and it should return the same key given the same input. 
Isolating this code allows us to easily test it, reuse it, and evolve it 
independently of other modules.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to