[GitHub] [incubator-superset] betodealmeida opened pull request #5828: Fix cache for multiple time comparisons

GitHub Wed, 05 Sep 2018 16:33:55 -0700

Currently we have these two conflicting behaviors:

1. When computing the cache key from the query object _"we remove datetime 
bounds that are hard values, and replace them with the use-provided inputs to 
bounds, which may be time-relative (as in "5 days ago" or "now")."_ This 
removes `from_dttm` and `to_dttm` from the query object when generating the 
cache dict.
2. When running a query for a time comparison, say "1 week", we move the values 
stored in the keys `from_dttm` and `to_dttm` in the query object to 
`inner_from_dttm` and `inner_to_dttm`, and update the original keys with the 
shifted values.


When the second query runs, the cache key will be different from the first one 
even though `from_dttm` and `to_dttm` are stripped from both query objects, 
because the second one has the `inner_from_dttm` and `inner_to_dttm` keys set.

The problem is that **multiple time shifts** will have the same cache key, 
since the only thing differing them is `from_dttm` and `to_dttm`. This results 
in a false positive cache hit. I fixed it by allowing passing extra cache keys 
when fetching a dataframe. This way each time comparison will add, eg, 
`time_compare: "1 week"` to the cache dict.

[ Full content available at: 
https://github.com/apache/incubator-superset/pull/5828 ]
This message was relayed via gitbox.apache.org for [email protected]

[GitHub] [incubator-superset] betodealmeida opened pull request #5828: Fix cache for multiple time comparisons

Reply via email to