villebro commented on a change in pull request #15279:
URL: https://github.com/apache/superset/pull/15279#discussion_r656928933
##########
File path: superset/common/query_context.py
##########
@@ -97,6 +101,62 @@ def __init__( # pylint: disable=too-many-arguments
"result_format": self.result_format,
}
+ def processing_time_offset(
+ self, df: pd.DataFrame, query_object: QueryObject,
+ ) -> Tuple[pd.DataFrame, List[str]]:
+ # ensure query_object is immutable
+ query_object_clone = copy.copy(query_object)
+ rv_sql = []
+
+ time_offset = query_object.time_offset
+ outer_from_dttm = query_object.from_dttm
+ outer_to_dttm = query_object.to_dttm
+ for offset in time_offset:
+ try:
+ query_object_clone.from_dttm = get_past_or_future(
+ offset, outer_from_dttm,
+ )
+ query_object_clone.to_dttm = get_past_or_future(offset,
outer_to_dttm,)
+ except ValueError as ex:
+ raise QueryObjectValidationError(str(ex))
+ # make sure subquery use main query where clause
+ query_object_clone.inner_from_dttm = outer_from_dttm
+ query_object_clone.inner_to_dttm = outer_to_dttm
+ query_object_clone.time_offset = []
Review comment:
I wonder if we should add `time_offset` to the `QueryObject` schema and
rename the current one to `time_offsets`, adding both to the cache key. Example:
We want to make a query with two offsets: one year ago and two years ago.
The "actual" main query that gets executed and cached (no additional columns
added yet):
```python
time_offsets: None
time_offset: None
```
First extra query (gets concatenated to the previous dataframe):
```python
time_offsets: None
time_offset: 1
```
Second extra query (also concatenated to the main dataframe):
```python
time_offsets: None
time_offset: 2
```
Finally, when the full query object is constructed, the following result
would be cached with the following keys:
```python
time_offsets: [1, 2]
time_offset: None
```
This way the main query result would be persisted along with the results of
the extra query results without the need to rebuild the full dataframe on each
request, and the extra queries could then also be cached individually.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]