zhaoyongjie commented on a change in pull request #15279:
URL: https://github.com/apache/superset/pull/15279#discussion_r667763659
##########
File path: superset/common/query_context.py
##########
@@ -101,21 +104,143 @@ def __init__( # pylint: disable=too-many-arguments
"result_format": self.result_format,
}
- def get_query_result(self, query_object: QueryObject) -> Dict[str, Any]:
- """Returns a pandas dataframe based on the query object"""
+ @staticmethod
+ def left_join_on_dttm(
+ left_df: pd.DataFrame, right_df: pd.DataFrame
+ ) -> pd.DataFrame:
+ df = left_df.set_index(DTTM_ALIAS).join(right_df.set_index(DTTM_ALIAS))
+ df.reset_index(level=0, inplace=True)
+ return df
+
+ def processing_time_offsets(
+ self, df: pd.DataFrame, query_object: QueryObject,
+ ) -> Tuple[pd.DataFrame, List[str], List[Optional[str]]]:
+ # ensure query_object is immutable
+ query_object_clone = copy.copy(query_object)
+ rv_sql = []
+ cache_keys = []
+
+ time_offsets = query_object.time_offsets
+ outer_from_dttm = query_object.from_dttm
+ outer_to_dttm = query_object.to_dttm
+ for offset in time_offsets:
Review comment:
For where clause combined by `or` operator, I estimate that the system
consumption is approximately equal to multiple queries. This is because the
`or` operator does not reduce rows scan for the database engine. And we don't
have the opportunity to cache each time offset. Let me explain.
### Use `or` operator in the where clause
- unable to cache each time-offset slice
- unable to easily generate the final dataframe, when it faces to null
values, it is difficult to join with main-query
<img width="824" alt="image"
src="https://user-images.githubusercontent.com/2016594/125272999-0e762280-e33f-11eb-80d4-bf5015ddc446.png">
### Use extra query
<img width="814" alt="image"
src="https://user-images.githubusercontent.com/2016594/125273052-1c2ba800-e33f-11eb-8feb-e03392092705.png">

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]