somu-imply commented on code in PR #15020:
URL: https://github.com/apache/druid/pull/15020#discussion_r1339336869
##########
sql/src/main/java/org/apache/druid/sql/calcite/rel/DruidQuery.java:
##########
@@ -786,8 +789,44 @@ static Pair<DataSource, Filtration> getFiltration(
JoinableFactoryWrapper joinableFactoryWrapper
)
{
- if (!canUseIntervalFiltering(dataSource)) {
+ if (dataSource instanceof UnnestDataSource) {
+ // UnnestDataSource can have another unnest data source
+ // join datasource, filtered data source, etc as base
+ Pair<DataSource, Filtration> pair = getFiltration(
+ ((UnnestDataSource) dataSource).getBase(),
+ filter,
+ virtualColumnRegistry,
+ joinableFactoryWrapper
+ );
+ return Pair.of(dataSource, pair.rhs);
+ } else if (!canUseIntervalFiltering(dataSource)) {
return Pair.of(dataSource, toFiltration(filter,
virtualColumnRegistry.getFullRowSignature(), false));
+ } else if (dataSource instanceof FilteredDataSource) {
+ // A filteredDS is created only inside the rel for Unnest, ensuring it
only grabs the outermost filter
+ // and, if possible, pushes it down inside the data source.
+ // So a chain of Filter->Unnest->Filter is typically impossible when the
query is done through SQL.
+ // Also, Calcite has filter reduction rules that push filters deep into
base data sources for better query planning.
+ // Hence, the case that can be seen is a bunch of unnests followed by a
terminal filteredDS like Unnest->Unnest->FilteredDS.
+ // A base table with a chain of filters is synonymous with a filteredDS.
+ // In case there are filters present in the getFiltration call we still
update the interval by:
+ // 1) creating a filtration from the filteredDS's filter and
+ // 2) Updating the interval of the outer filter with the intervals in
step 1, and you'll see these 2 calls in the code
+ final FilteredDataSource filteredDataSource = (FilteredDataSource)
dataSource;
+ // Defensive check as in the base of a filter cannot be another filter
+ final DataSource baseOfFilterDataSource = filteredDataSource.getBase();
+ if (baseOfFilterDataSource instanceof FilteredDataSource) {
+ throw DruidException.defensive("Cannot create a filteredDataSource
using another filteredDataSource as a base");
+ }
Review Comment:
The QueryDS would be handled separately where it would find its own UnnestDS
and FilteredDS and set the interval for the inner subquery. The outer query
should still be eternity. This is similar to how Druid handles scan over
queryDS. I'll add the examples. I agree with the other comment that we need to
recursively for the case of Unnest->Filter->Unnest->Filter. I'll update the
comment adding that a chain cannot start with a filterDS
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]