somu-imply commented on code in PR #15020:
URL: https://github.com/apache/druid/pull/15020#discussion_r1339336869


##########
sql/src/main/java/org/apache/druid/sql/calcite/rel/DruidQuery.java:
##########
@@ -786,8 +789,44 @@ static Pair<DataSource, Filtration> getFiltration(
       JoinableFactoryWrapper joinableFactoryWrapper
   )
   {
-    if (!canUseIntervalFiltering(dataSource)) {
+    if (dataSource instanceof UnnestDataSource) {
+      // UnnestDataSource can have another unnest data source
+      // join datasource, filtered data source, etc as base
+      Pair<DataSource, Filtration> pair = getFiltration(
+          ((UnnestDataSource) dataSource).getBase(),
+          filter,
+          virtualColumnRegistry,
+          joinableFactoryWrapper
+      );
+      return Pair.of(dataSource, pair.rhs);
+    } else if (!canUseIntervalFiltering(dataSource)) {
       return Pair.of(dataSource, toFiltration(filter, 
virtualColumnRegistry.getFullRowSignature(), false));
+    } else if (dataSource instanceof FilteredDataSource) {
+      // A filteredDS is created only inside the rel for Unnest, ensuring it 
only grabs the outermost filter
+      // and, if possible, pushes it down inside the data source.
+      // So a chain of Filter->Unnest->Filter is typically impossible when the 
query is done through SQL.
+      // Also, Calcite has filter reduction rules that push filters deep into 
base data sources for better query planning.
+      // Hence, the case that can be seen is a bunch of unnests followed by a 
terminal filteredDS like Unnest->Unnest->FilteredDS.
+      // A base table with a chain of filters is synonymous with a filteredDS.
+      // In case there are filters present in the getFiltration call we still 
update the interval by:
+      // 1) creating a filtration from the filteredDS's filter and
+      // 2) Updating the interval of the outer filter with the intervals in 
step 1, and you'll see these 2 calls in the code
+      final FilteredDataSource filteredDataSource = (FilteredDataSource) 
dataSource;
+      // Defensive check as in the base of a filter cannot be another filter
+      final DataSource baseOfFilterDataSource = filteredDataSource.getBase();
+      if (baseOfFilterDataSource instanceof FilteredDataSource) {
+        throw DruidException.defensive("Cannot create a filteredDataSource 
using another filteredDataSource as a base");
+      }

Review Comment:
   The QueryDS would be handled separately where it would find its own UnnestDS 
and FilteredDS and set the interval for the inner subquery. The outer query 
should still be eternity. This is similar to how Druid handles scan over 
queryDS. I'll add the examples. I agree with the other comment that we need to 
recursively for the case of Unnest->Filter->Unnest->Filter. I'll update the 
comment adding that a chain cannot start with a filterDS as a filteredDS is 
only created as the base of an UnnestDS



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to