imply-cheddar commented on PR #13922:
URL: https://github.com/apache/druid/pull/13922#issuecomment-1467236981

   Something I don't understand with this structuring of the code.  When we 
look at the actions taken in planning and running these queries we get
   
   1) SQL is parsed into parse tree and converted to logical DAG
   2) Logical DAG is optimized such that filters are applied to each side of 
the UNNEST correlate (i.e. Calcite figures out which filters apply to the 
unnested column (rhs of the LogicalCorrelate with the Uncollect) and which 
filters apply to the base query (lhs of the LogicalCorrelate with the Uncollect)
   3) We have rules that push all of the filters that Calcite already figured 
out for us such that they are above the`LogicalCorrelate`.
   4) We build a native query with the filters all meshed together
   5) The native query then has code that figures out, once again, whether some 
of the filters can be rewritten to be running against the underlying columns
   
   It seems weird to me that we would explicitly undo the thing that Calcite 
figured out for us so that we can attempt to re-do it in the native query.
   
   I'd propose that we take a `Filter` object on the `UnnestDatasource`.  The 
UnnestCursor can pretty easily attempt the re-write and pushdown of that RHS 
filter and also attach it as a `ValueMatcher` on the read.  This is also seems 
like a much more natural way to plan the query, no?  
   
   Is there some reason that we have to throw away the work that Calcite 
already did for us only to redo it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to