[GitHub] [druid] jon-wei commented on a change in pull request #10056: Ensure that join filter pre-analysis operates on optimized filters

GitBox Mon, 29 Jun 2020 19:59:50 -0700


jon-wei commented on a change in pull request #10056:
URL: https://github.com/apache/druid/pull/10056#discussion_r447373185




##########
File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java
##########
@@ -107,8 +107,17 @@ public static boolean isPrefixedBy(final String 
columnName, final String prefix)
             );
 
             for (Query joinQuery : joinQueryLevels) {
+              // The pre-analysis needs to apply to the optimized form of 
filters, as this is what will be
+              // passed to HashJoinSegmentAdapter.makeCursors().
+              // The optimize() call here means that the filter optimization 
will happen twice,
+              // since the query toolchests will call optimize() later.
+              // We do this for simplicity as we cannot override what query 
will get run later from this context.
+              // A more complicated approach involving wrapping the query 
runner after the pre-merge decoration
+              // and moving the pre-analysis might be viable.
+              // The additional overhead of the simple approach should be low, 
as the additional optimize() call
+              // on each filter will only occur once per query.
               preAnalysisGroup.computeJoinFilterPreAnalysisIfAbsent(
-                  joinQuery.getFilter() == null ? null : 
joinQuery.getFilter().toFilter(),
+                  joinQuery.getFilter() == null ? null : 
joinQuery.getFilter().optimize().toFilter(),

Review comment:
       I tried to move this test into a new test class in processing, but I 
found that there's quite a lot of test machinery that has to be created (around 
making a SegmentWalker). The existing test infrastructure for that all exists 
outside of the processing module.
   
   I don't think that's worth doing at this point for a single test with pretty 
special requirements.

##########
File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java
##########
@@ -107,8 +107,17 @@ public static boolean isPrefixedBy(final String 
columnName, final String prefix)
             );
 
             for (Query joinQuery : joinQueryLevels) {
+              // The pre-analysis needs to apply to the optimized form of 
filters, as this is what will be
+              // passed to HashJoinSegmentAdapter.makeCursors().
+              // The optimize() call here means that the filter optimization 
will happen twice,
+              // since the query toolchests will call optimize() later.
+              // We do this for simplicity as we cannot override what query 
will get run later from this context.
+              // A more complicated approach involving wrapping the query 
runner after the pre-merge decoration
+              // and moving the pre-analysis might be viable.
+              // The additional overhead of the simple approach should be low, 
as the additional optimize() call
+              // on each filter will only occur once per query.
               preAnalysisGroup.computeJoinFilterPreAnalysisIfAbsent(
-                  joinQuery.getFilter() == null ? null : 
joinQuery.getFilter().toFilter(),
+                  joinQuery.getFilter() == null ? null : 
joinQuery.getFilter().optimize().toFilter(),

Review comment:
       I tried to move this test into a new test class in processing, but I 
found that there's quite a lot of test machinery that has to be created (around 
making a SegmentWalker). The existing test infrastructure for that all exists 
outside of the processing module, so it cannot be reused.
   
   I don't think that's worth doing at this point for a single test with pretty 
special requirements.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] jon-wei commented on a change in pull request #10056: Ensure that join filter pre-analysis operates on optimized filters

Reply via email to