jon-wei commented on a change in pull request #10056:
URL: https://github.com/apache/druid/pull/10056#discussion_r447373185
##########
File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java
##########
@@ -107,8 +107,17 @@ public static boolean isPrefixedBy(final String
columnName, final String prefix)
);
for (Query joinQuery : joinQueryLevels) {
+ // The pre-analysis needs to apply to the optimized form of
filters, as this is what will be
+ // passed to HashJoinSegmentAdapter.makeCursors().
+ // The optimize() call here means that the filter optimization
will happen twice,
+ // since the query toolchests will call optimize() later.
+ // We do this for simplicity as we cannot override what query
will get run later from this context.
+ // A more complicated approach involving wrapping the query
runner after the pre-merge decoration
+ // and moving the pre-analysis might be viable.
+ // The additional overhead of the simple approach should be low,
as the additional optimize() call
+ // on each filter will only occur once per query.
preAnalysisGroup.computeJoinFilterPreAnalysisIfAbsent(
- joinQuery.getFilter() == null ? null :
joinQuery.getFilter().toFilter(),
+ joinQuery.getFilter() == null ? null :
joinQuery.getFilter().optimize().toFilter(),
Review comment:
I tried to move this test into a new test class in processing, but I
found that there's quite a lot of test machinery that has to be created (around
making a SegmentWalker). The existing test infrastructure for that all exists
outside of the processing module.
I don't think that's worth doing at this point for a single test with pretty
special requirements.
##########
File path: processing/src/main/java/org/apache/druid/segment/join/Joinables.java
##########
@@ -107,8 +107,17 @@ public static boolean isPrefixedBy(final String
columnName, final String prefix)
);
for (Query joinQuery : joinQueryLevels) {
+ // The pre-analysis needs to apply to the optimized form of
filters, as this is what will be
+ // passed to HashJoinSegmentAdapter.makeCursors().
+ // The optimize() call here means that the filter optimization
will happen twice,
+ // since the query toolchests will call optimize() later.
+ // We do this for simplicity as we cannot override what query
will get run later from this context.
+ // A more complicated approach involving wrapping the query
runner after the pre-merge decoration
+ // and moving the pre-analysis might be viable.
+ // The additional overhead of the simple approach should be low,
as the additional optimize() call
+ // on each filter will only occur once per query.
preAnalysisGroup.computeJoinFilterPreAnalysisIfAbsent(
- joinQuery.getFilter() == null ? null :
joinQuery.getFilter().toFilter(),
+ joinQuery.getFilter() == null ? null :
joinQuery.getFilter().optimize().toFilter(),
Review comment:
I tried to move this test into a new test class in processing, but I
found that there's quite a lot of test machinery that has to be created (around
making a SegmentWalker). The existing test infrastructure for that all exists
outside of the processing module, so it cannot be reused.
I don't think that's worth doing at this point for a single test with pretty
special requirements.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]