suneet-s commented on pull request #10316:
URL: https://github.com/apache/druid/pull/10316#issuecomment-682000989


   > ```
   > diff --git 
a/processing/src/main/java/org/apache/druid/segment/join/filter/JoinFilterAnalyzer.java
 
b/processing/src/main/java/org/apache/druid/segment/join/filter/JoinFilterAnalyzer.java
   > index 67b1ee0b76..7da31f3811 100644
   > --- 
a/processing/src/main/java/org/apache/druid/segment/join/filter/JoinFilterAnalyzer.java
   > +++ 
b/processing/src/main/java/org/apache/druid/segment/join/filter/JoinFilterAnalyzer.java
   > @@ -435,6 +435,11 @@ public class JoinFilterAnalyzer
   >            );
   >          }
   >  
   > +        // Wrap filter values in immutable set so that classes consuming 
this set do not compute the hash code
   > +        // again and again. We are creating multiple InDimFilter filters 
down for the same set of filter values.
   > +        // Calculating hashCode will be less expensive for these 
InDimFilter as hashCode for filter values is pre-computed
   > +        newFilterValues = ImmutableSet.copyOf(newFilterValues);
   > +
   >          for (String correlatedBaseColumn : 
correlationAnalysis.getBaseColumns()) {
   >            Filter rewrittenFilter = new InDimFilter(
   >                correlatedBaseColumn,
   > ```
   > 
   > This is what I had in mind.
   
   I initially didn't want to use an ImmutableSet because 
`ImmutableSet.copyOf(...)` would have to traverse through the entire set, and 
there may be cases where an InDimFilter doesn't need to traverse the entire 
set. However, this comment made me think that maybe the best way to do this is 
to have the correlatedValuesMap use an `ImmutableSet.Builder` instead. This way 
the hashCode is memoized as the set is being constructed! I might leave the 
other code paths as is so that this change only really impacts join clauses. 
What do you think @abhishekagarwal87 ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to