[GitHub] [druid] gianm commented on a change in pull request #10313: Make NUMERIC_HASHING_THRESHOLD configurable

GitBox Tue, 25 Aug 2020 10:17:24 -0700


gianm commented on a change in pull request #10313:
URL: https://github.com/apache/druid/pull/10313#discussion_r476609549




##########
File path: 
processing/src/main/java/org/apache/druid/query/filter/InDimFilter.java
##########
@@ -80,7 +80,8 @@
 {
   // determined through benchmark that binary search on long[] is faster than 
HashSet until ~16 elements
   // Hashing threshold is not applied to String for now, String still uses 
ImmutableSortedSet
-  public static final int NUMERIC_HASHING_THRESHOLD = 16;
+  public static final int NUMERIC_HASHING_THRESHOLD =
+      
Integer.parseInt(System.getProperty("druid.query.filter.inDimFilter.numericHashingThreshold",
 "1"));

Review comment:
       > One of those must be wrong 😅
   
   SelectorDimFilter calls `new InDimFilter(dimension, 
Collections.singleton(value), extractionFn, filterTuning).optimize()`, so you 
get another SelectorDimFilter back 🙂
   
   There's no comment about why, but I assume it's because InDimFilter has the 
`optimizeLookup()` code.
   
   > I put the default in mainly because of my paranoia, just in case this 
causes a perf degradation for a specific shape of query that isn't covered by 
my benchmarks.
   
   People generally aren't going to have the patience to research each of these 
settings, so it's usually better if we do some diligence to find something that 
should work (relatively) universally. If that involves removing code paths then 
it also helps us reduce the amount of code that needs to be tested.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] gianm commented on a change in pull request #10313: Make NUMERIC_HASHING_THRESHOLD configurable

Reply via email to