gianm commented on a change in pull request #10313:
URL: https://github.com/apache/druid/pull/10313#discussion_r476609549
##########
File path:
processing/src/main/java/org/apache/druid/query/filter/InDimFilter.java
##########
@@ -80,7 +80,8 @@
{
// determined through benchmark that binary search on long[] is faster than
HashSet until ~16 elements
// Hashing threshold is not applied to String for now, String still uses
ImmutableSortedSet
- public static final int NUMERIC_HASHING_THRESHOLD = 16;
+ public static final int NUMERIC_HASHING_THRESHOLD =
+
Integer.parseInt(System.getProperty("druid.query.filter.inDimFilter.numericHashingThreshold",
"1"));
Review comment:
> One of those must be wrong 😅
SelectorDimFilter calls `new InDimFilter(dimension,
Collections.singleton(value), extractionFn, filterTuning).optimize()`, so you
get another SelectorDimFilter back 🙂
There's no comment about why, but I assume it's because InDimFilter has the
`optimizeLookup()` code.
> I put the default in mainly because of my paranoia, just in case this
causes a perf degradation for a specific shape of query that isn't covered by
my benchmarks.
People generally aren't going to have the patience to research each of these
settings, so it's usually better if we do some diligence to find something that
should work (relatively) universally. If that involves removing code paths then
it also helps us reduce the amount of code that needs to be tested.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]