gianm opened a new pull request, #15832:
URL: https://github.com/apache/druid/pull/15832

   If lots of keys map to the same value, reversing a LOOKUP call can slow 
things down unacceptably. To protect against this, this patch introduces a 
parameter sqlReverseLookupThreshold representing the maximum size of an IN 
filter that will be created as part of lookup reversal.
   
   If inSubQueryThreshold is set to a smaller value than 
sqlReverseLookupThreshold, then inSubQueryThreshold will be used instead. This 
allows users to use that single parameter to control IN sizes if they wish.
   
   Benchmarks follow. I chose `10000` as the default for 
`sqlReverseLookupThreshold` since it keeps planning time under 1 second. Future 
work to speed up IN filters could allow us to raise the default threshold.
   
   ```
   Benchmark                                (keysPerValue)  (lookupType)  
(numKeys)  Mode  Cnt     Score     Error  Units
   SqlReverseLookupBenchmark.planEquals               1000       hashmap    
5000000  avgt    5   163.002 ±   4.228  ms/op
   SqlReverseLookupBenchmark.planEquals               1000     immutable    
5000000  avgt    5    43.095 ±   2.864  ms/op
   SqlReverseLookupBenchmark.planEquals              10000       hashmap    
5000000  avgt    5   734.592 ±  34.374  ms/op
   SqlReverseLookupBenchmark.planEquals              10000     immutable    
5000000  avgt    5   555.980 ±  49.903  ms/op
   SqlReverseLookupBenchmark.planEquals             100000       hashmap    
5000000  avgt    5  8545.459 ± 108.931  ms/op
   SqlReverseLookupBenchmark.planEquals             100000     immutable    
5000000  avgt    5  8415.105 ± 116.926  ms/op
   SqlReverseLookupBenchmark.planNotEquals            1000       hashmap    
5000000  avgt    5   257.995 ±   5.576  ms/op
   SqlReverseLookupBenchmark.planNotEquals            1000     immutable    
5000000  avgt    5    41.088 ±   1.582  ms/op
   SqlReverseLookupBenchmark.planNotEquals           10000       hashmap    
5000000  avgt    5   776.826 ±   8.265  ms/op
   SqlReverseLookupBenchmark.planNotEquals           10000     immutable    
5000000  avgt    5   583.022 ±  19.766  ms/op
   SqlReverseLookupBenchmark.planNotEquals          100000       hashmap    
5000000  avgt    5  9019.350 ± 144.835  ms/op
   SqlReverseLookupBenchmark.planNotEquals          100000     immutable    
5000000  avgt    5  8754.859 ± 429.341  ms/op
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to