gargvishesh commented on PR #15688: URL: https://github.com/apache/druid/pull/15688#issuecomment-1903316530
More that one additional clause such as multiple EQUALS followed by IN, or JOIN, EQUALS followed by IN, don't seem to add much overhead post the first addition. This I think is primarily because of multiple cost analysis in [CombineAndSimplifyBounds](https://github.com/apache/druid/blob/9d4e8053a4a84e6d106afdf8b27e62d2f396ab02/sql/src/main/java/org/apache/druid/sql/calcite/filtration/CombineAndSimplifyBounds.java#L70). As a result, the following combinations end up getting range simplified: * IN with OR * !IN with AND * EQUALS OR (IN WITH OR) * !EQUALS AND (!IN WITH AND) Also, there is an order of magnitude planning time difference between query planned with `inSubQueryThreshold` set to 0 to enforce a JOIN vs being set to the highest to retain it as IN filter. ``` Benchmark (inClauseExprCount) (inSubQueryThreshold) (rowsPerSegment) (schema) Mode Cnt Score Error Units InPlanningBenchmark.queryJoinEqualOrInSql 10 0 500000 auto avgt 5 1.368 ± 0.183 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 10 2147483647 500000 auto avgt 5 1.006 ± 0.020 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 100 0 500000 auto avgt 5 1.929 ± 0.018 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 100 2147483647 500000 auto avgt 5 5.254 ± 0.249 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 1000 0 500000 auto avgt 5 7.852 ± 0.414 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 1000 2147483647 500000 auto avgt 5 57.873 ± 2.162 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 10000 0 500000 auto avgt 5 59.861 ± 2.709 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 10000 2147483647 500000 auto avgt 5 763.333 ± 4.939 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 100000 0 500000 auto avgt 5 805.880 ± 65.558 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 100000 2147483647 500000 auto avgt 5 9573.531 ± 971.719 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 1000000 0 500000 auto avgt 5 7489.099 ± 1517.078 ms/op InPlanningBenchmark.queryJoinEqualOrInSql 1000000 2147483647 500000 auto avgt 5 155358.810 ± 5682.592 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 10 0 500000 auto avgt 5 1.288 ± 0.017 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 10 2147483647 500000 auto avgt 5 0.863 ± 0.011 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 100 0 500000 auto avgt 5 1.827 ± 0.029 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 100 2147483647 500000 auto avgt 5 5.418 ± 0.050 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 1000 0 500000 auto avgt 5 7.147 ± 0.235 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 1000 2147483647 500000 auto avgt 5 58.862 ± 0.472 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 10000 0 500000 auto avgt 5 57.851 ± 1.897 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 10000 2147483647 500000 auto avgt 5 756.667 ± 20.065 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 100000 0 500000 auto avgt 5 781.156 ± 47.021 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 100000 2147483647 500000 auto avgt 5 10453.902 ± 367.649 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 1000000 0 500000 auto avgt 5 7259.413 ± 584.843 ms/op InPlanningBenchmark.queryMultiEqualOrInSql 1000000 2147483647 500000 auto avgt 5 153890.808 ± 5516.659 ms/op ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
