[GitHub] [druid] somu-imply commented on a diff in pull request #13173: Fix row schema of nested queries to get proper row type during planning validation

GitBox Wed, 12 Oct 2022 09:52:39 -0700


somu-imply commented on code in PR #13173:
URL: https://github.com/apache/druid/pull/13173#discussion_r993698890



##########
benchmarks/src/test/java/org/apache/druid/benchmark/query/SqlBenchmark.java:
##########
@@ -392,7 +392,23 @@
       // 20: GroupBy, doubles sketches
       "SELECT dimZipf, APPROX_QUANTILE_DS(sumFloatNormal, 0.5), 
DS_QUANTILES_SKETCH(maxLongUniform) "
       + "FROM foo "
-      + "GROUP BY 1"
+      + "GROUP BY 1",
+
+      //21: Order by with alias with large in filter
+      "SELECT __time as t, dimSequential from foo "
+      + " where (dimSequential in (select DISTINCT dimSequential from foo)) "
+      + " order by 1 limit 1",
+
+      //22: Order by without alias with large in filter
+      "SELECT __time, dimSequential from foo "
+      + " where (dimSequential in (select DISTINCT dimSequential from foo)) "
+      + " order by 1 limit 1",
+
+      //23: Group by and Order by with alias with large in filter nested query
+      "SELECT __time as t, dimSequential from foo "
+      + " where dimSequential in (select dimSequential from foo where "
+      + " dimSequential in (select dimSequential from foo)) "

Review Comment:
   Thanks @abhishekagarwal87 I have updated the benchmark tests by adding 2 
queries with ~1000 values of IN filters. 
   
   Before (using 0.22.0 as baseline)
   ```
   Benchmark             (query)  (rowsPerSegment)  (vectorize)  Mode  Cnt    
Score    Error  Units
   SqlBenchmark.planSql       24           5000000        false  avgt    5  
189.154 ± 36.029  ms/op
   SqlBenchmark.planSql       25           5000000        false  avgt    5  
135.821 ± 22.867  ms/op
   ```
   
   After (this PR build )
   ```
   Benchmark             (query)  (rowsPerSegment)  (storageType)  (vectorize)  
Mode  Cnt    Score    Error  Units
   SqlBenchmark.planSql       24           5000000           mmap        false  
avgt    5  229.036 ± 49.337  ms/op
   SqlBenchmark.planSql       25           5000000           mmap        false  
avgt    5  145.159 ± 28.671  ms/op
   ```
   There has been an increase of ~40ms for a query with 1000 IN filters where 
in the other nested query the increase has been ~10ms. Please suggest if this 
much change is acceptable.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [druid] somu-imply commented on a diff in pull request #13173: Fix row schema of nested queries to get proper row type during planning validation

Reply via email to