sascha-coenen commented on issue #9321: Performance degradation in topN queries 
when SQL-compatible null handling is enabled
URL: https://github.com/apache/druid/issues/9321#issuecomment-591636541
 
 
   > Since it's this low level, I'd also question the reality of the scenario. 
It might be less visible if the performance tests were running a mixture of 
queries in both runs
   > My guess at this point is this an edge case, and I don't think we'd see a 
huge difference if this were a mixed load scenario
   
   Let me explain again, perhaps I can be more clear:
   From the point that a Druid historical executes a single groupbyV2 query for 
the first time, the execution of any later topN queries is much slower than 
before and remains at that deteriorated performance forever. Prior to the 
execution of a groupbyV2 query, the topN queries are consistently fast and 
remain so forever. These are steady-states. Within a mixed workload before or 
after, the topN queries will show the same degradation. It's not a dynamic 
temporary effect.
   
   If you look at the segment scan time graph, you see that scan times for topN 
queries are very consistent before and after the single execution of a groupBy 
query, just much higher afterwards. 
   This graph would continue in this static state forever. 
   Also, the topN peformance prior to executing a first groupBy query is thee 
same as in a Druid cluster configured without SQL-compatible null handling.
   
   So this is not an edge case by any means: If I were to switch the 
SQL-compatible null handling on in our production cluster which is running 
mixed workloads, the moment a groupby query would get executed for the first 
time, query performance for topNs would plummet considerably and not recover.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to