somandal opened a new pull request, #8758:
URL: https://github.com/apache/pinot/pull/8758

   Today aggregate group-by without order-by queries can be inaccurate and 
non-deterministic. The results are truncated at multiple stages (segment level 
and server level) and depending on the order in which the rows are processed 
the aggregate group-by can return very different results limit on the number of 
results to be returned is smaller than the total number of rows matching the 
query.
   
   Aggregate group-by with order-by on the other hand has to keep track of the 
top K results based on the ordering criteria due to which the results are more 
accurate and deterministic.
   
   This PR adds a new query rewriter to rewrite aggregate group-by only queries 
to include order-by based on the group-by predicates. By default the query 
rewriter is not added to the list of default query rewriters but this can be 
overridden via the broker side config.
   
   Treating aggregate group-by only queries to include order-by can lead to a 
performance hit as compared to the group-by only queries as the results need to 
be sorted and some processing is done under a lock to trim the data-structure 
for queries including order-by.
   
   cc @siddharthteotia 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to