geoffreyclaude opened a new issue, #15559:
URL: https://github.com/apache/datafusion/issues/15559

   ### Is your feature request related to a problem or challenge?
   
   Currently, the benchmarks folder in DataFusion does not include dedicated 
benchmarks for TopK queries (i.e., queries formatted as `SELECT ... ORDER BY a 
LIMIT n`).
   
   With ongoing work to optimize these types of queries, having dedicated 
benchmarks would be valuable for measuring progress.
   
   
   ### Describe the solution you'd like
   
   There are already sorting benchmarks based on the TPCH dataset. Since a TopK 
query is essentially a sort operation with an additional limit, we can extend 
the existing `sort_tpch` benchmarks by introducing an optional `LIMIT n` 
clause. This modification would effectively convert them into proper TopK 
benchmarks.
   
   
   ### Describe alternatives you've considered
   
   _No response_
   
   ### Additional context
   
   Relevant recent issues:
   - https://github.com/apache/datafusion/issues/15037
   - https://github.com/apache/datafusion/issues/15529
   - https://github.com/apache/datafusion/issues/15538


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to