alamb commented on issue #12075:
URL: https://github.com/apache/datafusion/issues/12075#issuecomment-2308848278

   > As I'm thinking about it, I'm not sure you can get around doing the sort 
since you have an arbitrary number of ordering clauses. I think what you've 
proposed is the best option. One _could_ speed up the operation when there's 
only a single order clause but that may not be worth the effort involved.
   
   FWIW this is what the `first_selector` / `last_selector` functions do in 
InfluxDB -- they basically maintain only the value for the largest ORDER BY 
column, rather than actually sorting the entire intermediate result
   
   > 
   > I'm okay closing this issue unless anyone thinks there's value in pursuing 
it further.
   
   It would be nice to make it clearer how to get the equivalent of  `min_by` 
and `max_by`  -- maybe we could just document the equivalent `first_value(... 
ORDER BY ...)`  queries 🤔  or we could automatically rewrite `min_by` / 
`max_by` into `first_value()` / `last_value` 🤔 
   
   It is probably also worth filing a ticket for the potential optimization


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to