alamb commented on issue #12075: URL: https://github.com/apache/datafusion/issues/12075#issuecomment-2308848278
> As I'm thinking about it, I'm not sure you can get around doing the sort since you have an arbitrary number of ordering clauses. I think what you've proposed is the best option. One _could_ speed up the operation when there's only a single order clause but that may not be worth the effort involved. FWIW this is what the `first_selector` / `last_selector` functions do in InfluxDB -- they basically maintain only the value for the largest ORDER BY column, rather than actually sorting the entire intermediate result > > I'm okay closing this issue unless anyone thinks there's value in pursuing it further. It would be nice to make it clearer how to get the equivalent of `min_by` and `max_by` -- maybe we could just document the equivalent `first_value(... ORDER BY ...)` queries 🤔 or we could automatically rewrite `min_by` / `max_by` into `first_value()` / `last_value` 🤔 It is probably also worth filing a ticket for the potential optimization -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org