alamb commented on PR #6036: URL: https://github.com/apache/arrow-datafusion/pull/6036#issuecomment-1518606303
I will also try and review this change as well over the coming few days > Even if there is no theoretical reason to break pipeline. It seems to me from reading the description on this PR that the tradeoff is: 1. Save a sort, (e.g. don't sort by `unsorted_col` ) so faster CPU 2. the BoundedWindowExec operator has to potentially buffer a large number of partitions (e.g. if `unsorted_column` has a large number of distrinct values) and thus requires more memory in some cases Is this a fair assesment -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
