caseykneale commented on issue #7373: URL: https://github.com/apache/arrow-datafusion/issues/7373#issuecomment-1688474113
I figured it out. Basically joins often happen out of the original order of the data, which is a good thing in general but when you go to groupby after you may end up with out of order data. Groupbys have no context for this change, and incorrectly return the wrong values. So if you run into this you need to make a subquery where you orderby AFTER the internal join. So the order of the groups is preserved for the groupby step. I would close this out, but I really wish this was in the documentation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
