HeartSaVioR commented on PR #37551: URL: https://github.com/apache/spark/pull/37551#issuecomment-1386664690
It doesn't sound ideal to me if we expect users to read the whole method doc and be warned by themselves. This gives far different user experience with `sortWithinPartitions`, while users would make a guess that `sortWithinGroups` is a pair API with it. If we really want to do this "generally", I'd say let's be consistent with sortWithinPartitions, add Sort logical node explicitly which performs sort with primary and secondary key. And the order requirement in flatMapGroups shouldn't trigger additional sort since the orderness of DataFrame is superset of the requirement. But if we just want to address this to only two methods, I'd say let's touch these methods instead of trying out odd generalization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
