WangGuangxin opened a new pull request #31967: URL: https://github.com/apache/spark/pull/31967
### What changes were proposed in this pull request? Currently MapType doesn't support orderable semantics, while it's supported in Hive/Presto. This makes it hard to migrate from Hive to SparkSQL if user have groupby/orderby map type in their sql. ### Why are the changes needed? Generally, we compare two maps by the following steps: 1. If the size of two maps are not equal, compare them by size. 2. Otherwise, sort each map entry by map key, then compare two map entries one by one, first compare by key, then value. We have to specially handle this in grouping/join/window because Spark SQL turns grouping/join/window partition keys into binary `UnsafeRow` and compare the binary data directly instead of using MapType's ordering. In this case, we have to insert a `SortMapKey` expression to sort map entry by key. This is very similiar to `NormalizeFloatingNumbers` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Add more UTs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
