GitHub user hvanhovell opened a pull request:
https://github.com/apache/spark/pull/15970
[SPARK-18134][SQL] Comparable MapTypes [POC]
## What changes were proposed in this pull request?
This is a small POC to see if we can make MapType orderable, and thus
usable in aggregates and joins. The key idea in this PR is that there is a
difference between an unordered and an ordered map (an ordered map is can be
compared), and that `ordered` is a property of `MapType`.
A map can be converted from an unordered map to an ordered map by injecting
a `SortMap` expression. The analyzer will inject `SortMap` expressions whenever
we use a map in a binary comparison and when we use it in an aggregate. Note
that the `SortMap` expression is far from optimized, it should however perform
reasonable.
## How was this patch tested?
No tests yet. This will probably fail tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/hvanhovell/spark orderable_map
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/15970.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #15970
----
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]