GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/7482
[SPARK-9143] [SQL] Add planner rule for automatically inserting Unsafe <->
Safe row format converters
Now that we have two different internal row formats, UnsafeRow and the old
Java-object-based row format, we end up having to perform conversions between
these two formats. These conversions should not be performed by the operators
themselves; instead, the planner should be responsible for inserting
appropriate format conversions when they are needed.
This patch makes the following changes:
- Add two new physical operators for performing row format conversions,
`ConvertToUnsafe` and `ConvertFromUnsafe`.
- Add new methods to `SparkPlan` to allow operators to express whether they
output UnsafeRows and whether they can handle safe or unsafe rows as inputs.
- Implement an `EnsureRowFormats` rule to automatically insert converter
operators where necessary.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark unsafe-converter-planning
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7482.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7482
----
commit 9ba30383ff15b9a7faf908179bd7c89c80f344bf
Author: Josh Rosen <[email protected]>
Date: 2015-07-17T15:56:31Z
WIP
commit b5df19b6ab9dad7e384ebef2fedad701f6330323
Author: Josh Rosen <[email protected]>
Date: 2015-07-17T20:38:47Z
Merge remote-tracking branch 'origin/master' into unsafe-converter-planning
commit d5f9005eff6d36f40b676af8405c35d35c5a25ac
Author: Josh Rosen <[email protected]>
Date: 2015-07-17T21:04:50Z
Finish writing EnsureRowFormats planner rule
commit 0fef0f867f4a692de5297ac83c3d8e1b076e7dd0
Author: Josh Rosen <[email protected]>
Date: 2015-07-17T21:06:57Z
Rename file.
commit ae2195a7d44dc7e673d77298213c7be6f92b6419
Author: Josh Rosen <[email protected]>
Date: 2015-07-17T21:36:43Z
Fixes
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]