GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/12926
[SPARK-15094][SPARK-14803][SQL] Add ObjectProject for EliminateSerialization
## What changes were proposed in this pull request?
We will eliminate the pair of `DeserializeToObject` and
`SerializeFromObject` in `Optimizer` and add extra `Project`. However, when
DeserializeToObject's outputObjectType is ObjectType and its cls can't be
processed by unsafe project, it will be failed.
To fix it, we can simply add a plan to project object that can preserve
`DeserializeToObject`'s output expr id as the extra `Project` did.
## How was this patch tested?
`DatasetSuite`, `EliminateSerializationSuite`.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/viirya/spark-1
fix-eliminate-serialization-projection
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/12926.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12926
----
commit cf53a434b893293041f73414f50d7f0918a01d49
Author: Liang-Chi Hsieh <[email protected]>
Date: 2016-05-04T09:49:27Z
Avoid extra Project when DeserializeToObject outputs an unsupported class
for Project.
commit 48e6b6d3bc4d41d808db43b888e6b17a17a77d1f
Author: Liang-Chi Hsieh <[email protected]>
Date: 2016-05-05T07:38:06Z
Add ObjectProject.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]