[
https://issues.apache.org/jira/browse/PIG-3855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13968472#comment-13968472
]
Rohini Palaniswamy commented on PIG-3855:
-----------------------------------------
Changes done:
- Created a new input in TEZ-1003 and used that so that we can turn on
UnionOptimizer by default. Without that seeing lot of performance degradation
in production scripts.
- Added lot of e2e tests for UnionOptimizer and fixed code based on the
issues found.
- Fixed couple of other minor issues like
- default parallelism not honored
- Serializing full store was causing problems with some UDFs on
deserialize for checkOutputSpecs.
This patch depends on TEZ-1003. So will check in once that is available as part
of tez snapshot in maven.
> Turn on UnionOptimizer by default and add new e2e tests for union
> -----------------------------------------------------------------
>
> Key: PIG-3855
> URL: https://issues.apache.org/jira/browse/PIG-3855
> Project: Pig
> Issue Type: Sub-task
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Fix For: tez-branch
>
> Attachments: PIG-3855-1.patch
>
>
> We don't have e2e tests for cases like union followed by group by, join
> (replicate, skewed, hash), orderby, limit, etc. PIG-3835 adds optimization to
> those cases and we should have e2e tests for that.
--
This message was sent by Atlassian JIRA
(v6.2#6252)