[
https://issues.apache.org/jira/browse/PIG-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Dai updated PIG-3775:
----------------------------
Fix Version/s: (was: tez-branch)
0.14.0
> Use unsorted shuffle in Orderby, Skewed Join to improve performance in Tez
> --------------------------------------------------------------------------
>
> Key: PIG-3775
> URL: https://issues.apache.org/jira/browse/PIG-3775
> Project: Pig
> Issue Type: Sub-task
> Components: tez
> Reporter: Rohini Palaniswamy
> Assignee: Rohini Palaniswamy
> Labels: GSOC2014
> Fix For: 0.14.0
>
>
> When implementing Pig union, we need to gather data from two or more upstream
> vertexes without sorting. The vertex itself might consists of several tasks.
> Same can be done for the partitioner vertex in orderby and skewed join
> instead of 1-1 edge for some cases of parallelism.
> TEZ-661 has been created to add custom output and input for that in Tez. It
> is currently not in the Tez team priorities but it is important for us as it
> will give good performance gains. We can write the custom input/output and
> contribute it to Tez and make the corresponding changes in Pig.
> This is a candidate project for Google summer of code 2014. More information
> about the program can be found at
> https://cwiki.apache.org/confluence/display/PIG/GSoc2014
--
This message was sent by Atlassian JIRA
(v6.2#6252)