[ https://issues.apache.org/jira/browse/PIG-3775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rohini Palaniswamy updated PIG-3775: ------------------------------------ Fix Version/s: (was: 0.14.0) > Use unsorted shuffle in Orderby, Skewed Join to improve performance in Tez > -------------------------------------------------------------------------- > > Key: PIG-3775 > URL: https://issues.apache.org/jira/browse/PIG-3775 > Project: Pig > Issue Type: Sub-task > Components: tez > Reporter: Rohini Palaniswamy > Assignee: Rohini Palaniswamy > Labels: GSOC2014 > > When implementing Pig union, we need to gather data from two or more upstream > vertexes without sorting. The vertex itself might consists of several tasks. > Same can be done for the partitioner vertex in orderby and skewed join > instead of 1-1 edge for some cases of parallelism. > TEZ-661 has been created to add custom output and input for that in Tez. It > is currently not in the Tez team priorities but it is important for us as it > will give good performance gains. We can write the custom input/output and > contribute it to Tez and make the corresponding changes in Pig. > This is a candidate project for Google summer of code 2014. More information > about the program can be found at > https://cwiki.apache.org/confluence/display/PIG/GSoc2014 -- This message was sent by Atlassian JIRA (v6.3.4#6332)