> On Feb. 17, 2014, 7:54 a.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java,
> >  line 1464
> > <https://reviews.apache.org/r/18181/diff/2/?file=490429#file490429line1464>
> >
> >     Remove this comment since it's no longer applicable?

Left that on purpose. We want to try unsorted shuffle to reduce the number of 
stages if data is less. For eg: If there are 7K input splits and parallel set 
to 100, with 1-1 it will be 7K tasks in load vertex, 7K tasks in partition 
vertex and 100 in join vertex. We want to see if 7K in load vertex, 3.5K in 
partition vertex and 100 in join vertex performs better. In theory it might be 
better as the final join task only needs to merge 3.5K map outputs instead of 
7K. But if that does not work out then we will stick with 1-1.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18181/#review34628
-----------------------------------------------------------


On Feb. 17, 2014, 7:34 a.m., Rohini Palaniswamy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18181/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 7:34 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Daniel Dai.
> 
> 
> Bugs: PIG-3766
>     https://issues.apache.org/jira/browse/PIG-3766
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Changes done:
>   1) Removed the POLocalRearrange in SampleVertex and replaced it with a 
> POValueOutTez for both orderby and skewedjoin. POValueOutTez takes multiple 
> outputs. So got rid of the POSplit as well in skewed join sample vertex.
>   2) Replaced the POPackage+POLocalRearrange in the partition vertex of left 
> table (vertex 3) with a POIdentityInOutTez moving the project in 
> POLocalRearrange into the POLocalRearrange in vertex 1. Also made the edge 
> 1-1 between vertex 1 and vertex 3. 
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC16.gld
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC17.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC7.gld
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java
>  1568862 
> 
> Diff: https://reviews.apache.org/r/18181/diff/
> 
> 
> Testing
> -------
> 
> TestSkewedJoin and -t SkewedJoin in nightly.conf (except SkewedJoin_6 
> PIG-3727) pass
> 
> 
> Thanks,
> 
> Rohini Palaniswamy
> 
>

Reply via email to