> On Feb. 17, 2014, 7:54 a.m., Cheolsoo Park wrote:
> > http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java,
> >  line 1464
> > <https://reviews.apache.org/r/18181/diff/2/?file=490429#file490429line1464>
> >
> >     Remove this comment since it's no longer applicable?
> 
> Rohini Palaniswamy wrote:
>     Left that on purpose. We want to try unsorted shuffle to reduce the 
> number of stages if data is less. For eg: If there are 7K input splits and 
> parallel set to 100, with 1-1 it will be 7K tasks in load vertex, 7K tasks in 
> partition vertex and 100 in join vertex. We want to see if 7K in load vertex, 
> 3.5K in partition vertex and 100 in join vertex performs better. In theory it 
> might be better as the final join task only needs to merge 3.5K map outputs 
> instead of 7K. But if that does not work out then we will stick with 1-1.

Oh I see. Make sense. Thanks for the clarification!


- Cheolsoo


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/18181/#review34628
-----------------------------------------------------------


On Feb. 17, 2014, 7:34 a.m., Rohini Palaniswamy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/18181/
> -----------------------------------------------------------
> 
> (Updated Feb. 17, 2014, 7:34 a.m.)
> 
> 
> Review request for pig, Cheolsoo Park and Daniel Dai.
> 
> 
> Bugs: PIG-3766
>     https://issues.apache.org/jira/browse/PIG-3766
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Changes done:
>   1) Removed the POLocalRearrange in SampleVertex and replaced it with a 
> POValueOutTez for both orderby and skewedjoin. POValueOutTez takes multiple 
> outputs. So got rid of the POSplit as well in skewed join sample vertex.
>   2) Replaced the POPackage+POLocalRearrange in the partition vertex of left 
> table (vertex 3) with a POIdentityInOutTez moving the project in 
> POLocalRearrange into the POLocalRearrange in vertex 1. Also made the edge 
> 1-1 between vertex 1 and vertex 3. 
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/PigProcessor.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC16.gld
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC17.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/TEZC7.gld
>  1568862 
>   
> http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java
>  1568862 
> 
> Diff: https://reviews.apache.org/r/18181/diff/
> 
> 
> Testing
> -------
> 
> TestSkewedJoin and -t SkewedJoin in nightly.conf (except SkewedJoin_6 
> PIG-3727) pass
> 
> 
> Thanks,
> 
> Rohini Palaniswamy
> 
>

Reply via email to