> On June 24, 2015, 6:38 p.m., Daniel Dai wrote:
> > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POLocalRearrangeTez.java,
> >  line 150
> > <https://reviews.apache.org/r/35491/diff/1/?file=985529#file985529line150>
> >
> >     Can you add a comment why we need to wrap key into 
> > NullablePartitionWritable for skewed join?

Sure. POPartitionRearrange of the right table creates as 
NullablePartitionWritable as the key. Since left side uses LocalRearrange, we 
have to wrap it specifically to match the key type of the right one.


- Rohini


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/35491/#review89225
-----------------------------------------------------------


On June 16, 2015, 7:19 a.m., Rohini Palaniswamy wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/35491/
> -----------------------------------------------------------
> 
> (Updated June 16, 2015, 7:19 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-4574
>     https://issues.apache.org/jira/browse/PIG-4574
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Reading orderby/skewed join data from HDFS in Partitioner vertex, instead of 
> getting from sampler vertex.
> 
> This jira does not optimize the case of 
> 
> A = LOAD 'x' ...;
> B = LOAD 'y' ...;
> C = UNION A, B;
> D = ORDER C BY ..;
> 
> This depends on UnionOptimizer being turned on and will need more changes. So 
> will leave this for another jira.
> 
> 
> Diffs
> -----
> 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POIdentityInOutTez.java
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POLocalRearrangeTez.java
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Limit-2.gld
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Order-1.gld
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Order-2.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-1.gld
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-2.gld
>  PRE-CREATION 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16-OPTOFF.gld
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16.gld
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java
>  1685498 
>   
> http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java
>  1685498 
> 
> Diff: https://reviews.apache.org/r/35491/diff/
> 
> 
> Testing
> -------
> 
> Ran subset of e2e tests - 
> SkewedJoin,Union,Order,MultiQuery_Self,MultiQuery_Union
> 
> Ran L9.pig. Before the patch
> 
> File System Counters
>               FILE_BYTES_READ=2028282366911
>               FILE_BYTES_WRITTEN=4049785379197
>               HDFS_BYTES_READ=1011533488395
>               HDFS_BYTES_WRITTEN=1010554380555
>         
> After the patch
> 
> File System Counters
>                 FILE_BYTES_READ=1007449863330
>                 FILE_BYTES_WRITTEN=2016036957653
>                 HDFS_BYTES_READ=2023066976790
>                 HDFS_BYTES_WRITTEN=1010554380555
> 
> 
> Thanks,
> 
> Rohini Palaniswamy
> 
>

Reply via email to