----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/35491/#review89225 -----------------------------------------------------------
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POLocalRearrangeTez.java (line 150) <https://reviews.apache.org/r/35491/#comment141821> Can you add a comment why we need to wrap key into NullablePartitionWritable for skewed join? - Daniel Dai On June 16, 2015, 7:19 a.m., Rohini Palaniswamy wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/35491/ > ----------------------------------------------------------- > > (Updated June 16, 2015, 7:19 a.m.) > > > Review request for pig. > > > Bugs: PIG-4574 > https://issues.apache.org/jira/browse/PIG-4574 > > > Repository: pig > > > Description > ------- > > Reading orderby/skewed join data from HDFS in Partitioner vertex, instead of > getting from sampler vertex. > > This jira does not optimize the case of > > A = LOAD 'x' ...; > B = LOAD 'y' ...; > C = UNION A, B; > D = ORDER C BY ..; > > This depends on UnionOptimizer being turned on and will need more changes. So > will leave this for another jira. > > > Diffs > ----- > > > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POIdentityInOutTez.java > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/operator/POLocalRearrangeTez.java > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Limit-2.gld > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Order-1.gld > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Order-2.gld > PRE-CREATION > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-1.gld > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-SkewJoin-2.gld > PRE-CREATION > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16-OPTOFF.gld > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-16.gld > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezAutoParallelism.java > 1685498 > > http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/tez/TestTezCompiler.java > 1685498 > > Diff: https://reviews.apache.org/r/35491/diff/ > > > Testing > ------- > > Ran subset of e2e tests - > SkewedJoin,Union,Order,MultiQuery_Self,MultiQuery_Union > > Ran L9.pig. Before the patch > > File System Counters > FILE_BYTES_READ=2028282366911 > FILE_BYTES_WRITTEN=4049785379197 > HDFS_BYTES_READ=1011533488395 > HDFS_BYTES_WRITTEN=1010554380555 > > After the patch > > File System Counters > FILE_BYTES_READ=1007449863330 > FILE_BYTES_WRITTEN=2016036957653 > HDFS_BYTES_READ=2023066976790 > HDFS_BYTES_WRITTEN=1010554380555 > > > Thanks, > > Rohini Palaniswamy > >
