----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/20320/ -----------------------------------------------------------
(Updated April 15, 2014, 1:37 p.m.)
Review request for pig, Cheolsoo Park and Daniel Dai.
Changes
-------
Updated patch
- fixes a case in UnionOptimizer where it was failing when POSplit had a
shared successor and was just writing to POValueOutputTez
- Set parallelism to Math.min(sum of predecessors, 20) till we have ARP
patch from Daniel.
- Changed isFRJoin to connectedToPackage to generalize it. Included that in
the copy constructor and clone else UnionOptimizer will have issue.
Bugs: PIG:3855
https://issues.apache.org/jira/browse/PIG:3855
Repository: pig
Description
-------
Changes done:
Created a new input in TEZ-1003 and used that so that we can turn on
UnionOptimizer by default. Without that seeing lot of performance degradation
in production scripts.
Added lot of e2e tests for UnionOptimizer and fixed code based on the issues
found.
Fixed couple of other minor issues like
default parallelism not honored
Serializing full store was causing problems with some UDFs on deserialize for
checkOutputSpecs.
This patch depends on TEZ-1003. So will check in once that is available as part
of tez snapshot in maven.
Diffs (updated)
-----
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POLocalRearrange.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/MultiQueryOptimizerTez.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POFRJoinTez.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POIdentityInOutTez.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POLocalRearrangeTez.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POShuffleTezLoad.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueInputTez.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/POValueOutputTez.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/UnionOptimizer.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/src/org/apache/pig/tools/pigstats/tez/TezTaskStats.java
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/e2e/pig/tests/nightly.conf
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2-OPTOFF.gld
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-MQ-2.gld
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-10-OPTOFF.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-10.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-2.gld
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6-OPTOFF.gld
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-6.gld
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-7-OPTOFF.gld
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-7.gld
1587343
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9-OPTOFF.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/test/data/GoldenFiles/tez/TEZC-Union-9.gld
PRE-CREATION
http://svn.apache.org/repos/asf/pig/branches/tez/test/org/apache/pig/tez/TestTezCompiler.java
1587343
Diff: https://reviews.apache.org/r/20320/diff/
Testing
-------
Unit tests pass fine and new e2e tests pass fine.
Thanks,
Rohini Palaniswamy
