----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/25912/ -----------------------------------------------------------
Review request for pig, Cheolsoo Park and Daniel Dai.
Bugs: PIG-4162
https://issues.apache.org/jira/browse/PIG-4162
Repository: pig
Description
-------
Following changes are done:
- Always estimate intermediate reducer parallelism even if user has
specified PARALLEL.
- intermediate reducer parallelism = Min(2 * userparallelism,
Math.max(userparallelism, Math.max(estimatedparallelism,
Math.max(2999,PigReducerEstimator.MAX_REDUCER_COUNT_PARAM)). i.e Limiting
estimated parallelism to be not more than 2x userparallelism or 2999.
Hardcoding 2999 for now which is different from final reducer max parallelism
default of 999 and is only for intermediate reducers. Will make it configurable
later if needed.
- ShuffleVertexManager.TEZ_SHUFFLE_VERTEX_MANAGER_DESIRED_TASK_INPUT_SIZE
is set to blocksize for intermediate tasks(same as mapper behaviour) instead of
InputSizeReducerEstimator.DEFAULT_BYTES_PER_REDUCER which defaults to 1G
Patch has few other minor unrelated fixes as well.
Diffs
-----
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/Main.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezResourceManager.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezCompiler.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/TezOperator.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/ParallelismSetter.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/TezOperDependencyParallelismEstimator.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/plan/optimizer/TezParallelismEstimator.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/util/TezCompilerUtil.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/backend/hadoop/executionengine/util/ParallelConstantVisitor.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/PigImplConstants.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/impl/io/FileLocalizer.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/src/org/apache/pig/tools/pigstats/tez/TezStats.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestAlgebraicEval.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/test/org/apache/pig/test/TestForEachNestedPlan.java
1626640
http://svn.apache.org/repos/asf/pig/trunk/test/tez-tests 1626640
Diff: https://reviews.apache.org/r/25912/diff/
Testing
-------
Running full suite of unit (Good so far) and e2e. Will update once done.
Thanks,
Rohini Palaniswamy
