----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23787/#review48747 -----------------------------------------------------------
Just few minor comments. trunk/src/org/apache/pig/PigConfiguration.java <https://reviews.apache.org/r/23787/#comment85494> It is a internal setting and not user facing one. We should probably create a new class called PigInternalConfiguration for those. Can also remove "hint" from the name as it is used as is. trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java <https://reviews.apache.org/r/23787/#comment85496> pig.inpTargets not required? trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java <https://reviews.apache.org/r/23787/#comment85497> Will have to deal with getting Credentials to Job if this is moved out of here. I will deal with it in a separate jira. trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java <https://reviews.apache.org/r/23787/#comment85500> Can we create a subclass instead and put these variables in them as they are all related ? Will help keep TezOperator more readable. - Rohini Palaniswamy On July 25, 2014, 1:36 a.m., Daniel Dai wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/23787/ > ----------------------------------------------------------- > > (Updated July 25, 2014, 1:36 a.m.) > > > Review request for pig. > > > Bugs: PIG-4057 > https://issues.apache.org/jira/browse/PIG-4057 > > > Repository: pig > > > Description > ------- > > Summary of changes: > 1. Take tez parallelism estimation out from TezDagBuilder to > ParallelismSetter, so we can get estimated parallelism of the cross before we > creating vertex of GFCross > 2. Take InputSplit generate out from TezDagBuilder to LoaderProcessor, since > we need to know the parallelism of maps before ParallelismSetter > 3. set pig.cross.parallelism.hint.(operator_key) in conf > * In tez, this is done when we encounter cross vertex > * In MR, this is done when we encounter the first GFCross > 4. GFCross will use pig.cross.parallelism.hint.(operator_key) to determine > the #partition > > > Diffs > ----- > > trunk/src/org/apache/pig/PigConfiguration.java 1613328 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java > 1613328 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POGlobalRearrange.java > 1613328 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java > 1613328 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java > 1613328 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java > 1613328 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java > 1613328 > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/LoaderProcessor.java > PRE-CREATION > > trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/ParallelismSetter.java > PRE-CREATION > trunk/src/org/apache/pig/impl/builtin/GFCross.java 1613328 > > trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java > 1613328 > trunk/test/e2e/pig/tests/nightly.conf 1613328 > trunk/test/org/apache/pig/test/TestGFCross.java 1613328 > > Diff: https://reviews.apache.org/r/23787/diff/ > > > Testing > ------- > > > Thanks, > > Daniel Dai > >
