-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23787/#review48679
-----------------------------------------------------------


Hi Daniel, thank you for the patch! I have few minor comments below-


trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
<https://reviews.apache.org/r/23787/#comment85424>

    Change "pigContext.instantiateFuncFromSpec(new FuncSpec(udf))" to func?



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
<https://reviews.apache.org/r/23787/#comment85440>

    Do we need to pass pc given conf is already built out of conf?
    
    Configuration conf = ConfigurationUtil.toConfiguration(pc.getProperties());



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java
<https://reviews.apache.org/r/23787/#comment85441>

    Same as above.



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/LoaderProcessor.java
<https://reviews.apache.org/r/23787/#comment85443>

    Do we need conf given that processLoads() takes conf as an argument? Don't 
we only need one or the other? Perhaps, you can remove the conf parameter from 
processLoads()?



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/LoaderProcessor.java
<https://reviews.apache.org/r/23787/#comment85432>

    Can we use Job.getInstance() since this code doesn't need to be backward 
compatible with Hadoop 1?



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/LoaderProcessor.java
<https://reviews.apache.org/r/23787/#comment85447>

    Aren't these duplicate since they're set in TezDagBuilder? Am I wrong?



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/LoaderProcessor.java
<https://reviews.apache.org/r/23787/#comment85445>

    Do we need to set these properties again here becasue I see the same 
properties are set in the payload of vertices in TezDagBuilder?



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/ParallelismSetter.java
<https://reviews.apache.org/r/23787/#comment85450>

    Same as in LoaderProcessor. PigContext seems redundant to me given conf is 
built out of pigContext.



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/ParallelismSetter.java
<https://reviews.apache.org/r/23787/#comment85451>

    new Job() => Job.getInstance().



trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/ParallelismSetter.java
<https://reviews.apache.org/r/23787/#comment85452>

    Can't you change this to a private non-static method and use the conf field 
instead of passing job parameter again?


- Cheolsoo Park


On July 24, 2014, 12:45 a.m., Daniel Dai wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23787/
> -----------------------------------------------------------
> 
> (Updated July 24, 2014, 12:45 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-4057
>     https://issues.apache.org/jira/browse/PIG-4057
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> Summary of changes:
> 1. Take tez parallelism estimation out from TezDagBuilder to 
> ParallelismSetter, so we can get estimated parallelism of the cross before we 
> creating vertex of GFCross
> 2. Take InputSplit generate out from TezDagBuilder to LoaderProcessor, since 
> we need to know the parallelism of maps before ParallelismSetter
> 3. set pig.cross.parallelism.hint.(operator_key) in conf
>     * In tez, this is done when we encounter cross vertex
>     * In MR, this is done when we encounter the first GFCross
> 4. GFCross will use pig.cross.parallelism.hint.(operator_key) to determine 
> the #partition
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/PigConfiguration.java 1612189 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  1612189 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POGlobalRearrange.java
>  1612189 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezCompiler.java 
> 1612189 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezDagBuilder.java
>  1612189 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezLauncher.java 
> 1612189 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/TezOperator.java 
> 1612189 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/LoaderProcessor.java
>  PRE-CREATION 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/tez/optimizers/ParallelismSetter.java
>  PRE-CREATION 
>   trunk/src/org/apache/pig/impl/builtin/GFCross.java 1612189 
>   
> trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java
>  1612189 
>   trunk/test/e2e/pig/tests/nightly.conf 1612189 
>   trunk/test/org/apache/pig/test/TestGFCross.java 1612189 
> 
> Diff: https://reviews.apache.org/r/23787/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Daniel Dai
> 
>

Reply via email to