> On July 24, 2014, 7:46 p.m., Cheolsoo Park wrote:
> > Quang-Nhat HOANG-XUAN, thank you very much for the patch.
> > 
> > I briefly went it over and found some high-level issues to be address-
> > 1) Please rebase your patch to trunk. New features are only committed into 
> > trunk.
> > 2) Please avoid using hard-coded strings. Change them to static variables, 
> > etc.
> > 3) Please use primitives instead of objects where possible. JobConf has 
> > set/get[Type] methods, so no need to convert primitives to objects and call 
> > toString().
> > 4) Please avoid adding unnecessary object fields and instead use local 
> > variables.
> > 5) Please make PORollupH2IRGForEach a subclass of POForEach. This seems to 
> > save quite a few code changes.
> > 6) Please use camel case for variable/field/method names and avoid 
> > underscores.
> > 7) Please remove "this" when unnecessary. e.g. this.rollup_position => 
> > rollup_position.
> > 8) Please do not use tabs for indentation but use 4 whitespaces. (You can 
> > set up IDE for this.)
> > 9) Please delete all the trailing whitespaces. (You can set up IDE for 
> > this.)
> > 
> > I stopped reviewing it for now. Once you clean up your patch a bit, I will 
> > resume my review and try to run tests.
> > 
> > Thanks!

Thank you so much!
I will fix them.


- Quang-Nhat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23804/#review48543
-----------------------------------------------------------


On July 22, 2014, 9:20 a.m., Quang-Nhat HOANG-XUAN wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23804/
> -----------------------------------------------------------
> 
> (Updated July 22, 2014, 9:20 a.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-4066
>     https://issues.apache.org/jira/browse/PIG-4066
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> This patch aims at addressing the current limitation of the ROLLUP operator 
> in PIG: most of the work is done in the Map phase of the underlying MapReduce 
> job to generate all possible intermediate keys that the reducer use to 
> aggregate and produce the ROLLUP output. Based on our previous work: 
> “Duy-Hung Phan, Matteo Dell’Amico, Pietro Michiardi: On the design space of 
> MapReduce ROLLUP aggregates” 
> (http://www.eurecom.fr/en/publication/4212/download/rs-publi-4212_2.pdf), we 
> show that the design space for a ROLLUP implementation allows for a different 
> approach (in-reducer grouping, IRG), in which less work is done in the Map 
> phase and the grouping is done in the Reduce phase. This patch presents the 
> most efficient implementation we designed (Hybrid IRG), which allows defining 
> a parameter to balance between parallelism (in the reducers) and 
> communication cost.
> This patch contains the following features:
> 1. The new ROLLUP approach: IRG, Hybrid IRG.
> 2. The PIVOT clause in CUBE operators.
> 3. Test cases.
> The new syntax to use our ROLLUP approach:
> alias = CUBE rel BY
> { CUBE col_ref | ROLLUP col_ref [PIVOT pivot_value]} [, { CUBE col_ref | 
> ROLLUP col_ref [PIVOT pivot_value]}
> ...]
> In case there is multiple ROLLUP operator in one CUBE clause, the last ROLLUP 
> operator will be executed with our approach (IRG, Hybrid IRG) while the 
> remaining ROLLUP ahead will be executed with the default approach.
> We have already made some experiments for comparison between our ROLLUP 
> implementation and the current ROLLUP. More information can be found at here: 
> http://hxquangnhat.github.io/PIG-ROLLUP-H2IRG/
> 
> 
> Diffs
> -----
> 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/HExecutionEngine.java 
> 1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/CombinerOptimizer.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MultiQueryOptimizer.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PhyPlanSetter.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/RollupH2IRGPartitioner.java
>  PRE-CREATION 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/DotPOPrinter.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PlanPrinter.java
>  1579421 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PORollupH2IRGForEach.java
>  PRE-CREATION 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java
>  1579421 
>   trunk/src/org/apache/pig/builtin/RollupDimensions.java 1579421 
>   trunk/src/org/apache/pig/newplan/logical/DotLOPrinter.java 1579421 
>   
> trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java
>  1579421 
>   trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 
> 1579421 
>   
> trunk/src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java 
> 1579421 
>   trunk/src/org/apache/pig/newplan/logical/optimizer/LogicalPlanPrinter.java 
> 1579421 
>   trunk/src/org/apache/pig/newplan/logical/optimizer/SchemaResetter.java 
> 1579421 
>   trunk/src/org/apache/pig/newplan/logical/optimizer/UidResetter.java 1579421 
>   trunk/src/org/apache/pig/newplan/logical/relational/LOCogroup.java 1579421 
>   trunk/src/org/apache/pig/newplan/logical/relational/LOCube.java 1579421 
>   
> trunk/src/org/apache/pig/newplan/logical/relational/LORollupH2IRGForEach.java 
> PRE-CREATION 
>   
> trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java
>  1579421 
>   
> trunk/src/org/apache/pig/newplan/logical/relational/LogicalRelationalNodesVisitor.java
>  1579421 
>   trunk/src/org/apache/pig/newplan/logical/rules/OptimizerUtils.java 1579421 
>   trunk/src/org/apache/pig/newplan/logical/rules/RollupH2IRGOptimizer.java 
> PRE-CREATION 
>   trunk/src/org/apache/pig/parser/AliasMasker.g 1579421 
>   trunk/src/org/apache/pig/parser/AstPrinter.g 1579421 
>   trunk/src/org/apache/pig/parser/AstValidator.g 1579421 
>   trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1579421 
>   trunk/src/org/apache/pig/parser/LogicalPlanGenerator.g 1579421 
>   trunk/src/org/apache/pig/parser/QueryLexer.g 1579421 
>   trunk/src/org/apache/pig/parser/QueryParser.g 1579421 
>   trunk/test/org/apache/pig/test/TestCubeOperator.java 1579421 
>   trunk/test/org/apache/pig/test/TestNewPlanLogToPhyTranslationVisitor.java 
> 1579421 
>   trunk/test/org/apache/pig/test/utils/GenPhyOp.java 1579421 
> 
> Diff: https://reviews.apache.org/r/23804/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Quang-Nhat HOANG-XUAN
> 
>

Reply via email to