-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23804/#review64932
-----------------------------------------------------------

Ship it!


Ran unit tests and e2e tests.

- Cheolsoo Park


On Dec. 7, 2014, 12:32 p.m., Quang-Nhat HOANG-XUAN wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23804/
> -----------------------------------------------------------
> 
> (Updated Dec. 7, 2014, 12:32 p.m.)
> 
> 
> Review request for pig.
> 
> 
> Bugs: PIG-4066
>     https://issues.apache.org/jira/browse/PIG-4066
> 
> 
> Repository: pig
> 
> 
> Description
> -------
> 
> This patch aims at addressing the current limitation of the ROLLUP operator 
> in PIG: most of the work is done in the Map phase of the underlying MapReduce 
> job to generate all possible intermediate keys that the reducer use to 
> aggregate and produce the ROLLUP output. Based on our previous work: 
> “Duy-Hung Phan, Matteo Dell’Amico, Pietro Michiardi: On the design space of 
> MapReduce ROLLUP aggregates” 
> (http://www.eurecom.fr/en/publication/4212/download/rs-publi-4212_2.pdf), we 
> show that the design space for a ROLLUP implementation allows for a different 
> approach (in-reducer grouping, IRG), in which less work is done in the Map 
> phase and the grouping is done in the Reduce phase. This patch presents the 
> most efficient implementation we designed (Hybrid IRG), which allows defining 
> a parameter to balance between parallelism (in the reducers) and 
> communication cost.
> This patch contains the following features:
> 1. The new ROLLUP approach: IRG, Hybrid IRG.
> 2. The PIVOT clause in CUBE operators.
> 3. Test cases.
> The new syntax to use our ROLLUP approach:
> alias = CUBE rel BY
> { CUBE col_ref | ROLLUP col_ref [PIVOT pivot_value]} [, { CUBE col_ref | 
> ROLLUP col_ref [PIVOT pivot_value]}
> ...]
> In case there is multiple ROLLUP operator in one CUBE clause, the last ROLLUP 
> operator will be executed with our approach (IRG, Hybrid IRG) while the 
> remaining ROLLUP ahead will be executed with the default approach.
> We have already made some experiments for comparison between our ROLLUP 
> implementation and the current ROLLUP. More information can be found at here: 
> http://hxquangnhat.github.io/PIG-ROLLUP-H2IRG/
> 
> 
> Diffs
> -----
> 
>   trunk/src/org/apache/pig/Main.java 1642549 
>   trunk/src/org/apache/pig/PigConstants.java 1642549 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/JobControlCompiler.java
>  1642549 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/MRCompiler.java
>  1642549 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/PigGenericMapReduce.java
>  1642549 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/mapReduceLayer/partitioners/RollupHIIPartitioner.java
>  PRE-CREATION 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/expressionOperators/POUserFunc.java
>  1642549 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/plans/PhyPlanVisitor.java
>  1642549 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/POPackage.java
>  1642549 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/relationalOperators/PORollupHIIForEach.java
>  PRE-CREATION 
>   
> trunk/src/org/apache/pig/backend/hadoop/executionengine/physicalLayer/util/PlanHelper.java
>  1642549 
>   trunk/src/org/apache/pig/builtin/RollupDimensions.java 1642549 
>   
> trunk/src/org/apache/pig/newplan/logical/expression/ExpToPhyTranslationVisitor.java
>  1642549 
>   trunk/src/org/apache/pig/newplan/logical/expression/UserFuncExpression.java 
> 1642549 
>   
> trunk/src/org/apache/pig/newplan/logical/optimizer/LogicalPlanOptimizer.java 
> 1642549 
>   trunk/src/org/apache/pig/newplan/logical/relational/LOCogroup.java 1642549 
>   trunk/src/org/apache/pig/newplan/logical/relational/LOCube.java 1642549 
>   trunk/src/org/apache/pig/newplan/logical/relational/LORollupHIIForEach.java 
> PRE-CREATION 
>   
> trunk/src/org/apache/pig/newplan/logical/relational/LogToPhyTranslationVisitor.java
>  1642549 
>   trunk/src/org/apache/pig/newplan/logical/relational/LogicalPlan.java 
> 1642549 
>   
> trunk/src/org/apache/pig/newplan/logical/relational/LogicalRelationalNodesVisitor.java
>  1642549 
>   trunk/src/org/apache/pig/newplan/logical/rules/OptimizerUtils.java 1642549 
>   trunk/src/org/apache/pig/newplan/logical/rules/RollupHIIOptimizer.java 
> PRE-CREATION 
>   trunk/src/org/apache/pig/parser/AliasMasker.g 1642549 
>   trunk/src/org/apache/pig/parser/AstPrinter.g 1642549 
>   trunk/src/org/apache/pig/parser/AstValidator.g 1642549 
>   trunk/src/org/apache/pig/parser/LogicalPlanBuilder.java 1642549 
>   trunk/src/org/apache/pig/parser/LogicalPlanGenerator.g 1642549 
>   trunk/src/org/apache/pig/parser/QueryLexer.g 1642549 
>   trunk/src/org/apache/pig/parser/QueryParser.g 1642549 
>   trunk/test/org/apache/pig/test/TestCubeOperator.java 1642549 
> 
> Diff: https://reviews.apache.org/r/23804/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Quang-Nhat HOANG-XUAN
> 
>

Reply via email to