[ 
https://issues.apache.org/jira/browse/PIG-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12884105#action_12884105
 ] 

Xuefu Zhang commented on PIG-1321:
----------------------------------

Here is the scope of this type of optimization:

Pre-condition: 
1. two consecutive foreach statements.
2. the second foreach statement is a simple inner plan in which the ognly 
statement is a GENERATE statement. In other words, the second foreach statement 
must be something like "FOREACH A GENERATE ...."

Optimization result:
The two foreach statement will be merged to one. The new foreach statement 
keeps the first old foreach statement's inner plan with the new expressions for 
the GENERATE statement. These new expressions are generated based on those in 
the second foreach generate statement, combined with those in the first foreach 
generate statement. For instance, suppose we have the following pig script:

A = load 'file.txt' as (a, b, c);
B = foreach A generate a+b as u, c-b as v;
C = foreach B generate $0+5, v;
dump C;

The optimized plan after merge-foreach optimization will be equivalent to the 
following pig script

A = load 'file.txt' as (a, b, c);
C = foreach A generate a+b+5, c-b;
dump C;

Of course, first foreach can have any complex inner plan, which remains the 
same in the new foreach statement.

Patch for this optimization is coming soon...

> Logical Optimizer: Merge cascading foreach
> ------------------------------------------
>
>                 Key: PIG-1321
>                 URL: https://issues.apache.org/jira/browse/PIG-1321
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>    Affects Versions: 0.7.0
>            Reporter: Daniel Dai
>            Assignee: Xuefu Zhang
>
> We can merge consecutive foreach statement.
> Eg:
> b = foreach a generate a0#'key1' as b0, a0#'key2' as b1, a1;
> c = foreach b generate b0#'kk1', b0#'kk2', b1, a1;
> => c = foreach a generate a0#'key1'#'kk1', a0#'key1'#'kk2', a0#'key2', a1;

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to