[ 
https://issues.apache.org/jira/browse/PIG-161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605551#action_12605551
 ] 

shravanmn edited comment on PIG-161 at 6/17/08 4:15 AM:
-----------------------------------------------------------------------------

I have a diff suggestion.

Top level plan:
load -> group -> foreach

The foreach will have a nested plan:
plan1: project(1) -> distinct -> accumulate
{{{
Add
|
|---- project(0)
|
|---- accumulate
        |        |
        |        |---SUM()
        |              |
        |              |--- project(1)
        |                    |
        |                    |--- project(*)
        |---- distinct
                |
                |---- project(1)
}}}
But I think we still have some issues with this. Consider this:
{{{
A = load 'myfile';
B = group A by $0;
C = foreach B {
    C1 = distinct $1;
    C2 = filter $1 by $0>10;
    generate group + SUM(C1.$1), 
(myUDF1(C1,C2)*myUDF2(C1,C2))+(COUNT(C1)*group);
};
}}}
But here we definitely need Accumulate to handle multiple inputs.


      was (Author: shravanmn):
    I have a diff suggestion.

Top level plan:
load -> group -> foreach

The foreach will have a nested plan:
plan1: project(1) -> distinct -> accumulate

Add
|
|---- project(0)
|
|---- accumulate
        |        |
        |        |---SUM()
        |              |
        |              |--- project(1)
        |                    |
        |                    |--- project(*)
        |---- distinct
                |
                |---- project(1)

But I think we still have some issues with this. Consider this:
{{{
A = load 'myfile';
B = group A by $0;
C = foreach B {
    C1 = distinct $1;
    C2 = filter $1 by $0>10;
    generate group + SUM(C1.$1), 
(myUDF1(C1,C2)*myUDF2(C1,C2))+(COUNT(C1)*group);
};
}}}
But here we definitely need Accumulate to handle multiple inputs.

  
> Rework physical plan
> --------------------
>
>                 Key: PIG-161
>                 URL: https://issues.apache.org/jira/browse/PIG-161
>             Project: Pig
>          Issue Type: Sub-task
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: arithmeticOperators.patch, BinCondAndNegative.patch, 
> CastAndMapLookUp.patch, incr2.patch, incr3.patch, incr4.patch, incr5.patch, 
> logToPhyTranslator.patch, missingOps.patch, 
> MRCompilerTests_PlansAndOutputs.txt, Phy_AbsClass.patch, physicalOps.patch, 
> physicalOps.patch, physicalOps.patch, physicalOps.patch, 
> physicalOps_latest.patch, POCast.patch, POCast.patch, podistinct.patch, 
> pogenerate.patch, pogenerate.patch, pogenerate.patch, posort.patch, 
> POUserFuncCorrection.patch, 
> TEST-org.apache.pig.test.TestLocalJobSubmission.txt, 
> TEST-org.apache.pig.test.TestLogToPhyCompiler.txt, 
> TEST-org.apache.pig.test.TestLogToPhyCompiler.txt, 
> TEST-org.apache.pig.test.TestMapReduce.txt, 
> TEST-org.apache.pig.test.TestTypeCheckingValidator.txt, 
> TEST-org.apache.pig.test.TestUnion.txt, translator.patch, translator.patch, 
> translator.patch, translator.patch
>
>
> This bug tracks work to rework all of the physical operators as described in 
> http://wiki.apache.org/pig/PigTypesFunctionalSpec

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to