[ 
https://issues.apache.org/jira/browse/PIG-158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591287#action_12591287
 ] 

Pi Song commented on PIG-158:
-----------------------------


2) So based on how you intend to use it. I would say LOProject is a function 
from Tuple x Tuple to Tuple (In our data model) :-
{noformat}
LOProject:(Tuple x Tuple) -> Tuple
{noformat}
or as a method:-
{noformat}
OutputTuple LOProject(InputTuple, IndexListTuple)
{noformat}
So for your given example $1.($0, $1, $2) , we can write like this:-
{noformat}
LOProject( LOProject(A_Tuple_From_Input_Bag, {1}), {0,1,2} )        
{noformat}  
which is opposite from your solution. This thing is a bit tricky, isn't it?

from what I can see here, LOGenerate suits very well with the nested plan model 
and it should have ArrayList<LogicalPlan>. You can look at the inner LOProject 
as the upstream operator and the outer one is the downstream. And you need a 
list of plans because you may have to handle  $1.($0, $1, $2) , $2 , $3.($1,$2)
Another suggestion is that we should also signify tuple operators by having a 
parent class for them.

3) I think by having nested operators (which some of them might not be in any 
plan) it will be more headache in that we will have to handle special cases for 
some operators that are just floating around (depending on the implementation 
of the operator those floating operators stick to). 
I always emphasize nested plan because if we just have a consistent nested 
model then we can just define our operations on our plans using recursive 
definitions which I find simpler than having same logics that work differently 
on different places. I've started doing this by implementing type-checking and 
schema merging using recursive definitions to prove that this concept really 
does make things simpler.

7) I have been trying to find special cases where logger cannot be static. If 
you know any of such cases please throw me some light.

8) Don't you think "flatten" should be associated with each column in 
LOGenerate? So LOGenerate may have "List<boolean> isFlatten". Basically if the 
mapped column is not a bag, it is meaningless.

One more question:-
- ForEach and Generate are always in the same statement so they are always used 
together. I think what you've done is somehow separating their 
responsibilities. Could you please explain how they are being used?

PS. You will see that I really emphasize on model consistency because I believe 
that's how to simplify things. If you don't have too many exceptional cases, 
then the logical model can be much simpler.

> Rework logical plan
> -------------------
>
>                 Key: PIG-158
>                 URL: https://issues.apache.org/jira/browse/PIG-158
>             Project: Pig
>          Issue Type: Sub-task
>          Components: impl
>            Reporter: Alan Gates
>            Assignee: Alan Gates
>         Attachments: logical_operators.patch, logical_operators_rev_1.patch, 
> logical_operators_rev_2.patch, logical_operators_rev_3.patch, 
> parser_changes.patch, ParserErrors.txt, visitorWalker.patch
>
>
> Rework the logical plan in line with 
> http://wiki.apache.org/pig/PigExecutionModel

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to