Dear Wiki user, You have subscribed to a wiki page or wiki category on "Pig Wiki" for change notification.
The following page has been changed by PiSong: http://wiki.apache.org/pig/NestedLogicalPlan ------------------------------------------------------------------------------ [pi] We can think about this in two ways: first, only one of them do all the work. Second, we split responsibilities. I'm confused with what it is. We should come up with clear cut of responsibilies. Though, if you say "foreach just takes each input and uses", then it is not a dummy. + [pi] This is one possible way to describe internal operations of FOREACH GENERATE:- + + Operator FOREACH: + {{{ + FOREACH: Bag x (f: Tuple -> Tuple) x (list of flatten indexes) -> Bag + }}} + 1. Iterate through the bag from input port + 1. For each tuple in the bag, apply f: Tuple -> Tuple (Which is the inner plan) + 1. Flatten and put all the output tuples to the output bag. Repeat previous step again. + 1. Output bag to the output port. + + This way we don't need GENERATE and only use a normal inner plan in FOREACH . The list of flatten flags is belong to FOREACH. + + ==== LOProject ==== This operator is only for mapping input tuple to output tuple (eg. {A,B,C,D,E} ==> {A,C,D} ). Given the fact that we allow users to have fields in COGROUP, FILTER, FOREACH as expressions, LOProject then becomes just a special case when users merely specify direct mapping. Since we have agreed upon the concept of inner plans, I think LOProject is not needed. [shrav]Project is a consistent way implementing these fields that the user mentions without letting the user bother about all the conversions he might need to do if we just pass the raw tuple to him. Also you can only project out one field and not multiple fields. + [pi] What you mentioned here is different from the current implementation.