Though we don't support nested foreach in grammar, Pig has some limited support for it in logical plan/runtime. For example, the following script will contain a nested foreach:

a = load '1.txt' as (a0, a1, a2);
b = group a by a0;
c = foreach b {
    c0 = a.a0;
    generate c0;
};
explain c;

So I believe the basic piece to make nested foreach work is already there. We need to further: 1. Allow parser to handle the real nested foreach statement, define the limitation of nested foreach we support
2. Make sure Pig handles the extended scope of nested foreach

Daniel

On 05/22/2011 11:52 AM, Aniket Mokashi wrote:
Hi,

Thank you everyone for all your support. It has been a very enjoyable
experience to work with pig community.

I plan get involved through GSoC platform to contribute to pig project. I
will be working on addition of support for nested foreach. I will also try
to work on jiras related to this support (Please assign related jiras to
me). My proposal to GSoC can be found at --
http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/aniket486/1

<http://www.google-melange.com/gsoc/proposal/review/google/gsoc2011/aniket486/1>I
worked on a couple of interesting projects at Yahoo last summer to learn
about internals of pig parser, logical plan build, construction of physical
and mr plans from the logical plan. While working on support for scalars, I
learnt about various passes in pig to reconstruct plans to optimize
execution and limitations on it. In Pig 0.9, a few things have changed with
parsers and optimizers. Hence, it would be beneficial for me if you can help
me out with any comments and remarks on my approach.

Here are my current thoughts on support of Nested Foreach -(
https://issues.apache.org/jira/browse/PIG-1631)
Pig currently supports nested_proj which internally streams the bag. This
support can be extended by assigning innerplan to this streaming with
nested_foreach. First step is to add parser support for this. But, this
would need changes further to restrict generic support to the innerplan
depending upon pig limitations. Currently, I am exploring various
possibilities to add buildNestedForeachOp to logicalplanbuilder with or
without using existing "generate_clause". I will upload a patch to jira once
  get projection support through nested foreach.
Please let me know your comments on the same.

Thanks,
Aniket

On Thu, May 19, 2011 at 1:19 PM, Ashutosh Chauhan<[email protected]>wrote:

Congratulations, Aniket!
Hoping to see many more contributions in Pig from you.

Ashutosh
On Thu, May 19, 2011 at 10:08, Alan Gates<[email protected]>  wrote:
Please join me in welcoming Aniket Mokashi as a new committer on Pig.
  Aniket has been contributing to Pig since last summer.  He wrote or
helped
shepherd several major features in 0.8, including the Python UDF work,
the
new mapreduce functionality, and the custom partitioner.  We look forward
to
more great work from him in the future.

Alan.




Reply via email to