[ 
https://issues.apache.org/jira/browse/PIG-430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12631924#action_12631924
 ] 

shravanmn edited comment on PIG-430 at 9/17/08 1:26 PM:
-----------------------------------------------------------------------------

I have fixed part of the problem that addresses the project issue. The issue 
mentioned in distinct still remains. The problem here is that we see that 
projects are being introduced into the input of distinct which creates a unique 
case where the projection chaining will not work. The problem is similar to the 
one where you can assign a nested project to a variable inside a nested block. 
This has been solved by replacing the nested project with a foreach statement. 
The solution to the distinct problem should be something similar where the 
input to the distinct can also be a nested project. I made a local change by 
replacing BaseEvalSpec by NestedProject in my code for this and it works. 
However, I don't want to mess up something because I am not completely aware of 
the side-effects of changing this in the parser. Its better if someone more 
comfortable with the parser took a look at this one.

Also, I think there are some issues with the parsing of nested things. I tried 
the following and the parser just doesn't terminate the nested block waiting 
and keeps waiting for more input:

A = load 'file';
B = group A by $0;
C = foreach B { C1=distinct "const"; generate C1;};

I was clueless as  to why this is happening but I tried this because I thought 
that the input to a nested distinct shouldn't be BaseEvalSpec which can 
FuncEvalSpecs and Constants. I think we need to change things a bit here.

      was (Author: shravanmn):
    I have fixed part of the problem that addresses the project issue. The 
issue mentioned in distinct still remains. The problem here is that we see that 
projects are being introduced into the input of distinct which creates a unique 
case where the projection chaining will not work. The problem is similar to the 
one where you can assign a nested project to a variable inside a nested block. 
This has been solved by replacing the nested project with a foreach statement. 
The solution to the distinct problem should be something similar where the 
input to the distinct can also be a nested project. I made a local change by 
replacing BaseEvalSpec by NestedProject in my code for this and it works. 
However, I don't want to mess up something because I am not completely aware of 
the side-effects of changing this in the parser. Its better if someone more 
comfortable with the parser took a look at this one.

Also, I think there are some issues with the parsing of nested things. I tried 
the following and the parser just doesn't terminate the nested block waiting 
and keeps waiting for more input:

A = load 'file';
B = group A by $0;
C = foreach B { C1=distinct "const"; generate C1;}

I was clueless as  to why this is happening but I tried this because I thought 
that the input to a nested distinct shouldn't be BaseEvalSpec which can 
FuncEvalSpecs and Constants. I think we need to change things a bit here.
  
> Projections in nested filter and inside foreach do not work
> -----------------------------------------------------------
>
>                 Key: PIG-430
>                 URL: https://issues.apache.org/jira/browse/PIG-430
>             Project: Pig
>          Issue Type: Bug
>    Affects Versions: types_branch
>            Reporter: Santhosh Srinivasan
>            Assignee: Shravan Matthur Narayanamurthy
>             Fix For: types_branch
>
>         Attachments: 430-1.patch
>
>
> The following queries do not work:
> Nested filter:
> a = load 'studenttab10k' as (name, age, gpa);
> b = filter a by age < 20;
> c = group b by age;
> d = foreach c { cf = filter b by gpa < 3.0; cp = cf.gpa; cd = distinct cp; co 
> = order cd by $0; generate group, flatten(co); }
> store d into 'output';
> Nested Distinct:
> a = load '/user/pig/tests/data/singlefile/studenttab10k' as (name, age, gpa);
> b = group a by name;
> c = foreach b { aa = distinct a.age; generate group, COUNT(aa); }
> store c into 'output';

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to