How can I understand that 'A.score' is a bag? I mean that if I issue a
'describe B' command, I can get B: {group:int, A: {name:chararray,
no:int,score:int}}. From here, I can't get any information that 'A.score' is
a bag, but I can see that A.score is an element of bag.
And why if I delete the quantifier 'A.', it works?

I just changed my pig code as

A = LOAD '/home/huyong/test/student.txt' AS (name:chararray, no:int, score:
int);
B = GROUP A BY no;
C =  FOREACH B {
    D = FILTER A BY score > 80;
    GENERATE D.name, D.score;}
DUMP C;

I got an empty bag!

The input is as:
henrietta       1       25
sally   1       82
fred    3       120
elsie   4       45

The output is as:
({(sally)},{(82)})
({(fred)},{(120)})
({},{})

As you see, I got an empty tuple? why?

Yong

2011/7/19 Jacob Perkins <[email protected]>

> I think it's because 'A.score' is a bag but Pig needs a reference to a
> field in the tuples. This worked for me:
>
> A = LOAD 'foo.tsv' AS (name:chararray, no:int, score: int);
> B = GROUP A BY no;
> C = FOREACH B {
>       D = FILTER A BY score > 80;
>      GENERATE FLATTEN(D.(name, score));
>    };
> DUMP C;
>
> on the following data:
>
> $: cat foo.tsv
> henrietta       1       25
> sally   1       82
> fred    3       120
> elsie   4       45
>
> yields:
>
>
> Does that work for you?
>
> --jacob
> @thedatachef
>
> On Tue, 2011-07-19 at 15:00 +0200, 勇胡 wrote:
> > A = LOAD '/home/test/student.txt' AS (name:chararray, no:int, score:
> > int);
> > B = GROUP A BY no;
> > C =  FOREACH B {
> >     D = FILTER A BY A.score > 80;
> >     GENERATE D.name, D.score;}
> > DUMP C;
>
>

Reply via email to