You can slice a bag, but not a bag of bag. If you do want to project x,
do it early:
A = load 'foo.txt' using PigStorage as (x : chararray, y : int);
B = group A by x;
B1 = foreach B generate group, A.x as Ax;
C = group B1 by group;
E = foreach C generate B1.(group, Ax);
Daniel
Kris Coward wrote:
Dare I ask why such a query would be used? AFAICT the second group
operation would just stick each record in a bag and create an extra
copy of group on the outside of the bag (but use up a lot more
computational power than a UDF that would just do the same thing
explicitly).
Cheers,
Kris
On Thu, Dec 09, 2010 at 03:34:58PM -0800, Lin Guo wrote:
A = load 'foo.txt' using PigStorage as (x : chararray, y : int);
B = group A by x;
C = group B by group;
describe C;
-- we got
-- C: {group: chararray,B: {group: chararray,A: {x: chararray,y: int}}}
D = foreach C generate B.(group, A); -- this works
describe D;
E = foreach C generate B.(group, A.(x));
describe E;
--- pig returns syntax error, but should this work? Or is there a patch for it?
thanks,
lin