On 3/5/12 7:19 PM, guoyun wrote:
Dear All:
        this is the description of wiki about distinct:

        grunt>  A = load 'mydata' using PigStorage() as (a, b, c);
        grunt>B = group A by a;
        grunt>  C = foreach B {
                D = distinct A.b;
                generate flatten(group), COUNT(D);
        }
        
        but if filed b have sub fileds,for example:
        A = load 'mydata' using PigStorage() as (a, b(b1,b2,b3), c);
        
        if i want to distinct D = distinct A.b.b1,how can i do?because pig is
not allowed to use D = distinct A.b.b1;
        
        Thank you!





You need to use another nested foreach statement. -

C = foreach B { B1BAG = foreach A generate b.b1; D = distinct B1BAG; generate flatten(group), COUNT(D);}

-Thejas

Reply via email to