The only thing I could think of would be to feed all of your potential keys to a UDF which then processes them, creates a tuple which is the new, actual key, and then you group and whatnot on that.
2011/1/28 Kunal Nawale <[email protected]> > Hi, > I have a relation R as (a, b, c, d, e) > > I need to group data, but the grouping criterion is variable, depending on > what input params my pig script receives. > My input params are group_on_a, group_on_b, group_on_c, group_on_d which > contain a value either 'T' or 'F' > > > so the group statement could be: > A = GROUP R BY (a,b) if group_on_a and group_on_b are T and everything > else > is F > A = GROUP R BY (a,c) if group_on_a and group_on_c are T and everything else > is F > A = GROUP R BY (a,b,c) if group_on_a and group_on_b, group_on_b are T and > everything else is F > A = GROUP R BY (a,b,c,d) and so on. > > Is there a way I could do this in pig ? > Regards, > -kunal >
