The only thing I could think of would be to feed all of your potential keys
to a UDF which then processes them, creates a tuple which is the new, actual
key, and then you group and whatnot on that.

2011/1/28 Kunal Nawale <[email protected]>

> Hi,
>  I have a relation R as (a, b, c, d, e)
>
> I need to group data, but the grouping criterion is variable, depending on
> what input params my pig script receives.
> My input params are group_on_a, group_on_b, group_on_c, group_on_d which
> contain a value either 'T' or 'F'
>
>
> so the group statement could be:
> A = GROUP R BY (a,b)  if group_on_a and group_on_b are T and everything
> else
> is F
> A = GROUP R BY (a,c) if group_on_a and group_on_c are T and everything else
> is F
> A = GROUP R BY (a,b,c) if group_on_a and group_on_b, group_on_b are T and
> everything else is F
> A = GROUP R BY (a,b,c,d)  and so on.
>
> Is there a way I could do this in pig ?
> Regards,
> -kunal
>

Reply via email to