Thanks Prashant.
I am looking into embedding Pig in Java and UDFs

On Mon, Feb 20, 2012 at 5:26 PM, Prashant Kommireddi <[email protected]>wrote:

> This should work if the values are only A,B,C.
>
> M = load 'input' as (city:chararray);
>
> N = foreach M generate city == 'A' ? 1 : 0 as A, city == 'B' ? 1 : 0 as B,
> city == 'C' ? 1 : 0 as C;
>
> However, if city values vary it might be a good option to do it by
> embedding Pig in Java.
> http://pig.apache.org/docs/r0.9.1/cont.html#embed-java
>
> Thanks,
> Prashant
>
> On Mon, Feb 20, 2012 at 3:16 AM, Austin Chungath <[email protected]>
> wrote:
>
> > Consider this scenario:
> >
> > I have a column named City and it takes 3 possible values: A,B,C
> >
> > City
> > A
> > B
> > C
> > A
> > C
> > C
> >
> > I want to convert it into
> >
> > A             B            C
> > 1              0            0
> > 0              1            0
> > 0              0            1
> > 1              0            0
> > 0              0            1
> > 0              0            1
> >
> > I am trying to write a pig script that will take two parameters, one
> > parameter is the data and then the column name, in this case 'City'. The
> > script should then identify distinct values that it will take and then
> > create that many columns and populate it with 1 or 0 depending on which
> one
> > is true.
> > Please let me know if you have got any ideas on how to approach this
> > problem.
> >
> > Thanks,
> > Austin
> >
>

Reply via email to