Thanks Prashant. I am looking into embedding Pig in Java and UDFs
On Mon, Feb 20, 2012 at 5:26 PM, Prashant Kommireddi <[email protected]>wrote: > This should work if the values are only A,B,C. > > M = load 'input' as (city:chararray); > > N = foreach M generate city == 'A' ? 1 : 0 as A, city == 'B' ? 1 : 0 as B, > city == 'C' ? 1 : 0 as C; > > However, if city values vary it might be a good option to do it by > embedding Pig in Java. > http://pig.apache.org/docs/r0.9.1/cont.html#embed-java > > Thanks, > Prashant > > On Mon, Feb 20, 2012 at 3:16 AM, Austin Chungath <[email protected]> > wrote: > > > Consider this scenario: > > > > I have a column named City and it takes 3 possible values: A,B,C > > > > City > > A > > B > > C > > A > > C > > C > > > > I want to convert it into > > > > A B C > > 1 0 0 > > 0 1 0 > > 0 0 1 > > 1 0 0 > > 0 0 1 > > 0 0 1 > > > > I am trying to write a pig script that will take two parameters, one > > parameter is the data and then the column name, in this case 'City'. The > > script should then identify distinct values that it will take and then > > create that many columns and populate it with 1 or 0 depending on which > one > > is true. > > Please let me know if you have got any ideas on how to approach this > > problem. > > > > Thanks, > > Austin > > >
