Consider this scenario:

I have a column named City and it takes 3 possible values: A,B,C

City
A
B
C
A
C
C

I want to convert it into

A             B            C
1              0            0
0              1            0
0              0            1
1              0            0
0              0            1
0              0            1

I am trying to write a pig script that will take two parameters, one
parameter is the data and then the column name, in this case 'City'. The
script should then identify distinct values that it will take and then
create that many columns and populate it with 1 or 0 depending on which one
is true.
Please let me know if you have got any ideas on how to approach this
problem.

Thanks,
Austin

Reply via email to