Hi,

I'm new to Pig. I have a file that contains the contents of documents. The 
problem is that the contents are not in one line of the file. The file is 
actually an export of a database table. Below is an example of the table:

id seg_no  text
-- -----  -----
1  0      This is
1  1      a
1  2      test for
1  3      Hello
1  4      World!
2  0      Test
2  1      number
2  2      two.


How do I get an output like this:

id  text
--  ----
1   This is a test for Hello World!
2   Test number two.


I can do this in SQL, but I want to try it using Hadoop and Pig. I'm not sure 
how to concatenate values of a column w/in a group. I wondering if Pig's 
built-in functions can handle this or if I have to create a UDF. I'm thinking I 
need to create a UDF, but am not sure how to go about this. Any help/advice 
would be appreciated.

Thanks.

Reply via email to