You will need a UDF to concat bag items.
Daniel
Matt Tanquary wrote:
This set results from a JOIN:
(04f4c2fd-8be2-41c3-b045-283de80909ba,1966,2L)
(04f4c2fd-8be2-41c3-b045-283de80909ba,3845,2L)
Using PIG, I group this and get:
(669a4b47-d3c3-4950-9ec0-f1e24064d9d9,{(669a4b47-d3c3-4950-9ec0-f1e24064d9d9,1634,2L),(669a4b47-d3c3-4950-9ec0-f1e24064d9d9,1966,2L)})
After FOREACH...GENERATE:
({(1966),(3845)},{(2L),(2L)})
What I want to do is derive:
(1966|3845,2L)
The trouble is that everything is bagged up from the group and I'm not sure
how to unbag for the output so I can do things like apply CONCAT, UNIQUE on
the fields, etc. I have tried nested FOREACH statements, but I can't seem to
drill down far enough to de-reference the values the way I'd like.
Is this a job for UDF or is there anything in Pig Latin that I can do to
accomplish this task?
Thanks!
-M@