Steve,Thanks. Will try that now.
> From: [email protected] > To: [email protected] > Subject: RE: Ideas for data processing > Date: Tue, 4 Feb 2014 17:57:44 +0000 > > Sameer, did you check out the TOMAP function in the documentation? The > example is close to yours. I think with a nested FOREACH in combination with > TOMAP and you'd get there, though I haven't tried it myself. > SB > > ______________________ > Steve Bernstein > VP/Analytics > > 408.499.0961 MOBILE > deem.com > > -----Original Message----- > From: Sameer Tilak [mailto:[email protected]] > Sent: Monday, February 03, 2014 2:00 PM > To: [email protected] > Subject: Ideas for data processing > > Hi everyone, > We have data set in the following format: > user1 item1 valueuser2 item1 valueuser3 item1 > value...................user1 item2 valueuser20 item2 valueuser35 > item2 value..................user2 item3 valueuser25 item3 > value....... > We have around 20 items and millions of users and not all users have entries > for all the items. We would like to transform this into > user1 item1 value, item2, value, item3, value....user2 item4 value, item 18 > value, item 19 value..... > I can think of a couple of ways for doing this in Pig Latin. For example, one > way would be to create a map (where key is item name and value is the > associated value) and then fill out that map as you read the data. Then write > it out to a file. I am not sure how efficient will that be. I would love to > get suggestions for doing this in Pig Latin. > >
