Any clues anyone? I still didn't get anything myself, thinking...
Sincerely, Marek M. -----Original Message----- From: yonghu [mailto:[email protected]] Sent: Monday, September 12, 2011 10:21 PM To: [email protected] Subject: Re: JOINing two inputs Sorry, I didn't understand you right. I didn't think just use Pig operator can finish this problem. You can first use cogroup operator to group the two inputs together. Then apply a UDF to each tuple. On Mon, Sep 12, 2011 at 5:35 PM, Marek Miglinski <[email protected]>wrote: > Thanks for fast reply ;) > > Ok, I've done this: > recordJoined = JOIN record1 BY (game_id, user_id), record2 BY > (game_id, user_id); > > Now I have: > record1.epoch::50, record1.game_id::434, record1.user_id::990, > record2.epoch::67, record2.param1::pop record1.epoch::50, > record1.game_id::434, record1.user_id::990, record2.epoch::43, > record2.param1::wow record1.epoch::50, record1.game_id::434, > record1.user_id::990, record2.epoch::42, record2.param1::slow > record1.epoch::50, record1.game_id::434, record1.user_id::990, > record2.epoch::23, record2.param1::fast (Other data) > record1.epoch::67, record1.game_id::564, record1.user_id::889, > record2.epoch::44, record2.param1::pop ... > > Now what? > I can do this: > recordFiltered = FILTER recordJoined BY record1::epoch >= > record2::epoch; > > It will give me: > record1.epoch::50, record1.game_id::434, record1.user_id::990, > record2.epoch::43, record2.param1::wow record1.epoch::50, > record1.game_id::434, record1.user_id::990, record2.epoch::42, > record2.param1::slow record1.epoch::50, record1.game_id::434, > record1.user_id::990, record2.epoch::23, record2.param1::fast (Other > data) record1.epoch::67, record1.game_id::564, record1.user_id::889, > record2.epoch::44, record2.param1::pop ... > > Still not what I want, I need: > record1.epoch::50, record1.game_id::434, record1.user_id::990, > record2.epoch::43, record2.param1::wow (Other data) record1.epoch::67, > record1.game_id::564, record1.user_id::889, record2.epoch::44, > record2.param1::pop ... > > > > Sincerely, > Marek M. > > ________________________________________ > From: yonghu [[email protected]] > Sent: Monday, September 12, 2011 5:49 PM > To: [email protected] > Subject: Re: JOINing two inputs > > I think you can first use join and then for each tuple using filter. > > On Mon, Sep 12, 2011 at 4:19 PM, Marek Miglinski <[email protected] > >wrote: > > > Hi, > > > > I have a serious task to finish, hope somebody will help me... I > > have two inputs with data: > > > > record1: > > epoch, > > game_id, > > user_id, > > other data > > > > record2: > > epoch, > > game_id, > > user_id, > > other data > > > > Now I need to JOIN record1 with record2 BY game_id, oper_id, > > user_id, epoch. BUT! epoch in record2 must be FIRST found data and > > it should be < than epoch in record1. > > > > recordJoined = JOIN record1 BY (game_id, user_id), record2 BY > > (game_id, user_id); + add something like... CLOSEST(WHERE > > record1::epoch < record2::epoch); > > > > So for example: > > > > record1: > > epoch::50 > > game_id::434 > > user_id::990 > > > > record2: > > epoch::67 > > game_id::434 > > user_id::990 > > param1::pop > > > > record2: > > epoch::43 > > game_id::434 > > user_id::990 > > param1::wow > > > > record2: > > epoch::42 > > game_id::434 > > user_id::990 > > param1::slow > > > > record2: > > epoch::23 > > game_id::434 > > user_id::990 > > param1::fast > > > > > > The result should be - record1.epoch::50, record1.game_id::434, > > record1.user_id::990, record2.epoch::43, record2.param1::wow and ... > > > > Is it possible to accomplish through PIG? Using JOIN or using FOREACH? > > > > > > > > Sincerely, > > Marek M. > > > > > > >
