Thanks for fast reply ;) Ok, I've done this: recordJoined = JOIN record1 BY (game_id, user_id), record2 BY (game_id, user_id);
Now I have: record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::67, record2.param1::pop record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::43, record2.param1::wow record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::42, record2.param1::slow record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::23, record2.param1::fast (Other data) record1.epoch::67, record1.game_id::564, record1.user_id::889, record2.epoch::44, record2.param1::pop ... Now what? I can do this: recordFiltered = FILTER recordJoined BY record1::epoch >= record2::epoch; It will give me: record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::43, record2.param1::wow record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::42, record2.param1::slow record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::23, record2.param1::fast (Other data) record1.epoch::67, record1.game_id::564, record1.user_id::889, record2.epoch::44, record2.param1::pop ... Still not what I want, I need: record1.epoch::50, record1.game_id::434, record1.user_id::990, record2.epoch::43, record2.param1::wow (Other data) record1.epoch::67, record1.game_id::564, record1.user_id::889, record2.epoch::44, record2.param1::pop ... Sincerely, Marek M. ________________________________________ From: yonghu [[email protected]] Sent: Monday, September 12, 2011 5:49 PM To: [email protected] Subject: Re: JOINing two inputs I think you can first use join and then for each tuple using filter. On Mon, Sep 12, 2011 at 4:19 PM, Marek Miglinski <[email protected]>wrote: > Hi, > > I have a serious task to finish, hope somebody will help me... I have two > inputs with data: > > record1: > epoch, > game_id, > user_id, > other data > > record2: > epoch, > game_id, > user_id, > other data > > Now I need to JOIN record1 with record2 BY game_id, oper_id, user_id, > epoch. BUT! epoch in record2 must be FIRST found data and it should be < > than epoch in record1. > > recordJoined = JOIN record1 BY (game_id, user_id), record2 BY (game_id, > user_id); + add something like... CLOSEST(WHERE record1::epoch < > record2::epoch); > > So for example: > > record1: > epoch::50 > game_id::434 > user_id::990 > > record2: > epoch::67 > game_id::434 > user_id::990 > param1::pop > > record2: > epoch::43 > game_id::434 > user_id::990 > param1::wow > > record2: > epoch::42 > game_id::434 > user_id::990 > param1::slow > > record2: > epoch::23 > game_id::434 > user_id::990 > param1::fast > > > The result should be - record1.epoch::50, record1.game_id::434, > record1.user_id::990, record2.epoch::43, record2.param1::wow and ... > > Is it possible to accomplish through PIG? Using JOIN or using FOREACH? > > > > Sincerely, > Marek M. > > >
