Any clues anyone? I still didn't get anything myself, thinking... 


Sincerely,
Marek M.

-----Original Message-----
From: yonghu [mailto:[email protected]] 
Sent: Monday, September 12, 2011 10:21 PM
To: [email protected]
Subject: Re: JOINing two inputs

Sorry, I didn't understand you right. I didn't think just use Pig operator can 
finish this problem. You can first use cogroup operator to group the two inputs 
together. Then apply a UDF to each tuple.

On Mon, Sep 12, 2011 at 5:35 PM, Marek Miglinski <[email protected]>wrote:

> Thanks for fast reply ;)
>
> Ok, I've done this:
> recordJoined = JOIN record1 BY (game_id, user_id), record2 BY 
> (game_id, user_id);
>
> Now I have:
> record1.epoch::50, record1.game_id::434, record1.user_id::990, 
> record2.epoch::67, record2.param1::pop record1.epoch::50, 
> record1.game_id::434, record1.user_id::990, record2.epoch::43, 
> record2.param1::wow record1.epoch::50, record1.game_id::434, 
> record1.user_id::990, record2.epoch::42, record2.param1::slow 
> record1.epoch::50, record1.game_id::434, record1.user_id::990, 
> record2.epoch::23, record2.param1::fast (Other data) 
> record1.epoch::67, record1.game_id::564, record1.user_id::889, 
> record2.epoch::44, record2.param1::pop ...
>
> Now what?
> I can do this:
> recordFiltered = FILTER recordJoined BY record1::epoch >= 
> record2::epoch;
>
> It will give me:
> record1.epoch::50, record1.game_id::434, record1.user_id::990, 
> record2.epoch::43, record2.param1::wow record1.epoch::50, 
> record1.game_id::434, record1.user_id::990, record2.epoch::42, 
> record2.param1::slow record1.epoch::50, record1.game_id::434, 
> record1.user_id::990, record2.epoch::23, record2.param1::fast (Other 
> data) record1.epoch::67, record1.game_id::564, record1.user_id::889, 
> record2.epoch::44, record2.param1::pop ...
>
> Still not what I want, I need:
> record1.epoch::50, record1.game_id::434, record1.user_id::990, 
> record2.epoch::43, record2.param1::wow (Other data) record1.epoch::67, 
> record1.game_id::564, record1.user_id::889, record2.epoch::44, 
> record2.param1::pop ...
>
>
>
> Sincerely,
> Marek M.
>
> ________________________________________
> From: yonghu [[email protected]]
> Sent: Monday, September 12, 2011 5:49 PM
> To: [email protected]
> Subject: Re: JOINing two inputs
>
> I think you can first use join and then for each tuple using filter.
>
> On Mon, Sep 12, 2011 at 4:19 PM, Marek Miglinski <[email protected]
> >wrote:
>
> > Hi,
> >
> > I have a serious task to finish, hope somebody will help me... I 
> > have two inputs with data:
> >
> > record1:
> > epoch,
> > game_id,
> > user_id,
> > other data
> >
> > record2:
> > epoch,
> > game_id,
> > user_id,
> > other data
> >
> > Now I need to JOIN record1 with record2 BY game_id, oper_id, 
> > user_id, epoch. BUT! epoch in record2 must be FIRST found data and 
> > it should be < than epoch in record1.
> >
> > recordJoined = JOIN record1 BY (game_id, user_id), record2 BY 
> > (game_id, user_id); + add something like... CLOSEST(WHERE 
> > record1::epoch < record2::epoch);
> >
> > So for example:
> >
> > record1:
> > epoch::50
> > game_id::434
> > user_id::990
> >
> > record2:
> > epoch::67
> > game_id::434
> > user_id::990
> > param1::pop
> >
> > record2:
> > epoch::43
> > game_id::434
> > user_id::990
> > param1::wow
> >
> > record2:
> > epoch::42
> > game_id::434
> > user_id::990
> > param1::slow
> >
> > record2:
> > epoch::23
> > game_id::434
> > user_id::990
> > param1::fast
> >
> >
> > The result should be - record1.epoch::50, record1.game_id::434, 
> > record1.user_id::990, record2.epoch::43, record2.param1::wow and ...
> >
> > Is it possible to accomplish through PIG? Using JOIN or using FOREACH?
> >
> >
> >
> > Sincerely,
> > Marek M.
> >
> >
> >
>

Reply via email to