Thanks for fast reply ;)

Ok, I've done this:
recordJoined = JOIN record1 BY (game_id, user_id), record2 BY (game_id, 
user_id);

Now I have:
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::67, record2.param1::pop
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::43, record2.param1::wow
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::42, record2.param1::slow
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::23, record2.param1::fast
(Other data) record1.epoch::67, record1.game_id::564, record1.user_id::889, 
record2.epoch::44, record2.param1::pop
...

Now what?
I can do this:
recordFiltered = FILTER recordJoined BY record1::epoch >= record2::epoch;

It will give me:
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::43, record2.param1::wow
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::42, record2.param1::slow
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::23, record2.param1::fast
(Other data) record1.epoch::67, record1.game_id::564, record1.user_id::889, 
record2.epoch::44, record2.param1::pop
...

Still not what I want, I need:
record1.epoch::50, record1.game_id::434, record1.user_id::990, 
record2.epoch::43, record2.param1::wow
(Other data) record1.epoch::67, record1.game_id::564, record1.user_id::889, 
record2.epoch::44, record2.param1::pop
...



Sincerely,
Marek M.

________________________________________
From: yonghu [[email protected]]
Sent: Monday, September 12, 2011 5:49 PM
To: [email protected]
Subject: Re: JOINing two inputs

I think you can first use join and then for each tuple using filter.

On Mon, Sep 12, 2011 at 4:19 PM, Marek Miglinski <[email protected]>wrote:

> Hi,
>
> I have a serious task to finish, hope somebody will help me... I have two
> inputs with data:
>
> record1:
> epoch,
> game_id,
> user_id,
> other data
>
> record2:
> epoch,
> game_id,
> user_id,
> other data
>
> Now I need to JOIN record1 with record2 BY game_id, oper_id, user_id,
> epoch. BUT! epoch in record2 must be FIRST found data and it should be <
> than epoch in record1.
>
> recordJoined = JOIN record1 BY (game_id, user_id), record2 BY (game_id,
> user_id); + add something like... CLOSEST(WHERE record1::epoch <
> record2::epoch);
>
> So for example:
>
> record1:
> epoch::50
> game_id::434
> user_id::990
>
> record2:
> epoch::67
> game_id::434
> user_id::990
> param1::pop
>
> record2:
> epoch::43
> game_id::434
> user_id::990
> param1::wow
>
> record2:
> epoch::42
> game_id::434
> user_id::990
> param1::slow
>
> record2:
> epoch::23
> game_id::434
> user_id::990
> param1::fast
>
>
> The result should be - record1.epoch::50, record1.game_id::434,
> record1.user_id::990, record2.epoch::43, record2.param1::wow and ...
>
> Is it possible to accomplish through PIG? Using JOIN or using FOREACH?
>
>
>
> Sincerely,
> Marek M.
>
>
>

Reply via email to