Hello TianYi Zhu, Thanks !! and will get back..
-->by the way, you can sort these 2 files by trade-key then merge them using a small script, that's much more faster than using pig. ... Trying out POC on updates in hadoop Thanks, Srinivas On Tue, Aug 28, 2012 at 12:55 AM, TianYi Zhu < [email protected]> wrote: > Hi Srinivas, > > you can write a user defined function for this > > feed = union feed1, feed2; > feed_grouped = group feed by trade-key; > output = foreach feed_grouped generate > flatten(your_user_defined_function(feed)) as (trade-key, trade-add-date, > trade-price) > > your_user_defined_function take the one or more records with the same > trade-key as input, and it should only output the latest tuple of > (trade-key, trade-add-date, trade-price) > > > by the way, you can sort these 2 files by trade-key then merge them using a > small script, that's much more faster than using pig. > > On Tue, Aug 28, 2012 at 2:36 PM, Srinivas Surasani <[email protected] > >wrote: > > > Hi, > > > > I'm trying to do updates of records in hadoop using Pig ( I know this is > > not ideal but trying out POC ).. > > data looks like the below: > > > > *feed1:* > > --> here trade key is unique for each order/record > > --> this is history file > > > > trade-key trade-add-date trade-price > > *k1 05/21/2012 2000* > > k2 04/21/2012 3000 > > k3 03/21/2012 4000 > > k4 05/21/2012 5000 > > > > *feed2: *--> this is the latest/daily feed > > trade-key trade-add-date trade-price > > k5 06/22/2012 1000 > > k6 06/22/2012 2000 > > *k1 06/21/2012 3000 ---> we can see here, > > trade with key "k1" is appeared again..that means order with trade key > "k1" > > has some update* > > * > > * > > Now I'm looking for the below output : ( merging the both files and and > > looking for common key from both feeds and keeping the latest key record > in > > the output file ) > > *k1 06/21/2012 3000* > > * > > k2 04/21/2012 3000 > > k3 06/21/2012 4000 > > k4 07/21/2012 5000 > > *k5 06/22/2012 1000 > > k6 06/22/2012 2000* > > > > any help appreciated greatly !! > > * > > > > Regards, > > Srinivas > > > -- Regards, Srinivas [email protected]
