Re: MapReduce with related data from disparate files

2008-03-25 Thread Colin Freas
] Sent: Monday, March 24, 2008 1:36 PM To: core-user@hadoop.apache.org Subject: MapReduce with related data from disparate files I have a cluster of 5 machines up and accepting jobs, and I'm trying to work out how to design my first MapReduce task for the data I have. So, I wonder if anyone

Re: MapReduce with related data from disparate files

2008-03-24 Thread Ted Dunning
Map-reduce excels at gluing together files like this. The map phase selects the key and makes sure that you have some way of telling what the source of the record is. The reduce phase takes all of the records with the same key and glues them together. It can do your processing, but it is also

RE: MapReduce with related data from disparate files

2008-03-24 Thread Nathan Wang
@hadoop.apache.org Subject: MapReduce with related data from disparate files I have a cluster of 5 machines up and accepting jobs, and I'm trying to work out how to design my first MapReduce task for the data I have. So, I wonder if anyone has any experience with the sort of problem I'm trying