Lance -

Fun to see you on a mailing list.
How are things?

;;peter


On 08/18/10 22:11, Lance Norskog wrote:
> Hadoop has a toolkit called 'map-side joins' which requires sorted
> input tables.  org.apache.hadoop.examples.Join.java shows how. Good
> luck decoding it!
> 
> Could you use chained mapper tasks to sort each input set before using
> the join framework?
> 
> On Wed, Aug 18, 2010 at 10:10 AM, y l <[email protected]> wrote:
>> Hi,
>>
>> My first email on the list, and overall pretty new to Hadoop, so I'm hoping 
>> to find some help with a new task I have to do for work.
>> I need to do a join between 2 sets of files. One is a bunch of csv files and 
>> the other set is sequence files.
>>
>> I was told MultiFilterRecorderReader could help me do the join, but I 
>> haven't been successful to find some good example on where and how to use 
>> that class to do the join.
>> I have found a good example using CompositeInputFormat here: 
>> http://www.congiu.com/node/5
>> But it assumes that the input is sorted and I can't guarantee that it will 
>> be on the csv files at least.
>>
>> Anyone knows what I need to do with that MultiFilterRecorderReader? Inherit 
>> it on the mapper? I'm a little confused... Please let me know if you have 
>> any pointers on that one.
>>
>> Thanks.
>>
> 
> 
> 

Reply via email to