Re: Is it possible to input two different files under same mapper

Muhammad Ali Amer Fri, 11 Jul 2008 14:09:58 -0700

Thanks Mori,

So far I cannot touch the large file, its just a very very longstring , and I have to "approximately" match smaller strings againstit. I will give it a try with the FileSplit and see if I am notmerging the two together.


On Jul 11, 2008, at 1:41 PM, Mori Bellamy wrote:

Hey Amer,
It sounds to me like you're going to have to write your own inputformat (or atleast modify an existing one). Take a look here:
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/mapred/FileSplit.html

I'm not sure how you'd go about doing this, but i hope this helps you.
(Also, have you considered preprocessing your input so that anyarbitrary mapper can know whether or not its looking at a line fromthe "large file"?)
On Jul 11, 2008, at 12:31 PM, Muhammad Ali Amer wrote:
HI,
My requirement is to compare the contents of one very large file(GB to TB size) with a bunch of smaller files (100s of MB to GBsizes). Is there a way I can give the mapper the 1st fileindependently of the remaining bunch?
Amer


Muhammad Ali Amer
Center For Grid Technologies
Information Sciences Institute
USC Viterbi School Of Engg
Tel : (310) 448-8349

Re: Is it possible to input two different files under same mapper

Reply via email to