Re: Write and Read file through map reduce

unmesha sreeveni Mon, 05 Jan 2015 19:59:57 -0800

Hi hitarth
,

If your file1 and file 2 is smaller you can move on with Distributed Cache.
mentioned here
<http://unmeshasreeveni.blogspot.in/2014/10/how-to-load-file-in-distributedcache-in.html>
 .


Or you can move on with MultipleInputFormat
 mentioned here
<http://unmeshasreeveni.blogspot.in/2014/12/joining-two-files-using-multipleinput.html>
.

[1]
http://unmeshasreeveni.blogspot.in/2014/10/how-to-load-file-in-distributedcache-in.html
[2]
http://unmeshasreeveni.blogspot.in/2014/12/joining-two-files-using-multipleinput.html

On Tue, Jan 6, 2015 at 8:53 AM, Ted Yu <[email protected]> wrote:

> Hitarth:
> You can also consider MultiFileInputFormat (and its concrete
> implementations).
>
> Cheers
>
> On Mon, Jan 5, 2015 at 6:14 PM, Corey Nolet <[email protected]> wrote:
>
>> Hitarth,
>>
>> I don't know how much direction you are looking for with regards to the
>> formats of the times but you can certainly read both files into the third
>> mapreduce job using the FileInputFormat by comma-separating the paths to
>> the files. The blocks for both files will essentially be unioned together
>> and the mappers scheduled across your cluster.
>>
>> On Mon, Jan 5, 2015 at 3:55 PM, hitarth trivedi <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> I have 6 node cluster, and the scenario is as follows :-
>>>
>>> I have one map reduce job which will write file1 in HDFS.
>>> I have another map reduce job which will write file2 in  HDFS.
>>> In the third map reduce job I need to use file1 and file2 to do some
>>> computation and output the value.
>>>
>>> What is the best way to store file1 and file2 in HDFS so that they could
>>> be used in third map reduce job.
>>>
>>> Thanks,
>>> Hitarth
>>>
>>
>>
>


-- 
*Thanks & Regards *


*Unmesha Sreeveni U.B*
*Hadoop, Bigdata Developer*
*Centre for Cyber Security | Amrita Vishwa Vidyapeetham*
http://www.unmeshasreeveni.blogspot.in/

Re: Write and Read file through map reduce

Reply via email to