Hitarth: You can also consider MultiFileInputFormat (and its concrete implementations).
Cheers On Mon, Jan 5, 2015 at 6:14 PM, Corey Nolet <[email protected]> wrote: > Hitarth, > > I don't know how much direction you are looking for with regards to the > formats of the times but you can certainly read both files into the third > mapreduce job using the FileInputFormat by comma-separating the paths to > the files. The blocks for both files will essentially be unioned together > and the mappers scheduled across your cluster. > > On Mon, Jan 5, 2015 at 3:55 PM, hitarth trivedi <[email protected]> > wrote: > >> Hi, >> >> I have 6 node cluster, and the scenario is as follows :- >> >> I have one map reduce job which will write file1 in HDFS. >> I have another map reduce job which will write file2 in HDFS. >> In the third map reduce job I need to use file1 and file2 to do some >> computation and output the value. >> >> What is the best way to store file1 and file2 in HDFS so that they could >> be used in third map reduce job. >> >> Thanks, >> Hitarth >> > >
