Hi Agarwal, I once have similar questions, and have done some experiment. Here is my experience: 1. For some applications over MR, like HBase, Hive, which does not need to submit additional files to HDFS, file:/// could work well without any problem (According to my test).
2. For simple MR applications, like TeraSort, there is some problems by simply using file:///, since MR will maintain some MR-control files both in shared FileSystem, and local file sytem in one list, and will lookup the list for the file, and simply using file:/// will cause the shared FS looks the same as local filesystem, while in fact, they are two different kinds of filesystem, and have different path conversion-rules. For the 2nd issue, you can just create a new shared filesystem class by deriving the existing org.apache.hadoop.fs.FileSystem , I have create such a repository with an example filesystem class implementation( https://github.com/Lingcc/hadoop-lingccfs ), hoping it is helpful to you. yours, Ling Kun. On Fri, May 31, 2013 at 2:37 PM, Agarwal, Nikhil <[email protected]>wrote: > Hi, **** > > ** ** > > Is it possible to run MapReduce on *multiple nodes* using Local File > system (file:///) ?**** > > I am able to run it in single node setup but in a multiple node setup the > “slave” nodes are not able to access the “jobtoken” file which is present > in the Hadoop.tmp.dir in “master” node. **** > > ** ** > > Please let me know if it is possible to do this.**** > > ** ** > > Thanks & Regards,**** > > Nikhil**** > -- http://www.lingcc.com
