See this thread: http://search-hadoop.com/m/LgpTk2dvrTk1
Cheers On Dec 20, 2014, at 3:46 PM, Behrooz Shafiee <[email protected]> wrote: > Thanks everyone, > I finally managed to run mapreduce over my dfs. As you mentioned there was > no need to run datanode or namenode. The only required config was to set > yarn.app.mapreduce.am.staging-dir to point to my dfs so all the node could > access it as in hdfs. > Something that I noticed when I run the TestDFSIO is that the block size > that my filesystem get for write/read is super small 4k. I changed > file.blocksize in core-site.xml but did not make any change. I guess it > affects HDFS, is there any parameter or somwhere in the code that I can > change the block size? > > Thanks, > > On Thu, Dec 18, 2014 at 1:00 PM, Allen Wittenauer <[email protected]> wrote: > >> >> I think you missed the point that Harsh was pointing out: >> >> The namenode and datanode is used to build the hdfs:// filesystem . There >> is no namenode or datanode in a file:/// setup. That’s why running the >> namenode blew up. If you want to use something besides hdfs://, then you >> only run the YARN daemons. >> >> On Dec 18, 2014, at 8:56 AM, Behrooz Shafiee <[email protected]> wrote: >> >>> Because my FS is an in-memory distributed file system; therefore, I >> believe >>> it can significantly improve IO intensive tasks on HADOOP. >>> >>> On Thu, Dec 18, 2014 at 2:27 AM, Harsh J <[email protected]> wrote: >>>> >>>> NameNodes and DataNodes are services that are part of HDFS. Why are >>>> you attempting to start them on top of your own DFS? >>>> >>>> On Thu, Dec 18, 2014 at 6:35 AM, Behrooz Shafiee <[email protected]> >>>> wrote: >>>>> Hello folks, >>>>> >>>>> I have developed my own distributed file system and I want to try it >>>> with >>>>> hadoop MapReduce. It is a POSIX compatible file system and can be >> mounted >>>>> under a directory; eg." /myfs". I was wondering how I can configure >>>> hadoop >>>>> to use my own fs instead of hdfs. What are the configurations that need >>>> to >>>>> be changed? Or what source files should I modify? Using google I came >>>>> across the sample of using lustre with hadoop and tried to apply them >> but >>>>> it failed. >>>>> >>>>> I setup a cluster and mounted my own filesystem under /myfs in all of >> my >>>>> nodes and changed the core-site.xml and maprd-site.xml following: >>>>> >>>>> core-site.xml: >>>>> >>>>> fs.default.name -> file:/// >>>>> fs.defaultFS -> file:/// >>>>> hadoop.tmp.dir -> /myfs >>>>> >>>>> >>>>> in mapred-site.xml: >>>>> >>>>> mapreduce.jobtracker.staging.root.dir -> /myfs/user >>>>> mapred.system.dir -> /myfs/system >>>>> mapred.local.dir -> /myfs/mapred_${host.name} >>>>> >>>>> and finally, hadoop-env.sh: >>>>> >>>>> added "-Dhost.name=`hostname -s`" to HADOOP_OPTS >>>>> >>>>> However, when I try to start my namenode, I get this error: >>>>> >>>>> 2014-12-17 19:44:35,902 FATAL >>>>> org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start >>>> namenode. >>>>> java.lang.IllegalArgumentException: Invalid URI for NameNode address >>>> (check >>>>> fs.defaultFS): file:///home/kos/msthesis/BFS/mountdir has no authority. >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:423) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:413) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.getRpcServerAddress(NameNode.java:464) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.loginAsNameNodeUser(NameNode.java:564) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:584) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:762) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:746) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1438) >>>>> at >> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1504) >>>>> 2014-12-17 19:44:35,914 INFO org.apache.hadoop.util.ExitUtil: Exiting >>>> with >>>>> status 1 >>>>> >>>>> for starting datanodes I get this error: >>>>> 2014-12-17 20:02:34,028 FATAL >>>>> org.apache.hadoop.hdfs.server.datanode.DataNode: Exception in >> secureMain >>>>> java.io.IOException: Incorrect configuration: namenode address >>>>> dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not >>>>> configured. >>>>> at >> org.apache.hadoop.hdfs.DFSUtil.getNNServiceRpcAddressesForCluster(DFSUtil.java:866) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.refreshNamenodes(BlockPoolManager.java:155) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:1074) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:415) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2268) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378) >>>>> at >> org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2402) >>>>> 2014-12-17 20:02:34,036 INFO org.apache.hadoop.util.ExitUtil: Exiting >>>> with >>>>> status 1 >>>>> >>>>> >>>>> I really appreciate if any one help about these problems. >>>>> Thanks in advance, >>>>> >>>>> -- >>>>> Behrooz >>>> >>>> >>>> >>>> -- >>>> Harsh J >>> >>> >>> -- >>> Behrooz > > > -- > Behrooz
