Re: [ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS
Yes. I have setup ceph and hadoop in each node. ceph health is OK and the hadoop works fine when I use HDFS (I have ran the same command with HDFS and it works). One node is the admin(job tracker running), other 4 are slaves(tasktracker running). The problem occurs when I change the hadoop/conf/core-site.xml file to incorporate cephFS. Although the error does not show anything related to ceph, I am really confused why this error is happening. I have another question, for running hadoop with cephFS should the hadoop input data be inside any directory or it has to be the directory where the cephFS has been mounted? Regards, Ridwan Rashid Noel Doctoral Student, Dept. of Computer Science, University of Texas at San Antonio Contact# 210-773-9966 On Mar 26, 2015 10:12 AM, Gregory Farnum g...@gregs42.com wrote: On Wed, Mar 25, 2015 at 8:10 PM, Ridwan Rashid Noel ridwan...@gmail.com wrote: Hi Greg, Thank you for your response. I have understood that I should be starting only the mapred daemons when using cephFS instead of HDFS. I have fixed that and trying to run hadoop wordcount job using this instruction: bin/hadoop jar hadoop*examples*.jar wordcount /tmp/wc-input /tmp/wc-output but I am getting this error 15/03/26 02:54:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library 15/03/26 02:54:35 INFO input.FileInputFormat: Total input paths to process : 1 15/03/26 02:54:35 WARN snappy.LoadSnappy: Snappy native library not loaded 15/03/26 02:54:35 INFO mapred.JobClient: Running job: job_201503260253_0001 15/03/26 02:54:36 INFO mapred.JobClient: map 0% reduce 0% 15/03/26 02:54:36 INFO mapred.JobClient: Task Id : attempt_201503260253_0001_m_21_0, Status : FAILED Error initializing attempt_201503260253_0001_m_21_0: java.io.FileNotFoundException: File file:/tmp/hadoop-ceph/mapred/system/job_201503260253_0001/jobToken does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251) at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445) at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213) at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568) at java.lang.Thread.run(Thread.java:745) I'm not an expert at setting up Hadoop, but these errors are coming out of the RawLocalFileSystem, which I think means that worker node is trying to use a local FS instead of Ceph. Did you set up each node to access Ceph? Have you set up and used Hadoop previously? -Greg . I have used the core-site.xml configurations as mentioned in http://ceph.com/docs/master/cephfs/hadoop/ Please tell me how can this problem be solved? Regards, Ridwan Rashid Noel Doctoral Student, Department of Computer Science, University of Texas at San Antonio Contact# 210-773-9966 On Fri, Mar 20, 2015 at 4:04 PM, Gregory Farnum g...@gregs42.com wrote: On Fri, Mar 20, 2015 at 1:05 PM, Ridwan Rashid ridwan...@gmail.com wrote: Gregory Farnum greg@... writes: On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan064@... wrote: Hi, I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with cephFS. I have installed hadoop-1.1.1 in the nodes and changed the conf/core-site.xml file according to the ceph documentation http://ceph.com/docs/master/cephfs/hadoop/ but after changing the file the namenode is not starting (namenode can be formatted) but the other services(datanode, jobtracker, tasktracker) are running in hadoop. The default hadoop works fine but when I change the core-site.xml file as above I get the following bindException as can be seen from the namenode log: 2015-03-19 01:37:31,436 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to node1/10.242.144.225:6789 : Cannot assign requested address I have one monitor for the ceph cluster (node1/10.242.144.225) and I included in the core-site.xml file ceph://10.242.144.225:6789 as the value of fs.default.name. The 6789 port is the default port being used by the monitor node of ceph, so that may be the reason for the bindException but the ceph documentation mentions that it should be included like this in the core-site.xml file. It would be really helpful to get some pointers to where I am doing wrong in the setup. I'm a bit confused. The NameNode is only used by HDFS, and so shouldn't be running at all if you're using CephFS. Nor do I have any idea why you've changed anything in a way that tells the NameNode to bind to
Re: [ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS
Hi Greg, Thank you for your response. I have understood that I should be starting only the mapred daemons when using cephFS instead of HDFS. I have fixed that and trying to run hadoop wordcount job using this instruction: bin/hadoop jar hadoop*examples*.jar wordcount /tmp/wc-input /tmp/wc-output but I am getting this error 15/03/26 02:54:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library 15/03/26 02:54:35 INFO input.FileInputFormat: Total input paths to process : 1 15/03/26 02:54:35 WARN snappy.LoadSnappy: Snappy native library not loaded 15/03/26 02:54:35 INFO mapred.JobClient: Running job: job_201503260253_0001 15/03/26 02:54:36 INFO mapred.JobClient: map 0% reduce 0% 15/03/26 02:54:36 INFO mapred.JobClient: Task Id : attempt_201503260253_0001_m_21_0, Status : FAILED Error initializing attempt_201503260253_0001_m_21_0: java.io.FileNotFoundException: File file:/tmp/hadoop-ceph/mapred/system/job_201503260253_0001/jobToken does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251) at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445) at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213) at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568) at java.lang.Thread.run(Thread.java:745) . I have used the core-site.xml configurations as mentioned in http://ceph.com/docs/master/cephfs/hadoop/ Please tell me how can this problem be solved? *Regards*, Ridwan Rashid Noel Doctoral Student, Department of Computer Science http://www.cs.utsa.edu/, University of Texas at San Antonio http://utsa.edu/ Contact# 210-773-9966 On Fri, Mar 20, 2015 at 4:04 PM, Gregory Farnum g...@gregs42.com wrote: On Fri, Mar 20, 2015 at 1:05 PM, Ridwan Rashid ridwan...@gmail.com wrote: Gregory Farnum greg@... writes: On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan064@... wrote: Hi, I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with cephFS. I have installed hadoop-1.1.1 in the nodes and changed the conf/core-site.xml file according to the ceph documentation http://ceph.com/docs/master/cephfs/hadoop/ but after changing the file the namenode is not starting (namenode can be formatted) but the other services(datanode, jobtracker, tasktracker) are running in hadoop. The default hadoop works fine but when I change the core-site.xml file as above I get the following bindException as can be seen from the namenode log: 2015-03-19 01:37:31,436 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to node1/10.242.144.225:6789 : Cannot assign requested address I have one monitor for the ceph cluster (node1/10.242.144.225) and I included in the core-site.xml file ceph://10.242.144.225:6789 as the value of fs.default.name. The 6789 port is the default port being used by the monitor node of ceph, so that may be the reason for the bindException but the ceph documentation mentions that it should be included like this in the core-site.xml file. It would be really helpful to get some pointers to where I am doing wrong in the setup. I'm a bit confused. The NameNode is only used by HDFS, and so shouldn't be running at all if you're using CephFS. Nor do I have any idea why you've changed anything in a way that tells the NameNode to bind to the monitor's IP address; none of the instructions that I see can do that, and they certainly shouldn't be. -Greg Hi Greg, I want to run a hadoop job (e.g. terasort) and want to use cephFS instead of HDFS. In Using Hadoop with cephFS documentation in http://ceph.com/docs/master/cephfs/hadoop/ if you look into the Hadoop configuration section, the first property fs.default.name has to be set as the ceph URI and in the notes it's mentioned as ceph://[monaddr:port]/. My core-site.xml of hadoop conf looks like this configuration property namefs.default.name/name valueceph://10.242.144.225:6789/value /property Yeah, that all makes sense. But I don't understand why or how you're starting up a NameNode at all, nor what config values it's drawing from to try and bind to that port. The NameNode is the problem because it shouldn't even be invoked. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS
On Wed, Mar 25, 2015 at 8:10 PM, Ridwan Rashid Noel ridwan...@gmail.com wrote: Hi Greg, Thank you for your response. I have understood that I should be starting only the mapred daemons when using cephFS instead of HDFS. I have fixed that and trying to run hadoop wordcount job using this instruction: bin/hadoop jar hadoop*examples*.jar wordcount /tmp/wc-input /tmp/wc-output but I am getting this error 15/03/26 02:54:35 INFO util.NativeCodeLoader: Loaded the native-hadoop library 15/03/26 02:54:35 INFO input.FileInputFormat: Total input paths to process : 1 15/03/26 02:54:35 WARN snappy.LoadSnappy: Snappy native library not loaded 15/03/26 02:54:35 INFO mapred.JobClient: Running job: job_201503260253_0001 15/03/26 02:54:36 INFO mapred.JobClient: map 0% reduce 0% 15/03/26 02:54:36 INFO mapred.JobClient: Task Id : attempt_201503260253_0001_m_21_0, Status : FAILED Error initializing attempt_201503260253_0001_m_21_0: java.io.FileNotFoundException: File file:/tmp/hadoop-ceph/mapred/system/job_201503260253_0001/jobToken does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251) at org.apache.hadoop.mapred.TaskTracker.localizeJobTokenFile(TaskTracker.java:4445) at org.apache.hadoop.mapred.TaskTracker.initializeJob(TaskTracker.java:1272) at org.apache.hadoop.mapred.TaskTracker.localizeJob(TaskTracker.java:1213) at org.apache.hadoop.mapred.TaskTracker$5.run(TaskTracker.java:2568) at java.lang.Thread.run(Thread.java:745) I'm not an expert at setting up Hadoop, but these errors are coming out of the RawLocalFileSystem, which I think means that worker node is trying to use a local FS instead of Ceph. Did you set up each node to access Ceph? Have you set up and used Hadoop previously? -Greg . I have used the core-site.xml configurations as mentioned in http://ceph.com/docs/master/cephfs/hadoop/ Please tell me how can this problem be solved? Regards, Ridwan Rashid Noel Doctoral Student, Department of Computer Science, University of Texas at San Antonio Contact# 210-773-9966 On Fri, Mar 20, 2015 at 4:04 PM, Gregory Farnum g...@gregs42.com wrote: On Fri, Mar 20, 2015 at 1:05 PM, Ridwan Rashid ridwan...@gmail.com wrote: Gregory Farnum greg@... writes: On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan064@... wrote: Hi, I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with cephFS. I have installed hadoop-1.1.1 in the nodes and changed the conf/core-site.xml file according to the ceph documentation http://ceph.com/docs/master/cephfs/hadoop/ but after changing the file the namenode is not starting (namenode can be formatted) but the other services(datanode, jobtracker, tasktracker) are running in hadoop. The default hadoop works fine but when I change the core-site.xml file as above I get the following bindException as can be seen from the namenode log: 2015-03-19 01:37:31,436 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to node1/10.242.144.225:6789 : Cannot assign requested address I have one monitor for the ceph cluster (node1/10.242.144.225) and I included in the core-site.xml file ceph://10.242.144.225:6789 as the value of fs.default.name. The 6789 port is the default port being used by the monitor node of ceph, so that may be the reason for the bindException but the ceph documentation mentions that it should be included like this in the core-site.xml file. It would be really helpful to get some pointers to where I am doing wrong in the setup. I'm a bit confused. The NameNode is only used by HDFS, and so shouldn't be running at all if you're using CephFS. Nor do I have any idea why you've changed anything in a way that tells the NameNode to bind to the monitor's IP address; none of the instructions that I see can do that, and they certainly shouldn't be. -Greg Hi Greg, I want to run a hadoop job (e.g. terasort) and want to use cephFS instead of HDFS. In Using Hadoop with cephFS documentation in http://ceph.com/docs/master/cephfs/hadoop/ if you look into the Hadoop configuration section, the first property fs.default.name has to be set as the ceph URI and in the notes it's mentioned as ceph://[monaddr:port]/. My core-site.xml of hadoop conf looks like this configuration property namefs.default.name/name valueceph://10.242.144.225:6789/value /property Yeah, that all makes sense. But I don't understand why or how you're starting up a NameNode at all, nor what config values it's drawing from to try and bind to that port. The NameNode is the problem because it shouldn't even be invoked. -Greg
Re: [ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS
On Fri, Mar 20, 2015 at 1:05 PM, Ridwan Rashid ridwan...@gmail.com wrote: Gregory Farnum greg@... writes: On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan064@... wrote: Hi, I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with cephFS. I have installed hadoop-1.1.1 in the nodes and changed the conf/core-site.xml file according to the ceph documentation http://ceph.com/docs/master/cephfs/hadoop/ but after changing the file the namenode is not starting (namenode can be formatted) but the other services(datanode, jobtracker, tasktracker) are running in hadoop. The default hadoop works fine but when I change the core-site.xml file as above I get the following bindException as can be seen from the namenode log: 2015-03-19 01:37:31,436 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to node1/10.242.144.225:6789 : Cannot assign requested address I have one monitor for the ceph cluster (node1/10.242.144.225) and I included in the core-site.xml file ceph://10.242.144.225:6789 as the value of fs.default.name. The 6789 port is the default port being used by the monitor node of ceph, so that may be the reason for the bindException but the ceph documentation mentions that it should be included like this in the core-site.xml file. It would be really helpful to get some pointers to where I am doing wrong in the setup. I'm a bit confused. The NameNode is only used by HDFS, and so shouldn't be running at all if you're using CephFS. Nor do I have any idea why you've changed anything in a way that tells the NameNode to bind to the monitor's IP address; none of the instructions that I see can do that, and they certainly shouldn't be. -Greg Hi Greg, I want to run a hadoop job (e.g. terasort) and want to use cephFS instead of HDFS. In Using Hadoop with cephFS documentation in http://ceph.com/docs/master/cephfs/hadoop/ if you look into the Hadoop configuration section, the first property fs.default.name has to be set as the ceph URI and in the notes it's mentioned as ceph://[monaddr:port]/. My core-site.xml of hadoop conf looks like this configuration property namefs.default.name/name valueceph://10.242.144.225:6789/value /property Yeah, that all makes sense. But I don't understand why or how you're starting up a NameNode at all, nor what config values it's drawing from to try and bind to that port. The NameNode is the problem because it shouldn't even be invoked. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS
On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan...@gmail.com wrote: Hi, I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with cephFS. I have installed hadoop-1.1.1 in the nodes and changed the conf/core-site.xml file according to the ceph documentation http://ceph.com/docs/master/cephfs/hadoop/ but after changing the file the namenode is not starting (namenode can be formatted) but the other services(datanode, jobtracker, tasktracker) are running in hadoop. The default hadoop works fine but when I change the core-site.xml file as above I get the following bindException as can be seen from the namenode log: 2015-03-19 01:37:31,436 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to node1/10.242.144.225:6789 : Cannot assign requested address I have one monitor for the ceph cluster (node1/10.242.144.225) and I included in the core-site.xml file ceph://10.242.144.225:6789 as the value of fs.default.name. The 6789 port is the default port being used by the monitor node of ceph, so that may be the reason for the bindException but the ceph documentation mentions that it should be included like this in the core-site.xml file. It would be really helpful to get some pointers to where I am doing wrong in the setup. I'm a bit confused. The NameNode is only used by HDFS, and so shouldn't be running at all if you're using CephFS. Nor do I have any idea why you've changed anything in a way that tells the NameNode to bind to the monitor's IP address; none of the instructions that I see can do that, and they certainly shouldn't be. -Greg ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS
Gregory Farnum greg@... writes: On Thu, Mar 19, 2015 at 5:57 PM, Ridwan Rashid ridwan064@... wrote: Hi, I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with cephFS. I have installed hadoop-1.1.1 in the nodes and changed the conf/core-site.xml file according to the ceph documentation http://ceph.com/docs/master/cephfs/hadoop/ but after changing the file the namenode is not starting (namenode can be formatted) but the other services(datanode, jobtracker, tasktracker) are running in hadoop. The default hadoop works fine but when I change the core-site.xml file as above I get the following bindException as can be seen from the namenode log: 2015-03-19 01:37:31,436 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to node1/10.242.144.225:6789 : Cannot assign requested address I have one monitor for the ceph cluster (node1/10.242.144.225) and I included in the core-site.xml file ceph://10.242.144.225:6789 as the value of fs.default.name. The 6789 port is the default port being used by the monitor node of ceph, so that may be the reason for the bindException but the ceph documentation mentions that it should be included like this in the core-site.xml file. It would be really helpful to get some pointers to where I am doing wrong in the setup. I'm a bit confused. The NameNode is only used by HDFS, and so shouldn't be running at all if you're using CephFS. Nor do I have any idea why you've changed anything in a way that tells the NameNode to bind to the monitor's IP address; none of the instructions that I see can do that, and they certainly shouldn't be. -Greg Hi Greg, I want to run a hadoop job (e.g. terasort) and want to use cephFS instead of HDFS. In Using Hadoop with cephFS documentation in http://ceph.com/docs/master/cephfs/hadoop/ if you look into the Hadoop configuration section, the first property fs.default.name has to be set as the ceph URI and in the notes it's mentioned as ceph://[monaddr:port]/. My core-site.xml of hadoop conf looks like this configuration property namefs.default.name/name valueceph://10.242.144.225:6789/value /property property namehadoop.tmp.dir/name value/app/hadoop/tmp/value descriptionA base for other temporary directories./description /property property namefs.ceph.impl/name valueorg.apache.hadoop.fs.ceph.CephFileSystem/value description /description /property property nameceph.conf.file/name value/etc/ceph/ceph.conf/value /property property nameceph.root.dir/name value//value /property property nameceph.mon.address/name value10.242.144.225:6789/value descriptionThis is the primary monitor node IP address in our installation./description /property property nameceph.auth.id/name valueadmin/value /property property nameceph.auth.keyring/name value/etc/ceph/ceph.client.admin.keyring/value /property property nameceph.object.size/name value67108864/value /property property nameceph.data.pools/name valuedata/value /property property nameceph.localize.reads/name valuetrue/value /property /configuration ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] hadoop namenode not starting due to bindException while deploying hadoop with cephFS
Hi, I have a 5 node ceph(v0.87) cluster and am trying to deploy hadoop with cephFS. I have installed hadoop-1.1.1 in the nodes and changed the conf/core-site.xml file according to the ceph documentation http://ceph.com/docs/master/cephfs/hadoop/ but after changing the file the namenode is not starting (namenode can be formatted) but the other services(datanode, jobtracker, tasktracker) are running in hadoop. The default hadoop works fine but when I change the core-site.xml file as above I get the following bindException as can be seen from the namenode log: 2015-03-19 01:37:31,436 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: java.net.BindException: Problem binding to node1/10.242.144.225:6789 : Cannot assign requested address I have one monitor for the ceph cluster (node1/10.242.144.225) and I included in the core-site.xml file ceph://10.242.144.225:6789 as the value of fs.default.name. The 6789 port is the default port being used by the monitor node of ceph, so that may be the reason for the bindException but the ceph documentation mentions that it should be included like this in the core-site.xml file. It would be really helpful to get some pointers to where I am doing wrong in the setup. Thank you. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com