Re: Hadoop on Mesos: FATAL mapred.MesosScheduler: Failed to initialize the TaskScheduler
Traiano, i am not also use it anymore, so i just share with you. 2017-07-31 2:27 GMT+08:00 Traiano Welcome: > Hi Tommy > > On Sun, Jul 30, 2017 at 9:37 PM, tommy xiao wrote: > >> why not use Myriad? >> >> https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home >> > > > I'm in doubt about the future of this project. I'm told it's likely to be > discontinued soon due to the lack of contributors. > In any case - have you perhaps seen a successful deployment of this ? > > > > > >> >> >> >> 2017-07-23 17:27 GMT+08:00 Traiano Welcome : >> >>> >>> Hi List! >>> >>> I'm working on configuring hadoop to use the mesos scheduler, using the >>> procedure outlined in "Apache Mesos Essentials" here: >>> >>> https://pastebin.com/y1ERJZqq >>> >>> Currently I've a 3 node mesos cluster, with an HDFS namenode >>> communicating successfully with two HDFS data nodes. However, when I try to >>> start up the jobtracker it fails with the following error: >>> >>> 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the >>> TaskScheduler >>> java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobQu >>> eueTaskScheduler >>> >>> Some more context around the error: >>> >>> 17/07/22 18:44:38 INFO mapred.CompletedJobStatusStore: Completed job >>> store is inactive >>> 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the >>> TaskScheduler >>> java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobQu >>> eueTaskScheduler >>> >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:359) >>> at java.net.URLClassLoader$1.run(URLClassLoader.java:348) >>> at java.security.AccessController.doPrivileged(Native Method) >>> at java.net.URLClassLoader.findClass(URLClassLoader.java:347) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312) >>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >>> at java.lang.Class.forName0(Native Method) >>> at java.lang.Class.forName(Class.java:195) >>> at org.apache.hadoop.mapred.MesosScheduler.start(MesosScheduler >>> .java:160) >>> atorg.apache.hadoop.mapred.JobTracker.offerService(JobTracker. >>> java:2186) >>> at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4548) >>> 17/07/22 18:44:38 INFO mapred.JobTracker: SHUTDOWN_MSG: >>> >>> Is there some way I could debug this further to trace the root cause of >>> this error? >>> >>> Here is a full paste of the debug output when starting up the jobtracker: >>> >>> https://pastebin.com/a61wN4vQ >>> >>> >>> >> >> >> -- >> Deshi Xiao >> Twitter: xds2000 >> E-mail: xiaods(AT)gmail.com >> > > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com
Re: Hadoop on Mesos: FATAL mapred.MesosScheduler: Failed to initialize the TaskScheduler
Hi Tommy On Sun, Jul 30, 2017 at 9:37 PM, tommy xiaowrote: > why not use Myriad? > > https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home > I'm in doubt about the future of this project. I'm told it's likely to be discontinued soon due to the lack of contributors. In any case - have you perhaps seen a successful deployment of this ? > > > > 2017-07-23 17:27 GMT+08:00 Traiano Welcome : > >> >> Hi List! >> >> I'm working on configuring hadoop to use the mesos scheduler, using the >> procedure outlined in "Apache Mesos Essentials" here: >> >> https://pastebin.com/y1ERJZqq >> >> Currently I've a 3 node mesos cluster, with an HDFS namenode >> communicating successfully with two HDFS data nodes. However, when I try to >> start up the jobtracker it fails with the following error: >> >> 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the >> TaskScheduler >> java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobQu >> eueTaskScheduler >> >> Some more context around the error: >> >> 17/07/22 18:44:38 INFO mapred.CompletedJobStatusStore: Completed job >> store is inactive >> 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the >> TaskScheduler >> java.lang.ClassNotFoundException: org.apache.hadoop.mapred.JobQu >> eueTaskScheduler >> >> at java.net.URLClassLoader$1.run(URLClassLoader.java:359) >> at java.net.URLClassLoader$1.run(URLClassLoader.java:348) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:347) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358) >> at java.lang.Class.forName0(Native Method) >> at java.lang.Class.forName(Class.java:195) >> at org.apache.hadoop.mapred.MesosScheduler.start(MesosScheduler >> .java:160) >> atorg.apache.hadoop.mapred.JobTracker.offerService(JobTracker. >> java:2186) >> at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4548) >> 17/07/22 18:44:38 INFO mapred.JobTracker: SHUTDOWN_MSG: >> >> Is there some way I could debug this further to trace the root cause of >> this error? >> >> Here is a full paste of the debug output when starting up the jobtracker: >> >> https://pastebin.com/a61wN4vQ >> >> >> > > > -- > Deshi Xiao > Twitter: xds2000 > E-mail: xiaods(AT)gmail.com >
Re: Hadoop on Mesos: FATAL mapred.MesosScheduler: Failed to initialize the TaskScheduler
why not use Myriad? https://cwiki.apache.org/confluence/display/MYRIAD/Myriad+Home 2017-07-23 17:27 GMT+08:00 Traiano Welcome: > > Hi List! > > I'm working on configuring hadoop to use the mesos scheduler, using the > procedure outlined in "Apache Mesos Essentials" here: > > https://pastebin.com/y1ERJZqq > > Currently I've a 3 node mesos cluster, with an HDFS namenode communicating > successfully with two HDFS data nodes. However, when I try to start up the > jobtracker it fails with the following error: > > 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the > TaskScheduler > java.lang.ClassNotFoundException: org.apache.hadoop.mapred. > JobQueueTaskScheduler > > Some more context around the error: > > 17/07/22 18:44:38 INFO mapred.CompletedJobStatusStore: Completed job > store is inactive > 17/07/22 18:44:38 FATAL mapred.MesosScheduler: Failed to initialize the > TaskScheduler > java.lang.ClassNotFoundException: org.apache.hadoop.mapred. > JobQueueTaskScheduler > > at java.net.URLClassLoader$1.run(URLClassLoader.java:359) > at java.net.URLClassLoader$1.run(URLClassLoader.java:348) > at java.security.AccessController.doPrivileged(Native Method) > at java.net.URLClassLoader.findClass(URLClassLoader.java:347) > at java.lang.ClassLoader.loadClass(ClassLoader.java:425) > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:312) > at java.lang.ClassLoader.loadClass(ClassLoader.java:358) > at java.lang.Class.forName0(Native Method) > at java.lang.Class.forName(Class.java:195) > at org.apache.hadoop.mapred.MesosScheduler.start( > MesosScheduler.java:160) > atorg.apache.hadoop.mapred.JobTracker.offerService( > JobTracker.java:2186) > at org.apache.hadoop.mapred.JobTracker.main(JobTracker.java:4548) > 17/07/22 18:44:38 INFO mapred.JobTracker: SHUTDOWN_MSG: > > Is there some way I could debug this further to trace the root cause of > this error? > > Here is a full paste of the debug output when starting up the jobtracker: > > https://pastebin.com/a61wN4vQ > > > -- Deshi Xiao Twitter: xds2000 E-mail: xiaods(AT)gmail.com
Re: Hadoop on Mesos. HDFS question.
Blatant product plug: The easiest way to run hdfs-mesos on Mesos (using Marathon) is to launch a DCOS (EE) cluster. https://mesosphere.com/product/ You might also want to look at the custom DCOS config in https://github.com/mesosphere/hdfs/tree/master/example-conf/mesosphere-dcos For basic instructions on running an app in Marathon, see https://mesosphere.github.io/marathon/docs/application-basics.html See the marathon.json in DCOS Universe: https://github.com/mesosphere/universe/blob/version-1.x/repo/packages/H/hdfs/0/marathon.json Just replace any of the {{moustache}} variables with values like the defaults in https://github.com/mesosphere/universe/blob/version-1.x/repo/packages/H/hdfs/0/config.json You can also simplify the command to cd hdfs-mesos* ./bin/hdfs-mesos On Mon, Jul 6, 2015 at 2:19 PM, Kk Bk kkbr...@gmail.com wrote: Adam I would like to choose option 2. Cab you provide pointers as to how to run hdfs-mesos using marathon ? -Bhargav On Sun, Jul 5, 2015 at 10:53 PM, Adam Bordelon a...@mesosphere.io wrote: Kk, There are two options for running the hdfs framework on Mesos. - If you already have the hadoop/hdfs binaries on all your nodes, you can follow the instructions in https://github.com/mesosphere/hdfs#if-you-have-hadoop-pre-installed-in-your-cluster to tell the scheduler to use the preinstalled NN/DN binaries. - Otherwise, you can run the hdfs framework scheduler `bin/hdfs-mesos` on any node that can reach the Mesos master and slaves, and it can serve out the binaries itself. Note that this node may not necessarily be the same node on which either of the namenodes end up running. Some choose to run the hdfs-mesos scheduler on a Mesos master node, but you can achieve framework scheduler HA if you run it via another framework like Marathon that can restart the scheduler (elsewhere) if it or its node dies. See example (templatized) Marathon json in https://github.com/mesosphere/universe/tree/version-1.x/repo/packages/H/hdfs/0 On Fri, Jul 3, 2015 at 11:31 AM, Kk Bk kkbr...@gmail.com wrote: Thanks guys for the response. 1) I use trusty. Seems like CDH4 does not have support for Trusty. 2) Followed instructions as per link https://github.com/mesosphere/hdfs. Able to build hdfs-mesos-*.tgz Should i copy this file to all nodes (i have multi-node mesos cluster) or just the master node of mesos where i plan to keep the namenode for hadoop On Fri, Jul 3, 2015 at 8:34 AM, Tom Arnfeld t...@duedil.com wrote: It might be worth taking a look at the install documentation on the Hadoop on Mesos product here; https://github.com/mesos/hadoop For our installations I don't think we really do much more than installing the apt packages you mentioned and then installing the hadoop-mesos jars.. plus adding the appropriate configuration. On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote: I am trying to install Hadoop on Mesos on ubuntu servers, So followed instruction as per link https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2. Step-2 of link says to install HDFS using as per link http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html . Question: Is it sufficient to run following commands 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode Or just follow the instructions on the mesosphere link that installs HDFS ?
Re: Hadoop on Mesos. HDFS question.
Kk, There are two options for running the hdfs framework on Mesos. - If you already have the hadoop/hdfs binaries on all your nodes, you can follow the instructions in https://github.com/mesosphere/hdfs#if-you-have-hadoop-pre-installed-in-your-cluster to tell the scheduler to use the preinstalled NN/DN binaries. - Otherwise, you can run the hdfs framework scheduler `bin/hdfs-mesos` on any node that can reach the Mesos master and slaves, and it can serve out the binaries itself. Note that this node may not necessarily be the same node on which either of the namenodes end up running. Some choose to run the hdfs-mesos scheduler on a Mesos master node, but you can achieve framework scheduler HA if you run it via another framework like Marathon that can restart the scheduler (elsewhere) if it or its node dies. See example (templatized) Marathon json in https://github.com/mesosphere/universe/tree/version-1.x/repo/packages/H/hdfs/0 On Fri, Jul 3, 2015 at 11:31 AM, Kk Bk kkbr...@gmail.com wrote: Thanks guys for the response. 1) I use trusty. Seems like CDH4 does not have support for Trusty. 2) Followed instructions as per link https://github.com/mesosphere/hdfs. Able to build hdfs-mesos-*.tgz Should i copy this file to all nodes (i have multi-node mesos cluster) or just the master node of mesos where i plan to keep the namenode for hadoop On Fri, Jul 3, 2015 at 8:34 AM, Tom Arnfeld t...@duedil.com wrote: It might be worth taking a look at the install documentation on the Hadoop on Mesos product here; https://github.com/mesos/hadoop For our installations I don't think we really do much more than installing the apt packages you mentioned and then installing the hadoop-mesos jars.. plus adding the appropriate configuration. On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote: I am trying to install Hadoop on Mesos on ubuntu servers, So followed instruction as per link https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2. Step-2 of link says to install HDFS using as per link http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html . Question: Is it sufficient to run following commands 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode Or just follow the instructions on the mesosphere link that installs HDFS ?
Re: Hadoop on Mesos. HDFS question.
You just need install HDFS through sudo apt-get install hadoop-hdfs-namenode hadoop-hdfs-secondarynamenode hadoop-hdfs-datanode hadoop-client And then continue to follow the mesosphere link step. Mesosphere link don't contain instructions to install HDFS. On Fri, Jul 3, 2015 at 10:51 PM, Kk Bk kkbr...@gmail.com wrote: I am trying to install Hadoop on Mesos on ubuntu servers, So followed instruction as per link https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2. Step-2 of link says to install HDFS using as per link http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html . Question: Is it sufficient to run following commands 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode Or just follow the instructions on the mesosphere link that installs HDFS ? -- Best Regards, Haosdent Huang
Re: Hadoop on Mesos. HDFS question.
It might be worth taking a look at the install documentation on the Hadoop on Mesos product here; https://github.com/mesos/hadoop For our installations I don't think we really do much more than installing the apt packages you mentioned and then installing the hadoop-mesos jars.. plus adding the appropriate configuration. On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote: I am trying to install Hadoop on Mesos on ubuntu servers, So followed instruction as per link https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2. Step-2 of link says to install HDFS using as per link http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html. Question: Is it sufficient to run following commands 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode Or just follow the instructions on the mesosphere link that installs HDFS ?
Re: Hadoop on Mesos. HDFS question.
Thanks guys for the response. 1) I use trusty. Seems like CDH4 does not have support for Trusty. 2) Followed instructions as per link https://github.com/mesosphere/hdfs. Able to build hdfs-mesos-*.tgz Should i copy this file to all nodes (i have multi-node mesos cluster) or just the master node of mesos where i plan to keep the namenode for hadoop On Fri, Jul 3, 2015 at 8:34 AM, Tom Arnfeld t...@duedil.com wrote: It might be worth taking a look at the install documentation on the Hadoop on Mesos product here; https://github.com/mesos/hadoop For our installations I don't think we really do much more than installing the apt packages you mentioned and then installing the hadoop-mesos jars.. plus adding the appropriate configuration. On Friday, Jul 3, 2015 at 3:52 pm, Kk Bk kkbr...@gmail.com, wrote: I am trying to install Hadoop on Mesos on ubuntu servers, So followed instruction as per link https://open.mesosphere.com/tutorials/run-hadoop-on-mesos/#step-2. Step-2 of link says to install HDFS using as per link http://www.cloudera.com/content/cloudera/en/documentation/cdh4/latest/CDH4-Installation-Guide/cdh4ig_topic_4_4.html . Question: Is it sufficient to run following commands 1) On Namenode: sudo apt-get install hadoop-hdfs-namenode 2) On Datanode: sudo apt-get install hadoop-0.20-mapreduce-tasktracker hadoop-hdfs-datanode Or just follow the instructions on the mesosphere link that installs HDFS ?
Re: hadoop on mesos odd issues with heartbeat and ghost task trackers.
Hi John, Not sure if you ended up getting to the bottom of the issue, but often when the scheduler gives up and his this time out it's because something funky happened in mesos and the scheduler wasn't updated correctly. Could you describe the state (with some logs too if possible) of mesos while this happens? Tom. On 25 February 2015 at 17:01, John Omernik j...@omernik.com wrote: I am running hadoop on mesos 0.0.8 on Mesos 0.21.0. I am running into a weird issue where it appears two of my nodes, when a task tracker is run on them, never really complete the check in process, the job tracker is waiting for their heartbeat, they think they are running successfully, and then tasks that would be assigned to them stay in a hung/pending state waiting for the heartbeat. Basically in the job tracker log, I see the below (where the pending tasks is one, the inactive slots is 2 (launched but no heartbeat yet) so the jobtracker just sits there waiting, and the node thinks it's running fine. Is there a way to have the JobTracker give up on a task tracker sooner? This waiting for timeout period seems odd. Thanks! (if there is any other information I can provide, please let me know) Job Tracker Log: Pending Map Tasks: 0 Pending Reduce Tasks: 1 Running Map Tasks: 0 Running Reduce Tasks: 0 Idle Map Slots: 2 Idle Reduce Slots: 0 Inactive Map Slots: 2 (launched but no hearbeat yet) Inactive Reduce Slots: 2 (launched but no hearbeat yet) Needed Map Slots: 0 Needed Reduce Slots: 0 Unhealthy Trackers: 0 2015-02-25 10:57:01,930 INFO mapred.ResourcePolicy [Thread-1290]: Satisfied map and reduce slots needed. 2015-02-25 10:57:02,083 INFO mapred.MesosScheduler [IPC Server handler 7 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:31264. 2015-02-25 10:57:02,097 INFO mapred.MesosScheduler [IPC Server handler 0 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:50060. 2015-02-25 10:57:02,148 INFO mapred.MesosScheduler [IPC Server handler 4 on 7676]: Unknown/exited TaskTracker: http://moonman:31182. 2015-02-25 10:57:02,392 INFO mapred.MesosScheduler [IPC Server handler 1 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:31264. 2015-02-25 10:57:02,403 INFO mapred.MesosScheduler [IPC Server handler 3 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:50060. 2015-02-25 10:57:02,459 INFO mapred.MesosScheduler [IPC Server handler 6 on 7676]: Unknown/exited TaskTracker: http://moonman:31182. 2015-02-25 10:57:02,702 INFO mapred.MesosScheduler [IPC Server handler 4 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:31264. 2015-02-25 10:57:02,714 INFO mapred.MesosScheduler [IPC Server handler 5 on 7676]: Unknown/exited TaskTracker: http://hadoopmapr3:50060.
Re: Hadoop on Mesos
Hi Tom, Thanks a lot for your reply, it's very helpful. On 01/29/2015 05:54 PM, Tom Arnfeld wrote: Hi Alex, Great to hear you're hoping to use Hadoop on Mesos. We've been running it for a good 6 months and it's been awesome. I'll answer the simpler question first, running multiple job trackers should be just fine.. even multiple JTs with HA enabled (we do this). The mesos scheduler for Hadoop will ship all configuration options needed for each TaskTracker within mesos, so there's nothing you need to have specifically configured on each slave.. # Slow slot allocations If you only have a few slaves, not many resources and a large amount of resources per slot, you might end up with a pretty small slot allocation (e.g 5 mappers and 1 reducer). Because of the nature of Hadoop, slots are static for each TaskTracker and the framework does a /best effort/ to figure out what balance of map/reduce slots to launch on the cluster. Because of this, the current stable version of the framework has a few issues when running on small clusters, especially when you don't configure min/max slot capacity for each JobTracker. Few links below - https://github.com/mesos/hadoop/issues/32 - https://github.com/mesos/hadoop/issues/31 - https://github.com/mesos/hadoop/issues/28 - https://github.com/mesos/hadoop/issues/26 Having said that, we've been working on a solution to this problem which enables Hadoop to launch different types of slots over the lifetime of a single job, meaning you could start with 5 maps and 1 reduce, and then end with 0 maps and 6 reduce. It's not perfect, but it's a decent optimisation if you still need to use Hadoop. - https://github.com/mesos/hadoop/pull/33 You may also want to look into how large your executor URI is (the one containing the hadoop source that gets downloaded for each task tracker) and how long that takes to download.. it might be that the task trackers are taking a while to bootstrap. Do you have any idea of when your pull request will be merged? It looks pretty interesting, even if we're just playing around at this point. Is your hadoop-mesos-0.0.9.jar available for download somewhere, or do I have to build it myself? In the meantime, I'm adding more slaves to see if this makes the problem go away, at least for demos. # HA Hadoop JTs The framework currently does not support a full HA setup, however that's not a huge issue. The JT will automatically restart jobs where they left off on it's own when a failover occurs, but for the time being all the track trackers will be killed and new ones spawned. Depending on your setup, this could be a fairly negligible time. I'm not sure I understand. I know task trackers will get restarted, that's not what I'm worried about. The issue I see is with the JT: it's started on one master only. If that master goes down, the framework goes down. I was kind of hoping to be able to do something like this: property namemapred.job.tracker/name valuezk://mesos01:2181,mesos02:2181,mesos03:2181/hadoop530/value /property Perhaps this doesn't actually work as I would expect. It doesn't look like there's been any progress on issue #28, unfortunately... # Multiple versions of hadoop on the cluster This is totally fine, each JT configuration can be given it's own hadoop tar.gz file with the right version in it, and they will all happily share the Mesos cluster. I guess you have to have multiple startup scripts for this, and also multiple versions of Hadoop on the masters. Any pointers of how you've set this up would be much appreciated. Cheers, Alex
Re: Hadoop on Mesos
Hi Alex, Great to hear you're hoping to use Hadoop on Mesos. We've been running it for a good 6 months and it's been awesome. I'll answer the simpler question first, running multiple job trackers should be just fine.. even multiple JTs with HA enabled (we do this). The mesos scheduler for Hadoop will ship all configuration options needed for each TaskTracker within mesos, so there's nothing you need to have specifically configured on each slave.. # Slow slot allocations If you only have a few slaves, not many resources and a large amount of resources per slot, you might end up with a pretty small slot allocation (e.g 5 mappers and 1 reducer). Because of the nature of Hadoop, slots are static for each TaskTracker and the framework does a best effort to figure out what balance of map/reduce slots to launch on the cluster. Because of this, the current stable version of the framework has a few issues when running on small clusters, especially when you don't configure min/max slot capacity for each JobTracker. Few links below - https://github.com/mesos/hadoop/issues/32 - https://github.com/mesos/hadoop/issues/31 - https://github.com/mesos/hadoop/issues/28 - https://github.com/mesos/hadoop/issues/26 Having said that, we've been working on a solution to this problem which enables Hadoop to launch different types of slots over the lifetime of a single job, meaning you could start with 5 maps and 1 reduce, and then end with 0 maps and 6 reduce. It's not perfect, but it's a decent optimisation if you still need to use Hadoop. - https://github.com/mesos/hadoop/pull/33 You may also want to look into how large your executor URI is (the one containing the hadoop source that gets downloaded for each task tracker) and how long that takes to download.. it might be that the task trackers are taking a while to bootstrap. # HA Hadoop JTs The framework currently does not support a full HA setup, however that's not a huge issue. The JT will automatically restart jobs where they left off on it's own when a failover occurs, but for the time being all the track trackers will be killed and new ones spawned. Depending on your setup, this could be a fairly negligible time. # Multiple versions of hadoop on the cluster This is totally fine, each JT configuration can be given it's own hadoop tar.gz file with the right version in it, and they will all happily share the Mesos cluster. I hope this makes sense! Ping me on irc (tarnfeld) if you run into anything funky on that branch for flexi trackers. Tom. -- Tom Arnfeld Developer // DueDil On Thu, Jan 29, 2015 at 4:09 PM, Alex alex.m.lis...@gmail.com wrote: Hi guys, I'm a Hadoop and Mesos n00b, so please be gentle. I'm trying to set up a Mesos cluster, and my ultimate goal is to introduce Mesos in my organization by showing off it's ability to run multiple Hadoop clusters, plus other stuff, on the same resources. I'd like to be able to do this with a HA configuration as close as possible to something we would run in production. I've successfully set up a Mesos cluster with 3 masters and 4 slaves, but I'm having trouble getting Hadoop jobs to run on top of it. I'm using Mesos 0.21.1 and Hadoop CDH 5.3.0. Initially I tried to follow the Mesosphere tutorial[1], but it looks like it is very outdated and I didn't get very far. Then I tried following the instructions in the github repo[2], but they're also less than ideal. I've managed to get a Hadoop jobtracker running on one of the masters, I can submit jobs to it and they eventually finish. The strange thing is that they take a really long time to start the reduce task, so much so that the first few times I thought it wasn't working at all. Here's part of the output for a simple wordcount example: 15/01/29 16:37:58 INFO mapred.JobClient: map 0% reduce 0% 15/01/29 16:39:23 INFO mapred.JobClient: map 25% reduce 0% 15/01/29 16:39:31 INFO mapred.JobClient: map 50% reduce 0% 15/01/29 16:39:34 INFO mapred.JobClient: map 75% reduce 0% 15/01/29 16:39:37 INFO mapred.JobClient: map 100% reduce 0% 15/01/29 16:56:25 INFO mapred.JobClient: map 100% reduce 100% 15/01/29 16:56:29 INFO mapred.JobClient: Job complete: job_201501291533_0004 Mesos started 3 task trackers which ran the map tasks pretty fast, but then it looks like it was stuck for quite a while before launching a fourth task tracker to run the reduce task. Is this normal, or is there something wrong here? More questions: my configuration file looks a lot like the example in the github repo, but that's listed as being representative of a pseudo-distributed configuration. What should it look like for a real distributed setup? How can I go about running multiple Hadoop clusters? Currently, all three masters have the same configuration file, so they all create a different framework. How should things be set up for a high-availability Hadoop framework that can
Re: Hadoop on Mesos use local cdh4 installation instead of tar.gz
Hello, Using hadoop distribution is possible (here cdh4.1.2) : An archive is mandatory by haddop-mesos framework, so I created and deployed a small dummy file that does not cost so much to get and untar. In mapred-site.xml, override mapred.mesos.executor.directory and mapred.mesos.executor.command so I use mesos task directory for my job and deployed cloudera tasktracker to execute. + property +namemapred.mesos.executor.uri/name +valuehdfs://hdfscluster/tmp/dummy.tar.gz/value + /property + property +namemapred.mesos.executor.directory/name +value.//value + /property + property +namemapred.mesos.executor.command/name +value. /etc/default/hadoop-0.20; env ; $HADOOP_HOME/bin/hadoop org.apache.hadoop.mapred.MesosExecutor/value + /property Add some envar in /etc/default/hadoop-0.20 so hadoop services can find hadoop-mesos jar and libmesos : +export HADOOP_CLASSPATH=/usr/lib/hadoop-mesos/hadoop-mesos.jar:$HADOOP_HOME/contrib/fairscheduler/hadoop-fairscheduler-2.0.0-mr1-cdh4.1.2.jar:$HADOOP_CLASSPATH +export MESOS_NATIVE_LIBRARY=/usr/lib/libmesos.so I created an hadoop-mesos deb to be deployed with hadoop ditribution. My goal is to limit -copyToLocal of TT code for each mesos tasks, and no need for special manipulation in Hadoop Distribution code (only config) Regards, Le 31/12/2013 16:45, Damien Hardy a écrit : I'm now able to use snappy compression by adding export JAVA_LIBRARY_PATH=/usr/lib/hadoop/lib/native/ in my /etc/default/mesos-slave (environment variable for mesos-slave process used by my init.d script) This envar is propagated to executor Jvm and so taskTracker can find libsnappy.so to use it. Starting using local deployement of cdh4 ... Reading at the source it seams that something could be done using mapred.mesos.executor.directory and mapred.mesos.executor.command to use local hadoop. Le 31/12/2013 15:08, Damien Hardy a écrit : Hello, Happy new year 2014 @mesos users. I am trying to get MapReduce cdh4.1.2 running on Mesos. Seams working mostly but few things are still problematic. * MR1 code is already deployed locally with HDFS is there a way to use it instead of tar.gz stored on HDFS to be copied locally and untar. * If not, using tar.gz distribution of cdh4 seams not supporting Snappy compression. is there a way to correct it ? Best regards, -- Damien HARDY IT Infrastructure Architect Viadeo - 30 rue de la Victoire - 75009 Paris - France PGP : 45D7F89A signature.asc Description: OpenPGP digital signature
Re: Hadoop on Mesos use local cdh4 installation instead of tar.gz
I'm now able to use snappy compression by adding export JAVA_LIBRARY_PATH=/usr/lib/hadoop/lib/native/ in my /etc/default/mesos-slave (environment variable for mesos-slave process used by my init.d script) This envar is propagated to executor Jvm and so taskTracker can find libsnappy.so to use it. Starting using local deployement of cdh4 ... Reading at the source it seams that something could be done using mapred.mesos.executor.directory and mapred.mesos.executor.command to use local hadoop. Le 31/12/2013 15:08, Damien Hardy a écrit : Hello, Happy new year 2014 @mesos users. I am trying to get MapReduce cdh4.1.2 running on Mesos. Seams working mostly but few things are still problematic. * MR1 code is already deployed locally with HDFS is there a way to use it instead of tar.gz stored on HDFS to be copied locally and untar. * If not, using tar.gz distribution of cdh4 seams not supporting Snappy compression. is there a way to correct it ? Best regards, -- Damien HARDY signature.asc Description: OpenPGP digital signature