Hi Tinco, What OS/environment are you running mesos-slave on? You might need to enable cgroups if it's not enabled/mounted by default.
Tim On Tue, Dec 16, 2014 at 11:26 AM, Ian Downes <[email protected]> wrote: > > Can you also please post the output of these commands for a working and a > non-working host? > > $ cat /proc/cgroups > > $ cat /proc/mounts > > Are you running inside a Docker or systemd container? > > On Tue, Dec 16, 2014 at 11:22 AM, Benjamin Mahler < > [email protected]> wrote: >> >> +Tim Chen (please chime in if I'm missing something) >> >> Sorry for the delay, from a quick glance it looks like the >> DockerContainerizer it a bit less liberal in the setting up of cgroups if >> they are not mounted on the machine. I'm curious, if you remove "docker" >> from the containerizers flag, does it work? >> >> Otherwise, you can try mount the cgroups manually, as suggested by the >> error message. >> >> Feel free to file a ticket to capture this! >> >> Hope this helps, >> Ben >> >> On Fri, Dec 12, 2014 at 1:55 AM, Tinco Andringa <[email protected]> wrote: >>> >>> Hi, I'm provisioning a mesos cluster and on two of my machines I get the >>> following error when starting mesos-slave: >>> >>> oot@web1:~# /usr/local/sbin/mesos-slave >>> --master=zk://localhost:2181/mesos --log_dir=/var/log/mesos >>> --isolation=cgroups/cpu,cgroups/mem --containerizers=docker,mesos >>> --executor_registration_timeout=5mins --work_dir=/var/run/work >>> I1212 10:46:30.782308 32590 logging.cpp:172] INFO level logging started! >>> I1212 10:46:30.782580 32590 main.cpp:142] Build: 2014-11-22 05:29:13 by >>> root >>> I1212 10:46:30.782615 32590 main.cpp:144] Version: 0.21.0 >>> I1212 10:46:30.782640 32590 main.cpp:147] Git tag: 0.21.0 >>> I1212 10:46:30.782665 32590 main.cpp:151] Git SHA: >>> ab8fa655d34e8e15a4290422df38a18db1c09b5b >>> Failed to create a containerizer: Could not create DockerContainerizer: >>> Failed to find a mounted cgroups hierarchy for the 'cpu' subsystem; you >>> probably need to mount cgroups manually! >>> >>> I have four machines in total, on two machines, db1 and db2 everything >>> runs fine and the slaves get added to the cluster. On web1 and web2 it >>> fails and they don't appear in the cluster. I run the exact same command on >>> each of the machines. Mesos-master runs fine on web1, web2 and db1. >>> >>> Obviously there's some difference between the web and db machines, but >>> I'm really unclear on what that difference is specifically. Most of my chef >>> scripts are ran on both types of machine, there's just some extra webserver >>> stuff on the web machines, and some extra db stuff on the db machines. Db2 >>> is the only node that doesn't run zookeeper or mesos-master. >>> >>> Any hints or tips to get closer to the root of the problem would be much >>> appreciated, I'm not affraid to dive into the source a little if necessary. >>> >>> Kind regards, >>> Tinco >>> >>

