Hi Stephen, You can see all the launch flags here: http://mesos.apache.org/documentation/latest/configuration/ (or just running .../mesos-slave.sh --help)
If you launch it via systemd (which is actually how we run it ourselves in DCOS) you will have to configure your nodes (master/agents) via the MESOS_* environment variables. In production, obviously, you want to use ZooKeeper as the discovery / coordination method (as you correctly did here): you can obviously use whatever you like as the znode path there, but it must be the same for all masters/agents. Make sure, if your run a test/dev configuration with multiple masters/agents on the same node to (a) configure each master on their own port (--port) and (b) to make each node point to a different work_dir (or you'll get confusing errors around log-replicas). (@haosdent: I'm *almost* sure the packaging is correct, but needs the env vars to be configured properly) *Marco Massenzio* *Distributed Systems Engineerhttp://codetrips.com <http://codetrips.com>* On Thu, Aug 6, 2015 at 4:12 AM, Stephen Knight <[email protected]> wrote: > Ok, that's working if I run it like this: /usr/sbin/mesos-slave > --master=zk://172.31.x.x:2181/mesos & > /dev/null 2>&1 > > Thanks for your help, really appreciate it. > > On Thu, Aug 6, 2015 at 3:03 PM, haosdent <[email protected]> wrote: > >> Hm, need pass your master location, for example: >> >> /usr/sbin/mesos-slave --master=x.x.x.x:5050 >> >> if you use zookeeper, need use the format like: >> >> /usr/sbin/mesos-slave --master=zk://host1:port1,host2:port2,.../path >> >> On Thu, Aug 6, 2015 at 6:55 PM, Stephen Knight <[email protected]> >> wrote: >> >>> My system doesn't support cat with systemctl for some reason but here is >>> the contents of /usr/lib/systemd/system/mesos-slave.service >>> >>> [Unit] >>> >>> Description=Mesos Slave >>> >>> After=network.target >>> >>> Wants=network.target >>> >>> >>> [Service] >>> >>> ExecStart=/usr/bin/mesos-init-wrapper slave >>> >>> KillMode=process >>> >>> Restart=always >>> >>> RestartSec=20 >>> >>> LimitNOFILE=16384 >>> >>> CPUAccounting=true >>> >>> MemoryAccounting=true >>> >>> >>> [Install] >>> >>> WantedBy=multi-user.target >>> >>> >>> What are the required flags to start it manually? >>> >>> On Thu, Aug 6, 2015 at 2:51 PM, haosdent <[email protected]> wrote: >>> >>>> Or you could try "systemctl cat mesos-slave.service" and show us the >>>> file content. >>>> >>>> On Thu, Aug 6, 2015 at 6:49 PM, haosdent <[email protected]> wrote: >>>> >>>>> From this message, I think "systemctl status mesos-slave.service -l" >>>>> run mesos-slave with uncorrect flags. And the status out of it is the help >>>>> message of slave. Could you try to start mesos-slave in manual way? Not >>>>> through systemctl. >>>>> >>>>> On Thu, Aug 6, 2015 at 6:41 PM, Stephen Knight <[email protected]> >>>>> wrote: >>>>> >>>>>> systemctl gives me the following output on CentOS: The command to >>>>>> start I ran was "systemctl start mesos-slave.service" >>>>>> >>>>>> [root@ip-172-31-35-167 mesos]# systemctl status mesos-slave.service >>>>>> -l >>>>>> >>>>>> mesos-slave.service - Mesos Slave >>>>>> >>>>>> Loaded: loaded (/usr/lib/systemd/system/mesos-slave.service; >>>>>> enabled) >>>>>> >>>>>> Drop-In: /etc/systemd/system/mesos-slave.service.d >>>>>> >>>>>> └─mesos-slave-containerizers.conf >>>>>> >>>>>> Active: activating (auto-restart) (Result: exit-code) since Thu >>>>>> 2015-08-06 10:38:08 UTC; 2s ago >>>>>> >>>>>> Process: 1472 ExecStart=/usr/bin/mesos-init-wrapper slave >>>>>> *(code=exited, >>>>>> status=1/FAILURE)* >>>>>> >>>>>> Main PID: 1472 (code=exited, status=1/FAILURE) >>>>>> >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: *If >>>>>> strict=false, any expected errors (e.g., slave cannot recover* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: >>>>>> *information >>>>>> about an executor, because the slave died right before* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: *the >>>>>> executor registered.) during recovery are ignored and as much* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: *state >>>>>> as possible is recovered.* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: >>>>>> *(default: >>>>>> true)* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: >>>>>> *--[no-]switch_user >>>>>> Whether to run tasks as the user who* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: >>>>>> *submitted >>>>>> them rather than the user running* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: *the >>>>>> slave (requires setuid permission) (default: true)* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: >>>>>> *--[no-]version >>>>>> Show version and exit. (default: >>>>>> false)* >>>>>> >>>>>> Aug 06 10:38:08 ip-172-31-35-167.ec2.internal mesos-slave[1483]: >>>>>> *--work_dir=VALUE >>>>>> Directory path to place framework work >>>>>> directories* >>>>>> >>>>>> >>>>>> >>>>>> I've also run strace against it, nothing sticks out: >>>>>> >>>>>> >>>>>> strace systemctl start mesos-slave.service >>>>>> >>>>>> execve("/bin/systemctl", ["systemctl", "start", >>>>>> "mesos-slave.service"], [/* 18 vars */]) = 0 >>>>>> >>>>>> brk(0) = 0x7f5c2af9f000 >>>>>> >>>>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>>>>> 0) = 0x7f5c2a5c6000 >>>>>> >>>>>> access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or >>>>>> directory) >>>>>> >>>>>> open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0644, st_size=20940, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 20940, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5c2a5c0000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libsystemd-daemon.so.0", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240\r\0\0\0\0\0\0"..., >>>>>> 832) >>>>>> = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=15216, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 2109448, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c2a1a2000 >>>>>> >>>>>> mprotect(0x7f5c2a1a4000, 2097152, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c2a3a4000, 4096, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f5c2a3a4000 >>>>>> >>>>>> mmap(0x7f5c2a3a5000, 8, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5c2a3a5000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libdbus-1.so.3", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0@x\0\0\0\0\0\0"..., >>>>>> 832) = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=304536, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 2390496, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c29f5a000 >>>>>> >>>>>> mprotect(0x7f5c29fa0000, 2097152, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c2a1a0000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x46000) = 0x7f5c2a1a0000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/librt.so.1", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\"\0\0\0\0\0\0"..., >>>>>> 832) >>>>>> = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=44088, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>>>>> 0) = 0x7f5c2a5bf000 >>>>>> >>>>>> mmap(NULL, 2128952, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c29d52000 >>>>>> >>>>>> mprotect(0x7f5c29d59000, 2093056, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c29f58000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x6000) = 0x7f5c29f58000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libselinux.so.1", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240d\0\0\0\0\0\0"..., >>>>>> 832) >>>>>> = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=147120, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 2246784, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c29b2d000 >>>>>> >>>>>> mprotect(0x7f5c29b4e000, 2097152, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c29d4e000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x21000) = 0x7f5c29d4e000 >>>>>> >>>>>> mmap(0x7f5c29d50000, 6272, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5c29d50000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/liblzma.so.5", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000/\0\0\0\0\0\0"..., >>>>>> 832) = >>>>>> 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=153184, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 2245240, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c29908000 >>>>>> >>>>>> mprotect(0x7f5c2992c000, 2093056, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c29b2b000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x23000) = 0x7f5c29b2b000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libgcrypt.so.11", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0u\0\0\0\0\0\0"..., 832) >>>>>> = >>>>>> 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=534488, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>>>>> 0) = 0x7f5c2a5be000 >>>>>> >>>>>> mmap(NULL, 2621456, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c29687000 >>>>>> >>>>>> mprotect(0x7f5c29703000, 2097152, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c29903000, 16384, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7c000) = 0x7f5c29903000 >>>>>> >>>>>> mmap(0x7f5c29907000, 16, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5c29907000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libgcc_s.so.1", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360*\0\0\0\0\0\0"..., >>>>>> 832) >>>>>> = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=88720, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 2184192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c29471000 >>>>>> >>>>>> mprotect(0x7f5c29486000, 2093056, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c29685000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x14000) = 0x7f5c29685000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\3\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\0\34\2\0\0\0\0\0"..., >>>>>> 832) >>>>>> = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=2107760, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 3932736, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c290b0000 >>>>>> >>>>>> mprotect(0x7f5c29266000, 2097152, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c29466000, 24576, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x1b6000) = 0x7f5c29466000 >>>>>> >>>>>> mmap(0x7f5c2946c000, 16960, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5c2946c000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libdl.so.2", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\320\16\0\0\0\0\0\0"..., >>>>>> 832) = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=19512, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>>>>> 0) = 0x7f5c2a5bd000 >>>>>> >>>>>> mmap(NULL, 2109744, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c28eac000 >>>>>> >>>>>> mprotect(0x7f5c28eaf000, 2093056, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c290ae000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x2000) = 0x7f5c290ae000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libpthread.so.0", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240l\0\0\0\0\0\0"..., >>>>>> 832) >>>>>> = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=141616, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 2208864, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c28c90000 >>>>>> >>>>>> mprotect(0x7f5c28ca6000, 2097152, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c28ea6000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x16000) = 0x7f5c28ea6000 >>>>>> >>>>>> mmap(0x7f5c28ea8000, 13408, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f5c28ea8000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libpcre.so.1", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\360\25\0\0\0\0\0\0"..., >>>>>> 832) = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=398272, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 2490888, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c28a2f000 >>>>>> >>>>>> mprotect(0x7f5c28a8e000, 2097152, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c28c8e000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x5f000) = 0x7f5c28c8e000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> open("/lib64/libgpg-error.so.0", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> read(3, >>>>>> "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0000\n\0\0\0\0\0\0"..., >>>>>> 832) >>>>>> = 832 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0755, st_size=19384, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>>>>> 0) = 0x7f5c2a5bc000 >>>>>> >>>>>> mmap(NULL, 2113656, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, >>>>>> 3, 0) = 0x7f5c2882a000 >>>>>> >>>>>> mprotect(0x7f5c2882e000, 2093056, PROT_NONE) = 0 >>>>>> >>>>>> mmap(0x7f5c28a2d000, 8192, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x3000) = 0x7f5c28a2d000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>>>>> 0) = 0x7f5c2a5bb000 >>>>>> >>>>>> mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, >>>>>> 0) = 0x7f5c2a5b9000 >>>>>> >>>>>> arch_prctl(ARCH_SET_FS, 0x7f5c2a5b9880) = 0 >>>>>> >>>>>> mprotect(0x7f5c29466000, 16384, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c28a2d000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c28ea6000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c28c8e000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c290ae000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c29685000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c29903000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c29b2b000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c29d4e000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c29f58000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c2a1a0000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c2a3a4000, 4096, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c2a81d000, 8192, PROT_READ) = 0 >>>>>> >>>>>> mprotect(0x7f5c2a5c7000, 4096, PROT_READ) = 0 >>>>>> >>>>>> munmap(0x7f5c2a5c0000, 20940) = 0 >>>>>> >>>>>> set_tid_address(0x7f5c2a5b9b50) = 1545 >>>>>> >>>>>> set_robust_list(0x7f5c2a5b9b60, 24) = 0 >>>>>> >>>>>> rt_sigaction(SIGRTMIN, {0x7f5c28c96780, [], SA_RESTORER|SA_SIGINFO, >>>>>> 0x7f5c28c9f130}, NULL, 8) = 0 >>>>>> >>>>>> rt_sigaction(SIGRT_1, {0x7f5c28c96810, [], >>>>>> SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f5c28c9f130}, NULL, 8) = 0 >>>>>> >>>>>> rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0 >>>>>> >>>>>> getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, >>>>>> rlim_max=RLIM64_INFINITY}) = 0 >>>>>> >>>>>> brk(0) = 0x7f5c2af9f000 >>>>>> >>>>>> brk(0x7f5c2afc0000) = 0x7f5c2afc0000 >>>>>> >>>>>> access("/etc/system-fips", F_OK) = -1 ENOENT (No such file or >>>>>> directory) >>>>>> >>>>>> statfs("/sys/fs/selinux", {f_type=0xf97cff8c, f_bsize=4096, >>>>>> f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, >>>>>> f_namelen=255, f_frsize=4096}) = 0 >>>>>> >>>>>> statfs("/sys/fs/selinux", {f_type=0xf97cff8c, f_bsize=4096, >>>>>> f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, >>>>>> f_namelen=255, f_frsize=4096}) = 0 >>>>>> >>>>>> stat("/sys/fs/selinux", {st_mode=S_IFDIR|0755, st_size=0, ...}) = 0 >>>>>> >>>>>> open("/usr/lib/locale/locale-archive", O_RDONLY|O_CLOEXEC) = 3 >>>>>> >>>>>> fstat(3, {st_mode=S_IFREG|0644, st_size=106065056, ...}) = 0 >>>>>> >>>>>> mmap(NULL, 106065056, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f5c22303000 >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or >>>>>> TCGETS, {B9600 opost isig icanon echo ...}) = 0 >>>>>> >>>>>> stat("/proc/1/root", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 >>>>>> >>>>>> stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 >>>>>> >>>>>> stat("/proc/1/root", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 >>>>>> >>>>>> stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 >>>>>> >>>>>> lstat("/run/systemd/system/", {st_mode=S_IFDIR|0755, st_size=80, >>>>>> ...}) = 0 >>>>>> >>>>>> geteuid() = 0 >>>>>> >>>>>> socket(PF_LOCAL, SOCK_STREAM|SOCK_CLOEXEC, 0) = 3 >>>>>> >>>>>> connect(3, {sa_family=AF_LOCAL, sun_path="/run/systemd/private"}, 22) >>>>>> = 0 >>>>>> >>>>>> fcntl(3, F_GETFL) = 0x2 (flags O_RDWR) >>>>>> >>>>>> fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0 >>>>>> >>>>>> geteuid() = 0 >>>>>> >>>>>> getsockname(3, {sa_family=AF_LOCAL, NULL}, [2]) = 0 >>>>>> >>>>>> getsockopt(3, SOL_SOCKET, SO_PEERCRED, {pid=1, uid=0, gid=0}, [12]) = >>>>>> 0 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 716107629}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLOUT}], 1, 90000) = 1 ([{fd=3, >>>>>> revents=POLLOUT}]) >>>>>> >>>>>> sendto(3, "\0", 1, MSG_NOSIGNAL, NULL, 0) = 1 >>>>>> >>>>>> sendto(3, "AUTH EXTERNAL 30\r\n", 18, MSG_NOSIGNAL, NULL, 0) = 18 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 716225119}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLIN}], 1, 90000) = 1 ([{fd=3, revents=POLLIN}]) >>>>>> >>>>>> read(3, "OK c2d5f933b89e938a29b77c0355c33"..., 2048) = 37 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 716481724}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLOUT}], 1, 90000) = 1 ([{fd=3, >>>>>> revents=POLLOUT}]) >>>>>> >>>>>> sendto(3, "NEGOTIATE_UNIX_FD\r\n", 19, MSG_NOSIGNAL, NULL, 0) = 19 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 716552147}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLIN}], 1, 90000) = 1 ([{fd=3, revents=POLLIN}]) >>>>>> >>>>>> read(3, "AGREE_UNIX_FD\r\n", 2048) = 15 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 716708588}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLOUT}], 1, 90000) = 1 ([{fd=3, >>>>>> revents=POLLOUT}]) >>>>>> >>>>>> sendto(3, "BEGIN\r\n", 7, MSG_NOSIGNAL, NULL, 0) = 7 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 716773800}) = 0 >>>>>> >>>>>> stat("/proc/1/root", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 >>>>>> >>>>>> stat("/", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 >>>>>> >>>>>> ioctl(0, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or >>>>>> TCGETS, {B9600 opost isig icanon echo ...}) = 0 >>>>>> >>>>>> rt_sigprocmask(SIG_SETMASK, ~[RTMIN RT_1], [], 8) = 0 >>>>>> >>>>>> clone(child_stack=0, >>>>>> flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, >>>>>> child_tidptr=0x7f5c2a5b9b50) = 1546 >>>>>> >>>>>> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 >>>>>> >>>>>> gettid() = 1545 >>>>>> >>>>>> sendmsg(3, {msg_name(0)=NULL, >>>>>> msg_iov(2)=[{"l\1\0\1$\0\0\0\1\0\0\0\240\0\0\0\1\1o\0\31\0\0\0/org/fre"..., >>>>>> 176}, {"\23\0\0\0mesos-slave.service\0\7\0\0\0repl"..., 36}], >>>>>> msg_controllen=0, msg_flags=0}, MSG_NOSIGNAL) = 212 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 717164627}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLIN}], 1, 25000) = 1 ([{fd=3, revents=POLLIN}]) >>>>>> >>>>>> recvmsg(3, {msg_name(0)=NULL, >>>>>> msg_iov(1)=[{"l\2\1\1'\0\0\0\1\0\0\0\17\0\0\0\5\1u\0\1\0\0\0\10\1g\0\1o\0\0"..., >>>>>> 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) >>>>>> = >>>>>> 1143 >>>>>> >>>>>> recvmsg(3, 0x7fff05789c50, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource >>>>>> temporarily unavailable) >>>>>> >>>>>> sendmsg(3, {msg_name(0)=NULL, >>>>>> msg_iov(2)=[{"l\1\0\1\30\0\0\0\2\0\0\0\227\0\0\0\1\1o\0\31\0\0\0/org/fre"..., >>>>>> 168}, {"\23\0\0\0mesos-slave.service\0", 24}], msg_controllen=0, >>>>>> msg_flags=0}, MSG_NOSIGNAL) = 192 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 719385633}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLIN}], 1, 25000) = 1 ([{fd=3, revents=POLLIN}]) >>>>>> >>>>>> recvmsg(3, {msg_name(0)=NULL, >>>>>> msg_iov(1)=[{"l\2\1\1;\0\0\0\2\0\0\0\17\0\0\0\5\1u\0\2\0\0\0\10\1g\0\1o\0\0"..., >>>>>> 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) >>>>>> = >>>>>> 91 >>>>>> >>>>>> recvmsg(3, 0x7fff05789c50, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource >>>>>> temporarily unavailable) >>>>>> >>>>>> sendmsg(3, {msg_name(0)=NULL, >>>>>> msg_iov(2)=[{"l\1\0\0019\0\0\0\3\0\0\0\250\0\0\0\1\1o\0006\0\0\0/org/fre"..., >>>>>> 184}, {"\35\0\0\0org.freedesktop.systemd1.Uni"..., 57}], >>>>>> msg_controllen=0, >>>>>> msg_flags=0}, MSG_NOSIGNAL) = 241 >>>>>> >>>>>> clock_gettime(CLOCK_MONOTONIC, {774, 719741008}) = 0 >>>>>> >>>>>> poll([{fd=3, events=POLLIN}], 1, 25000) = 1 ([{fd=3, revents=POLLIN}]) >>>>>> >>>>>> recvmsg(3, {msg_name(0)=NULL, >>>>>> msg_iov(1)=[{"l\2\1\1\10\0\0\0\3\0\0\0\17\0\0\0\5\1u\0\3\0\0\0\10\1g\0\1v\0\0"..., >>>>>> 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) >>>>>> = >>>>>> 40 >>>>>> >>>>>> recvmsg(3, 0x7fff05789c50, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource >>>>>> temporarily unavailable) >>>>>> >>>>>> poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, >>>>>> revents=POLLIN}]) >>>>>> >>>>>> recvmsg(3, {msg_name(0)=NULL, >>>>>> msg_iov(1)=[{"l\4\1\1M\0\0\0\206\1\0\0z\0\0\0\1\1o\0\31\0\0\0/org/fre"..., >>>>>> 2048}], msg_controllen=0, msg_flags=MSG_CMSG_CLOEXEC}, MSG_CMSG_CLOEXEC) >>>>>> = >>>>>> 1089 >>>>>> >>>>>> recvmsg(3, 0x7fff0578a030, MSG_CMSG_CLOEXEC) = -1 EAGAIN (Resource >>>>>> temporarily unavailable) >>>>>> >>>>>> close(3) = 0 >>>>>> >>>>>> kill(1546, SIGTERM) = 0 >>>>>> >>>>>> kill(1546, SIGCONT) = 0 >>>>>> >>>>>> waitid(P_PID, 1546, {si_signo=SIGCHLD, si_code=CLD_EXITED, >>>>>> si_pid=1546, si_status=0, si_utime=0, si_stime=0}, WEXITED, NULL) = 0 >>>>>> >>>>>> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1546, >>>>>> si_status=0, si_utime=0, si_stime=0} --- >>>>>> >>>>>> exit_group(0) = ? >>>>>> >>>>>> +++ exited with 0 +++ >>>>>> >>>>>> >>>>>> On Thu, Aug 6, 2015 at 2:36 PM, haosdent <[email protected]> wrote: >>>>>> >>>>>>> You use service mesos-slave to start slave failed so that it could >>>>>>> not connect to master, right? Does it have any log about this? >>>>>>> >>>>>>> On Thu, Aug 6, 2015 at 6:30 PM, Stephen Knight <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Yes, I have tried on both environments (one CentOS 7 and one >>>>>>>> Ubuntu) and they behave the exact same way. I probably should clarify >>>>>>>> though that after a restart the agent fails to start on the slave, >>>>>>>> which is >>>>>>>> why it never reaches the master. >>>>>>>> >>>>>>>> I can however run a smoke test and get it to connect from a slave >>>>>>>> to the master by passing a job, but the slave agent won't start. The >>>>>>>> guide >>>>>>>> I followed is at: >>>>>>>> http://open.mesosphere.com/getting-started/datacenter/install/ >>>>>>>> >>>>>>>> The environment is by no means live, I can share details with you >>>>>>>> if you can help me to that degree. >>>>>>>> >>>>>>>> On Thu, Aug 6, 2015 at 2:23 PM, haosdent <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, @Stephen From your slave log, could not see the restart log >>>>>>>>> about slave. Are you sure you restart slave after reboot? >>>>>>>>> >>>>>>>>> On Thu, Aug 6, 2015 at 5:54 PM, Stephen Knight <[email protected] >>>>>>>>> > wrote: >>>>>>>>> >>>>>>>>>> Hi Klaus, I have attached all from a master and a slave. >>>>>>>>>> >>>>>>>>>> I've replicated the problem over and over again, not sure what to >>>>>>>>>> make of it. First registration is fine but then if I reboot the >>>>>>>>>> service for >>>>>>>>>> mesos-slave (process restart of full server restart) it never >>>>>>>>>> connects >>>>>>>>>> again. >>>>>>>>>> >>>>>>>>>> The VM's are in the same VPC on AWS with an open security group >>>>>>>>>> between them. >>>>>>>>>> >>>>>>>>>> On Thu, Aug 6, 2015 at 12:41 PM, Klaus Ma <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Stephen, >>>>>>>>>>> >>>>>>>>>>> Would you share the log of master & slave? >>>>>>>>>>> >>>>>>>>>>> Thanks >>>>>>>>>>> Klaus >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 2015年08月06日 16:07, Stephen Knight wrote: >>>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> I was wondering if anyone can help me. I have a test setup, 1 >>>>>>>>>>> master/zookeeper and 2 slaves on Ubuntu 14.04. >>>>>>>>>>> >>>>>>>>>>> When I initialize the slaves the first time it all works and >>>>>>>>>>> they register with the master (I can see it on x.x.x.x:5050) but >>>>>>>>>>> when I >>>>>>>>>>> reboot those slaves for any reason, they never re-register. Am I >>>>>>>>>>> missing >>>>>>>>>>> something? >>>>>>>>>>> >>>>>>>>>>> Thx >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> --- >>>>>>>>>>> Stephen Knight >>>>>>>>>>> Infrastructure Consultant >>>>>>>>>>> >>>>>>>>>>> Pivotal Services @ EMC >>>>>>>>>>> +971 (0)56 538 2071 >>>>>>>>>>> >>>>>>>>>>> [email protected] >>>>>>>>>>> [email protected] >>>>>>>>>>> >>>>>>>>>>> Pivotal.io >>>>>>>>>>> >>>>>>>>>>> Notice of Confidentiality - This email message is for the sole >>>>>>>>>>> use of the intended recipient and may contain confidential and >>>>>>>>>>> privileged >>>>>>>>>>> information. Any unauthorized review, use, disclosure or >>>>>>>>>>> distribution is >>>>>>>>>>> prohibited. If you are not the intended recipient, please contact >>>>>>>>>>> the >>>>>>>>>>> sender by reply email and destroy all copies of the original >>>>>>>>>>> message. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Best Regards, >>>>>>>>> Haosdent Huang >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Best Regards, >>>>>>> Haosdent Huang >>>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>>> -- >>>>> Best Regards, >>>>> Haosdent Huang >>>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Haosdent Huang >>>> >>> >>> >> >> >> -- >> Best Regards, >> Haosdent Huang >> > >

