Hmm. That does seem strange. If the slave was able to connect to the master, the framework should've been able to connect too.
I see that you specified --ip for slave. What happens if you don't specify it? On Tue, Apr 16, 2013 at 11:54 PM, Xiaoying Zheng <[email protected]> wrote: > hi, all, > > We failed to run a job from a slave node on a Mesos cluster but we think > Mesos should support it. We set up a small Mesos cluster consisting of only > two nodes, one master and one slave. When we ran the C++ test framework on > the slave node, the framework kept getting connected and disconnected. BTW, > when we ran the C++ test framework on the master node, everything went > okay. We attached the logs. Any help is appreciated. > > Regards, > Xiaoying > > 1. On the master node (192.168.1.130), we ran "bin/mesos-master.sh > --ip=192.168.1.130". > > Log file created at: 2013/04/17 14:25:53 > Running on machine: master > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg > I0417 14:25:53.069144 3253 logging.cpp:72] Logging to logs > I0417 14:25:53.070135 3253 main.cpp:105] Build: 2013-01-22 20:32:32 by > hadoop > I0417 14:25:53.070175 3253 main.cpp:106] Starting Mesos master > I0417 14:25:53.070313 3268 master.cpp:268] Master started on > 192.168.1.130:5050 > I0417 14:25:53.070438 3268 master.cpp:283] Master ID: > 201304171425-2113820480-5050-**3253 > I0417 14:25:53.071903 3268 master.cpp:483] Elected as master! > I0417 14:25:53.083060 3272 webui_utils.cpp:45] Loading webui script at > '/home/hadoop/mesos-0.9.0/src/**webui/master/webui.py' > I0417 14:26:28.812635 3268 master.cpp:844] Attempting to register slave > 201304171425-2113820480-5050-**3253-0 at [email protected]:53221 > I0417 14:26:28.812717 3268 master.cpp:1097] Master now considering a > slave at hdfs2:53221 as active > I0417 14:26:28.812746 3268 master.cpp:1633] Adding slave > 201304171425-2113820480-5050-**3253-0 at hdfs2 with cpus=4; mem=968 > I0417 14:26:28.813254 3268 simple_allocator.cpp:69] Added slave > 201304171425-2113820480-5050-**3253-0 with cpus=4; mem=968 > I0417 14:26:49.963219 3270 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0000 at [email protected]:50381 > I0417 14:26:49.963736 3270 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0000 > I0417 14:26:49.963889 3270 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0000 > I0417 14:26:49.964043 3270 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0000 disconnected > I0417 14:26:50.968178 3269 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0000 > I0417 14:26:50.968356 3269 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0000 > I0417 14:26:50.973130 3268 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0001 at [email protected]:50381 > I0417 14:26:50.973316 3268 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0001 > I0417 14:26:50.973417 3268 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0001 > I0417 14:26:50.973528 3268 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0001 disconnected > I0417 14:26:51.978101 3267 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0001 > I0417 14:26:51.978263 3267 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0001 > I0417 14:26:51.983166 3269 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0002 at [email protected]:50381 > I0417 14:26:51.983443 3269 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0002 > I0417 14:26:51.983575 3269 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0002 > I0417 14:26:51.983705 3269 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0002 disconnected > I0417 14:26:52.988116 3268 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0002 > I0417 14:26:52.988281 3268 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0002 > I0417 14:26:52.993340 3267 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0003 at [email protected]:50381 > I0417 14:26:52.993530 3267 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0003 > I0417 14:26:52.993625 3267 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0003 > I0417 14:26:52.993737 3267 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0003 disconnected > I0417 14:26:53.998100 3270 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0003 > I0417 14:26:53.998241 3270 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0003 > I0417 14:26:54.006017 3269 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0004 at [email protected]:50381 > I0417 14:26:54.006289 3269 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0004 > I0417 14:26:54.006477 3269 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0004 > I0417 14:26:54.006607 3269 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0004 disconnected > I0417 14:26:55.010612 3268 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0004 > I0417 14:26:55.010838 3268 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0004 > I0417 14:26:55.016173 3267 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0005 at [email protected]:50381 > I0417 14:26:55.016374 3267 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0005 > I0417 14:26:55.016474 3267 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0005 > I0417 14:26:55.016592 3267 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0005 disconnected > I0417 14:26:56.018100 3269 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0005 > I0417 14:26:56.018290 3269 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0005 > I0417 14:26:56.026458 3268 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0006 at [email protected]:50381 > I0417 14:26:56.026661 3268 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0006 > I0417 14:26:56.026759 3268 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0006 > I0417 14:26:56.026876 3268 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0006 disconnected > I0417 14:26:57.028127 3267 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0006 > I0417 14:26:57.028327 3267 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0006 > I0417 14:26:57.036485 3269 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0007 at [email protected]:50381 > I0417 14:26:57.036756 3269 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0007 > I0417 14:26:57.036861 3269 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0007 > I0417 14:26:57.036993 3269 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0007 disconnected > I0417 14:26:58.040633 3270 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0007 > I0417 14:26:58.040828 3270 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0007 > I0417 14:26:58.046758 3267 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0008 at [email protected]:50381 > I0417 14:26:58.047071 3267 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0008 > I0417 14:26:58.047216 3267 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0008 > I0417 14:26:58.047349 3267 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0008 disconnected > I0417 14:26:59.048115 3270 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0008 > I0417 14:26:59.048280 3270 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0008 > I0417 14:26:59.056917 3269 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0009 at [email protected]:50381 > I0417 14:26:59.057204 3269 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0009 > I0417 14:26:59.057329 3269 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0009 > I0417 14:26:59.057471 3269 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0009 disconnected > I0417 14:27:00.061285 3268 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0009 > I0417 14:27:00.061535 3268 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0009 > I0417 14:27:00.067119 3270 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0010 at [email protected]:50381 > I0417 14:27:00.067361 3270 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0010 > I0417 14:27:00.067471 3270 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0010 > I0417 14:27:00.067586 3270 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0010 disconnected > I0417 14:27:01.070636 3269 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0010 > I0417 14:27:01.070832 3269 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0010 > I0417 14:27:01.077214 3268 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0011 at [email protected]:50381 > I0417 14:27:01.077497 3268 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0011 > I0417 14:27:01.077616 3268 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0011 > I0417 14:27:01.077760 3268 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0011 disconnected > I0417 14:27:02.080665 3267 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0011 > I0417 14:27:02.080870 3267 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0011 > I0417 14:27:02.087455 3270 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0012 at [email protected]:50381 > I0417 14:27:02.087746 3270 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0012 > I0417 14:27:02.087867 3270 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0012 > I0417 14:27:02.088044 3270 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0012 disconnected > I0417 14:27:03.097745 3268 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0012 > I0417 14:27:03.097934 3268 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0012 > I0417 14:27:03.098002 3268 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0013 at [email protected]:50381 > I0417 14:27:03.098170 3268 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0013 > I0417 14:27:03.098273 3268 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0013 > I0417 14:27:03.098388 3268 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0013 disconnected > I0417 14:27:04.107770 3270 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0013 > I0417 14:27:04.107971 3270 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0013 > I0417 14:27:04.108072 3270 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0014 at [email protected]:50381 > I0417 14:27:04.108361 3270 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0014 > I0417 14:27:04.108464 3270 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0014 > I0417 14:27:04.108542 3270 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0014 disconnected > I0417 14:27:05.118046 3267 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0014 > I0417 14:27:05.118283 3267 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0014 > I0417 14:27:05.118834 3267 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0015 at [email protected]:50381 > I0417 14:27:05.119031 3267 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0015 > I0417 14:27:05.119133 3267 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0015 > I0417 14:27:05.119261 3267 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0015 disconnected > I0417 14:27:06.128176 3270 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0015 > I0417 14:27:06.128389 3270 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0015 > I0417 14:27:06.128525 3270 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0016 at [email protected]:50381 > I0417 14:27:06.128854 3270 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0016 > I0417 14:27:06.129014 3270 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0016 > I0417 14:27:06.129128 3270 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0016 disconnected > I0417 14:27:07.138417 3268 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0016 > I0417 14:27:07.138607 3268 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0016 > I0417 14:27:07.138677 3268 master.cpp:526] Registering framework > 201304171425-2113820480-5050-**3253-0017 at [email protected]:50381 > I0417 14:27:07.138838 3268 simple_allocator.cpp:46] Added framework > 201304171425-2113820480-5050-**3253-0017 > I0417 14:27:07.138934 3268 master.cpp:1188] Sending 1 offers to framework > 201304171425-2113820480-5050-**3253-0017 > I0417 14:27:07.139050 3268 master.cpp:430] Framework > 201304171425-2113820480-5050-**3253-0017 disconnected > I0417 14:27:08.148147 3267 master.cpp:1147] Framework failover timeout, > removing framework 201304171425-2113820480-5050-**3253-0017 > I0417 14:27:08.148331 3267 simple_allocator.cpp:59] Removed framework > 201304171425-2113820480-5050-**3253-0017 > > 2. On the slave node (192.168.1.132), we ran "bin/mesos-master.sh > --ip=192.168.1.132 --master=192.168.1.130:5050". > > Log file created at: 2013/04/17 14:25:01 > Running on machine: hdfs2 > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg > I0417 14:25:01.151159 2430 logging.cpp:72] Logging to logs > I0417 14:25:01.152088 2430 main.cpp:111] Creating "process" isolation > module > I0417 14:25:01.152246 2430 main.cpp:119] Build: 2013-01-22 20:32:32 by > hadoop > I0417 14:25:01.152283 2430 main.cpp:120] Starting Mesos slave > I0417 14:25:01.152614 2444 slave.cpp:191] Slave started on > 192.168.1.132:53221 > I0417 14:25:01.152660 2444 slave.cpp:192] Slave resources: cpus=4; mem=968 > I0417 14:25:01.154165 2444 slave.cpp:357] New master detected at > [email protected]:5050 > I0417 14:25:01.156548 2445 slave.cpp:377] Registered with master; given > slave ID 201304171425-2113820480-5050-**3253-0 > I0417 14:25:01.165132 2449 webui_utils.cpp:45] Loading webui script at > '/home/hadoop/mesos-0.9.0/src/**webui/slave/webui.py' > I0417 14:25:23.307472 2446 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0000 > I0417 14:25:24.317137 2447 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0001 > I0417 14:25:25.326979 2446 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0002 > I0417 14:25:26.336750 2444 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0003 > I0417 14:25:27.349195 2445 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0004 > I0417 14:25:28.356397 2446 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0005 > I0417 14:25:29.366291 2447 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0006 > I0417 14:25:30.378608 2446 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0007 > I0417 14:25:31.385884 2445 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0008 > I0417 14:25:32.398939 2446 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0009 > I0417 14:25:33.408038 2447 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0010 > I0417 14:25:34.417917 2445 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0011 > I0417 14:25:35.434813 2444 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0012 > I0417 14:25:36.444607 2446 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0013 > I0417 14:25:37.454766 2445 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0014 > I0417 14:25:38.464651 2446 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0015 > I0417 14:25:39.474720 2445 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0016 > I0417 14:25:40.484283 2444 slave.cpp:604] Asked to shut down framework > 201304171425-2113820480-5050-**3253-0017 > I0417 14:25:46.626567 2446 slave.cpp:1104] Process exited: @0.0.0.0:0 > W0417 14:25:46.626637 2446 slave.cpp:1107] WARNING! Master disconnected! > Waiting for a new master to be elected. > > 3. On the slave node (192.168.1.132), we started the C++ test frame work > by "src/test-framework 192.168.1.130:5050" > >
