hi, all,
We failed to run a job from a slave node on a Mesos cluster but we think
Mesos should support it. We set up a small Mesos cluster consisting of
only two nodes, one master and one slave. When we ran the C++ test
framework on the slave node, the framework kept getting connected and
disconnected. BTW, when we ran the C++ test framework on the master
node, everything went okay. We attached the logs. Any help is appreciated.
Regards,
Xiaoying
1. On the master node (192.168.1.130), we ran "bin/mesos-master.sh
--ip=192.168.1.130".
Log file created at: 2013/04/17 14:25:53
Running on machine: master
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0417 14:25:53.069144 3253 logging.cpp:72] Logging to logs
I0417 14:25:53.070135 3253 main.cpp:105] Build: 2013-01-22 20:32:32 by
hadoop
I0417 14:25:53.070175 3253 main.cpp:106] Starting Mesos master
I0417 14:25:53.070313 3268 master.cpp:268] Master started on
192.168.1.130:5050
I0417 14:25:53.070438 3268 master.cpp:283] Master ID:
201304171425-2113820480-5050-3253
I0417 14:25:53.071903 3268 master.cpp:483] Elected as master!
I0417 14:25:53.083060 3272 webui_utils.cpp:45] Loading webui script at
'/home/hadoop/mesos-0.9.0/src/webui/master/webui.py'
I0417 14:26:28.812635 3268 master.cpp:844] Attempting to register slave
201304171425-2113820480-5050-3253-0 at [email protected]:53221
I0417 14:26:28.812717 3268 master.cpp:1097] Master now considering a
slave at hdfs2:53221 as active
I0417 14:26:28.812746 3268 master.cpp:1633] Adding slave
201304171425-2113820480-5050-3253-0 at hdfs2 with cpus=4; mem=968
I0417 14:26:28.813254 3268 simple_allocator.cpp:69] Added slave
201304171425-2113820480-5050-3253-0 with cpus=4; mem=968
I0417 14:26:49.963219 3270 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0000 at [email protected]:50381
I0417 14:26:49.963736 3270 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0000
I0417 14:26:49.963889 3270 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0000
I0417 14:26:49.964043 3270 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0000 disconnected
I0417 14:26:50.968178 3269 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0000
I0417 14:26:50.968356 3269 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0000
I0417 14:26:50.973130 3268 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0001 at [email protected]:50381
I0417 14:26:50.973316 3268 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0001
I0417 14:26:50.973417 3268 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0001
I0417 14:26:50.973528 3268 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0001 disconnected
I0417 14:26:51.978101 3267 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0001
I0417 14:26:51.978263 3267 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0001
I0417 14:26:51.983166 3269 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0002 at [email protected]:50381
I0417 14:26:51.983443 3269 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0002
I0417 14:26:51.983575 3269 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0002
I0417 14:26:51.983705 3269 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0002 disconnected
I0417 14:26:52.988116 3268 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0002
I0417 14:26:52.988281 3268 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0002
I0417 14:26:52.993340 3267 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0003 at [email protected]:50381
I0417 14:26:52.993530 3267 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0003
I0417 14:26:52.993625 3267 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0003
I0417 14:26:52.993737 3267 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0003 disconnected
I0417 14:26:53.998100 3270 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0003
I0417 14:26:53.998241 3270 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0003
I0417 14:26:54.006017 3269 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0004 at [email protected]:50381
I0417 14:26:54.006289 3269 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0004
I0417 14:26:54.006477 3269 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0004
I0417 14:26:54.006607 3269 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0004 disconnected
I0417 14:26:55.010612 3268 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0004
I0417 14:26:55.010838 3268 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0004
I0417 14:26:55.016173 3267 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0005 at [email protected]:50381
I0417 14:26:55.016374 3267 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0005
I0417 14:26:55.016474 3267 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0005
I0417 14:26:55.016592 3267 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0005 disconnected
I0417 14:26:56.018100 3269 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0005
I0417 14:26:56.018290 3269 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0005
I0417 14:26:56.026458 3268 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0006 at [email protected]:50381
I0417 14:26:56.026661 3268 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0006
I0417 14:26:56.026759 3268 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0006
I0417 14:26:56.026876 3268 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0006 disconnected
I0417 14:26:57.028127 3267 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0006
I0417 14:26:57.028327 3267 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0006
I0417 14:26:57.036485 3269 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0007 at [email protected]:50381
I0417 14:26:57.036756 3269 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0007
I0417 14:26:57.036861 3269 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0007
I0417 14:26:57.036993 3269 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0007 disconnected
I0417 14:26:58.040633 3270 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0007
I0417 14:26:58.040828 3270 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0007
I0417 14:26:58.046758 3267 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0008 at [email protected]:50381
I0417 14:26:58.047071 3267 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0008
I0417 14:26:58.047216 3267 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0008
I0417 14:26:58.047349 3267 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0008 disconnected
I0417 14:26:59.048115 3270 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0008
I0417 14:26:59.048280 3270 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0008
I0417 14:26:59.056917 3269 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0009 at [email protected]:50381
I0417 14:26:59.057204 3269 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0009
I0417 14:26:59.057329 3269 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0009
I0417 14:26:59.057471 3269 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0009 disconnected
I0417 14:27:00.061285 3268 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0009
I0417 14:27:00.061535 3268 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0009
I0417 14:27:00.067119 3270 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0010 at [email protected]:50381
I0417 14:27:00.067361 3270 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0010
I0417 14:27:00.067471 3270 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0010
I0417 14:27:00.067586 3270 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0010 disconnected
I0417 14:27:01.070636 3269 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0010
I0417 14:27:01.070832 3269 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0010
I0417 14:27:01.077214 3268 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0011 at [email protected]:50381
I0417 14:27:01.077497 3268 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0011
I0417 14:27:01.077616 3268 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0011
I0417 14:27:01.077760 3268 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0011 disconnected
I0417 14:27:02.080665 3267 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0011
I0417 14:27:02.080870 3267 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0011
I0417 14:27:02.087455 3270 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0012 at [email protected]:50381
I0417 14:27:02.087746 3270 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0012
I0417 14:27:02.087867 3270 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0012
I0417 14:27:02.088044 3270 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0012 disconnected
I0417 14:27:03.097745 3268 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0012
I0417 14:27:03.097934 3268 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0012
I0417 14:27:03.098002 3268 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0013 at [email protected]:50381
I0417 14:27:03.098170 3268 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0013
I0417 14:27:03.098273 3268 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0013
I0417 14:27:03.098388 3268 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0013 disconnected
I0417 14:27:04.107770 3270 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0013
I0417 14:27:04.107971 3270 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0013
I0417 14:27:04.108072 3270 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0014 at [email protected]:50381
I0417 14:27:04.108361 3270 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0014
I0417 14:27:04.108464 3270 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0014
I0417 14:27:04.108542 3270 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0014 disconnected
I0417 14:27:05.118046 3267 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0014
I0417 14:27:05.118283 3267 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0014
I0417 14:27:05.118834 3267 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0015 at [email protected]:50381
I0417 14:27:05.119031 3267 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0015
I0417 14:27:05.119133 3267 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0015
I0417 14:27:05.119261 3267 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0015 disconnected
I0417 14:27:06.128176 3270 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0015
I0417 14:27:06.128389 3270 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0015
I0417 14:27:06.128525 3270 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0016 at [email protected]:50381
I0417 14:27:06.128854 3270 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0016
I0417 14:27:06.129014 3270 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0016
I0417 14:27:06.129128 3270 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0016 disconnected
I0417 14:27:07.138417 3268 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0016
I0417 14:27:07.138607 3268 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0016
I0417 14:27:07.138677 3268 master.cpp:526] Registering framework
201304171425-2113820480-5050-3253-0017 at [email protected]:50381
I0417 14:27:07.138838 3268 simple_allocator.cpp:46] Added framework
201304171425-2113820480-5050-3253-0017
I0417 14:27:07.138934 3268 master.cpp:1188] Sending 1 offers to
framework 201304171425-2113820480-5050-3253-0017
I0417 14:27:07.139050 3268 master.cpp:430] Framework
201304171425-2113820480-5050-3253-0017 disconnected
I0417 14:27:08.148147 3267 master.cpp:1147] Framework failover timeout,
removing framework 201304171425-2113820480-5050-3253-0017
I0417 14:27:08.148331 3267 simple_allocator.cpp:59] Removed framework
201304171425-2113820480-5050-3253-0017
2. On the slave node (192.168.1.132), we ran "bin/mesos-master.sh
--ip=192.168.1.132 --master=192.168.1.130:5050".
Log file created at: 2013/04/17 14:25:01
Running on machine: hdfs2
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0417 14:25:01.151159 2430 logging.cpp:72] Logging to logs
I0417 14:25:01.152088 2430 main.cpp:111] Creating "process" isolation
module
I0417 14:25:01.152246 2430 main.cpp:119] Build: 2013-01-22 20:32:32 by
hadoop
I0417 14:25:01.152283 2430 main.cpp:120] Starting Mesos slave
I0417 14:25:01.152614 2444 slave.cpp:191] Slave started on
192.168.1.132:53221
I0417 14:25:01.152660 2444 slave.cpp:192] Slave resources: cpus=4; mem=968
I0417 14:25:01.154165 2444 slave.cpp:357] New master detected at
[email protected]:5050
I0417 14:25:01.156548 2445 slave.cpp:377] Registered with master; given
slave ID 201304171425-2113820480-5050-3253-0
I0417 14:25:01.165132 2449 webui_utils.cpp:45] Loading webui script at
'/home/hadoop/mesos-0.9.0/src/webui/slave/webui.py'
I0417 14:25:23.307472 2446 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0000
I0417 14:25:24.317137 2447 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0001
I0417 14:25:25.326979 2446 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0002
I0417 14:25:26.336750 2444 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0003
I0417 14:25:27.349195 2445 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0004
I0417 14:25:28.356397 2446 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0005
I0417 14:25:29.366291 2447 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0006
I0417 14:25:30.378608 2446 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0007
I0417 14:25:31.385884 2445 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0008
I0417 14:25:32.398939 2446 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0009
I0417 14:25:33.408038 2447 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0010
I0417 14:25:34.417917 2445 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0011
I0417 14:25:35.434813 2444 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0012
I0417 14:25:36.444607 2446 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0013
I0417 14:25:37.454766 2445 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0014
I0417 14:25:38.464651 2446 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0015
I0417 14:25:39.474720 2445 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0016
I0417 14:25:40.484283 2444 slave.cpp:604] Asked to shut down framework
201304171425-2113820480-5050-3253-0017
I0417 14:25:46.626567 2446 slave.cpp:1104] Process exited: @0.0.0.0:0
W0417 14:25:46.626637 2446 slave.cpp:1107] WARNING! Master
disconnected! Waiting for a new master to be elected.
3. On the slave node (192.168.1.132), we started the C++ test frame work
by "src/test-framework 192.168.1.130:5050"