Also can you try starting the slave with manually specifying the slave's ip address via "--ip" flag? That seemed to have done the trick for other people on this list in similar situation.
On Wed, Apr 17, 2013 at 12:26 PM, Benjamin Mahler <[email protected] > wrote: > In terms of the connectivity issue, can you re-run with GLOG_v=2 and report > back? > > > On Tue, Apr 16, 2013 at 6:41 PM, Vinod Kone <[email protected]> wrote: > > > On Tue, Apr 16, 2013 at 6:41 PM, Vinod Kone <[email protected]> wrote: > > > > > Hi John, > > > > > > You seem to have hit a couple of known issues: > > > https://issues.apache.org/jira/browse/MESOS-300 > > > https://issues.apache.org/jira/browse/MESOS-435 > > > > > > Unfortunately, we haven't been able to reproduce these bugs > consistently > > > on our end, so we were never able to find the root cause and fix :/ > > Please > > > add your data to the above tickets, so that we can diagnose/fix these. > > > > > > > > > > > > > > > @vinodkone > > > > > > > > > On Tue, Apr 16, 2013 at 6:21 AM, John B. Wyatt IV <[email protected] > > >wrote: > > > > > >> Greetings, > > >> > > >> I've been spending some time trying to get the Mesos up and running on > > >> Vagrant (a nice frontend for headless Virtualbox). I have the master > > setup > > >> locally on 33.33.13.38:5050 and one slave setup on 33.33.13.39:5050. > > >> There > > >> able to communicate with each other and the web display on the master > > >> works. The problem is that the master keeps adding and removing the > > slave > > >> or just segfaults sometimes. The web interface doesn't register the > > slave > > >> (maybe removed too quickly?). I'm not too sure what to do at this > point > > >> and > > >> I was hoping for some help. I'm using Mesos 0.10. > > >> > > >> Here is the output from the master: > > >> > > >> I0416 10:09:01.794397 2040 dominant_share_allocator.cpp:417] > Performed > > >> allocation for 0 slaves in 0.018916 milliseconds > > >> I0416 10:09:02.099568 2038 master.cpp:906] Attempting to register > slave > > >> on > > >> vagrant-ubuntu.vagrantup.com at slave(1)@127.0.1.1:57599 > > >> I0416 10:09:02.100764 2038 master.cpp:1142] Master now considering a > > >> slave > > >> at vagrant-ubuntu.vagrantup.com:57599 as active > > >> I0416 10:09:02.101080 2038 master.cpp:1721] Adding slave > > >> 201304161008-16842879-5050-2023-56 at vagrant-ubuntu.vagrantup.comwith > > >> cpus=2; mem=979; ports=[31000-32000] > > >> I0416 10:09:02.104706 2038 master.cpp:513] Slave > > >> 201304161008-16842879-5050-2023-56(vagrant-ubuntu.vagrantup.com) > > >> disconnected > > >> I0416 10:09:02.105237 2037 dominant_share_allocator.cpp:244] Added > > slave > > >> 201304161008-16842879-5050-2023-56 (vagrant-ubuntu.vagrantup.com) > with > > >> cpus=2; mem=979; ports=[31000-32000] (and cpus=2; mem=979; > > >> ports=[31000-32000] available) > > >> I0416 10:09:02.105865 2037 dominant_share_allocator.cpp:435] > Performed > > >> allocation for slave 201304161008-16842879-5050-2023-56 in 0.011817 > > >> milliseconds > > >> I0416 10:09:02.106258 2037 dominant_share_allocator.cpp:269] Removed > > >> slave > > >> 201304161008-16842879-5050-2023-56 > > >> I0416 10:09:02.797294 2038 dominant_share_allocator.cpp:417] > Performed > > >> allocation for 0 slaves in 0.017615 milliseconds > > >> I0416 10:09:03.101245 2040 master.cpp:906] Attempting to register > slave > > >> on > > >> vagrant-ubuntu.vagrantup.com at slave(1)@127.0.1.1:57599 > > >> I0416 10:09:03.102088 2040 master.cpp:1142] Master now considering a > > >> slave > > >> at vagrant-ubuntu.vagrantup.com:57599 as active > > >> I0416 10:09:03.103230 2040 master.cpp:1721] Adding slave > > >> 201304161008-16842879-5050-2023-57 at vagrant-ubuntu.vagrantup.comwith > > >> cpus=2; mem=979; ports=[31000-32000] > > >> I0416 10:09:03.106045 2040 master.cpp:513] Slave > > >> 201304161008-16842879-5050-2023-57(vagrant-ubuntu.vagrantup.com) > > >> disconnected > > >> I0416 10:09:03.106202 2039 dominant_share_allocator.cpp:244] Added > > slave > > >> 201304161008-16842879-5050-2023-57 (vagrant-ubuntu.vagrantup.com) > with > > >> cpus=2; mem=979; ports=[31000-32000] (and cpus=2; mem=979; > > >> ports=[31000-32000] available) > > >> I0416 10:09:03.107240 2039 dominant_share_allocator.cpp:435] > Performed > > >> allocation for slave 201304161008-16842879-5050-2023-57 in 0.011276 > > >> milliseconds > > >> I0416 10:09:03.107650 2039 dominant_share_allocator.cpp:269] Removed > > >> slave > > >> 201304161008-16842879-5050-2023-57 > > >> I0416 10:09:03.799612 2040 dominant_share_allocator.cpp:417] > Performed > > >> allocation for 0 slaves in 0.024916 milliseconds > > >> > > >> Here is the output from the slave: > > >> I0416 10:19:46.207093 1867 main.cpp:123] Creating "process" isolation > > >> module > > >> I0416 10:19:46.209199 1867 main.cpp:131] Build: 2013-04-16 07:41:31 > by > > >> vagrant > > >> I0416 10:19:46.209410 1867 main.cpp:132] Starting Mesos slave > > >> I0416 10:19:46.210247 1883 slave.cpp:175] Slave started on 1)@ > > >> 127.0.1.1:56701 > > >> I0416 10:19:46.210842 1883 slave.cpp:176] Slave resources: cpus=2; > > >> mem=979; ports=[31000-32000] > > >> I0416 10:19:46.213693 1883 slave.cpp:352] New master detected at > > >> [email protected]:5050 > > >> Loading webui script at > > >> '/home/vagrant/mesos-0.10.0/src/webui/slave/webui.py' > > >> Bottle server starting up (using WSGIRefServer())... > > >> Listening on http://0.0.0.0:8081/ > > >> Use Ctrl-C to quit. > > >> > > >> Sometimes the master just quits > > >> > > >> master: > > >> I0416 10:19:58.244128 2545 master.cpp:513] Slave > > >> 201304161019-16842879-5050-2531-12(vagrant-ubuntu.vagrantup.com) > > >> disconnected > > >> I0416 10:19:58.245954 2545 dominant_share_allocator.cpp:269] Removed > > >> slave > > >> 201304161019-16842879-5050-2531-12 > > >> F0416 10:19:58.719403 2549 process.cpp:1828] Check failed: > > >> outgoing.count(s) > 0 > > >> *** Check failure stack trace: *** > > >> @ 0x7f554933c0ad google::LogMessage::Fail() > > >> @ 0x7f554933e83f google::LogMessage::SendToLog() > > >> @ 0x7f554933bcab google::LogMessage::Flush() > > >> @ 0x7f554933f0cd google::LogMessageFatal::~LogMessageFatal() > > >> @ 0x7f5549227484 process::SocketManager::next() > > >> @ 0x7f55492216bf process::send_data() > > >> @ 0x7f554937b9df ev_invoke_pending > > >> @ 0x7f554937fd14 ev_loop > > >> @ 0x7f554922292c process::serve() > > >> @ 0x7f5548a9ae9a start_thread > > >> @ 0x7f5547fb5cbd (unknown) > > >> > > >> > > >> Additional from slave: > > >> I0416 10:19:58.808632 1884 slave.cpp:1141] Process exited: @ > 0.0.0.0:0 > > >> W0416 10:19:58.808785 1884 slave.cpp:1144] WARNING! Master > > disconnected! > > >> Waiting for a new master to be elected. > > >> > > >> > > >> -- > > >> John > > >> > > > > > > > > >
