Ok, I think I figure it out why the client dropped out. When I tried to attach slave to the master, I was using fully qualify name (ex. testmesos.test.com). Once it attached to the server and looking at the instance detail it showed the private IP rather than the public IP. Is the communicate between master and slave via private IP? Is there a way to use public IP?
Thanks, Scott On Mon, Oct 1, 2012 at 8:54 PM, Vinod Kone <[email protected]> wrote: > Hmm, nothing jumps out from the logs as suspicious. But the 75s from slave > registration to its eventual removal corresponds with the health check > timeout (15s * 5 retries). I would suggest running 'top' on the master and > slave machines to see if either of them are loaded. > > @vinodkone > > > On Mon, Oct 1, 2012 at 8:30 PM, Scott Wang < > [email protected]> wrote: > >> Vinod, >> >> For sure they can connect with each other. I can see the slave on the >> webUI for about a minute or so then it just dropped out of cluster. >> Enclosed are the slave output and master output in that one minute. >> Maybe you can spot something that I am not doing right. >> >> Thanks, >> Scott >> >> On Mon, Oct 1, 2012 at 7:49 PM, Vinod Kone <[email protected]> wrote: >> > Looks like the slave is not responding to health checks from the master. >> Is >> > the network connection from master-->slave alright? is the machine >> hosting >> > slave is cpu starved? Those are some of the things, I would check for. >> > >> > @vinodkone >> > >> > >> > On Mon, Oct 1, 2012 at 6:04 PM, Scott Wang < >> > [email protected]> wrote: >> > >> >> I am trying to setup a small cluster a master and a slave but I am >> >> getting the following output and the slave just terminated. >> >> >> >> ------------------------------------------------Slave >> >> >> >> >> output----------------------------------------------------------------------- >> >> I1002 01:00:02.868795 27122 slave.cpp:1160] Current disk usage 2.09%. >> >> Max allowed age: 6.85days >> >> I1002 01:00:17.974647 27123 slave.cpp:335] Slave asked to shut down >> >> I1002 01:00:17.974792 27123 slave.cpp:313] Slave terminating >> >> >> >> >> >> -----------------------------------------------Master >> >> >> >> >> output---------------------------------------------------------------------- >> >> W1002 01:00:17.967651 11433 master.cpp:1173] Removing slave >> >> 201210020057-1994437898-5050-11419-1 at itvm638:34013 because it has >> >> been deactivated >> >> I1002 01:00:17.968174 11433 master.cpp:1182] Master now considering a >> >> slave at itvm638:34013 as inactive >> >> I1002 01:00:17.968328 11435 hierarchical_allocator_process.hpp:371] >> >> Removed slave 201210020057-1994437898-5050-11419-1 >> >> >> >> Does anyone have any idea what I should do to prevent the slave going >> >> down by itself. >> >> >> >> Thanks, >> >> Scott >> >> >>
