Currently both machines are newly provisioned in our private cloud, and nothing is actually running. Is there anything I can check/do to help diagnose the problem?
Thanks, Scott On Mon, Oct 1, 2012 at 8:54 PM, Vinod Kone <[email protected]> wrote: > Hmm, nothing jumps out from the logs as suspicious. But the 75s from slave > registration to its eventual removal corresponds with the health check > timeout (15s * 5 retries). I would suggest running 'top' on the master and > slave machines to see if either of them are loaded. > > @vinodkone > > > On Mon, Oct 1, 2012 at 8:30 PM, Scott Wang < > [email protected]> wrote: > >> Vinod, >> >> For sure they can connect with each other. I can see the slave on the >> webUI for about a minute or so then it just dropped out of cluster. >> Enclosed are the slave output and master output in that one minute. >> Maybe you can spot something that I am not doing right. >> >> Thanks, >> Scott >> >> On Mon, Oct 1, 2012 at 7:49 PM, Vinod Kone <[email protected]> wrote: >> > Looks like the slave is not responding to health checks from the master. >> Is >> > the network connection from master-->slave alright? is the machine >> hosting >> > slave is cpu starved? Those are some of the things, I would check for. >> > >> > @vinodkone >> > >> > >> > On Mon, Oct 1, 2012 at 6:04 PM, Scott Wang < >> > [email protected]> wrote: >> > >> >> I am trying to setup a small cluster a master and a slave but I am >> >> getting the following output and the slave just terminated. >> >> >> >> ------------------------------------------------Slave >> >> >> >> >> output----------------------------------------------------------------------- >> >> I1002 01:00:02.868795 27122 slave.cpp:1160] Current disk usage 2.09%. >> >> Max allowed age: 6.85days >> >> I1002 01:00:17.974647 27123 slave.cpp:335] Slave asked to shut down >> >> I1002 01:00:17.974792 27123 slave.cpp:313] Slave terminating >> >> >> >> >> >> -----------------------------------------------Master >> >> >> >> >> output---------------------------------------------------------------------- >> >> W1002 01:00:17.967651 11433 master.cpp:1173] Removing slave >> >> 201210020057-1994437898-5050-11419-1 at itvm638:34013 because it has >> >> been deactivated >> >> I1002 01:00:17.968174 11433 master.cpp:1182] Master now considering a >> >> slave at itvm638:34013 as inactive >> >> I1002 01:00:17.968328 11435 hierarchical_allocator_process.hpp:371] >> >> Removed slave 201210020057-1994437898-5050-11419-1 >> >> >> >> Does anyone have any idea what I should do to prevent the slave going >> >> down by itself. >> >> >> >> Thanks, >> >> Scott >> >> >>
