Hmm, nothing jumps out from the logs as suspicious. But the 75s from slave
registration to its eventual removal corresponds with the health check
timeout (15s * 5 retries). I would suggest running 'top' on the master and
slave machines to see if either of them are loaded.

@vinodkone


On Mon, Oct 1, 2012 at 8:30 PM, Scott Wang <
[email protected]> wrote:

> Vinod,
>
> For sure they can connect with each other.  I can see the slave on the
> webUI for about  a minute or so then it just dropped out of cluster.
> Enclosed are the slave output and master output in that one minute.
> Maybe you can spot something that I am not doing right.
>
> Thanks,
> Scott
>
> On Mon, Oct 1, 2012 at 7:49 PM, Vinod Kone <[email protected]> wrote:
> > Looks like the slave is not responding to health checks from the master.
> Is
> > the network connection from master-->slave alright? is the machine
> hosting
> > slave is cpu starved? Those are some of the things, I would check for.
> >
> > @vinodkone
> >
> >
> > On Mon, Oct 1, 2012 at 6:04 PM, Scott Wang <
> > [email protected]> wrote:
> >
> >> I am trying to setup a small cluster a master and a slave but I am
> >> getting the following output and the slave just terminated.
> >>
> >> ------------------------------------------------Slave
> >>
> >>
> output-----------------------------------------------------------------------
> >> I1002 01:00:02.868795 27122 slave.cpp:1160] Current disk usage 2.09%.
> >> Max allowed age: 6.85days
> >> I1002 01:00:17.974647 27123 slave.cpp:335] Slave asked to shut down
> >> I1002 01:00:17.974792 27123 slave.cpp:313] Slave terminating
> >>
> >>
> >> -----------------------------------------------Master
> >>
> >>
> output----------------------------------------------------------------------
> >> W1002 01:00:17.967651 11433 master.cpp:1173] Removing slave
> >> 201210020057-1994437898-5050-11419-1 at itvm638:34013 because it has
> >> been deactivated
> >> I1002 01:00:17.968174 11433 master.cpp:1182] Master now considering a
> >> slave at itvm638:34013 as inactive
> >> I1002 01:00:17.968328 11435 hierarchical_allocator_process.hpp:371]
> >> Removed slave 201210020057-1994437898-5050-11419-1
> >>
> >> Does anyone have any idea what I should do to prevent the slave going
> >> down by itself.
> >>
> >> Thanks,
> >> Scott
> >>
>

Reply via email to