Currently both machines are newly provisioned in our private cloud,
and nothing is actually running.   Is there anything I can check/do to
help diagnose the problem?

Thanks,
Scott

On Mon, Oct 1, 2012 at 8:54 PM, Vinod Kone <[email protected]> wrote:
> Hmm, nothing jumps out from the logs as suspicious. But the 75s from slave
> registration to its eventual removal corresponds with the health check
> timeout (15s * 5 retries). I would suggest running 'top' on the master and
> slave machines to see if either of them are loaded.
>
> @vinodkone
>
>
> On Mon, Oct 1, 2012 at 8:30 PM, Scott Wang <
> [email protected]> wrote:
>
>> Vinod,
>>
>> For sure they can connect with each other.  I can see the slave on the
>> webUI for about  a minute or so then it just dropped out of cluster.
>> Enclosed are the slave output and master output in that one minute.
>> Maybe you can spot something that I am not doing right.
>>
>> Thanks,
>> Scott
>>
>> On Mon, Oct 1, 2012 at 7:49 PM, Vinod Kone <[email protected]> wrote:
>> > Looks like the slave is not responding to health checks from the master.
>> Is
>> > the network connection from master-->slave alright? is the machine
>> hosting
>> > slave is cpu starved? Those are some of the things, I would check for.
>> >
>> > @vinodkone
>> >
>> >
>> > On Mon, Oct 1, 2012 at 6:04 PM, Scott Wang <
>> > [email protected]> wrote:
>> >
>> >> I am trying to setup a small cluster a master and a slave but I am
>> >> getting the following output and the slave just terminated.
>> >>
>> >> ------------------------------------------------Slave
>> >>
>> >>
>> output-----------------------------------------------------------------------
>> >> I1002 01:00:02.868795 27122 slave.cpp:1160] Current disk usage 2.09%.
>> >> Max allowed age: 6.85days
>> >> I1002 01:00:17.974647 27123 slave.cpp:335] Slave asked to shut down
>> >> I1002 01:00:17.974792 27123 slave.cpp:313] Slave terminating
>> >>
>> >>
>> >> -----------------------------------------------Master
>> >>
>> >>
>> output----------------------------------------------------------------------
>> >> W1002 01:00:17.967651 11433 master.cpp:1173] Removing slave
>> >> 201210020057-1994437898-5050-11419-1 at itvm638:34013 because it has
>> >> been deactivated
>> >> I1002 01:00:17.968174 11433 master.cpp:1182] Master now considering a
>> >> slave at itvm638:34013 as inactive
>> >> I1002 01:00:17.968328 11435 hierarchical_allocator_process.hpp:371]
>> >> Removed slave 201210020057-1994437898-5050-11419-1
>> >>
>> >> Does anyone have any idea what I should do to prevent the slave going
>> >> down by itself.
>> >>
>> >> Thanks,
>> >> Scott
>> >>
>>

Reply via email to