>within 75s, slave restarts occasionally success, and sometimes failed with
message ' Slave asked to shut down by master because 'health check timed
out'.

Could you provide the master/slave log about this? I check current code,
seems only slave_ping_timeout and max_slave_ping_timeouts would affect 'health
check timed out'.

On Fri, Aug 7, 2015 at 5:32 PM, sujz <[email protected]> wrote:

> Hi all,
> I ran into this problem with these steps:
> 1: Start master and slave successfully.
> 2: Stop slave by pressing Ctrl+C.
> 3: After stopping slave, restart slave within 75s, it prompts this error:
> Slave asked to shut down by master because 'health check timed out'.
>
> After reading the code and  searching on the internet, I knew that after
> slave being observed disconnected, master continues to send
> PingSlaveMessage for MAX_SLAVE_PING_TIMEOUTS times, during each time
> waiting for pong message from slave for SLAVE_PING_TIMEOUT, so the total
> waiting time is
> MAX_SLAVE_PING_TIMEOUTS * SLAVE_PING_TIMEOUT=75s, if slave restarts within
> 75s master believes this slave failsover and accepts re-register request,
> otherwise, removes this slave.
>
> For my situation, within 75s, slave restarts occasionally success, and
> sometimes failed with message ' Slave asked to shut down by master because
> 'health check timed out'.  I found the bug report MESOS-2679 also
> discussing this problem, but I think it didn't explain what master checks
> and what factors may causes health check timed out, could anybody be kind
> to give more explaination? So we can get away from 'health check timed out'.
>
> Thanks very much and best regards!




-- 
Best Regards,
Haosdent Huang

Reply via email to