[ 
https://issues.apache.org/jira/browse/MESOS-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986490#comment-14986490
 ] 

Adam B commented on MESOS-1826:
-------------------------------

I'm not sure this is helpful enough in debugging why the slave was 
disconnected/deactivated, based on [~thomasr]'s request for a message like 
"Unable to connect to slave X at x.x.x.x:5051. Please make sure that host is 
reachable from your master."
All I see in your log excerpt is a "Failed to shutdown socket with fd 11: 
Transport endpoint is not connected", which doesn't tell a user/admin that the 
master cannot reach the slave, or even if the socket failure is related to the 
slave referenced below.

> Improve logging for when master cannot connect to slaves
> --------------------------------------------------------
>
>                 Key: MESOS-1826
>                 URL: https://issues.apache.org/jira/browse/MESOS-1826
>             Project: Mesos
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Thomas Rampelberg
>            Assignee: Guangya Liu
>            Priority: Minor
>              Labels: newbie
>
> When first setting a mesos cluster up, it is possible to get into a state 
> where your slaves are constantly re-registering. This happens because the 
> slave pid is not reachable from the master.
> Currently, the master logs make it pretty tough to figure out that this is 
> the problem that is occurring. It would be fantastic if there was a better 
> explanation in the logs, something like:
>     Unable to connect to slave X at x.x.x.x:5051. Please make sure that host 
> is reachable from your master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to