[
https://issues.apache.org/jira/browse/MESOS-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986490#comment-14986490
]
Adam B commented on MESOS-1826:
-------------------------------
I'm not sure this is helpful enough in debugging why the slave was
disconnected/deactivated, based on [~thomasr]'s request for a message like
"Unable to connect to slave X at x.x.x.x:5051. Please make sure that host is
reachable from your master."
All I see in your log excerpt is a "Failed to shutdown socket with fd 11:
Transport endpoint is not connected", which doesn't tell a user/admin that the
master cannot reach the slave, or even if the socket failure is related to the
slave referenced below.
> Improve logging for when master cannot connect to slaves
> --------------------------------------------------------
>
> Key: MESOS-1826
> URL: https://issues.apache.org/jira/browse/MESOS-1826
> Project: Mesos
> Issue Type: Improvement
> Affects Versions: 0.20.0
> Reporter: Thomas Rampelberg
> Assignee: Guangya Liu
> Priority: Minor
> Labels: newbie
>
> When first setting a mesos cluster up, it is possible to get into a state
> where your slaves are constantly re-registering. This happens because the
> slave pid is not reachable from the master.
> Currently, the master logs make it pretty tough to figure out that this is
> the problem that is occurring. It would be fantastic if there was a better
> explanation in the logs, something like:
> Unable to connect to slave X at x.x.x.x:5051. Please make sure that host
> is reachable from your master.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)