Repository: mesos Updated Branches: refs/heads/master bfa89f22e -> d43b9df4a
Slave exits gracefully on DNS lookup failure. Jira: MESOS-2962 Slave fails with Abort stacktrace when DNS cannot resolve hostname. If the DNS cannot resolve the hostname for a slave node, we correctly return an Error object, but we then fail with a segfault. This code adds a more user-friendly message and exits normally (with an `EXIT_FAILURE` code). For example, forcing `net::getIp()` to always return an Error, now causes the slave to exit like this: ``` $ ./bin/mesos-slave.sh --master=10.10.1.121:5405 Failed to obtain the IP address for 'stratos.local'; the DNS service may not be able to resolve it: Name or service not known $ echo $? 1 ``` Review: https://reviews.apache.org/r/36061 Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/d43b9df4 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/d43b9df4 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/d43b9df4 Branch: refs/heads/master Commit: d43b9df4a1cb3a315c33f3244b631faf5004a301 Parents: bfa89f2 Author: Marco Massenzio <[email protected]> Authored: Wed Jul 1 17:12:53 2015 -0700 Committer: Benjamin Hindman <[email protected]> Committed: Wed Jul 1 17:12:54 2015 -0700 ---------------------------------------------------------------------- 3rdparty/libprocess/src/process.cpp | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/d43b9df4/3rdparty/libprocess/src/process.cpp ---------------------------------------------------------------------- diff --git a/3rdparty/libprocess/src/process.cpp b/3rdparty/libprocess/src/process.cpp index b754fb3..2d29d96 100644 --- a/3rdparty/libprocess/src/process.cpp +++ b/3rdparty/libprocess/src/process.cpp @@ -896,7 +896,9 @@ void initialize(const string& delegate) Try<net::IP> ip = net::getIP(hostname, __address__.ip.family()); if (ip.isError()) { - LOG(FATAL) << ip.error(); + EXIT(EXIT_FAILURE) << "Failed to obtain the IP address for '" << hostname + << "'; the DNS service may not be able to resolve it: " + << ip.error(); } __address__.ip = ip.get();
