----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/36061/#review90014 -----------------------------------------------------------
I'm sorry, but I don't understand how LOG(FATAL) was segfaulting before. Please explain. 3rdparty/libprocess/src/process.cpp (lines 890 - 891) <https://reviews.apache.org/r/36061/#comment142947> Why doesn't this need to be an `EXIT_(EXIT_FAILURE)` as well? 3rdparty/libprocess/src/process.cpp (line 899) <https://reviews.apache.org/r/36061/#comment142946> I'm confused about how this was segfaulting before. "Logging a FATAL message terminates the program (after the message is logged)", so unless `ip` is invalid or `ip.error()` doesn't exist (yet `ip.isError()==true`), the process should exit cleanly after logging the message. The original ticket (MESOS-2636) discusses segfaults within the actual `getIP()` and `hostname()` methods, so the process would have segfaulted before reaching the LOG(FATAL). Now that MESOS-2636 has been fixed, we should safely get back an `ip.error()` that we can actually log. - Adam B On June 30, 2015, 5:32 p.m., Marco Massenzio wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/36061/ > ----------------------------------------------------------- > > (Updated June 30, 2015, 5:32 p.m.) > > > Review request for mesos, Adam B and Joris Van Remoortere. > > > Bugs: MESOS-2962 > https://issues.apache.org/jira/browse/MESOS-2962 > > > Repository: mesos > > > Description > ------- > > Jira: MESOS-2962 > > Slave fails with Abort stacktrace when DNS cannot resolve hostname > > If the DNS cannot resolve the hostname for a slave node, we correctly return > an Error object, but we then fail with a segfault. > > This code adds a more user-friendly message and exits normally (with an > `EXIT_FAILURE` code). > For example, forcing `net::getIp()` to always return an Error, now causes the > slave to exit like this: > ``` > $ ./bin/mesos-slave.sh --master=10.10.1.121:5405 > WARNING: Logging before InitGoogleLogging() is written to STDERR > E0630 11:31:45.777465 1944417024 process.cpp:899] Could not obtain the IP > address for stratos.local; the DNS service may not be able to resolve it: >>> > Marco was here!!! > > $ echo $? > 1 > ``` > > > Diffs > ----- > > 3rdparty/libprocess/src/process.cpp > d99947c1598c43c47c88ef3e8038081855f0d1dc > > Diff: https://reviews.apache.org/r/36061/diff/ > > > Testing > ------- > > make check > and manual failing the DNS > > > Thanks, > > Marco Massenzio > >
