----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67587/#review204749 -----------------------------------------------------------
PASS: Mesos patch 67587 was successfully built and tested. Reviews applied: `['67587']` All the build artifacts available at: http://dcos-win.westus.cloudapp.azure.com/mesos-build/review/67587 - Mesos Reviewbot Windows On June 13, 2018, 10:30 p.m., Andrew Schwartzmeyer wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67587/ > ----------------------------------------------------------- > > (Updated June 13, 2018, 10:30 p.m.) > > > Review request for mesos, Joseph Wu and Neil Conway. > > > Bugs: MESOS-3790 > https://issues.apache.org/jira/browse/MESOS-3790 > > > Repository: mesos > > > Description > ------- > > Per MESOS-3790, the call to `zookeeper_init` maps `EAI_NONAME` and > `EAI_NODATA` to an `errno` value of `ENOENT`, and all others except > `EAI_MEMORY` to `EINVAL`. Mesos's ZooKeeper logic is written to retry > this initialization for ten minutes if the error is `EINVAL`, and > should be updated to also retry if the error is `ENOENT`. > > This is necessary because if the initialization is not retried, the > process crashes due to the `PLOG(FATAL)` call, and if it crashes, it > will interrupt other Mesos threads and potentially leave the > environment in an unknown state. For instance, we have seen > intermittent failures where the systemd unit file > `mesos_executors.slice` is created but empty because Mesos crashed > between creating the file and flushing the write to the file. This > then leads to errors when the agent is restarted (and succeeds to > connect to ZooKeeper), because the agent explicitly does not attempt > to write to the unit file if it already exists. > > > Diffs > ----- > > src/zookeeper/zookeeper.cpp 52c4af192ccd1361afc4f7a0041889238c01e674 > > > Diff: https://reviews.apache.org/r/67587/diff/1/ > > > Testing > ------- > > Testing against our repro right now, but it's flaky, so it'll take a while. > > > Thanks, > > Andrew Schwartzmeyer > >
