> On March 1, 2017, 8:23 p.m., Vinod Kone wrote: > > src/checks/checker.cpp > > Line 424 (original), 440 (patched) > > <https://reviews.apache.org/r/56208/diff/5-7/?file=1628427#file1628427line452> > > > > why are we sending WEXITSTATUS and not exit code? > > Alexander Rukletsov wrote: > Because what we get from `Subprocess` is actually `pid_t`. It is unclear > what the scheduler will do with the pid. > > Vinod Kone wrote: > AFAIK the status code returned by `waitpid()` has strictly more > information (whether a process has exited, signalled or stopped). So I can > imagine scheduler can make better decisions with extra information? It's a > bit weird that we are not sending check status if a command exits because of > signaling or stopping? > > Also, AFAIK we send status code in the executor terminated event as well? > So it would be nice if we can be consistent unless there is a strong reason > to differ. > > Alexander Rukletsov wrote: > Correcting my answer above. What we get from `Subprocess` is > `status_value` (see man on `waitpid`), which embeds exit code and has extra > information. I'm not sure whether a scheduler is interested in fine grained > termination information for a check. > > I would argue it may be more surprising for 3rdparty tools to see Posix's > internal `status_value` instead of a plain exit code. Moreover, IIUC to > extract exit code from `status_value`, a scheduler should know whether the > message is coming from Posix or Windows agent and hence either apply extra > interpretation or use the value as is. > > However, I do confirm that executor termination event contains > `status_value`.
See https://reviews.apache.org/r/57597 - Alexander ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/56208/#review167557 ----------------------------------------------------------- On March 14, 2017, 2:05 p.m., Alexander Rukletsov wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/56208/ > ----------------------------------------------------------- > > (Updated March 14, 2017, 2:05 p.m.) > > > Review request for mesos, Gastón Kleiman and Vinod Kone. > > > Bugs: MESOS-6906 > https://issues.apache.org/jira/browse/MESOS-6906 > > > Repository: mesos > > > Description > ------- > > Add support for general checks, i.e. defined by CheckInfo, in > checking library. A general check can be either an command or > an HTTP request. The library performs the requested check at > the specified interval and sends the result to the framework > via a task status update. If the current result is the same as > the previous result, no status update is sent. > > > Diffs > ----- > > src/checks/checker.hpp dc293f3d3613dec716510d269829f8a6f406c277 > src/checks/checker.cpp 8716e4cc684e6c4b6b76d8ca53221be06d10b2a6 > src/checks/health_checker.hpp f1f2834b3429fb00cc49c179fa9a3de328f597b5 > src/checks/health_checker.cpp 6c97369fd9a567ba16dd92085bf142d43f71eaf1 > > > Diff: https://reviews.apache.org/r/56208/diff/9/ > > > Testing > ------- > > https://reviews.apache.org/r/56213/ > > > Thanks, > > Alexander Rukletsov > >
