> On March 27, 2015, 9:17 a.m., Adam B wrote: > > docs/slave-recovery.md, line 71 > > <https://reviews.apache.org/r/32543/diff/2/?file=907123#file907123line71> > > > > (If the slave does not come back, each executorDriver shuts itself down > > after $MESOS_RECOVERY_TIMEOUT.) > > > > Important question: If an executor is killed, does this systemd mode > > affect whether its tasks would get killed? > > Alexander Rukletsov wrote: > Adam, could you please explain what use case do you have in mind and how > it is related to slave recovery? > > Adam B wrote: > It's not related to slave recovery necessarily, but to how this KillMode > impacts other processes like a custom executor. Some frameworks (like HDFS) > have a custom executor that launches task(s) as a separate > process/subprocess. If the executor is killed (kill -9, or shutdown by the > framework/admin), will this change in KillMode affect whether the executors > task subprocesses also get killed? > I'm mostly worried about this KillMode change suddenly leaving stranded > task processes if/when executors are killed. > > Alexander Rukletsov wrote: > I thought that's exactly why we have containerizers: clean-up all > stranded processes. > > Adam B wrote: > Fair enough, when the slave is running. But what if the executor is > killed while the slave (thus also the containerizer) is shutdown/recovering? > I'm not claiming there's anything necessarily wrong with using this > KillMode. I just ask the question to make sure we don't recommend a setting > that may fix one issue but cause others.
I see your point. I would be surprised if this setting will cause the issue, but let's check: better safe than sorry. - Alexander ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/32543/#review78025 ----------------------------------------------------------- On March 27, 2015, 2:09 p.m., Joerg Schad wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/32543/ > ----------------------------------------------------------- > > (Updated March 27, 2015, 2:09 p.m.) > > > Review request for mesos, Alexander Rukletsov and Brenden Matthews. > > > Bugs: Mesos-2555 > https://issues.apache.org/jira/browse/Mesos-2555 > > > Repository: mesos > > > Description > ------- > > Documented the problem and solution encountered in MESOS-2419. > > > Diffs > ----- > > docs/slave-recovery.md 4bb4a71c6945bd70121743a1e9209a26906773c1 > docs/upgrades.md 2a15694607c079ad95ef6cf7f1490872ab9a5976 > > Diff: https://reviews.apache.org/r/32543/diff/ > > > Testing > ------- > > markdown check > > > Thanks, > > Joerg Schad > >