> On July 18, 2018, 2:21 p.m., Stephan Erb wrote: > > src/main/python/apache/aurora/executor/aurora_executor.py > > Lines 159 (patched) > > <https://reviews.apache.org/r/67967/diff/1/?file=2061542#file2061542line159> > > > > Should we use TASK_LOST here instead? Most users interpret TASK_FAILED > > as their responsibility whereas TASK_LOST is more of a misshap of > > Aurora/Mesos/Thermos. I would think an unknown exception in the runner is > > part of the latter category.
Hmm. Then we can argue that failure to create sandbox or fork the process etc also should be treated as TASK_LOST? At Twitter this is really not going to help us, since we have platform wrapper that cause TASK_FAILED and it is already hard to differentiate user configuration failures against platform dependency failures. I wanted to keep this consistent with the rest. If TASK_LOST makes more sense for you I can update it. - Santhosh Kumar ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/67967/#review206209 ----------------------------------------------------------- On July 18, 2018, 1:27 p.m., Santhosh Kumar Shanmugham wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/67967/ > ----------------------------------------------------------- > > (Updated July 18, 2018, 1:27 p.m.) > > > Review request for Aurora, David McLaughlin, Jordan Ly, Reza Motamedi, and > Stephan Erb. > > > Repository: aurora > > > Description > ------- > > If the ThermoTaskRunner encounters an Exception when trying to > fork the process, it bubbles this up to the Executor which does > not handle execptions other than TaskError. This leads to the > executor leaving the task in STARTING state and we end up with > tasks that get stranded in this state. > > Fix it so that any unknown expection that is thrown when starting > a runner leads to task failure and get marked as FAILED. > > > Diffs > ----- > > src/main/python/apache/aurora/executor/aurora_executor.py > 8a9958fffc2312686dccc7daf6d216631d4c956e > src/test/python/apache/aurora/executor/test_thermos_executor.py > f6ae1be5d56bfd845bd09db67ef0000a92091136 > > > Diff: https://reviews.apache.org/r/67967/diff/1/ > > > Testing > ------- > > ./gradlew test > ./pants test src/test/python/apache:: > > > Thanks, > > Santhosh Kumar Shanmugham > >
