----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59654 -----------------------------------------------------------
src/slave/slave.cpp <https://reviews.apache.org/r/26382/#comment100949> So, after thinking a bit more and looking at MESOS-343, what we really want here is to set the reason for failed as oom or memory limit (REASON_MEMORY_LIMIT). Slave generates TASK_FAILED here not because the executor terminated but becaused it oomed (and in the future due to disk limit etc). Note that the slave also generates TASK_FAILED when command executor unexpectedly terminates. This is because, under normal conditions, command executor is not expected to terminate until the task finishes. For this case I think REASON_INVALID_COMMAND or REASON_BAD_COMMAND or REASON_COMMAND_FAILED is a better reason than REASON_EXECUTOR_TERMINATED because users have no idea about executors when using command tasks. The right way to do this is to plumb the reason via the Termination protobuf. That way any isolator can include the right reason if its corresponding limit is reached (memory, disk etc). If you want to punt on the plumbing, please add a TODO and tracking ticket. - Vinod Kone On Oct. 31, 2014, 10:09 p.m., Dominic Hamon wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/26382/ > ----------------------------------------------------------- > > (Updated Oct. 31, 2014, 10:09 p.m.) > > > Review request for mesos, Vinod Kone and Bill Farner. > > > Bugs: MESOS-1830 and MESOS-343 > https://issues.apache.org/jira/browse/MESOS-1830 > https://issues.apache.org/jira/browse/MESOS-343 > > > Repository: mesos-git > > > Description > ------- > > Added source and reason, enabled TASK_ERROR, and made the changes necessary > throughout the codebase. > > > Diffs > ----- > > include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 > src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 > src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 > src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 > src/examples/java/TestFramework.java > bc593d0abfacb00690b1492b2b82c970f4e4de6d > src/examples/low_level_scheduler_libprocess.cpp > 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 > src/examples/low_level_scheduler_pthread.cpp > 6e233a10117a1c7aa669806b5b430e746e227ee5 > src/examples/no_executor_framework.cpp > f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 > src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea > src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd > src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 > src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 > src/tests/fault_tolerance_tests.cpp > a18a41a3e34ff112e04e693447d757403e5013bd > src/tests/master_authorization_tests.cpp > 652e80d0d4567b225c6ffb326725ddfde06f7fd3 > src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 > src/tests/resource_offers_tests.cpp > fe66432b1bf75ee25feb73c4bb353e4d4e5b503f > > Diff: https://reviews.apache.org/r/26382/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Dominic Hamon > >