> On Oct. 29, 2014, 5:09 p.m., Vinod Kone wrote: > > looking good. some minor comments. > > > > are you going to follow up with a review for metrics based on source and > > reason? that would make this effort super useful.
yes! i started to fold it into this one but it became too complicated. I need to think carefully about whether to still include aggregates. Ie: master/tasks_lost master/tasks_lost/source_master master/tasks_lost/source_master/reason_task_invalid master/tasks_lost/source_slave master/tasks_lost/source_slave/reason_something master/tasks_lost/source_slave/reason_something_else It could get quite unwieldy but would be backwards compatible and complete. > On Oct. 29, 2014, 5:09 p.m., Vinod Kone wrote: > > src/master/master.cpp, line 2725 > > <https://reviews.apache.org/r/26382/diff/9/?file=741857#file741857line2725> > > > > why no reason? > > > > use REASON_TASK_PENDING I went back and forth on this. That's not the reason. The reason is that killTask is called, which is adequately captured by the status TASK_KILLED. The fact that it's pending isn't a reason and I don't think it adds anything useful. The other places we set TASK_KILLED are: executor (source is set), framework is removed (reason is set), from the slave (source is set), executor unregistered (source slave, and reason set). This is the only place that we get TASK_KILLED with no reason, and it's because we don't have one. We could add a reason to the KillTaskMessage but i was hesitant to make further changes to the API. Thoughts? > On Oct. 29, 2014, 5:09 p.m., Vinod Kone wrote: > > src/master/master.cpp, line 3646 > > <https://reviews.apache.org/r/26382/diff/9/?file=741857#file741857line3646> > > > > Not sure if REASON_RECONCILIATON_REQUEST is more appropriate here than > > REASON_TASK_UNKNOWN. i can go either way. left it as explicit for now. > On Oct. 29, 2014, 5:09 p.m., Vinod Kone wrote: > > src/slave/slave.cpp, line 1430 > > <https://reviews.apache.org/r/26382/diff/9/?file=741860#file741860line1430> > > > > why no reason? > > > > use REASON_TASK_PENDING see argument above for master. > On Oct. 29, 2014, 5:09 p.m., Vinod Kone wrote: > > src/tests/master_authorization_tests.cpp, line 210 > > <https://reviews.apache.org/r/26382/diff/9/?file=741863#file741863line210> > > > > kill this. oops. debugging. - Dominic ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/26382/#review59070 ----------------------------------------------------------- On Oct. 29, 2014, 8:47 p.m., Dominic Hamon wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/26382/ > ----------------------------------------------------------- > > (Updated Oct. 29, 2014, 8:47 p.m.) > > > Review request for mesos, Vinod Kone and Bill Farner. > > > Bugs: MESOS-1830 and MESOS-343 > https://issues.apache.org/jira/browse/MESOS-1830 > https://issues.apache.org/jira/browse/MESOS-343 > > > Repository: mesos-git > > > Description > ------- > > Added source and reason, enabled TASK_ERROR, and made the changes necessary > throughout the codebase. > > > Diffs > ----- > > include/mesos/mesos.proto 168a7a8c35ed1bf3f5bd6d7431b1e511bae7b789 > src/common/protobuf_utils.hpp 212d5124b9a4cc58e61719fa7f07a61cd166e834 > src/common/protobuf_utils.cpp a9b65e328c4c62bff7fbf5633dda25d742d79019 > src/examples/balloon_framework.cpp b05d5679fe2915142907af0b2dc00c6cd76eb9c1 > src/examples/java/TestFramework.java > bc593d0abfacb00690b1492b2b82c970f4e4de6d > src/examples/low_level_scheduler_libprocess.cpp > 7ef5ea78ade4ed856b97009fdfe31281f0a55c17 > src/examples/low_level_scheduler_pthread.cpp > 6e233a10117a1c7aa669806b5b430e746e227ee5 > src/examples/no_executor_framework.cpp > f98a0735b9f287e7f1bf98af6c2e9a47ca6a77b2 > src/examples/test_framework.cpp 187a611ebfe35cb13ee48aa5eca934cf55f34dea > src/master/master.cpp 762d2ff6c168ac212f70b43275692a77496a7fcd > src/sched/sched.cpp 0fb8c7bda75545389f8024489b3c76ae115111f4 > src/slave/slave.cpp 96fb5f7385b0762d46d8129f7e43207bd6311644 > src/tests/fault_tolerance_tests.cpp > a18a41a3e34ff112e04e693447d757403e5013bd > src/tests/master_authorization_tests.cpp > 652e80d0d4567b225c6ffb326725ddfde06f7fd3 > src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6 > src/tests/resource_offers_tests.cpp > fe66432b1bf75ee25feb73c4bb353e4d4e5b503f > > Diff: https://reviews.apache.org/r/26382/diff/ > > > Testing > ------- > > make check > > > Thanks, > > Dominic Hamon > >