> On Nov. 4, 2014, 10:33 a.m., Vinod Kone wrote:
> > src/master/master.hpp, line 640
> > <https://reviews.apache.org/r/27531/diff/3/?file=748373#file748373line640>
> >
> > s/reason/TASK_LOST reason/
> >
> > also, add a TODO for reasons for TASK_KILLED and TASK_FAILED?
> >
> > more importantly, have you thought about how would you extend this for
> > TASK_KILLED and TASK_FAILED with the current design? if you use vectors we
> > are back to the same problem of having most entries being zero.
> >
> > how about using a hashmap intead?
> >
> > hashmap<TaskState, process::metrics::Counter > task_lost_reasons;
> >
> > you could explicitly initialize it with task states that you would
> > expect to exist for this reason. that way we don't have task states that
> > are not expected for this reason.
I started down the hashmap route. I think it's the right model, but it would be
hashmap<TaskStatus::Reason, process::metrics::Counter> task_lost_reasons;
or maybe a hashmap of hashmaps... :P
> On Nov. 4, 2014, 10:33 a.m., Vinod Kone wrote:
> > src/master/master.cpp, line 4735
> > <https://reviews.apache.org/r/27531/diff/3/?file=748374#file748374line4735>
> >
> >
> > s/++metrics.task_lost_reasons[status.reason()]/metrics.task_lost_reasons[status.reason()]++/
> >
> >
> > to be consistent.
this is consistent with the code around it.
- Dominic
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27531/#review59784
-----------------------------------------------------------
On Nov. 4, 2014, 8:58 a.m., Dominic Hamon wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/27531/
> -----------------------------------------------------------
>
> (Updated Nov. 4, 2014, 8:58 a.m.)
>
>
> Review request for mesos, Tobias Weingartner and Vinod Kone.
>
>
> Bugs: MESOS-1830
> https://issues.apache.org/jira/browse/MESOS-1830
>
>
> Repository: mesos-git
>
>
> Description
> -------
>
> Update metrics in Master to match the source and reason split for task
> statuses.
>
>
> Diffs
> -----
>
> src/master/master.hpp 687f1789712dcd867b411badd85f4a12ae8f16d7
> src/master/master.cpp d914786a2517677ee7dd4a3130b4b191ef416c26
> src/slave/slave.hpp 6c183f827ec2521052f8b088d4583f13e661db2c
> src/slave/slave.cpp b8935173bae9c124b8d08db590893334d7a45a23
> src/tests/master_tests.cpp 2e525749247626c05efb2f54a707599facb114b6
> src/tests/slave_tests.cpp d2cbaf82391411bdc5d85327478d7ca8072048af
>
> Diff: https://reviews.apache.org/r/27531/diff/
>
>
> Testing
> -------
>
> make check
> run master and check endpoint:
>
> {
> ...
> master/reconciliation_states/task_error: 0,
> master/reconciliation_states/task_failed: 0,
> master/reconciliation_states/task_finished: 0,
> master/reconciliation_states/task_killed: 0,
> master/reconciliation_states/task_lost: 0,
> master/reconciliation_states/task_running: 0,
> master/reconciliation_states/task_staging: 0,
> master/reconciliation_states/task_starting: 0,
> master/recovery_slave_removals: 0,
> master/slave_registrations: 1,
> master/slave_removals: 0,
> master/slave_reregistrations: 0,
> master/slaves_active: 1,
> master/slaves_connected: 1,
> master/slaves_disconnected: 0,
> master/slaves_inactive: 0,
> master/tasks_failed: 0,
> master/tasks_finished: 0,
> master/tasks_killed: 0,
> master/tasks_lost: 0,
> master/tasks_lost/reason_executor_terminated: 0,
> master/tasks_lost/reason_executor_unregistered: 0,
> master/tasks_lost/reason_framework_removed: 0,
> master/tasks_lost/reason_gc_error: 0,
> master/tasks_lost/reason_invalid_command: 0,
> master/tasks_lost/reason_invalid_frameworkid: 0,
> master/tasks_lost/reason_invalid_offers: 0,
> master/tasks_lost/reason_master_disconnected: 0,
> master/tasks_lost/reason_memory_limit: 0,
> master/tasks_lost/reason_reconciliation: 0,
> master/tasks_lost/reason_slave_disconnected: 0,
> master/tasks_lost/reason_slave_removed: 0,
> master/tasks_lost/reason_slave_restarted: 0,
> master/tasks_lost/reason_slave_unknown: 0,
> master/tasks_lost/reason_task_invalid: 0,
> master/tasks_lost/reason_task_unauthorized: 0,
> master/tasks_lost/reason_task_unknown: 0,
> ...
> }
>
> and slave:
>
> {
> ...
> slave/tasks_lost: 0,
> slave/tasks_lost/reason_executor_terminated: 0,
> slave/tasks_lost/reason_executor_unregistered: 0,
> slave/tasks_lost/reason_framework_removed: 0,
> slave/tasks_lost/reason_gc_error: 0,
> slave/tasks_lost/reason_invalid_command: 0,
> slave/tasks_lost/reason_invalid_frameworkid: 0,
> slave/tasks_lost/reason_invalid_offers: 0,
> slave/tasks_lost/reason_master_disconnected: 0,
> slave/tasks_lost/reason_memory_limit: 0,
> slave/tasks_lost/reason_reconciliation: 0,
> slave/tasks_lost/reason_slave_disconnected: 0,
> slave/tasks_lost/reason_slave_removed: 0,
> slave/tasks_lost/reason_slave_restarted: 0,
> slave/tasks_lost/reason_slave_unknown: 0,
> slave/tasks_lost/reason_task_invalid: 0,
> slave/tasks_lost/reason_task_unauthorized: 0,
> slave/tasks_lost/reason_task_unknown: 0,
> slave/tasks_running: 0,
> slave/tasks_staging: 0,
> slave/tasks_starting: 0,
> ...
> }
>
>
> Thanks,
>
> Dominic Hamon
>
>