[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227977#comment-15227977 ] Fan Du commented on MESOS-4981: --- [~bmahler] You are correct about this, I totally missed here. Please review the new RR: https://reviews.apache.org/r/45808/ Look, in linux kernel there is Suggested-by: indicates the idea comes from someone else, I didn't notice this in Mesos, so I add comments in the commit message. Thanks for your reviewing. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15227046#comment-15227046 ] Benjamin Mahler commented on MESOS-4981: [~fan.du] Hm.. I'm not sure I follow the difficulty here. Can't these metrics be distinguished by introspecting {{subscribe.framework_info.id}}? If id is present, it is a re-registration. If id is absent, it is a registration. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15221062#comment-15221062 ] Fan Du commented on MESOS-4981: --- [~bmahler] & [~vinodkone] How about not to distinguish {{messages_register_framework}} with {{messages_reregister_framework}} in such strict manner? Update flow of {{subscribe}} by: {code} 1. bump messages_register_framework 2. Various of sanity check 3. Newborn framework? 3a. Add new framework 3b. Return 4. Add messages_reregister_framework 5. Otherwise framework is reregistering 5a. Updating the framework 5b. Return {code} > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209672#comment-15209672 ] Fan Du commented on MESOS-4981: --- hmm, here is the scenario, I can explain :) When framework call SUBSCRIBE, it could be register a newborn framework, or it could also possibly updating(reregistering) a framework. For {{subscribe}} the flow is: {code} 1. bump messages_register_framework 2. Various of sanity check 3. Newborn framework? 3a. Add new framework 3b. Return 4. Roll back messages_register_framework, and add messages_reregister_framework 5. Otherwise framework is reregistering 5a. Updating the framework 5b. Return {code} That's why I ask two questions above: q1. Does metrics has to counter fail cases like sanity check? If no, we can fairly bump the metrics when we are sure it's a good/clean operation in 3a, and 5a. But from the conventions how other metrics are countered, metrics includes all other fail cases like sanity check. q2. Is it ok to update messages_register_framework, even though it's already know the operation should bump messages_reregister_framework? that's being said, do not need to roll back messages_register_framework again? > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209630#comment-15209630 ] Benjamin Mahler commented on MESOS-4981: I don't follow why you're subtracting in [r/45097|https://reviews.apache.org/r/45097/], it seems like a hack? Seeing the counter for registrations go up and then back down is going to cause confusion. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209584#comment-15209584 ] Fan Du commented on MESOS-4981: --- [~bmahler] May I have your comments here? then I can move forward on this ticket. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15207493#comment-15207493 ] Cong Wang commented on MESOS-4981: -- W.r.t r44473, if you need it too, please comment on r44473 to let BenM know I am not the only needs it. We already discussed the difference between Counter and Gauge on that RB. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205960#comment-15205960 ] Fan Du commented on MESOS-4981: --- [~bbannier] Thanks for the quick review! :) [~bmahler] Actually I have two questions here first: 1. Do we need to bump the metrics for failure cases of operation, e.g. parameter sanity checks, authentication/authorization? 2. For the case of this ticket, we handle {{registerFramework}} and {{reregisterFramework}} together in {{[subscribe|https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=src/master/master.cpp;h=e6290ea686ccf17813d6faeaf2f2012f79cf3b7f;hb=HEAD#l2256]}}, do we need to differentiate the metrics of {{registerFramework}} and {{reregisterFramework}} strictly? If "yes" to above two questions, IMO, we DO need Counter to be decremented for above case, to accommodate for the implementation :) I didn't know [~wangcong] has submit [r44473 | https://reviews.apache.org/r/44473/], I think it could be beneficial at least to my case here. Here is my understanding about Counter and Gauge, though we didn't differentiate them in Linux kernel. Use Counter for events or messages, and use Gauge to get a snapshot of Resources by its name and meaning. It lost the semantics if switching them over. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205905#comment-15205905 ] Fan Du commented on MESOS-4981: --- [~anandmazumdar] Thanks, I have added [~vinodkone] as reviewer. > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15205283#comment-15205283 ] Anand Mazumdar commented on MESOS-4981: --- Thanks for working on this. [~vinodkone] agreed to shepherd this. I can do a first pass review. [~fan.du] Can you add Vinod as a reviewer to the reviews too? > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. We should correctly be incrementing these counters for PID based > frameworks as was the case previously. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver
[ https://issues.apache.org/jira/browse/MESOS-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15203887#comment-15203887 ] Fan Du commented on MESOS-4981: --- [~anandmazumdar] I happened to look a deep look at this, here is fix works on my env. Please review: https://reviews.apache.org/r/45094/ > Framework (re-)register metric counters broken for calls made via scheduler > driver > -- > > Key: MESOS-4981 > URL: https://issues.apache.org/jira/browse/MESOS-4981 > Project: Mesos > Issue Type: Bug > Components: master >Reporter: Anand Mazumdar >Assignee: Fan Du > Labels: mesosphere > > The counters {{master/messages_register_framework}} and > {{master/messages_reregister_framework}} are no longer being incremented > after the scheduler driver started sending {{Call}} messages to the master in > Mesos 0.23. Either, we should think about adding new counter(s) for > {{Subscribe}} calls to the master for both PID/HTTP frameworks or modify the > existing code to correctly increment the counters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)