[jira] [Assigned] (MESOS-895) Unbundle libev.

2018-09-11 Thread James Peach (JIRA)


 [ 
https://issues.apache.org/jira/browse/MESOS-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach reassigned MESOS-895:
-

Assignee: James Peach  (was: Timothy St. Clair)

CentOS 6 ships {{libev}} 4.03 and and Ubuntu 14.04 ships 4.15, so once 
MESOS-9212 lands, I think we can unbundle {{libev}}.

/cc [~tillt] [~bmahler] [~vinodkone]

> Unbundle libev.
> ---
>
> Key: MESOS-895
> URL: https://issues.apache.org/jira/browse/MESOS-895
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.17.0
>Reporter: Timothy St. Clair
>Assignee: James Peach
>Priority: Major
>  Labels: tech-debt
>
> The libev patch can easily be removed and update the configuration flags and 
> possibly the accompanying code prior to include.   
> For configure pass in: 
> CFLAGS=-DEV_CHILD_ENABLE=0
> For inclusion: 
> #define EV_CHILD_ENABLE 0
> include 
> excerpt from maintainer: 
>  that patch is unnecessary
>  schmorp, so if they wanted to just set EV_CHILD_ENABLE=0 they 
> could just pass CFLAGS=-DEV_CHILD_ENABLE=0  through.
>  tstclair: yes, or use a wrapper



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (MESOS-9178) Add a metric for master failover time.

2018-09-11 Thread James Peach (JIRA)


[ 
https://issues.apache.org/jira/browse/MESOS-9178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610795#comment-16610795
 ] 

James Peach commented on MESOS-9178:


Another way to measure this is to publish it in the event stream.

> Add a metric for master failover time.
> --
>
> Key: MESOS-9178
> URL: https://issues.apache.org/jira/browse/MESOS-9178
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Xudong Ni
>Assignee: Xudong Ni
>Priority: Minor
>
> Quote from Yan Xu: Previous the argument against it is that you don't know if 
> all agents are going to come back after a master failover so there's not a 
> certain point that marks the end of "full reregistration of all agents". 
> However empirically the number of agents usually don't change during the 
> failover and there's an upper bound of such wait (after a 10min timeout the 
> agents that haven't reregistered are going to be marked unreachable so we can 
> just use that to stop the timer.
> So we can define failover time as "the time it takes for all agents recovered 
> from the registry to be accounted for" i.e., either reregistered or marked as 
> unreachable.
> This is of course looking at failover from an agent reregistration 
> perspective.
> Later after we add framework info persistence, we can similarly define the 
> framework perspective using reregistration time or reconciliation time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)