[
https://issues.apache.org/jira/browse/MESOS-3423?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Mahler updated MESOS-3423:
-----------------------------------
Shepherd: Jie Yu
Sprint: Twitter Mesos Q3 Sprint 5
Affects Version/s: 0.24.0
Target Version/s: 0.25.0
Labels: twitter (was: )
Description:
Currently the perf event isolator times out a sample after a fixed extra time
of 2 seconds on top of the sample time elapses:
{code}
Duration timeout = flags.perf_duration + Seconds(2);
{code}
This should be based on the reap interval maximum.
Also, the code stops sampling altogether when a single timeout occurs. We've
observed time outs during normal operation, so it would be better for the
isolator to continue performing perf sampling in the case of timeouts. It may
also make sense to continue sampling in the case of errors, since these may be
transient.
was:[~jieyu] can you fill in the details here?
Component/s: slave
isolation
Summary: Perf event isolator stops performing sampling if a
single timeout occurs. (was: perf sampling stops after a timeout occurs)
> Perf event isolator stops performing sampling if a single timeout occurs.
> -------------------------------------------------------------------------
>
> Key: MESOS-3423
> URL: https://issues.apache.org/jira/browse/MESOS-3423
> Project: Mesos
> Issue Type: Bug
> Components: isolation, slave
> Affects Versions: 0.24.0
> Reporter: Vinod Kone
> Assignee: Cong Wang
> Labels: twitter
>
> Currently the perf event isolator times out a sample after a fixed extra time
> of 2 seconds on top of the sample time elapses:
> {code}
> Duration timeout = flags.perf_duration + Seconds(2);
> {code}
> This should be based on the reap interval maximum.
> Also, the code stops sampling altogether when a single timeout occurs. We've
> observed time outs during normal operation, so it would be better for the
> isolator to continue performing perf sampling in the case of timeouts. It may
> also make sense to continue sampling in the case of errors, since these may
> be transient.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)