[ 
https://issues.apache.org/jira/browse/MESOS-4740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Robinson updated MESOS-4740:
----------------------------------
    Description: 
[~drobinson] noticed retrieving metrics/snapshot statistics could be very 
inefficient.

{noformat}
[user@server ~]$ time curl -s localhost:5050/metrics/snapshot

real    0m35.654s
user    0m0.019s
sys     0m0.011s
{noformat}

MESOS-1287 introduces a timeout parameter for this query, but for 
metric-collectors like ours they are not aware of such URL-specific parameter, 
so we need:

1) We should always have a timeout and set some default value to it

2) Investigate why metrics/snapshot could take such a long time to complete 
under load, since we don't use history for these statistics and the values are 
just some atomic read.


  was:
David Robinson noticed retrieving metrics/snapshot statistics could be very 
inefficient and cause Mesos master stuck.

{noformat}
[root@atla-bny-34-sr1 ~]# time curl -s localhost:5051/metrics/snapshot

real    2m7.302s
user    0m0.001s
sys    0m0.004s
{noformat}

MESOS-1287 introduces a timeout parameter for this query, but for observers 
like ours they are not aware of such URL-specific parameter, so we need:

1) We should always have a timeout and set some default value to it

2) Investigate why metrics/snapshot could take such a long time to complete 
under load, since we don't use history for these statistics and the values are 
just some atomic read.



> Improve metrics/snapshot performace
> -----------------------------------
>
>                 Key: MESOS-4740
>                 URL: https://issues.apache.org/jira/browse/MESOS-4740
>             Project: Mesos
>          Issue Type: Task
>            Reporter: Cong Wang
>            Assignee: Cong Wang
>
> [~drobinson] noticed retrieving metrics/snapshot statistics could be very 
> inefficient.
> {noformat}
> [user@server ~]$ time curl -s localhost:5050/metrics/snapshot
> real  0m35.654s
> user  0m0.019s
> sys   0m0.011s
> {noformat}
> MESOS-1287 introduces a timeout parameter for this query, but for 
> metric-collectors like ours they are not aware of such URL-specific 
> parameter, so we need:
> 1) We should always have a timeout and set some default value to it
> 2) Investigate why metrics/snapshot could take such a long time to complete 
> under load, since we don't use history for these statistics and the values 
> are just some atomic read.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to