[
https://issues.apache.org/jira/browse/STORM-3099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhengdai Hu updated STORM-3099:
-------------------------------
Summary: Extend metrics on supervisor, workers, and DRPC (was: Extend
metrics on supervisor and workers)
> Extend metrics on supervisor, workers, and DRPC
> -----------------------------------------------
>
> Key: STORM-3099
> URL: https://issues.apache.org/jira/browse/STORM-3099
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-server
> Affects Versions: 2.0.0
> Reporter: Zhengdai Hu
> Assignee: Zhengdai Hu
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> This patch serves to extend metrics on supervisor and worker. Currently the
> following metrics are being implemented, including but not limited to:
> Worker:
> # Kill Count by Category - Assignment Change/HB too old/Heap Space
> # Time spent in each state
> # Time to Actually Kill worker (from identifying need by supervisor and
> actual change in the state of the worker) - per worker?
> # Time to start worker for topology from reading assignment for the first
> time.
> # Worker cleanup Time/Worker cleanup Retries
> # Worker Suicide Count - category: internal error or Assignment Change
> Supervisor:
> # Supervisor restart Count
> # Blobstore (Request to download time)
> - # Download time individual blob (inside localizer) localizer gettting
> requst to actually download hdfs request to finish
> - # Download rate individual blob (inside localizer)
> - # Supervisor localizer thread blob download - how long (outside
> localizer)
> # Blobstore Update due to Version change Cnts
> # Blobstore Storage by users
> There might be more metrics added later.
> This patch will also refactor code in relevant files. Bugs found during the
> process will be reported in other issues and handled separately.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)