[
https://issues.apache.org/jira/browse/HDDS-5401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-5401:
---------------------------------
Labels: pull-request-available (was: )
> Add more metrics to ReplicationManager to help monitor replication progress
> ---------------------------------------------------------------------------
>
> Key: HDDS-5401
> URL: https://issues.apache.org/jira/browse/HDDS-5401
> Project: Apache Ozone
> Issue Type: Improvement
> Reporter: Mark Gui
> Assignee: Mark Gui
> Priority: Major
> Labels: pull-request-available
>
> For now SCM ReplicationManager only has 2 metrics: inflightReplication and
> inflightDeletion.
> We could add more metrics to help better monitor the replication progress(via
> prometheus e.g.).
> Then we could also estimate the time needed to complete the whole replication.
> Some proposed metrics:
> * number of replicate/delete cmds sent
> * number of replicate/delete cmds completed
> * number of replicate/delete cmds timeout
> These metrics will be refreshed for each replication round(300s by default).
> So we could calculate how many replicate/delete are completed between 2
> successive rounds and how many are undergoing, thus we could estimate how
> much more time it needs.
> Two more metrics to help more accurate estimation since closed containers
> could be in different sizes:
> * number of replicate bytes total
> * number of replicate bytes completed
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]