[ 
https://issues.apache.org/jira/browse/HDDS-6744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-6744:
---------------------------------
    Labels: pull-request-available  (was: )

> EC: ReplicationManager - create ContainerReplicaPendingOps class and 
> integrate with ContainerManager
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-6744
>                 URL: https://issues.apache.org/jira/browse/HDDS-6744
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: SCM
>            Reporter: Stephen O'Donnell
>            Assignee: Stephen O'Donnell
>            Priority: Major
>              Labels: pull-request-available
>
> The legacy replication manager internally keeps a list of all pending 
> replications and deletes. Each time a container is checked, it check this 
> list and removes any replications that have been completed or expired. Then 
> it gets the list of remaining pending operations to help decide if container 
> is healthy or not.
> Rather than the ReplicationManager removing the completed and expired 
> replications, we could have a standalone PendingContainerOps monitor, that 
> works as follows:
> 1. Replication Manager adds pending replications and deletes to it.
> 2. Replication Manager queries it for anything pending for the current 
> container and gets a list of PendingActions back.
> 3. The PendingReplicationMonitor has its own internal thread that checks for 
> expired replications and removes them.
> 4. Completed replications and deletes are removed in ComtainerManagerImpl, 
> which has add and removeContainer triggered via the container reports (ICR 
> and FCR) from the datanodes as they are replicated.
> This way, the ReplicationManager does not need to worry about expiring 
> replications or removing completed entries. We also get the ability to have a 
> more up-to-date view of the system, as the ICR / FCRs will keep the pending 
> table up-to-date in real time, rather than having to wait for the container 
> to be re-check inside replication manager.
> We can have a fairly simple "ContainerReplicaPendingOps" class that is 
> basically standalone and inject it into ReplicationManager and 
> ContainerManagerImpl. This would allow for removing some complexity from RM 
> and let the expiry and completion be tested in an isolated way.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to