[ 
https://issues.apache.org/jira/browse/HDDS-13437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HDDS-13437:
----------------------------------
    Labels: pull-request-available  (was: )

> Avoid scheduling replications on full datanodes by tracking pending op size 
> in SCM
> ----------------------------------------------------------------------------------
>
>                 Key: HDDS-13437
>                 URL: https://issues.apache.org/jira/browse/HDDS-13437
>             Project: Apache Ozone
>          Issue Type: Sub-task
>          Components: SCM
>            Reporter: Siddhant Sangwan
>            Assignee: Siddhant Sangwan
>            Priority: Major
>              Labels: pull-request-available
>
> SCM schedules replication commands to fix under replication, mis replication, 
> for container moves, decommission etc. for both Ratis and EC containers. It 
> checks whether a target Datanode has 2 * containerSize amount of space before 
> selecting it as the target (the node that the container will be replicated 
> to). However it does not track what size of data it has already scheduled and 
> is in-flight. This means it can end up over scheduling replications to a 
> target Datanode that does not have enough space.
> For example if the remaining space in Dn is 30 gb and min free space is 20 
> gb, it can get have only 1 more container (2 * 5 gb, see 
> https://issues.apache.org/jira/browse/HDDS-12426). Currently the replication 
> manager will keep selecting it as a target for multiple containers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to