[ 
https://issues.apache.org/jira/browse/HDDS-12468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Siddhant Sangwan reassigned HDDS-12468:
---------------------------------------

    Assignee: Siddhant Sangwan  (was: Chu Cheng Li)

> Check for space availability for all dns while container creation in pipeline
> -----------------------------------------------------------------------------
>
>                 Key: HDDS-12468
>                 URL: https://issues.apache.org/jira/browse/HDDS-12468
>             Project: Apache Ozone
>          Issue Type: Sub-task
>            Reporter: Sumit Agrawal
>            Assignee: Siddhant Sangwan
>            Priority: Major
>
> At SCM for Ratis during allocateBlock,
>  # Pipeline is chosen randomly
>  # Container is choosen round robin with size required
>  # if matching container is not found
>  ## Create a new container and return back
>  # Block is assigned to the container and returned back response
>  
> Later can can fail at DN while container creation with negative impact as 
> below.
> Issue here is,
>  * If Leader node in pipeline do not have capacity to create new container, 
> it will return back container creation failure
>  * If Follower node do not have capacity to create new container, it will 
> fail and keep trying (if another follower is success)
>  * This can have negative impact of disk getting full in parallel write 
> blocks via state machine, and slow down write capability and failure response
>  
> Its being observed that write on follower node getting stuck due to disk full 
> / volume failure.
>  
> As solution,
>  * In this situation, SCM should trigger pipeline closure (including 
> container closure) with cool down time
>  * Should choose other pipeline for block allocation



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to