Siddhant Sangwan created HDDS-12658:
---------------------------------------

             Summary: Verify datanode space availability when selecting 
pipeline to write to.
                 Key: HDDS-12658
                 URL: https://issues.apache.org/jira/browse/HDDS-12658
             Project: Apache Ozone
          Issue Type: Sub-task
          Components: SCM
            Reporter: Siddhant Sangwan


When writing in Ozone, SCM selects a container to write to. We have a 
PipelineChoosePolicy interface that's used to select which pipeline the write 
will go to. The default implementation picks a pipeline randomly. However it 
doesn't check whether the Datanodes in this pipeline have enough disk space 
available. If a full pipeline is handed over to the client, it can cause writes 
to fail and reduce write performance.

We do have a check later on in that code path that checks whether the chosen 
container has sufficient space:
{code}
containerInfo.getUsedBytes() + size <= this.containerSize
{code}
But this is not sufficient.

I propose adding some common logic to only pick a pipeline if all the Datanodes 
in that pipeline have sufficient available space. All implementations of 
PipelineChoosePolicy should use it. I need to check how this area is handling 
parallel writes and other synchronization/race details.

Note that we also have another implementation of the PipelineChoosePolicy, 
CapacityPipelineChoosePolicy:
{code}
Pipeline choose policy that randomly choose pipeline with relatively lower 
utilization.
{code}

I think all policies including this one will benefit from having the proposed 
check.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to