[
https://issues.apache.org/jira/browse/HDDS-9151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen O'Donnell updated HDDS-9151:
------------------------------------
Description:
In testing we have found an issues in the ECWritableContainerProvider.
For EC a pipeline is used for only one container, when the container gets
closed, the pipeline also gets closed. At the moment, the only place in the
code which closes the EC piplines which no longer have an open container is
inside the ECWritableContainerProvider. It first gets the list of open piplines
and enforces the pipeline limit, then for all open pipelines, it tries top find
one the client can use.
If the client has had problems writing to the pipelines (eg it was given a
container/pipeline and then the write failed as the container was closed on the
DN), the pipelines get added to the exclude list. Then we can get into a
situation where many pipelines need to be closed on the write path, slowing
down block allocation.
Ideally, when a container transitions to CLOSING in SCM, if the container is an
EC container, we should also close the associated pipeline to avoid it counting
toward the limit and to avoid needing to close it during the write (block
allocation) path.
This could be achieved relatively simply inside the
PipelineManagerImpl.removeContainersFromPipeline() method which is called as
soon as the container transitions to CLOSING via
ContainerStateManagerImpl.updateContainerState() when it executes the
containerStateChangeActions. Wrapping the container close and pipeline close in
a lock inside PipelineManagerImpl ensure we have a consistent "ec container
close" flow and it should avoid the ECWritableContainerProvider needing to
close the pipelines internally. However we can leave that code in place in
ECWritableContainerProvider incase some pipelines slip through somehow.
was:
In testing we have found an issues in the ECWritableContainerProvider.
For EC a pipeline is used for only one container, when the container gets
closed, the pipeline also gets closed. However closing the container can take
some time. First it is marked as CLOSING in SCM, then SCM sends commands to the
DNs to close it, and finally the container gets CLOSED.
As we limit the number of pipelines in SCM, containers in a CLOSING state mean
there are containers/pipelines which are effectively closed, but the pipeline
are still counted toward the limit.
Ideally, when a container transitions to CLOSING in SCM, if the container is an
EC container, we should also close the associated pipeline to avoid it counting
toward the limit.
This could be achieved relatively simply inside the
PipelineManagerImpl.removeContainersFromPipeline() method which is called as
soon as the container transitions to CLOSING via
ContainerStateManagerImpl.updateContainerState() when it executes the
containerStateChangeActions.
> Close EC Pipeline when container transitions to closing
> -------------------------------------------------------
>
> Key: HDDS-9151
> URL: https://issues.apache.org/jira/browse/HDDS-9151
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Stephen O'Donnell
> Assignee: Stephen O'Donnell
> Priority: Major
>
> In testing we have found an issues in the ECWritableContainerProvider.
> For EC a pipeline is used for only one container, when the container gets
> closed, the pipeline also gets closed. At the moment, the only place in the
> code which closes the EC piplines which no longer have an open container is
> inside the ECWritableContainerProvider. It first gets the list of open
> piplines and enforces the pipeline limit, then for all open pipelines, it
> tries top find one the client can use.
> If the client has had problems writing to the pipelines (eg it was given a
> container/pipeline and then the write failed as the container was closed on
> the DN), the pipelines get added to the exclude list. Then we can get into a
> situation where many pipelines need to be closed on the write path, slowing
> down block allocation.
> Ideally, when a container transitions to CLOSING in SCM, if the container is
> an EC container, we should also close the associated pipeline to avoid it
> counting toward the limit and to avoid needing to close it during the write
> (block allocation) path.
> This could be achieved relatively simply inside the
> PipelineManagerImpl.removeContainersFromPipeline() method which is called as
> soon as the container transitions to CLOSING via
> ContainerStateManagerImpl.updateContainerState() when it executes the
> containerStateChangeActions. Wrapping the container close and pipeline close
> in a lock inside PipelineManagerImpl ensure we have a consistent "ec
> container close" flow and it should avoid the ECWritableContainerProvider
> needing to close the pipelines internally. However we can leave that code in
> place in ECWritableContainerProvider incase some pipelines slip through
> somehow.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]