[
https://issues.apache.org/jira/browse/HDDS-6598?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Gui updated HDDS-6598:
---------------------------
Description:
After stressing a cluster for several days, we found that there are a lot of
CLOSED EC pipelines.
{code:java}
[ozoneadmin@TENCENT64 ~/ozone-1.3.0-SNAPSHOT]$ ./bin/ozone admin pipeline list
--state=CLOSED | wc -l
997 {code}
It makes commands return slowly(e.g. ozone admin datanode list, ozone admin
pipeline list), and potentially it will add unnecessary burden to SCM HA, so
these CLOSED EC pipelines should be cleaned up properly.
Several ways to consider:
# We close pipelines in `WritableECContainerProvider` by calling
`pipelineManager.closePipeline(pipeline, true);`, here the `true` means we
don't remove the pipeline record until a timeout. But actually the remove only
happens for Ratis Pipelines in `BackgrounePipelineCreator` when doing
`pipelineManager.scrubPipeline(replicationConfig);`. We could make it to
`false` then we'll get selected, CLOSED pipeline records removed, but leave the
unselected CLOSED pipeline records there.
# We could try to close pipeline after container close event from DN is
received. But container close follows a lifecyle like: OPEN -> CLOSING ->
QUASI_CLOSED -> CLOSED. I think it would be tricky to hook a pipeline close
action after an EC container is closed.
# We could have a dedicated background thread that runs periodically to
cleanup the CLOSED pipelines in a batch. This also benefits SCM HA compared to
solution 1 since we tends to do batch cleanups instead of one by one.
I think we could choose solution 3 to solve this problem.
was:
After stressing a cluster for several days, we found that there are a lot of
CLOSED EC pipelines.
{code:java}
[ozoneadmin@TENCENT64 ~/ozone-1.3.0-SNAPSHOT]$ ./bin/ozone admin pipeline list
--state=CLOSED | wc -l
997 {code}
It makes commands return slowly(e.g. ozone admin datanode list, ozone admin
pipeline list), and potentially it will add unnecessary burden to SCM HA, so
these CLOSED EC pipelines should be cleaned up properly.
Several ways to consider:
- We close pipelines in `WritableECContainerProvider` by calling
`pipelineManager.closePipeline(pipeline, true);`
> EC: EC pipeline records are not removed after close.
> ----------------------------------------------------
>
> Key: HDDS-6598
> URL: https://issues.apache.org/jira/browse/HDDS-6598
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Mark Gui
> Assignee: Mark Gui
> Priority: Major
>
> After stressing a cluster for several days, we found that there are a lot of
> CLOSED EC pipelines.
> {code:java}
> [ozoneadmin@TENCENT64 ~/ozone-1.3.0-SNAPSHOT]$ ./bin/ozone admin pipeline
> list --state=CLOSED | wc -l
> 997 {code}
> It makes commands return slowly(e.g. ozone admin datanode list, ozone admin
> pipeline list), and potentially it will add unnecessary burden to SCM HA, so
> these CLOSED EC pipelines should be cleaned up properly.
> Several ways to consider:
> # We close pipelines in `WritableECContainerProvider` by calling
> `pipelineManager.closePipeline(pipeline, true);`, here the `true` means we
> don't remove the pipeline record until a timeout. But actually the remove
> only happens for Ratis Pipelines in `BackgrounePipelineCreator` when doing
> `pipelineManager.scrubPipeline(replicationConfig);`. We could make it to
> `false` then we'll get selected, CLOSED pipeline records removed, but leave
> the unselected CLOSED pipeline records there.
> # We could try to close pipeline after container close event from DN is
> received. But container close follows a lifecyle like: OPEN -> CLOSING ->
> QUASI_CLOSED -> CLOSED. I think it would be tricky to hook a pipeline close
> action after an EC container is closed.
> # We could have a dedicated background thread that runs periodically to
> cleanup the CLOSED pipelines in a batch. This also benefits SCM HA compared
> to solution 1 since we tends to do batch cleanups instead of one by one.
> I think we could choose solution 3 to solve this problem.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]