[ https://issues.apache.org/jira/browse/HDDS-12080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated HDDS-12080: ---------------------------------- Labels: pull-request-available (was: ) > Clear irrelevant RATIS THREE pipelines on datanodes > --------------------------------------------------- > > Key: HDDS-12080 > URL: https://issues.apache.org/jira/browse/HDDS-12080 > Project: Apache Ozone > Issue Type: Improvement > Components: Ozone Datanode, SCM > Affects Versions: 2.0.0 > Reporter: Vyacheslav Tutrinov > Assignee: Vyacheslav Tutrinov > Priority: Major > Labels: pull-request-available > > h3. Lifecycle of RATIS/THREE Pipelines > The lifecycle of RATIS/THREE pipelines is fairly simple: they are created > automatically by the SCM (periodic invocation of > *{*}RatisPipelineProvider.create(...){*}*) and closed automatically (unless > manually closed via CLI) for several possible reasons: > - {*}Slow Followers{*}: If the followers in the pipeline's RAFT group are > slow, pipeline operations fail, and the datanode triggers pipeline closure > (this operation is sent to the SCM along with a heartbeat, and the SCM then > deletes the pipeline on its side). > - {*}Stuck in ALLOCATED State{*}: If a pipeline created by the SCM remains > in the ALLOCATED state for too long (e.g., datanodes do not retrieve pipeline > creation tasks from the SCM during heartbeats), the SCM triggers a task to > close the pipeline. This is handled by the *{*}PipelineManagerImpl{*}* class > in the *{*}scrubPipelines{*}* method. > - {*}Missing Heartbeats{*}: If a datanode in a pipeline's group stops > sending heartbeats for a prolonged period, the SCM marks it as STALE and > begins its finalization, which involves closing the pipelines in which the > node participates. > The last case is the most intriguing of the three. Let's examine it closely. > ---- > h3. HEALTHY->STALE->DEAD Datanode and Its Pipelines > Here's an interesting scenario: > 1. Assume there are *N* RATIS/THREE pipelines involving the datanode. Let > {*}N=32{*}. > 2. The datanode is registered with the SCM, and the SCM periodically checks > in the *StaleNodeHandler* whether the datanode has become STALE (stopped > sending heartbeats). > 3. If the datanode stops sending heartbeats, the SCM marks it as STALE and > initiates its finalization: > {code:bash} > 2024-12-19 17:04:09,383 INFO node.StaleNodeHandler > (StaleNodeHandler.java:onMessage(57)) - Datanode > af19f8cd-65fb-465d-bccf-310da1d8acc4(test1.ozone.test/127.0.0.1) moved to > stale state. Finalizing its pipelines > [PipelineID=29ffde4a-f0de-48e2-8ab5-80ef832dbc3e, > PipelineID=6358cd69-531e-49ea-9cd6-6c505f1b9bd8, > PipelineID=e6b6854c-d4d9-4c93-9b46-82ec5e0fed95, > PipelineID=7aeb0b55-6ed1-4b68-b316-10bb1f976cff, > PipelineID=cbca39b9-3fbb-4167-8b90-33960460c700, > PipelineID=eb950f17-b001-4679-9fba-39a590c0f72f, > PipelineID=512bacdb-ef2e-4cb6-aa98-05cf6be77049, > PipelineID=fc5d1a92-6075-486a-a56e-c5521256afe8, > PipelineID=06e1d2bc-b84e-455f-b8bd-f115c9cdde7d, > PipelineID=a30c7d15-25dd-4599-a4d6-ad52a65d0010, > PipelineID=7fef5cd4-dca3-4268-b835-fc7437d40372, > PipelineID=c8982830-2d8a-40c5-92d3-f07cd1509e80, > PipelineID=f13100dc-1cb2-42e2-9850-6139f47d7c71, > PipelineID=a7cfcb55-bbd1-4744-a274-3643ad61b205, > PipelineID=224ff311-b86b-457d-8542-0c513ba5a4a1, > PipelineID=2824ca06-52dd-4c2c-a8ba-da8c0aedb3e9, > PipelineID=751ba40f-62cb-4c76-b68f-ad21a68f3aad, > PipelineID=18f6122b-2b54-4a9d-8ee8-f1618f2cce58, > PipelineID=7a50de7e-ba95-4d3c-9f30-4bc00de9aa32, > PipelineID=eeb3e427-1d47-4677-8ff5-7e9671b932d2, > PipelineID=31ac2d9f-ece6-40fc-9e5f-79e7ef28b794, > PipelineID=871f6a99-3c5a-4283-be36-6df575d41fa6, > PipelineID=b148ce0d-3f14-4f70-a583-ba896a613024, > PipelineID=d71fe274-56bb-46ae-b017-a2e9386c8843, > PipelineID=59439d88-ba41-4637-bf88-2e46e7cad889, > PipelineID=53b3bcbe-ab75-4bc0-aeff-900db9029a79, > PipelineID=7a940c61-eece-4894-bb7b-9b891c9e1eb0, > PipelineID=fa452b0e-f43f-4190-a23c-5f2a088247c3, > PipelineID=a24a8f8b-f7f4-4013-8a28-bed4437b6359, > PipelineID=33b42053-b4e4-461b-90ab-f941ae263aef, > PipelineID=b3eb6560-818c-48ac-a334-a552cfe57a56, > PipelineID=ac385fed-0c9c-416c-8808-7eff784d6220] > {code} > 4. Finalization involves creating a batch of commands for the datanode to > close and delete its 32 pipelines: > {code:bash} > 2024-12-19 17:08:09,432 INFO pipeline.PipelineManagerImpl > (PipelineManagerImpl.java:scrubPipelines(610)) - Scrubbing pipeline: id: > PipelineID=29ffde4a-f0de-48e2-8ab5-80ef832dbc3e since it stays at CLOSED > stage. > 2024-12-19 17:08:09,436 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(270)) - Send > pipeline:PipelineID=29ffde4a-f0de-48e2-8ab5-80ef832dbc3e close command to > datanode b602aa44-832e-45f6-8bc7-18b2c0ca747b > 2024-12-19 17:08:09,442 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(270)) - Send > pipeline:PipelineID=29ffde4a-f0de-48e2-8ab5-80ef832dbc3e close command to > datanode af19f8cd-65fb-465d-bccf-310da1d8acc4 > 2024-12-19 17:08:09,443 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(270)) - Send > pipeline:PipelineID=29ffde4a-f0de-48e2-8ab5-80ef832dbc3e close command to > datanode 4a0d6707-e72e-4454-b0e3-b4ede5f8fcee > 2024-12-19 17:08:09,450 INFO pipeline.PipelineManagerImpl > (PipelineManagerImpl.java:removePipeline(459)) - Pipeline Pipeline[ Id: > 29ffde4a-f0de-48e2-8ab5-80ef832dbc3e, Nodes: > b602aa44-832e-45f6-8bc7-18b2c0ca747b(test1.ozone.test/127.0.0.1) > ReplicaIndex: > 0af19f8cd-65fb-465d-bccf-310da1d8acc4(test1.ozone.test/127.0.0.1) > ReplicaIndex: > 04a0d6707-e72e-4454-b0e3-b4ede5f8fcee(test1.ozone.test/127.0.0.1) > ReplicaIndex: 0, ReplicationConfig: RATIS/THREE, State:CLOSED, > leaderId:4a0d6707-e72e-4454-b0e3-b4ede5f8fcee, > CreationTimestamp2024-12-19T16:48:09.374+03:00[Europe/Moscow]] removed. > 2024-12-19 17:08:09,450 INFO pipeline.PipelineManagerImpl > (PipelineManagerImpl.java:scrubPipelines(610)) - Scrubbing pipeline: id: > PipelineID=6358cd69-531e-49ea-9cd6-6c505f1b9bd8 since it stays at CLOSED > stage. > 2024-12-19 17:08:09,451 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(270)) - Send > pipeline:PipelineID=6358cd69-531e-49ea-9cd6-6c505f1b9bd8 close command to > datanode b602aa44-832e-45f6-8bc7-18b2c0ca747b > 2024-12-19 17:08:09,451 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(270)) - Send > pipeline:PipelineID=6358cd69-531e-49ea-9cd6-6c505f1b9bd8 close command to > datanode af19f8cd-65fb-465d-bccf-310da1d8acc4 > 2024-12-19 17:08:09,451 INFO pipeline.RatisPipelineProvider > (RatisPipelineProvider.java:lambda$close$4(270)) - Send > pipeline:PipelineID=6358cd69-531e-49ea-9cd6-6c505f1b9bd8 close command to > datanode 4a0d6707-e72e-4454-b0e3-b4ede5f8fcee > {code} > 5. If the datanode remains unresponsive, the SCM marks it as DEAD and clears > the command queue for closing its pipelines: > {code:bash} > 2024-12-19 17:08:51,403 INFO node.DeadNodeHandler > (DeadNodeHandler.java:onMessage(108)) - Clearing command queue of size 32 for > DN af19f8cd-65fb-465d-bccf-310da1d8acc4(test1.ozone.test/127.0.0.1) > {code} > 6. If the datanode then resumes sending heartbeats, it will: > - Retain knowledge of the 32 pipelines that were already closed by the SCM. > - Receive new commands to create an additional 32 pipelines (since the SCM > no longer associates the old pipelines with the datanode). > 7. This can result in the datanode managing {*}64 pipelines{*}. If such > interruptions occur repeatedly, the number of pipelines can skyrocket, > potentially overwhelming the datanode (e.g., insufficient memory during a > restart due to the sheer number of RAFT groups). > This scenario illustrates how pipeline management challenges could escalate > and potentially destabilize the system under specific edge cases. > So, when we are in a state that a datanode has a DEAD state on the SCM side > and the datanode has been restarted a bunch of raft logs (raft groups, > pipelines) can be deleted without trying to initiate them (they are > irrelevant at the moment) and save a lot of time and avoid memory consumption > *See also:* > [https://github.com/apache/ozone/discussions/7186] > https://issues.apache.org/jira/browse/HDDS-11856 -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@ozone.apache.org For additional commands, e-mail: issues-h...@ozone.apache.org