[
https://issues.apache.org/jira/browse/HDDS-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xu Shao Hong updated HDDS-5916:
-------------------------------
Description:
During the chaos test, 10% DNs were killed to mimic the possible accident.
Env:
kubernetes+ PV+prom
Phenomenon:
The key writing rate sharply reduces and was tended to be a horizontal line.
Even after the chaos injection was recovered, the rate kept still.
In addition, the scm_pipeline_metrics_num_pipeline_allocated metrics showed the
periodic creation of new pipelines endlessly.
Datanodes were holding leader elections continuously, and cannot become stable
after the leader was elected.
Reason:
The DN pods were killed once and the IP of each revived pod might not have the
same IP address as previous. SCM can receive heartbeats from them and treat
them as normal due to the invariance of DN UUID with PV. The SCM currently does
not update IP in the DatanodeDetails, thus it would transfer wrong info for the
datanodes in the newly allocated pipeline.
In the raft group, for example, three raft peers are ABC respectively. A
was revived and had a new IP address. A could contact BC, but BC could not
contact A. Thus A would never receive the heartbeats from leader B or C and get
stuck in the transition of follower and candidate. Each time A become the
candidate, it will increase the term, raise the leader election and send it
successfully to BC. The leader once receives the requestVote, will step down
and reelect. This explains why the raft group in the pipeline never stabilize.
Meanwhile, the short-term leader could send the ready message to the SCM, and
the SCM misunderstands this pipeline is ready to write chunk, causing blocking
issues.
Possible solution:
check the datanodeDetails either by DN itself or the SCM and update IP if
necessary.
was:
During the chaos test, 10% DNs were killed to mimic the possible accident.
Env:
kubernetes+ PV+prom
Phenomenon:
The key writing rate sharply reduces and was inclined to be a horizontal line.
Even after the chaos injection was recovered, the rate kept still.
In addition, the scm_pipeline_metrics_num_pipeline_allocated metrics showed the
periodic creation of new pipelines endlessly.
Datanodes were holding leader elections continuously, and cannot become stable
after the leader was elected.
Reason:
The DN pods were killed once and the IP of each revived pod might not have the
same IP address as previous. SCM can receive heartbeats from them and treat
them as normal due to the invariance of DN UUID with PV. The SCM currently does
not update IP in the DatanodeDetails, thus it would transfer wrong info for the
datanodes in the newly allocated pipeline.
In the raft group, for example, three raft peers are ABC respectively. A
was revived and had a new IP address. A could contact BC, but BC could not
contact A. Thus A would never receive the heartbeats from leader B or C and get
stuck in the transition of follower and candidate. Each time A become the
candidate, it will increase the term, raise the leader election and send it
successfully to BC. The leader once receives the requestVote, will step down
and reelect. This explains why the raft group in the pipeline never stabilize.
Meanwhile, the short-term leader could send the ready message to the SCM, and
the SCM misunderstands this pipeline is ready to write chunk, causing blocking
issues.
Possible solution:
check the datanodeDetails either by DN itself or the SCM and update IP if
necessary.
> DNs in pipeline raft group get stuck in infinite leader election in Kubernets
> env
> ---------------------------------------------------------------------------------
>
> Key: HDDS-5916
> URL: https://issues.apache.org/jira/browse/HDDS-5916
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Xu Shao Hong
> Priority: Critical
> Attachments: wecom-temp-096bc77af479d5e6c280bbcaa35b7fe5.png,
> wecom-temp-56d8d0bcd030797a228dbb32e0dfa0f1.png,
> wecom-temp-5c5afba22bfcf188415ad622f82f66af.png
>
>
> During the chaos test, 10% DNs were killed to mimic the possible accident.
> Env:
> kubernetes+ PV+prom
>
> Phenomenon:
> The key writing rate sharply reduces and was tended to be a horizontal line.
> Even after the chaos injection was recovered, the rate kept still.
> In addition, the scm_pipeline_metrics_num_pipeline_allocated metrics showed
> the periodic creation of new pipelines endlessly.
> Datanodes were holding leader elections continuously, and cannot become
> stable after the leader was elected.
>
> Reason:
> The DN pods were killed once and the IP of each revived pod might not have
> the same IP address as previous. SCM can receive heartbeats from them and
> treat them as normal due to the invariance of DN UUID with PV. The SCM
> currently does not update IP in the DatanodeDetails, thus it would transfer
> wrong info for the datanodes in the newly allocated pipeline.
> In the raft group, for example, three raft peers are ABC respectively. A
> was revived and had a new IP address. A could contact BC, but BC could not
> contact A. Thus A would never receive the heartbeats from leader B or C and
> get stuck in the transition of follower and candidate. Each time A become
> the candidate, it will increase the term, raise the leader election and send
> it successfully to BC. The leader once receives the requestVote, will step
> down and reelect. This explains why the raft group in the pipeline never
> stabilize.
> Meanwhile, the short-term leader could send the ready message to the SCM, and
> the SCM misunderstands this pipeline is ready to write chunk, causing
> blocking issues.
>
> Possible solution:
> check the datanodeDetails either by DN itself or the SCM and update IP if
> necessary.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]