[
https://issues.apache.org/jira/browse/HDFS-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17840302#comment-17840302
]
farmmamba commented on HDFS-17477:
----------------------------------
Sir, I also met this problem. Failed with OOM.
张浩博
[email protected]
---- Replied Message ----
[
https://issues.apache.org/jira/browse/HDFS-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17840294#comment-17840294
]
Ayush Saxena commented on HDFS-17477:
-------------------------------------
Hi [~dannytbecker]
Seems like since this got committed
TestLargeBlockReport#testBlockReportSucceedsWithLargerLengthLimit is failing
ref:
[https://ci-hadoop.apache.org/view/Hadoop/job/hadoop-qbt-trunk-java8-linux-x86_64/1564/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestLargeBlockReport/testBlockReportSucceedsWithLargerLengthLimit/]
It did fail once in the Jenkins result of this PR as well:
[https://github.com/apache/hadoop/pull/6748#issuecomment-2063042088]
But in the successive build, I am not sure if it ran or not.
Tried locally, with this in locally it was failing with OOM, I reverted it & it
passed.
Can you check once?
IncrementalBlockReport race condition additional edge cases
-----------------------------------------------------------
Key: HDFS-17477
URL: https://issues.apache.org/jira/browse/HDFS-17477
Project: Hadoop HDFS
Issue Type: Bug
Components: auto-failover, ha, namenode
Affects Versions: 3.3.5, 3.3.4, 3.3.6
Reporter: Danny Becker
Assignee: Danny Becker
Priority: Major
Labels: pull-request-available
HDFS-17453 fixes a race condition between IncrementalBlockReports (IBR) and the
Edit Log Tailer which can cause the Standby NameNode (SNN) to incorrectly mark
blocks as corrupt when it transitions to Active. There are a few edge cases
that HDFS-17453 does not cover.
For Example:
1. SNN1 loads the edits for b1gs1 and b1gs2.
2. DN1 reports b1gs1 to SNN1, so it gets queued for later processing.
3. DN1 reports b1gs2 to SNN1 so it gets added to the blocks map.
4. SNN1 transitions to Active (ANN1).
5. ANN1 processes the pending DN message queue and marks DN1->b1gs1 as corrupt
because it was still in the queue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
> IncrementalBlockReport race condition additional edge cases
> -----------------------------------------------------------
>
> Key: HDFS-17477
> URL: https://issues.apache.org/jira/browse/HDFS-17477
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: auto-failover, ha, namenode
> Affects Versions: 3.3.5, 3.3.4, 3.3.6
> Reporter: Danny Becker
> Assignee: Danny Becker
> Priority: Major
> Labels: pull-request-available
>
> HDFS-17453 fixes a race condition between IncrementalBlockReports (IBR) and
> the Edit Log Tailer which can cause the Standby NameNode (SNN) to incorrectly
> mark blocks as corrupt when it transitions to Active. There are a few edge
> cases that HDFS-17453 does not cover.
> For Example:
> 1. SNN1 loads the edits for b1gs1 and b1gs2.
> 2. DN1 reports b1gs1 to SNN1, so it gets queued for later processing.
> 3. DN1 reports b1gs2 to SNN1 so it gets added to the blocks map.
> 4. SNN1 transitions to Active (ANN1).
> 5. ANN1 processes the pending DN message queue and marks DN1->b1gs1 as
> corrupt because it was still in the queue.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]