[
https://issues.apache.org/jira/browse/HDFS-16557?focusedWorklogId=779778&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-779778
]
ASF GitHub Bot logged work on HDFS-16557:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 09/Jun/22 04:53
Start Date: 09/Jun/22 04:53
Worklog Time Spent: 10m
Work Description: ZanderXu commented on PR #4219:
URL: https://github.com/apache/hadoop/pull/4219#issuecomment-1150668804
Thanks @tomscut , after tracing the code, I think we cannot add
`elis.isInProgress()`.
And I will explain my ideas trough questions and answers.
**Question one: Why was INVALID_TXID considered in the original code?**
- CheckForGaps method is used to check whether streams contains continuous
TXids from fromTxId to toAtLeastTxid
- LastTxId equals INVALID_TXID means the stream is in progress
- toAtLeastTxid maybe abnormal value, like Long.MaxValue. So the
CheckForGaps method only need to cover the latest inprogress segment.
**Question two: What is the difference between INVALID_TXID and is
InProgress()?**
- Before introducing [SBN READ], LastTxId equals INVALID_TXID means the
stream is in progress. And stream is in progress means it's lastTxId is
INVALID_TXID.
- But after introducing [SBN READ], LastTxId equals INVALID_TXID means the
stream is in progress. But stream is in progress cannot mean it's lastTxId is
INVALID_TXID. Because introducing getJournaledEdits.
- So if we add `elis.isInProgress()` in CheckForGaps, it cannot cover the
last writing segments which actual contains latest edit.
Please correct me if anything is wrong.
Issue Time Tracking
-------------------
Worklog Id: (was: 779778)
Time Spent: 3h (was: 2h 50m)
> BootstrapStandby failed because of checking gap for inprogress
> EditLogInputStream
> ---------------------------------------------------------------------------------
>
> Key: HDFS-16557
> URL: https://issues.apache.org/jira/browse/HDFS-16557
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Tao Li
> Assignee: Tao Li
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2022-04-22-17-17-14-577.png,
> image-2022-04-22-17-17-14-618.png, image-2022-04-22-17-17-23-113.png,
> image-2022-04-22-17-17-32-487.png
>
> Time Spent: 3h
> Remaining Estimate: 0h
>
> The lastTxId of an inprogress EditLogInputStream lastTxId isn't necessarily
> HdfsServerConstants.INVALID_TXID. We can determine its status directly by
> EditLogInputStream#isInProgress.
> We introduced [SBN READ], and set
> {color:#ff0000}{{dfs.ha.tail-edits.in-progress=true}}{color}. Then
> bootstrapStandby, the EditLogInputStream of inProgress is misjudged,
> resulting in a gap check failure, which causes bootstrapStandby to fail.
> hdfs namenode -bootstrapStandby
> !image-2022-04-22-17-17-32-487.png|width=766,height=161!
> !image-2022-04-22-17-17-14-577.png|width=598,height=187!
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]