[
https://issues.apache.org/jira/browse/HDFS-16557?focusedWorklogId=780179&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-780179
]
ASF GitHub Bot logged work on HDFS-16557:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 10/Jun/22 02:29
Start Date: 10/Jun/22 02:29
Worklog Time Spent: 10m
Work Description: tomscut commented on PR #4219:
URL: https://github.com/apache/hadoop/pull/4219#issuecomment-1151852402
> Thanks @tomscut , after tracing the code, I think we cannot add
`elis.isInProgress()`.
>
> And I will explain my ideas trough questions and answers. **Question one:
Why was INVALID_TXID considered in the original code?**
>
> * CheckForGaps method is used to check whether streams contains continuous
TXids from fromTxId to toAtLeastTxid
> * LastTxId equals INVALID_TXID means the stream is in progress
> * toAtLeastTxid maybe abnormal value, like Long.MaxValue. So the
CheckForGaps method only need to cover the latest inprogress segment.
>
> **Question two: What is the difference between INVALID_TXID and is
InProgress()?**
>
> * Before introducing [SBN READ], LastTxId equals INVALID_TXID means the
stream is in progress. And stream is in progress means it's lastTxId is
INVALID_TXID.
> * But after introducing [SBN READ], LastTxId equals INVALID_TXID means the
stream is in progress. But stream is in progress cannot mean it's lastTxId is
INVALID_TXID. Because introducing getJournaledEdits.
> * So if we add `elis.isInProgress()` in CheckForGaps, it cannot cover the
last writing segments which actual contains latest edit.
>
> Please correct me if anything is wrong.
Thanks @ZanderXu for your comment. Please refer to the stack.

When we set `dfs.ha.tail-edits.in-progress=true`, the txID can be read by
getJournaledEdits (there is no gap actually) . But there is an GAP exception
thrown.
Issue Time Tracking
-------------------
Worklog Id: (was: 780179)
Time Spent: 3h 10m (was: 3h)
> BootstrapStandby failed because of checking gap for inprogress
> EditLogInputStream
> ---------------------------------------------------------------------------------
>
> Key: HDFS-16557
> URL: https://issues.apache.org/jira/browse/HDFS-16557
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Tao Li
> Assignee: Tao Li
> Priority: Major
> Labels: pull-request-available
> Attachments: image-2022-04-22-17-17-14-577.png,
> image-2022-04-22-17-17-14-618.png, image-2022-04-22-17-17-23-113.png,
> image-2022-04-22-17-17-32-487.png
>
> Time Spent: 3h 10m
> Remaining Estimate: 0h
>
> The lastTxId of an inprogress EditLogInputStream lastTxId isn't necessarily
> HdfsServerConstants.INVALID_TXID. We can determine its status directly by
> EditLogInputStream#isInProgress.
> We introduced [SBN READ], and set
> {color:#ff0000}{{dfs.ha.tail-edits.in-progress=true}}{color}. Then
> bootstrapStandby, the EditLogInputStream of inProgress is misjudged,
> resulting in a gap check failure, which causes bootstrapStandby to fail.
> hdfs namenode -bootstrapStandby
> !image-2022-04-22-17-17-32-487.png|width=766,height=161!
> !image-2022-04-22-17-17-14-577.png|width=598,height=187!
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]