[ 
https://issues.apache.org/jira/browse/HDFS-16557?focusedWorklogId=779778&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-779778
 ]

ASF GitHub Bot logged work on HDFS-16557:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Jun/22 04:53
            Start Date: 09/Jun/22 04:53
    Worklog Time Spent: 10m 
      Work Description: ZanderXu commented on PR #4219:
URL: https://github.com/apache/hadoop/pull/4219#issuecomment-1150668804

   Thanks @tomscut , after tracing the code, I think we cannot add 
`elis.isInProgress()`.
   
   And I will explain my ideas trough questions and answers. 
   **Question one: Why was INVALID_TXID considered in the original code?**
   - CheckForGaps method is used to check whether streams contains continuous 
TXids from fromTxId to toAtLeastTxid
   - LastTxId equals INVALID_TXID means the stream is in progress
   - toAtLeastTxid maybe abnormal value, like Long.MaxValue.  So the 
CheckForGaps method only need to cover the latest inprogress segment.
   
   **Question two: What is the difference between INVALID_TXID and is 
InProgress()?**
   - Before introducing [SBN READ], LastTxId equals INVALID_TXID means the 
stream is in progress. And stream is in progress means it's lastTxId is 
INVALID_TXID.
   - But after introducing [SBN READ], LastTxId equals INVALID_TXID means the 
stream is in progress. But stream is in progress cannot mean it's lastTxId is 
INVALID_TXID. Because introducing getJournaledEdits.
   - So if we add `elis.isInProgress()` in CheckForGaps, it cannot cover the 
last writing segments which actual contains latest edit.
   
   Please correct me if anything is wrong.
   
   
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 779778)
    Time Spent: 3h  (was: 2h 50m)

> BootstrapStandby failed because of checking gap for inprogress 
> EditLogInputStream
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-16557
>                 URL: https://issues.apache.org/jira/browse/HDFS-16557
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Tao Li
>            Assignee: Tao Li
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2022-04-22-17-17-14-577.png, 
> image-2022-04-22-17-17-14-618.png, image-2022-04-22-17-17-23-113.png, 
> image-2022-04-22-17-17-32-487.png
>
>          Time Spent: 3h
>  Remaining Estimate: 0h
>
> The lastTxId of an inprogress EditLogInputStream lastTxId isn't necessarily 
> HdfsServerConstants.INVALID_TXID. We can determine its status directly by 
> EditLogInputStream#isInProgress.
> We introduced [SBN READ], and set 
> {color:#ff0000}{{dfs.ha.tail-edits.in-progress=true}}{color}. Then 
> bootstrapStandby, the EditLogInputStream of inProgress is misjudged, 
> resulting in a gap check failure, which causes bootstrapStandby to fail.
> hdfs namenode -bootstrapStandby
> !image-2022-04-22-17-17-32-487.png|width=766,height=161!
> !image-2022-04-22-17-17-14-577.png|width=598,height=187!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to