tomscut commented on PR #4219:
URL: https://github.com/apache/hadoop/pull/4219#issuecomment-1113892494

   > This seems right to me, but I don't fully understand what went wrong to 
cause the error. Can you explain more fully? Why did we previously make the 
assumption that `INVALID_TXID` meant in-progress, and what has changed to make 
that not true / what happened in your specific scenario to cause that not to be 
true?
   
   Thank you @xkrogen very much  for your review.
   
   After introducing [SBN READ], we updated the configuration: 
`dfs.ha.tail-edits.in-progress=true`.
   
   Then when we `bootstrapStandby`, we will encounter something like this:
   1. We need to start an Observer Namenode, so we execute bootstrapStandby 
before start it. This will automatically pull the latest FSImage from the 
Active Namenode and check whether the edits in the journals has a gap based on 
the `lastTxid` of the FSImage.
   
   2. Assume that the txid of the latest FSImage is x, and editslogs from x in 
journals is in `InProgress` state, `FSEditLog#checkForGaps` will be skipped. 
Because the `lastTxid` of the InProgress EditLogInputStream is not 
`HdfsServerConstants.INVALID_TXID`, but a specific number.  
   
   3. However, between x and txID currently being written, there is finalize 
Edit log, and `bootstrapStandby` can execute normally.
   
   The `lastTxId` of an InProgress EditLogInputStream isn't always as 
`HdfsServerConstants.INVALID_TXID`, could also be a specific number.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to