[ 
https://issues.apache.org/jira/browse/HDFS-5074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13731216#comment-13731216
 ] 

Suresh Srinivas commented on HDFS-5074:
---------------------------------------

bq. SBN is running and somehow encounters an error in the middle of replaying 
an edit log in the tailer (eg the JN it's reading from crashes)
We need understand why this issue happens, if it is not expected. Does SBN 
still complete checkpointing if the JN crashes?
                
> Allow starting up from an fsimage checkpoint in the middle of a segment
> -----------------------------------------------------------------------
>
>                 Key: HDFS-5074
>                 URL: https://issues.apache.org/jira/browse/HDFS-5074
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha, namenode
>    Affects Versions: 3.0.0, 2.1.0-beta
>            Reporter: Todd Lipcon
>
> We've seen the following behavior a couple times:
> - SBN is running and somehow encounters an error in the middle of replaying 
> an edit log in the tailer (eg the JN it's reading from crashes)
> - SBN successfully has processed half of the edits in the segment it was 
> reading.
> - SBN saves a checkpoint, which now falls in the middle of a segment, and 
> then restarts
> Upon restart, the SBN will load this checkpoint which falls in the middle of 
> a segment. {{selectInputStreams}} then fails when the SBN requests a 
> mid-segment txid.
> We should handle this case by downloading the right segment and 
> fast-forwarding to the correct txid.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to