[ 
https://issues.apache.org/jira/browse/HDFS-14674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16895919#comment-16895919
 ] 

wangzhaohui commented on HDFS-14674:
------------------------------------

[~csun] thanks for your attention
{code:java}
//代码占位符
for (EditLogInputStream editIn : editStreams) {
 ...
  try {
    loader.loadFSEdits(editIn, lastAppliedTxId + 1, maxTxnsToRead,
        startOpt, recovery);
  } finally {
    // Update lastAppliedTxId even in case of error, since some ops may
    // have been successfully applied before the error.
    lastAppliedTxId = loader.getLastAppliedTxId();
  }
...
}{code}
when i add the conf dfs.ha.tail-eidt.max-txns-per-lock,traversal the 
editStreams,if the dfs.ha.tail-eidt.max-txns-per-lock value is 5000,the first 
editlog is bigger than 5000,for the first time,editLogInputStream read 5000,but 
the cycle doesn't end, editstream is going on traversal,maybe the inputstream 
read others eitlog, therefor got an unexpected txid when tail editlog.

 

> Got an unexpected txid when tail editlog
> ----------------------------------------
>
>                 Key: HDFS-14674
>                 URL: https://issues.apache.org/jira/browse/HDFS-14674
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: wangzhaohui
>            Assignee: wangzhaohui
>            Priority: Major
>         Attachments: HDFS-14674-001.patch, image-2019-07-26-11-34-23-405.png
>
>
> Add the following configuration
> !image-2019-07-26-11-34-23-405.png!
> error:
> {code:java}
> //代码占位符
> [2019-07-17T11:50:21.048+08:00] [INFO] [Edit log tailer] : replaying edit 
> log: 1/20512836 transactions completed. (0%) [2019-07-17T11:50:21.059+08:00] 
> [INFO] [Edit log tailer] : Edits file 
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  of size 3126782311 edits # 500 loaded in 3 seconds 
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc 
> expecting start txid #232056752162 [2019-07-17T11:50:21.059+08:00] [INFO] 
> [Edit log tailer] : Start loading edits file 
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  maxTxnipsToRead = 500 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log 
> tailer] : Fast-forwarding stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?ipjid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662 [2019-07-17T11:50:21.059+08:00] [INFO] [Edit 
> log tailer] ip: Fast-forwarding stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662 [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit 
> log tailer] : Unknown error encountered while tailing edits. Shutting down 
> standby NN. java.io.IOException: There appears to be a gap in the edit log. 
> We expected txid 232056752162, but got txid 232077264498. at 
> org.apache.hadoop.hdfs.server.namenode.MetaRecoveryContext.editLogLoaderPrompt(MetaRecoveryContext.java:94)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:239)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:161)
>  at 
> org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:895) at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:321)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:460)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:410)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:427)
>  at 
> org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:414)
>  at 
> org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:423)
>  [2019-07-17T11:50:21.064+08:00] [INFO] [Edit log tailer] : Exiting with 
> status 1 [2019-07-17T11:50:21.066+08:00] [INFO] [Thread-1] : SHUTDOWN_MSG: 
> /************************************************************ SHUTDOWN_MSG: 
> Shutting down NameNode at ip 
> ************************************************************/
> {code}
>  
> if dfs.ha.tail-edits.max-txns-per-lock value is 500,when the namenode load 
> the editlog util 500,the current namenode will load the next editlog,but 
> editlog more than 500.So,namenode got an unexpected txid when tail editlog.
>  
>  
> {code:java}
> //代码占位符[2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Edits file 
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232056426162&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  of size 3126782311 edits # 500 loaded in 3 seconds
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Reading 
> org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@51ceb7bc 
> expecting start txid #232056752162
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Start loading 
> edits file 
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH
>  maxTxnsToRead = 500
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding 
> stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH,
>  
> http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662
> [2019-07-17T11:50:21.059+08:00] [INFO] [Edit log tailer] : Fast-forwarding 
> stream 
> 'http://ip/getJournal?jid=ns1003&segmentTxId=232077264498&storageInfo=-63%3A1902204348%3A0%3ACID-hope-20180214-20161018-SQYH'
>  to transaction ID 232056751662
> [2019-07-17T11:50:21.061+08:00] [ERROR] [Edit log tailer] : Unknown error 
> encountered while tailing edits. Shutting down standby NN.
> java.io.IOException: There appears to be a gap in the edit log.  We expected 
> txid 232056752162, but got txid 232077264498.
> {code}
> Read data from JN twice in the same second,changed segmentTxId,finally quit 
> because of txid mismatch.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to