Yuriy Malygin created HDFS-13236:
------------------------------------

             Summary: Standby NN down with error encountered while tailing edits
                 Key: HDFS-13236
                 URL: https://issues.apache.org/jira/browse/HDFS-13236
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: journal-node, namenode
    Affects Versions: 3.0.0
            Reporter: Yuriy Malygin


After update Hadoop from 2.7.3 to 3.0.0 standby NN down with error encountered 
while tailing edits from JN:
{code:java}
Feb 28 01:58:31 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:31,594 INFO 
[FSImageSaver for /one/hadoop-data/dfs of type IMAGE_AND_EDITS] 
FSImageFormatProtobuf - Image file 
/one/hadoop-data/dfs/current/fsimage.ckpt_00000000012748979
98 of size 4595971949 bytes saved in 93 seconds.
Feb 28 01:58:33 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:33,445 INFO 
[Standby State Checkpointer] NNStorageRetentionManager - Going to retain 2 
images with txid >= 1274897935
Feb 28 01:58:33 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:33,445 INFO 
[Standby State Checkpointer] NNStorageRetentionManager - Purging old image 
FSImageFile(file=/one/hadoop-data/dfs/current/fsimage_0000000001274897875, 
cpktTxId
=0000000001274897875)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,660 INFO 
[Edit log tailer] FSImage - Reading 
org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@6a168e6f 
expecting start txid #1274897999
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,660 INFO 
[Edit log tailer] FSImage - Start loading edits file 
http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup&segmentTxId=1274897999&storageInfo=-64%3A10
56233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848&inProgressOk=true, 
http://srve2916.local:8480/getJournal?jid=datalab-hadoop-backup&segmentTxId=1274897999&storageInfo=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848&;
inProgressOk=true
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,661 INFO 
[Edit log tailer] RedundantEditLogInputStream - Fast-forwarding stream 
'http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup&segmentTxId=1274897999&s
torageInfo=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848&inProgressOk=true,
 
http://srve2916.local:8480/getJournal?jid=datalab-hadoop-backup&segmentTxId=1274897999&storageInfo=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217
-aef5-6ed206893848&inProgressOk=true' to transaction ID 1274897999
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,661 INFO 
[Edit log tailer] RedundantEditLogInputStream - Fast-forwarding stream 
'http://srvd87.local:8480/getJournal?jid=datalab-hadoop-backup&segmentTxId=1274897999&storageInfo=-64%3A1056233980%3A0%3ACID-1fba08aa-c8bd-4217-aef5-6ed206893848&inProgressOk=true'
 to transaction ID 1274897999
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,680 ERROR 
[Edit log tailer] FSEditLogLoader - Encountered exception on operation AddOp 
[length=0, inodeId=145550319, 
path=/kafka/parquet/infrastructureGrace/date=2018-02-28/_temporary/1/_temporary/attempt_1516181147167_20856_r_000098_0/part-r-00098.gz.parquet,
 replication=3, mtime=1519772206615, atime=1519772206615, blockSize=134217728, 
blocks=[], permissions=root:supergroup:rw-r--r--, aclEntries=null, 
clientName=DFSClient_attempt_1516181147167_20856_r_000098_0_1523538799_1, 
clientMachine=10.137.2.142, overwrite=false, RpcClientId=, RpcCallId=271996603, 
storagePolicyId=0, erasureCodingPolicyId=0, opCode=OP_ADD, txid=1274898002]
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 
java.lang.IllegalArgumentException: Invalid clientId - length is 0 expected 
length 16
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
com.google.common.base.Preconditions.checkArgument(Preconditions.java:92)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache$CacheEntry.<init>(RetryCache.java:74)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache$CacheEntry.<init>(RetryCache.java:86)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache$CacheEntryWithPayload.<init>(RetryCache.java:163)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.ipc.RetryCache.addCacheEntryWithPayload(RetryCache.java:322)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addCacheEntryWithPayload(FSNamesystem.java:946)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.applyEditLogOp(FSEditLogLoader.java:397)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:249)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,688 FATAL 
[Edit log tailer] EditLogTailer - Unknown error encountered while tailing 
edits. Shutting down standby NN.
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: java.io.IOException: 
java.lang.IllegalStateException: Cannot skip to less than the current value 
(=145550319), where newValue=145550318
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1941)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:298)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:158)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:882)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:863)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:293)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:427)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$400(EditLogTailer.java:380)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:397)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:481)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:393)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: Caused by: 
java.lang.IllegalStateException: Cannot skip to less than the current value 
(=145550319), where newValue=145550318
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.util.SequentialNumber.skipTo(SequentialNumber.java:58)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: at 
org.apache.hadoop.hdfs.server.namenode.FSDirectory.resetLastInodeId(FSDirectory.java:1939)
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: ... 10 more
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,708  INFO 
[Edit log tailer] ExitUtil - Exiting with status 1: java.io.IOException: 
java.lang.IllegalStateException: Cannot skip to less than the current value 
(=145550319), where newValue=145550318
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 2018-02-28 01:58:34,722 INFO 
[pool-1-thread-1] NameNode - SHUTDOWN_MSG:
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: 
/************************************************************
Feb 28 01:58:34 srvd2135 datalab-namenode[15566]: SHUTDOWN_MSG: Shutting down 
NameNode at srvd2135.local/10.137.2.39
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Reply via email to