[ 
https://issues.apache.org/jira/browse/HDFS-7934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362832#comment-14362832
 ] 

J.Andreina commented on HDFS-7934:
----------------------------------

Steps to Reproduce:
=================

Step 1: Start NN1 as active , NN2 as standby .
Step 2: Perform "hdfs dfsadmin -rollingUpgrade prepare"
Step 3: Start NN2 active and NN1 as standby with rolling upgrade started option.
Step 4: DN also restarted in upgrade mode.
{noformat}
NN2 active:
-rw-r--r-- 1 Rex users 1048576 Mar 13 17:36 edits_inprogress_0000000000000000031
-rw-r--r-- 1 Rex users     350 Mar 13 17:33 fsimage_0000000000000000000
-rw-r--r-- 1 Rex users      62 Mar 13 17:33 fsimage_0000000000000000000.md5
-rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_rollback_0000000000000000029
-rw-r--r-- 1 Rex users      71 Mar 13 17:36 
fsimage_rollback_0000000000000000029.md5
-rw-r--r-- 1 Rex users       2 Mar 13 17:33 seen_txid
-rw-r--r-- 1 Rex users     206 Mar 13 17:36 VERSION
{noformat}
Step 5: NN2 active shutdown
Step 6: write files
{noformat}
NN1 active:
-rw-r--r-- 1 Rex users    1817 Mar 13 17:35 
edits_0000000000000000001-0000000000000000026
-rw-r--r-- 1 Rex users      67 Mar 13 17:35 
edits_0000000000000000027-0000000000000000029
-rw-r--r-- 1 Rex users 1048576 Mar 13 17:35 
edits_0000000000000000030-0000000000000000030
-rw-r--r-- 1 Rex users 1048576 Mar 13 17:39 edits_inprogress_0000000000000000032
-rw-r--r-- 1 Rex users     350 Mar 13 17:32 fsimage_0000000000000000000
-rw-r--r-- 1 Rex users      62 Mar 13 17:32 fsimage_0000000000000000000.md5
-rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_rollback_0000000000000000029
-rw-r--r-- 1 Rex users      71 Mar 13 17:36 
fsimage_rollback_0000000000000000029.md5
-rw-r--r-- 1 Rex users       3 Mar 13 17:35 seen_txid
-rw-r--r-- 1 Rex users     206 Mar 13 17:32 VERSION
{noformat}

Step 7: bring down both NN
Step 8: Start NN2 and NN1 with rolling upgrade rollback option.

Issue:
======
NN2 active started successfully but NN1 standby startup failed with following 
exception:

{noformat}
15/03/13 17:41:30 ERROR namenode.NameNode: Failed to start namenode.
java.io.IOException: Gap in transactions. Expected to be able to read up until 
at least txid 31 but unable to find any edit logs containing txid 31
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.checkForGaps(FSEditLog.java:1617)
at 
org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1575)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:647)
{noformat}

{noformat}
NN2 active:

-rw-r--r-- 1 Rex users 1048576 Mar 13 17:36 
edits_0000000000000000031-0000000000000000031.trash
-rw-r--r-- 1 Rex users 1048576 Mar 13 17:40 edits_inprogress_0000000000000000030
-rw-r--r-- 1 Rex users     350 Mar 13 17:33 fsimage_0000000000000000000
-rw-r--r-- 1 Rex users      62 Mar 13 17:33 fsimage_0000000000000000000.md5
-rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_0000000000000000029
-rw-r--r-- 1 Rex users      62 Mar 13 17:40 fsimage_0000000000000000029.md5
-rw-r--r-- 1 Rex users       2 Mar 13 17:33 seen_txid
-rw-r--r-- 1 Rex users     206 Mar 13 17:40 VERSION
{noformat}

{noformat}
NN1 standby:

-rw-r--r-- 1 Rex users    1817 Mar 13 17:35 
edits_0000000000000000001-0000000000000000026
-rw-r--r-- 1 Rex users      67 Mar 13 17:35 
edits_0000000000000000027-0000000000000000029
-rw-r--r-- 1 Rex users 1048576 Mar 13 17:35 
edits_0000000000000000030-0000000000000000030
-rw-r--r-- 1 Rex users 1048576 Mar 13 17:39 
edits_0000000000000000032-0000000000000000062
-rw-r--r-- 1 Rex users     350 Mar 13 17:32 fsimage_0000000000000000000
-rw-r--r-- 1 Rex users      62 Mar 13 17:32 fsimage_0000000000000000000.md5
-rw-r--r-- 1 Rex users     622 Mar 13 17:36 fsimage_rollback_0000000000000000029
-rw-r--r-- 1 Rex users      71 Mar 13 17:36 
fsimage_rollback_0000000000000000029.md5
-rw-r--r-- 1 Rex users       3 Mar 13 17:35 seen_txid
-rw-r--r-- 1 Rex users     206 Mar 13 17:32 VERSION
{noformat}


> During Rolling upgrade rollback ,standby namenode startup fails.
> ----------------------------------------------------------------
>
>                 Key: HDFS-7934
>                 URL: https://issues.apache.org/jira/browse/HDFS-7934
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: J.Andreina
>            Assignee: J.Andreina
>            Priority: Critical
>
> During Rolling upgrade rollback , standby namenode startup fails , while 
> loading edits and when  there is no local copy of edits created after upgrade 
> ( which is already been removed  by Active Namenode from journal manager and 
> from Active's local). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to