[ 
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14162282#comment-14162282
 ] 

Colin Patrick McCabe commented on HDFS-7185:
--------------------------------------------

bq. Hi Colin, one question is that is the scenario where we hit this exception 
is only when we have upgraded the SBN with the new version of the software, 
while still leaving the ANN running with the old bits?

Both NameNodes have been upgraded.

> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade
> -------------------------------------------------------------------------------------------
>
>                 Key: HDFS-7185
>                 URL: https://issues.apache.org/jira/browse/HDFS-7185
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Colin Patrick McCabe
>
> The active NameNode will not accept an fsimage sent from the standby during 
> rolling upgrade.  The active fails with the exception:
> {code}
> 18:25:07,620  WARN ImageServlet:198 - Received an invalid request file 
> transfer request from a secondary with storage info 
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620  WARN log:76 - Committed before 410 PutImage failed. 
> java.io.IOException: This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
>         at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
>         at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload: 
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
>  This namenode has storage info 
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary 
> expected
>  -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
>         at 
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at 
> -55 (the old version) even after the rolling upgrade has started.  When the 
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}}, 
> both VERSION files get set to the new version, and the problem goes away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to