[
https://issues.apache.org/jira/browse/HDFS-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14157564#comment-14157564
]
Colin Patrick McCabe commented on HDFS-7185:
--------------------------------------------
Note that this bug isn't critical because edits continue to get written, and
everything functions normally except transferring the fsimage from the standby
to the primary prior to finalization.
I think we should consider updating the VERSION file of the NameNode as soon as
it is started with the new software and "{{\-rollingUpgrade started}}" This
would avoid the weird situation we're in currently where we're writing edit
logs and fsimages with version -59 (or whatever the upgraded, newer version
is), but our VERSION is stuck at -55 (the previous, old version). It is weird
to have new edit logs when the VERSION is the old version, right? Perhaps we
can move the old VERSION file to a VERSION.prev which we can restore in the
case of rollback.
I also noticed that the VERSION file on the standby namenode somehow changed to
-59 (the newer version), even before I invoked {{\-rollingUpgrade finalize}}.
> The active NameNode will not accept an fsimage sent from the standby during
> rolling upgrade
> -------------------------------------------------------------------------------------------
>
> Key: HDFS-7185
> URL: https://issues.apache.org/jira/browse/HDFS-7185
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.6.0
> Reporter: Colin Patrick McCabe
>
> The active NameNode will not accept an fsimage sent from the standby during
> rolling upgrade. The active fails with the exception:
> {code}
> 18:25:07,620 WARN ImageServlet:198 - Received an invalid request file
> transfer request from a secondary with storage info
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> 18:25:07,620 WARN log:76 - Committed before 410 PutImage failed.
> java.io.IOException: This namenode has storage info
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary
> expected -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-
> 0a6e431987f6
> at
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.validateRequest(ImageServlet.java:200)
> at
> org.apache.hadoop.hdfs.server.namenode.ImageServlet.doPut(ImageServlet.java:443)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:730)
> {code}
> On the standby, the exception is:
> {code}
> java.io.IOException: Exception during image upload:
> org.apache.hadoop.hdfs.server.namenode.TransferFsImage$HttpPutFailedException:
> This namenode has storage info
> -55:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6 but the secondary
> expected
> -59:65195028:0:CID-385de4d7-64e4-4dde-9f5d-0a6e431987f6
> at
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.doCheckpoint(StandbyCheckpointer.java:218)
> at
> org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer.access$1400(StandbyCheckpointer.java:62)
> {code}
> This seems to be a consequence of the fact that the VERSION file still is at
> -55 (the old version) even after the rolling upgrade has started. When the
> rolling upgrade is finalized with {{hdfs dfsadmin -rollingUpgrade finalize}},
> both VERSION files get set to the new version, and the problem goes away.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)