[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-5138:
---------------------------------

    Attachment: HDFS-5138.patch

Here's a preliminary patch for folks to take a look at to get an idea of what 
I'm thinking here.

The patch makes a few changes to the way upgrade works in order to support HA 
upgrades. In particular:
* Normal, non-HA upgrade is left unchanged.
* I've removed the "-finalize" startup option, and instead running with this 
will direct users to use the `hdfs dfsadmin -finalizeUpgrade' command. 
Supporting both styles of finalization seems unnecessary, and makes HA 
finalization more difficult.
* Starting the NN with the '-rollback' flag will perform the rollback just as 
before, but it will not then proceed to start the NN daemon. Supporting this 
mode also makes HA rollback more difficult, and doesn't seem to be necessary or 
helpful, since to perform a rollback we don't need to load the fsimage/edit 
log, and thus performing the actual rollback should be quick. Operators can 
then start the NN as normal after rolling back the FS.

To perform upgrade in the HA case, this patch does the following:
* Both NNs must be started with the '-upgrade' flag.
* On start, each one of the NNs will first try to create a special lock file, 
either in the shared edits dir in the NFS case or on each of the JNs in the QJM 
case. This lock file will contain the CTime that that NN would like to upgrade 
the FS to.
* One of the NNs will win the creation of this file, and the other will lose. 
Whichever one loses will use the CTime stored in this file as the CTime to 
upgrade its own local dirs.
* After either creating or reading the shared lock file, each NN will perform 
an upgrade just as normal of all its local storage dirs.
* At the time when either NN is transitioned to the active state, that NN will 
perform an upgrade of the shared log, either on NFS or on the JNs.
* To finalize an HA upgrade, an operator will just use hdfsadmin as described 
before. The active NN at the time this happens will perform the upgrade of the 
shared log. Finalization will also remove the shared log lock file previously 
described.
* To perform a rollback in the HA case, both NNs should first be shut down. The 
operator should run the roll back command on one of the NN boxes, which will 
perform the rollback on the local dirs there, as well as on the shared log, 
either NFS or on the JNs. Afterward, this NN should be started and the operator 
should run `-bootstrapStandby' on the other NN.

As far as how the code is structured, much of the actual local file system 
upgrade code is refactored out of FSImage and into a new class, NNUpgradeUtil. 
This is so that this code can be called both from FSImage, and from 
FileJournalManager. Then, a few new upgrade-related methods are added to the 
JournalManager interface and the QJournalProtocol, so as to support generic 
upgrade of the shared log, however that shared log is implemented.

Please have a look. This patch definitely needs some cleanup and more tests, 
but I think the majority of it should roughly work and folks can hopefully get 
the gist.

> Support HDFS upgrade in HA
> --------------------------
>
>                 Key: HDFS-5138
>                 URL: https://issues.apache.org/jira/browse/HDFS-5138
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.1.1-beta
>            Reporter: Kihwal Lee
>            Priority: Blocker
>         Attachments: HDFS-5138.patch
>
>
> With HA enabled, NN wo't start with "-upgrade". Since there has been a layout 
> version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
> necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
> to get around this was to disable HA and upgrade. 
> The NN and the cluster cannot be flipped back to HA until the upgrade is 
> finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
> back on without involving DNs, things will work, but finaliizeUpgrade won't 
> work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
> snapshots won't get removed.
> We will need a different ways of doing layout upgrade and upgrade snapshot.  
> I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
> there is a reasonable workaround that does not increase maintenance window 
> greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to