[
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Aaron T. Myers updated HDFS-5138:
---------------------------------
Attachment: HDFS-5138.patch
{quote}
+ // This is expected to happen for a stanby NN.
Typo (standby)
{quote}
Thanks, fixed.
{quote}
+ // Either they all return the same thing or this call fails, so we can
+ // just return the first result.
Would be good to assert that - eg in case one of the JNs crashed in the middle
of a previously attempted upgrade sequence.
{quote}
Sure, done.
{quote}
* @param useLock true - enables locking on the storage directory and false
* disables locking
+ * @param isShared whether or not this dir is shared between two NNs. true
+ * enables locking on the storage directory, false disables locking
I think this doc is now wrong because you inverted the sense of these booleans
- we don't lock the shared dir.
{quote}
Good catch. Fixed.
{quote}
+ public synchronized void doFinalizeOfSharedLog() throws IOException {
+ public synchronized boolean canRollBackSharedLog(Storage prevStorage,
Style nit: extra space in the above two methods
{quote}
Fixed.
{quote}
+ if (!sd.isShared()) {
+ // This will be done on transition to active.
Worth a LOG.info or even warn here
{quote}
Added the following:
{code}
LOG.info("Not doing recovery on " + sd + " now. Will be done on "
+ "transition to active.");
{code}
bq. Currently it seems like whichever SBN starts up first has to be the one who
does the transition to active. Maybe a follow-up JIRA could be to relax that
constraint? Seems like it should be fine for either one of the NNs to actually
do the upgrade - the lock file is just to make sure they agree on the target
ctime.
Agree this seems like a good idea, but agree it can reasonably be done in a
follow-up JIRA. If you agree, I'll file it when we commit this one.
{quote}
+ dfsadmin -finalizeUpgrade'>>> command while the NNs are running and one of
them
+ is active. The active NN at the time this happens will perform the upgrade of
+ the shared log, and both of the NNs will finalize the upgrade in their local
I think here you mean the "finalization of the shared log"
{quote}
Sure did. Fixed.
> Support HDFS upgrade in HA
> --------------------------
>
> Key: HDFS-5138
> URL: https://issues.apache.org/jira/browse/HDFS-5138
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.1.1-beta
> Reporter: Kihwal Lee
> Assignee: Aaron T. Myers
> Priority: Blocker
> Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch,
> HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch
>
>
> With HA enabled, NN wo't start with "-upgrade". Since there has been a layout
> version change between 2.0.x and 2.1.x, starting NN in upgrade mode was
> necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way
> to get around this was to disable HA and upgrade.
> The NN and the cluster cannot be flipped back to HA until the upgrade is
> finalized. If HA is disabled only on NN for layout upgrade and HA is turned
> back on without involving DNs, things will work, but finaliizeUpgrade won't
> work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade
> snapshots won't get removed.
> We will need a different ways of doing layout upgrade and upgrade snapshot.
> I am marking this as a 2.1.1-beta blocker based on feedback from others. If
> there is a reasonable workaround that does not increase maintenance window
> greatly, we can lower its priority from blocker to critical.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)