[
https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13910982#comment-13910982
]
Andrew Wang commented on HDFS-5535:
-----------------------------------
Hi all,
It looks like this feature is getting close, nice work! Can we get a rev of the
design doc at a less-high level as we approach merge? It seems like details
surrounding e.g. user API and implementation have been ironed out, so should be
included. There are also (I believe deprecated) mentions of lite-decom. It'd
also be nice if someone could unify the section title formatting, since there
are a number of different parts (checkpoint/rollback, NN failover, DN restart),
and they each use their own formatting schemes. Namely, it'd be very helpful to
consistently number the section titles (most word processing apps can do this
for you).
I also had a few questions after reading the doc, sorry in advance if these
were already answered in the comments:
* Can you expand on NN/DN consistency with the rollback marker and heartbeat
notifications? I'm not familiar with append or lease recovery, so it'd be nice
to get more explanation on those in particular.
* Could you comment on your experiences regarding the interval between an
upgrade and finalize? My impression was that right now, cluster operators might
wait a long time before finalizing to be safe (e.g. a week or two). Since
checkpointing would be paused with the rollback marker, a lot of edits would
accumulate, and NN startup time would suffer.
* Big +1 to not changing the layout version any further in the 2.x line after
this. With PB'd metadata and feature flags (whenever they arrive), this makes
NN upgrade a lot more pleasant. We should also call this out on the Hadoop
compatibility wiki page when this JIRA is merged goes in.
* Can you comment on how riding out DN restarts interacts with the HBase MTTR
work? I know they've done a lot of work to reduce timeouts throughout the
stack, and riding out restarts sounds like we need to keep the timeouts up. It
might help to specify your target restart time, for example with a DN with 500k
blocks.
* Are longer restarts (e.g. OS or hardware upgrade) part of the scope?
Obviously, 1-repl blocks would become an issue, and a super long timeout is not
a good solution. Maybe this is just the normal decom process needing love, but
it'd be nice to address these longer maintenance restarts too.
> Umbrella jira for improved HDFS rolling upgrades
> ------------------------------------------------
>
> Key: HDFS-5535
> URL: https://issues.apache.org/jira/browse/HDFS-5535
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, ha, hdfs-client, namenode
> Affects Versions: 3.0.0, 2.2.0
> Reporter: Nathan Roberts
> Attachments: HDFSRollingUpgradesHighLevelDesign.pdf,
> h5535_20140219.patch, h5535_20140220-1554.patch, h5535_20140220b.patch,
> h5535_20140221-2031.patch
>
>
> In order to roll a new HDFS release through a large cluster quickly and
> safely, a few enhancements are needed in HDFS. An initial High level design
> document will be attached to this jira, and sub-jiras will itemize the
> individual tasks.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)