[
https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13872269#comment-13872269
]
Kihwal Lee commented on HDFS-5535:
----------------------------------
bq. The total time required to upgrade a cluster MUST not exceed
#Nodes_in_cluster * 10 seconds.
This is about how fast the upgrade process can go while minimally impacting
service and data availability. Please note that this is a requirement for the
upgrade feature. It does not dictate what users should do. This requirement
exists mainly to help users estimate how soon a cluster can be upgraded and
also force us to guarantee estimates stay valid in the future.
bq. Probably meant to say that old software should be able to support whatever
state of the file system left after the upgrade experiment was terminated?
I know you didn't intended it to be, but this sounds like the requirement is
reduced to maintaining file system integrity. It could simply be "Data
durability must not be compromised by upgrades or downgrades".
bq. May be it needs to roll edits in some special way to indicate the start of
the rolling upgrade?
I believe this came up during discussions, but do not remember the conclusion.
We will clarify this.
bq. What is MTTR?
Mean time to recovery.
bq. Looks like Lite-Decom and “Optimizing DN Restart time” are competing
proposals
Yes, indeed. We will do the latter, which will be more in-line with existing
tool-driven approaches. Lite-Decom may be considered in later development
phases for other use cases(e.g. the case Ming Ma mentioned above), but regular
DN rolling upgrade won't depend on it.
> Umbrella jira for improved HDFS rolling upgrades
> ------------------------------------------------
>
> Key: HDFS-5535
> URL: https://issues.apache.org/jira/browse/HDFS-5535
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, ha, hdfs-client, namenode
> Affects Versions: 3.0.0, 2.2.0
> Reporter: Nathan Roberts
> Attachments: HDFSRollingUpgradesHighLevelDesign.pdf
>
>
> In order to roll a new HDFS release through a large cluster quickly and
> safely, a few enhancements are needed in HDFS. An initial High level design
> document will be attached to this jira, and sub-jiras will itemize the
> individual tasks.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)