[
https://issues.apache.org/jira/browse/HDFS-5535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911339#comment-13911339
]
Suresh Srinivas commented on HDFS-5535:
---------------------------------------
Thanks for the comments [~andrew.wang] and [~stack]. The design needs to be
updated. We will do that in a day or two. The comments related to unifying the
sections will be taken care of then.
Responses:
{quote}
Could you comment on your experiences regarding the interval between an upgrade
and finalize? My impression was that right now, cluster operators might wait a
long time before finalizing to be safe (e.g. a week or two). Since
checkpointing would be paused with the rollback marker, a lot of edits would
accumulate, and NN startup time would suffer.
{quote}
With the latest changes from HDFS-6000, during rolling upgrades, checkpointing
will continue to happen. We only retain a special fsimage for rolling back. BTW
not finalizing for a week or two comes at significant storage cost, since no
blocks are deleted. I generally recommend ~ 3days to finalize, depending on the
storage pressure in the cluster.
bq. We should also call this out on the Hadoop compatibility wiki page when
this JIRA is merged goes in.
I will add some information about this. I do not think we can leave layout
version as is, as new features are added. While the old version of namenode can
perhaps handle new editlog, getting rid of data saved by newly added feature
may not be straightforward in all the cases.
bq. It might help to specify your target restart time, for example with a DN
with 500k blocks.
bq. What you folks thinking here? I saw 60 seconds earlier up in the doc. Some
HBase deploys have this ratcheted down to a few seconds or so...
[~kihwal] or [~brandonli] can comment on the final timeout chosen here.
bq. Are longer restarts (e.g. OS or hardware upgrade) part of the scope?
What specifically are you referring to? Can you add more details?
bq. Downgrade sounds like it will be a load of work
[~stack], this is fairly straight forward. When namenode layout has not
changed, the older release of namenode can handle both fsimage and edits
written by the new namenode. In the current upgrade mechanism, we lose lots of
newly written data on rollback. With downgrade, the newly created data can be
retained. This mainly would work in dot releases and most likely not in minor
release upgrade, where rollback may have to be used.
> Umbrella jira for improved HDFS rolling upgrades
> ------------------------------------------------
>
> Key: HDFS-5535
> URL: https://issues.apache.org/jira/browse/HDFS-5535
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: datanode, ha, hdfs-client, namenode
> Affects Versions: 3.0.0, 2.2.0
> Reporter: Nathan Roberts
> Attachments: HDFSRollingUpgradesHighLevelDesign.pdf,
> h5535_20140219.patch, h5535_20140220-1554.patch, h5535_20140220b.patch,
> h5535_20140221-2031.patch, h5535_20140224-1931.patch
>
>
> In order to roll a new HDFS release through a large cluster quickly and
> safely, a few enhancements are needed in HDFS. An initial High level design
> document will be attached to this jira, and sub-jiras will itemize the
> individual tasks.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)