[
https://issues.apache.org/jira/browse/HDFS-5223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529336#comment-14529336
]
Aaron T. Myers commented on HDFS-5223:
--------------------------------------
bq. Complexity in HDFS often arises from combinations of its features rather
than individual features in isolation. If individual features can be toggled,
then no two HDFS instances running the same software version are really
guaranteed to be alike. This becomes another layer of troubleshooting required
for a technical support team. Testing the possible combinations of features on
and off becomes a combinatorial explosion that's difficult for a QA team to
manage.
This is an issue, to be sure, but is this really different with or without
feature flags present? Even today, users can always choose to use or not use
all the various features of HDFS in any number of combinations. The fact that
presently all features are always enabled means that we should consider
ourselves obligated to make sure that all features work well with all other
features.
bq. Aside from managing metadata upgrades, we've also found rolling upgrade to
be valuable because of the OOB ack propagated through write pipelines
(HDFS-5583) to tell clients to pause rather than aborting the connection. Even
if it wasn't required from a metadata standpoint, some users might continue to
use rolling upgrade to get this benefit, even within a minor release line where
the layout version hasn't changed. Considering that use case, I see value in
improving our ability to downgrade within the current rolling upgrade scheme.
Fair point, but this suggests to me that the OOB ack feature should perhaps be
separated from the rolling upgrade feature, since those seem somewhat
orthogonal. One might want to use the OOB ack feature just when doing a rolling
restart (no upgrade) to effect a configuration change, without the additional
complexity of metadata changes, etc.
> Allow edit log/fsimage format changes without changing layout version
> ---------------------------------------------------------------------
>
> Key: HDFS-5223
> URL: https://issues.apache.org/jira/browse/HDFS-5223
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: namenode
> Affects Versions: 2.1.1-beta
> Reporter: Aaron T. Myers
> Assignee: Colin Patrick McCabe
> Attachments: HDFS-5223-HDFS-Downgrade-Extended-Support.pdf,
> HDFS-5223.004.patch, HDFS-5223.005.patch
>
>
> Currently all HDFS on-disk formats are version by the single layout version.
> This means that even for changes which might be backward compatible, like the
> addition of a new edit log op code, we must go through the full `namenode
> -upgrade' process which requires coordination with DNs, etc. HDFS should
> support a lighter weight alternative.
> Copied description from HDFS-8075 which is a duplicate and now closed. (by
> sanjay on APril 7 2015)
> Background
> * HDFS image layout was changed to use Protobufs to allow easier forward and
> backward compatibility.
> * Hdfs has a layout version which is changed on each change (even if it an
> optional protobuf field was added).
> * Hadoop supports two ways of going back during an upgrade:
> ** downgrade: go back to old binary version but use existing image/edits so
> that newly created files are not lost
> ** rollback: go back to "checkpoint" created before upgrade was started -
> hence newly created files are lost.
> Layout needs to be revisited if we want to support downgrade is some
> circumstances which we dont today. Here are use cases:
> * Some changes can support downgrade even though they was a change in layout
> since there is not real data loss but only loss of new functionality. E.g.
> when we added ACLs one could have downgraded - there is no data loss but you
> will lose the newly created ACLs. That is acceptable for a user since one
> does not expect to retain the newly added ACLs in an old version.
> * Some changes may lead to data-loss if the functionality was used. For
> example, the recent truncate will cause data loss if the functionality was
> actually used. Now one can tell admins NOT use such new such new features
> till the upgrade is finalized in which case one could potentially support
> downgrade.
> * A fairly fundamental change to layout where a downgrade is not possible but
> a rollback is. Say we change the layout completely from protobuf to something
> else. Another example is when HDFS moves to support partial namespace in
> memory - they is likely to be a fairly fundamental change in layout.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)