[
https://issues.apache.org/jira/browse/HADOOP-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251843#comment-13251843
]
Aaron T. Myers commented on HADOOP-8209:
----------------------------------------
Patch looks pretty good to me, Eli. Just a few small comments:
# Not obvious to me why we have these static version methods in the Storage
class, which themselves just delegate to static methods of the VersionInfo
class.
# Recommend adding additional detail to the AssertionErrors, including the
revisions and versions that didn't match.
# Recommend adding an explanation to the DN log message about why the
communication is being allowed, e.g.: "... because versions match exactly ('" +
version + "') and hadoop.relaxed.worker.version.check is enabled." Ditto for TT.
# Similarly the log message explaining why communication isn't being allowed
might mention whether the check failed because of strict revision checking, or
relaxed version checking.
# Why call the new method "getInfoVersion" in JobTracker? getVersion, as was
done in Storage, seems to make more sense to me.
# In TestTaskTrackerVersionCheck#testDefaultVersionCheck, I don't think you
actually test that different revisions are still disallowed by default, since
you change both the revision and version simultaneously in the test.
> Add option to enable DN and TT rolling upgrades in branch-1
> -----------------------------------------------------------
>
> Key: HADOOP-8209
> URL: https://issues.apache.org/jira/browse/HADOOP-8209
> Project: Hadoop Common
> Issue Type: Improvement
> Affects Versions: 1.0.0
> Reporter: Eli Collins
> Assignee: Eli Collins
> Attachments: hadoop-8209.txt
>
>
> In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie
> svn revision) do not match. TTs refuse to connect to JTs if their build
> *version* (version, revision, user, and source checksum) do not match.
> This prevents rolling upgrades, which is intentional, see the discussion in
> HADOOP-5203. The primary motivation in that jira was (1) it's difficult to
> guarantee every build on a large cluster got deployed correctly, builds don't
> get rolled back to old versions by accident etc, and (2) mixed versions can
> lead to execution problems that are hard to debug.
> However there are also cases when users know they two builds are compatible,
> eg when deploying a new build which contains the same contents as the
> previous one, plus a critical security patch that does not affect
> compatibility. Currently deploying a 1 line patch requires taking down the
> entire cluster (or trying to work around the issue by lying about the build
> revision or checksum, yuck). These users would like to be able to perform a
> rolling upgrade.
> In order to support this, let's add an option that is off by default, but,
> when enabled, makes the DN and TT version check just check for an exact
> version match (eg "1.0.2") but ignore the build revision (DN) and the source
> checksum (TT). Two builds still need to match the major, minor, and point
> numbers, but nothing else.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira