[ 
https://issues.apache.org/jira/browse/HADOOP-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251843#comment-13251843
 ] 

Aaron T. Myers commented on HADOOP-8209:
----------------------------------------

Patch looks pretty good to me, Eli. Just a few small comments:

# Not obvious to me why we have these static version methods in the Storage 
class, which themselves just delegate to static methods of the VersionInfo 
class.
# Recommend adding additional detail to the AssertionErrors, including the 
revisions and versions that didn't match.
# Recommend adding an explanation to the DN log message about why the 
communication is being allowed, e.g.: "... because versions match exactly ('" + 
version + "') and hadoop.relaxed.worker.version.check is enabled." Ditto for TT.
# Similarly the log message explaining why communication isn't being allowed 
might mention whether the check failed because of strict revision checking, or 
relaxed version checking.
# Why call the new method "getInfoVersion" in JobTracker? getVersion, as was 
done in Storage, seems to make more sense to me.
# In TestTaskTrackerVersionCheck#testDefaultVersionCheck, I don't think you 
actually test that different revisions are still disallowed by default, since 
you change both the revision and version simultaneously in the test.
                
> Add option to enable DN and TT rolling upgrades in branch-1
> -----------------------------------------------------------
>
>                 Key: HADOOP-8209
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8209
>             Project: Hadoop Common
>          Issue Type: Improvement
>    Affects Versions: 1.0.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>         Attachments: hadoop-8209.txt
>
>
> In 1.x DNs currently refuse to connect to NNs if their build *revision* (ie 
> svn revision) do not match. TTs refuse to connect to JTs if their build 
> *version* (version, revision, user, and source checksum) do not match.
> This prevents rolling upgrades, which is intentional, see the discussion in 
> HADOOP-5203. The primary motivation in that jira was (1) it's difficult to 
> guarantee every build on a large cluster got deployed correctly, builds don't 
> get rolled back to old versions by accident etc, and (2) mixed versions can 
> lead to execution problems that are hard to debug.
> However there are also cases when users know they two builds are compatible, 
> eg when deploying a new build which contains the same contents as the 
> previous one, plus a critical security patch that does not affect 
> compatibility. Currently deploying a 1 line patch requires taking down the 
> entire cluster (or trying to work around the issue by lying about the build 
> revision or checksum, yuck). These users would like to be able to perform a 
> rolling upgrade.
> In order to support this, let's add an option that is off by default, but, 
> when enabled, makes the DN and TT version check just check for an exact 
> version match (eg "1.0.2") but ignore the build revision (DN) and the source 
> checksum (TT). Two builds still need to match the major, minor, and point 
> numbers, but nothing else.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to