[ 
https://issues.apache.org/jira/browse/HADOOP-7359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046708#comment-13046708
 ] 

Aaron T. Myers commented on HADOOP-7359:
----------------------------------------

bq. Would anyone object to allowing the HostsReader to trigger refreshNodes? 
That would let Hadoop scan for or be notified of cluster membership changes and 
automagically do the Right Thing.

In the abstract I think this is a fine change to make.

bq. Introduce a "Refreshable" interface that both FSNamesystem and JobTracker 
implement, that only defines a refreshNodes method. HostsReader would have an 
initialize method that takes a Refreshable and users could choose to call 
refreshNodes.

I think the name "Refreshable" isn't the best. Seems a little too generic to 
me. How about something like "NodeListRefreshable" ?

Also, the NN and the JT already implement the interfaces 
{{o.a.h.hdfs.protocol.ClientProtocol}} and 
{{o.a.h.mapred.AdminOperationsProtocol}}, respectively, both of which require 
implementation of a {{refreshNodes()}} method which happen to have the same 
signature. You could just make these interfaces extend your new interface and 
then you'd get the genericity you'd need without actually having to touch the 
NN or JT classes at all.

bq. The current file-based cluster membership would continue to work exactly as 
it does today.

That seems wise to me. This proposed change would also make it easy to 
potentially make the {{HostsFileReader}} do something like periodically check 
the mtime of the hosts files and re-read them automatically if they've changed 
and call {{refreshNodes()}} on the relevant {{NodeListRefreshable}}.

> Pluggable interface for cluster membership
> ------------------------------------------
>
>                 Key: HADOOP-7359
>                 URL: https://issues.apache.org/jira/browse/HADOOP-7359
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Travis Crawford
>         Attachments: HADOOP-7359.diff
>
>
> Currently Hadoop uses local files to determine cluster membership. With HDFS 
> for example, dfs.hosts and dfs.hosts.exclude are used.
> To enable tighter integrations cluster membership should be an interface, 
> with the current file-based functionality provided as the default 
> implementation. The common case would be no functional change, however, sites 
> could plug an alternative implementation in, such as pulling the machine 
> lists from a machine database.
> DETAILS:
> Two machine lists, includes and excludes, are used to define cluster 
> membership and state. HostsFileReader currently handles reading these lists 
> from files, who's names are passed in by FSNamesystem for HDFS and JobTracker 
> for MR.
> The proposed change is adding a HostsReader interface to common, and changing 
> HostsFileReader to an abstract class that functions the same as today.
> Two new classes, DFSHostsFileReader and MRHostsFileReader, extend 
> HostsFileReader and simply pass the appropriate file names in. These new 
> classes are needed because config key names live outside common.
> Two new conf keys, defaulting to the file-based readers, would be added to 
> choose a different hosts reader: dfs.namenode.hosts.reader.class 
> mapreduce.jobtracker.hosts.reader.class
> Comments/suggestions? I have most of this written already but would love some 
> feedback on the general idea before posting the diff.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to