A patch was submitted for topology documentation, but it doesn't appear to have 
made it to any releases.  This svn link may help starting at line 1294.
 
http://svn.apache.org/viewvc?view=revision&revision=1411359  

Assuming you are using hadoop 1.x and not yarn, the topology script only needs 
to be on the namenode and jobtracker.  As you have noticed it doesn't hurt 
anything if you copy the script everywhere as the tasktracker and datanode 
process will ignore it.   Try looking at pdsh for controlling compute nodes and 
pushing files, but be careful as if you type a bad command it's going to get 
ran everywhere. http://code.google.com/p/pdsh/

-- Adam


On Jan 11, 2013, at 9:01 AM, Bryan Beaudreault <[email protected]> wrote:

> The documentation on topology conf (topology.script.file.name) is a little 
> sparse, and while we have it working in our cluster I am trying to make it a 
> little easier to configure.
> 
> Currently we upload a python file and conf file to every node in our cluster. 
>  However I have a feeling that it is only needed on the NameNode(s) and 
> perhaps JobTracker.  I checked the code for DataNode and see no reference to 
> this configuration parameter, but I wanted to check with you all before I 
> stop updating the conf on every one of my nodes.
> 
> Can anyone confirm whether these configuration files only need to be present 
> on the NameNode/JobTracker, or do they need to be on every node in a cluster?
> 
> Thanks

Reply via email to