Owen O'Malley wrote:
allssh -h node1000-3000 bin/hadoop-daemon.sh start tasktracker
and it will use ssh in parallel to connect to every node between
node1000 and node3000. Our's is a mess, but it would be great if someone
contributed a script like that. *smile*
It would be a one-line change to bin/slaves.sh to have it filter hosts
by a regex.
Note that bin/slaves.sh can have problems with larger clusters (>~100
nodes) since a single shell has trouble handling the i/o from 100
sub-processes, and ssh connections will start timing out. That's the
point of the HADOOP_SLAVE_SLEEP parameter, to meter the rate that
sub-processes are spawned. A better solution might be too sleep if the
number of sub-processes exceeds some limit, e.g.:
while [[ `jobs | wc -l` > 10 ]]; do sleep 1 ; done
Doug