Owen O'Malley wrote:
allssh -h node1000-3000 bin/hadoop-daemon.sh start tasktracker

and it will use ssh in parallel to connect to every node between node1000 and node3000. Our's is a mess, but it would be great if someone contributed a script like that. *smile*

It would be a one-line change to bin/slaves.sh to have it filter hosts by a regex.

Note that bin/slaves.sh can have problems with larger clusters (>~100 nodes) since a single shell has trouble handling the i/o from 100 sub-processes, and ssh connections will start timing out. That's the point of the HADOOP_SLAVE_SLEEP parameter, to meter the rate that sub-processes are spawned. A better solution might be too sleep if the number of sub-processes exceeds some limit, e.g.:

  while [[ `jobs | wc -l` > 10 ]]; do sleep 1 ; done

Doug

Reply via email to