Re: Zeroconf for hadoop

Doug Cutting Mon, 26 Jan 2009 10:36:29 -0800

Owen O'Malley wrote:

allssh -h node1000-3000 bin/hadoop-daemon.sh start tasktracker
and it will use ssh in parallel to connect to every node betweennode1000 and node3000. Our's is a mess, but it would be great if someonecontributed a script like that. *smile*

It would be a one-line change to bin/slaves.sh to have it filter hostsby a regex.

Note that bin/slaves.sh can have problems with larger clusters (>~100nodes) since a single shell has trouble handling the i/o from 100sub-processes, and ssh connections will start timing out. That's thepoint of the HADOOP_SLAVE_SLEEP parameter, to meter the rate thatsub-processes are spawned. A better solution might be too sleep if thenumber of sub-processes exceeds some limit, e.g.:


  while [[ `jobs | wc -l` > 10 ]]; do sleep 1 ; done

Doug

Re: Zeroconf for hadoop

Reply via email to