I'm running Hadoop 1.1.2 on a cluster with 10ish computers. I would like to
nicely add and remove nodes, both for HDFS and MapReduce.

I've noticed the *datanode* process dies once decomissioning is done, so
this is what I do to remove a node:

   - Add node to *mapred.exclude*
   - Add node to *hdfs.exclude*
   - $ hadoop mradmin -refreshNodes
   - $ hadoop dfsadmin -refreshNodes
   - $ hadoop-daemon.sh stop tasktracker

To add athe node back in (assuming it was removed like above):

   - Remove from *mapred.exclude*
   - Remove from *hdfs.exclude*
   - $ hadoop mradmin -refreshNodes
   - $ hadoop dfsadmin -refreshNodes
   - $ hadoop-daemon.sh start tasktracker
   - $ hadoop-daemon.sh start datanode

Is this the correct way to scale up and down "nicely"?

By "nicely", I mean without data loss, and without stopping tasks running
on the nodes that I'm removing. (I.e. I'm assuming that *$ hadoop-daemon.sh
stop tasktracker* lets the tasktracker finish any currently running tasks
before dying).

Thanks,
Philippe

Reply via email to