Just for the reference these links:
http://wiki.apache.org/hadoop/FAQ#17
http://hadoop.apache.org/core/docs/r0.19.0/hdfs_user_guide.html#DFSAdmin+Command
Decommissioning is not happening at once.
-refreshNodes just starts the process, but does not complete it.
There could be a lot of blocks on the nodes you want to decommission,
and replication takes time.
The progress can be monitored on the name-node web UI.
Right after -refreshNodes on the web ui you will see the nodes you chose for
decommission have state "Decommission In Progress" you should wait until it is
changed to "Decommissioned" and then turn the node off.
--Konstantin
David Hall wrote:
I'm starting to think I'm doing things wrong.
I have an absolute path to dfs.hosts.exclude that includes what i want
decommissioned, and a dfs.hosts which includes those i want to remain
commissioned (this points to the slaves file).
Nothing seems to do anything...
What am I missing?
-- David
On Thu, Dec 4, 2008 at 12:48 AM, David Hall <[EMAIL PROTECTED]> wrote:
Hi,
I'm trying to decommission some nodes. The process I tried to follow is:
1) add them to conf/excluding (hadoop-site points there)
2) invoke hadoop dfsadmin -refreshNodes
This returns immediately, so I thought it was done, so i killed off
the cluster and rebooted without the new nodes, but then fsck was very
unhappy...
Is there some way to watch the progress of decomissioning?
Thanks,
-- David