The best way to do is by decommissioning nodes as said by Rich.
Another way could be to bump up the replication factor by setrep command
(without the -w option) and wait for a few hours and then reset the
replication factor back to original.
Then start the decommissioning process, it will be much quicker.
On 01/07/15 8:39 pm, Rich Haase wrote:
Hi Bing,
I would recommend that you add your 50 new nodes to the cluster and
the decommission the 50 nodes you want to get rid of. You can do the
decommission in one operation (albeit a lengthy operation) by adding
the nodes you want to decommission to your HDFS exclude file and
running `hdfs dfsadmin -refreshNodes`. The decommission process will
ensure that the data from your old nodes are distributed across your
cluster.
Cheers,
Rich
On Jul 1, 2015, at 3:30 AM, Bing Jiang <[email protected]
<mailto:[email protected]>> wrote:
hi, guys.
I want to move all the blocks from 50 datanodes to another 50 new
datanodes. There is a very easy idea that we can add the new 50 nodes
to hadoop cluster firstly, then decommission the other 50 nodes one
by one. But, I believe it is not an efficient way to reach the goal.
So I plan to get the help of the idea of Hdfs balancer, which limits
the movements happen to the 100 nodes. But I need to disable
write_op for those nodes to be decommissioned.
Is there a way to make DataNode in safe mode (read only) mode?
Of course, regarding to blocks movement between nodes, Any thoughts
will be appreciated.
--
Bing Jiang
*Rich Haase*| Sr. Software Engineer | Pandora
m (303) 887-1146 | [email protected] <mailto:[email protected]>
--
--
Thanks and Regards
Gurmukh Singh