Way#3
1) bring up all 8 dn and the nn
2) retire one of your 4 nodes:
kill the datanode process
hadoop dfsadmin -refreshNodes (this should be done on nn)
3) do 2) extra three times
On Fri, Aug 6, 2010 at 1:21 AM, Allen Wittenauer
<[email protected]>wrote:
>
> On Aug 5, 2010, at 10:42 PM, Steve Kuo wrote:
>
> > As part of our experimentation, the plan is to pull 4 slave nodes out of
> a
> > 8-slave/1-master cluster. With replication factor set to 3, I thought
> > losing half of the cluster may be too much for hdfs to recover. Thus I
> > copied out all relevant data from hdfs to local disk and reconfigure the
> > cluster.
>
> It depends. If you have configured Hadoop to have a topology such that the
> 8 nodes were in 2 logical racks, then it would have worked just fine. If
> you didn't have any topology configured, then each node is considered its
> own rack. So pulling half of the grid down means you are likely losing a
> good chunk of all your blocks.
>
>
>
>
> >
> > The 4 slave nodes started okay but hdfs never left safe mode. The nn.log
> > has the following line. What is the best way to deal with this? Shall I
> > restart the cluster with 8-node and then delete
> > /data/hadoop-hadoop/mapred/system? Or shall I reformat hdfs?
>
> Two ways to go:
>
> Way #1:
>
> 1) configure dfs.hosts
> 2) bring up all 8 nodes
> 3) configure dfs.hosts.exclude to include the 4 you don't want
> 4) dfsadmin -refreshNodes to start decommissioning the 4 you don't want
>
> Way #2:
>
> 1) configure a topology
> 2) bring up all 8 nodes
> 3) setrep all files +1
> 4) wait for nn to finish replication
> 5) pull 4 nodes
> 6) bring down nn
> 7) remove topology
> 8) bring nn up
> 9) setrep -1
>
>
>
>
--
Best Wishes!
顺送商祺!
--
Chen He
(402)613-9298
PhD. student of CSE Dept.
Research Assistant of Holland Computing Center
University of Nebraska-Lincoln
Lincoln NE 68588