Re: Decommissioning a datanode takes forever

Ben Kim Tue, 22 Jan 2013 00:39:37 -0800

UPDATE:

WARN with edit log had nothing to do with the current problem.


However replica placement warnings seem to be suspicious.
Please have a look at the following logs.

2013-01-22 09:12:10,885 WARN
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
enough replicas, still in need of 1
2013-01-22 00:02:17,541 INFO
org.apache.hadoop.hdfs.server.namenode.FSNamesystem:
Block: blk_4844131893883391179_3440513,
Expected Replicas: 10, live replicas: 9, c        orrupt replicas: 0,
decommissioned replicas: 1, excess replicas: 0, Is Open File: false,
Datanodes having this block: 203.235.211.155:50010
203.235.211.156:5001020
3.235.211.145:50010 203.235.211.144:50010 203.235.211.146:50010
203.235.211.158:50010 203.235.211.159:50010 203.235.211.157:50010
203.235.211.160:50010 203.235.211.        143:50010 ,
Current Datanode: 203.235.211.155:50010, Is current datanode
decommissioning: true

I have set my replication factor to 3. I dont understand why hadoop is
trying to replicate it to 10 nodes. I have decommissioned one node so
currently I have 9 nodes in operation. It will never be replicated to 10
nodes.

I also see that all repeated warning msg like the above is for
blk_4844131893883391179_3440513.

How would I delete the block? it's not showing as corrupted block on fsck.
:(

BEN




On Tue, Jan 22, 2013 at 9:28 AM, Ben Kim <benkimkim...@gmail.com> wrote:

> Hi Varun, Thnk you for the reponse
>
> No there doesnt seem to be any corrupted blocks in my cluster.
> I did "hadoop fsck -blocks /" and it didnt report any corrupted block.
>
> However, these are two WARNings in the namenode log, constantly repeating
> since the decommission.
>
>    - 2013-01-22 09:16:30,908 WARN
>    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Cannot roll edit log,
>    edits.new files already exists in all healthy directories:
>    - 2013-01-22 09:12:10,885 WARN
>    org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Not able to place
>    enough replicas, still in need of 1
>
> There isn't any WARN or ERROR in the decommissioning datanode log
>
> Ben
>
>
>
> On Mon, Jan 21, 2013 at 3:05 PM, varun kumar <varun....@gmail.com> wrote:
>
>> Hi Ben,
>>
>> Are there any corrupted blocks in your hadoop cluster.
>>
>> Regards,
>> Varun Kumar
>>
>>
>> On Mon, Jan 21, 2013 at 8:22 AM, Ben Kim <benkimkim...@gmail.com> wrote:
>>
>>> Hi!
>>>
>>> I followed the decommissioning guide on the hadoop hdfs wiki.
>>>
>>> the hdfs web ui shows that the decommissioning proceess has successfully
>>> begun.
>>>
>>> it started redeploying 80,000 blocks through the hadoop cluster, but for
>>> some reason it stopped at 9059 blocks. I've waited 30 hours and still no
>>> progress.
>>>
>>> Any one with any idea?
>>>  --
>>>
>>> *Benjamin Kim*
>>> *benkimkimben at gmail*
>>>
>>
>>
>>
>> --
>> Regards,
>> Varun Kumar.P
>>
>
>
>
> --
>
> *Benjamin Kim*
> *benkimkimben at gmail*
>



-- 

*Benjamin Kim*
*benkimkimben at gmail*

Re: Decommissioning a datanode takes forever

Reply via email to