Re: Persistent small number of Blocks with corrupt replicas / Under replicated blocks

Robert J Berger Fri, 10 Jun 2011 18:10:39 -0700

I can't really do a full restart unless its the only option.

I did find some old temporary mapred job files that were considered 
under-replicated, so I deleted them and the system that was taking forever to 
decommision finished decomissioning (not sure if there was really a causal 
connection)


But the corrupted replicas were still there.

Would there be any negative consequences of running the fsck -move just to try 
it?

On Jun 10, 2011, at 3:33 PM, Joey Echeverria wrote:

> Good question. I didn't pick up on the fact that fsck disagrees with
> dfsadmin. Have you tried a full restart? Maybe somebody's information
> is out of date?
> 
> -Joey
> 
> On Fri, Jun 10, 2011 at 6:22 PM, Robert J Berger <rber...@runa.com> wrote:
>> I think the files may have been corrupted when I had initially shut down the 
>> node that was still in decommisiioning mode
>> 
>> Unfortunately I hadn't done the dfsadmin -report any time soon before I had 
>> the incident so I can't be sure that they haven't been there for a while. I 
>> always assumed that the fsck command would tell me if there were issues.
>> 
>> So will running hadoop fsck -move just move the corrupted replicas and leave 
>> the good ones? Will this work even though fsck does not report any 
>> corruption?
>> 
>> On Jun 9, 2011, at 3:20 PM, Joey Echeverria wrote:
>> 
>>> hadoop fsck -move will move the corrupt files to /lost+found, which
>>> will "fix" the report.
>>> 
>>> Do you know what created the corrupt files?
>>> 
>>> -Joey
>>> 
>>> On Thu, Jun 9, 2011 at 3:04 PM, Robert J Berger <rber...@runa.com> wrote:
>>>> I'm still having this problem and am kind of paralyzed until I figure out 
>>>> how to eliminate these Blocks with corrupt replicas.
>>>> 
>>>> Here is the  output of dfsadmin -report and fsck:
>>>> 
>>>> dfsadmin -report
>>>> Configured Capacity: 13723995700736 (12.48 TB)
>>>> Present Capacity: 13731775356416 (12.49 TB)
>>>> DFS Remaining: 4079794918277 (3.71 TB)
>>>> DFS Used: 9651980438139 (8.78 TB)
>>>> DFS Used%: 70.29%
>>>> Under replicated blocks: 18
>>>> Blocks with corrupt replicas: 34
>>>> Missing blocks: 0
>>>> 
>>>> -------------------------------------------------
>>>> Datanodes available: 9 (9 total, 0 dead)
>>>> (Not showing the nodes other than the one with Decommission in progress)
>>>> ...
>>>> Name: 10.195.10.175:50010
>>>> Decommission Status : Decommission in progress
>>>> Configured Capacity: 1731946381312 (1.58 TB)
>>>> DFS Used: 1083853885440 (1009.42 GB)
>>>> Non DFS Used: 0 (0 KB)
>>>> DFS Remaining: 651169222656(606.45 GB)
>>>> DFS Used%: 62.58%
>>>> DFS Remaining%: 37.6%
>>>> Last contact: Wed Jun 08 18:56:54 UTC 2011
>>>> ...
>>>> 
>>>> And the good bits from fsck:
>>>> 
>>>> Status: HEALTHY
>>>> Total size:     2832555958232 B (Total open files size: 134217728 B)
>>>> Total dirs:     72151
>>>> Total files:    65449 (Files currently being written: 9)
>>>> Total blocks (validated):       95076 (avg. block size 29792544 B) (Total 
>>>> open file blocks (not validated): 10)
>>>> Minimally replicated blocks:    95076 (100.0 %)
>>>> Over-replicated blocks: 35667 (37.5142 %)
>>>> Under-replicated blocks:        18 (0.018932223 %)
>>>> Mis-replicated blocks:          0 (0.0 %)
>>>> Default replication factor:     3
>>>> Average block replication:      3.376278
>>>> Corrupt blocks:         0
>>>> Missing replicas:               18 (0.0056074243 %)
>>>> Number of data-nodes:           9
>>>> Number of racks:                1
>>>> 
>>>> 
>>>> The filesystem under path '/' is HEALTHY
>>>> 
>>>> 
>>>> 
>>>> On Jun 8, 2011, at 10:38 AM, Robert J Berger wrote:
>>>> 
>>>>> Synopsis:
>>>>> * After shutting down a datanode in  a cluster, fsck declares CORRUPT 
>>>>> with missing blocks,
>>>>> * I restore/restart the datanode and fsck soon declares things healthy
>>>>> * But dfsadmin -report says a small number of blocks have corrupt 
>>>>> replicas and an even smaller number of under replicated blocks
>>>>> * After a couple of days that number corrupt replicas and under 
>>>>> replicated blocks stays the same
>>>>> 
>>>>> Full Story:
>>>>> My Goal is to rebalance blocks across 3 drives each within 2 datanodes in 
>>>>> a 9 datanode (Replication=3) cluster running hadoop 0.20.1
>>>>> (EBS Volumes were added to the datanodes over time so one disk had 95% 
>>>>> usage and the others had significantly less)
>>>>> 
>>>>> The plan was to decommission the nodes and then wipe the disks and then 
>>>>> add them back in to the cluster.
>>>>> 
>>>>> Before I started I ran fsck and all was healthy. (Unfortunately I did not 
>>>>> really look at the dfsadmin -report at that time, so I can't be sure if 
>>>>> there were no blocks with corrupt replicas at this point)
>>>>> 
>>>>> I put two nodes into the Decommission process and after waiting about 36 
>>>>> hours it hadn't finished decommissioning ether. So I decided to throw 
>>>>> caution to the wind and shut down one of them. (and had taken the node I 
>>>>> was shutting down  out of the dfs.exclude.file file, also removed the 2nd 
>>>>> node from the dfs.exclude.file , dfsadmin -refreshNodes but kept the 2nd 
>>>>> node live)
>>>>> 
>>>>> After shutting down one node, running fsck showed about 400 blocks as 
>>>>> missing.
>>>>> 
>>>>> So I brought back up the shutdown node (it took a while as I had to 
>>>>> restore it from EBS snapshot) and fsck quickly went back to healthy but 
>>>>> with a significant amount of Over replicated blocks
>>>>> 
>>>>> I put that node back into the decommissioning state (put just that one 
>>>>> node back in the dfs.exclude.file and ran dfsadmin -refreshNodes.
>>>>> 
>>>>> After another day or so, its still in the decommissioning mode. Fsck says 
>>>>> the cluster is healthy but still 37% over-replicated blocks.
>>>>> 
>>>>> But the thing that concerns me is that  dfsadmin -report says:
>>>>> 
>>>>> Under replicated blocks: 18
>>>>> Blocks with corrupt replicas: 34
>>>>> 
>>>>> So really two questions:
>>>>> 
>>>>> * Is there a way to force these corrupt replicas and under replicated 
>>>>> blocks to get fixed?
>>>>> * Is there a way to speed up the decommissioning process (without 
>>>>> restarting the cluster)
>>>>> 
>>>>> I presume that its not safe for me to take down this node until the 
>>>>> decommissioning completes and/or the corrupt replicas are fixed..
>>>>> 
>>>>> And finally, is there a better way to accomplish the original task of 
>>>>> rebalancing disks on a datanode?
>>>>> 
>>>>> Thanks!
>>>>> Rob
>>>>> __________________
>>>>> Robert J Berger - CTO
>>>>> Runa Inc.
>>>>> +1 408-838-8896
>>>>> http://blog.ibd.com
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> __________________
>>>> Robert J Berger - CTO
>>>> Runa Inc.
>>>> +1 408-838-8896
>>>> http://blog.ibd.com
>>>> 
>>>> 
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Joseph Echeverria
>>> Cloudera, Inc.
>>> 443.305.9434
>> 
>> __________________
>> Robert J Berger - CTO
>> Runa Inc.
>> +1 408-838-8896
>> http://blog.ibd.com
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> Joseph Echeverria
> Cloudera, Inc.
> 443.305.9434

__________________
Robert J Berger - CTO
Runa Inc.
+1 408-838-8896
http://blog.ibd.com

Re: Persistent small number of Blocks with corrupt replicas / Under replicated blocks

Reply via email to