Re: Persistent small number of Blocks with corrupt replicas / Under replicated blocks

Joey Echeverria Fri, 10 Jun 2011 15:34:00 -0700

Good question. I didn't pick up on the fact that fsck disagrees with
dfsadmin. Have you tried a full restart? Maybe somebody's information
is out of date?


-Joey

On Fri, Jun 10, 2011 at 6:22 PM, Robert J Berger <[email protected]> wrote:
> I think the files may have been corrupted when I had initially shut down the 
> node that was still in decommisiioning mode
>
> Unfortunately I hadn't done the dfsadmin -report any time soon before I had 
> the incident so I can't be sure that they haven't been there for a while. I 
> always assumed that the fsck command would tell me if there were issues.
>
> So will running hadoop fsck -move just move the corrupted replicas and leave 
> the good ones? Will this work even though fsck does not report any corruption?
>
> On Jun 9, 2011, at 3:20 PM, Joey Echeverria wrote:
>
>> hadoop fsck -move will move the corrupt files to /lost+found, which
>> will "fix" the report.
>>
>> Do you know what created the corrupt files?
>>
>> -Joey
>>
>> On Thu, Jun 9, 2011 at 3:04 PM, Robert J Berger <[email protected]> wrote:
>>> I'm still having this problem and am kind of paralyzed until I figure out 
>>> how to eliminate these Blocks with corrupt replicas.
>>>
>>> Here is the  output of dfsadmin -report and fsck:
>>>
>>> dfsadmin -report
>>> Configured Capacity: 13723995700736 (12.48 TB)
>>> Present Capacity: 13731775356416 (12.49 TB)
>>> DFS Remaining: 4079794918277 (3.71 TB)
>>> DFS Used: 9651980438139 (8.78 TB)
>>> DFS Used%: 70.29%
>>> Under replicated blocks: 18
>>> Blocks with corrupt replicas: 34
>>> Missing blocks: 0
>>>
>>> -------------------------------------------------
>>> Datanodes available: 9 (9 total, 0 dead)
>>> (Not showing the nodes other than the one with Decommission in progress)
>>> ...
>>> Name: 10.195.10.175:50010
>>> Decommission Status : Decommission in progress
>>> Configured Capacity: 1731946381312 (1.58 TB)
>>> DFS Used: 1083853885440 (1009.42 GB)
>>> Non DFS Used: 0 (0 KB)
>>> DFS Remaining: 651169222656(606.45 GB)
>>> DFS Used%: 62.58%
>>> DFS Remaining%: 37.6%
>>> Last contact: Wed Jun 08 18:56:54 UTC 2011
>>> ...
>>>
>>> And the good bits from fsck:
>>>
>>> Status: HEALTHY
>>> Total size:     2832555958232 B (Total open files size: 134217728 B)
>>> Total dirs:     72151
>>> Total files:    65449 (Files currently being written: 9)
>>> Total blocks (validated):       95076 (avg. block size 29792544 B) (Total 
>>> open file blocks (not validated): 10)
>>> Minimally replicated blocks:    95076 (100.0 %)
>>> Over-replicated blocks: 35667 (37.5142 %)
>>> Under-replicated blocks:        18 (0.018932223 %)
>>> Mis-replicated blocks:          0 (0.0 %)
>>> Default replication factor:     3
>>> Average block replication:      3.376278
>>> Corrupt blocks:         0
>>> Missing replicas:               18 (0.0056074243 %)
>>> Number of data-nodes:           9
>>> Number of racks:                1
>>>
>>>
>>> The filesystem under path '/' is HEALTHY
>>>
>>>
>>>
>>> On Jun 8, 2011, at 10:38 AM, Robert J Berger wrote:
>>>
>>>> Synopsis:
>>>> * After shutting down a datanode in  a cluster, fsck declares CORRUPT with 
>>>> missing blocks,
>>>> * I restore/restart the datanode and fsck soon declares things healthy
>>>> * But dfsadmin -report says a small number of blocks have corrupt replicas 
>>>> and an even smaller number of under replicated blocks
>>>> * After a couple of days that number corrupt replicas and under replicated 
>>>> blocks stays the same
>>>>
>>>> Full Story:
>>>> My Goal is to rebalance blocks across 3 drives each within 2 datanodes in 
>>>> a 9 datanode (Replication=3) cluster running hadoop 0.20.1
>>>> (EBS Volumes were added to the datanodes over time so one disk had 95% 
>>>> usage and the others had significantly less)
>>>>
>>>> The plan was to decommission the nodes and then wipe the disks and then 
>>>> add them back in to the cluster.
>>>>
>>>> Before I started I ran fsck and all was healthy. (Unfortunately I did not 
>>>> really look at the dfsadmin -report at that time, so I can't be sure if 
>>>> there were no blocks with corrupt replicas at this point)
>>>>
>>>> I put two nodes into the Decommission process and after waiting about 36 
>>>> hours it hadn't finished decommissioning ether. So I decided to throw 
>>>> caution to the wind and shut down one of them. (and had taken the node I 
>>>> was shutting down  out of the dfs.exclude.file file, also removed the 2nd 
>>>> node from the dfs.exclude.file , dfsadmin -refreshNodes but kept the 2nd 
>>>> node live)
>>>>
>>>> After shutting down one node, running fsck showed about 400 blocks as 
>>>> missing.
>>>>
>>>> So I brought back up the shutdown node (it took a while as I had to 
>>>> restore it from EBS snapshot) and fsck quickly went back to healthy but 
>>>> with a significant amount of Over replicated blocks
>>>>
>>>> I put that node back into the decommissioning state (put just that one 
>>>> node back in the dfs.exclude.file and ran dfsadmin -refreshNodes.
>>>>
>>>> After another day or so, its still in the decommissioning mode. Fsck says 
>>>> the cluster is healthy but still 37% over-replicated blocks.
>>>>
>>>> But the thing that concerns me is that  dfsadmin -report says:
>>>>
>>>> Under replicated blocks: 18
>>>> Blocks with corrupt replicas: 34
>>>>
>>>> So really two questions:
>>>>
>>>> * Is there a way to force these corrupt replicas and under replicated 
>>>> blocks to get fixed?
>>>> * Is there a way to speed up the decommissioning process (without 
>>>> restarting the cluster)
>>>>
>>>> I presume that its not safe for me to take down this node until the 
>>>> decommissioning completes and/or the corrupt replicas are fixed..
>>>>
>>>> And finally, is there a better way to accomplish the original task of 
>>>> rebalancing disks on a datanode?
>>>>
>>>> Thanks!
>>>> Rob
>>>> __________________
>>>> Robert J Berger - CTO
>>>> Runa Inc.
>>>> +1 408-838-8896
>>>> http://blog.ibd.com
>>>>
>>>>
>>>>
>>>
>>> __________________
>>> Robert J Berger - CTO
>>> Runa Inc.
>>> +1 408-838-8896
>>> http://blog.ibd.com
>>>
>>>
>>>
>>>
>>
>>
>>
>> --
>> Joseph Echeverria
>> Cloudera, Inc.
>> 443.305.9434
>
> __________________
> Robert J Berger - CTO
> Runa Inc.
> +1 408-838-8896
> http://blog.ibd.com
>
>
>
>



-- 
Joseph Echeverria
Cloudera, Inc.
443.305.9434

Re: Persistent small number of Blocks with corrupt replicas / Under replicated blocks

Reply via email to