Re: Persistent small number of Blocks with corrupt replicas / Under replicated blocks

Joey Echeverria Fri, 10 Jun 2011 18:26:52 -0700

It should be safe to run fsck -move. Worst case, corrupt files end up in 
/lost+found. The job files are probably related to the under replicated blocks. 
The default replication factor for job files is 10 and I noticed you have 9 
datanodes.


The under replication would probably also prevented the node from 
decommissioning. If you run fsck -move I'd be interested to know if that fixes 
the corrupt replicas. 

-Joey

On Jun 10, 2011, at 21:09, Robert J Berger <rber...@runa.com> wrote:

> I can't really do a full restart unless its the only option.
> 
> I did find some old temporary mapred job files that were considered 
> under-replicated, so I deleted them and the system that was taking forever to 
> decommision finished decomissioning (not sure if there was really a causal 
> connection)
> 
> But the corrupted replicas were still there.
> 
> Would there be any negative consequences of running the fsck -move just to 
> try it?
> 
> On Jun 10, 2011, at 3:33 PM, Joey Echeverria wrote:
> 
>> Good question. I didn't pick up on the fact that fsck disagrees with
>> dfsadmin. Have you tried a full restart? Maybe somebody's information
>> is out of date?
>> 
>> -Joey
>> 
>> On Fri, Jun 10, 2011 at 6:22 PM, Robert J Berger <rber...@runa.com> wrote:
>>> I think the files may have been corrupted when I had initially shut down 
>>> the node that was still in decommisiioning mode
>>> 
>>> Unfortunately I hadn't done the dfsadmin -report any time soon before I had 
>>> the incident so I can't be sure that they haven't been there for a while. I 
>>> always assumed that the fsck command would tell me if there were issues.
>>> 
>>> So will running hadoop fsck -move just move the corrupted replicas and 
>>> leave the good ones? Will this work even though fsck does not report any 
>>> corruption?
>>> 
>>> On Jun 9, 2011, at 3:20 PM, Joey Echeverria wrote:
>>> 
>>>> hadoop fsck -move will move the corrupt files to /lost+found, which
>>>> will "fix" the report.
>>>> 
>>>> Do you know what created the corrupt files?
>>>> 
>>>> -Joey
>>>> 
>>>> On Thu, Jun 9, 2011 at 3:04 PM, Robert J Berger <rber...@runa.com> wrote:
>>>>> I'm still having this problem and am kind of paralyzed until I figure out 
>>>>> how to eliminate these Blocks with corrupt replicas.
>>>>> 
>>>>> Here is the  output of dfsadmin -report and fsck:
>>>>> 
>>>>> dfsadmin -report
>>>>> Configured Capacity: 13723995700736 (12.48 TB)
>>>>> Present Capacity: 13731775356416 (12.49 TB)
>>>>> DFS Remaining: 4079794918277 (3.71 TB)
>>>>> DFS Used: 9651980438139 (8.78 TB)
>>>>> DFS Used%: 70.29%
>>>>> Under replicated blocks: 18
>>>>> Blocks with corrupt replicas: 34
>>>>> Missing blocks: 0
>>>>> 
>>>>> -------------------------------------------------
>>>>> Datanodes available: 9 (9 total, 0 dead)
>>>>> (Not showing the nodes other than the one with Decommission in progress)
>>>>> ...
>>>>> Name: 10.195.10.175:50010
>>>>> Decommission Status : Decommission in progress
>>>>> Configured Capacity: 1731946381312 (1.58 TB)
>>>>> DFS Used: 1083853885440 (1009.42 GB)
>>>>> Non DFS Used: 0 (0 KB)
>>>>> DFS Remaining: 651169222656(606.45 GB)
>>>>> DFS Used%: 62.58%
>>>>> DFS Remaining%: 37.6%
>>>>> Last contact: Wed Jun 08 18:56:54 UTC 2011
>>>>> ...
>>>>> 
>>>>> And the good bits from fsck:
>>>>> 
>>>>> Status: HEALTHY
>>>>> Total size:     2832555958232 B (Total open files size: 134217728 B)
>>>>> Total dirs:     72151
>>>>> Total files:    65449 (Files currently being written: 9)
>>>>> Total blocks (validated):       95076 (avg. block size 29792544 B) (Total 
>>>>> open file blocks (not validated): 10)
>>>>> Minimally replicated blocks:    95076 (100.0 %)
>>>>> Over-replicated blocks: 35667 (37.5142 %)
>>>>> Under-replicated blocks:        18 (0.018932223 %)
>>>>> Mis-replicated blocks:          0 (0.0 %)
>>>>> Default replication factor:     3
>>>>> Average block replication:      3.376278
>>>>> Corrupt blocks:         0
>>>>> Missing replicas:               18 (0.0056074243 %)
>>>>> Number of data-nodes:           9
>>>>> Number of racks:                1
>>>>> 
>>>>> 
>>>>> The filesystem under path '/' is HEALTHY
>>>>> 
>>>>> 
>>>>> 
>>>>> On Jun 8, 2011, at 10:38 AM, Robert J Berger wrote:
>>>>> 
>>>>>> Synopsis:
>>>>>> * After shutting down a datanode in  a cluster, fsck declares CORRUPT 
>>>>>> with missing blocks,
>>>>>> * I restore/restart the datanode and fsck soon declares things healthy
>>>>>> * But dfsadmin -report says a small number of blocks have corrupt 
>>>>>> replicas and an even smaller number of under replicated blocks
>>>>>> * After a couple of days that number corrupt replicas and under 
>>>>>> replicated blocks stays the same
>>>>>> 
>>>>>> Full Story:
>>>>>> My Goal is to rebalance blocks across 3 drives each within 2 datanodes 
>>>>>> in a 9 datanode (Replication=3) cluster running hadoop 0.20.1
>>>>>> (EBS Volumes were added to the datanodes over time so one disk had 95% 
>>>>>> usage and the others had significantly less)
>>>>>> 
>>>>>> The plan was to decommission the nodes and then wipe the disks and then 
>>>>>> add them back in to the cluster.
>>>>>> 
>>>>>> Before I started I ran fsck and all was healthy. (Unfortunately I did 
>>>>>> not really look at the dfsadmin -report at that time, so I can't be sure 
>>>>>> if there were no blocks with corrupt replicas at this point)
>>>>>> 
>>>>>> I put two nodes into the Decommission process and after waiting about 36 
>>>>>> hours it hadn't finished decommissioning ether. So I decided to throw 
>>>>>> caution to the wind and shut down one of them. (and had taken the node I 
>>>>>> was shutting down  out of the dfs.exclude.file file, also removed the 
>>>>>> 2nd node from the dfs.exclude.file , dfsadmin -refreshNodes but kept the 
>>>>>> 2nd node live)
>>>>>> 
>>>>>> After shutting down one node, running fsck showed about 400 blocks as 
>>>>>> missing.
>>>>>> 
>>>>>> So I brought back up the shutdown node (it took a while as I had to 
>>>>>> restore it from EBS snapshot) and fsck quickly went back to healthy but 
>>>>>> with a significant amount of Over replicated blocks
>>>>>> 
>>>>>> I put that node back into the decommissioning state (put just that one 
>>>>>> node back in the dfs.exclude.file and ran dfsadmin -refreshNodes.
>>>>>> 
>>>>>> After another day or so, its still in the decommissioning mode. Fsck 
>>>>>> says the cluster is healthy but still 37% over-replicated blocks.
>>>>>> 
>>>>>> But the thing that concerns me is that  dfsadmin -report says:
>>>>>> 
>>>>>> Under replicated blocks: 18
>>>>>> Blocks with corrupt replicas: 34
>>>>>> 
>>>>>> So really two questions:
>>>>>> 
>>>>>> * Is there a way to force these corrupt replicas and under replicated 
>>>>>> blocks to get fixed?
>>>>>> * Is there a way to speed up the decommissioning process (without 
>>>>>> restarting the cluster)
>>>>>> 
>>>>>> I presume that its not safe for me to take down this node until the 
>>>>>> decommissioning completes and/or the corrupt replicas are fixed..
>>>>>> 
>>>>>> And finally, is there a better way to accomplish the original task of 
>>>>>> rebalancing disks on a datanode?
>>>>>> 
>>>>>> Thanks!
>>>>>> Rob
>>>>>> __________________
>>>>>> Robert J Berger - CTO
>>>>>> Runa Inc.
>>>>>> +1 408-838-8896
>>>>>> http://blog.ibd.com
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> __________________
>>>>> Robert J Berger - CTO
>>>>> Runa Inc.
>>>>> +1 408-838-8896
>>>>> http://blog.ibd.com
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Joseph Echeverria
>>>> Cloudera, Inc.
>>>> 443.305.9434
>>> 
>>> __________________
>>> Robert J Berger - CTO
>>> Runa Inc.
>>> +1 408-838-8896
>>> http://blog.ibd.com
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> Joseph Echeverria
>> Cloudera, Inc.
>> 443.305.9434
> 
> __________________
> Robert J Berger - CTO
> Runa Inc.
> +1 408-838-8896
> http://blog.ibd.com
> 
> 
>

Re: Persistent small number of Blocks with corrupt replicas / Under replicated blocks

Reply via email to