Tried that already. Went even up to 1.0f. Tried also different values for 
dfs.block.invalidate.limit without impact
Hoping for something similar to “expunge” command that would clear HDFS of all 
orphaned blocks

From: Harsh J [mailto:[email protected]]
Sent: Friday, March 24, 2017 11:16 AM
To: Pol, Daniel (BigData) <[email protected]>; [email protected]
Subject: Re: HDFS - How to delete orphaned blocks

The rate of deletion of DN blocks is throttled via 
dfs.namenode.invalidate.work.pct.per.iteration (documented at 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml#dfs.namenode.invalidate.work.pct.per.iteration).
 If your problem is the rate and your usage is such that you generate and 
delete a lot of data quick, you can consider increasing the percentage 
represented by this value, and restart your NameNode.

P.s. Going too high may require raising heap spaces, so keep an eye out on JVM 
heap space usage across NN and DNs after raise.

On Fri, 24 Mar 2017 at 21:42 Pol, Daniel (BigData) 
<[email protected]<mailto:[email protected]>> wrote:
Hi !

Is there a way to delete “orphaned” blocks ? I see this happening quite often 
it I change the HDFS storage policy and recreate data or if a datanode fails 
and data on it its “old” but not old enough. After a few days it goes away by 
itself but I need a way to manually trigger it or make it faster. Right now I 
have to write scripts to detect the orphaned blocks and delete them manually 
outside Hadoop or reformat my HDFS.

I get into this situation where ‘dfs du’ shows not much space in use.
sudo -u hdfs bin/hdfs dfs -du -h /
8.1 G    24.2 G   /app-logs
867      2.5 K    /benchmarks
2.0 G    6.0 G    /mr-history
762      2.2 K    /system
100.4 M  251.2 M  /user

I have nothing in Trash and no Snapshots but my dfsadmin report show TBs of 
data in DFS used:
Name: 172.30.253.6:50010<http://172.30.253.6:50010> (m07dn06)
Hostname: m07dn06
Decommission Status : Normal
Configured Capacity: 108579574620160 (98.75 TB)
DFS Used: 1756550197248 (1.60 TB)
Non DFS Used: 0 (0 B)
DFS Remaining: 106822554660864 (97.15 TB)
DFS Used%: 1.62%
DFS Remaining%: 98.38%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 2
Last contact: Fri Mar 24 12:57:07 CDT 2017

Namenode logs show many block reports with invalidatedBlocks:
2017-03-24 12:49:37,625 INFO  BlockStateChange 
(BlockManager.java:processReport(2354)) - BLOCK* processReport 
0x19c92e070e3c2301: from storage DS-41ba227f-2a3e-45ac-b28c-1504e51d7cc2 node 
DatanodeRegistration(172.30.253.5:50010<http://172.30.253.5:50010>, 
datanodeUuid=5be84f90-ba9c-4c85-94fd-e4d20369c4e4, infoPort=50075, 
infoSecurePort=0, ipcPort=8010, 
storageInfo=lv=-57;cid=CID-ca8849f2-d722-45de-9848-ad50eeeabcf7;nsid=1923307298;c=1487788944154),
 blocks: 498, hasStaleStorage: false, processing time: 0 msecs, 
invalidatedBlocks: 65



Have a nice day,
Dani

Reply via email to