Hi ! Is there a way to delete "orphaned" blocks ? I see this happening quite often it I change the HDFS storage policy and recreate data or if a datanode fails and data on it its "old" but not old enough. After a few days it goes away by itself but I need a way to manually trigger it or make it faster. Right now I have to write scripts to detect the orphaned blocks and delete them manually outside Hadoop or reformat my HDFS.
I get into this situation where 'dfs du' shows not much space in use. sudo -u hdfs bin/hdfs dfs -du -h / 8.1 G 24.2 G /app-logs 867 2.5 K /benchmarks 2.0 G 6.0 G /mr-history 762 2.2 K /system 100.4 M 251.2 M /user I have nothing in Trash and no Snapshots but my dfsadmin report show TBs of data in DFS used: Name: 172.30.253.6:50010 (m07dn06) Hostname: m07dn06 Decommission Status : Normal Configured Capacity: 108579574620160 (98.75 TB) DFS Used: 1756550197248 (1.60 TB) Non DFS Used: 0 (0 B) DFS Remaining: 106822554660864 (97.15 TB) DFS Used%: 1.62% DFS Remaining%: 98.38% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 2 Last contact: Fri Mar 24 12:57:07 CDT 2017 Namenode logs show many block reports with invalidatedBlocks: 2017-03-24 12:49:37,625 INFO BlockStateChange (BlockManager.java:processReport(2354)) - BLOCK* processReport 0x19c92e070e3c2301: from storage DS-41ba227f-2a3e-45ac-b28c-1504e51d7cc2 node DatanodeRegistration(172.30.253.5:50010, datanodeUuid=5be84f90-ba9c-4c85-94fd-e4d20369c4e4, infoPort=50075, infoSecurePort=0, ipcPort=8010, storageInfo=lv=-57;cid=CID-ca8849f2-d722-45de-9848-ad50eeeabcf7;nsid=1923307298;c=1487788944154), blocks: 498, hasStaleStorage: false, processing time: 0 msecs, invalidatedBlocks: 65 Have a nice day, Dani
