[ https://issues.apache.org/jira/browse/HDFS-12618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wellington Chevreuil updated HDFS-12618: ---------------------------------------- Attachment: HDFS-12618.004.patch Here goes another patch attempt. I believe to have found a solution for all the cases. Some explanations below: 1) For each file in *.snapshot* folder, it first checks if the path resolves to an instance of *INodeFile*. This would be the case for non-renamed files. 1.1) In this case, we need to check if the given file only exists on snapshots, that's possible by calling *inodeFile.isWithSnapshot()*. 1.2) If the file only exists on snapshots, we should then check if it has been deleted from original folder, appended or truncated. 1.3) Files appended or truncated will still have a valid inode outside of snapshot folder, as long as original file has not been deleted yet. To check this condition, we can call *dir.getINodesInPath(inodeFile.getName(),FSDirectory.DirOp.READ).validate();*. For appended/truncated cases we then need to compare blocks for file in snaphsot folder with those to original file, counting only blocks from files in snapshot there are not in the original file (outside snapshot). 1.4) If file has been deleted from original folder, it exists only within snapshots. Call for *dir.getINodesInPath(inodeFile.getName(),FSDirectory.DirOp.READ).validate();* will throw an AssertionError in such cases, so in the catch statement we can then verify two additional conditions: 1.4.1) If we checking last snapshot, we can simply count all the blocks for the file. 1.4.2) If this is not last snapshot, we need to compare blocks on this file with the ones on the last snapshot, and count only those blocks that are not on last snapshot. 2) Renamed files will be resolved as either *INodeReference.DstReference* or *INodeReference.WithName)*. 2.1) *INodeReference.DstReference* will be the case where file has been renamed on original folder, then got renamed and snapshoted again. In this case, we only have to count the block if the original file gets deleted. In such scenario, *referenceIip.getLastINode()* returns null, so we can count the blocks. 2.2) Files in snapshot that then got renamed on the original folder will be *INodeReference.WithName*. If the original file gets deleted outside of the snapshot, it then needs to be counted. This can be identified by following condition: *referenceIip.getLastINode() == null && inode.asFile().getParent() == null*. Current patch is implementing the conditions described above, along with additional 12 unit tests for different variations of possible scenarios. > fsck -includeSnapshots reports wrong amount of total blocks > ----------------------------------------------------------- > > Key: HDFS-12618 > URL: https://issues.apache.org/jira/browse/HDFS-12618 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Affects Versions: 3.0.0-alpha3 > Reporter: Wellington Chevreuil > Assignee: Wellington Chevreuil > Priority: Minor > Attachments: HDFS-121618.initial, HDFS-12618.001.patch, > HDFS-12618.002.patch, HDFS-12618.003.patch, HDFS-12618.004.patch > > > When snapshot is enabled, if a file is deleted but is contained by a > snapshot, *fsck* will not reported blocks for such file, showing different > number of *total blocks* than what is exposed in the Web UI. > This should be fine, as *fsck* provides *-includeSnapshots* option. The > problem is that *-includeSnapshots* option causes *fsck* to count blocks for > every occurrence of a file on snapshots, which is wrong because these blocks > should be counted only once (for instance, if a 100MB file is present on 3 > snapshots, it would still map to one block only in hdfs). This causes fsck to > report much more blocks than what actually exist in hdfs and is reported in > the Web UI. > Here's an example: > 1) HDFS has two files of 2 blocks each: > {noformat} > $ hdfs dfs -ls -R / > drwxr-xr-x - root supergroup 0 2017-10-07 21:21 /snap-test > -rw-r--r-- 1 root supergroup 209715200 2017-10-07 20:16 /snap-test/file1 > -rw-r--r-- 1 root supergroup 209715200 2017-10-07 20:17 /snap-test/file2 > drwxr-xr-x - root supergroup 0 2017-05-13 13:03 /test > {noformat} > 2) There are two snapshots, with the two files present on each of the > snapshots: > {noformat} > $ hdfs dfs -ls -R /snap-test/.snapshot > drwxr-xr-x - root supergroup 0 2017-10-07 21:21 > /snap-test/.snapshot/snap1 > -rw-r--r-- 1 root supergroup 209715200 2017-10-07 20:16 > /snap-test/.snapshot/snap1/file1 > -rw-r--r-- 1 root supergroup 209715200 2017-10-07 20:17 > /snap-test/.snapshot/snap1/file2 > drwxr-xr-x - root supergroup 0 2017-10-07 21:21 > /snap-test/.snapshot/snap2 > -rw-r--r-- 1 root supergroup 209715200 2017-10-07 20:16 > /snap-test/.snapshot/snap2/file1 > -rw-r--r-- 1 root supergroup 209715200 2017-10-07 20:17 > /snap-test/.snapshot/snap2/file2 > {noformat} > 3) *fsck -includeSnapshots* reports 12 blocks in total (4 blocks for the > normal file path, plus 4 blocks for each snapshot path): > {noformat} > $ hdfs fsck / -includeSnapshots > FSCK started by root (auth:SIMPLE) from /127.0.0.1 for path / at Mon Oct 09 > 15:15:36 BST 2017 > Status: HEALTHY > Number of data-nodes: 1 > Number of racks: 1 > Total dirs: 6 > Total symlinks: 0 > Replicated Blocks: > Total size: 1258291200 B > Total files: 6 > Total blocks (validated): 12 (avg. block size 104857600 B) > Minimally replicated blocks: 12 (100.0 %) > Over-replicated blocks: 0 (0.0 %) > Under-replicated blocks: 0 (0.0 %) > Mis-replicated blocks: 0 (0.0 %) > Default replication factor: 1 > Average block replication: 1.0 > Missing blocks: 0 > Corrupt blocks: 0 > Missing replicas: 0 (0.0 %) > {noformat} > 4) Web UI shows the correct number (4 blocks only): > {noformat} > Security is off. > Safemode is off. > 5 files and directories, 4 blocks = 9 total filesystem object(s). > {noformat} > I would like to work on this solution, will propose an initial solution > shortly. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org