Alex Ivanov created HDFS-9841: --------------------------------- Summary: FileDiff's skipped by hdfs snapshotDiff Key: HDFS-9841 URL: https://issues.apache.org/jira/browse/HDFS-9841 Project: Hadoop HDFS Issue Type: Bug Components: snapshots Affects Versions: 2.3.0, 2.4.0 Reporter: Alex Ivanov
Summary When a file in HDFS is read, its corresponding inode's accessTime field is updated. If the file is present in the last snapshot, the accessTime change causes a FileDiff to be added to the SnapshotDiff of the last snapshot. This behavior has the following problems: - Since FileDiff's reside in memory on the namenodes, snapshots become progressively more memory-heavy with increasing volume of data in hdfs. On a system with frequent updates, e.g. hourly, this becomes a big problem since for, say 2000 snapshots, one can have 2000 FileDiff's per file pointing to the same inode. - FSImage grows in size tremendously, and upload operation from standby to active namenode takes much longer. -The generated FileDiff does not contain any useful information that I can see. Since all FileDiff's for that file are pointing to the same inode, the accessTime they see is the same.- - I was wrong about the last point. Each FileDiff includes a SnapshotCopy attribute, which contains the updated accessTime. This may be a feature, but I'd question the value of having it enabled by default. Configuration: CDH 5.0.5 (Hadoop 2.3 / 2.4) We are NOT overwriting the default parameter: DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT = 3600000; Note that it determines the allowed frequency of accessTime field updates - every hour by default. How to reproduce: {code} [root@node1076]# hdfs dfs -ls /data/tenants/testenv.testtenant/wddata Found 3 items drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:52 /data/tenants/testenv.testtenant/wddata/folder1 -rw-r--r-- 3 hdfs hadoop 38 2015-10-05 03:13 /data/tenants/testenv.testtenant/wddata/testfile1 -rw-r--r-- 3 hdfs hadoop 21 2015-10-04 10:45 /data/tenants/testenv.testtenant/wddata/testfile2 [root@node1076]# hdfs dfs -ls /data/tenants/testenv.testtenant/wddata/.snapshot Found 8 items drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:47 /data/tenants/testenv.testtenant/wddata/.snapshot/sn1 drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:47 /data/tenants/testenv.testtenant/wddata/.snapshot/sn2 drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:52 /data/tenants/testenv.testtenant/wddata/.snapshot/sn3 drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:53 /data/tenants/testenv.testtenant/wddata/.snapshot/sn4 drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:57 /data/tenants/testenv.testtenant/wddata/.snapshot/sn5 drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:58 /data/tenants/testenv.testtenant/wddata/.snapshot/sn6 drwxr-xr-x - hdfs hadoop 0 2015-10-05 03:13 /data/tenants/testenv.testtenant/wddata/.snapshot/sn7 drwxr-xr-x - hdfs hadoop 0 2015-10-05 04:20 /data/tenants/testenv.testtenant/wddata/.snapshot/sn8 [root@node1076]# hdfs dfs -createSnapshot /data/tenants/testenv.testtenant/wddata sn9 Created snapshot /data/tenants/testenv.testtenant/wddata/.snapshot/sn9 [root@node1076]# hdfs snapshotDiff /data/tenants/testenv.testtenant/wddata sn8 sn9 Difference between snapshot sn8 and snapshot sn9 under directory /data/tenants/testenv.testtenant/wddata: ################ ## IMPORTANT: testfile1 was put into HDFS more than 1 hour ago, which triggers the accessTime update. ################ [root@node1076]# hdfs dfs -cat /data/tenants/testenv.testtenant/wddata/testfile1 This is test file 1, but now it's 11. [root@node1076]# hdfs dfs -createSnapshot /data/tenants/testenv.testtenant/wddata sn10 Created snapshot /data/tenants/testenv.testtenant/wddata/.snapshot/sn10 [root@node1076]# hdfs snapshotDiff /data/tenants/testenv.testtenant/wddata sn9 sn10 Difference between snapshot sn9 and snapshot sn10 under directory /data/tenants/testenv.testtenant/wddata: M ./testfile1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)