[
https://issues.apache.org/jira/browse/HDFS-9841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Ivanov updated HDFS-9841:
------------------------------
Description:
Summary
As described in [HDFS-9197|https://issues.apache.org/jira/browse/HDFS-9197],
FileDiff's get created based on changes in a file's _modified_ or _access_
time. In Hadoop 2.5, the method
[metadataEquals|http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-hdfs/2.5.0/org/apache/hadoop/hdfs/server/namenode/INodeFileAttributes.java#INodeFileAttributes.SnapshotCopy.metadataEquals%28org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes%29]
was added to _org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes_. The
*hdfs snapshotDiff* command seems to use the method to show a list of snapshot
metadata diff's, but because the method doesn't check for the times, you CANNOT
see the time-based FileDiff's in the comparison. This used to work before
Hadoop 2.5. I see the new behavior as problematic because the user no longer
has an accurate picture of all the Namenode metadata created for a snapshot
unless he looks at the fsimage.
Configuration:
CDH 5.5.1 (Hadoop 2.6)
was:
Summary
When a file in HDFS is read, its corresponding inode's accessTime field is
updated. If the file is present in the last snapshot, the accessTime change
causes a FileDiff to be added to the SnapshotDiff of the last snapshot.
This behavior has the following problems:
- Since FileDiff's reside in memory on the namenodes, snapshots become
progressively more memory-heavy with increasing volume of data in hdfs. On a
system with frequent updates, e.g. hourly, this becomes a big problem since
for, say 2000 snapshots, one can have 2000 FileDiff's per file pointing to the
same inode.
- FSImage grows in size tremendously, and upload operation from standby to
active namenode takes much longer.
-The generated FileDiff does not contain any useful information that I can see.
Since all FileDiff's for that file are pointing to the same inode, the
accessTime they see is the same.-
- I was wrong about the last point. Each FileDiff includes a SnapshotCopy
attribute, which contains the updated accessTime. This may be a feature, but
I'd question the value of having it enabled by default.
Configuration:
CDH 5.0.5 (Hadoop 2.3 / 2.4)
We are NOT overwriting the default parameter:
DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT = 3600000;
Note that it determines the allowed frequency of accessTime field updates -
every hour by default.
How to reproduce:
{code}
[root@node1076]# hdfs dfs -ls /data/tenants/testenv.testtenant/wddata
Found 3 items
drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:52
/data/tenants/testenv.testtenant/wddata/folder1
-rw-r--r-- 3 hdfs hadoop 38 2015-10-05 03:13
/data/tenants/testenv.testtenant/wddata/testfile1
-rw-r--r-- 3 hdfs hadoop 21 2015-10-04 10:45
/data/tenants/testenv.testtenant/wddata/testfile2
[root@node1076]# hdfs dfs -ls /data/tenants/testenv.testtenant/wddata/.snapshot
Found 8 items
drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:47
/data/tenants/testenv.testtenant/wddata/.snapshot/sn1
drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:47
/data/tenants/testenv.testtenant/wddata/.snapshot/sn2
drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:52
/data/tenants/testenv.testtenant/wddata/.snapshot/sn3
drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:53
/data/tenants/testenv.testtenant/wddata/.snapshot/sn4
drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:57
/data/tenants/testenv.testtenant/wddata/.snapshot/sn5
drwxr-xr-x - hdfs hadoop 0 2015-10-04 10:58
/data/tenants/testenv.testtenant/wddata/.snapshot/sn6
drwxr-xr-x - hdfs hadoop 0 2015-10-05 03:13
/data/tenants/testenv.testtenant/wddata/.snapshot/sn7
drwxr-xr-x - hdfs hadoop 0 2015-10-05 04:20
/data/tenants/testenv.testtenant/wddata/.snapshot/sn8
[root@node1076]# hdfs dfs -createSnapshot
/data/tenants/testenv.testtenant/wddata sn9
Created snapshot /data/tenants/testenv.testtenant/wddata/.snapshot/sn9
[root@node1076]# hdfs snapshotDiff /data/tenants/testenv.testtenant/wddata sn8
sn9
Difference between snapshot sn8 and snapshot sn9 under directory
/data/tenants/testenv.testtenant/wddata:
################
## IMPORTANT: testfile1 was put into HDFS more than 1 hour ago, which triggers
the accessTime update.
################
[root@node1076]# hdfs dfs -cat /data/tenants/testenv.testtenant/wddata/testfile1
This is test file 1, but now it's 11.
[root@node1076]# hdfs dfs -createSnapshot
/data/tenants/testenv.testtenant/wddata sn10
Created snapshot /data/tenants/testenv.testtenant/wddata/.snapshot/sn10
[root@node1076]# hdfs snapshotDiff /data/tenants/testenv.testtenant/wddata sn9
sn10
Difference between snapshot sn9 and snapshot sn10 under directory
/data/tenants/testenv.testtenant/wddata:
M ./testfile1
{code}
> FileDiff's skipped by hdfs snapshotDiff
> ---------------------------------------
>
> Key: HDFS-9841
> URL: https://issues.apache.org/jira/browse/HDFS-9841
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 2.5.0
> Reporter: Alex Ivanov
>
> Summary
> As described in [HDFS-9197|https://issues.apache.org/jira/browse/HDFS-9197],
> FileDiff's get created based on changes in a file's _modified_ or _access_
> time. In Hadoop 2.5, the method
> [metadataEquals|http://grepcode.com/file/repo1.maven.org/maven2/org.apache.hadoop/hadoop-hdfs/2.5.0/org/apache/hadoop/hdfs/server/namenode/INodeFileAttributes.java#INodeFileAttributes.SnapshotCopy.metadataEquals%28org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes%29]
> was added to _org.apache.hadoop.hdfs.server.namenode.INodeFileAttributes_.
> The *hdfs snapshotDiff* command seems to use the method to show a list of
> snapshot metadata diff's, but because the method doesn't check for the times,
> you CANNOT see the time-based FileDiff's in the comparison. This used to work
> before Hadoop 2.5. I see the new behavior as problematic because the user no
> longer has an accurate picture of all the Namenode metadata created for a
> snapshot unless he looks at the fsimage.
> Configuration:
> CDH 5.5.1 (Hadoop 2.6)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)