Yongjun Zhang created HDFS-12191:
------------------------------------
Summary: Provide option to not capture the accessTime change of a
file to snapshot if no other modification has been done
Key: HDFS-12191
URL: https://issues.apache.org/jira/browse/HDFS-12191
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs, namenode
Affects Versions: 3.0.0-beta1
Reporter: Yongjun Zhang
Assignee: Yongjun Zhang
Currently, if the accessTime of a file changed before a snapshot is taken, this
accessTime will be captured in the snapshot, even if there is no other
modifications made to this file.
Because of this, when we calculate snapshotDiff, more work need to be done for
this file, e,g,, metadataEquals method will be called, even if there is no
modification is made (thus not recorded to snapshotDiff). This can cause
snapshotDiff to slow down quite a lot when there are a lot of files to be
examined.
This jira is to provide an option to skip capturing accessTime only change to
snapshot. Thus snapshotDiff can be done faster.
When accessTime of a file changed, if there is other modification to the file,
the access time will still be captured in snapshot.
Sometimes we want accessTime be captured to snapshot, such that when restoring
from the snapshot, we know the accessTime of this snapshot. So this new feature
is optional, and is controlled by a config property.
Worth to mention is, how accurately the acessTime is captured is dependent on
the following config that has default value of 1 hour, which means new access
within an hour of previous access will not be captured.
{code}
public static final String DFS_NAMENODE_ACCESSTIME_PRECISION_KEY =
HdfsClientConfigKeys.DeprecatedKeys.DFS_NAMENODE_ACCESSTIME_PRECISION_KEY;
public static final long DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT = 3600000;
{code}
.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]