[
https://issues.apache.org/jira/browse/HDFS-12191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140691#comment-16140691
]
Manoj Govindassamy edited comment on HDFS-12191 at 8/24/17 10:06 PM:
---------------------------------------------------------------------
Thanks for working on the patch revision [~yzhangal]. Few comments below:
1. {{INode}}
{noformat}
public abstract class INode {
..
private static boolean dontCaptureAccessTimeOnlyChangeInSnapshot = false;
public static void setDontCaptureAccessTimeOnlyChangeInSnapshot(boolean s) {
LOG.info("Setting dontCaptureAccessTimeOnlyChangeInSnapshot to " + s);
dontCaptureAccessTimeOnlyChangeInSnapshot = s;
}
public static boolean getDontCaptureAccessTimeOnlyChangeInSnapshot() {
return dontCaptureAccessTimeOnlyChangeInSnapshot;
}
{noformat}
* Abstract class INode doesn't look a right place for placing the snapshot
logic. The callers of {{INode#setAccessTime()}} can pass in the needed details
to skip recording the modifications
* FSNamesystem#setAccessTimes() has all the needed details to make the decision
whether to record the accesstime changes in the snapshots or not. So, shall we
pass in the details from here?
2. Config
* typos in {{DFS_NAMENODE_SNAPTHOT_SKIP_ACCESSTIME_ONLY_CHANG_DEFAULT}}
3.
{noformat}
<description>
4228 <name>dfs.namenode.snapshot.skip.accesstime-only-change</name>
4229 <value>false</value>
4231 If accessTime of a file changed but there is no other modification
4232 made to the file, the changed accesstime will not be captured in
next
4233 snapshot. However, if there is other modification made to the file,
4234 the latest access time will be captured together with the
modification
4235 in next snapshot.
4236 </description>
{noformat}
* How about something on the below lines?
"When enabled, for access time only modification operations, HDFS snapshots
will skip capturing pre modification copy of files/directories metadata.
However, for all other metadata modification operations, files/directories
latest access time will be captured along with the other metadata."
was (Author: manojg):
Thanks for working on the patch revision [~yzhangal].
1. {{INode}}
{noformat}
public abstract class INode {
..
private static boolean dontCaptureAccessTimeOnlyChangeInSnapshot = false;
public static void setDontCaptureAccessTimeOnlyChangeInSnapshot(boolean s) {
LOG.info("Setting dontCaptureAccessTimeOnlyChangeInSnapshot to " + s);
dontCaptureAccessTimeOnlyChangeInSnapshot = s;
}
public static boolean getDontCaptureAccessTimeOnlyChangeInSnapshot() {
return dontCaptureAccessTimeOnlyChangeInSnapshot;
}
{noformat}
* Abstract class INode doesn't look a right place for placing the snapshot
logic. The callers of {{INode#setAccessTime()}} can pass in the needed details
to skip recording the modifications
* FSNamesystem#setAccessTimes() has all the needed details to make the decision
whether to record the accesstime changes in the snapshots or not. So, shall we
pass in the details from here?
> Provide option to not capture the accessTime change of a file to snapshot if
> no other modification has been done
> ----------------------------------------------------------------------------------------------------------------
>
> Key: HDFS-12191
> URL: https://issues.apache.org/jira/browse/HDFS-12191
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs, namenode
> Affects Versions: 3.0.0-beta1
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
> Attachments: HDFS-12191.001.patch, HDFS-12191.002.patch
>
>
> Currently, if the accessTime of a file changed before a snapshot is taken,
> this accessTime will be captured in the snapshot, even if there is no other
> modifications made to this file.
> Because of this, when we calculate snapshotDiff, more work need to be done
> for this file, e,g,, metadataEquals method will be called, even if there is
> no modification is made (thus not recorded to snapshotDiff). This can cause
> snapshotDiff to slow down quite a lot when there are a lot of files to be
> examined.
> This jira is to provide an option to skip capturing accessTime only change to
> snapshot. Thus snapshotDiff can be done faster.
> When accessTime of a file changed, if there is other modification to the
> file, the access time will still be captured in snapshot.
> Sometimes we want accessTime be captured to snapshot, such that when
> restoring from the snapshot, we know the accessTime of this snapshot. So this
> new feature is optional, and is controlled by a config property.
> Worth to mention is, how accurately the acessTime is captured is dependent on
> the following config that has default value of 1 hour, which means new access
> within an hour of previous access will not be captured.
> {code}
> public static final String DFS_NAMENODE_ACCESSTIME_PRECISION_KEY =
>
> HdfsClientConfigKeys.DeprecatedKeys.DFS_NAMENODE_ACCESSTIME_PRECISION_KEY;
> public static final long DFS_NAMENODE_ACCESSTIME_PRECISION_DEFAULT =
> 3600000;
> {code}
> .
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]