[
https://issues.apache.org/jira/browse/HBASE-17330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15771727#comment-15771727
]
Jianwei Cui commented on HBASE-17330:
-------------------------------------
Thanks for pointing out the mod time problem, [~stack]. I tried the patch
locally as:
1. start a client to take snapshot periodically;
2. make {{SnapshotFileCache#refreshCache}} log the loading hfile names each
time it scheduled.
The log shows {{SnapshotFileCache}} could load the hfiles referenced by
snapshots taken before {{refreshCache}} starting. However, as you mentioned,
relying on the mod time is risky, the accuracy of mod time depends on the
implementation of underlying file system, and the mod time could also be
updated(such as by {{FSNamesystem#setTimes}}). To be more safe, we can make
{{SnapshotFileCache#getUnreferencedFiles}} load hfile names through on-disk
snapshots if the passed file is not in memory cache? as:
{code}
public synchronized Iterable<FileStatus>
getUnreferencedFiles(Iterable<FileStatus> files,
final SnapshotManager snapshotManager)
throws IOException {
...
for (FileStatus file : files) {
String fileName = file.getPath().getName();
if (!refreshed && !cache.contains(fileName)) {
refreshCache(); // ==> Always load hfile names through on-disk
snapshots(not consider the mod time).
refreshed = true;
}
if (cache.contains(fileName)) {
continue;
}
{code}
> SnapshotFileCache will always refresh the file cache
> ----------------------------------------------------
>
> Key: HBASE-17330
> URL: https://issues.apache.org/jira/browse/HBASE-17330
> Project: HBase
> Issue Type: Bug
> Components: snapshots
> Affects Versions: 2.0.0, 1.3.1, 0.98.23
> Reporter: Jianwei Cui
> Assignee: Jianwei Cui
> Priority: Minor
> Fix For: 2.0.0, 1.4.0
>
> Attachments: HBASE-17330-v1.patch, HBASE-17330-v2.patch
>
>
> In {{SnapshotFileCache#refreshCache}}, the {{hasChanges}} will be judged as:
> {code}
> try {
> FileStatus dirStatus = fs.getFileStatus(snapshotDir);
> lastTimestamp = dirStatus.getModificationTime();
> hasChanges |= (lastTimestamp >= lastModifiedTime); // >= will make
> hasChanges always be true
> {code}
> The {{(lastTimestamp >= lastModifiedTime)}} will make {{hasChanges}} always
> be true because {{lastTimestamp}} will be updated as:
> {code}
> this.lastModifiedTime = lastTimestamp;
> {code}
> So, SnapshotFileCache will always refresh the file cache.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)