[
https://issues.apache.org/jira/browse/HDFS-8118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505791#comment-14505791
]
Harsh J commented on HDFS-8118:
-------------------------------
Thanks for explaining that Casey. It makes sense to constant-ise the checkpoint
date for uniformity - and the fix for this looks alright to me.
It also may make sense that people want to set checkpoint intervals equal to
the trash intervals. I think we can remove the change in the patch of capping
it to 1/2 the value of intervals, but just add a small doc note in
hdfs-default.xml to the trash checkpoint period property on what the behaviour
could end up being if its set to equal of the trash clearing interval.
Would it also be possible to come up with a test-case for this? For example,
load some files into trash such that multiple dirs need to be checkpointed, and
issue a checkpoint (or await its lowered interval) and ensure only one date is
observed before clearing occurs? It would help avoid regressions in future,
just in case.
> Delay in checkpointing Trash can leave trash for 2 intervals before deleting
> ----------------------------------------------------------------------------
>
> Key: HDFS-8118
> URL: https://issues.apache.org/jira/browse/HDFS-8118
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Casey Brotherton
> Assignee: Casey Brotherton
> Priority: Trivial
> Attachments: HDFS-8118.patch
>
>
> When the fs.trash.checkpoint.interval and the fs.trash.interval are set
> non-zero and the same, it is possible for trash to be left for two intervals.
> The TrashPolicyDefault will use a floor and ceiling function to ensure that
> the Trash will be checkpointed every "interval" of minutes.
> Each user's trash is checkpointed individually. The time resolution of the
> checkpoint timestamp is to the second.
> If the seconds switch while one user is checkpointing, then the next user's
> timestamp will be later.
> This will cause the next user's checkpoint to not be deleted at the next
> interval.
> I have recreated this in a lab cluster
> I also have a suggestion for a patch that I can upload later tonight after
> testing it further.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)