[
https://issues.apache.org/jira/browse/CASSANDRA-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393515#comment-17393515
]
Yifan Cai commented on CASSANDRA-16815:
---------------------------------------
[~brandon.williams], I have updated the
[PR|https://github.com/apache/cassandra/pull/1120] to 1) add a toggle (default
off) to enable/disable the auto cleanup and 2) compute the largest gcgs among
the tables to better estimate the hints file ttl. Please take another look.
I proposed to control it with a duration property earlier. When writing the
code, I realized that we already have a property, {{cassandra.maxHintTTL}}.
Adding a new one can be confusing. And it does not make sense to change the
default of maxHintTTL to 0. So falling back to the original idea to have
boolean to control.
Regarding computing the largest gcgs, the idea is that if the hints file lives
longer than it, it is certain that all mutations in the hints are fully
expired. It is possible that the current gcgs is not the same as the original
recorded gcgs, e.g. by altering the table. If the current gcgs is larger,
cassandra realizes the hints has expired when deserializing and discard. If the
current gcgs is lower, the hints receiver (HintVerbHandler) will not apply the
mutations. So it does not make sense to even send it. In both cases, it is safe
to just delete the hints file once it passes the current largest gcgs.
> Background schedule to clean up orphaned hints files
> ----------------------------------------------------
>
> Key: CASSANDRA-16815
> URL: https://issues.apache.org/jira/browse/CASSANDRA-16815
> Project: Cassandra
> Issue Type: Improvement
> Components: Local/Other
> Reporter: Yifan Cai
> Assignee: Yifan Cai
> Priority: Normal
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Host replacement is possible to produce orphaned hints files that the
> original associated host ID no longer exist in the cluster (i.e., being
> replaced). Those orphaned hints files will not be dispatched and only
> consumes up the disk space.
> We can have a background schedule that infrequently checks and deletes the
> files if they are orphaned and have exceeded the TTL.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]