[ 
https://issues.apache.org/jira/browse/CASSANDRA-16815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393515#comment-17393515
 ] 

Yifan Cai commented on CASSANDRA-16815:
---------------------------------------

[~brandon.williams], I have updated the 
[PR|https://github.com/apache/cassandra/pull/1120] to 1) add a toggle (default 
off) to enable/disable the auto cleanup and 2) compute the largest gcgs among 
the tables to better estimate the hints file ttl. Please take another look. 

I proposed to control it with a duration property earlier. When writing the 
code, I realized that we already have a property, {{cassandra.maxHintTTL}}. 
Adding a new one can be confusing. And it does not make sense to change the 
default of maxHintTTL to 0. So falling back to the original idea to have 
boolean to control. 

Regarding computing the largest gcgs, the idea is that if the hints file lives 
longer than it, it is certain that all mutations in the hints are fully 
expired. It is possible that the current gcgs is not the same as the original 
recorded gcgs, e.g. by altering the table. If the current gcgs is larger, 
cassandra realizes the hints has expired when deserializing and discard. If the 
current gcgs is lower, the hints receiver (HintVerbHandler) will not apply the 
mutations. So it does not make sense to even send it. In both cases, it is safe 
to just delete the hints file once it passes the current largest gcgs. 

> Background schedule to clean up orphaned hints files
> ----------------------------------------------------
>
>                 Key: CASSANDRA-16815
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-16815
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Local/Other
>            Reporter: Yifan Cai
>            Assignee: Yifan Cai
>            Priority: Normal
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Host replacement is possible to produce orphaned hints files that the 
> original associated host ID no longer exist in the cluster (i.e., being 
> replaced). Those orphaned hints files will not be dispatched and only 
> consumes up the disk space. 
> We can have a background schedule that infrequently checks and deletes the 
> files if they are orphaned and have exceeded the TTL. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to