[
https://issues.apache.org/jira/browse/CASSANDRA-17180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17453037#comment-17453037
]
Paulo Motta commented on CASSANDRA-17180:
-----------------------------------------
My point is that customers should not be aware of the "heartbeat" term since
this is an implementation detail of the "check_gc_grace_seconds_on_startup"
feature, which requires a heartbeat file to track the last time the node was
up. But for instance, a heartbeat file would not be needed if it were not for
this feature.
So my suggestion is to not expose this feature to users as "heartbeat" to avoid
leaking implementation details to users. What the user is interested is just
that the startup fails if the node has been down for longer than
gc_grace_seconds on any table so that is the feature we should expose to users.
So I would suggest something along those lines in the current configuration
format:
{noformat}
check_gc_grace_seconds_on_startup:
enabled: true
ignored_tables:
- ks1.tb1
- ks2.tb2
- ks3 // would ignore whole keyspace
- heartbeat_file: .cassandra-heartbeat //advanced property, maybe can be a
system property?
- heartbeat_period: 60 secs //advanced property, maybe can be a system
property?
{noformat}
Regarding the refactoring of the startup checks this was more a suggestion for
a future improvement but we shouldn't block this ticket on that, just be aware
of the future perspective so we can easily transpose the property to the new
format in the future.
> Implement heartbeat service to know last time Cassandra node was up
> -------------------------------------------------------------------
>
> Key: CASSANDRA-17180
> URL: https://issues.apache.org/jira/browse/CASSANDRA-17180
> Project: Cassandra
> Issue Type: New Feature
> Components: Legacy/Observability
> Reporter: Stefan Miklosovic
> Assignee: Stefan Miklosovic
> Priority: Normal
> Time Spent: 10m
> Remaining Estimate: 0h
>
> As already discussed on ML, it would be nice to have a service which would
> periodically write timestamp to a file signalling it is up / running.
> Then, on the startup, we would read this file and we would determine if there
> is some table which gc grace is behind this time and we would fail the start
> so we would prevent zombie data to be likely spread around a cluster.
> https://lists.apache.org/thread/w4w5t2hlcrvqhgdwww61hgg58qz13glw
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]