[
https://issues.apache.org/jira/browse/CASSANDRA-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860654#comment-17860654
]
Caleb Rackliffe edited comment on CASSANDRA-19321 at 7/8/24 4:05 PM:
---------------------------------------------------------------------
Implementation Notes (draft)
- {{ClusterMetadata}} contains a set of CMS node IDs for replicas that should
be considered stale
-- This could initially sit alongside {{AccordFastPath}}
- nodetool command group for Accord, similar to {{CMSAdmin}}
-- {{nodetool accord mark_stale <host>}} feeds hostname to the server, where
it's mapped to a CMS node ID
-- {{ClusterMetadata}} is updated, i.e. a transformation committed that
resembles {{ReconfigureAccordFastPath}}, but adds replica to the stale set and
produces the new metadata.
-- Care probably needs to be taken to ensure we don't mark too many nodes stale
and therefore stop guaranteeing quorums.
-
{{CoordinateDurabilityScheduling#coordinateShardDurableAfterExclusiveSyncPoint()}},
via {{CoordinateShardDurable#coordinate()}}, uses updated {{Topology}} to
exclude stale replicas from the set we need to wait for. (However, we always
need to wait for at least a quorum. Should this even be possible if we don't
allow marking too many stale?)
-- {{AccordTopology}} and related classes may need additional logic to
synthesize Accord {{Topology}} instances from {{ClusterMetadata}}.
was (Author: maedhroz):
Implementation Notes (draft)
- {{ClusterMetadata}} contains a set of CMS node IDs for replicas that should
be considered stale
-- This could initially sit alongside {{AccordFastPath}}
- nodetool command group for Accord, similar to {{CMSAdmin}}
-- {{nodetool accord mark_stale <host>}} feeds hostname to the server, where
it's mapped to a CMS node ID
-- {{ClusterMetadata}} is updated, i.e. a transformation committed that
resembles {{ReconfigureAccordFastPath}}, but adds replica to the stale set and
produces the new metadata.
-
{{CoordinateDurabilityScheduling#coordinateShardDurableAfterExclusiveSyncPoint()}},
via {{CoordinateShardDurable#coordinate()}}, uses updated {{Topology}} to
exclude stale replicas from the set we need to wait for.
-- {{AccordTopology}} and related classes may need additional logic to
synthesize Accord {{Topology}} instances from {{ClusterMetadata}}.
> Accord: Command to mark replicas as “stale" for decommission
> ------------------------------------------------------------
>
> Key: CASSANDRA-19321
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19321
> Project: Cassandra
> Issue Type: Improvement
> Components: Accord
> Reporter: Benedict Elliott Smith
> Assignee: Caleb Rackliffe
> Priority: Normal
>
> So that other replicas may continue to cleanup their state, we must have an
> operator command for marking replicas as stale so that the remaining replicas
> do not wait for them to coordinate their durability status.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]