[jira] [Comment Edited] (CASSANDRA-19321) Accord: Command to mark replicas as “stale" for decommission

Caleb Rackliffe (Jira) Mon, 08 Jul 2024 09:06:34 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-19321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17860654#comment-17860654
 ]


Caleb Rackliffe edited comment on CASSANDRA-19321 at 7/8/24 4:05 PM:
---------------------------------------------------------------------

Implementation Notes (draft)

- {{ClusterMetadata}} contains a set of CMS node IDs for replicas that should 
be considered stale
-- This could initially sit alongside {{AccordFastPath}}

- nodetool command group for Accord, similar to {{CMSAdmin}}
-- {{nodetool accord mark_stale <host>}} feeds hostname to the server, where 
it's mapped to a CMS node ID
-- {{ClusterMetadata}} is updated, i.e. a transformation committed that 
resembles {{ReconfigureAccordFastPath}}, but adds replica to the stale set and 
produces the new metadata.
-- Care probably needs to be taken to ensure we don't mark too many nodes stale 
and therefore stop guaranteeing quorums.

- 
{{CoordinateDurabilityScheduling#coordinateShardDurableAfterExclusiveSyncPoint()}},
 via {{CoordinateShardDurable#coordinate()}}, uses updated {{Topology}} to 
exclude stale replicas from the set we need to wait for. (However, we always 
need to wait for at least a quorum. Should this even be possible if we don't 
allow marking too many stale?)
-- {{AccordTopology}} and related classes may need additional logic to 
synthesize Accord {{Topology}} instances from {{ClusterMetadata}}.


was (Author: maedhroz):
Implementation Notes (draft)

- {{ClusterMetadata}} contains a set of CMS node IDs for replicas that should 
be considered stale
-- This could initially sit alongside {{AccordFastPath}}

- nodetool command group for Accord, similar to {{CMSAdmin}}
-- {{nodetool accord mark_stale <host>}} feeds hostname to the server, where 
it's mapped to a CMS node ID
-- {{ClusterMetadata}} is updated, i.e. a transformation committed that 
resembles {{ReconfigureAccordFastPath}}, but adds replica to the stale set and 
produces the new metadata.

- 
{{CoordinateDurabilityScheduling#coordinateShardDurableAfterExclusiveSyncPoint()}},
 via {{CoordinateShardDurable#coordinate()}}, uses updated {{Topology}} to 
exclude stale replicas from the set we need to wait for.
-- {{AccordTopology}} and related classes may need additional logic to 
synthesize Accord {{Topology}} instances from {{ClusterMetadata}}.

> Accord: Command to mark replicas as “stale" for decommission
> ------------------------------------------------------------
>
>                 Key: CASSANDRA-19321
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-19321
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Accord
>            Reporter: Benedict Elliott Smith
>            Assignee: Caleb Rackliffe
>            Priority: Normal
>
> So that other replicas may continue to cleanup their state, we must have an 
> operator command for marking replicas as stale so that the remaining replicas 
> do not wait for them to coordinate their durability status.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (CASSANDRA-19321) Accord: Command to mark replicas as “stale" for decommission

Reply via email to