[jira] [Commented] (CASSANDRA-21024) Add configuration to disk usage guardrails to stop writes across all replicas of a keyspace when any node replicating that keyspace exceeds the disk usage failure threshold.

Isaac Reath (Jira) Mon, 17 Nov 2025 16:49:08 -0800


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-21024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18039032#comment-18039032
 ]


Isaac Reath commented on CASSANDRA-21024:
-----------------------------------------

WIP branch available for trunk: 
https://github.com/isaacreath/cassandra/commit/b2b738335f67692bed725cb74f50a2eafdec2b25.
 General idea is to track if a given datacenter contains a full node in the 
`DiskUsageBroadcaster`. On write, we'll check all nodes that replicate data for 
a given partition key and if that node is in a datacenter that contains a full 
node, we'll reject the write.

> Add configuration to disk usage guardrails to stop writes across all replicas 
> of a keyspace when any node replicating that keyspace exceeds the disk usage 
> failure threshold.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-21024
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-21024
>             Project: Apache Cassandra
>          Issue Type: Improvement
>          Components: Feature/Guardrails
>            Reporter: Isaac Reath
>            Assignee: Isaac Reath
>            Priority: Normal
>             Fix For: 6.x
>
>
> [CASSANDRA-17150|https://issues.apache.org/jira/browse/CASSANDRA-17150] 
> introduced disk usage guardrails that stop writes for specific tokens when 
> any replica responsible for those tokens exceeds the configured failure 
> threshold. This mechanism protects individual nodes from running out of disk 
> space but can result in inconsistent write availability when only a subset of 
> replicas or token ranges are affected. This in turn pushes the responsibility 
> onto the application owner to decide how to handle the partial write 
> unavailability. 
> We propose adding a new configuration option, 
> data_disk_usage_stop_writes_for_keyspace_on_fail, that extends this behavior 
> to the keyspace level. When enabled, if any node that participates in 
> replication for a keyspace exceeds the disk usage failure threshold, writes 
> to that keyspace will be stopped across all nodes who replicate that keyspace.
> This change provides operators with finer control over guardrail enforcement, 
> allowing them to choose between the current per-token behavior or a stricter, 
> keyspace-wide policy that prioritizes simplicity and operational 
> predictability over partial write availability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-21024) Add configuration to disk usage guardrails to stop writes across all replicas of a keyspace when any node replicating that keyspace exceeds the disk usage failure threshold.

Reply via email to