[
https://issues.apache.org/jira/browse/CASSANDRA-21024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18039032#comment-18039032
]
Isaac Reath commented on CASSANDRA-21024:
-----------------------------------------
WIP branch available for trunk:
https://github.com/isaacreath/cassandra/commit/b2b738335f67692bed725cb74f50a2eafdec2b25.
General idea is to track if a given datacenter contains a full node in the
`DiskUsageBroadcaster`. On write, we'll check all nodes that replicate data for
a given partition key and if that node is in a datacenter that contains a full
node, we'll reject the write.
> Add configuration to disk usage guardrails to stop writes across all replicas
> of a keyspace when any node replicating that keyspace exceeds the disk usage
> failure threshold.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-21024
> URL: https://issues.apache.org/jira/browse/CASSANDRA-21024
> Project: Apache Cassandra
> Issue Type: Improvement
> Components: Feature/Guardrails
> Reporter: Isaac Reath
> Assignee: Isaac Reath
> Priority: Normal
> Fix For: 6.x
>
>
> [CASSANDRA-17150|https://issues.apache.org/jira/browse/CASSANDRA-17150]
> introduced disk usage guardrails that stop writes for specific tokens when
> any replica responsible for those tokens exceeds the configured failure
> threshold. This mechanism protects individual nodes from running out of disk
> space but can result in inconsistent write availability when only a subset of
> replicas or token ranges are affected. This in turn pushes the responsibility
> onto the application owner to decide how to handle the partial write
> unavailability.
> We propose adding a new configuration option,
> data_disk_usage_stop_writes_for_keyspace_on_fail, that extends this behavior
> to the keyspace level. When enabled, if any node that participates in
> replication for a keyspace exceeds the disk usage failure threshold, writes
> to that keyspace will be stopped across all nodes who replicate that keyspace.
> This change provides operators with finer control over guardrail enforcement,
> allowing them to choose between the current per-token behavior or a stricter,
> keyspace-wide policy that prioritizes simplicity and operational
> predictability over partial write availability.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]