[
https://issues.apache.org/jira/browse/CASSANDRA-13704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17880836#comment-17880836
]
Caleb Rackliffe edited comment on CASSANDRA-13704 at 9/17/24 6:45 PM:
----------------------------------------------------------------------
|4.0|[patch|https://github.com/apache/cassandra/pull/3526]| [^ci_summary.html]
|n/a|
|4.1|[patch|https://github.com/apache/cassandra/pull/3539]|
[^ci_summary-1.html] |n/a|
|5.0|[patch|https://github.com/apache/cassandra/pull/3544]|
[^ci_summary-2.html] |[ASF
CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch-5/67/]|
I've now posted PRs for 4.0, 4.1, and 5.0. CI is looking pretty clean on 4.0
and 4.1, and 5.0 runs are in progress...
The patch introduces two new YAML options, {{log_out_of_token_range_requests}}
and {{{}reject_out_of_token_range_requests{}}}, which default to {{{}true{}}}.
They determine how streaming, repair, hints, mutations, read repair, and
point/range reads handle cases where they are being executed on nodes that do
not own the range(es) for the data involved.
When enabled, {{log_out_of_token_range_requests}} logs at WARN level,
indicating the kind of request, its source, the invalid ranges requested, and
the ranges the node actually owns. When {{reject_out_of_token_range_requests}},
out-of-range operations are outright rejected, rather than being accepted by a
node that may never own the relevant range(s) and cannot, for example, safely
participate in a write quorum. (Writes are not considered out-of-range if the
range is pending, but in the event the node itself isn't yet aware of the
pending range, they will be rejected. This may cause a short window of degraded
availability, but it is safer and more visible than silently and erroneously
accepting them.)
Once review settles, I'll likely add entries to {{NEWS.txt}} along w/ the
CHANGES content, but given this is something we should probably never disable,
I'm not too keen on adding it to the example {{cassandra.yaml}}.
Finally, {{nodetool info}} has a new option, {{--out-of-range-ops}}, that will
display per-keyspace counts of operations for invalid tokens.
was (Author: maedhroz):
|4.0|[patch|https://github.com/apache/cassandra/pull/3526]| [^ci_summary.html]
|n/a|
|4.1|[patch|https://github.com/apache/cassandra/pull/3539]|
[^ci_summary-1.html] |n/a|
|5.0|[patch|https://github.com/apache/cassandra/pull/3544]|
[^ci_summary-2.html] |[ASF
CI|https://ci-cassandra.apache.org/view/patches/job/Cassandra-devbranch-5/66/]|
I've now posted PRs for 4.0, 4.1, and 5.0. CI is looking pretty clean on 4.0
and 4.1, and 5.0 runs are in progress...
The patch introduces two new YAML options, {{log_out_of_token_range_requests}}
and {{{}reject_out_of_token_range_requests{}}}, which default to {{{}true{}}}.
They determine how streaming, repair, hints, mutations, read repair, and
point/range reads handle cases where they are being executed on nodes that do
not own the range(es) for the data involved.
When enabled, {{log_out_of_token_range_requests}} logs at WARN level,
indicating the kind of request, its source, the invalid ranges requested, and
the ranges the node actually owns. When {{reject_out_of_token_range_requests}},
out-of-range operations are outright rejected, rather than being accepted by a
node that may never own the relevant range(s) and cannot, for example, safely
participate in a write quorum. (Writes are not considered out-of-range if the
range is pending, but in the event the node itself isn't yet aware of the
pending range, they will be rejected. This may cause a short window of degraded
availability, but it is safer and more visible than silently and erroneously
accepting them.)
Once review settles, I'll likely add entries to {{NEWS.txt}} along w/ the
CHANGES content, but given this is something we should probably never disable,
I'm not too keen on adding it to the example {{cassandra.yaml}}.
Finally, {{nodetool info}} has a new option, {{--out-of-range-ops}}, that will
display per-keyspace counts of operations for invalid tokens.
> Safer handling of out of range tokens
> -------------------------------------
>
> Key: CASSANDRA-13704
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13704
> Project: Cassandra
> Issue Type: Bug
> Components: Legacy/Coordination, Legacy/Observability
> Reporter: Sam Tunnicliffe
> Assignee: Caleb Rackliffe
> Priority: Urgent
> Fix For: 4.0.x, 4.1.x, 5.0.x
>
> Attachments: CASSANDRA-13704_5-0_23_ci_summary.html,
> CASSANDRA-13704_5-0_23_results_details.tar.xz,
> CASSANDRA-13704_5-0_24_ci_summary.html,
> CASSANDRA-13704_5-0_24_results_details.tar.xz, ci_summary-1.html,
> ci_summary-2.html, ci_summary.html, result_details.tar-1.gz,
> result_details.tar-2.gz, result_details.tar.gz
>
> Time Spent: 7h 10m
> Remaining Estimate: 0h
>
> It is possible for nodes to have a divergent view of the ring, which can
> result in some operations being sent to the wrong nodes. This is an umbrella
> ticket to mitigate such issues by adding logging when a node is asked to
> perform an operation for tokens it does not own. This will be useful for
> detecting when the nodes' views of the ring diverge, which is not highly
> visible at the moment, and also for post-hoc analysis.
> It may also be beneficial to straight up reject certain operations, though
> this will need to balance the risk of performing those ops against the
> consequences rejecting them has on availability.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]