[
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812447#comment-17812447
]
Caleb Rackliffe edited comment on CASSANDRA-19018 at 1/31/24 5:03 AM:
----------------------------------------------------------------------
bq. I'm not sure if we can rely on timestamps to be different for detecting
partial updates. Partial updates may have timestamp collisions
After reproducing locally myself and thinking it over a bit, I think
[~ifesdjeen] is right. Because we break the ties per-cell, there might not be
enough information to correctly do what I've got in the patch in
{{FilterTree#getLocalOperator()}}. Removing this doesn't mean anything for
correctness, but it was a nice optimization in terms of how many results we'd
have to send to the coordinator. Trying to think of another way around this,
and open to suggestions, although I'm not immediately sure if it's possible
without the row read itself keeping track of ties...
Actually, even if we were able to keep track of timestamp ties locally, it
might not matter. We might have two complete-row mutations w/ the same
timestamp hitting different replicas where strict filtering on neither one
would produce a match, but a match would exist post-reconciliation. I think I'm
just going to have to change {{getLocalOperator()}} to downgrade to unions when
the coordinator tells us strict filtering isn't safe.
I think this also renders moot all of our previous discussion on things like
requiring non-partial updates, given it's possible to break things even with
those (in the presence of a timestamp collision).
CC [~adelapena] [~pkolaczk]
was (Author: maedhroz):
bq. I'm not sure if we can rely on timestamps to be different for detecting
partial updates. Partial updates may have timestamp collisions
After reproducing locally myself and thinking it over a bit, I think
[~ifesdjeen] is right. Because we break the ties per-cell, there might not be
enough information to correctly do what I've got in the patch in
{{FilterTree#getLocalOperator()}}. Removing this doesn't mean anything for
correctness, but it was a nice optimization in terms of how many results we'd
have to send to the coordinator. Trying to think of another way around this,
and open to suggestions, although I'm not immediately sure if it's possible
without the row read itself keeping track of ties...
Actually, even if we were able to keep track of timestamp ties locally, it
might not matter. We might have two complete-row mutations w/ the same
timestamp hitting different replicas where strict filtering on neither one
would produce a match, but a match would exist post-reconciliation. I think I'm
just going to have to change {{getLocalOperator()}} to downgrade to unions when
the coordinator tells us strict filtering isn't safe.
CC [~adelapena] [~pkolaczk]
> An SAI-specific mechanism to ensure consistency isn't violated for
> multi-column (i.e. AND) queries at CL > ONE
> --------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-19018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Coordination, Feature/SAI
> Reporter: Caleb Rackliffe
> Assignee: Caleb Rackliffe
> Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
> Time Spent: 6h 50m
> Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around
> filtering/index queries that use intersection/AND over partially updated
> non-key columns. (ex. Restricting one clustering column and one normal column
> does not cause a consistency problem, as primary keys cannot be partially
> updated.) This issue exists to attempt to fix this specifically for SAI in
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config ->
> config.with(GOSSIP).with(NETWORK)).start()))
> {
> cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int
> PRIMARY KEY, a int, b int)"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING
> 'sai'"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING
> 'sai'"));
> // insert a split row
> cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k,
> a) VALUES (0, 1)"));
> cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k,
> b) VALUES (0, 2)"));
> // Uncomment this line and test succeeds w/ partial writes
> completed...
> //cluster.get(1).nodetoolResult("repair",
> KEYSPACE).asserts().success();
> String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND
> b = 2");
> Object[][] initialRows = cluster.coordinator(1).execute(select,
> ConsistencyLevel.ALL);
> assertRows(initialRows, row(0, 1, 2)); // not found!!
> }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial
> matches from the coordinator that would combine there to form full matches.
> Simple non-index filtering queries also suffer from this problem, but they
> hide the partial matches in a different way. I'll outline a possible solution
> for this in the comments that takes advantage of replica filtering protection
> and the repaired/unrepaired datasets...and attempts to minimize the amount of
> extra row data sent to the coordinator.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]