[
https://issues.apache.org/jira/browse/CASSANDRA-19018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17812280#comment-17812280
]
Alex Petrov commented on CASSANDRA-19018:
-----------------------------------------
I am not fully done with my review just yet, but wanted to post some early
findings, since they may change the course a bit.
I'm not sure if we can rely on timestamps to be different for detecting partial
updates. Partial updates may have timestamp collisions:
{code:java}
try (Cluster cluster = init(Cluster.build(2).withConfig(cfg ->
cfg.with(NETWORK, GOSSIP)).start()))
{
cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int PRIMARY
KEY, a int, b int)"));
cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING
'sai'"));
cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING
'sai'"));
SAIUtil.waitForIndexQueryable(cluster, KEYSPACE);
// insert a split row
cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, a,
b) VALUES (0, 1, 2) USING TIMESTAMP 1"));
cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, a,
b) VALUES (0, 2, 1) USING TIMESTAMP 1"));
// Uncomment this line and test succeeds w/ partial writes
completed...
// cluster.get(1).nodetoolResult("repair",
KEYSPACE).asserts().success();
String select = withKeyspace("SELECT * FROM %s.t WHERE a = 2 AND b
= 2");
Object[][] initialRows = cluster.coordinator(1).execute(select,
ConsistencyLevel.ALL);
AssertUtils.assertRows(initialRows, AssertUtils.row(0,2,2));
}
{code}
When reconciled, value 2 will take precedence because of lexicographical
conflict resolution. This test will fail with:
{code}
java.lang.AssertionError:
Expected: [[0, 2, 2]]
Actual: []
{code}
When repaired, both columns will have value {2}, and match on the index will
get triggered.
> An SAI-specific mechanism to ensure consistency isn't violated for
> multi-column (i.e. AND) queries at CL > ONE
> --------------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-19018
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19018
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Coordination, Feature/SAI
> Reporter: Caleb Rackliffe
> Assignee: Caleb Rackliffe
> Priority: Normal
> Fix For: 5.0-rc, 5.x
>
> Attachments: ci_summary.html, result_details.tar.gz
>
> Time Spent: 6h 50m
> Remaining Estimate: 0h
>
> CASSANDRA-19007 is going to be where we add a guardrail around
> filtering/index queries that use intersection/AND over partially updated
> non-key columns. (ex. Restricting one clustering column and one normal column
> does not cause a consistency problem, as primary keys cannot be partially
> updated.) This issue exists to attempt to fix this specifically for SAI in
> 5.0.x, as Accord will (last I checked) not be available until the 5.1 release.
> The SAI-specific version of the originally reported issue is this:
> {noformat}
> try (Cluster cluster = init(Cluster.build(2).withConfig(config ->
> config.with(GOSSIP).with(NETWORK)).start()))
> {
> cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int
> PRIMARY KEY, a int, b int)"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(a) USING
> 'sai'"));
> cluster.schemaChange(withKeyspace("CREATE INDEX ON %s.t(b) USING
> 'sai'"));
> // insert a split row
> cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k,
> a) VALUES (0, 1)"));
> cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k,
> b) VALUES (0, 2)"));
> // Uncomment this line and test succeeds w/ partial writes
> completed...
> //cluster.get(1).nodetoolResult("repair",
> KEYSPACE).asserts().success();
> String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND
> b = 2");
> Object[][] initialRows = cluster.coordinator(1).execute(select,
> ConsistencyLevel.ALL);
> assertRows(initialRows, row(0, 1, 2)); // not found!!
> }
> {noformat}
> To make a long story short, the local SAI indexes are hiding local partial
> matches from the coordinator that would combine there to form full matches.
> Simple non-index filtering queries also suffer from this problem, but they
> hide the partial matches in a different way. I'll outline a possible solution
> for this in the comments that takes advantage of replica filtering protection
> and the repaired/unrepaired datasets...and attempts to minimize the amount of
> extra row data sent to the coordinator.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]