[
https://issues.apache.org/jira/browse/CASSANDRA-19007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784388#comment-17784388
]
Andres de la Peña commented on CASSANDRA-19007:
-----------------------------------------------
The proposal sounds good to me at first glance. I guess it can be extended to
also cover {{ALLOW FILTERING}} by just accepting unset column values in the
unrepaired set.
I don't see why the two queries for repaired/un-repaired sstables for a certain
replica have to be issued and merged by the coordinator node. Can't it issue a
single query per replica, and let the replica split it in two and merge the
results locally before sending them back to the coordinator?
> Queries with multi-column replica-side filtering can miss rows
> --------------------------------------------------------------
>
> Key: CASSANDRA-19007
> URL: https://issues.apache.org/jira/browse/CASSANDRA-19007
> Project: Cassandra
> Issue Type: Bug
> Components: Consistency/Coordination
> Reporter: Andres de la Peña
> Assignee: Caleb Rackliffe
> Priority: Normal
> Fix For: 5.0.x, 5.x
>
>
> {{SELECT}} queries with multi-column replica-side filtering can miss rows if
> the filtered columns are spread across out-of-sync replicas. This dtest
> reproduces the issue:
> {code:java}
> @Test
> public void testMultiColumnReplicaSideFiltering() throws IOException
> {
> try (Cluster cluster = init(Cluster.build().withNodes(2).start()))
> {
> cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int PRIMARY
> KEY, a int, b int)"));
> // insert a split row
> cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, a)
> VALUES (0, 1)"));
> cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, b)
> VALUES (0, 2)"));
> String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND b =
> 2 ALLOW FILTERING");
> Object[][] initialRows = cluster.coordinator(1).execute(select, ALL);
> assertRows(initialRows, row(0, 1, 2)); // not found!!
> }
> }
> {code}
> This edge case affects queries using {{ALLOW FILTERING}} or any index
> implementation.
> It affects all branches since multi-column replica-side filtering queries
> were introduced, long before 3.0.
> The protection mechanism added by CASSANDRA-8272/8273 won't deal with this
> case, since it only solves single-column conflicts where stale rows could
> resurrect. This bug however doesn't resurrect data, it can only miss rows
> while the replicas are out-of-sync.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]