[ https://issues.apache.org/jira/browse/CASSANDRA-19007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17829211#comment-17829211 ]
Caleb Rackliffe commented on CASSANDRA-19007: --------------------------------------------- [~Bereng] I wouldn't say that an actual solution (which I had [started to explore|https://github.com/apache/cassandra/pull/3155] along w/ CASSANDRA-19018) should block 5.0, as this has been broken since the beginning of time for normal filtering queries. What I think we should do at this point is just spin off a separate Jira to emit a client warning/put a guardrail in place if a user attempts to do a read that involves filtering (without SAI, which is fixed, or at least without an index) on multiple mutable/regular columns at a consistency level that requires coordinator resolution. We can keep this Jira to track an actual fix, which honestly might not be that bad, given how much of the dirty work CASSANDRA-19018 has already done, in terms of testing infrastructure and all the fixes to RFP. If that all makes sense, feel free to throw up that Jira, or let me know if you'd like me to, and I can. > Queries with multi-column replica-side filtering can miss rows > -------------------------------------------------------------- > > Key: CASSANDRA-19007 > URL: https://issues.apache.org/jira/browse/CASSANDRA-19007 > Project: Cassandra > Issue Type: Bug > Components: Consistency/Coordination > Reporter: Andres de la Peña > Assignee: Caleb Rackliffe > Priority: Normal > Fix For: 5.0.x, 5.x > > Time Spent: 10m > Remaining Estimate: 0h > > {{SELECT}} queries with multi-column replica-side filtering can miss rows if > the filtered columns are spread across out-of-sync replicas. This dtest > reproduces the issue: > {code:java} > @Test > public void testMultiColumnReplicaSideFiltering() throws IOException > { > try (Cluster cluster = init(Cluster.build().withNodes(2).start())) > { > cluster.schemaChange(withKeyspace("CREATE TABLE %s.t (k int PRIMARY > KEY, a int, b int)")); > // insert a split row > cluster.get(1).executeInternal(withKeyspace("INSERT INTO %s.t(k, a) > VALUES (0, 1)")); > cluster.get(2).executeInternal(withKeyspace("INSERT INTO %s.t(k, b) > VALUES (0, 2)")); > String select = withKeyspace("SELECT * FROM %s.t WHERE a = 1 AND b = > 2 ALLOW FILTERING"); > Object[][] initialRows = cluster.coordinator(1).execute(select, ALL); > assertRows(initialRows, row(0, 1, 2)); // not found!! > } > } > {code} > This edge case affects queries using {{ALLOW FILTERING}} or any index > implementation. > It affects all branches since multi-column replica-side filtering queries > were introduced, long before 3.0. > The protection mechanism added by CASSANDRA-8272/8273 won't deal with this > case, since it only solves single-column conflicts where stale rows could > resurrect. This bug however doesn't resurrect data, it can only miss rows > while the replicas are out-of-sync. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org