[
https://issues.apache.org/jira/browse/CASSANDRA-8273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007964#comment-16007964
]
Sylvain Lebresne commented on CASSANDRA-8273:
---------------------------------------------
bq. Obviously, moving the filtering to the coordinator would remove that
problem, but doing so would, on top of not being trivial to implmenent, have
serious performance impact since we can't know in advance how much data will be
filtered and we may have to redo query to replica multiple times.
That comment (from the description) is pretty old and isn't entirely accurate
anymore so I want to amend it and expand on it.
While it's obviously still true that moving filtering coordinator-side has
performance impacts, it's now kind of trivial to do post-CASSANDRA-8099.
Basically, I believe we just need to move the {{RowFilter#filter}} call that is
currently in {{ReadCommand#executeLocally()}} to
post-coordinator-reconciliation. Typically, to the
{{postReconciliationProcessing()}} method that {{PartitionRangeReadCommand}}
has that we would just generalize to all {{ReadCommand}} (that is, adding it to
{{SinglePartitionReadCommand}}).
In particular, while it's still true that we'll have to redo queries when
filtering makes us fall short on a first try, the "short read protection" from
{{DataResolver}} actually handles this for us reasonably nicely.
Of course, there is the performance concerns, which concretely come in 2
flavors:
# we'll transfer everything that is filtered from the replica to the
coordinator while we don't today.
# as a consequence and as mentioned above, we'll have to (usually) do multiple
coordinator<->replica queries to get a particular count of final rows, when
it's only one today.
I do want to note the following though:
* For CL.ONE, and as noted by Robert above, this is not really a big deal.
There is actually no impact if you use a token-aware client. If you don't, then
we could theoretically push the filtering on the replica in that specific case,
but honestly, if you care about performance, you should be using
token-awareness so I'm not convinced it's even worth adding any complexity for
this (at the very least, for a v1, we don't currently ship the CL with queries
to replica, and while I'm sure we'll want to change that for other reasons at
some point, I don't think we should bother here).
* For higher CL, it's definitively a bigger impact, but here the thing: if you
use a higher CL, that implies that you actually care about and _rely on_ CL
guarantees, so I think no kind of performance matters if we don't fulfill those
guarantees, and not fixing a know correctness issue because it impact
performance is imo backward.
I'll also note that while the 2nd flavor will certainly have an impact, the
short-read protection from {{DataResolver}} is actually not too stupid about
this and will "regulate" his 2nd query based on how much was filtered on the
1st one to limit the impact somehow. Not awesome, but better than nothing.
Anyway, I'm personally in favor of fixing this by moving filtering server-side,
as while this has performance impact, we shouldn't be fast at the expense of
correctness. And I have no clue how to fix this replica-side and no-one offered
a proper option for that in ~3 years. Let's we make things correct now, and
_then_ we can think about how to optimize.
I also do want to remind for context that {{ALLOW FILTERING}} is something we
strongly advertise as not-a-great-idea for anything performance sensitive in
the first place, so that's imo all the more reason to not agonize over
performance too much and favor correctness first and foremost.
> Allow filtering queries can return stale data
> ---------------------------------------------
>
> Key: CASSANDRA-8273
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8273
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sylvain Lebresne
>
> Data filtering is done replica side. That means that a single replica with
> stale data may make the whole query return that stale data.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v1 text, v2 int);
> CREATE INDEX ON test(v1);
> INSERT INTO test(k, v1, v2) VALUES (0, 'foo', 1);
> {noformat}
> with every replica up to date. Now, suppose that the following queries are
> done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v2 = 2 WHERE k = 0;
> SELECT * FROM test WHERE v1 = 'foo' AND v2 = 1;
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before
> having applied the insert, then the now stale result will be returned. Let's
> note that this is a problem related to filtering, not 2ndary indexes.
> This issue share similarity with CASSANDRA-8272 but contrarily to that former
> issue, I'm not sure how to fix it. Obviously, moving the filtering to the
> coordinator would remove that problem, but doing so would, on top of not
> being trivial to implmenent, have serious performance impact since we can't
> know in advance how much data will be filtered and we may have to redo query
> to replica multiple times.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]