[
https://issues.apache.org/jira/browse/CASSANDRA-8272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16007801#comment-16007801
]
Sylvain Lebresne commented on CASSANDRA-8272:
---------------------------------------------
bq. I disagree here: if filtering is applied on top of index results, you'll
still get wrong results
It's possible, but not at all guarantee since the index and filtering will
apply to different columns. But that's almost beside the point as my point is
that even solving only the indexing will still avoid bugs for some people (at
the very least the ones that don't use filtering over indexing at all), so if
we can't get agreement on how to fix the filtering, I don't think we should
hold the indexing fix.
But mostly, I just want us to have the _discussion_ around filtering in
CASSANDRA-8273 to avoid mixing things up, but If we can agree on moving
filtering server-side there quickly, then I'm totally fine doing that and the
indexing in a single patch if we prefer.
bq. what about fixing filtering (that is, moving to coordinator-side filtering)
only when indexes are present?
Well, that kind of already get into the territory of whether we're ok with
moving filtering coordinator-side. In fact, I don't think having filtering
applied on top of indexing or not change in any way that discussion. Again
though, I'm not at all against fixing both issues, I just prefer discussing the
two different (though related) problems separately.
bq. But we can still provide some API (i.e. the {{isSatisfiedBy()}} you
mentioned) they can leverage.
If you're making a general point, then sure. Otherwise, I'm not sure what else
you have in mind (and as I said I don't see what more we can do) so feel free
to share.
bq. Mmmhhhh ... clunky. And error prone as the 3.X code would be probably
untestable. Couldn't the replica detect the coordinator version and return
results accordingly?
We can do anything, but everything version-related is currently wired to the
messaging protocol version, which can't currently change in minor versions, so
we'd have to rely on the version exchanged through gossip in a way we never
have, so with risks associated (typically potential races between when we
actually get that version and where we use it). Plus it would mean quite a bit
of (fairly ugly) changes to pass the version where we need it. All that in a
minor release. I doubt it's a good idea in practice in this context.
On the flip-side, we do have quite a bit of prior experience adding stuffs to
minor releases to fix future major upgrade. I don't disagree it's clunky, mind
you, but better the devil you know...
I don't see why it would be untestable though: we can test the added filtering
doesn't break anything in 3.x and we can totally test upgrades.
bq. for index using custom indexes: we'd need to have them implement the
{{CustomExpression#isSatistiedBy}} method
I was a bit too quick here, it's actually not that simple, because
{{CustomExpression}} are created directly from the parser and don't depend on
whatever index use them, so we can't have them override/implement it. That
said, we do know which index it's use with when we create one so we could
change things a bit so index do provide us with their own concrete
implementation {{CustomExpression}}, it's just a tiny bit more involved that I
made is sound to be.
> 2ndary indexes can return stale data
> ------------------------------------
>
> Key: CASSANDRA-8272
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8272
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sylvain Lebresne
> Assignee: Andrés de la Peña
> Fix For: 3.0.x
>
>
> When replica return 2ndary index results, it's possible for a single replica
> to return a stale result and that result will be sent back to the user,
> potentially failing the CL contract.
> For instance, consider 3 replicas A, B and C, and the following situation:
> {noformat}
> CREATE TABLE test (k int PRIMARY KEY, v text);
> CREATE INDEX ON test(v);
> INSERT INTO test(k, v) VALUES (0, 'foo');
> {noformat}
> with every replica up to date. Now, suppose that the following queries are
> done at {{QUORUM}}:
> {noformat}
> UPDATE test SET v = 'bar' WHERE k = 0;
> SELECT * FROM test WHERE v = 'foo';
> {noformat}
> then, if A and B acknowledge the insert but C respond to the read before
> having applied the insert, then the now stale result will be returned (since
> C will return it and A or B will return nothing).
> A potential solution would be that when we read a tombstone in the index (and
> provided we make the index inherit the gcGrace of it's parent CF), instead of
> skipping that tombstone, we'd insert in the result a corresponding range
> tombstone.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]