[
https://issues.apache.org/jira/browse/CASSANDRA-8038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14179628#comment-14179628
]
Takenori Sato commented on CASSANDRA-8038:
------------------------------------------
{quote}
Several places.
One of them RowDataResolve#resolve() -> resolveSuperset() ->
filter.collateColumns() -> ...
{quote}
Thanks for pointing out.
I have looked through RowDataResolver carefully.
It finds a resolved version out of multiple versions collected from replica
nodes. It uses an instance of IdentityQueryFilter for filtering, which is a
subclass of SliceQueryFilter. And it follows the same path of slice query that
finds a resolved version out of multiple versions from sstables, which is fixed
by this patch.
The problem I noticed is that an obsolete data could be returned to a client in
Case1-2 as below.
{code:title=scenario|borderStyle=solid}
RF=3
Node X
Node Y
Node Z
Coordinator Node C
Case 1: ignoreTombstonesForReadRepair=true for X, Y, Z
Case 1-1
X[col1:tombstone] ==> [] ==> C
Y[col1:tombstone] ==> [] ==> C
Z[col1:tombstone] ==> [] ==> C
C ==> [] ===> Client
Case 1-2
X[col1:tombstone] ==> [] ==> C
Y[col1:old] ==> [col1:old] ==> C
Z[col1:tombstone] ==> [] ==> C
C ==> [col1:old] ===> Client
C ==> [col1:old] ===> X, Z(stored locally, but tombstone wins)
Case 1-3
X[col1:new] ==> [new] ==> C
Y[col1:tombstone] ==> [] ==> C
Z[col1:new] ==> [new] ==> C
C ==> [new] ===> Client
C ==> [new] ===> Y
Case 2: ignoreTombstonesForReadRepair=true for X, Y, and false for Z
Case 2-1
X[col1:tombstone] ==> [] ==> C
Y[col1:tombstone] ==> [] ==> C
Z[col1:tombstone] ==> [tombstone] ==> C
C ==> [] ===> Client
Case 2-2
X[col1:tombstone] ==> [] ==> C
Y[col1:old] ==> [col1:old] ==> C
Z[col1:tombstone] ==> [tombstone] ==> C
C ==> [] ===> Client
C ==> [tombstone] ===> X, Y
Case 2-3
X[col1:new] ==> [new] ==> C
Y[col1:tombstone] ==> [] ==> C
Z[col1:new] ==> [new] ==> C
C ==> [new] ===> Client
C ==> [new] ===> Y
Case 3: ignoreTombstonesForReadRepair=true for X, Z, and false for Y
Case 3-1
X[col1:tombstone] ==> [] ==> C
Y[col1:tombstone] ==> [tombstone] ==> C
Z[col1:tombstone] ==> [] ==> C
C ==> [] ===> Client
Case 3-2
X[col1:tombstone] ==> [] ==> C
Y[col1:old] ==> [col1:old] ==> C
Z[col1:tombstone] ==> [] ==> C
C ==> [] ===> Client
C ==> [tombstone] ===> X, Y
Case 3-3
X[col1:new] ==> [new] ==> C
Y[col1:tombstone] ==> [tombstone] ==> C
Z[col1:new] ==> [new] ==> C
C ==> [new] ===> Client
C ==> [new] ===> Y
Case 4: ignoreTombstonesForReadRepair=true for Y, Z, and false for X
Case 4-1
X[col1:tombstone] ==> [tombstone] ==> C
Y[col1:tombstone] ==> [] ==> C
Z[col1:tombstone] ==> [] ==> C
C ==> [] ===> Client
Case 4-2
X[col1:tombstone] ==> [tombstone] ==> C
Y[col1:old] ==> [col1:old] ==> C
Z[col1:tombstone] ==> [] ==> C
C ==> [] ===> Client
C ==> [tombstone] ===> Y, Z
Case 4-3
X[col1:new] ==> [new] ==> C
Y[col1:tombstone] ==> [tombstone] ==> C
Z[col1:new] ==> [new] ==> C
C ==> [new] ===> Client
C ==> [new] ===> Y
{code}
> A new config option to ignore column tombstones for RR or not
> -------------------------------------------------------------
>
> Key: CASSANDRA-8038
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8038
> Project: Cassandra
> Issue Type: New Feature
> Components: Core
> Reporter: Takenori Sato
> Fix For: 2.1.2
>
> Attachments: CASSANDRA-8038-v2.txt, CASSANDRA-8038-v3.txt,
> CASSANDRA-8038.txt
>
>
> CASSANDRA-6117 addressed the death of Cassandra by column tombstones, and
> whose fix was to raise an error when reading more tombstones than a
> threshold. I think it is an emergency action, rather than a fix.
> We have had this issue for long. So I wondered, in the first place, if it is
> really necessary to collect non-gc-able tombstones, which could cause
> concurrent mode failures, and OOM eventually?
> Actually, I was surprised by the fact that Cassandra takes them into
> consideration. Rather, I prefer to raise a threshold, and tell Cassandra to
> ignore tombstones for digest calculation of RR because a repair is running
> regularly.
> I guess there are some people like me, but not all. So what about adding a
> new configuration option if Cassandra ignores column tombstones for RR or not?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)