[jira] [Commented] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones

2020-05-26 Thread Michael Semb Wever (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17116570#comment-17116570
 ] 

Michael Semb Wever commented on CASSANDRA-15805:


[~slebresne], i've re-run #131 for you as 
https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/134/pipeline
 (taking away the stress and cdc stages, as they don't exist in 3.0) and #132 
as 
https://ci-cassandra.apache.org/blue/organizations/jenkins/Cassandra-devbranch/detail/Cassandra-devbranch/135/pipeline
 (just a retry).

(The CI failure in #132 is addressed in CASSANDRA-15826.)

> Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones 
> interacts with collection tombstones
> --
>
> Key: CASSANDRA-15805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15805
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/SSTable
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> The legacy reading code ({{LegacyLayout}} and 
> {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly 
> the case where a range tombstone covering multiple rows interacts with a 
> collection tombstone.
> A simple example of this problem is if one runs on 2.X:
> {noformat}
> CREATE TABLE t (
>   k int,
>   c1 text,
>   c2 text,
>   a text,
>   b set,
>   c text,
>   PRIMARY KEY((k), c1, c2)
> );
> // Delete all rows where c1 is 'A'
> DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A';
> // Inserts a row covered by that previous range tombstone
> INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, 
> 'bar') USING TIMESTAMP 2;
> // Delete the collection of that previously inserted row
> DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X';
> {noformat}
> If the following is ran on 2.X (with everything either flushed in the same 
> table or compacted together), then this will result in the inserted row being 
> duplicated (one part containing the {{a}} column, the other the {{c}} one).
> I will note that this is _not_ a duplicate of CASSANDRA-15789 and this 
> reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, 
> the additional code added to CASSANDRA-15789 to force merging duplicated rows 
> if they are produced _will_ end up fixing this as a consequence (assuming 
> there is no variation of this problem that leads to other visible issues than 
> duplicated rows). That said, I "think" we'd still rather fix the source of 
> the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones

2020-05-19 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17111019#comment-17111019
 ] 

Sylvain Lebresne commented on CASSANDRA-15805:
--

Thanks for the review. I addressed the comments, squash-cleaned, 'merged' into 
3.11 and started CI (first try at https://ci-cassandra.apache.org, not sure how 
that will go).

||branch||CI||
| [3.0|https://github.com/pcmanus/cassandra/commits/C-15805-3.0] | 
[ci-cassandra 
#122|https://ci-cassandra.apache.org/job/Cassandra-devbranch/122/] |
| [3.11|https://github.com/pcmanus/cassandra/commits/C-15805-3.11] | 
[ci-cassandra 
#123|https://ci-cassandra.apache.org/job/Cassandra-devbranch/123/] |


> Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones 
> interacts with collection tombstones
> --
>
> Key: CASSANDRA-15805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15805
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/SSTable
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> The legacy reading code ({{LegacyLayout}} and 
> {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly 
> the case where a range tombstone covering multiple rows interacts with a 
> collection tombstone.
> A simple example of this problem is if one runs on 2.X:
> {noformat}
> CREATE TABLE t (
>   k int,
>   c1 text,
>   c2 text,
>   a text,
>   b set,
>   c text,
>   PRIMARY KEY((k), c1, c2)
> );
> // Delete all rows where c1 is 'A'
> DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A';
> // Inserts a row covered by that previous range tombstone
> INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, 
> 'bar') USING TIMESTAMP 2;
> // Delete the collection of that previously inserted row
> DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X';
> {noformat}
> If the following is ran on 2.X (with everything either flushed in the same 
> table or compacted together), then this will result in the inserted row being 
> duplicated (one part containing the {{a}} column, the other the {{c}} one).
> I will note that this is _not_ a duplicate of CASSANDRA-15789 and this 
> reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, 
> the additional code added to CASSANDRA-15789 to force merging duplicated rows 
> if they are produced _will_ end up fixing this as a consequence (assuming 
> there is no variation of this problem that leads to other visible issues than 
> duplicated rows). That said, I "think" we'd still rather fix the source of 
> the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones

2020-05-18 Thread Aleksey Yeschenko (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17110514#comment-17110514
 ] 

Aleksey Yeschenko commented on CASSANDRA-15805:
---

Nit, feel free to ignore: in {{UnfilteredDeserializer}} on line 701, redundant 
{{this}}. Otherwise pretty straightforward and LGTM.

> Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones 
> interacts with collection tombstones
> --
>
> Key: CASSANDRA-15805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15805
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/SSTable
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> The legacy reading code ({{LegacyLayout}} and 
> {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly 
> the case where a range tombstone covering multiple rows interacts with a 
> collection tombstone.
> A simple example of this problem is if one runs on 2.X:
> {noformat}
> CREATE TABLE t (
>   k int,
>   c1 text,
>   c2 text,
>   a text,
>   b set,
>   c text,
>   PRIMARY KEY((k), c1, c2)
> );
> // Delete all rows where c1 is 'A'
> DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A';
> // Inserts a row covered by that previous range tombstone
> INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, 
> 'bar') USING TIMESTAMP 2;
> // Delete the collection of that previously inserted row
> DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X';
> {noformat}
> If the following is ran on 2.X (with everything either flushed in the same 
> table or compacted together), then this will result in the inserted row being 
> duplicated (one part containing the {{a}} column, the other the {{c}} one).
> I will note that this is _not_ a duplicate of CASSANDRA-15789 and this 
> reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, 
> the additional code added to CASSANDRA-15789 to force merging duplicated rows 
> if they are produced _will_ end up fixing this as a consequence (assuming 
> there is no variation of this problem that leads to other visible issues than 
> duplicated rows). That said, I "think" we'd still rather fix the source of 
> the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones

2020-05-15 Thread Marcus Eriksson (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17108165#comment-17108165
 ] 

Marcus Eriksson commented on CASSANDRA-15805:
-

This LGTM, two minor ignoreable comments;

* lazily initialise the {{outOfOrderAtom}} queue in {{AtomIterator}} as it will 
be infrequently used (and maybe name it {{outOfOrderAtoms}} )
* consume the peeked atom ({{atoms.next()}}) before calling 
{{atoms.pushOutOfOrder}} in {{UnfilteredIterator#readRow}} to make it obvious 
we can't consume the atom we just pushed

> Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones 
> interacts with collection tombstones
> --
>
> Key: CASSANDRA-15805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15805
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/SSTable
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> The legacy reading code ({{LegacyLayout}} and 
> {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly 
> the case where a range tombstone covering multiple rows interacts with a 
> collection tombstone.
> A simple example of this problem is if one runs on 2.X:
> {noformat}
> CREATE TABLE t (
>   k int,
>   c1 text,
>   c2 text,
>   a text,
>   b set,
>   c text,
>   PRIMARY KEY((k), c1, c2)
> );
> // Delete all rows where c1 is 'A'
> DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A';
> // Inserts a row covered by that previous range tombstone
> INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, 
> 'bar') USING TIMESTAMP 2;
> // Delete the collection of that previously inserted row
> DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X';
> {noformat}
> If the following is ran on 2.X (with everything either flushed in the same 
> table or compacted together), then this will result in the inserted row being 
> duplicated (one part containing the {{a}} column, the other the {{c}} one).
> I will note that this is _not_ a duplicate of CASSANDRA-15789 and this 
> reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, 
> the additional code added to CASSANDRA-15789 to force merging duplicated rows 
> if they are produced _will_ end up fixing this as a consequence (assuming 
> there is no variation of this problem that leads to other visible issues than 
> duplicated rows). That said, I "think" we'd still rather fix the source of 
> the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones

2020-05-13 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17106271#comment-17106271
 ] 

Sylvain Lebresne commented on CASSANDRA-15805:
--

There is probably a few variations for how to fix this, but what feels the more 
intuitive to me is to:
# slightly modify the ctor of {{LegacyRangeTombstone}} so the {{atom5}} of my 
previous comment use an inclusive start. Mostly because that make the rest of 
the logic a bit simpler imo (we can still assume that when we get a atom whose 
cluster strictly sort after the currently grouper row, we're done with that 
row).
# modify {{UnfilteredDeserializer.OldFormatDeserializer}} so that when, while 
grouping a row, it encounters a RT that covers it, is "splits" that into a row 
tombstone covering the row, and push back the handling of the rest of the 
tombstone to when the row is truly finished.

I've pushed a patch doing so for 3.0 below (thanks to [~markus] for triggering 
CI on this):
||branch||unit tests||dtests||jvm dtests||jvm upgrade dtest||
| https://github.com/pcmanus/cassandra/commits/C-15805-3.0 | 
[utests|https://circleci.com/gh/krummas/cassandra/3289] | 
[vnodes|https://circleci.com/gh/krummas/cassandra/3292] 
[no-vnodes|https://circleci.com/gh/krummas/cassandra/3293] | [jvm 
dtests|https://circleci.com/gh/krummas/cassandra/3290] | [upgrade 
dtests|https://circleci.com/gh/krummas/cassandra/3294] |

I'll note that the branch contains another small fix, that is a lot less 
important. Namely, [the return at the beginning of 
{{CellGrouper#addCollectionTombstone}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/LegacyLayout.java#L1516]
 should be {{true}}, not {{false}} as it currently is. The test in question 
checks if a collection tombstone happens to not be selected by the query we're 
decoding data for. If it isn't included, we can ignore the tombstone, so we 
can/should return, but not with {{false}} as that imply the row is finished, 
which it probably isn't.

Now the reason I say that last problem is less important is that in practice, 
only thrift queries should run into this (since CQL queries queries all column 
effectively) so even if we duplicate the row here, it won't matter when the 
result is converted back to thrift (besides, having a collection tombstone 
implies that this is a thrift query on a CQL table, which is dodgy in the first 
place). Anyway, the code is still obviously wrong and the fix trivial, so 
included it it nonetheless (in a separate commit).


> Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones 
> interacts with collection tombstones
> --
>
> Key: CASSANDRA-15805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15805
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/SSTable
>Reporter: Sylvain Lebresne
>Assignee: Sylvain Lebresne
>Priority: Normal
> Fix For: 3.0.x, 3.11.x
>
>
> The legacy reading code ({{LegacyLayout}} and 
> {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly 
> the case where a range tombstone covering multiple rows interacts with a 
> collection tombstone.
> A simple example of this problem is if one runs on 2.X:
> {noformat}
> CREATE TABLE t (
>   k int,
>   c1 text,
>   c2 text,
>   a text,
>   b set,
>   c text,
>   PRIMARY KEY((k), c1, c2)
> );
> // Delete all rows where c1 is 'A'
> DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A';
> // Inserts a row covered by that previous range tombstone
> INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, 
> 'bar') USING TIMESTAMP 2;
> // Delete the collection of that previously inserted row
> DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X';
> {noformat}
> If the following is ran on 2.X (with everything either flushed in the same 
> table or compacted together), then this will result in the inserted row being 
> duplicated (one part containing the {{a}} column, the other the {{c}} one).
> I will note that this is _not_ a duplicate of CASSANDRA-15789 and this 
> reproduce even with the fix to {{LegacyLayout}} of this ticket. That said, 
> the additional code added to CASSANDRA-15789 to force merging duplicated rows 
> if they are produced _will_ end up fixing this as a consequence (assuming 
> there is no variation of this problem that leads to other visible issues than 
> duplicated rows). That said, I "think" we'd still rather fix the source of 
> the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: 

[jira] [Commented] (CASSANDRA-15805) Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones interacts with collection tombstones

2020-05-12 Thread Sylvain Lebresne (Jira)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-15805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17105536#comment-17105536
 ] 

Sylvain Lebresne commented on CASSANDRA-15805:
--

To understand why this happens, let me write down the atoms that the example of 
the description generates on 2.X (using a simplified representation that I hope 
is clear enough):
{noformat}
atom1: RT([A:_, A:X:b:_])@1, // beginning of all 'A' rows to beginning of 
A:X's b column
atom2: Cell(A:X:)@2, // row marker for A:X
atom3: Cell(A:X:a=foo)@2,// value of a in A:X
atom4: RT([A:X:b:_, A:X:b:!])@3, // collection tombstone for b in A:X's
atom5: RT([A:X:b:!, A:!])@1, // remainder of covering RT, from end of b in 
A:X to end of all 'A' rows
atom6: Cell(A:X:c=bar)@2 // value of c in A:X
{noformat}
Those atoms are deserialized into {{LegacyCell}} and {{LegacyRangeTombstone}} 
on 3.X as:
{noformat}
atom1: RT(Bound(INCL_START_BOUND(A), 
collection=null)-Bound(EXCL_END_BOUND(A:B), collection=null), deletedAt=1, 
localDeletion=1589204864)
atom2: LegacyCell(REGULAR, name=Cellname(clustering=A:X, column=null, 
collElt=null), v=, ts=2, ldt=2147483647, ttl=0)
atom3: LegacyCell(REGULAR, name=Cellname(clustering=A:X, column=a, 
collElt=null), v=foo, ts=2, ldt=2147483647, ttl=0)
atom4: RT(Bound(INCL_START_BOUND(A:X), collection=b)-Bound(INCL_END_BOUND(A:X), 
collection=b), deletedAt=3, localDeletion=1589204864)
atom5: RT(Bound(EXCL_START_BOUND(A:X), 
collection=null)-Bound(INCL_END_BOUND(A), collection=null), deletedAt=1, 
localDeletion=1589204864)
atom6: LegacyCell(REGULAR, name=Cellname(clustering=A:X, column=c, 
collElt=null), v=bar, ts=2, ldt=2147483647, ttl=0)
{noformat}

I'll point out that those are a direct translation of the 2.X atoms except for 
{{atom1}} and {{atom5}} that are slightly different:
* instead of {{atom1}} stopping at the beginning of the row {{b}} column, it 
extends to the end of the row.
* and instead of {{atom5}} staring after that {{b}} column, it starts after the 
row. Do note however that the order of atoms is still the one above, so that 
atom is effectively out-of-order.

The reason for those differences is the logic [at the beginning of 
{{LegacyLayout.RangeTombstone}}|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/LegacyLayout.java#L1883],
 whose comment is trying to explain, but is basically due to the legacy layer 
having to map all 2.X RTs into either a 3.X range tombstone (so one over 
multiple rows), a row tombstone or a collection one.

Anyway, as mentioned above, the problem is that {{atom5}} is out of order.  
What currently happens is that when {{atom5}} is encountered by 
{{UnfilteredDeserialized.OldFormatDeserializer}}, it will be passed to the 
{{CellGrouper}} currently grouping the row, and will end up in the 
[{{CellGrouper#addGenericTombstone}} 
method|https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/db/LegacyLayout.java#L1544].
  But, because that atom starts strictly after the row being grouped, the 
method returns {{false}} and the row is generated a first time. Later, we get 
{{atom6}} which restarts the row with the value of column {{c}}, after which it 
is generated a second time.


> Potential duplicate rows on 2.X->3.X upgrade when multi-rows range tombstones 
> interacts with collection tombstones
> --
>
> Key: CASSANDRA-15805
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15805
> Project: Cassandra
>  Issue Type: Bug
>  Components: Consistency/Coordination, Local/SSTable
>Reporter: Sylvain Lebresne
>Priority: Normal
>
> The legacy reading code ({{LegacyLayout}} and 
> {{UnfilteredDeserializer.OldFormatDeserializer}}) does not handle correctly 
> the case where a range tombstone covering multiple rows interacts with a 
> collection tombstone.
> A simple example of this problem is if one runs on 2.X:
> {noformat}
> CREATE TABLE t (
>   k int,
>   c1 text,
>   c2 text,
>   a text,
>   b set,
>   c text,
>   PRIMARY KEY((k), c1, c2)
> );
> // Delete all rows where c1 is 'A'
> DELETE FROM t USING TIMESTAMP 1 WHERE k = 0 AND c1 = 'A';
> // Inserts a row covered by that previous range tombstone
> INSERT INTO t(k, c1, c2, a, b, c) VALUES (0, 'A', 'X', 'foo', {'whatever'}, 
> 'bar') USING TIMESTAMP 2;
> // Delete the collection of that previously inserted row
> DELETE b FROM t USING TIMESTAMP 3 WHERE k = 0 AND c1 = 'A' and c2 = 'X';
> {noformat}
> If the following is ran on 2.X (with everything either flushed in the same 
> table or compacted together), then this will result in the inserted row being 
> duplicated (one part containing the {{a}} column, the other the {{c}} one).
> I will note that this is