Spiros Ioannou created CASSANDRA-14672:
------------------------------------------
Summary: After deleting data in 3.11.3, reads fail: "open marker
and close marker have different deletion times"
Key: CASSANDRA-14672
URL: https://issues.apache.org/jira/browse/CASSANDRA-14672
Project: Cassandra
Issue Type: Bug
Environment: CentOS 7, GCE, 9 nodes, 4TB disk/~2TB full each, level
compaction, timeseries data
Reporter: Spiros Ioannou
We perform routinely perform deletions as the one described below.
We had 3.11.0, then we upgraded to 3.11.3. After upgrading we run the following
deletion query:
{code:java}
DELETE FROM measurement_events_dbl WHERE measurement_source_id IN (
9df798a2-6337-11e8-b52b-42010afa015a, 9df7717e-6337-11e8-b52b-42010afa015a,
a08b8042-6337-11e8-b52b-42010afa015a, a08e52cc-6337-11e8-b52b-42010afa015a,
a08e6654-6337-11e8-b52b-42010afa015a, a08e6104-6337-11e8-b52b-42010afa015a,
a08e6c76-6337-11e8-b52b-42010afa015a, a08e5a9c-6337-11e8-b52b-42010afa015a,
a08bcc50-6337-11e8-b52b-42010afa015a) AND year IN (2018) AND measurement_time
>= '2018-07-19 04:00:00'{code}
Immediately after that, trying to read the last value produces an error:
{code:java}
select * FROM measurement_events_dbl WHERE measurement_source_id =
a08b8042-6337-11e8-b52b-42010afa015a AND year IN (2018) order by
measurement_time desc limit 1;
ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read]
message="Operation failed - received 0 responses and 2 failures"
info={'failures': 2, 'received_responses': 0, 'required_responses': 1,
'consistency': 'ONE'}{code}
And the following exception:
{noformat}
WARN [ReadStage-4] 2018-08-29 06:59:53,505
AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread
Thread[ReadStage-4,5,main]: {}
java.lang.RuntimeException: java.lang.IllegalStateException:
UnfilteredRowIterator for pvpms_mevents.measurement_events_dbl has an illegal
RT bounds sequence: open marker and close marker have different deletion times
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2601)
~[apache-cassandra-3.11.3.jar:3.11.3]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_181]
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
[apache-cassandra-3.11.3.jar:3.11.3]
at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
[apache-cassandra-3.11.3.jar:3.11.3]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
Caused by: java.lang.IllegalStateException: UnfilteredRowIterator for
pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence: open
marker and close marker have different deletion times
at
org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:103)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.applyToMarker(RTBoundValidator.java:81)
~[apache-cassandra-3.11.3.jar:3.11.3]
at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:148)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:136)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:187)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:180)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:176)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76)
~[apache-cassandra-3.11.3.jar:3.11.3]
at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:352)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1889)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2597)
~[apache-cassandra-3.11.3.jar:3.11.3]
... 5 common frames omitted
Suppressed: java.lang.IllegalStateException: UnfilteredRowIterator for
pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence:
expected all RTs to be closed, but the last one is open
at
org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:103)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.onPartitionClose(RTBoundValidator.java:96)
~[apache-cassandra-3.11.3.jar:3.11.3]
at org.apache.cassandra.db.transform.BaseRows.runOnClose(BaseRows.java:91)
~[apache-cassandra-3.11.3.jar:3.11.3]
at org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:86)
~[apache-cassandra-3.11.3.jar:3.11.3]
at
org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:309)
~[apache-cassandra-3.11.3.jar:3.11.3]
... 12 common frames omitted
{noformat}
We have 9 nodes ~2TB each, leveled compaction, repairs run daily in sequence.
Table definition is:
{noformat}
CREATE TABLE pvpms_mevents.measurement_events_dbl (
measurement_source_id uuid,
year int,
measurement_time timestamp,
event_reception_time timestamp,
quality double,
value double,
PRIMARY KEY ((measurement_source_id, year), measurement_time)
) WITH CLUSTERING ORDER BY (measurement_time ASC)
AND bloom_filter_fp_chance = 0.1
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class':
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
AND compression = {'chunk_length_in_kb': '64', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';{noformat}
We host those on GCE and recreated all the nodes with disk snapshots, and we
reproduced the error: after re-running the DELETE the error was reproduced
immediately.
We tried so far:
re-running repairs on all nodes and running nodetool garbagecollect with no
success.
We are currently testing downgrade to 3.11.2 hoping it will work, please inform
us if downgrading would cause issues.
This may be related to CASSANDRA-14515
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]