[
https://issues.apache.org/jira/browse/CASSANDRA-14672?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626030#comment-16626030
]
Aleksey Yeschenko edited comment on CASSANDRA-14672 at 9/24/18 9:57 PM:
------------------------------------------------------------------------
3.0: [code|https://github.com/iamaleksey/cassandra/commits/14672-3.0],
[utests|https://circleci.com/gh/iamaleksey/cassandra/771],
[dtests|https://circleci.com/gh/iamaleksey/cassandra/789]
3.11: [code|https://github.com/iamaleksey/cassandra/commits/14672-3.11],
[utests|https://circleci.com/gh/iamaleksey/cassandra/772],
[dtests|https://circleci.com/gh/iamaleksey/cassandra/784]
4.0: [code|https://github.com/iamaleksey/cassandra/commits/14672-4.0],
[utests|https://circleci.com/gh/iamaleksey/cassandra/775],
[dtests|https://circleci.com/gh/iamaleksey/cassandra/786]
was (Author: iamaleksey):
3.0: [code|https://github.com/iamaleksey/cassandra/commits/14672-3.0],
[utests|https://circleci.com/gh/iamaleksey/cassandra/771],
[dtests|https://circleci.com/gh/iamaleksey/cassandra/770]
3.11: [code|https://github.com/iamaleksey/cassandra/commits/14672-3.11],
[utests|https://circleci.com/gh/iamaleksey/cassandra/772],
[dtests|https://circleci.com/gh/iamaleksey/cassandra/773]
4.0: [code|https://github.com/iamaleksey/cassandra/commits/14672-4.0],
[utests|https://circleci.com/gh/iamaleksey/cassandra/775],
[dtests|https://circleci.com/gh/iamaleksey/cassandra/777]
> After deleting data in 3.11.3, reads fail with "open marker and close marker
> have different deletion times"
> -----------------------------------------------------------------------------------------------------------
>
> Key: CASSANDRA-14672
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14672
> Project: Cassandra
> Issue Type: Bug
> Components: Local Write-Read Paths
> Environment: CentOS 7, GCE, 9 nodes, 4TB disk/~2TB full each, level
> compaction, timeseries data
> Reporter: Spiros Ioannou
> Assignee: Aleksey Yeschenko
> Priority: Blocker
> Fix For: 3.0.x, 3.11.x, 4.0.x
>
>
> We had 3.11.0, then we upgraded to 3.11.3 last week. We routinely perform
> deletions as the one described below. After upgrading we run the following
> deletion query:
>
> {code:java}
> DELETE FROM measurement_events_dbl WHERE measurement_source_id IN (
> 9df798a2-6337-11e8-b52b-42010afa015a, 9df7717e-6337-11e8-b52b-42010afa015a,
> a08b8042-6337-11e8-b52b-42010afa015a, a08e52cc-6337-11e8-b52b-42010afa015a,
> a08e6654-6337-11e8-b52b-42010afa015a, a08e6104-6337-11e8-b52b-42010afa015a,
> a08e6c76-6337-11e8-b52b-42010afa015a, a08e5a9c-6337-11e8-b52b-42010afa015a,
> a08bcc50-6337-11e8-b52b-42010afa015a) AND year IN (2018) AND measurement_time
> >= '2018-07-19 04:00:00'{code}
>
> Immediately after that, trying to read the last value produces an error:
> {code:java}
> select * FROM measurement_events_dbl WHERE measurement_source_id =
> a08b8042-6337-11e8-b52b-42010afa015a AND year IN (2018) order by
> measurement_time desc limit 1;
> ReadFailure: Error from server: code=1300 [Replica(s) failed to execute read]
> message="Operation failed - received 0 responses and 2 failures"
> info={'failures': 2, 'received_responses': 0, 'required_responses': 1,
> 'consistency': 'ONE'}{code}
>
> And the following exception:
> {noformat}
> WARN [ReadStage-4] 2018-08-29 06:59:53,505
> AbstractLocalAwareExecutorService.java:167 - Uncaught exception on thread
> Thread[ReadStage-4,5,main]: {}
> java.lang.RuntimeException: java.lang.IllegalStateException:
> UnfilteredRowIterator for pvpms_mevents.measurement_events_dbl has an illegal
> RT bounds sequence: open marker and close marker have different deletion times
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2601)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_181]
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:134)
> [apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
> [apache-cassandra-3.11.3.jar:3.11.3]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_181]
> Caused by: java.lang.IllegalStateException: UnfilteredRowIterator for
> pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence: open
> marker and close marker have different deletion times
> at
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:103)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.applyToMarker(RTBoundValidator.java:81)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:148)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:136)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:92)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:79)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:308)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:187)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:180)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:176)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:352)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1889)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:2597)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> ... 5 common frames omitted
> Suppressed: java.lang.IllegalStateException: UnfilteredRowIterator for
> pvpms_mevents.measurement_events_dbl has an illegal RT bounds sequence:
> expected all RTs to be closed, but the last one is open
> at
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.ise(RTBoundValidator.java:103)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.transform.RTBoundValidator$RowsTransformation.onPartitionClose(RTBoundValidator.java:96)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at org.apache.cassandra.db.transform.BaseRows.runOnClose(BaseRows.java:91)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.transform.BaseIterator.close(BaseIterator.java:86)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:309)
> ~[apache-cassandra-3.11.3.jar:3.11.3]
> ... 12 common frames omitted
>
> {noformat}
>
> We have 9 nodes ~2TB each, leveled compaction, repairs run daily in sequence.
> Table definition is:
> {noformat}
> CREATE TABLE pvpms_mevents.measurement_events_dbl (
> measurement_source_id uuid,
> year int,
> measurement_time timestamp,
> event_reception_time timestamp,
> quality double,
> value double,
> PRIMARY KEY ((measurement_source_id, year), measurement_time)
> ) WITH CLUSTERING ORDER BY (measurement_time ASC)
> AND bloom_filter_fp_chance = 0.1
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
> AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';{noformat}
>
> We host those on GCE and recreated all the nodes with disk snapshots, and we
> reproduced the error: after re-running the DELETE with all nodes up and no
> other queries running, the error was reproduced immediately.
>
> We tried so far:
> re-running repairs on all nodes and running nodetool garbagecollect with no
> success.
> We downgraded to 3.11.2 for now, no issues so far.
> This may be related to CASSANDRA-14515
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]