[
https://issues.apache.org/jira/browse/CASSANDRA-15778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108618#comment-17108618
]
David Capwell commented on CASSANDRA-15778:
-------------------------------------------
Alex and I spoke offline, dumping here for context.
There are a few cases we want to worry about, and some cases which we don't
feel are important to worry about:
Cases we feel are important to worry about
1) upgrade from 2.1 to 3.0 *or* 3.11
2) CQL Table with values written to the previous blob type (hidden type only
exposed via Thrift)
3) Thrift tables with arbitrary types
Cases we don't feel are important to worry about
1) upgrade from 3.0 to 3.0 or 3.11
Given this, we came up with a proposal that should address the above concerns.
1) while upgrading from 2.1 to 3.0 or 3.11, legacy schema migration should
default to BytesType, but allow an override to switch to EmptyType; BytesType
will cause the column to be no longer hidden [1][2].
2) if override is set to EmptyType and data is found to be rejected in
validation, then DROP COMPACT STORAGE will be needed and should add an option
to expose this column. This means the hidden column will now be exposed and
will allow alter statements to change the type or drop it completely.
3) Attempt to add a more meaningful exception which tells users about this fix.
Right now the exception is "EmptyType only accept empty values", this is not
actionable and can cause confusion.
4) For clusters already upgraded to 3.0 or 3.11, the only impact would be the
options added to DROP COMPACT STORAGE which would allow the column to be
exposed or not [3]; we have not worked out the default behaviors of this.
[1] - This is actually an open disagreement. Alex's point is that this affects
a small user base, but would expose the field to all users upgrading. My point
is summarized in [2].
[2] - The reason to default to BytesTypes is because the rolling upgrade case.
Since there may be data present, the default of EmptyType is unsafe as it would
cause the token range impacted to be unavailable to the users. Schema
migrations are not allowed in mixed mode so the only way to resolve this would
be to complete the upgrade, which may cause a larger outage for users. If the
user is aware of this behavior and know its safe, they may choose to switch to
the EmptyType case.
[3] - There is an assumption that all clusters are 3.0.20+, the reason for this
assumption is that validation checks were in 3.0.20 (see CASSANDRA-15373) and
would protect against corrupting the SSTable in the upgrade or compaction case.
If the cluster is 3.0.19 or earlier, then corruption would have happened as
the EmptyType was writing the bytes but was not reading them (see
CASSANDRA-15790). As of this moment we are not aware of a generic way to
repair this data as we don't know what those bytes were so couldn't account for
them generically (they are written without length).
> CorruptSSTableException after a 2.1 SSTable is upgraded to 3.0, failing reads
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-15778
> URL: https://issues.apache.org/jira/browse/CASSANDRA-15778
> Project: Cassandra
> Issue Type: Bug
> Components: Local/Compaction, Local/SSTable
> Reporter: Sumanth Pasupuleti
> Assignee: Alex Petrov
> Priority: Normal
> Fix For: 3.0.x
>
>
> Below is the exception with stack trace. This issue is consistently
> reproduce-able.
> {code:java}
> ERROR [SharedPool-Worker-1] 2020-05-01 14:57:57,661
> AbstractLocalAwareExecutorService.java:169 - Uncaught exception on thread
> Thread[SharedPool-Worker-1,5,main]ERROR [SharedPool-Worker-1] 2020-05-01
> 14:57:57,661 AbstractLocalAwareExecutorService.java:169 - Uncaught exception
> on thread
> Thread[SharedPool-Worker-1,5,main]org.apache.cassandra.io.sstable.CorruptSSTableException:
> Corrupted:
> /mnt/data/cassandra/data/<ks>/<cf-fda511301fb311e7bd79fd24f2fcfb0d/md-10151-big-Data.db
> at
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.hasNext(AbstractSSTableIterator.java:349)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator.hasNext(AbstractSSTableIterator.java:220)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.columniterator.SSTableIterator.hasNext(SSTableIterator.java:33)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:95)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:129)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:131)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:87)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.rows.UnfilteredRowIteratorSerializer.serialize(UnfilteredRowIteratorSerializer.java:77)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.partitions.UnfilteredPartitionIterators$Serializer.serialize(UnfilteredPartitionIterators.java:294)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.build(ReadResponse.java:187)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:180)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.ReadResponse$LocalDataResponse.<init>(ReadResponse.java:176)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.ReadResponse.createDataResponse(ReadResponse.java:76)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.ReadCommand.createResponse(ReadCommand.java:341)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.ReadCommandVerbHandler.doVerb(ReadCommandVerbHandler.java:47)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:67)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> ~[na:1.8.0_231] at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:165)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$LocalSessionFutureTask.run(AbstractLocalAwareExecutorService.java:137)
> [nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
> [nf-cassandra-3.0.19.8.jar:3.0.19.8] at java.lang.Thread.run(Thread.java:748)
> [na:1.8.0_231]Caused by: java.lang.ArrayIndexOutOfBoundsException: 121 at
> org.apache.cassandra.db.ClusteringPrefix$Deserializer.prepare(ClusteringPrefix.java:425)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.UnfilteredDeserializer$CurrentDeserializer.prepareNext(UnfilteredDeserializer.java:170)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.UnfilteredDeserializer$CurrentDeserializer.hasNext(UnfilteredDeserializer.java:151)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.computeNext(SSTableIterator.java:140)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.hasNextInternal(SSTableIterator.java:172)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] at
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.hasNext(AbstractSSTableIterator.java:336)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] ... 27 common frames omitted
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 121
> at
> org.apache.cassandra.db.ClusteringPrefix$Deserializer.prepare(ClusteringPrefix.java:425)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
> at
> org.apache.cassandra.db.UnfilteredDeserializer$CurrentDeserializer.prepareNext(UnfilteredDeserializer.java:170)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
> at
> org.apache.cassandra.db.UnfilteredDeserializer$CurrentDeserializer.hasNext(UnfilteredDeserializer.java:151)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
> at
> org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.computeNext(SSTableIterator.java:140)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
> at
> org.apache.cassandra.db.columniterator.SSTableIterator$ForwardReader.hasNextInternal(SSTableIterator.java:172)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8]
> at
> org.apache.cassandra.db.columniterator.AbstractSSTableIterator$Reader.hasNext(AbstractSSTableIterator.java:336)
> ~[nf-cassandra-3.0.19.8.jar:3.0.19.8] ... 27 common frames omitted
> {code}
> Column family definition
> {code:java}
> CREATE TABLE <keyspace>."<cf>" (
> key text,
> value text,
> PRIMARY KEY (key, value)
> ) WITH COMPACT STORAGE
> AND CLUSTERING ORDER BY (value ASC)
> AND bloom_filter_fp_chance = 0.01
> AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
> AND comment = ''
> AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
> AND compression = {'enabled': 'false'}
> AND crc_check_chance = 1.0
> AND dclocal_read_repair_chance = 0.1
> AND default_time_to_live = 0
> AND gc_grace_seconds = 864000
> AND max_index_interval = 2048
> AND memtable_flush_period_in_ms = 0
> AND min_index_interval = 128
> AND read_repair_chance = 0.0
> AND speculative_retry = '99PERCENTILE';{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]