[
https://issues.apache.org/jira/browse/CASSANDRA-11416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15245353#comment-15245353
]
Sylvain Lebresne commented on CASSANDRA-11416:
----------------------------------------------
bq. Aka assume they are there because someone dropped them in a previous life.
We could, though obviously it's always slightly scary to throw stuff away based
on assumptions we can't be 100% sure of. Though if we log a clear warning,
that's probably good enough in practice (that is, if it's not previously
dropped data, it means you screwed up re-creating your schema and the warning
should be enough to have you fix that and re-load the backup).
Overall, I feel that the real fix for this is that backups should come with
their schema, and by this I mean in the "internal" format that includes column
drop informations, and restoring a backup should make sure the nodes are up to
date on such infos.
I'll also note for the records that the previous behavior (in pre-3.0) wasn't
perfect either as we were in that case basically keeping the previously-dropped
column data, but we didn't have the drop information anymore so said data would
sit there forever (in that sense, I would argue that warning about the data but
ignoring it otherwise is a better behavior overall).
> No longer able to load backups into new cluster if there was a dropped column
> -----------------------------------------------------------------------------
>
> Key: CASSANDRA-11416
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11416
> Project: Cassandra
> Issue Type: Bug
> Reporter: Jeremiah Jordan
> Assignee: Aleksey Yeschenko
> Fix For: 3.0.x, 3.x
>
>
> The following change to the sstableloader test works in 2.1/2.2 but fails in
> 3.0+
> https://github.com/JeremiahDJordan/cassandra-dtest/commit/7dc66efb8d24239f0a488ec5a613240531aeb7db
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text, c4
> text)
> ...insert data...
> ALTER TABLE test_drop DROP c4
> ...insert more data...
> {code}
> Make a snapshot and save off a describe to backup table test_drop.
> Decide to restore the snapshot to a new cluster. First restore the schema
> from describe. (column c4 isn't there)
> {code}
> CREATE TABLE test_drop (key text PRIMARY KEY, c1 text, c2 text, c3 text)
> {code}
> sstableload the snapshot data.
> Works in 2.1/2.2. Fails in 3.0+ with:
> {code}
> java.lang.RuntimeException: Unknown column c4 during deserialization
> java.lang.RuntimeException: Failed to list files in
> /var/folders/t4/rlc2b6450qbg92762l9l4mt80000gn/T/dtest-3eKv_g/test/node1/data1_copy/ks/drop_one-bcef5280f11b11e5825a43f0253f18b5
> at
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:53)
> at
> org.apache.cassandra.db.lifecycle.LifecycleTransaction.getFiles(LifecycleTransaction.java:544)
> at
> org.apache.cassandra.io.sstable.SSTableLoader.openSSTables(SSTableLoader.java:76)
> at
> org.apache.cassandra.io.sstable.SSTableLoader.stream(SSTableLoader.java:165)
> at org.apache.cassandra.tools.BulkLoader.main(BulkLoader.java:104)
> Caused by: java.lang.RuntimeException: Unknown column c4 during
> deserialization
> at
> org.apache.cassandra.db.SerializationHeader$Component.toHeader(SerializationHeader.java:331)
> at
> org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:430)
> at
> org.apache.cassandra.io.sstable.SSTableLoader.lambda$openSSTables$193(SSTableLoader.java:121)
> at
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.lambda$innerList$184(LogAwareFileLister.java:75)
> at
> java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:174)
> at
> java.util.TreeMap$EntrySpliterator.forEachRemaining(TreeMap.java:2965)
> at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
> at
> java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
> at
> java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
> at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
> at
> java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
> at
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.innerList(LogAwareFileLister.java:77)
> at
> org.apache.cassandra.db.lifecycle.LogAwareFileLister.list(LogAwareFileLister.java:49)
> ... 4 more
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)