Charlemange Lasse created CASSANDRA-15298:
---------------------------------------------

             Summary: Cassandra node cannot be restored using documented backup 
method
                 Key: CASSANDRA-15298
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15298
             Project: Cassandra
          Issue Type: Bug
            Reporter: Charlemange Lasse


I have a single cassandra 3.11.4 node. It contains various tables and UDFs. The 
[documentation describes a method to backup this 
node|https://docs.datastax.com/en/archived/cassandra/3.0/cassandra/operations/opsBackupTakesSnapshot.html]:
 * use "DESCRIBE SCHEMA" in cqlsh to get the schema
 * create a snapshot using nodetool
 * copy the snapshot + schema to a new (completely disconnected) node
 * load schema into new node
 * load sstables again using nodetool

But this is a complete bogus method. It will result in errors like: 

 
{noformat}
java.lang.RuntimeException: Unknown column deleted_column during 
deserialization {noformat}
And all data in this column is now lost.

Problem is that the "DESCRIBE SCHEMA" CQL doesn't add the stuff correctly for 
already deleted (but still existing columns) to the schema. It looks for 
example like:
{noformat}
CREATE TABLE mykeyspace.testcf (
    primary_uuid uuid,
    secondary_uuid uuid,
    name text,
    PRIMARY KEY (main_uuid, secondary_uuid)
) WITH CLUSTERING ORDER BY (secondary_uuid ASC)
    AND bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';
{noformat}
But it must actually look like:
{noformat}
CREATE TABLE IF NOT EXISTS mykeyspace.testcf (
        primary_uuid uuid,
        secondary_uuid uuid,
        name text,
        deleted_column boolean,
        PRIMARY KEY (main_uuid, secondary_uuid)
        WITH ID = a1afdd4d-b61e-4f2a-b806-57c296be3948
        AND CLUSTERING ORDER BY (ap_uuid ASC)
        AND bloom_filter_fp_chance = 0.01
        AND dclocal_read_repair_chance = 0.1
        AND crc_check_chance = 1.0
        AND default_time_to_live = 0
        AND gc_grace_seconds = 864000
        AND min_index_interval = 128
        AND max_index_interval = 2048
        AND memtable_flush_period_in_ms = 0
        AND read_repair_chance = 0.0
        AND speculative_retry = '99PERCENTILE'
        AND comment = ''
        AND caching = { 'keys': 'ALL', 'rows_per_partition': 'NONE' }
        AND compaction = { 'max_threshold': '32', 'min_threshold': '4', 
'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' }
        AND compression = { 'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor' }
        AND cdc = false
        AND extensions = {  };
ALTER TABLE mykeyspace.testcf DROP deleted_column USING TIMESTAMP 
1563978151561000;
{noformat}
This was taken from the snapshot's (column family specific) schema.cql. Which 
of course is not compatible with the main schema because it will only create 
the tables when they don't exist (which they are because the main "DESCRIBE 
SCHEMA" file already creates them) and is missing all other kind of stuff like 
UDFs.

It is currently not possible (using the builtin mechanisms from cassandra 
3.11.4) to migrate a keyspace from one separated server to another separated 
server.

This behavior also breaks various backup systems which try to store cassandra 
cluster information to an offline storage.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to