[jira] [Created] (CASSANDRA-15758) ERROR when a disconnected Cassandra node comes back and receives a drop/add column request

YCozy (Jira) Sat, 25 Apr 2020 09:00:09 -0700

YCozy created CASSANDRA-15758:
---------------------------------

             Summary: ERROR when a disconnected Cassandra node comes back and 
receives a drop/add column request
                 Key: CASSANDRA-15758
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15758
             Project: Cassandra
          Issue Type: Bug
            Reporter: YCozy



We got the following error when we were dropping a column in the table:
{code:java}
ERROR [MigrationStage:1] 2020-04-24 00:07:54,995 SchemaKeyspace.java:1021 - No 
partition columns found for table ks_name.tbl_name in system_schema.columns.  
This may be due to corruption or concurrent dropping and altering of a table. 
If this table is supposed to be dropped, restart cassandra with 
-Dcassandra.ignore_corrupted_schema_tables=true and run the following query to 
cleanup: "DELETE FROM system_schema.tables WHERE keyspace_name = 'ks_name' AND 
table_name = 'tbl_name'; DELETE FROM system_schema.columns WHERE keyspace_name 
= 'ks_name' AND table_name = 'tbl_name';" If the table is not supposed to be 
dropped, restore system_schema.columns sstables from backups.
ERROR [MigrationStage:1] 2020-04-25 15:21:55,716 CassandraDaemon.java:228 - 
Exception in thread Thread[MigrationStage:1,5,main]
org.apache.cassandra.schema.SchemaKeyspace$MissingColumns: Columns not found in 
schema table for ks_name.tbl_name
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchColumns(SchemaKeyspace.java:1100)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1046) 
~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:1000)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:959)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesOnly(SchemaKeyspace.java:951)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1401)
 ~[main/:na]
        at 
org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1380)
 ~[main/:na]
        at 
org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:51)
 ~[main/:na]
        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
~[main/:na]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[na:1.8.0_242]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[na:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[na:1.8.0_242]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_242]
        at 
org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
 [main/:na]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_242]
{code}
We analyzed the logs and came up with the following theory of what happened:
 # We have a cluster of three nodes (C1, C2, C3).
 # Right after we start all the nodes, C3 is partitioned away from the other. 
As a result, neither C1 or C2 knows that C3 exists.
 # User contacts C1 to create a keyspace "ks_name" and a table "tbl_name". C1 
and C2 serve the requests. Since they don't know about C3, they think the 
schema is consistent across the cluster. Both the keyspace and the table are 
created successfully without warning.
 # User tries to drop a column in the table. Now C3 reconnects and receives the 
drop column request from C1 (the coordinator node). However, it does not know 
about "ks_name" nor "tbl_name". So it throws the above error.
 # If the user tries to add a column instead of dropping one, the same error 
will occur.

Since network partition is inevitable in deployed clusters, we think Cassandra 
should better handle such a scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (CASSANDRA-15758) ERROR when a disconnected Cassandra node comes back and receives a drop/add column request

Reply via email to