[jira] [Commented] (CASSANDRA-15758) ERROR when a disconnected Cassandra node comes back and receives a drop/add column request

Brandon Williams (Jira) Fri, 25 Sep 2020 06:41:53 -0700


    [ 
https://issues.apache.org/jira/browse/CASSANDRA-15758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17202162#comment-17202162
 ]


Brandon Williams commented on CASSANDRA-15758:
----------------------------------------------

There is no problem.  It tried to drop a column that doesn't exist.

> ERROR when a disconnected Cassandra node comes back and receives a drop/add 
> column request
> ------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15758
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15758
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: YCozy
>            Priority: Normal
>
> We got the following error when we were dropping a column in the table:
> {code:java}
> ERROR [MigrationStage:1] 2020-04-24 00:07:54,995 SchemaKeyspace.java:1021 - 
> No partition columns found for table ks_name.tbl_name in 
> system_schema.columns.  This may be due to corruption or concurrent dropping 
> and altering of a table. If this table is supposed to be dropped, restart 
> cassandra with -Dcassandra.ignore_corrupted_schema_tables=true and run the 
> following query to cleanup: "DELETE FROM system_schema.tables WHERE 
> keyspace_name = 'ks_name' AND table_name = 'tbl_name'; DELETE FROM 
> system_schema.columns WHERE keyspace_name = 'ks_name' AND table_name = 
> 'tbl_name';" If the table is not supposed to be dropped, restore 
> system_schema.columns sstables from backups.
> ERROR [MigrationStage:1] 2020-04-25 15:21:55,716 CassandraDaemon.java:228 - 
> Exception in thread Thread[MigrationStage:1,5,main]
> org.apache.cassandra.schema.SchemaKeyspace$MissingColumns: Columns not found 
> in schema table for ks_name.tbl_name
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchColumns(SchemaKeyspace.java:1100)
>  ~[main/:na]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:1046)
>  ~[main/:na]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:1000)
>  ~[main/:na]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:959)
>  ~[main/:na]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesOnly(SchemaKeyspace.java:951)
>  ~[main/:na]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchema(SchemaKeyspace.java:1401)
>  ~[main/:na]
>         at 
> org.apache.cassandra.schema.SchemaKeyspace.mergeSchemaAndAnnounceVersion(SchemaKeyspace.java:1380)
>  ~[main/:na]
>         at 
> org.apache.cassandra.db.DefinitionsUpdateVerbHandler$1.runMayThrow(DefinitionsUpdateVerbHandler.java:51)
>  ~[main/:na]
>         at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) 
> ~[main/:na]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_242]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_242]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  ~[na:1.8.0_242]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  [na:1.8.0_242]
>         at 
> org.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)
>  [main/:na]
>         at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_242]
> {code}
> We analyzed the logs and came up with the following theory of what happened:
>  # We have a cluster of three nodes (C1, C2, C3).
>  # Right after we start all the nodes, C3 is partitioned away from the other. 
> As a result, neither C1 or C2 knows that C3 exists.
>  # User contacts C1 to create a keyspace "ks_name" and a table "tbl_name". C1 
> and C2 serve the requests. Since they don't know about C3, they think the 
> schema is consistent across the cluster. Both the keyspace and the table are 
> created successfully without warning.
>  # User tries to drop a column in the table. Now C3 reconnects and receives 
> the drop column request from C1 (the coordinator node). However, it does not 
> know about "ks_name" nor "tbl_name". So it throws the above error.
>  # If the user tries to add a column instead of dropping one, the same error 
> will occur.
> Since network partition is inevitable in deployed clusters, we think 
> Cassandra should better handle such a scenario.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (CASSANDRA-15758) ERROR when a disconnected Cassandra node comes back and receives a drop/add column request

Reply via email to