[
https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16739427#comment-16739427
]
Avraham Kalvo edited comment on CASSANDRA-14957 at 1/10/19 2:09 PM:
--------------------------------------------------------------------
To be clear, here's the timeline of the incident:
```
12:05:02 first node state jump to shutdown for restart
12:06:37 INFO Initializing tasks_scheduler_external.tasks (first node)
12:06:39 WARN UnknownColumnFamilyException reading from socket; closing (first
node)
...
12:09:15 only trace of service migration running by issuing the following:
`CREATE KEYSPACE IF NOT EXISTS tasks_scheduler_external WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': '3'};`
...
12:09:31 last node started after restart
```
Notice *no tables* were attempted to be created throughout the restart, and
also the keyspace wasn't recreated as it was already in existence.
Hence - the new version of the table, as visible in the file system, *has
nothing to do* with any explicit DDL running before, throughout and after the
rolling restart.
The schema (DDL) hasn't changed - just the data was split into a new version on
the filesystem which eventually became the version the cluster agrees on once
it has completed its rolling restart.
Thank you.
Avi.
was (Author: via.vokal):
To be clear, here's the timeline of the incident:
12:05:02 first node state jump to shutdown for restart
12:06:37 INFO Initializing tasks_scheduler_external.tasks (first node)
12:06:39 WARN UnknownColumnFamilyException reading from socket; closing (first
node)
...
12:09:15 only trace of service migration running by issuing the following:
`CREATE KEYSPACE IF NOT EXISTS tasks_scheduler_external WITH replication =
{'class': 'SimpleStrategy', 'replication_factor': '3'};`
...
12:09:31 last node started after restart
Notice *no tables* were attempted to be created throughout the restart, and
also the keyspace wasn't recreated as it was already in existence.
Hence - the new version of the table, as visible in the file system, *has
nothing to do* with any explicit DDL running before, throughout and after the
rolling restart.
The schema (DDL) hasn't changed - just the data was split into a new version on
the filesystem which eventually became the version the cluster agrees on once
it has completed its rolling restart.
Thank you.
Avi.
> Rolling Restart Of Nodes Causes Dataloss Due To Schema Collision
> ----------------------------------------------------------------
>
> Key: CASSANDRA-14957
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14957
> Project: Cassandra
> Issue Type: Bug
> Components: Cluster/Schema
> Reporter: Avraham Kalvo
> Priority: Major
>
> We were issuing a rolling restart on a mission-critical five node C* cluster.
> The first node which was restarted got the following messages in its
> system.log:
> ```
> January 2nd 2019, 12:06:37.310 - INFO 12:06:35 Initializing
> tasks_scheduler_external.tasks
> ```
> ```
> WARN 12:06:39 UnknownColumnFamilyException reading from socket; closing
> org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for
> cfId bd7200a0-1567-11e8-8974-855d74ee356f. If a table was just created, this
> is likely due to the schema not being fully propagated. Please wait for
> schema agreement on table creation.
> at
> org.apache.cassandra.config.CFMetaData$Serializer.deserialize(CFMetaData.java:1336)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize30(PartitionUpdate.java:660)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.db.partitions.PartitionUpdate$PartitionUpdateSerializer.deserialize(PartitionUpdate.java:635)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:330)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:349)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.db.Mutation$MutationSerializer.deserialize(Mutation.java:286)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at org.apache.cassandra.net.MessageIn.read(MessageIn.java:98)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessage(IncomingTcpConnection.java:201)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.net.IncomingTcpConnection.receiveMessages(IncomingTcpConnection.java:178)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:92)
> ~[apache-cassandra-3.0.10.jar:3.0.10]
> ```
> The latter was then repeated several times across the cluster.
> It was then found out that the table in question
> `tasks_scheduler_external.tasks` was created with a new schema version
> sometime along the entire cluster consecutive restart and became available
> once the schema agreement settled, which started taking requests leaving the
> previous version of the schema unavailable for any request, thus generating a
> data loss to our online system.
> Data loss was recovered by manually copying SSTables from the previous
> version directory of the schema to the new one followed by `nodetool refresh`
> to the relevant table.
> The above has repeated itself for several tables across various keyspaces.
> One other thing to mention is that a repair was in place for the first node
> to be restarted, which was obviously stopped as the daemon was shut down, but
> this doesn't seem to do with the above at first glance.
> Seems somewhat related to:
> https://issues.apache.org/jira/browse/CASSANDRA-13559
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]