[jira] [Updated] (CASSANDRA-11143) Schema changes don't propagate correctly if nodes are down

Anubhav Kale (JIRA) Wed, 10 Feb 2016 14:13:02 -0800

     [ 
https://issues.apache.org/jira/browse/CASSANDRA-11143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Anubhav Kale updated CASSANDRA-11143:
-------------------------------------
    Description: 
We saw a problem similar to what I describe below in our PROD environment a few 
times. Below is a consistent repro. We can change the priority to Minor since 
there is a workaround, though.

Using steps from 
http://stackoverflow.com/questions/22513979/setting-up-cassandra-multi-node-cluster-on-a-single-ubuntu-server/25348301#25348301,
 setup a two node cluster locally. 

. Bring up both nodes
. Create a table, and ensure cqlsh is correctly showing it on both nodes.
. Bring down one node
. Drop and re-create the same table Or change some schema in the table.
. Bring up the down node.

You will notice the exceptions like below (because of schema mismatch), and the 
new schema never propagates to this node that was down ((meaning  a select * 
via cqlsh will continue to show old schema for the table). I let the cluster 
run for an hour to see if gossip will somehow catch up. 

However, the interesting part is if you restart this node that was down when 
schema changes were made, the exception below goes away and it gets new schema 
correctly. 

What is it caching that a second restart is necessary to make it behave 
correctly ?

ERROR 00:23:33 Configuration exception merging remote schema
org.apache.cassandra.exceptions.ConfigurationException: Column family ID 
mismatch (found 7208d260-cf8c-11e5-a13b-fb6871b443fb; expected 
e2839010-cf7e-11e5-a13b-fb6871b443fb)
        at 
org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:783)
 ~[main/:na]
        at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:743) 
~[main/:na]
        at org.apache.cassandra.config.Schema.updateTable(Schema.java:626) 
~[main/:na]
        at org.apach


  was:
We saw a problem similar to what I describe below in our PROD environment a few 
times. Below is a consistent repro. We can change the priority to Minor since 
there is a workaround, though.

Using steps from 
http://stackoverflow.com/questions/22513979/setting-up-cassandra-multi-node-cluster-on-a-single-ubuntu-server/25348301#25348301,
 setup a two node cluster locally. 

. Bring up both nodes
. Create a table, and ensure cqlsh is correctly showing it on both nodes.
. Bring down one node
. Drop and re-create the same table Or change some schema in the table.
. Bring up the down node.

You will notice the exceptions like below (because of schema mismatch), and the 
new schema never propagates to this node that was down ((meaning cqlsh will 
continue to show old schema for the table). I let the cluster run for an hour 
to see if gossip will somehow catch up. 

However, the interesting part is if you restart this node that was down when 
schema changes were made, the exception below goes away and it gets new schema 
correctly. 

What is it caching that a second restart is necessary to make it behave 
correctly ?

ERROR 00:23:33 Configuration exception merging remote schema
org.apache.cassandra.exceptions.ConfigurationException: Column family ID 
mismatch (found 7208d260-cf8c-11e5-a13b-fb6871b443fb; expected 
e2839010-cf7e-11e5-a13b-fb6871b443fb)
        at 
org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:783)
 ~[main/:na]
        at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:743) 
~[main/:na]
        at org.apache.cassandra.config.Schema.updateTable(Schema.java:626) 
~[main/:na]
        at org.apach



> Schema changes don't propagate correctly if nodes are down
> ----------------------------------------------------------
>
>                 Key: CASSANDRA-11143
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11143
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: PROD
>            Reporter: Anubhav Kale
>
> We saw a problem similar to what I describe below in our PROD environment a 
> few times. Below is a consistent repro. We can change the priority to Minor 
> since there is a workaround, though.
> Using steps from 
> http://stackoverflow.com/questions/22513979/setting-up-cassandra-multi-node-cluster-on-a-single-ubuntu-server/25348301#25348301,
>  setup a two node cluster locally. 
> . Bring up both nodes
> . Create a table, and ensure cqlsh is correctly showing it on both nodes.
> . Bring down one node
> . Drop and re-create the same table Or change some schema in the table.
> . Bring up the down node.
> You will notice the exceptions like below (because of schema mismatch), and 
> the new schema never propagates to this node that was down ((meaning  a 
> select * via cqlsh will continue to show old schema for the table). I let the 
> cluster run for an hour to see if gossip will somehow catch up. 
> However, the interesting part is if you restart this node that was down when 
> schema changes were made, the exception below goes away and it gets new 
> schema correctly. 
> What is it caching that a second restart is necessary to make it behave 
> correctly ?
> ERROR 00:23:33 Configuration exception merging remote schema
> org.apache.cassandra.exceptions.ConfigurationException: Column family ID 
> mismatch (found 7208d260-cf8c-11e5-a13b-fb6871b443fb; expected 
> e2839010-cf7e-11e5-a13b-fb6871b443fb)
>       at 
> org.apache.cassandra.config.CFMetaData.validateCompatibility(CFMetaData.java:783)
>  ~[main/:na]
>       at org.apache.cassandra.config.CFMetaData.apply(CFMetaData.java:743) 
> ~[main/:na]
>       at org.apache.cassandra.config.Schema.updateTable(Schema.java:626) 
> ~[main/:na]
>       at org.apach



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (CASSANDRA-11143) Schema changes don't propagate correctly if nodes are down

Reply via email to