[ 
https://issues.apache.org/jira/browse/CASSANDRA-15844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maxwellguo updated CASSANDRA-15844:
-----------------------------------
     Bug Category: Parent values: Correctness(12982)Level 1 values: Recoverable 
Corruption / Loss(12986)
       Complexity: Normal
    Discovered By: User Report
         Severity: Normal
           Status: Open  (was: Triage Needed)

> Create table Asynchronously or creating table contact the same node from many 
> client threads at same time may causing data lose
> -------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-15844
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-15844
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Cluster/Schema
>            Reporter: maxwellguo
>            Assignee: maxwellguo
>            Priority: Normal
>         Attachments: createkeyspace.jpg, keyspace inner.jpg, schemaversion.jpg
>
>
> When creating a table from on one coordinator node from some client threads 
> at the same time, Or creating a table using session.executeAsync() method, 
> may cause the schema'information incorrect. Seriously will causing data lose.
> For my test. I use executeAsync() to create table one by one using the same 
> table name (Though I do konw create table should be synchronously, but some 
> of our customers may create table using executAsync() ). My expectations is 
> that the last cql 
> {code:java}
> CREATE TABLE ks.tb (name text PRIMARY KEY , age int, adds text, height text)
> {code}
> should take effect . 
>  !createkeyspace.jpg! 
> But after runing the code, I foud that the result is not what I am expected.  
> the schema struct is is :
> {code:java}
> CREATE TABLE ks.tb (name text PRIMARY KEY , age int, adds text, sex int, 
> height int)
> {code}
>  !keyspace inner.jpg! 
> And the schema version in the memory and on the disk is not the same. 
>  !schemaversion.jpg! 
> When add a new columnfamily (creat a new table), the request of creating same 
> table with different schema definition arrived at the same time from 
> different clients or using 
> executeAsync method. 
> {code:java}
>  private static void announceNewColumnFamily(CFMetaData cfm, boolean 
> announceLocally, boolean throwOnDuplicate, long timestamp) throws 
> ConfigurationException
>     {
>         cfm.validate();
>         KeyspaceMetadata ksm = Schema.instance.getKSMetaData(cfm.ksName);
>         if (ksm == null)
>             throw new ConfigurationException(String.format("Cannot add table 
> '%s' to non existing keyspace '%s'.", cfm.cfName, cfm.ksName));
>         // If we have a table or a view which has the same name, we can't add 
> a new one
>         else if (throwOnDuplicate && ksm.getTableOrViewNullable(cfm.cfName) 
> != null)
>             throw new AlreadyExistsException(cfm.ksName, cfm.cfName);
>         logger.info("Create new table: {}", cfm);
>         announce(SchemaKeyspace.makeCreateTableMutation(ksm, cfm, timestamp), 
> announceLocally);
>     }
> {code}
> The code of checking table existance may failed. And same table's request may 
> all going to do announce() method;
> {code:java}
> public static synchronized void mergeSchema(Collection<Mutation> mutations, 
> boolean forDynamoTTL)
>     {
>         // only compare the keyspaces affected by this set of schema mutations
>         Set<String> affectedKeyspaces =
>         mutations.stream()
>                  .map(m -> UTF8Type.instance.compose(m.key().getKey()))
>                  .collect(Collectors.toSet());
>         // fetch the current state of schema for the affected keyspaces only
>         Keyspaces before = Schema.instance.getKeyspaces(affectedKeyspaces);
>         // apply the schema mutations and flush
>         mutations.forEach(Mutation::apply);
>         if (FLUSH_SCHEMA_TABLES)
>             flush();
>         // fetch the new state of schema from schema tables (not applied to 
> Schema.instance yet)
>         Keyspaces after = fetchKeyspacesOnly(affectedKeyspaces);
>         mergeSchema(before, after);
>         scheduleDynamoTTLClean(forDynamoTTL, mutations);
>     }
> {code}
> For we may write the new table definition into disk, so at last we saw 
> {code:java}
> CREATE TABLE ks.tb (name text PRIMARY KEY , age int, adds text, sex int, 
> height int)
> {code}
> in our case.
> And we also saw the different version in memory and disk. 
> when writing data we using the schema in memory, but when we doing node 
> restart the schema definition on disk will be used. Then may causing data 
> lose. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to