[ https://issues.apache.org/jira/browse/CASSANDRA-10250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726721#comment-14726721 ]
Andrew Hust commented on CASSANDRA-10250: ----------------------------------------- example script run -- missing c4 column {code} ❯ python concurrent_schema_changes.py creating base tables to be added/altered executing creation of tables, add/drop column and index creation sleeping 20 to make sure things are settled verifing schema status Errors found: alter_me_8 expected c1 -> c7, id, got: [u'c1', u'c2', u'c3', u'c5', u'c6', u'c7', u'id'] {code} > Executing lots of schema alters concurrently can lead to dropped alters > ----------------------------------------------------------------------- > > Key: CASSANDRA-10250 > URL: https://issues.apache.org/jira/browse/CASSANDRA-10250 > Project: Cassandra > Issue Type: Bug > Reporter: Andrew Hust > Attachments: concurrent_schema_changes.py, node1.log, node2.log, > node3.log > > > A recently added > [dtest|http://cassci.datastax.com/view/cassandra-3.0/job/cassandra-3.0_dtest/132/testReport/junit/concurrent_schema_changes_test/TestConcurrentSchemaChanges/create_lots_of_schema_churn_test/] > has been flapping on cassci and has exposed an issue with running lots of > schema alterations concurrently. The failures occur on healthy clusters but > seem to occur at higher rates when 1 node is down during the alters. > The test executes the following – 440 total commands: > - Create 20 new tables > - Drop 7 columns one at time across 20 tables > - Add 7 columns on at time across 20 tables > - Add one column index on each of the 7 columns on 20 tables > Outcome is random. Majority of the failures are dropped columns still being > present, but new columns and indexes have been observed to be incorrect. The > logs are don’t have exceptions and the columns/indexes that are incorrect > don’t seem to follow a pattern. Running a {{nodetool describecluster}} on > each node shows the same schema id on all nodes. > Attached is a python script extracted from the dtest. Running against a > local 3 node cluster will reproduce the issue (with enough runs – fails ~20% > on my machine). > Also attached is the node logs from a run with when a dropped column > (alter_me_7 table, column s1) is still present. Checking the system_schema > tables for this case shows the s1 column in both the columns and drop_columns > tables. > This has been flapping on cassci on versions 2+ and doesn’t seem to be > related to changes in 3.0. More testing needs to be done though. -- This message was sent by Atlassian JIRA (v6.3.4#6332)