[ https://issues.apache.org/jira/browse/CASSANDRA-16444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290027#comment-17290027 ]
Alexandre Dutra edited comment on CASSANDRA-16444 at 2/24/21, 3:48 PM: ----------------------------------------------------------------------- What I've got so far: # The usage of {{System.currentTimeMillis()}} to generate timestamps is error-prone. Some tests create and apply many mutations in sequence, but sometimes 2 successive mutations get the same timestamp, and the test fails. The following tests are potentially impacted by this: ## testCrossSSTableQueries ## testMultiExpressionQueriesWhereRowSplitBetweenSSTables ## testPagination ## testColumnNamesWithSlashes ## testInvalidate ## testIndexRedistribution ## testTruncate ## testSameKeyInMemtableAndSSTables ## testUnicodeSupport ## testUnicodeSuffixModeNoSplits ## testChinesePrefixSearch ## testLowerCaseAnalyzer ## testPrefixSSTableLookup ## testIndexMemtableSwitching # {{testIndexRedistribution}}: race condition. The test fails when a {{CompactionTask}} is running while {{getIndexed()}} is called. ## Wrapping {{getIndexed()}} call within {{store.runWithCompactionsDisabled()}} solves the problem. ## _I am not expert enough to tell if this is hiding a broader problem with index redistribution in general._ # {{testIndexMemtableSwitching}}: side-effect issue. The test fails only if {{testInvalidIndexOptions}} is executed before: ## {{testInvalidIndexOptions}} leaves the store in an inconsistent state due to an invalid mutation. ## The next memtable flush task fails because of an invalid cell type created by the test. ## afaict, the memtable stays forever among the pending memtables list and will never be flushed or removed. ## Similarly, {{ColumnIndex}} never gets a notification that the parent memtable was flushed, and so {{ColumnIndex.pendingFlush}} is never cleared. ## {{testIndexMemtableSwitching}} verifies that {{ColumnIndex.pendingFlush}} is empty, and fails. ## _I am not expert enough to tell if this is hiding a broader problem with failed memtable flushes._ I am going to propose 3 distinct patches: # Replace all occurrences of {{System.currentTimeMillis()}} in {{SASIIndexTest}} by fixed timestamps. # Fix {{testIndexRedistribution}} by reading the index contents inside {{ store.runWithCompactionsDisabled()}}. # {{testIndexMemtableSwitching}}: manually clear the store memtables using {{store.clearUnsafe()}} and manually clean {{ColumnIndex.pendingFlush}} after the test, so as to leave the store in a consistent state. was (Author: adutra): What I've got so far: # The usage of {{System.currentTimeMillis()}} to generate timestamps is error-prone. Some tests create and apply many mutations in sequence, but sometimes 2 successive mutations get the same timestamp, and the test fails. The following tests are potentially impacted by this: ## testCrossSSTableQueries ## testMultiExpressionQueriesWhereRowSplitBetweenSSTables ## testPagination ## testColumnNamesWithSlashes ## testInvalidate ## testIndexRedistribution ## testTruncate ## testSameKeyInMemtableAndSSTables ## testUnicodeSupport ## testUnicodeSuffixModeNoSplits ## testChinesePrefixSearch ## testLowerCaseAnalyzer ## testPrefixSSTableLookup ## testIndexMemtableSwitching # {{testIndexRedistribution}}: race condition. The test fails when a {{CompactionTask}} is running while {{getIndexed()}} is called. ## Wrapping {{getIndexed()}} call within {{store.runWithCompactionsDisabled()}} solves the problem. ## _I am not expert enough to tell if this is hiding a broader problem with index redistribution in general._ # {{testIndexMemtableSwitching}}: side-effect issue. The test fails only if {{testInvalidIndexOptions}} is executed before: ## {{testInvalidIndexOptions}} leaves the store in an inconsistent state due to an invalid mutation. ## The next memtable flush task fails because of an invalid cell type created by the test. ## afaict, the memtable stays forever among the pending memtables list and will never be flushed or removed. ## Similarly, {{ColumnIndex}} never gets a notification that the parent memtable was flushed, and so {{ColumnIndex.pendingFlush}} is never cleared. ## {{testIndexMemtableSwitching }}verifies that {{ColumnIndex.pendingFlush}} is empty, and fails. ## _I am not expert enough to tell if this is hiding a broader problem with failed memtable flushes._ I am going to propose 3 distinct patches: # Replace all occurrences of {{System.currentTimeMillis()}} in {{SASIIndexTest}} by fixed timestamps. # Fix {{testIndexRedistribution}} by reading the index contents inside{{ store.runWithCompactionsDisabled().}} # {{testIndexMemtableSwitching}}: manually clear the store memtables using {{store.clearUnsafe()}} and manually clean {{ColumnIndex.pendingFlush}} after the test, so as to leave the store in a consistent state.{{}} > Fix flaky test testMultiExpressionQueriesWhereRowSplitBetweenSSTables - > org.apache.cassandra.index.sasi.SASIIndexTest > --------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-16444 > URL: https://issues.apache.org/jira/browse/CASSANDRA-16444 > Project: Cassandra > Issue Type: Bug > Components: Test/unit > Reporter: David Capwell > Assignee: Alexandre Dutra > Priority: Normal > Fix For: 4.0-beta > > > https://app.circleci.com/pipelines/github/dcapwell/cassandra/862/workflows/d2b10373-5bd1-4895-a738-1c28587cae62/jobs/5136 > {code} > junit.framework.AssertionFailedError: [] > at > org.apache.cassandra.index.sasi.SASIIndexTest.testMultiExpressionQueriesWhereRowSplitBetweenSSTables(SASIIndexTest.java:589) > at > org.apache.cassandra.index.sasi.SASIIndexTest.testMultiExpressionQueriesWhereRowSplitBetweenSSTables(SASIIndexTest.java:468) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org For additional commands, e-mail: commits-h...@cassandra.apache.org