[ 
https://issues.apache.org/jira/browse/CASSANDRA-17416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17531049#comment-17531049
 ] 

Yifan Cai commented on CASSANDRA-17416:
---------------------------------------

I was able to reproduce the both failures. Some logging changes was made to the 
test and it helped to indicate the cause. Please see the test failures in [this 
repeated 
run|https://app.circleci.com/pipelines/github/yifan-c/cassandra/337/workflows/5a42b3bc-c540-4c36-aac4-7a22709cf75b/jobs/2808]

For the unequal expected and actual offset values, [~jmckenzie]'s analysis is 
correct. The completed tasks from the commit log metrics between the explicit 
commit log sync call and reading the index file are different for the failed 
tests. It means there is another mutation added to the commit log. 

For reading the null value, I believe the cause is the race between overwriting 
and reading the CDC index file. Overwriting a file (as for the one opened with 
PathUtils#newWriteOverwriteChannel) has 2 steps:
1. Truncate the file to size of 0.
2. Write the offset (long). 
It is possible that the test reads the file when the file is just truncated, 
hence getting the null value. I also have an independent unit test to reproduce 
the behavior. It inserts a pause between creating the file writer (which 
truncates) and writing content. The other thread that reads the file throws 
NPE. 
Since the overwrite operation is non-atomic, NPE is possible to happen when CDC 
consumers read the index file. I think we should either document the behavior 
or make the update atomic. I would lean to the former for simplicity and 
re-reading the file is going to give the correct content. WDYT?

> Test Failure: 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerCDCTest.testCDCIndexFileWriteOnSync
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-17416
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-17416
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Test/unit
>            Reporter: Marcus Eriksson
>            Assignee: Josh McKenzie
>            Priority: Normal
>             Fix For: 4.x
>
>
> [https://ci-cassandra.apache.org/job/Cassandra-trunk/985/testReport/org.apache.cassandra.db.commitlog/CommitLogSegmentManagerCDCTest/testCDCIndexFileWriteOnSync_cdc_3/]
> h3. Error Message
> expected:<1748956> but was:<1749196>
> h3. Stacktrace
> junit.framework.AssertionFailedError: expected:<1748956> but was:<1749196> at 
> org.apache.cassandra.db.commitlog.CommitLogSegmentManagerCDCTest.testCDCIndexFileWriteOnSync(CommitLogSegmentManagerCDCTest.java:160)
>  at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> h3. Standard Output
> INFO [main] 2022-03-02 15:04:59,516 YamlConfigurationLoader.java:103 - 
> Configuration location: 
> file:////home/cassandra/cassandra/build/test/cassandra.cdc.yaml DEBUG [main] 
> 2022-03-02 15:04:59,520 YamlConfigurationLoader.java:124 - Loading settings 
> from file:////home/cassandra/cassandra/build/test/cassandra.cdc.yaml INFO 
> [main] 2022-03-02 15:04:59,674 Config.java:907 - Node 
> configuration:[allocate_tokens_for_keyspace=null; 
> allocate_tokens_for_local_replication_factor=null; allow_extra_insecure_ 
> ...[truncated 4125855 chars]... -02 15:06:57,491 PathUtils.java:73 - Deleting 
> file during startup: 
> /home/cassandra/cassandra/build/test/cassandra/data/system_schema/views-9786ac1cdd583201a7cdad556410c985/nb-11-big-Summary.db
>  DEBUG [MemtableFlushWriter:2] 2022-03-02 15:06:57,496 
> ColumnFamilyStore.java:1207 - Flushed to 
> [BigTableReader(path='/home/cassandra/cassandra/build/test/cassandra/data/system_schema/keyspaces-abac5682dea631c5b535b3d6cffd0fb6/nb-55-big-Data.db')]
>  (1 sstables, 4.895KiB), biggest 4.895KiB, smallest 4.895KiB



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to