[
https://issues.apache.org/jira/browse/CASSANDRA-12261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15396899#comment-15396899
]
Stefania commented on CASSANDRA-12261:
--------------------------------------
This test seems to have failed only once with this specific failure and it
cannot be reproduced locally. [~mambocab], are you aware of any other tests
with the same problem or a reasonably reliable way to reproduce this?
The sstable transaction tidier deletes the sstable data file but at present it
does not check if the data file exists. I've added this check and, if the
sstable was not new, a new error message that should tell us if the sstable
tidier ran without data file. After all sstable tidiers have run, the
transaction tidier does another pass to ensure no content files are left before
deleting the actual transaction log file. Here we were missing a folder sync,
so I think it's not 100% impossible that {{record.getExistingFiles()}} returned
a wrong result including files that were deleted just before by the sstable
tidiers, although this wouldn't explain why only data files were reported as
non existing. I've added the folder sync call. Unless the same file is involved
in two different transactions, which shouldn't happen, there shouldn't be any
race between the call to record.getExistingFiles() and LogTransaction::delete
in LogFile.deleteRecordFiles().
I've also added the exception call stack in LogTransaction.delete(), but only
to the debug log file.
I've prepared the patches fro 3.9 and trunk:
||3.9||trunk||
|[patch|https://github.com/stef1927/cassandra/commits/12261-3.9]|[patch|https://github.com/stef1927/cassandra/commits/12261]|
|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12261-3.9-testall/]|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12261-testall/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12261-3.9-dtest/]|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-12261-dtest/]|
This problem could technically happen in 3.0 as well, but I would rather not
disrupt 3.0 unless we are sure there is an issue in the first place.
> dtest failure in write_failures_test.TestWriteFailures.test_thrift
> ------------------------------------------------------------------
>
> Key: CASSANDRA-12261
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12261
> Project: Cassandra
> Issue Type: Bug
> Reporter: Philip Thompson
> Assignee: Stefania
> Labels: dtest
> Fix For: 3.x
>
> Attachments: node1.log, node1_debug.log, node1_gc.log, node2.log,
> node2_debug.log, node2_gc.log, node3.log, node3_debug.log, node3_gc.log
>
>
> example failure:
> http://cassci.datastax.com/job/cassandra-3.9_novnode_dtest/14/testReport/write_failures_test/TestWriteFailures/test_thrift
> Failure is
> {code}
> Unexpected error in node3 log, error:
> ERROR [NonPeriodicTasks:1] 2016-07-20 07:09:52,127 LogTransaction.java:205 -
> Unable to delete
> /tmp/dtest-CSPEFG/test/node3/data2/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-2-big-Data.db
> as it does not exist
> Unexpected error in node3 log, error:
> ERROR [NonPeriodicTasks:1] 2016-07-20 07:09:52,334 LogTransaction.java:205 -
> Unable to delete
> /tmp/dtest-CSPEFG/test/node3/data2/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-15-big-Data.db
> as it does not exist
> Unexpected error in node3 log, error:
> ERROR [NonPeriodicTasks:1] 2016-07-20 07:09:52,337 LogTransaction.java:205 -
> Unable to delete
> /tmp/dtest-CSPEFG/test/node3/data2/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-31-big-Data.db
> as it does not exist
> Unexpected error in node3 log, error:
> ERROR [NonPeriodicTasks:1] 2016-07-20 07:09:52,339 LogTransaction.java:205 -
> Unable to delete
> /tmp/dtest-CSPEFG/test/node3/data2/system_schema/tables-afddfb9dbc1e30688056eed6c302ba09/mb-18-big-Data.db
> as it does not exist
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)