Kishan Karunaratne created CASSANDRA-8902:
---------------------------------------------
Summary: Missing data files, database corruption
Key: CASSANDRA-8902
URL: https://issues.apache.org/jira/browse/CASSANDRA-8902
Project: Cassandra
Issue Type: Bug
Environment: ruby-driver 2.1.0 | C* 2.0.12
Reporter: Kishan Karunaratne
During a recent duration test run of the ruby-driver (as well as a previous
run), I see many of the following exceptions thrown in the system.log:
{noformat}
ERROR [CompactionExecutor:81] 2015-02-20 22:32:33,064 CassandraDaemon.java
(line 199) Exception in thread Thread[CompactionExecutor:81,1,main]
java.lang.RuntimeException: java.io.FileNotFoundException:
/srv/performance/cass/data/duration_test1/ints/duration_test1-ints-jb-39-Data.db
(No such file or directory)
at
org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:52)
at
org.apache.cassandra.io.sstable.SSTableReader.openDataReader(SSTableReader.java:1399)
at
org.apache.cassandra.io.sstable.SSTableScanner.<init>(SSTableScanner.java:67)
at
org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1205)
at
org.apache.cassandra.io.sstable.SSTableReader.getScanner(SSTableReader.java:1217)
at
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:272)
at
org.apache.cassandra.db.compaction.AbstractCompactionStrategy.getScanners(AbstractCompactionStrategy.java:278)
at
org.apache.cassandra.db.compaction.CompactionTask.runMayThrow(CompactionTask.java:131)
at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
at
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
at
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
at
org.apache.cassandra.db.compaction.CompactionManager$BackgroundCompactionTask.run(CompactionManager.java:198)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:724)
Caused by: java.io.FileNotFoundException:
/srv/performance/cass/data/duration_test1/ints/duration_test1-ints-jb-39-Data.db
(No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.<init>(RandomAccessFile.java:233)
at
org.apache.cassandra.io.util.RandomAccessReader.<init>(RandomAccessReader.java:58)
at
org.apache.cassandra.io.compress.CompressedRandomAccessReader.<init>(CompressedRandomAccessReader.java:76)
at
org.apache.cassandra.io.compress.CompressedThrottledReader.<init>(CompressedThrottledReader.java:34)
at
org.apache.cassandra.io.compress.CompressedThrottledReader.open(CompressedThrottledReader.java:48)
... 17 more
{noformat}
I've checked this data directory and indeed this specific db file is missing.
This would signal a database corruption.
The duration test uses a 3-node cluster. It seems like the nodes have also gone
out of sync. For example, getting a nodetool status on one node gives:
{noformat}
$ cassandra/bin/nodetool -h 10.240.61.210 status
-- Address Load Tokens Owns Host ID
Rack
DN 10.240.210.69 533.32 MB 256 32.3%
2947fe5e-f149-4ff6-b26c-570ae72b7606 RAC1
DN 10.240.185.204 570.86 MB 256 36.7%
3a6e2152-c7dc-457a-a4c5-4c6f01986dd0 RAC1
UN 10.240.61.210 877.43 MB 256 31.0%
c3b1beff-9587-4851-85a9-05a9ba6deaff RAC1
{noformat}
While on another node it gives:
{noformat}
$ cassandra/bin/nodetool -h 10.240.210.69 status (or 10.240.185.204)
-- Address Load Tokens Owns Host ID
Rack
UN 10.240.210.69 4.83 GB 256 32.3%
2947fe5e-f149-4ff6-b26c-570ae72b7606 RAC1
UN 10.240.185.204 4.88 GB 256 36.7%
3a6e2152-c7dc-457a-a4c5-4c6f01986dd0 RAC1
DN 10.240.61.210 877.43 MB 256 31.0%
c3b1beff-9587-4851-85a9-05a9ba6deaff RAC1
{noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)