PJ created CASSANDRA-7145:
-----------------------------
Summary: FileNotFoundException during compaction
Key: CASSANDRA-7145
URL: https://issues.apache.org/jira/browse/CASSANDRA-7145
Project: Cassandra
Issue Type: Bug
Environment: CentOS 6.3, Datastax Enterprise 4.0.1 (Cassandra 2.0.5),
Java 1.7.0_55
Reporter: PJ
Priority: Blocker
Attachments: compaction - FileNotFoundException.txt, repair -
RuntimeException.txt, startup - AssertionError.txt
I can't finish any compaction because my nodes always throw a
"FileNotFoundException". I've already tried the following but nothing helped:
1. nodetool flush
2. nodetool repair (ends with RuntimeException; see attachment)
3. node restart (via dse cassandra-stop)
Somewhere near the end of startup process, another type of exception is logged
(see attachment) but the nodes are still able to finish the startup and
eventually become online.
My questions now are:
1. Have I already lost data? I'm in the middle of migrating 4.8 billion rows
from MySQL and I'd like to know whether I should already abort and start over
2. What caused the sstable files to go missing?
3. How can I proceed with compaction and repair? Obviously, not being able to
do so would eventually lead to serious performance and data issues
Related StackOverflow question (mine):
http://stackoverflow.com/questions/23435847/filenotfoundexception-during-compaction
Notes:
1. I didn't drop and recreate the keyspace (so probably not related to
CASSANDRA-4857)
2. I use sstableloader for the migration. However, since it is designed to wait
for the secondary index build to complete before exiting, the overall
throughput becomes unacceptable. Due to this, I devised a mechanism that would
kill the sstableloader process and cancel the secondary index build when the
bulk-loading total progress reaches 100%. So far, I've done this more than 100
times already
3. There are times when I had to restart the nodes because the OS load reached
high levels. It's possible that there are compactions in-progress when I
restarted the nodes
--
This message was sent by Atlassian JIRA
(v6.2#6252)