Hi all,

We have recently run in production a new cluster C* 2.0.0 with 3 nodes RF 3.

After 7 days, we seen on log this error on one node :

ERROR [FlushWriter:216] 2013-10-07 07:11:46,538 CassandraDaemon.java (line
186) Exception in thread Thread[FlushWriter:216,5,main]
java.lang.AssertionError
        at
org.apache.cassandra.io.sstable.SSTableWriter.rawAppend(SSTableWriter.java:198)
        at
org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:186)
        at
org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents(Memtable.java:417)
        at
org.apache.cassandra.db.Memtable$FlushRunnable.runWith(Memtable.java:376)
        at
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
        at
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:724)


$ nodetool tpstats
Pool Name                    Active   Pending      Completed   Blocked  All
time blocked
ReadStage                         0         0       20102712
0                 0
RequestResponseStage              0         0       12186090
0                 0
MutationStage                     0         0       17999146
0                 0
ReadRepairStage                   0         0        1013914
0                 0
ReplicateOnWriteStage             0         0          77415
0                 0
GossipStage                       0         0         234894
0                 0
AntiEntropyStage                  0         0              0
0                 0
MigrationStage                    0         0              4
0                 0
MemoryMeter                       0         0            475
0                 0
MemtablePostFlusher               1        59            210
0                 0
FlushWriter                       0         0            248
0               155
MiscStage                         0         0              0
0                 0
commitlog_archiver                0         0              0
0                 0
InternalResponseStage             0         0              3
0                 0
HintedHandoff                     0         0              5
0                 0

$ nodetool compactionstats
pending tasks: 72
Active compaction remaining time :        n/a

A few days later, the other two nodes had this error, since it seems that
the active thread on MemtablePostFlusher blocks any compaction, I can not
make snapshot, or drain, I tried rebooted servers one by one, it works
again for
a moment, but after a few days in the log I have this error again. I also
had loss of data that are occurring after the reboot. Now I do not know how
to fix the problem, I think the solution would be to install another
cluster and
copy the data, except if one of you had this problem and could guide me.

Thanks a lot,

Reply via email to