Hi everyone,

We are running some Cassandra clusters (Usually a cluster of 5 nodes with 
replication factor of 3.)  And at least once per day we do see some corruption 
related to a specific sstable in system/hints. (We are using Cassandra version 
1.2.16 on RHEL 6.5)

Here is an example of such exception:


ERROR [CompactionExecutor:1694] 2014-06-08 21:37:33,267 CassandraDaemon.java 
(line 191) Exception in thread Thread[CompactionExecutor:1694,1,main]

org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
dataSize of 8224262783474088549 starting at 502360510 would be larger than file 
/home/y/var/cassandra/data/syste

m/hints/system-hints-ic-281-Data.db length 504590769

        at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:167)

        at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:83)

        at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:69)

        at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)

        at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)

        at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)

        at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)

        at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:145)

        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)

        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)

        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

        at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)

        at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)

        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)

        at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)

        at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)

        at 
org.apache.cassandra.db.compaction.CompactionManager$7.runMayThrow(CompactionManager.java:442)

        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)

        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

        at java.util.concurrent.FutureTask.run(FutureTask.java:262)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

Caused by: java.io.IOException: dataSize of 8224262783474088549 starting at 
502360510 would be larger than file 
/home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length 
504590769

        at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:123)

        ... 23 more

 INFO [HintedHandoff:35] 2014-06-08 21:37:33,267 HintedHandOffManager.java 
(line 296) Started hinted handoff for host: 
502a48cd-171b-4e83-a9ad-67f32437353a with IP: /10.210.239.190

ERROR [HintedHandoff:33] 2014-06-08 21:37:33,267 CassandraDaemon.java (line 
191) Exception in thread Thread[HintedHandoff:33,1,main]

java.lang.RuntimeException: java.util.concurrent.ExecutionException: 
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
dataSize of 8224262783474088549 starting at 502360510 would be larger than file 
/home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length 
504590769

        at 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:441)

        at 
org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:282)

        at 
org.apache.cassandra.db.HintedHandOffManager.access$300(HintedHandOffManager.java:90)

        at 
org.apache.cassandra.db.HintedHandOffManager$4.run(HintedHandOffManager.java:508)

        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

        at java.lang.Thread.run(Thread.java:745)

Caused by: java.util.concurrent.ExecutionException: 
org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
dataSize of 8224262783474088549 starting at 502360510 would be larger than file 
/home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length 
504590769

        at java.util.concurrent.FutureTask.report(FutureTask.java:122)

        at java.util.concurrent.FutureTask.get(FutureTask.java:188)

        at 
org.apache.cassandra.db.HintedHandOffManager.doDeliverHintsToEndpoint(HintedHandOffManager.java:437)

        ... 6 more

Caused by: org.apache.cassandra.io.sstable.CorruptSSTableException: 
java.io.IOException: dataSize of 8224262783474088549 starting at 502360510 
would be larger than file 
/home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length 
504590769

        at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:167)

        at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:83)

        at 
org.apache.cassandra.io.sstable.SSTableIdentityIterator.<init>(SSTableIdentityIterator.java:69)

        at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:180)

        at 
org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:155)

        at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:142)

        at 
org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:38)

        at 
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:145)

        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)

        at 
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)

        at 
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)

        at 
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)

        at 
org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:145)

        at 
org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)

        at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)

        at 
org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:58)

        at 
org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:60)



Our current filesystem configuration for Cassandra: (nothing fancy …)


/dev/sda6            /home/y/var/cassandra/commitlog ext4     
defaults,commit=20,noatime,nobarrier,nodiratime   0 0

/dev/sda7            /home/y/var/cassandra/data ext4     
defaults,commit=20,data=writeback,noatime,nobarrier,nodiratime   0 0



The workaround we have right now is the following:


1-  delete the “guilty” sstable, in this case: 
/home/y/var/cassandra/data/system/hints/system-hints-ic-281*

2- Issue a major compaction for system/hints —> nodetool compact system hints;

3- Repeat for all the stables producing this issue.



My biggest worry here is around the following message:


 org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.IOException: 
dataSize of 8224262783474088549 starting at 502360510 would be larger than file 
/home/y/var/cassandra/data/system/hints/system-hints-ic-281-Data.db length 
504590769



Any clues on why this is happening ?



Thanks,


FR







Reply via email to