[
https://issues.apache.org/jira/browse/CASSANDRA-10534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14996319#comment-14996319
]
Fredrik Larsson Stigbäck commented on CASSANDRA-10534:
------------------------------------------------------
Will this one be included in 2.1.12?
> CompressionInfo not being fsynced on close
> ------------------------------------------
>
> Key: CASSANDRA-10534
> URL: https://issues.apache.org/jira/browse/CASSANDRA-10534
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sharvanath Pathak
> Assignee: Stefania
> Fix For: 2.1.x
>
>
> I was seeing SSTable corruption due to a CompressionInfo.db file of size 0,
> this happened multiple times in our testing with hard node reboots. After
> some investigation it seems like these file is not being fsynced, and that
> can potentially lead to data corruption. I am working with version 2.1.9.
> I checked for fsync calls using strace, and found them happening for all but
> the following components: CompressionInfo, TOC.txt and digest.sha1. All of
> these but the CompressionInfo seem tolerable. Also a quick look through the
> code did not reveal any fsync calls. Moreover, I suspect the commit
> 4e95953f29d89a441dfe06d3f0393ed7dd8586df
> (https://github.com/apache/cassandra/commit/4e95953f29d89a441dfe06d3f0393ed7dd8586df#diff-b7e48a1398e39a936c11d0397d5d1966R344)
> has caused the regression, which removed the line
> {noformat}
> getChannel().force(true);
> {noformat}
> from CompressionMetadata.Writer.close.
> Following is the trace I saw in system.log:
> {noformat}
> INFO [SSTableBatchOpen:1] 2015-09-29 19:24:39,170 SSTableReader.java:478 -
> Opening
> /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-13368
> (79 bytes)
> ERROR [SSTableBatchOpen:1] 2015-09-29 19:24:39,177 FileUtils.java:447 -
> Exiting forcefully due to file system exception on startup, disk failure
> policy "stop"
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
> at
> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> [na:1.7.0_80]
> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
> [na:1.7.0_80]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> [na:1.7.0_80]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> [na:1.7.0_80]
> at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> Caused by: java.io.EOFException: null
> at
> java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
> ~[na:1.7.0_80]
> at java.io.DataInputStream.readUTF(DataInputStream.java:589)
> ~[na:1.7.0_80]
> at java.io.DataInputStream.readUTF(DataInputStream.java:564)
> ~[na:1.7.0_80]
> at
> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106)
> ~[apache-cassandra-2.1.9.jar:2.1.9]
> ... 14 common frames omitted
> {noformat}
> Following is the result of ls on the data directory of a corrupted SSTable
> after the hard reboot:
> {noformat}
> $ ls -l
> /var/lib/cassandra/data/system/sstable_activity-5a1ff267ace03f128563cfae6103c65e/
> total 60
> -rw-r--r-- 1 cassandra cassandra 0 Oct 15 09:31
> system-sstable_activity-ka-1-CompressionInfo.db
> -rw-r--r-- 1 cassandra cassandra 9740 Oct 15 09:31
> system-sstable_activity-ka-1-Data.db
> -rw-r--r-- 1 cassandra cassandra 0 Oct 15 09:31
> system-sstable_activity-ka-1-Digest.sha1
> -rw-r--r-- 1 cassandra cassandra 880 Oct 15 09:31
> system-sstable_activity-ka-1-Filter.db
> -rw-r--r-- 1 cassandra cassandra 34000 Oct 15 09:31
> system-sstable_activity-ka-1-Index.db
> -rw-r--r-- 1 cassandra cassandra 7338 Oct 15 09:31
> system-sstable_activity-ka-1-Statistics.db
> -rw-r--r-- 1 cassandra cassandra 0 Oct 15 09:31
> system-sstable_activity-ka-1-TOC.txt
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)