[ 
https://issues.apache.org/jira/browse/CASSANDRA-10479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950066#comment-14950066
 ] 

Sharvanath Pathak commented on CASSANDRA-10479:
-----------------------------------------------

Deleting the corrupt SStables, in general, is bad, and I do agree that in your 
example I wouldn't be happy if the SStable was automatically deleted. However, 
one approach is to maintain a persistent list of which SStables are being 
flushed, and remove it from that list before marking the corresponding 
commitlogs as non-dirty for that column family. In this case it is perfectly 
valid to delete these SStables on bootup since all that data is present in 
commitlogs. Running Cassandra without this feature will require manual 
intervention for most cases of node crashes, and that would be pretty bad for 
any system that claims it to be tolerant to node crashes.

> Handling partially written sstables on node crashes
> ---------------------------------------------------
>
>                 Key: CASSANDRA-10479
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10479
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sharvanath Pathak
>
> Currently a power loss can potentially require manual intervention to bring 
> Cassandra back up. Essentially, these partially written SStables are 
> considered as corrupt, and we see the following trace quite often on hard 
> reboots:
> {noformat}
> INFO  [SSTableBatchOpen:1] 2015-09-29 19:24:39,170 SSTableReader.java:478 - 
> Opening 
> /var/lib/cassandra/data/system/compactions_in_progress-55080ab05d9c388690a4acb25fe1f77b/system-compactions_in_progress-ka-13368
>  (79 bytes)
> ERROR [SSTableBatchOpen:1] 2015-09-29 19:24:39,177 FileUtils.java:447 - 
> Exiting forcefully due to file system exception on startup, disk failure 
> policy "stop"
> org.apache.cassandra.io.sstable.CorruptSSTableException: java.io.EOFException
>         at 
> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:131)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:85)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.util.CompressedSegmentedFile$Builder.metadata(CompressedSegmentedFile.java:79)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:72)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.util.SegmentedFile$Builder.complete(SegmentedFile.java:168)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:752) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:703) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:491) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:387) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> org.apache.cassandra.io.sstable.SSTableReader$4.run(SSTableReader.java:534) 
> ~[apache-cassandra-2.1.9.jar:2.1.9]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) 
> [na:1.7.0_80]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:262) 
> [na:1.7.0_80]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>  [na:1.7.0_80]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>  [na:1.7.0_80]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_80]
> Caused by: java.io.EOFException: null
>         at 
> java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340) 
> ~[na:1.7.0_80]
>         at java.io.DataInputStream.readUTF(DataInputStream.java:589) 
> ~[na:1.7.0_80]
>         at java.io.DataInputStream.readUTF(DataInputStream.java:564) 
> ~[na:1.7.0_80]
>         at 
> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:106)
>  ~[apache-cassandra-2.1.9.jar:2.1.9]
>         ... 14 common frames omitted
> {noformat}
> Deleting partially written SStables might be a perfectly valid thing to do 
> (given that the data is present in commitlogs).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to