As I tried to say, EBS snapshots require much care or you get corruption
such as you have encountered.

Does Cassandra quiesce the file system after a snapshot using fsfreeze or
xfs_freeze? Somehow I doubt it...


On Fri, Mar 28, 2014 at 4:17 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

> I have a nagging memory of reading about issues with virtualization and
> not actually having durable versions of your data even after an fsync
> (within the VM).  Googling around lead me to this post:
> http://petercai.com/virtualization-is-bad-for-database-integrity/
>
> It's possible you're hitting this issue, with with the virtualization
> layer, or with EBS itself.  Just a shot in the dark though, other people
> would likely know much more than I.
>
>
>
> On Fri, Mar 28, 2014 at 12:50 PM, Russ Lavoie <ussray...@yahoo.com> wrote:
>
>> Robert,
>>
>> That is what I thought as well.  But apparently something is happening.
>>  The only way I can get away with doing this is adding a sleep 60 right
>> after the nodetool snapshot is executed.  I can reproduce this 100% of the
>> time by not issuing a sleep after nodetool snapshot.
>>
>> This is the error.
>>
>> ERROR [SSTableBatchOpen:1] 2014-03-28 17:08:14,290 CassandraDaemon.java
>> (line 191) Exception in thread Thread[SSTableBatchOpen:1,5,main]
>> org.apache.cassandra.io.sstable.CorruptSSTableException:
>> java.io.EOFException
>> at
>> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:108)
>> at
>> org.apache.cassandra.io.compress.CompressionMetadata.create(CompressionMetadata.java:63)
>>  at
>> org.apache.cassandra.io.util.CompressedPoolingSegmentedFile$Builder.complete(CompressedPoolingSegmentedFile.java:42)
>> at
>> org.apache.cassandra.io.sstable.SSTableReader.load(SSTableReader.java:407)
>>  at
>> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:198)
>> at
>> org.apache.cassandra.io.sstable.SSTableReader.open(SSTableReader.java:157)
>> at
>> org.apache.cassandra.io.sstable.SSTableReader$1.run(SSTableReader.java:262)
>> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>> at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>>  at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>> at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:744)
>> Caused by: java.io.EOFException
>> at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:340)
>> at java.io.DataInputStream.readUTF(DataInputStream.java:589)
>> at java.io.DataInputStream.readUTF(DataInputStream.java:564)
>> at
>> org.apache.cassandra.io.compress.CompressionMetadata.<init>(CompressionMetadata.java:83)
>>  ... 11 more
>>
>>
>>   On Friday, March 28, 2014 2:38 PM, Robert Coli <rc...@eventbrite.com>
>> wrote:
>>  On Fri, Mar 28, 2014 at 12:21 PM, Russ Lavoie <ussray...@yahoo.com>wrote:
>>
>> Thank you for your quick response.
>>
>> Is there a way to tell when a snapshot is completely done?
>>
>>
>> IIRC, the JMX call blocks until the snapshot completes. It should be done
>> when nodetool returns.
>>
>> =Rob
>>
>>
>>
>
>
> --
> Jon Haddad
> http://www.rustyrazorblade.com
> skype: rustyrazorblade
>

Reply via email to