[jira] [Comment Edited] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

Benedict (JIRA) Thu, 06 Mar 2014 14:00:09 -0800

    [ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13923125#comment-13923125
 ]


Benedict edited comment on CASSANDRA-6285 at 3/6/14 9:57 PM:
-------------------------------------------------------------

So, I think there may potentially be at least two races in the off heap 
deallocation. I suspect that may not be everything, though, as these two races 
probably won't cause the problem often. These are predicated on the assumption 
that thrift doesn't copy data from the DirectByteBuffer that the hsha server 
provides to it, so could be wrong, but anyway:

1) CL appends can be lagged behind the memtable update and, as a result, the 
acknowledgment to the client of success writing. If the CL record contains the 
ByteBuffer when it is freed, and that address is then reused in another 
allocation, it will write incorrect data to the commit log.
2) I believe thrift calls are two stage. If this is the case, and the client 
disconnects in between sending the first stage and receiving the result in the 
second stage, the buffer could be freed whilst still in flight to the 
memtable/CL

These are just quick ideas for where it might be, I haven't familiarised myself 
fully with thrift, the disruptor etc. to be certain if these are plausible, but 
it may turn out to be useful so thought I'd share.


was (Author: benedict):
So, I think there may potentially be at least two races in the off heap 
deallocation. I suspect that may not be everything, though, as these two races 
probably won't cause the problem often. These are predicated on the assumption 
that thrift doesn't copy data from the DirectByteBuffer that the hsha server 
provides to it, so could be wrong, but anyway:

1) CL appends can be lagged behind the memtable update and, as a result, the 
acknowledgment to the client of success writing. If the CL record contains the 
ByteBuffer when it is freed, and that data is then reused in another 
allocation, it will write incorrect data to the commit log.
2) I believe thrift calls are two stage. If this is the case, and the client 
disconnects in between sending the first stage and receiving the result in the 
second stage, the buffer could be freed whilst still in flight to the 
memtable/CL

These are just quick ideas for where it might be, I haven't familiarised myself 
fully with thrift, the disruptor etc. to be certain if these are plausible, but 
it may turn out to be useful so thought I'd share.

> 2.0 HSHA server introduces corrupt data
> ---------------------------------------
>
>                 Key: CASSANDRA-6285
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
>            Reporter: David Sauer
>            Assignee: Pavel Yaskevich
>            Priority: Critical
>             Fix For: 2.0.6
>
>         Attachments: 6285_testnotes1.txt, 
> CASSANDRA-6285-disruptor-heap.patch, compaction_test.py
>
>
> After altering everything to LCS the table OpsCenter.rollups60 amd one other 
> none OpsCenter-Table got stuck with everything hanging around in L0.
> The compaction started and ran until the logs showed this:
> ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
> (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
> java.lang.RuntimeException: Last written key 
> DecoratedKey(1326283851463420237, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
>  >= current key DecoratedKey(954210699457429663, 
> 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
>  writing into 
> /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
>       at 
> org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
>       at 
> org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
>       at 
> org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
>       at 
> org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>       at 
> org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
>       at 
> org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
>       at 
> org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
>       at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>       at java.lang.Thread.run(Thread.java:724)
> Moving back to STC worked to keep the compactions running.
> Especialy my own Table i would like to move to LCS.
> After a major compaction with STC the move to LCS fails with the same 
> Exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

Reply via email to