[jira] [Commented] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

2014-03-07 Thread Miles Shang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924330#comment-13924330
 ] 

Miles Shang commented on CASSANDRA-6285:


To add to [~rbranson]'s input, we're also seeing the same stacktrace as 
[~mshuler] (TimeUUID MarshalException). I inspected the row mutations that 
caused it. Three ranges were nonsensical: the key, the column name, and the 
value. By nonsensical, I mean that they don't match my expectation of what we 
are inserting in production data. All other ranges seemed fine (timestamps, 
masks, sizes, cfid). The key, column name, and value were read successfully, so 
their length metadata was good. For our data, the column comparator is 
TimeUUID. Our client library is pycassa. Whereas pycassa generates tuuids like 
this: 913d7fea-a631-11e3-8080-808080808080, the nonsensical column names look 
like this: 22050aa4-de11-e380-8080-80808080800b and this: 
10c326eb-86a4-e211-e380-808080808080. Most are of the first form. By shifting 
these nonsensical tuuids to the left or right by 2 octets, you get a reasonable 
tuuid. I don't have a similar insight into the nonsensical keys and values, but 
they could also be left or right shifted.

 2.0 HSHA server introduces corrupt data
 ---

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 2.0.6

 Attachments: 6285_testnotes1.txt, 
 CASSANDRA-6285-disruptor-heap.patch, compaction_test.py


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at 
 java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
   at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
   at java.lang.Thread.run(Thread.java:724)
 Moving back to STC worked to keep the compactions running.
 Especialy my own Table i would like to move to LCS.
 After a major compaction with STC the move to LCS fails with the same 
 Exception.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Comment Edited] (CASSANDRA-6285) 2.0 HSHA server introduces corrupt data

2014-03-07 Thread Miles Shang (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-6285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13924330#comment-13924330
 ] 

Miles Shang edited comment on CASSANDRA-6285 at 3/7/14 9:29 PM:


To add to [~rbranson]'s input, we're also seeing the same stacktrace as 
[~mshuler] (TimeUUID MarshalException). I inspected the row mutations that 
caused it. Three ranges were nonsensical: the key, the column name, and the 
value. By nonsensical, I mean that they don't match my expectation of what we 
are inserting in production data. All other ranges seemed fine (timestamps, 
masks, sizes, cfid). The key, column name, and value were read successfully, so 
their length metadata was good. For our data, the column comparator is 
TimeUUID. Our client library is pycassa. Whereas pycassa generates tuuids like 
this: 913d7fea-a631-11e3-8080-808080808080, the nonsensical column names look 
like this: 22050aa4-de11-e380-8080-80808080800b and this: 
10c326eb-86a4-e211-e380-808080808080. Most are of the first form. By shifting 
these nonsensical tuuids to the left or right by an octet, you get a reasonable 
tuuid. I don't have a similar insight into the nonsensical keys and values, but 
they could also be left or right shifted.


was (Author: mshang):
To add to [~rbranson]'s input, we're also seeing the same stacktrace as 
[~mshuler] (TimeUUID MarshalException). I inspected the row mutations that 
caused it. Three ranges were nonsensical: the key, the column name, and the 
value. By nonsensical, I mean that they don't match my expectation of what we 
are inserting in production data. All other ranges seemed fine (timestamps, 
masks, sizes, cfid). The key, column name, and value were read successfully, so 
their length metadata was good. For our data, the column comparator is 
TimeUUID. Our client library is pycassa. Whereas pycassa generates tuuids like 
this: 913d7fea-a631-11e3-8080-808080808080, the nonsensical column names look 
like this: 22050aa4-de11-e380-8080-80808080800b and this: 
10c326eb-86a4-e211-e380-808080808080. Most are of the first form. By shifting 
these nonsensical tuuids to the left or right by 2 octets, you get a reasonable 
tuuid. I don't have a similar insight into the nonsensical keys and values, but 
they could also be left or right shifted.

 2.0 HSHA server introduces corrupt data
 ---

 Key: CASSANDRA-6285
 URL: https://issues.apache.org/jira/browse/CASSANDRA-6285
 Project: Cassandra
  Issue Type: Bug
  Components: Core
 Environment: 4 nodes, shortly updated from 1.2.11 to 2.0.2
Reporter: David Sauer
Assignee: Pavel Yaskevich
Priority: Critical
 Fix For: 2.0.6

 Attachments: 6285_testnotes1.txt, 
 CASSANDRA-6285-disruptor-heap.patch, compaction_test.py


 After altering everything to LCS the table OpsCenter.rollups60 amd one other 
 none OpsCenter-Table got stuck with everything hanging around in L0.
 The compaction started and ran until the logs showed this:
 ERROR [CompactionExecutor:111] 2013-11-01 19:14:53,865 CassandraDaemon.java 
 (line 187) Exception in thread Thread[CompactionExecutor:111,1,RMI Runtime]
 java.lang.RuntimeException: Last written key 
 DecoratedKey(1326283851463420237, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574426c6f6f6d46696c746572537061636555736564)
  = current key DecoratedKey(954210699457429663, 
 37382e34362e3132382e3139382d6a7576616c69735f6e6f72785f696e6465785f323031335f31305f30382d63616368655f646f63756d656e74736c6f6f6b75702d676574546f74616c4469736b5370616365557365640b0f)
  writing into 
 /var/lib/cassandra/data/OpsCenter/rollups60/OpsCenter-rollups60-tmp-jb-58656-Data.db
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:141)
   at 
 org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:164)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.runWith(CompactionTask.java:160)
   at 
 org.apache.cassandra.io.util.DiskAwareRunnable.runMayThrow(DiskAwareRunnable.java:48)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 org.apache.cassandra.db.compaction.CompactionTask.executeInternal(CompactionTask.java:60)
   at 
 org.apache.cassandra.db.compaction.AbstractCompactionTask.execute(AbstractCompactionTask.java:59)
   at 
 org.apache.cassandra.db.compaction.CompactionManager$6.runMayThrow(CompactionManager.java:296)
   at 
 org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28)
   at 
 java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
   at java.util.concurrent.FutureTask.run(FutureTask.java:262)
   at