[ https://issues.apache.org/jira/browse/CASSANDRA-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14222950#comment-14222950 ]
Nicolas Lalevée edited comment on CASSANDRA-8356 at 11/24/14 12:50 PM: ----------------------------------------------------------------------- I got the snapshot data from a node on my local machine, and I tried to load it up in a local cassandra node 2.0.11. The node did the "opening" of the files correctly. But querying against it is impossible, I hit the following error: {noformat} ERROR 11:28:45,693 Exception in thread Thread[ReadStage:2,5,main] java.lang.RuntimeException: java.lang.IllegalArgumentException at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1981) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:267) at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:587) at org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:61) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:1) at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:436) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:141) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:113) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:202) at org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:54) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:168) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87) at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:56) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1413) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1977) ... 3 more {noformat} This reminded me of an error we had on our test cluster, when we tested the upgrade to 2.0.x : CASSANDRA-6733 So here, I ran an upgradesstable on our production cluster, and now the slice queries return all the expected data. So everything is back to normal (and I am very pleased by the lower cpu activity with 2.0.x for the same load). I looked up again the logs in prod, I still don't see any such Buffer.limit errors. I don't know what was going wrong. As for CASSANDRA-6733, I have a snapshot of the data before the upgrade_sstable (unfortunately I don't have a snapshot pre-upgrade, but somme sstables are sill in the old format). If someone wants the data to analyse it, concat me, nlalevee at scoop.it. was (Author: hibou): I got the snapshot data from a node on my local machine, and I tried to load it up in a local cassandra node 2.0.11. The node did the "opening" of the files correctly. But querying against it is impossible, I hit the following error: {noformat} ERROR 11:28:45,693 Exception in thread Thread[ReadStage:2,5,main] java.lang.RuntimeException: java.lang.IllegalArgumentException at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1981) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException at java.nio.Buffer.limit(Buffer.java:267) at org.apache.cassandra.utils.ByteBufferUtil.readBytes(ByteBufferUtil.java:587) at org.apache.cassandra.utils.ByteBufferUtil.readBytesWithShortLength(ByteBufferUtil.java:596) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:61) at org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:1) at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:436) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:141) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:113) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:202) at org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:54) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:155) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:168) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:140) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:144) at org.apache.cassandra.utils.MergeIterator$ManyToOne.<init>(MergeIterator.java:87) at org.apache.cassandra.utils.MergeIterator.get(MergeIterator.java:46) at org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:120) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:80) at org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:72) at org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:297) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:56) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1547) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1376) at org.apache.cassandra.db.Keyspace.getRow(Keyspace.java:333) at org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:65) at org.apache.cassandra.service.StorageProxy$LocalReadRunnable.runMayThrow(StorageProxy.java:1413) at org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1977) ... 3 more {nofomart} This reminded me of an error we had on our test cluster, when we tested the upgrade to 2.0.x : CASSANDRA-6733 So here, I ran an upgradesstable on our production cluster, and now the slice queries return all the expected data. So everything is back to normal (and I am very pleased by the lower cpu activity with 2.0.x for the same load). I looked up again the logs in prod, I still don't see any such Buffer.limit errors. I don't know what was going wrong. As for CASSANDRA-6733, I have a snapshot of the data before the upgrade_sstable (unfortunately I don't have a snapshot pre-upgrade, but somme sstables are sill in the old format). If someone wants the data to analyse it, concat me, nlalevee at scoop.it. > Slice query on a super column family with counters doesn't get all the data > --------------------------------------------------------------------------- > > Key: CASSANDRA-8356 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8356 > Project: Cassandra > Issue Type: Bug > Reporter: Nicolas Lalevée > Assignee: Aleksey Yeschenko > Fix For: 2.0.12 > > > We've finally been able to upgrade our cluster to 2.0.11, after > CASSANDRA-7188 being fixed. > But now slice queries on a super column family with counters doesn't return > all the expected data. We first though because of all the trouble we had that > we lost data, but there a way to actually get the data, so nothing is lost; > it just that cassandra seems to incorrectly skip it. > See the following CQL log: > {noformat} > cqlsh:Theme> desc table theme_view; > CREATE TABLE theme_view ( > key bigint, > column1 varint, > column2 text, > value counter, > PRIMARY KEY ((key), column1, column2) > ) WITH COMPACT STORAGE AND > bloom_filter_fp_chance=0.010000 AND > caching='KEYS_ONLY' AND > comment='' AND > dclocal_read_repair_chance=0.000000 AND > gc_grace_seconds=864000 AND > index_interval=128 AND > read_repair_chance=1.000000 AND > replicate_on_write='true' AND > populate_io_cache_on_flush='false' AND > default_time_to_live=0 AND > speculative_retry='99.0PERCENTILE' AND > memtable_flush_period_in_ms=0 AND > compaction={'class': 'SizeTieredCompactionStrategy'} AND > compression={'sstable_compression': 'SnappyCompressor'}; > cqlsh:Theme> select * from theme_view where key = 99421 limit 10; > key | column1 | column2 | value > -------+---------+------------+------- > 99421 | -12 | 2011-03-25 | 59 > 99421 | -12 | 2011-03-26 | 5 > 99421 | -12 | 2011-03-27 | 2 > 99421 | -12 | 2011-03-28 | 40 > 99421 | -12 | 2011-03-29 | 14 > 99421 | -12 | 2011-03-30 | 17 > 99421 | -12 | 2011-03-31 | 5 > 99421 | -12 | 2011-04-01 | 37 > 99421 | -12 | 2011-04-02 | 7 > 99421 | -12 | 2011-04-03 | 4 > (10 rows) > cqlsh:Theme> select * from theme_view where key = 99421 and column1 = -12 > limit 10; > key | column1 | column2 | value > -------+---------+------------+------- > 99421 | -12 | 2011-03-25 | 59 > 99421 | -12 | 2014-05-06 | 15 > 99421 | -12 | 2014-06-06 | 7 > 99421 | -12 | 2014-06-10 | 22 > 99421 | -12 | 2014-06-11 | 34 > 99421 | -12 | 2014-06-12 | 35 > 99421 | -12 | 2014-06-13 | 26 > 99421 | -12 | 2014-06-14 | 16 > 99421 | -12 | 2014-06-15 | 24 > 99421 | -12 | 2014-06-16 | 25 > (10 rows) > {noformat} > As you can see the second query should return data from 2012, but it is not. > Via thrift, we have the exact same bug. -- This message was sent by Atlassian JIRA (v6.3.4#6332)