Try 0.8.3 They fixed https://issues.apache.org/jira/browse/CASSANDRA-2968 - and this produced erroneous records for counters. Not sure this is exactly yours, but similar.
On Tue, Aug 9, 2011 at 5:28 AM, Boris Yen <yulin...@gmail.com> wrote: > Hi, > > I am not sure if this is a bug or we use the counter the wrong way, but I > keep getting a enormous counter number in our deployment. After a few tries, > I am finally able to reproduce it. The following are the settings of my > development: > ----------------------------------------------------- > I have two-node cluster with the following keyspace and column family > settings. > > Cluster Information: > Snitch: org.apache.cassandra.locator.SimpleSnitch > Partitioner: org.apache.cassandra.dht.RandomPartitioner > Schema versions: > 63fda700-c243-11e0-0000-2d03dcafebdf: [172.17.19.151, 172.17.19.152] > > Keyspace: test: > Replication Strategy: > org.apache.cassandra.locator.NetworkTopologyStrategy > Durable Writes: true > Options: [datacenter1:2] > Column Families: > ColumnFamily: testCounter (Super) > "APP status information." > Key Validation Class: org.apache.cassandra.db.marshal.BytesType > Default column value validator: > org.apache.cassandra.db.marshal.CounterColumnType > Columns sorted by: > org.apache.cassandra.db.marshal.BytesType/org.apache.cassandra.db.marshal.BytesType > Row cache size / save period in seconds: 0.0/0 > Key cache size / save period in seconds: 200000.0/14400 > Memtable thresholds: 1.1578125/1440/247 (millions of ops/MB/minutes) > GC grace seconds: 864000 > Compaction min/max thresholds: 4/32 > Read repair chance: 1.0 > Replicate on write: true > Built indexes: [] > > Then, I use a test program based on hector to add a counter column > (testCounter[sc][column]) 1000 times. In the middle the adding process, I > intentional shut down the node 172.17.19.152. In addition to that, the test > program is smart enough to switch the consistency level from Quorum to One, > so that the following adding actions would not fail. > > After all the adding actions are done, I start the cassandra > on 172.17.19.152, and I use cassandra-cli to check if the counter is correct > on both nodes, and I got a result 1001 which should be reasonable because > hector will retry once. However, when I shut down 172.17.19.151 and > after 172.17.19.152 is aware of 172.17.19.151 is down, I try to start the > cassandra on 172.17.19.151 again. Then, I check the counter again, this time > I got a result 481387 which is so wrong. > > I was wondering if anyone could explain why this happens, is this a bug or > do I use the counter the wrong way?. > > Regards > Boris > -- Regards, Andriy Denysenko