Re: NetworkTopologyStrategy ring distribution across 2 DC
Thanks. The error is gone if i specify the keyspace name. However the replicas in the ring output is not correct. Shouldn't it say 3 because I have DC1:3, DC2:3 in my schema? thanks Ramesh Datacenter: DC1 == Replicas: 2 AddressRackStatus State LoadOwns Token -9223372036854775808 192.168.1.107 RAC1Up Normal 4.72 MB 42.86% 6588122883467697004 192.168.1.106 RAC1Up Normal 4.73 MB 42.86% 3952873730080618202 192.168.1.105 RAC1Up Normal 4.8 MB 42.86% 1317624576693539400 192.168.1.104 RAC1Up Normal 4.77 MB 42.86% -1317624576693539402 192.168.1.103 RAC1Up Normal 4.83 MB 42.86% -3952873730080618204 192.168.1.102 RAC1Up Normal 4.69 MB 42.86% -6588122883467697006 192.168.1.101 RAC1Up Normal 4.8 MB 42.86% -9223372036854775808 Datacenter: DC2 == Replicas: 2 AddressRackStatus State LoadOwns Token 3952873730080618203 192.168.1.111 RAC1Up Normal 4.73 MB 42.86% -1317624576693539401 192.168.1.110 RAC1Up Normal 4.79 MB 42.86% -3952873730080618203 192.168.1.109 RAC1Up Normal 3.16 MB 42.86% -6588122883467697005 192.168.1.108 RAC1Up Normal 3.22 MB 42.86% -9223372036854775807 192.168.1.114 RAC1Up Normal 4.69 MB 42.86% 6588122883467697005 192.168.1.112 RAC1Up Normal 4.76 MB 42.86% 1317624576693539401 192.168.1.113 RAC1Up Normal 3.19 MB 42.86% 3952873730080618203 On Tue, Mar 11, 2014 at 7:24 PM, Tyler Hobbs ty...@datastax.com wrote: On Tue, Mar 11, 2014 at 1:37 PM, Ramesh Natarajan rames...@gmail.comwrote: Note: Ownership information does not include topology; for complete information, specify a keyspace Also the owns column is 0% for the second DC. Is this normal? Yes. Without a keyspace specified, the Owns column is showing the equivalent of SimpleStrategy with replication_factor=1. If you specify a keyspace, it will take the replication strategy and options into account. -- Tyler Hobbs DataStax http://datastax.com/
Fwd: NetworkTopologyStrategy ring distribution across 2 DC
Hi, I have 14 cassandra nodes, running as 2 data centers using PropertyFileSnitch as follows 192.168.1.101=DC1:RAC1 192.168.1.102=DC1:RAC1 192.168.1.103=DC1:RAC1 192.168.1.104=DC1:RAC1 192.168.1.105=DC1:RAC1 192.168.1.106=DC1:RAC1 192.168.1.107=DC1:RAC1 192.168.1.108=DC2:RAC1 192.168.1.109=DC2:RAC1 192.168.1.110=DC2:RAC1 192.168.1.111=DC2:RAC1 192.168.1.112=DC2:RAC1 192.168.1.113=DC2:RAC1 192.168.1.114=DC2:RAC1 My schema uses replication factor of 3 for each data center as follows create keyspace test with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {DC2 : 3, DC1 : 3} and durable_writes = true; When i setup the second DC, the initial token is offset by 1 per documentation. When i run the ring command I get this note Note: Ownership information does not include topology; for complete information, specify a keyspace Also the owns column is 0% for the second DC. Is this normal? thanks Ramesh # /opt/mp/storage/persistent/cassandra-dse/bin/nodetool -h cassandra101 ring Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: DC1 == AddressRackStatus State LoadOwns Token 6588122883467697004 192.168.1.101 RAC1Up Normal 4.8 MB 14.29% -9223372036854775808 192.168.1.102 RAC1Up Normal 4.69 MB 14.29% -6588122883467697006 192.168.1.103 RAC1Up Normal 4.83 MB 14.29% -3952873730080618204 192.168.1.104 RAC1Up Normal 4.77 MB 14.29% -1317624576693539402 192.168.1.105 RAC1Up Normal 4.8 MB 14.29% 1317624576693539400 192.168.1.106 RAC1Up Normal 4.73 MB 14.29% 3952873730080618202 192.168.1.107 RAC1Up Normal 4.72 MB 14.29% 6588122883467697004 Datacenter: DC2 == AddressRackStatus State LoadOwns Token 6588122883467697005 192.168.1.108 RAC1Up Normal 3.22 MB 0.00% -9223372036854775807 192.168.1.109 RAC1Up Normal 3.16 MB 0.00% -6588122883467697005 192.168.1.110 RAC1Up Normal 4.79 MB 0.00% -3952873730080618203 192.168.1.111 RAC1Up Normal 4.73 MB 0.00% -1317624576693539401 192.168.1.112 RAC1Up Normal 4.76 MB 0.00% 1317624576693539401 192.168.1.113 RAC1Up Normal 3.19 MB 0.00% 3952873730080618203 192.168.1.114 RAC1Up Normal 4.69 MB 0.00% 6588122883467697005 #
datastax opscenter authentication
I am trying to integrate opscenter in our environment and I was wondering if we can use PAM authentication instead of a password file for opscenter authentication? thanks Ramesh
Re: specifying initial cassandra schema
Thanks and appreciate the responses. Will look into this. thanks Ramesh On Wed, Jan 18, 2012 at 2:27 AM, aaron morton aa...@thelastpickle.com wrote: check the command line help for cassandra-cli, you can pass it a file name. e.g. cassandra --host localhost --file schema.txt Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 9:35 AM, Carlos Pérez Miguel wrote: Hi Ramesh You can use the schematool command. I am using it for the same purposes in Cassandra 0.7.9. I use the following line in my cassandra startup script: $CASSANDRA_HOME/bin/schematool HOSTNAME 8080 import where HOSTNAME is the hostname of your test machine. It will import the schema from your cassandra.yaml file. If you execute it and there is already a schema in the cassandra cluster, you'll get a exception from schematool but no impact to the cluster. Bye Carlos Pérez Miguel 2012/1/17 Ramesh Natarajan rames...@gmail.com: I usually start cassandra and then use cassandra-cli to import a schema. Is there any automated way to load a fixed schema when cassandra starts automatically? I have a test setup where i run cassandra on a single node. I have a OS image packaged with cassandra and it automatically starts cassandra as a part of OS boot up. I saw some old references to specify schema in cassandra.yaml. Is this still supported in Cassandra 1.x? Are there any examples? thanks Ramesh
specifying initial cassandra schema
I usually start cassandra and then use cassandra-cli to import a schema. Is there any automated way to load a fixed schema when cassandra starts automatically? I have a test setup where i run cassandra on a single node. I have a OS image packaged with cassandra and it automatically starts cassandra as a part of OS boot up. I saw some old references to specify schema in cassandra.yaml. Is this still supported in Cassandra 1.x? Are there any examples? thanks Ramesh
gracefully recover from data file corruptions
We are running a 30 node 1.0.5 cassandra cluster running RHEL 5.6 x86_64 virtualized on ESXi 5.0. We are seeing Decorated Key assertion error during compactions and at this point we are suspecting anything from OS/ESXi/HBA/iSCSI RAID. Please correct me i am wrong, once a node gets into this state I don't see any way to recover unless I remove the corrupted data file and restart cassandra. I am running tests with replication factor 3 and all reads and writes are done with QUORUM. So i believe there will not be data loss if i do this. If this is a correct way to recover I would like to know how to gracefully do this in production environment.. - Disable thrift - Disable gossip - Drain the node - kill the cassandra java process ( send a sigterm and or sigkill ) - do a filesystem sync - remove the corrupted file from the /var/lib/cassandra/data directory - start cassandra - enable gossip so all pending hintedhandoff occurs - enable thrift. Thanks Ramesh
Re: gracefully recover from data file corruptions
Thanks Ben and Jeremiah. We are actively working with our 3rd party vendors to determine the root cause for this issue. Hopefully we will figure something out. This repair procedure is more like a last resort which i really don't want to use but something to keep in mind if such necessity arises. thanks Ramesh On Fri, Dec 16, 2011 at 12:48 PM, Ben Coverston ben.covers...@datastax.com wrote: Hi Ramesh, Every time I have seen this in the last year it has been caused by bad hardware or bad memory. Usually we find errors in the syslog. Jeremiah is right about running repair when you get your nodes back up. Fortunately with the addition of checksums in 1.0 I don't think that the corrupt data can get propagated across nodes. Your recovery steps do seem solid, if not a bit verbose. I usually tell people to shut down the node, remove the offending SSTables, bring the node back up then run repair. I can't stress enough however that if you're going to bring it back up on the same hardware you probably want to find the root cause, otherwise you're going to find yourself in the same situation days/weeks/months in the future. Ben On Fri, Dec 16, 2011 at 5:16 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: You need to run repair on the node once it is back up (to get back the data you just deleted). If this is happening on more than one node you could have data loss... -Jeremiah On 12/16/2011 07:46 AM, Ramesh Natarajan wrote: We are running a 30 node 1.0.5 cassandra cluster running RHEL 5.6 x86_64 virtualized on ESXi 5.0. We are seeing Decorated Key assertion error during compactions and at this point we are suspecting anything from OS/ESXi/HBA/iSCSI RAID. Please correct me i am wrong, once a node gets into this state I don't see any way to recover unless I remove the corrupted data file and restart cassandra. I am running tests with replication factor 3 and all reads and writes are done with QUORUM. So i believe there will not be data loss if i do this. If this is a correct way to recover I would like to know how to gracefully do this in production environment.. - Disable thrift - Disable gossip - Drain the node - kill the cassandra java process ( send a sigterm and or sigkill ) - do a filesystem sync - remove the corrupted file from the /var/lib/cassandra/data directory - start cassandra - enable gossip so all pending hintedhandoff occurs - enable thrift. Thanks Ramesh -- Ben Coverston DataStax -- The Apache Cassandra Company
tmp files in /var/lib/cassandra/data
We are using leveled compaction running cassandra 1.0.6. I checked the data directory (/var/lib/cassandra/data) and i see these 0 bytes tmp files. What are these files? thanks Ramesh -rw-r--r-- 1 root root0 Dec 14 17:15 uid-tmp-hc-106-Data.db -rw-r--r-- 1 root root0 Dec 14 17:15 uid-tmp-hc-106-Index.db -rw-r--r-- 1 root root0 Dec 14 17:23 uid-tmp-hc-117-Data.db -rw-r--r-- 1 root root0 Dec 14 17:23 uid-tmp-hc-117-Index.db -rw-r--r-- 1 root root0 Dec 14 15:51 uid-tmp-hc-11-Data.db -rw-r--r-- 1 root root0 Dec 14 15:51 uid-tmp-hc-11-Index.db -rw-r--r-- 1 root root0 Dec 14 17:31 uid-tmp-hc-129-Data.db -rw-r--r-- 1 root root0 Dec 14 17:31 uid-tmp-hc-129-Index.db -rw-r--r-- 1 root root0 Dec 14 17:40 uid-tmp-hc-142-Data.db -rw-r--r-- 1 root root0 Dec 14 17:40 uid-tmp-hc-142-Index.db -rw-r--r-- 1 root root0 Dec 14 17:40 uid-tmp-hc-145-Data.db -rw-r--r-- 1 root root0 Dec 14 17:40 uid-tmp-hc-145-Index.db -rw-r--r-- 1 root root0 Dec 14 17:47 uid-tmp-hc-158-Data.db -rw-r--r-- 1 root root0 Dec 14 17:47 uid-tmp-hc-158-Index.db -rw-r--r-- 1 root root0 Dec 14 17:47 uid-tmp-hc-162-Data.db -rw-r--r-- 1 root root0 Dec 14 17:47 uid-tmp-hc-162-Index.db -rw-r--r-- 1 root root0 Dec 14 17:55 uid-tmp-hc-175-Data.db -rw-r--r-- 1 root root0 Dec 14 17:55 uid-tmp-hc-175-Index.db -rw-r--r-- 1 root root0 Dec 14 17:55 uid-tmp-hc-179-Data.db -rw-r--r-- 1 root root0 Dec 14 17:55 uid-tmp-hc-179-Index.db -rw-r--r-- 1 root root0 Dec 14 18:03 uid-tmp-hc-193-Data.db -rw-r--r-- 1 root root0 Dec 14 18:03 uid-tmp-hc-193-Index.db -rw-r--r-- 1 root root0 Dec 14 18:03 uid-tmp-hc-197-Data.db -rw-r--r-- 1 root root0 Dec 14 18:03 uid-tmp-hc-197-Index.db -rw-r--r-- 1 root root0 Dec 14 16:02 uid-tmp-hc-19-Data.db -rw-r--r-- 1 root root0 Dec 14 16:02 uid-tmp-hc-19-Index.db -rw-r--r-- 1 root root0 Dec 14 18:03 uid-tmp-hc-200-Data.db -rw-r--r-- 1 root root0 Dec 14 18:03 uid-tmp-hc-200-Index.db -rw-r--r-- 1 root root0 Dec 14 18:11 uid-tmp-hc-213-Data.db -rw-r--r-- 1 root root0 Dec 14 18:11 uid-tmp-hc-213-Index.db -rw-r--r-- 1 root root0 Dec 14 18:11 uid-tmp-hc-217-Data.db -rw-r--r-- 1 root root0 Dec 14 18:11 uid-tmp-hc-217-Index.db -rw-r--r-- 1 root root0 Dec 14 18:19 uid-tmp-hc-230-Data.db -rw-r--r-- 1 root root0 Dec 14 18:19 uid-tmp-hc-230-Index.db -rw-r--r-- 1 root root0 Dec 14 18:19 uid-tmp-hc-235-Data.db -rw-r--r-- 1 root root0 Dec 14 18:19 uid-tmp-hc-235-Index.db -rw-r--r-- 1 root root0 Dec 14 18:27 uid-tmp-hc-249-Data.db -rw-r--r-- 1 root root0 Dec 14 18:27 uid-tmp-hc-249-Index.db -rw-r--r-- 1 root root0 Dec 14 18:27 uid-tmp-hc-253-Data.db -rw-r--r-- 1 root root0 Dec 14 18:27 uid-tmp-hc-253-Index.db -rw-r--r-- 1 root root0 Dec 14 18:28 uid-tmp-hc-257-Data.db -rw-r--r-- 1 root root0 Dec 14 18:28 uid-tmp-hc-257-Index.db -rw-r--r-- 1 root root0 Dec 14 18:35 uid-tmp-hc-270-Data.db -rw-r--r-- 1 root root0 Dec 14 18:35 uid-tmp-hc-270-Index.db -rw-r--r-- 1 root root0 Dec 14 18:36 uid-tmp-hc-275-Data.db -rw-r--r-- 1 root root0 Dec 14 18:36 uid-tmp-hc-275-Index.db -rw-r--r-- 1 root root0 Dec 14 18:44 uid-tmp-hc-288-Data.db -rw-r--r-- 1 root root0 Dec 14 18:44 uid-tmp-hc-288-Index.db -rw-r--r-- 1 root root0 Dec 14 16:10 uid-tmp-hc-28-Data.db -rw-r--r-- 1 root root0 Dec 14 16:10 uid-tmp-hc-28-Index.db -rw-r--r-- 1 root root0 Dec 14 18:44 uid-tmp-hc-293-Data.db -rw-r--r-- 1 root root0 Dec 14 18:44 uid-tmp-hc-293-Index.db -rw-r--r-- 1 root root0 Dec 14 18:52 uid-tmp-hc-307-Data.db -rw-r--r-- 1 root root0 Dec 14 18:52 uid-tmp-hc-307-Index.db -rw-r--r-- 1 root root0 Dec 14 18:52 uid-tmp-hc-310-Data.db -rw-r--r-- 1 root root0 Dec 14 18:52 uid-tmp-hc-310-Index.db -rw-r--r-- 1 root root0 Dec 14 18:52 uid-tmp-hc-315-Data.db -rw-r--r-- 1 root root0 Dec 14 18:52 uid-tmp-hc-315-Index.db -rw-r--r-- 1 root root0 Dec 14 19:00 uid-tmp-hc-328-Data.db -rw-r--r-- 1 root root0 Dec 14 19:00 uid-tmp-hc-328-Index.db -rw-r--r-- 1 root root0 Dec 14 19:00 uid-tmp-hc-333-Data.db -rw-r--r-- 1 root root0 Dec 14 19:00 uid-tmp-hc-333-Index.db -rw-r--r-- 1 root root0 Dec 14 19:08 uid-tmp-hc-347-Data.db -rw-r--r-- 1 root root0 Dec 14 19:08 uid-tmp-hc-347-Index.db -rw-r--r-- 1 root root0 Dec 14 19:08 uid-tmp-hc-353-Data.db -rw-r--r-- 1 root root0 Dec 14 19:08 uid-tmp-hc-353-Index.db -rw-r--r-- 1 root root0 Dec 14 19:09 uid-tmp-hc-357-Data.db -rw-r--r-- 1 root root0 Dec 14 19:09 uid-tmp-hc-357-Index.db -rw-r--r-- 1 root root0 Dec 14 19:17 uid-tmp-hc-370-Data.db -rw-r--r-- 1 root root0 Dec 14 19:17 uid-tmp-hc-370-Index.db -rw-r--r-- 1 root root
Re: tmp files in /var/lib/cassandra/data
yep, so far it looks like a file descriptor leak. Not sure if gc or some other event like compaction would close these files.. [root@CAP-VM-1 ~]# ls -al /proc/31134/fd | grep MSA | wc -l 540 [root@CAP-VM-1 ~]# ls -al /proc/31134/fd | grep MSA | wc -l 542 [root@CAP-VM-1 ~]# ls -al /proc/31134/fd | grep MSA | wc -l 554 [root@CAP-VM-1 ~]# ls -al /proc/31134/fd | grep MSA | wc -l 558 On Wed, Dec 14, 2011 at 8:28 PM, Bryce Godfrey bryce.godf...@azaleos.com wrote: I'm seeing this also, and my nodes have started crashing with too many open file errors. Running lsof I see lots of these open tmp files. java 8185 root 911u REG 8,32 38 129108266 /opt/cassandra/data/MonitoringData/Properties-tmp-hc-268721-CompressionInfo.db java 8185 root 912u REG 8,32 0 155320741 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1092-Data.db java 8185 root 913u REG 8,32 0 155320742 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1097-Index.db java 8185 root 914u REG 8,32 0 155320743 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1097-Data.db java 8185 root 916u REG 8,32 0 155320754 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1113-Data.db java 8185 root 918u REG 8,32 0 155320744 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1102-Index.db java 8185 root 919u REG 8,32 0 155320745 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1102-Data.db java 8185 root 920u REG 8,32 0 155320755 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1118-Index.db java 8185 root 921u REG 8,32 0 129108272 /opt/cassandra/data/MonitoringData/Properties-tmp-hc-268781-Data.db java 8185 root 922u REG 8,32 38 129108273 /opt/cassandra/data/MonitoringData/Properties-tmp-hc-268781-CompressionInfo.db java 8185 root 923u REG 8,32 0 155320756 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1118-Data.db java 8185 root 929u REG 8,32 38 129108262 /opt/cassandra/data/MonitoringData/Properties-tmp-hc-268822-CompressionInfo.db java 8185 root 947u REG 8,32 0 129108284 /opt/cassandra/data/MonitoringData/Properties-tmp-hc-268854-Data.db java 8185 root 948u REG 8,32 38 129108285 /opt/cassandra/data/MonitoringData/Properties-tmp-hc-268854-CompressionInfo.db java 8185 root 954u REG 8,32 0 155320746 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1107-Index.db java 8185 root 955u REG 8,32 0 155320747 /opt/cassandra/data/system/HintsColumnFamily-tmp-hc-1107-Data.db Going to try rolling back to 1.0.5 for the time being even though I was hoping to use one of the fixes in 1.0.6 -Original Message- From: Ramesh Natarajan [mailto:rames...@gmail.com] Sent: Wednesday, December 14, 2011 6:03 PM To: user@cassandra.apache.org Subject: tmp files in /var/lib/cassandra/data We are using leveled compaction running cassandra 1.0.6. I checked the data directory (/var/lib/cassandra/data) and i see these 0 bytes tmp files. What are these files? thanks Ramesh -rw-r--r-- 1 root root 0 Dec 14 17:15 uid-tmp-hc-106-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:15 uid-tmp-hc-106-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:23 uid-tmp-hc-117-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:23 uid-tmp-hc-117-Index.db -rw-r--r-- 1 root root 0 Dec 14 15:51 uid-tmp-hc-11-Data.db -rw-r--r-- 1 root root 0 Dec 14 15:51 uid-tmp-hc-11-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:31 uid-tmp-hc-129-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:31 uid-tmp-hc-129-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:40 uid-tmp-hc-142-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:40 uid-tmp-hc-142-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:40 uid-tmp-hc-145-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:40 uid-tmp-hc-145-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:47 uid-tmp-hc-158-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:47 uid-tmp-hc-158-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:47 uid-tmp-hc-162-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:47 uid-tmp-hc-162-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:55 uid-tmp-hc-175-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:55 uid-tmp-hc-175-Index.db -rw-r--r-- 1 root root 0 Dec 14 17:55 uid-tmp-hc-179-Data.db -rw-r--r-- 1 root root 0 Dec 14 17:55 uid-tmp-hc-179-Index.db -rw
Re: cassandra in production environment
- We are seeing DecoratedKey error during compaction. Looks like the sha1sum of the data file doesn't match the digest file created by cassandra. I dont have any clue where things are failing. It could be either at OS level, ESXi HBA level, or the DotHill iSCSI raid layer. - We are using sun JRE 1.6.29 thanks Ramesh On Mon, Dec 12, 2011 at 10:02 AM, Jason Wellonen jason.wello...@cassidiancommunications.com wrote: RHEL 6.1 and 6.2 with KVM. No file corruptions that I am aware of. Jason -Original Message- From: Ramesh Natarajan [mailto:rames...@gmail.com] Sent: Sunday, December 11, 2011 5:05 PM To: user@cassandra.apache.org Subject: cassandra in production environment Hi, We are currently testing cassandra in RHEL 6.1 64 bit environment running on ESXi 5.0 and are experiencing issues with data file corruptions. If you are using linux for production environment can you please share which OS/version you are using? thanks Ramesh
cassandra in production environment
Hi, We are currently testing cassandra in RHEL 6.1 64 bit environment running on ESXi 5.0 and are experiencing issues with data file corruptions. If you are using linux for production environment can you please share which OS/version you are using? thanks Ramesh
Re: IOException running cassandra 1.0.5
(Unknown Source) ERROR [ReadStage:50] 2011-12-10 03:27:20,577 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[ReadStage:50,5,main] java.lang.AssertionError: DecoratedKey(-1, ) != DecoratedKey(53731996390544741435985962281191741460, 37303730323632333931) in /data/MSA/modseq-hb-177-Data.db at org.apache.cassandra.db.columniterator.SSTableNamesIterator.init(SSTableNamesIterator.java:70) at org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:60) at org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:78) at org.apache.cassandra.db.CollationController.collectTimeOrderedData(CollationController.java:114) at org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:62) at org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1275) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1161) at org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1128) at org.apache.cassandra.db.Table.getRow(Table.java:375) at org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:58) at org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:53) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) On Fri, Dec 9, 2011 at 6:52 AM, Sylvain Lebresne sylv...@datastax.com wrote: The sha1 don't match, which would indicate that the sstable has been modified after being written. But Cassandra never modify a sstable once it has been written, so this would suggest an external modification, typically some bit rot. In that case you don't have much other choice than removing the mentioned data file and run a repair. -- Sylvain On Fri, Dec 9, 2011 at 1:37 PM, Ramesh Natarajan rames...@gmail.com wrote: Hi, I have a 30 node cassandra cluster running on RHEL6 64 bit. RF=3, reads and writes performed with QUORUM. After few hours of test run, I am seeing this error in the system.log file. [root@MSA-VM-18 cassandra]# cat /var/lib/cassandra/data/MSA/modseq-hb-419-Digest.sha1 71e43a932a29553720149bb4f93727e4d269735d modseq-hb-419-Data.db[root@MSA-VM-18 cassandra]# [root@MSA-VM-18 cassandra]# sha1sum /var/lib/cassandra/data/MSA/modseq-hb-419-Data.db 033f5aea5590851377d3bb79df27f0e6eedb6b95 /var/lib/cassandra/data/MSA/modseq-hb-419-Data.db [root@MSA-VM-18 cassandra]# Any pointers to troubleshoot this issue? I am attaching the system.log file for your reference. thanks Ramesh INFO [CompactionExecutor:296] 2011-12-09 04:36:40,430 CompactionTask.java (line 112) Compacting [SSTableReader(path='/var/lib/cassandra/data/MSA/transactions-hb-55-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/transactions-hb-53-Data.db')] INFO [CompactionExecutor:296] 2011-12-09 04:36:40,501 CompactionTask.java (line 213) Compacted to [/var/lib/cassandra/data/MSA/transactions-hb-56-Data.db,]. 280,210 to 144,785 (~51% of original) bytes for 3 keys at 2.191710MB/s. Time: 63ms. ERROR [CompactionExecutor:295] 2011-12-09 04:36:41,320 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:295,1,main] java.io.IOError: java.io.IOException: dataSize of 14293651161088 starting at 5541742 would be larger than file /var/lib/cassandra/data/MSA/modseq-hb-419-Data.db length 10486511 at org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:154) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:86) at org.apache.cassandra.io.sstable.SSTableIdentityIterator.init(SSTableIdentityIterator.java:70) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:177) at org.apache.cassandra.io.sstable.SSTableScanner$KeyScanningIterator.next(SSTableScanner.java:142) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:134) at org.apache.cassandra.io.sstable.SSTableScanner.next(SSTableScanner.java:37) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:147) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:124) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:98) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:140) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:135
Re: AssertionError in hintedhandoff - 1.0.5
https://issues.apache.org/jira/browse/CASSANDRA-3579 thanks Ramesh On Tue, Dec 6, 2011 at 2:16 AM, Sylvain Lebresne sylv...@datastax.com wrote: Do you mind opening a ticket on https://issues.apache.org/jira/browse/CASSANDRA ? Thanks -- Sylvain On Tue, Dec 6, 2011 at 12:52 AM, Ramesh Natarajan rames...@gmail.com wrote: Hi, We are running a 8 node cassandra cluster running cassandra 1.0.5. All our CF use leveled compaction. We ran a test where we did a lot of inserts for 3 days. After that we started to run tests where some of the reads could ask for information that was inserted a while back. In this scenario we are seeing this assertion error in HintedHandoff. Is this a known issue? ERROR [HintedHandoff:3] 2011-12-05 15:42:04,324 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[HintedHandoff:3,1,main] java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:330) at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:81) at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:353) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:326) ... 6 more Caused by: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:124) at org.apache.cassandra.io.sstable.SSTableWriter.append(SSTableWriter.java:160) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:158) at org.apache.cassandra.db.compaction.CompactionManager$6.call(CompactionManager.java:275) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) ... 3 more ERROR [HintedHandoff:3] 2011-12-05 15:42:04,333 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[HintedHandoff:3,1,main] java.lang.RuntimeException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:330) at org.apache.cassandra.db.HintedHandOffManager.access$100(HintedHandOffManager.java:81) at org.apache.cassandra.db.HintedHandOffManager$2.runMayThrow(HintedHandOffManager.java:353) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) ... 3 more Caused by: java.util.concurrent.ExecutionException: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.db.HintedHandOffManager.deliverHintsToEndpoint(HintedHandOffManager.java:326) ... 6 more Caused by: java.lang.AssertionError: originally calculated column size of 470937164 but now it is 470294247 at org.apache.cassandra.db.compaction.LazilyCompactedRow.write(LazilyCompactedRow.java:124
Re: Assertion error during bootstraping cassandra 1.0.2
https://issues.apache.org/jira/browse/CASSANDRA-3536 thanks Ramesh On Mon, Nov 28, 2011 at 5:59 AM, Sylvain Lebresne sylv...@datastax.com wrote: I don't think you're doing anything wrong. Would you mind opening a ticket on JIRA (https://issues.apache.org/jira/browse/CASSANDRA) ? -- Sylvain On Tue, Nov 22, 2011 at 5:35 PM, Ramesh Natarajan rames...@gmail.com wrote: Hi, I have a 3 node cassandra cluster. I have RF set to 3 and do reads and writes using QUORUM. Here is my initial ring configuration [root@CAP4-CNode1 ~]# /root/cassandra/bin/nodetool -h localhost ring Address DC Rack Status State Load Owns Token 113427455640312821154458202477256070484 10.19.104.11 datacenter1 rack1 Up Normal 1.66 GB 33.33% 0 10.19.104.12 datacenter1 rack1 Up Normal 1.06 GB 33.33% 56713727820156410577229101238628035242 10.19.104.13 datacenter1 rack1 Up Normal 1.61 GB 33.33% 113427455640312821154458202477256070484 I want to add 10.19.104.14 to the cluster. I edited the 10.19.104.14 cassandra.yaml file and set the token to 127605887595351923798765477786913079296 and set auto_bootstrap to true. When I started cassandra I am getting Assertion Error. Can someone point to me if i am doing anything wrong here? thanks Ramesh [root@CAP4-CNode4 cassandra]# INFO 10:29:46,093 Logging initialized INFO 10:29:46,099 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_25 INFO 10:29:46,100 Heap size: 8304721920/8304721920 INFO 10:29:46,100 Classpath: bin/../conf:bin/../build/classes/main:bin/../build/classes/thrift:bin/../lib/antlr-3.2.jar:bin/../lib/apache-cassandra-1.0.2.jar:bin/../lib/apache-cassandra-clientutil-1.0.2.jar:bin/../lib/apache-cassandra-thrift-1.0.2.jar:bin/../lib/avro-1.4.0-fixes.jar:bin/../lib/avro-1.4.0-sources-fixes.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/compress-lzf-0.8.4.jar:bin/../lib/concurrentlinkedhashmap-lru-1.2.jar:bin/../lib/guava-r08.jar:bin/../lib/high-scale-lib-1.1.2.jar:bin/../lib/jackson-core-asl-1.4.0.jar:bin/../lib/jackson-mapper-asl-1.4.0.jar:bin/../lib/jamm-0.2.5.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/jna.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-0.6.jar:bin/../lib/log4j-1.2.16.jar:bin/../lib/mx4j-examples.jar:bin/../lib/mx4j-impl.jar:bin/../lib/mx4j.jar:bin/../lib/mx4j-jmx.jar:bin/../lib/mx4j-remote.jar:bin/../lib/mx4j-rimpl.jar:bin/../lib/mx4j-rjmx.jar:bin/../lib/mx4j-tools.jar:bin/../lib/servlet-api-2.5-20081211.jar:bin/../lib/slf4j-api-1.6.1.jar:bin/../lib/slf4j-log4j12-1.6.1.jar:bin/../lib/snakeyaml-1.6.jar:bin/../lib/snappy-java-1.0.4.1.jar:bin/../lib/jamm-0.2.5.jar INFO 10:29:48,713 JNA mlockall successful INFO 10:29:48,726 Loading settings from file:/root/apache-cassandra-1.0.2/conf/cassandra.yaml INFO 10:29:48,883 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 10:29:48,898 Global memtable threshold is enabled at 2640MB INFO 10:29:49,203 Couldn't detect any schema definitions in local storage. INFO 10:29:49,204 Found table data in data directories. Consider using the CLI to define your schema. INFO 10:29:49,220 Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1321979389220.log INFO 10:29:49,227 No commitlog files found; skipping replay INFO 10:29:49,230 Cassandra version: 1.0.2 INFO 10:29:49,230 Thrift API version: 19.18.0 INFO 10:29:49,230 Loading persisted ring state INFO 10:29:49,235 Starting up server gossip INFO 10:29:49,259 Enqueuing flush of Memtable-LocationInfo@122130810(192/240 serialized/live bytes, 4 ops) INFO 10:29:49,260 Writing Memtable-LocationInfo@122130810(192/240 serialized/live bytes, 4 ops) INFO 10:29:49,317 Completed flushing /var/lib/cassandra/data/system/LocationInfo-h-1-Data.db (300 bytes) INFO 10:29:49,340 Starting Messaging Service on port 7000 INFO 10:29:49,349 JOINING: waiting for ring and schema information INFO 10:29:50,759 Applying migration 4b0e20f0-1511-11e1--c11bc95834d7 Add keyspace: MSA, rep strategy:SimpleStrategy{}, durable_writes: true INFO 10:29:50,761 Enqueuing flush of Memtable-Migrations@1507565381(6744/8430 serialized/live bytes, 1 ops) INFO 10:29:50,761 Writing Memtable-Migrations@1507565381(6744/8430 serialized/live bytes, 1 ops) INFO 10:29:50,761 Enqueuing flush of Memtable-Schema@1498835564(2889/3611 serialized/live bytes, 3 ops) INFO 10:29:50,776 Completed flushing /var/lib/cassandra/data/system/Migrations-h-1-Data.db (6808 bytes) INFO 10:29:50,777 Writing Memtable-Schema@1498835564(2889/3611 serialized/live bytes, 3 ops) INFO 10:29:50,797 Completed flushing /var/lib/cassandra/data/system/Schema-h-1-Data.db (3039 bytes) INFO 10:29:50,814 Applying migration 4b6f2cb0-1511-11e1--c11bc95834d7 Add column family: org.apache.cassandra.config.CFMetaData@1639d811[cfId=1000,ksName=MSA,cfName=modseq,cfType
Assertion error during bootstraping cassandra 1.0.2
Hi, I have a 3 node cassandra cluster. I have RF set to 3 and do reads and writes using QUORUM. Here is my initial ring configuration [root@CAP4-CNode1 ~]# /root/cassandra/bin/nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070484 10.19.104.11datacenter1 rack1 Up Normal 1.66 GB 33.33% 0 10.19.104.12datacenter1 rack1 Up Normal 1.06 GB 33.33% 56713727820156410577229101238628035242 10.19.104.13datacenter1 rack1 Up Normal 1.61 GB 33.33% 113427455640312821154458202477256070484 I want to add 10.19.104.14 to the cluster. I edited the 10.19.104.14 cassandra.yaml file and set the token to 127605887595351923798765477786913079296 and set auto_bootstrap to true. When I started cassandra I am getting Assertion Error. Can someone point to me if i am doing anything wrong here? thanks Ramesh [root@CAP4-CNode4 cassandra]# INFO 10:29:46,093 Logging initialized INFO 10:29:46,099 JVM vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.6.0_25 INFO 10:29:46,100 Heap size: 8304721920/8304721920 INFO 10:29:46,100 Classpath: bin/../conf:bin/../build/classes/main:bin/../build/classes/thrift:bin/../lib/antlr-3.2.jar:bin/../lib/apache-cassandra-1.0.2.jar:bin/../lib/apache-cassandra-clientutil-1.0.2.jar:bin/../lib/apache-cassandra-thrift-1.0.2.jar:bin/../lib/avro-1.4.0-fixes.jar:bin/../lib/avro-1.4.0-sources-fixes.jar:bin/../lib/commons-cli-1.1.jar:bin/../lib/commons-codec-1.2.jar:bin/../lib/commons-lang-2.4.jar:bin/../lib/compress-lzf-0.8.4.jar:bin/../lib/concurrentlinkedhashmap-lru-1.2.jar:bin/../lib/guava-r08.jar:bin/../lib/high-scale-lib-1.1.2.jar:bin/../lib/jackson-core-asl-1.4.0.jar:bin/../lib/jackson-mapper-asl-1.4.0.jar:bin/../lib/jamm-0.2.5.jar:bin/../lib/jline-0.9.94.jar:bin/../lib/jna.jar:bin/../lib/json-simple-1.1.jar:bin/../lib/libthrift-0.6.jar:bin/../lib/log4j-1.2.16.jar:bin/../lib/mx4j-examples.jar:bin/../lib/mx4j-impl.jar:bin/../lib/mx4j.jar:bin/../lib/mx4j-jmx.jar:bin/../lib/mx4j-remote.jar:bin/../lib/mx4j-rimpl.jar:bin/../lib/mx4j-rjmx.jar:bin/../lib/mx4j-tools.jar:bin/../lib/servlet-api-2.5-20081211.jar:bin/../lib/slf4j-api-1.6.1.jar:bin/../lib/slf4j-log4j12-1.6.1.jar:bin/../lib/snakeyaml-1.6.jar:bin/../lib/snappy-java-1.0.4.1.jar:bin/../lib/jamm-0.2.5.jar INFO 10:29:48,713 JNA mlockall successful INFO 10:29:48,726 Loading settings from file:/root/apache-cassandra-1.0.2/conf/cassandra.yaml INFO 10:29:48,883 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap INFO 10:29:48,898 Global memtable threshold is enabled at 2640MB INFO 10:29:49,203 Couldn't detect any schema definitions in local storage. INFO 10:29:49,204 Found table data in data directories. Consider using the CLI to define your schema. INFO 10:29:49,220 Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1321979389220.log INFO 10:29:49,227 No commitlog files found; skipping replay INFO 10:29:49,230 Cassandra version: 1.0.2 INFO 10:29:49,230 Thrift API version: 19.18.0 INFO 10:29:49,230 Loading persisted ring state INFO 10:29:49,235 Starting up server gossip INFO 10:29:49,259 Enqueuing flush of Memtable-LocationInfo@122130810(192/240 serialized/live bytes, 4 ops) INFO 10:29:49,260 Writing Memtable-LocationInfo@122130810(192/240 serialized/live bytes, 4 ops) INFO 10:29:49,317 Completed flushing /var/lib/cassandra/data/system/LocationInfo-h-1-Data.db (300 bytes) INFO 10:29:49,340 Starting Messaging Service on port 7000 INFO 10:29:49,349 JOINING: waiting for ring and schema information INFO 10:29:50,759 Applying migration 4b0e20f0-1511-11e1--c11bc95834d7 Add keyspace: MSA, rep strategy:SimpleStrategy{}, durable_writes: true INFO 10:29:50,761 Enqueuing flush of Memtable-Migrations@1507565381(6744/8430 serialized/live bytes, 1 ops) INFO 10:29:50,761 Writing Memtable-Migrations@1507565381(6744/8430 serialized/live bytes, 1 ops) INFO 10:29:50,761 Enqueuing flush of Memtable-Schema@1498835564(2889/3611 serialized/live bytes, 3 ops) INFO 10:29:50,776 Completed flushing /var/lib/cassandra/data/system/Migrations-h-1-Data.db (6808 bytes) INFO 10:29:50,777 Writing Memtable-Schema@1498835564(2889/3611 serialized/live bytes, 3 ops) INFO 10:29:50,797 Completed flushing /var/lib/cassandra/data/system/Schema-h-1-Data.db (3039 bytes) INFO 10:29:50,814 Applying migration 4b6f2cb0-1511-11e1--c11bc95834d7 Add column family:
Re: Internal error processing get: Null pointer exception
Are we doing anything wrong here? or can this be a bug in cassandra? thanks Ramesh On Tue, Nov 1, 2011 at 11:02 PM, Jonathan Ellis jbel...@gmail.com wrote: That doesn't make sense to me. CS:147 is columnFamilyKeyMap.put(row.key, row.cf); where cFKM is MapDecoratedKey, ColumnFamily columnFamilyKeyMap = new HashMapDecoratedKey, ColumnFamily(); So cFKM can't be null, and HashMap accomodates both null key and null value, so I'm not sure what there is to thorw NPE. On Tue, Nov 1, 2011 at 5:56 PM, Ramesh Natarajan rames...@gmail.com wrote: We have a 8 node cassandra cluster running cassandra 1.0.0. After a while in our load testing we are seeing Null pointer exception on gets. Attached is the stack trace ERROR [pool-2-thread-2241] 2011-11-01 15:52:19,335 Cassandra.java (line 2999) Internal error processing get java.lang.NullPointerException at org.apache.cassandra.thrift.CassandraServer.readColumnFamily(CassandraServer.java:147) at org.apache.cassandra.thrift.CassandraServer.internal_get(CassandraServer.java:383) at org.apache.cassandra.thrift.CassandraServer.get(CassandraServer.java:401) at org.apache.cassandra.thrift.Cassandra$Processor$get.process(Cassandra.java:2989) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2889) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Is this a known issue? thanks Ramesh -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Second Cassandra users survey
Here is my wish list - I would love Cassandra to - provide a efficient method to retrieve the count of columns for a given row without resorting to read all columns and calculate the count for a given row key. - support auto increment column names - Column slice based query doesn't take advantage of the Column Bloom Filter and it is not always easy to enumerate the column names in a deterministic manner. - provide JNA support for Key cache - remove dependency on running node tool repair when any column is deleted ( so tombstones doesn't get resurrected ) thanks Ramesh On Tue, Nov 1, 2011 at 5:59 PM, Jonathan Ellis jbel...@gmail.com wrote: Hi all, Two years ago I asked for Cassandra use cases and feature requests. [1] The results [2] have been extremely useful in setting and prioritizing goals for Cassandra development. But with the release of 1.0 we've accomplished basically everything from our original wish list. [3] I'd love to hear from modern Cassandra users again, especially if you're usually a quiet lurker. What does Cassandra do well? What are your pain points? What's your feature wish list? As before, if you're in stealth mode or don't want to say anything in public, feel free to reply to me privately and I will keep it off the record. [1] http://www.mail-archive.com/cassandra-dev@incubator.apache.org/msg01148.html [2] http://www.mail-archive.com/cassandra-user@incubator.apache.org/msg01446.html [3] http://www.mail-archive.com/dev@cassandra.apache.org/msg01524.html -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Key count mismatch in cluster for a column family
Looks like compaction for this column family stopped after some time. The last message for this column family in the system.log is INFO [MigrationStage:1] 2011-10-25 16:57:00,385 Migration.java (line 119) Applying migration 43f106c0-ff54-11e0--68877f281daf Update column family to org.apache.cassandra.config.CFMetaData@86c50e4[cfId=1000,ksName=MSA,cfName=uid,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType,subcolumncomparator=null,comment=,rowCacheSize=0.0,keyCacheSize=1.3E7,readRepairChance=1.0,replicateOnWrite=true,gcGraceSeconds=3600,defaultValidator=org.apache.cassandra.db.marshal.BytesType,keyValidator=org.apache.cassandra.db.marshal.BytesType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,rowCacheKeysToSave=2147483647,rowCacheProvider=org.apache.cassandra.cache.SerializingCacheProvider@7f32ad0d,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=0 lim=3 cap=3],column_metadata={},compactionStrategyClass=class org.apache.cassandra.db.compaction.LeveledCompactionStrategy,compactionStrategyOptions={sstable_size_in_mb=10},compressionOptions={}] INFO [MigrationStage:1] 2011-10-25 16:57:00,389 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-Migrations@1386279942(8806/11007 serialized/live bytes, 1 ops) INFO [FlushWriter:307] 2011-10-25 16:57:00,389 Memtable.java (line 237) Writing Memtable-Migrations@1386279942(8806/11007 serialized/live bytes, 1 ops) INFO [MigrationStage:1] 2011-10-25 16:57:00,389 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-Schema@1156898891(4336/5420 serialized/live bytes, 3 ops) INFO [FlushWriter:307] 2011-10-25 16:57:00,402 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/system/Migrations-h-48-Data.db (8870 bytes) INFO [FlushWriter:307] 2011-10-25 16:57:00,402 Memtable.java (line 237) Writing Memtable-Schema@1156898891(4336/5420 serialized/live bytes, 3 ops) INFO [FlushWriter:307] 2011-10-25 16:57:00,413 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/system/Schema-h-48-Data.db (4486 bytes) INFO [CompactionExecutor:23] 2011-10-25 16:57:00,929 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9016-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9046-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9042-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9039-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9043-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9045-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9044-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9015-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9040-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9054-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9047-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9041-Data.db')] INFO [MigrationStage:1] 2011-10-25 16:57:06,693 Migration.java (line 119) Applying migration 47b1b840-ff54-11e0--68877f281daf Update column family to org.apache.cassandra.config.CFMetaData@23de1953[cfId=1000,ksName=MSA,cfName=uid,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType,subcolumncomparator=null,comment=,rowCacheSize=0.0,keyCacheSize=1.5E7,readRepairChance=1.0,replicateOnWrite=true,gcGraceSeconds=3600,defaultValidator=org.apache.cassandra.db.marshal.BytesType,keyValidator=org.apache.cassandra.db.marshal.BytesType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,rowCacheKeysToSave=2147483647,rowCacheProvider=org.apache.cassandra.cache.SerializingCacheProvider@4a50aa8a,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=0 lim=3 cap=3],column_metadata={},compactionStrategyClass=class org.apache.cassandra.db.compaction.LeveledCompactionStrategy,compactionStrategyOptions={sstable_size_in_mb=10},compressionOptions={}] INFO [MigrationStage:1] 2011-10-25 16:57:06,694 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-Migrations@1978429475(8806/11007 serialized/live bytes, 1 ops) INFO [FlushWriter:307] 2011-10-25 16:57:06,694 Memtable.java (line 237) Writing Memtable-Migrations@1978429475(8806/11007 serialized/live bytes, 1 ops) The schema for this CF is create column family uid with column_type = 'Standard' and comparator = 'BytesType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and rows_cached = 0.0 and row_cache_save_period = 0 and row_cache_keys_to_save = 2147483647 and keys_cached = 1.5E7 and key_cache_save_period = 14400 and read_repair_chance = 1.0 and gc_grace = 3600 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and row_cache_provider = 'SerializingCacheProvider' and
Re: Key count mismatch in cluster for a column family
I did some more analysis and I think the compaction for this CF stopped after we did a update column family to increase the key cache. Other CF compactions were going on without any issues. I did another update column family to the same CF with same values as before and the compaction started again. It it interesting to note that this happened on only one of the node in a 8 node cluster. Also it is the same node where the command was issued using the cli. Is it normal to issue update column family to increase the key cache when cassandra is actively serving traffic in a production environment? Are there any known issues/interactions with compaction and updating column family in 1.0? Thanks Ramesh Ramesh Natarajan rames...@gmail.com wrote: Looks like compaction for this column family stopped after some time. The last message for this column family in the system.log is INFO [MigrationStage:1] 2011-10-25 16:57:00,385 Migration.java (line 119) Applying migration 43f106c0-ff54-11e0--68877f281daf Update column family to org.apache.cassandra.config.CFMetaData@86c50e4[cfId=1000,ksName=MSA,cfName=uid,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType,subcolumncomparator=null,comment=,rowCacheSize=0.0,keyCacheSize=1.3E7,readRepairChance=1.0,replicateOnWrite=true,gcGraceSeconds=3600,defaultValidator=org.apache.cassandra.db.marshal.BytesType,keyValidator=org.apache.cassandra.db.marshal.BytesType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,rowCacheKeysToSave=2147483647,rowCacheProvider=org.apache.cassandra.cache.SerializingCacheProvider@7f32ad0d,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=0 lim=3 cap=3],column_metadata={},compactionStrategyClass=class org.apache.cassandra.db.compaction.LeveledCompactionStrategy,compactionStrategyOptions={sstable_size_in_mb=10},compressionOptions={}] INFO [MigrationStage:1] 2011-10-25 16:57:00,389 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-Migrations@1386279942(8806/11007 serialized/live bytes, 1 ops) INFO [FlushWriter:307] 2011-10-25 16:57:00,389 Memtable.java (line 237) Writing Memtable-Migrations@1386279942(8806/11007 serialized/live bytes, 1 ops) INFO [MigrationStage:1] 2011-10-25 16:57:00,389 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-Schema@1156898891(4336/5420 serialized/live bytes, 3 ops) INFO [FlushWriter:307] 2011-10-25 16:57:00,402 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/system/Migrations-h-48-Data.db (8870 bytes) INFO [FlushWriter:307] 2011-10-25 16:57:00,402 Memtable.java (line 237) Writing Memtable-Schema@1156898891(4336/5420 serialized/live bytes, 3 ops) INFO [FlushWriter:307] 2011-10-25 16:57:00,413 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/system/Schema-h-48-Data.db (4486 bytes) INFO [CompactionExecutor:23] 2011-10-25 16:57:00,929 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9016-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9046-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9042-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9039-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9043-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9045-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9044-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9015-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9040-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9054-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9047-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-9041-Data.db')] INFO [MigrationStage:1] 2011-10-25 16:57:06,693 Migration.java (line 119) Applying migration 47b1b840-ff54-11e0--68877f281daf Update column family to org.apache.cassandra.config.CFMetaData@23de1953[cfId=1000,ksName=MSA,cfName=uid,cfType=Standard,comparator=org.apache.cassandra.db.marshal.BytesType,subcolumncomparator=null,comment=,rowCacheSize=0.0,keyCacheSize=1.5E7,readRepairChance=1.0,replicateOnWrite=true,gcGraceSeconds=3600,defaultValidator=org.apache.cassandra.db.marshal.BytesType,keyValidator=org.apache.cassandra.db.marshal.BytesType,minCompactionThreshold=4,maxCompactionThreshold=32,rowCacheSavePeriodInSeconds=0,keyCacheSavePeriodInSeconds=14400,rowCacheKeysToSave=2147483647,rowCacheProvider=org.apache.cassandra.cache.SerializingCacheProvider@4a50aa8a,mergeShardsChance=0.1,keyAlias=java.nio.HeapByteBuffer[pos=0 lim=3 cap=3],column_metadata={},compactionStrategyClass=class org.apache.cassandra.db.compaction.LeveledCompactionStrategy,compactionStrategyOptions={sstable_size_in_mb=10},compressionOptions={}] INFO [MigrationStage:1] 2011-10-25 16:57:06,694 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-Migrations@1978429475(8806/11007
Re: Cassandra 1.0: Exception in compaction
We are using the final 1.0.0. Thanks Ramesh On Wed, Oct 26, 2011 at 11:16 AM, Sylvain Lebresne sylv...@datastax.com wrote: Also, to be sure, were you using the 1.0.0 final or some RC when getting this exception? On Fri, Oct 21, 2011 at 8:24 PM, Sylvain Lebresne sylv...@datastax.com wrote: Would you have the full log for one of those node leading to the exception that you could share? Not sure that'll help but who knows. -- Sylvain On Fri, Oct 21, 2011 at 4:34 PM, Ramesh Natarajan rames...@gmail.com wrote: i am using size based compaction ( the default one ). Also this is on linux. thanks Ramesh On Fri, Oct 21, 2011 at 4:25 AM, Sylvain Lebresne sylv...@datastax.com wrote: I believe this is the same as https://issues.apache.org/jira/browse/CASSANDRA-3306. The initial reporter only got this exception with leveled compaction, is it what you are using too (to help narrow it down)? Also, are you using windows by any chance? -- Sylvain On Thu, Oct 20, 2011 at 9:04 PM, Ramesh Natarajan rames...@gmail.com wrote: We are running a 8 node cassandra 1.0 cluster. We are seeing this exception quite often. Any idea how to debug this issue? java.lang.IllegalArgumentException: Illegal Capacity: -2 at java.util.ArrayList.init(ArrayList.java:110) at org.apache.cassandra.db.DataTracker$View.newSSTables(DataTracker.java:573) at org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:546) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:268) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:232) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:960) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:199) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:131) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:114) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) few lines before this error INFO [FlushWriter:222] 2011-10-20 10:52:25,885 Memtable.java (line 237) Writing Memtable-participants@1907757288(6777526/153164388 serialized/live bytes, 199339 ops) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319115938691.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319122061956.log) INFO [FlushWriter:222] 2011-10-20 10:52:26,865 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/participants-h-87-Data.db (14695382 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,866 Memtable.java (line 237) Writing Memtable-modseq@1745889706(13206/311769 serialized/live bytes, 426 ops) INFO [FlushWriter:222] 2011-10-20 10:52:26,896 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/modseq-h-262-Data.db (38646 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,897 Memtable.java (line 237) Writing Memtable-msgid@2099219781(3571249/77183008 serialized/live bytes, 109823 ops) INFO [FlushWriter:222] 2011-10-20 10:52:27,497 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/msgid-h-47-Data.db (8125165 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:27,498 Memtable.java (line 237) Writing Memtable-uid@578022704(43734344/317200563 serialized/live bytes, 611301 ops) INFO [FlushWriter:222] 2011-10-20 10:52:29,802 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/uid-h-291-Data.db (48225128 bytes) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125356477.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125683351.log) INFO [MutationStage:88] 2011-10-20 10:52:29,905 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-modseq@339630706(155217/3664394 serialized/live bytes, 5007 ops) INFO [FlushWriter:222] 2011-10-20 10:52:29,905 Memtable.java (line 237) Writing Memtable-modseq@339630706(155217/3664394 serialized/live bytes, 5007 ops) INFO [FlushWriter:222] 2011-10-20 10:52:30,216 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/modseq-h-263-Data.db (450477 bytes
Re: Cassandra 1.0: Exception in compaction
No Unfortunately I don't have the log files , the system was scratch loaded. Thanks Ramesh Sylvain Lebresne sylv...@datastax.com wrote: You don't have the full logs of a node leading to the exception by any change ? Especially one that leads to a java.lang.IllegalArgumentException: Illegal Capacity: -2 would be great. -- Sylvain On Wed, Oct 26, 2011 at 6:26 PM, Ramesh Natarajan rames...@gmail.com wrote: We are using the final 1.0.0. Thanks Ramesh On Wed, Oct 26, 2011 at 11:16 AM, Sylvain Lebresne sylv...@datastax.com wrote: Also, to be sure, were you using the 1.0.0 final or some RC when getting this exception? On Fri, Oct 21, 2011 at 8:24 PM, Sylvain Lebresne sylv...@datastax.com wrote: Would you have the full log for one of those node leading to the exception that you could share? Not sure that'll help but who knows. -- Sylvain On Fri, Oct 21, 2011 at 4:34 PM, Ramesh Natarajan rames...@gmail.com wrote: i am using size based compaction ( the default one ). Also this is on linux. thanks Ramesh On Fri, Oct 21, 2011 at 4:25 AM, Sylvain Lebresne sylv...@datastax.com wrote: I believe this is the same as https://issues.apache.org/jira/browse/CASSANDRA-3306. The initial reporter only got this exception with leveled compaction, is it what you are using too (to help narrow it down)? Also, are you using windows by any chance? -- Sylvain On Thu, Oct 20, 2011 at 9:04 PM, Ramesh Natarajan rames...@gmail.com wrote: We are running a 8 node cassandra 1.0 cluster. We are seeing this exception quite often. Any idea how to debug this issue? java.lang.IllegalArgumentException: Illegal Capacity: -2 at java.util.ArrayList.init(ArrayList.java:110) at org.apache.cassandra.db.DataTracker$View.newSSTables(DataTracker.java:573) at org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:546) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:268) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:232) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:960) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:199) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:131) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:114) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) few lines before this error INFO [FlushWriter:222] 2011-10-20 10:52:25,885 Memtable.java (line 237) Writing Memtable-participants@1907757288(6777526/153164388 serialized/live bytes, 199339 ops) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319115938691.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319122061956.log) INFO [FlushWriter:222] 2011-10-20 10:52:26,865 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/participants-h-87-Data.db (14695382 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,866 Memtable.java (line 237) Writing Memtable-modseq@1745889706(13206/311769 serialized/live bytes, 426 ops) INFO [FlushWriter:222] 2011-10-20 10:52:26,896 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/modseq-h-262-Data.db (38646 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,897 Memtable.java (line 237) Writing Memtable-msgid@2099219781(3571249/77183008 serialized/live bytes, 109823 ops) INFO [FlushWriter:222] 2011-10-20 10:52:27,497 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/msgid-h-47-Data.db (8125165 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:27,498 Memtable.java (line 237) Writing Memtable-uid@578022704(43734344/317200563 serialized/live bytes, 611301 ops) INFO [FlushWriter:222] 2011-10-20 10:52:29,802 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/uid-h-291-Data.db (48225128 bytes) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125356477.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125683351.log) INFO [MutationStage:88] 2011-10-20 10:52:29,905
Re: Cassandra 1.0: Exception in compaction
i am using size based compaction ( the default one ). Also this is on linux. thanks Ramesh On Fri, Oct 21, 2011 at 4:25 AM, Sylvain Lebresne sylv...@datastax.com wrote: I believe this is the same as https://issues.apache.org/jira/browse/CASSANDRA-3306. The initial reporter only got this exception with leveled compaction, is it what you are using too (to help narrow it down)? Also, are you using windows by any chance? -- Sylvain On Thu, Oct 20, 2011 at 9:04 PM, Ramesh Natarajan rames...@gmail.com wrote: We are running a 8 node cassandra 1.0 cluster. We are seeing this exception quite often. Any idea how to debug this issue? java.lang.IllegalArgumentException: Illegal Capacity: -2 at java.util.ArrayList.init(ArrayList.java:110) at org.apache.cassandra.db.DataTracker$View.newSSTables(DataTracker.java:573) at org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:546) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:268) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:232) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:960) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:199) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:131) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:114) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) few lines before this error INFO [FlushWriter:222] 2011-10-20 10:52:25,885 Memtable.java (line 237) Writing Memtable-participants@1907757288(6777526/153164388 serialized/live bytes, 199339 ops) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319115938691.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319122061956.log) INFO [FlushWriter:222] 2011-10-20 10:52:26,865 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/participants-h-87-Data.db (14695382 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,866 Memtable.java (line 237) Writing Memtable-modseq@1745889706(13206/311769 serialized/live bytes, 426 ops) INFO [FlushWriter:222] 2011-10-20 10:52:26,896 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/modseq-h-262-Data.db (38646 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,897 Memtable.java (line 237) Writing Memtable-msgid@2099219781(3571249/77183008 serialized/live bytes, 109823 ops) INFO [FlushWriter:222] 2011-10-20 10:52:27,497 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/msgid-h-47-Data.db (8125165 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:27,498 Memtable.java (line 237) Writing Memtable-uid@578022704(43734344/317200563 serialized/live bytes, 611301 ops) INFO [FlushWriter:222] 2011-10-20 10:52:29,802 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/uid-h-291-Data.db (48225128 bytes) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125356477.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125683351.log) INFO [MutationStage:88] 2011-10-20 10:52:29,905 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-modseq@339630706(155217/3664394 serialized/live bytes, 5007 ops) INFO [FlushWriter:222] 2011-10-20 10:52:29,905 Memtable.java (line 237) Writing Memtable-modseq@339630706(155217/3664394 serialized/live bytes, 5007 ops) INFO [FlushWriter:222] 2011-10-20 10:52:30,216 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/modseq-h-263-Data.db (450477 bytes) ERROR [CompactionExecutor:538] 2011-10-20 10:52:36,132 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:538,1,main] Another one INFO [FlushWriter:222] 2011-10-20 10:52:39,623 Memtable.java (line 237) Writing Memtable-uid@2018688194(79740/578345 serialized/live bytes, ops) INFO [FlushWriter:222] 2011-10-20 10:52:39,777 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/uid-h-295-Data.db (142584 bytes) INFO [CompactionExecutor:544] 2011-10-20 10:52
Re: nodetool ring Load column
I don't use compressed sstable. I also use the default compaction strategy. i will look at JIRA and see if there are any similarities. thanks Ramesh On Fri, Oct 21, 2011 at 6:51 AM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: Are you using compressed sstables? or the leveled sstables? Make sure you include how you are configured in any JIRA you make, someone else was seeing a similar issue with compression turned on. -Jeremiah On Oct 14, 2011, at 1:13 PM, Ramesh Natarajan wrote: What does the Load column in nodetool ring mean? From the output below it shows 101.62 GB. However if I do a disk usage it is about 6 GB. thanks Ramesh [root@CAP2-CNode1 cassandra]# ~root/apache-cassandra-1.0.0-rc2/bin/nodetool -h localhost ring Address DC Rack Status State Load Owns Token 148873535527910577765226390751398592512 10.19.102.11 datacenter1 rack1 Up Normal 101.62 GB 12.50% 0 10.19.102.12 datacenter1 rack1 Up Normal 84.42 GB 12.50% 21267647932558653966460912964485513216 10.19.102.13 datacenter1 rack1 Up Normal 95.47 GB 12.50% 42535295865117307932921825928971026432 10.19.102.14 datacenter1 rack1 Up Normal 91.25 GB 12.50% 63802943797675961899382738893456539648 10.19.103.11 datacenter1 rack1 Up Normal 93.98 GB 12.50% 85070591730234615865843651857942052864 10.19.103.12 datacenter1 rack1 Up Normal 100.33 GB 12.50% 106338239662793269832304564822427566080 10.19.103.13 datacenter1 rack1 Up Normal 74.1 GB 12.50% 127605887595351923798765477786913079296 10.19.103.14 datacenter1 rack1 Up Normal 93.96 GB 12.50% 148873535527910577765226390751398592512 [root@CAP2-CNode1 cassandra]# du -hs /var/lib/cassandra/data/ 6.0G /var/lib/cassandra/data/
Cassandra 1.0: Exception in compaction
We are running a 8 node cassandra 1.0 cluster. We are seeing this exception quite often. Any idea how to debug this issue? java.lang.IllegalArgumentException: Illegal Capacity: -2 at java.util.ArrayList.init(ArrayList.java:110) at org.apache.cassandra.db.DataTracker$View.newSSTables(DataTracker.java:573) at org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:546) at org.apache.cassandra.db.DataTracker.replace(DataTracker.java:268) at org.apache.cassandra.db.DataTracker.replaceCompactedSSTables(DataTracker.java:232) at org.apache.cassandra.db.ColumnFamilyStore.replaceCompactedSSTables(ColumnFamilyStore.java:960) at org.apache.cassandra.db.compaction.CompactionTask.execute(CompactionTask.java:199) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:131) at org.apache.cassandra.db.compaction.CompactionManager$1.call(CompactionManager.java:114) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) few lines before this error INFO [FlushWriter:222] 2011-10-20 10:52:25,885 Memtable.java (line 237) Writing Memtable-participants@1907757288(6777526/153164388 serialized/live bytes, 199339 ops) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319115938691.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:25,886 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319122061956.log) INFO [FlushWriter:222] 2011-10-20 10:52:26,865 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/participants-h-87-Data.db (14695382 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,866 Memtable.java (line 237) Writing Memtable-modseq@1745889706(13206/311769 serialized/live bytes, 426 ops) INFO [FlushWriter:222] 2011-10-20 10:52:26,896 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/modseq-h-262-Data.db (38646 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:26,897 Memtable.java (line 237) Writing Memtable-msgid@2099219781(3571249/77183008 serialized/live bytes, 109823 ops) INFO [FlushWriter:222] 2011-10-20 10:52:27,497 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/msgid-h-47-Data.db (8125165 bytes) INFO [FlushWriter:222] 2011-10-20 10:52:27,498 Memtable.java (line 237) Writing Memtable-uid@578022704(43734344/317200563 serialized/live bytes, 611301 ops) INFO [FlushWriter:222] 2011-10-20 10:52:29,802 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/uid-h-291-Data.db (48225128 bytes) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125356477.log) INFO [COMMIT-LOG-WRITER] 2011-10-20 10:52:29,804 CommitLog.java (line 488) Discarding obsolete commit log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1319125683351.log) INFO [MutationStage:88] 2011-10-20 10:52:29,905 ColumnFamilyStore.java (line 664) Enqueuing flush of Memtable-modseq@339630706(155217/3664394 serialized/live bytes, 5007 ops) INFO [FlushWriter:222] 2011-10-20 10:52:29,905 Memtable.java (line 237) Writing Memtable-modseq@339630706(155217/3664394 serialized/live bytes, 5007 ops) INFO [FlushWriter:222] 2011-10-20 10:52:30,216 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/modseq-h-263-Data.db (450477 bytes) ERROR [CompactionExecutor:538] 2011-10-20 10:52:36,132 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:538,1,main] Another one INFO [FlushWriter:222] 2011-10-20 10:52:39,623 Memtable.java (line 237) Writing Memtable-uid@2018688194(79740/578345 serialized/live bytes, ops) INFO [FlushWriter:222] 2011-10-20 10:52:39,777 Memtable.java (line 273) Completed flushing /var/lib/cassandra/data/MSA/uid-h-295-Data.db (142584 bytes) INFO [CompactionExecutor:544] 2011-10-20 10:52:39,778 CompactionTask.java (line 119) Compacting [SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-295-Data.db'), SSTableReader(path='/var/lib/cassandra/data/MSA/uid-h-294-Data.db')] ERROR [CompactionExecutor:544] 2011-10-20 10:52:39,935 AbstractCassandraDaemon.java (line 133) Fatal exception in thread Thread[CompactionExecutor:544,1,main] java.lang.AssertionError at org.apache.cassandra.db.DataTracker$View.newSSTables(DataTracker.java:580) at org.apache.cassandra.db.DataTracker$View.replace(DataTracker.java:546) at
nodetool ring Load column
What does the Load column in nodetool ring mean? From the output below it shows 101.62 GB. However if I do a disk usage it is about 6 GB. thanks Ramesh [root@CAP2-CNode1 cassandra]# ~root/apache-cassandra-1.0.0-rc2/bin/nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 148873535527910577765226390751398592512 10.19.102.11datacenter1 rack1 Up Normal 101.62 GB 12.50% 0 10.19.102.12datacenter1 rack1 Up Normal 84.42 GB 12.50% 21267647932558653966460912964485513216 10.19.102.13datacenter1 rack1 Up Normal 95.47 GB 12.50% 42535295865117307932921825928971026432 10.19.102.14datacenter1 rack1 Up Normal 91.25 GB 12.50% 63802943797675961899382738893456539648 10.19.103.11datacenter1 rack1 Up Normal 93.98 GB 12.50% 85070591730234615865843651857942052864 10.19.103.12datacenter1 rack1 Up Normal 100.33 GB 12.50% 106338239662793269832304564822427566080 10.19.103.13datacenter1 rack1 Up Normal 74.1 GB 12.50% 127605887595351923798765477786913079296 10.19.103.14datacenter1 rack1 Up Normal 93.96 GB 12.50% 148873535527910577765226390751398592512 [root@CAP2-CNode1 cassandra]# du -hs /var/lib/cassandra/data/ 6.0G/var/lib/cassandra/data/
read on multiple SS tables
Lets assume I perform frequent insert update on a column family.. Over a period of time multiple sstables will have this row/column data. I have 2 questions about how reads work in cassandra w.r.t. multiple SS tables. -If you perform a query for a specific row key and a column name, does it read the most recent SSTable first and if it finds a hit, does it stop there or does it need to read through all the SStables (to find most recent one) regardless of whether if found a hit on the most recent SSTable or not? - If I perform a slice query on a column range does cassandra iterate all the SS tables? We have an option to create 1st option: Key1 | COL1 | COL2 | COL3 . multiple columns We need to perform a slice query to get COL1-COL3 using key1. 2nd option: Key1 | COL as one column and have application place values of COL1-COLN in this one column This key would be updated several times where the app would manage adding multiple values to the one column key. Our max col value size will be less than 64mb. When you need to search for a value, we would read the one column and the application would manage looking up the appropriate value in the list of values. So I am wondering which option would be most efficient from read point of view. thanks Ramesh
Consistency level and ReadRepair
I have a 12 node cassandra cluster running with RF=3. I have severl clients ( all running on a single node ) connecting to the cluster ( fixed client - node mapping ) and try to do a insert, update , select and delete. Each client has a fixed mapping of the row-keys and always connect to the same node. The timestamp on the client node is used for all operations. All operations are done using CL QUORUM. When I run a tpstats I see the ReadRepair count consistently increasing. i need to figure out why ReadRepair is happening.. One scenario I can think of is, it could happen when there is a delay in updating the nodes to reach eventual consistency.. Let's say I have 3 nodes (RF=3) A,B,C. I insert key with timestamp ts1 to A and the call will return as soon as it inserts the record to A and B. At some later point this information is sent to C... A while later A,B,C have the same data with the same timestamp. A key,ts1 B key, ts1 and C key, ts1 When I update key on A with timestamp ts2 to A, the call will return as soon as it inserts the record to A and B. Now the data is A key,ts2 B key,ts2 C key,ts1 Assuming I query for key A,C respond and since there is no QUORUM, it waits for B to respond and when A,B match, the response is returned to the client and ReadRepair is sent to C. This could happen only when C is running behind in catching up the updates to A,B. Are there any stats that would let me know if the system is in a consistent state? thanks Ramesh tpstats_2011-10-05_12:50:01:ReadRepairStage 0 0 43569781 0 0 tpstats_2011-10-05_12:55:01:ReadRepairStage 0 0 43646420 0 0 tpstats_2011-10-05_13:00:02:ReadRepairStage 0 0 43725850 0 0 tpstats_2011-10-05_13:05:01:ReadRepairStage 0 0 43790047 0 0 tpstats_2011-10-05_13:10:02:ReadRepairStage 0 0 43869704 0 0 tpstats_2011-10-05_13:15:01:ReadRepairStage 0 0 43945635 0 0 tpstats_2011-10-05_13:20:01:ReadRepairStage 0 0 44020406 0 0 tpstats_2011-10-05_13:25:02:ReadRepairStage 0 0 44093227 0 0 tpstats_2011-10-05_13:30:01:ReadRepairStage 0 0 44167455 0 0 tpstats_2011-10-05_13:35:02:ReadRepairStage 0 0 44247519 0 0 tpstats_2011-10-05_13:40:01:ReadRepairStage 0 0 44312726 0 0 tpstats_2011-10-05_13:45:01:ReadRepairStage 0 0 44387633 0 0 tpstats_2011-10-05_13:50:01:ReadRepairStage 0 0 3683 0 0 tpstats_2011-10-05_13:55:02:ReadRepairStage 0 0 44499487 0 0 tpstats_2011-10-05_14:00:01:ReadRepairStage 0 0 44578656 0 0 tpstats_2011-10-05_14:05:01:ReadRepairStage 0 0 44647555 0 0 tpstats_2011-10-05_14:10:02:ReadRepairStage 0 0 44716730 0 0 tpstats_2011-10-05_14:15:01:ReadRepairStage 0 0 44776644 0 0 tpstats_2011-10-05_14:20:01:ReadRepairStage 0 0 44840237 0 0 tpstats_2011-10-05_14:25:01:ReadRepairStage 0 0 44891444 0 0 tpstats_2011-10-05_14:30:01:ReadRepairStage 0 0 44931105 0 0 tpstats_2011-10-05_14:35:02:ReadRepairStage 0 0 44976801 0 0 tpstats_2011-10-05_14:40:01:ReadRepairStage 0 0 45042220 0 0 tpstats_2011-10-05_14:45:01:ReadRepairStage 0 0 45112141 0 0 tpstats_2011-10-05_14:50:02:ReadRepairStage 0 0 45177816 0 0 tpstats_2011-10-05_14:55:02:ReadRepairStage 0 0 45246675 0 0 tpstats_2011-10-05_15:00:01:ReadRepairStage 0 0 45309533 0 0 tpstats_2011-10-05_15:05:01:ReadRepairStage 0 0 45357575 0 0 tpstats_2011-10-05_15:10:01:ReadRepairStage 0 0 45405943 0 0 tpstats_2011-10-05_15:15:01:ReadRepairStage 0 0 45458435 0 0 tpstats_2011-10-05_15:20:01:ReadRepairStage 0 2 45508253 0 0 tpstats_2011-10-05_15:25:01:ReadRepairStage 0 0 45570375 0
Re: nodetool cfstats on 1.0.0-rc1 throws an exception
I don't have access to the test system anymore. We did move to lower number of CFs and dont see this problem any more. I remember when I noticed the size in system.log it was little more than UINT_MAX (4294967295). I was able to recreate it multiple times. So I am wondering if there are any stats counters in the system which is set to unsigned int instead of unsigned long? thanks Ramesh On Tue, Oct 4, 2011 at 3:20 AM, aaron morton aa...@thelastpickle.com wrote: That row has a size of 819 peta bytes, so something is odd there. The error is a result of that value been so huge. When you rant he same script on 0.8.6 what was the max size of the Migrations CF ? As Jonathan says, it's unlikely anyone would have tested creating 5000 CF's. Most people only create a few 10's of CF's at most. either use fewer CF's or… * dump the Migrations CF using sstable2json to take a look around * work out steps to reproduce and report it on Jira Hope that helps. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 4/10/2011, at 11:30 AM, Ramesh Natarajan wrote: We recreated the schema using the same input file on both clusters and they are running identical load. Isn't the exception thrown in the system CF? this line looks strange: Compacted row maximum size: 9223372036854775807 thanks Ramesh On Mon, Oct 3, 2011 at 5:26 PM, Jonathan Ellis jbel...@gmail.com wrote: Looks like you have unexpectedly large rows in your 1.0 cluster but not 0.8. I guess you could use sstable2json to manually check your row sizes. On Mon, Oct 3, 2011 at 5:20 PM, Ramesh Natarajan rames...@gmail.com wrote: It happens all the time on 1.0. It doesn't happen on 0.8.6. Is there any thing I can do to check? thanks Ramesh On Mon, Oct 3, 2011 at 5:15 PM, Jonathan Ellis jbel...@gmail.com wrote: My suspicion would be that it has more to do with rare case when running with 5000 CFs than 1.0 regression. On Mon, Oct 3, 2011 at 5:00 PM, Ramesh Natarajan rames...@gmail.com wrote: We have about 5000 column family and when we run the nodetool cfstats it throws out this exception... this is running 1.0.0-rc1 This seems to work on 0.8.6. Is this a bug in 1.0.0? thanks Ramesh Keyspace: system Read Count: 28 Read Latency: 5.8675 ms. Write Count: 3 Write Latency: 0.166 ms. Pending Tasks: 0 Column Family: Schema SSTable count: 4 Space used (live): 4293758276 Space used (total): 4293758276 Number of Keys (estimate): 5376 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 3 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 53 Key cache size: 2 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 104 Compacted row maximum size: 1955666 Compacted row mean size: 1508515 Column Family: HintsColumnFamily SSTable count: 0 Space used (live): 0 Space used (total): 0 Number of Keys (estimate): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 5 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: LocationInfo SSTable count: 1 Space used (live): 6947 Space used (total): 6947 Number of Keys (estimate): 128 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 2 Read Count: 20 Read Latency: NaN ms. Write Count: 3 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 1 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 73 Compacted row maximum size: 258 Compacted row mean size
Memtable Switch Count
What is Memtable Switch Count in the cfstats output? thanks Ramesh
Re: Consistency level and ReadRepair
Lets assume we have 3 nodes all up and running at all times with no failures or communication problems. 1. If I have a RF=3 and writing with QUORUM, 2 nodes the change gets committed, what is the delay we should expect before the 3rd replica gets written 2. In this scenario ( no failures e.t.c ) if we do a read with a QUORUM read what situation can lead to read repair? I didn't expect any ReadRepair because all 3 must have the same value. On Wed, Oct 5, 2011 at 1:11 PM, Jonathan Ellis jbel...@gmail.com wrote: Start with http://wiki.apache.org/cassandra/ReadRepair. Read repair count increasing just means you were doing reads at CL.ALL, and had the CF configured to perform RR. On Wed, Oct 5, 2011 at 12:37 PM, Ramesh Natarajan rames...@gmail.com wrote: I have a 12 node cassandra cluster running with RF=3. I have severl clients ( all running on a single node ) connecting to the cluster ( fixed client - node mapping ) and try to do a insert, update , select and delete. Each client has a fixed mapping of the row-keys and always connect to the same node. The timestamp on the client node is used for all operations. All operations are done using CL QUORUM. When I run a tpstats I see the ReadRepair count consistently increasing. i need to figure out why ReadRepair is happening.. One scenario I can think of is, it could happen when there is a delay in updating the nodes to reach eventual consistency.. Let's say I have 3 nodes (RF=3) A,B,C. I insert key with timestamp ts1 to A and the call will return as soon as it inserts the record to A and B. At some later point this information is sent to C... A while later A,B,C have the same data with the same timestamp. A key,ts1 B key, ts1 and C key, ts1 When I update key on A with timestamp ts2 to A, the call will return as soon as it inserts the record to A and B. Now the data is A key,ts2 B key,ts2 C key,ts1 Assuming I query for key A,C respond and since there is no QUORUM, it waits for B to respond and when A,B match, the response is returned to the client and ReadRepair is sent to C. This could happen only when C is running behind in catching up the updates to A,B. Are there any stats that would let me know if the system is in a consistent state? thanks Ramesh tpstats_2011-10-05_12:50:01:ReadRepairStage 0 0 43569781 0 0 tpstats_2011-10-05_12:55:01:ReadRepairStage 0 0 43646420 0 0 tpstats_2011-10-05_13:00:02:ReadRepairStage 0 0 43725850 0 0 tpstats_2011-10-05_13:05:01:ReadRepairStage 0 0 43790047 0 0 tpstats_2011-10-05_13:10:02:ReadRepairStage 0 0 43869704 0 0 tpstats_2011-10-05_13:15:01:ReadRepairStage 0 0 43945635 0 0 tpstats_2011-10-05_13:20:01:ReadRepairStage 0 0 44020406 0 0 tpstats_2011-10-05_13:25:02:ReadRepairStage 0 0 44093227 0 0 tpstats_2011-10-05_13:30:01:ReadRepairStage 0 0 44167455 0 0 tpstats_2011-10-05_13:35:02:ReadRepairStage 0 0 44247519 0 0 tpstats_2011-10-05_13:40:01:ReadRepairStage 0 0 44312726 0 0 tpstats_2011-10-05_13:45:01:ReadRepairStage 0 0 44387633 0 0 tpstats_2011-10-05_13:50:01:ReadRepairStage 0 0 3683 0 0 tpstats_2011-10-05_13:55:02:ReadRepairStage 0 0 44499487 0 0 tpstats_2011-10-05_14:00:01:ReadRepairStage 0 0 44578656 0 0 tpstats_2011-10-05_14:05:01:ReadRepairStage 0 0 44647555 0 0 tpstats_2011-10-05_14:10:02:ReadRepairStage 0 0 44716730 0 0 tpstats_2011-10-05_14:15:01:ReadRepairStage 0 0 44776644 0 0 tpstats_2011-10-05_14:20:01:ReadRepairStage 0 0 44840237 0 0 tpstats_2011-10-05_14:25:01:ReadRepairStage 0 0 44891444 0 0 tpstats_2011-10-05_14:30:01:ReadRepairStage 0 0 44931105 0 0 tpstats_2011-10-05_14:35:02:ReadRepairStage 0 0 44976801 0 0 tpstats_2011-10-05_14:40:01:ReadRepairStage 0 0 45042220 0 0 tpstats_2011-10-05_14:45:01:ReadRepairStage 0 0 45112141 0 0 tpstats_2011
Re: Consistency level and ReadRepair
Yes Hinted Handoff is enabled. However I don't see any counters raising against the HintedHandoff in the tpstats. thanks Ramesh On Wed, Oct 5, 2011 at 2:10 PM, Mohit Anchlia mohitanch...@gmail.com wrote: Do you see any errors in the logs? Is your HH enabled? On Wed, Oct 5, 2011 at 12:00 PM, Ramesh Natarajan rames...@gmail.com wrote: Lets assume we have 3 nodes all up and running at all times with no failures or communication problems. 1. If I have a RF=3 and writing with QUORUM, 2 nodes the change gets committed, what is the delay we should expect before the 3rd replica gets written 2. In this scenario ( no failures e.t.c ) if we do a read with a QUORUM read what situation can lead to read repair? I didn't expect any ReadRepair because all 3 must have the same value. On Wed, Oct 5, 2011 at 1:11 PM, Jonathan Ellis jbel...@gmail.com wrote: Start with http://wiki.apache.org/cassandra/ReadRepair. Read repair count increasing just means you were doing reads at CL.ALL, and had the CF configured to perform RR. On Wed, Oct 5, 2011 at 12:37 PM, Ramesh Natarajan rames...@gmail.com wrote: I have a 12 node cassandra cluster running with RF=3. I have severl clients ( all running on a single node ) connecting to the cluster ( fixed client - node mapping ) and try to do a insert, update , select and delete. Each client has a fixed mapping of the row-keys and always connect to the same node. The timestamp on the client node is used for all operations. All operations are done using CL QUORUM. When I run a tpstats I see the ReadRepair count consistently increasing. i need to figure out why ReadRepair is happening.. One scenario I can think of is, it could happen when there is a delay in updating the nodes to reach eventual consistency.. Let's say I have 3 nodes (RF=3) A,B,C. I insert key with timestamp ts1 to A and the call will return as soon as it inserts the record to A and B. At some later point this information is sent to C... A while later A,B,C have the same data with the same timestamp. A key,ts1 B key, ts1 and C key, ts1 When I update key on A with timestamp ts2 to A, the call will return as soon as it inserts the record to A and B. Now the data is A key,ts2 B key,ts2 C key,ts1 Assuming I query for key A,C respond and since there is no QUORUM, it waits for B to respond and when A,B match, the response is returned to the client and ReadRepair is sent to C. This could happen only when C is running behind in catching up the updates to A,B. Are there any stats that would let me know if the system is in a consistent state? thanks Ramesh tpstats_2011-10-05_12:50:01:ReadRepairStage 0 0 43569781 0 0 tpstats_2011-10-05_12:55:01:ReadRepairStage 0 0 43646420 0 0 tpstats_2011-10-05_13:00:02:ReadRepairStage 0 0 43725850 0 0 tpstats_2011-10-05_13:05:01:ReadRepairStage 0 0 43790047 0 0 tpstats_2011-10-05_13:10:02:ReadRepairStage 0 0 43869704 0 0 tpstats_2011-10-05_13:15:01:ReadRepairStage 0 0 43945635 0 0 tpstats_2011-10-05_13:20:01:ReadRepairStage 0 0 44020406 0 0 tpstats_2011-10-05_13:25:02:ReadRepairStage 0 0 44093227 0 0 tpstats_2011-10-05_13:30:01:ReadRepairStage 0 0 44167455 0 0 tpstats_2011-10-05_13:35:02:ReadRepairStage 0 0 44247519 0 0 tpstats_2011-10-05_13:40:01:ReadRepairStage 0 0 44312726 0 0 tpstats_2011-10-05_13:45:01:ReadRepairStage 0 0 44387633 0 0 tpstats_2011-10-05_13:50:01:ReadRepairStage 0 0 3683 0 0 tpstats_2011-10-05_13:55:02:ReadRepairStage 0 0 44499487 0 0 tpstats_2011-10-05_14:00:01:ReadRepairStage 0 0 44578656 0 0 tpstats_2011-10-05_14:05:01:ReadRepairStage 0 0 44647555 0 0 tpstats_2011-10-05_14:10:02:ReadRepairStage 0 0 44716730 0 0 tpstats_2011-10-05_14:15:01:ReadRepairStage 0 0 44776644 0 0 tpstats_2011-10-05_14:20:01:ReadRepairStage 0 0 44840237 0 0 tpstats_2011-10-05_14:25:01:ReadRepairStage 0 0 44891444 0 0 tpstats_2011-10-05_14:30:01:ReadRepairStage 0 0 44931105 0 0
Re: Consistency level and ReadRepair
Thanks for the explanation. I think i am at loss trying to understand the tpstats output.. when does the ReadRepair count get incremented? - When any read is performed with CL ALL and RF=3 (or) - When there is a discrepency? I have 2 snapshots when i run tpstats and the counts indicate there were 1042805 reads and 354774 ReadRepairs. All reads are done with consistenct QUORUM. Per documentation should we do the read repair on all the reads? ReadStage 1 13533450 0 0 RequestResponseStage 0 07258586 0 0 MutationStage 0 15056119 0 0 ReadRepairStage 0 01210754 0 0 ReadStage 1 14576255 0 0 RequestResponseStage 0 09460969 0 0 MutationStage 0 26638499 0 0 ReadRepairStage 0 01565528 0 0 Read difference: 1042805 ReadRepair difference : 354774 thanks Ramesh On Wed, Oct 5, 2011 at 2:21 PM, Jonathan Ellis jbel...@gmail.com wrote: As explained in the link in my earlier reply, Read Repair just means a replica was checked in the background, not that it was out of sync. On Wed, Oct 5, 2011 at 2:00 PM, Ramesh Natarajan rames...@gmail.com wrote: Lets assume we have 3 nodes all up and running at all times with no failures or communication problems. 1. If I have a RF=3 and writing with QUORUM, 2 nodes the change gets committed, what is the delay we should expect before the 3rd replica gets written 2. In this scenario ( no failures e.t.c ) if we do a read with a QUORUM read what situation can lead to read repair? I didn't expect any ReadRepair because all 3 must have the same value. On Wed, Oct 5, 2011 at 1:11 PM, Jonathan Ellis jbel...@gmail.com wrote: Start with http://wiki.apache.org/cassandra/ReadRepair. Read repair count increasing just means you were doing reads at CL.ALL, and had the CF configured to perform RR. On Wed, Oct 5, 2011 at 12:37 PM, Ramesh Natarajan rames...@gmail.com wrote: I have a 12 node cassandra cluster running with RF=3. I have severl clients ( all running on a single node ) connecting to the cluster ( fixed client - node mapping ) and try to do a insert, update , select and delete. Each client has a fixed mapping of the row-keys and always connect to the same node. The timestamp on the client node is used for all operations. All operations are done using CL QUORUM. When I run a tpstats I see the ReadRepair count consistently increasing. i need to figure out why ReadRepair is happening.. One scenario I can think of is, it could happen when there is a delay in updating the nodes to reach eventual consistency.. Let's say I have 3 nodes (RF=3) A,B,C. I insert key with timestamp ts1 to A and the call will return as soon as it inserts the record to A and B. At some later point this information is sent to C... A while later A,B,C have the same data with the same timestamp. A key,ts1 B key, ts1 and C key, ts1 When I update key on A with timestamp ts2 to A, the call will return as soon as it inserts the record to A and B. Now the data is A key,ts2 B key,ts2 C key,ts1 Assuming I query for key A,C respond and since there is no QUORUM, it waits for B to respond and when A,B match, the response is returned to the client and ReadRepair is sent to C. This could happen only when C is running behind in catching up the updates to A,B. Are there any stats that would let me know if the system is in a consistent state? thanks Ramesh tpstats_2011-10-05_12:50:01:ReadRepairStage 0 0 43569781 0 0 tpstats_2011-10-05_12:55:01:ReadRepairStage 0 0 43646420 0 0 tpstats_2011-10-05_13:00:02:ReadRepairStage 0 0 43725850 0 0 tpstats_2011-10-05_13:05:01:ReadRepairStage 0 0 43790047 0 0 tpstats_2011-10-05_13:10:02:ReadRepairStage 0 0 43869704 0 0 tpstats_2011-10-05_13:15:01:ReadRepairStage 0 0 43945635 0 0 tpstats_2011-10-05_13:20:01:ReadRepairStage 0 0 44020406 0 0 tpstats_2011-10-05_13:25:02:ReadRepairStage 0 0 44093227 0 0 tpstats_2011-10-05_13:30:01:ReadRepairStage 0 0 44167455 0 0 tpstats_2011-10-05_13:35:02:ReadRepairStage 0 0 44247519 0 0 tpstats_2011-10-05_13:40:01:ReadRepairStage
help needed interpreting Read/Write latency in cfstats and cfhistograms output
I am running a cassandra 0.8.6 cluster. I started a clean test setup and run my tests for a while. Later when I run cfstats and cfhistograms ( both ran at the same time ) the values for Read/Write latency doesn't match. As per cfstats the latency for read and write are 5.086 and 0.018 ms respectively. However per cfhistogram output the latency doesn't look correct. Attached are the output.. Can someone explain how to correlate the data? Thanks Ramesh cfstats Column Family: uid SSTable count: 10 Space used (live): 7453864915 Space used (total): 7453864915 Number of Keys (estimate): 2669184 Memtable Columns Count: 6864 Memtable Data Size: 9254197 Memtable Switch Count: 1037 Read Count: 353627031 Read Latency: 5.086 ms. Write Count: 325803780 Write Latency: 0.018 ms. Pending Tasks: 0 Key cache capacity: 200 Key cache size: 200 Key cache hit rate: 0.8106968059650433 Row cache: disabled Compacted row minimum size: 104 Compacted row maximum size: 11864 Compacted row mean size: 2629 cfhistograms for uid MSA/uid histograms Offset SSTables Write Latency Read Latency Row Size Column Count 14148086 9198896 0 0 28680130 2993805 0 0 3 17138720 2 2487034 0 0 4 29039539 3 4712246 0 0 5 4192539220 7805708 0 0 6 52669945 126 11641747 0 0 7 57474130 457 15812298 0 0 8 53613212 1034 19846340 0 0 10 67641689 5463 48797478 0 0 12 19456795 15703 52875124 0 0 14 1841556 35196 47573455 0 0 17 3095102787 51065577 0 0 20 0145706 27439942 0 0 24 0196614 14573201 0 0 29 0237579 4983641 0 0 35 0489150 2167481 0 0 42 0 1234257 2613908 0 0 50 0 2623421 2887838 0 0 60 0 5991578 1767507 0 0 72 0 13091537 1187687 0 0 86 0 23362939 1001303 0 0 1030 34216661966773 0 0 1240 39232688580505 3790 3790 1490 33531717411380 63631 63631 1790 28741050297513 11083 11083 2150 26604624211311 82446 82446 2580 23621426152905 16650 16650 3100 21389177115147 77831 77831 3720 17854763 89825 77962 77962 4460 12658366 73162 63647 63647 5350 8777140 62366 55015 55015 6420 6714295 58156 89230 89230 7700 5919024 52929 50298 50298 9240 5793271 49989 45053 45053 1109 0 5793698 45865 41470 41470 1331 0 5451214 42312 87700 87700 1597 4720406 32027220189 220189 1916 3745143 27804313070 313070 2299 2864750 27902188374 188374 2759
cassandra performance degrades after 12 hours
I am running a cassandra cluster of 6 nodes running RHEL6 virtualized by ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. Our test setup performs about 3000 inserts per second. The cassandra data partition is on a XFS filesystem mounted with options (noatime,nodiratime,nobarrier,logbufs=8). We have no swap enabled on the VMs and the vm.swappiness set to 0. To avoid any contention issues our cassandra VMs are not running any other application other than cassandra. The test runs fine for about 12 hours or so. After that the performance starts to degrade to about 1500 inserts per sec. By 18-20 hours the inserts go down to 300 per sec. if i do a truncate, it starts clean, runs for a few hours (not as clean as rebooting). We find a direct correlation between kswapd kicking in after 12 hours or so and the performance degradation. If i look at the cached memory it is close to 10G. I am not getting a OOM error in cassandra. So looks like we are not running out of memory. Can some one explain if we can optimize this so that kswapd doesn't kick in. Our top output shows top - 16:23:54 up 2 days, 23:17, 4 users, load average: 2.21, 2.08, 2.02 Tasks: 213 total, 1 running, 212 sleeping, 0 stopped, 0 zombie Cpu(s): 1.6%us, 0.8%sy, 0.0%ni, 90.9%id, 6.3%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 20602812k total, 20320424k used, 282388k free, 1020k buffers Swap:0k total,0k used,0k free, 10145516k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 2586 root 20 0 36.3g 17g 8.4g S 32.1 88.9 8496:37 java java output root 2453 1 99 Sep30 pts/09-13:51:38 java -ea -javaagent:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms10059M -Xmx10059M -Xmn1200M -XX:+HeapDumpOnOutOfMemoryError -Xss128k -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:+CMSParallelRemarkEnabled -XX:SurvivorRatio=8 -XX:MaxTenuringThreshold=1 -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -Djava.net.preferIPv4Stack=true -Dcom.sun.management.jmxremote.port=7199 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Djava.rmi.server.hostname=10.19.104.14 -Djava.net.preferIPv4Stack=true -Dlog4j.configuration=log4j-server.properties -Dlog4j.defaultInitOverride=true -cp ./apache-cassandra-0.8.6/bin/../conf:./apache-cassandra-0.8.6/bin/../build/classes/main:./apache-cassandra-0.8.6/bin/../build/classes/thrift:./apache-cassandra-0.8.6/bin/../lib/antlr-3.2.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/apache-cassandra-thrift-0.8.6.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/avro-1.4.0-sources-fixes.jar:./apache-cassandra-0.8.6/bin/../lib/commons-cli-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-codec-1.2.jar:./apache-cassandra-0.8.6/bin/../lib/commons-collections-3.2.1.jar:./apache-cassandra-0.8.6/bin/../lib/commons-lang-2.4.jar:./apache-cassandra-0.8.6/bin/../lib/concurrentlinkedhashmap-lru-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/guava-r08.jar:./apache-cassandra-0.8.6/bin/../lib/high-scale-lib-1.1.2.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-core-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jackson-mapper-asl-1.4.0.jar:./apache-cassandra-0.8.6/bin/../lib/jamm-0.2.2.jar:./apache-cassandra-0.8.6/bin/../lib/jline-0.9.94.jar:./apache-cassandra-0.8.6/bin/../lib/json-simple-1.1.jar:./apache-cassandra-0.8.6/bin/../lib/libthrift-0.6.jar:./apache-cassandra-0.8.6/bin/../lib/log4j-1.2.16.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-examples.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-impl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-jmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-remote.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rimpl.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-rjmx.jar:./apache-cassandra-0.8.6/bin/../lib/mx4j-tools.jar:./apache-cassandra-0.8.6/bin/../lib/servlet-api-2.5-20081211.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-api-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/slf4j-log4j12-1.6.1.jar:./apache-cassandra-0.8.6/bin/../lib/snakeyaml-1.6.jar org.apache.cassandra.thrift.CassandraDaemon Ring output [root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool -h 127.0.0.1 ring Address DC RackStatus State LoadOwns Token 141784319550391026443072753096570088105 10.19.104.11datacenter1 rack1 Up Normal 19.92 GB 16.67% 0 10.19.104.12datacenter1 rack1 Up Normal 19.3 GB 16.67% 28356863910078205288614550619314017621 10.19.104.13datacenter1 rack1 Up Normal 18.57 GB 16.67% 56713727820156410577229101238628035242 10.19.104.14datacenter1 rack1 Up Normal 19.34 GB 16.67% 85070591730234615865843651857942052863 10.19.105.11datacenter1 rack1 Up
Re: cassandra performance degrades after 12 hours
We have 5 CF. Attached is the output from the describe command. We don't have row cache enabled. Thanks Ramesh Keyspace: MSA: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:3] Column Families: ColumnFamily: admin Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: modseq Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 50.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: msgid Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 50.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: participants Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 50.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: uid Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 200.0/14400 Memtable thresholds: 0.4/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] On Mon, Oct 3, 2011 at 12:26 PM, Mohit Anchlia mohitanch...@gmail.comwrote: On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan rames...@gmail.com wrote: I am running a cassandra cluster of 6 nodes running RHEL6 virtualized by ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. Our test setup performs about 3000 inserts per second. The cassandra data partition is on a XFS filesystem mounted with options (noatime,nodiratime,nobarrier,logbufs=8). We have no swap enabled on the VMs and the vm.swappiness set to 0. To avoid any contention issues our cassandra VMs are not running any other application other than cassandra. The test runs fine for about 12 hours or so. After that the performance starts to degrade to about 1500 inserts per sec. By 18-20 hours the inserts go down to 300 per sec. if i do a truncate, it starts clean, runs for a few hours (not as clean as rebooting). We find a direct correlation between kswapd kicking in after 12 hours or so and the performance degradation. If i look at the cached memory it is close to 10G. I am not getting a OOM error in cassandra. So looks like we are not running out of memory. Can some one explain if we can optimize this so that kswapd doesn't kick in. Our top output shows top - 16:23:54 up 2 days, 23:17, 4 users, load average: 2.21, 2.08, 2.02 Tasks: 213 total, 1 running, 212 sleeping, 0 stopped, 0 zombie Cpu(s): 1.6%us, 0.8%sy, 0.0%ni, 90.9%id, 6.3%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 20602812k total, 20320424k used, 282388k free, 1020k buffers Swap:0k total,0k used,0k free, 10145516k cached PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 2586 root 20 0 36.3g 17g 8.4g S
Re: cassandra performance degrades after 12 hours
I will start another test run to collect these stats. Our test model is in the neighborhood of 4500 inserts, 8000 updatesdeletes and 1500 reads every second across 6 servers. Can you elaborate more on reducing the heap space? Do you think it is a problem with 17G RSS? thanks Ramesh On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote: I am wondering if you are seeing issues because of more frequent compactions kicking in. Is this primarily write ops or reads too? During the period of test gather data like: 1. cfstats 2. tpstats 3. compactionstats 4. netstats 5. iostat You have RSS memory close to 17gb. Maybe someone can give further advise if that could be because of mmap. You might want to lower your heap size to 6-8G and see if that helps. Also, check if you have jna.jar deployed and you see malloc successful message in the logs. On Mon, Oct 3, 2011 at 10:36 AM, Ramesh Natarajan rames...@gmail.com wrote: We have 5 CF. Attached is the output from the describe command. We don't have row cache enabled. Thanks Ramesh Keyspace: MSA: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:3] Column Families: ColumnFamily: admin Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: modseq Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 50.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: msgid Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 50.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 864000 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: participants Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 50.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: uid Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 200.0/14400 Memtable thresholds: 0.4/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] On Mon, Oct 3, 2011 at 12:26 PM, Mohit Anchlia mohitanch...@gmail.com wrote: On Mon, Oct 3, 2011 at 10:12 AM, Ramesh Natarajan rames...@gmail.com wrote: I am running a cassandra cluster of 6 nodes running RHEL6 virtualized by ESXi 5.0. Each VM is configured with 20GB of ram and 12 cores. Our test setup performs about 3000 inserts per second. The cassandra data partition is on a XFS filesystem mounted with options (noatime,nodiratime,nobarrier,logbufs=8). We have no swap enabled on the VMs and the vm.swappiness set to 0. To avoid any contention issues our cassandra VMs are not running any other application other than
Re: cassandra performance degrades after 12 hours
Thanks for the pointers. I checked the system and the iostat showed that we are saturating the disk to 100%. The disk is SCSI device exposed by ESXi and it is running on a dedicated lun as RAID10 (4 600GB 15k drives) connected to ESX host via iSCSI. When I run compactionstats I see we are compacting a column family which has about 10GB of data. During this time I also see dropped messages in the system.log file. Since my io rates are constant in my tests I think the compaction is throwing things off. Is there a way I can throttle compaction on cassandra? Rather than run multiple compaction run at the same time, i would like to throttle it by io rate.. It is possible? If instead of having 5 big column families, if I create say 1000 each (5000 total), do you think it will help me in this case? ( smaller files and so smaller load on compaction ) Is it normal to have 5000 column families? thanks Ramesh On Mon, Oct 3, 2011 at 2:50 PM, Chris Goffinet c...@chrisgoffinet.com wrote: Most likely what could be happening is you are running single threaded compaction. Look at the cassandra.yaml of how to enable multi-threaded compaction. As more data comes into the system, bigger files get created during compaction. You could be in a situation where you might be compacting at a higher bucket N level, and compactions build up at lower buckets. Run nodetool -host localhost compactionstats to get an idea of what's going on. On Mon, Oct 3, 2011 at 12:05 PM, Mohit Anchlia mohitanch...@gmail.comwrote: In order to understand what's going on you might want to first just do write test, look at the results and then do just the read tests and then do both read / write tests. Since you mentioned high update/deletes I should also ask your CL for writes/reads? with high updates/delete + high CL I think one should expect reads to slow down when sstables have not been compacted. You have 20G space and 17G is used by your process and I also see 36G VIRT which I don't really understand why it's that high when swap is disabled. Look at sar -r output too to make sure there are no swaps occurring. Also, verify jna.jar is installed. On Mon, Oct 3, 2011 at 11:52 AM, Ramesh Natarajan rames...@gmail.com wrote: I will start another test run to collect these stats. Our test model is in the neighborhood of 4500 inserts, 8000 updatesdeletes and 1500 reads every second across 6 servers. Can you elaborate more on reducing the heap space? Do you think it is a problem with 17G RSS? thanks Ramesh On Mon, Oct 3, 2011 at 1:33 PM, Mohit Anchlia mohitanch...@gmail.com wrote: I am wondering if you are seeing issues because of more frequent compactions kicking in. Is this primarily write ops or reads too? During the period of test gather data like: 1. cfstats 2. tpstats 3. compactionstats 4. netstats 5. iostat You have RSS memory close to 17gb. Maybe someone can give further advise if that could be because of mmap. You might want to lower your heap size to 6-8G and see if that helps. Also, check if you have jna.jar deployed and you see malloc successful message in the logs. On Mon, Oct 3, 2011 at 10:36 AM, Ramesh Natarajan rames...@gmail.com wrote: We have 5 CF. Attached is the output from the describe command. We don't have row cache enabled. Thanks Ramesh Keyspace: MSA: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:3] Column Families: ColumnFamily: admin Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 20.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: modseq Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator: org.apache.cassandra.db.marshal.UTF8Type Columns sorted by: org.apache.cassandra.db.marshal.UTF8Type Row cache size / save period in seconds: 0.0/0 Key cache size / save period in seconds: 50.0/14400 Memtable thresholds: 0.5671875/1440/121 (millions of ops/minutes/MB) GC grace seconds: 3600 Compaction min/max thresholds: 4/32 Read repair chance: 1.0 Replicate on write: true Built indexes: [] ColumnFamily: msgid Key Validation Class: org.apache.cassandra.db.marshal.UTF8Type Default column value validator
Re: help needed interpreting Read/Write latency in cfstats and cfhistograms output
Thanks Aaron. The ms in the latency is it microseconds or milliseconds? I ran the 2 commands at the same time. I was expecting the values to be in the some what similar but from my output earlier , you can see the median in read latency in histogram output is about 10 milliseconds whereas the cfstats showed 5 ms. Is this normal? thanks Ramesh On Mon, Oct 3, 2011 at 3:40 PM, aaron morton aa...@thelastpickle.comwrote: Hi Rameash, Both tools output the recent latency, and while they do this slightly differently, the result is that it's the latency since the last time it was checked. Also the two tools use different counters, so using cfstats will not update cfhistogram. So when you see Read Latency: 5.086 ms. Write Latency: 0.018 ms. It means since you last checked the average latency for requests was 5.086 and 0.018ms When you see Offset SSTables Write Latency Read Latency Row Size Column Count 14148086 9198896 0 0 it means that 198,896 read requests were completed in 1 *microsecond* and 9 write requests completed n 1 microsecond. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 4/10/2011, at 4:58 AM, Ramesh Natarajan wrote: I am running a cassandra 0.8.6 cluster. I started a clean test setup and run my tests for a while. Later when I run cfstats and cfhistograms ( both ran at the same time ) the values for Read/Write latency doesn't match. As per cfstats the latency for read and write are 5.086 and 0.018 ms respectively. However per cfhistogram output the latency doesn't look correct. Attached are the output.. Can someone explain how to correlate the data? Thanks Ramesh cfstats Column Family: uid SSTable count: 10 Space used (live): 7453864915 Space used (total): 7453864915 Number of Keys (estimate): 2669184 Memtable Columns Count: 6864 Memtable Data Size: 9254197 Memtable Switch Count: 1037 Read Count: 353627031 Read Latency: 5.086 ms. Write Count: 325803780 Write Latency: 0.018 ms. Pending Tasks: 0 Key cache capacity: 200 Key cache size: 200 Key cache hit rate: 0.8106968059650433 Row cache: disabled Compacted row minimum size: 104 Compacted row maximum size: 11864 Compacted row mean size: 2629 cfhistograms for uid MSA/uid histograms Offset SSTables Write Latency Read Latency Row Size Column Count 14148086 9198896 0 0 28680130 2993805 0 0 3 17138720 2 2487034 0 0 4 29039539 3 4712246 0 0 5 4192539220 7805708 0 0 6 52669945 126 11641747 0 0 7 57474130 457 15812298 0 0 8 53613212 1034 19846340 0 0 10 67641689 5463 48797478 0 0 12 19456795 15703 52875124 0 0 14 1841556 35196 47573455 0 0 17 3095102787 51065577 0 0 20 0145706 27439942 0 0 24 0196614 14573201 0 0 29 0237579 4983641 0 0 35 0489150 2167481 0 0 42 0 1234257 2613908 0 0 50 0 2623421 2887838 0 0 60 0 5991578 1767507 0 0 72 0 13091537 1187687 0 0 86 0 23362939 1001303 0 0 1030 34216661966773 0 0 1240 39232688580505 3790 3790 1490 33531717411380 63631 63631 1790 28741050297513 11083 11083 2150 26604624211311 82446 82446 2580
node selection for replication factor 3
I have 6 nodes in a cluster running RandonPartitioner with SimpleStrategy and replication factor 3. Lets say we insert a column with a QUORUM consistency. Based on the md5 hash it decides to go to node 10.19.104.11. How does cassandra pick the other 2 nodes? Is it sequential ( .12 and .13 ) or any random node? thanks Ramesh root@CAP4-CNode4 apache-cassandra-0.8.6]# ./bin/nodetool -h 127.0.0.1 ring Address DC RackStatus State LoadOwns Token 141784319550391026443072753096570088105 10.19.104.11datacenter1 rack1 Up Normal 19.92 GB 16.67% 0 10.19.104.12datacenter1 rack1 Up Normal 19.3 GB 16.67% 28356863910078205288614550619314017621 10.19.104.13datacenter1 rack1 Up Normal 18.57 GB 16.67% 56713727820156410577229101238628035242 10.19.104.14datacenter1 rack1 Up Normal 19.34 GB 16.67% 85070591730234615865843651857942052863 10.19.105.11datacenter1 rack1 Up Normal 19.88 GB 16.67% 113427455640312821154458202477256070484 10.19.105.12datacenter1 rack1 Up Normal 20 GB 16.67% 141784319550391026443072753096570088105 [root@CAP4-CNode4 apache-cassandra-0.8.6]#
nodetool cfstats on 1.0.0-rc1 throws an exception
We have about 5000 column family and when we run the nodetool cfstats it throws out this exception... this is running 1.0.0-rc1 This seems to work on 0.8.6. Is this a bug in 1.0.0? thanks Ramesh Keyspace: system Read Count: 28 Read Latency: 5.8675 ms. Write Count: 3 Write Latency: 0.166 ms. Pending Tasks: 0 Column Family: Schema SSTable count: 4 Space used (live): 4293758276 Space used (total): 4293758276 Number of Keys (estimate): 5376 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 3 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 53 Key cache size: 2 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 104 Compacted row maximum size: 1955666 Compacted row mean size: 1508515 Column Family: HintsColumnFamily SSTable count: 0 Space used (live): 0 Space used (total): 0 Number of Keys (estimate): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 5 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: LocationInfo SSTable count: 1 Space used (live): 6947 Space used (total): 6947 Number of Keys (estimate): 128 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 2 Read Count: 20 Read Latency: NaN ms. Write Count: 3 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 1 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 73 Compacted row maximum size: 258 Compacted row mean size: 185 Column Family: Migrations SSTable count: 4 Space used (live): 4315909643 Space used (total): 4315909643 Number of Keys (estimate): 512 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 5 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 5839589 Compacted row maximum size: 9223372036854775807 Exception in thread main java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170) at org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395) at org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:275) at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.getAttribute(PerInterface.java:65) at com.sun.jmx.mbeanserver.MBeanSupport.getAttribute(MBeanSupport.java:216) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:666) at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:638) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1404) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at
Re: nodetool cfstats on 1.0.0-rc1 throws an exception
It happens all the time on 1.0. It doesn't happen on 0.8.6. Is there any thing I can do to check? thanks Ramesh On Mon, Oct 3, 2011 at 5:15 PM, Jonathan Ellis jbel...@gmail.com wrote: My suspicion would be that it has more to do with rare case when running with 5000 CFs than 1.0 regression. On Mon, Oct 3, 2011 at 5:00 PM, Ramesh Natarajan rames...@gmail.com wrote: We have about 5000 column family and when we run the nodetool cfstats it throws out this exception... this is running 1.0.0-rc1 This seems to work on 0.8.6. Is this a bug in 1.0.0? thanks Ramesh Keyspace: system Read Count: 28 Read Latency: 5.8675 ms. Write Count: 3 Write Latency: 0.166 ms. Pending Tasks: 0 Column Family: Schema SSTable count: 4 Space used (live): 4293758276 Space used (total): 4293758276 Number of Keys (estimate): 5376 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 3 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 53 Key cache size: 2 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 104 Compacted row maximum size: 1955666 Compacted row mean size: 1508515 Column Family: HintsColumnFamily SSTable count: 0 Space used (live): 0 Space used (total): 0 Number of Keys (estimate): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 5 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: LocationInfo SSTable count: 1 Space used (live): 6947 Space used (total): 6947 Number of Keys (estimate): 128 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 2 Read Count: 20 Read Latency: NaN ms. Write Count: 3 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 1 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 73 Compacted row maximum size: 258 Compacted row mean size: 185 Column Family: Migrations SSTable count: 4 Space used (live): 4315909643 Space used (total): 4315909643 Number of Keys (estimate): 512 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 5 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 5839589 Compacted row maximum size: 9223372036854775807 Exception in thread main java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170) at org.apache.cassandra.db.DataTracker.getMeanRowSize(DataTracker.java:395) at org.apache.cassandra.db.ColumnFamilyStore.getMeanRowSize(ColumnFamilyStore.java:275) at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java
Re: nodetool cfstats on 1.0.0-rc1 throws an exception
We recreated the schema using the same input file on both clusters and they are running identical load. Isn't the exception thrown in the system CF? this line looks strange: Compacted row maximum size: 9223372036854775807 thanks Ramesh On Mon, Oct 3, 2011 at 5:26 PM, Jonathan Ellis jbel...@gmail.com wrote: Looks like you have unexpectedly large rows in your 1.0 cluster but not 0.8. I guess you could use sstable2json to manually check your row sizes. On Mon, Oct 3, 2011 at 5:20 PM, Ramesh Natarajan rames...@gmail.com wrote: It happens all the time on 1.0. It doesn't happen on 0.8.6. Is there any thing I can do to check? thanks Ramesh On Mon, Oct 3, 2011 at 5:15 PM, Jonathan Ellis jbel...@gmail.com wrote: My suspicion would be that it has more to do with rare case when running with 5000 CFs than 1.0 regression. On Mon, Oct 3, 2011 at 5:00 PM, Ramesh Natarajan rames...@gmail.com wrote: We have about 5000 column family and when we run the nodetool cfstats it throws out this exception... this is running 1.0.0-rc1 This seems to work on 0.8.6. Is this a bug in 1.0.0? thanks Ramesh Keyspace: system Read Count: 28 Read Latency: 5.8675 ms. Write Count: 3 Write Latency: 0.166 ms. Pending Tasks: 0 Column Family: Schema SSTable count: 4 Space used (live): 4293758276 Space used (total): 4293758276 Number of Keys (estimate): 5376 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 3 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 53 Key cache size: 2 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 104 Compacted row maximum size: 1955666 Compacted row mean size: 1508515 Column Family: HintsColumnFamily SSTable count: 0 Space used (live): 0 Space used (total): 0 Number of Keys (estimate): 0 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 5 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 0 Compacted row maximum size: 0 Compacted row mean size: 0 Column Family: LocationInfo SSTable count: 1 Space used (live): 6947 Space used (total): 6947 Number of Keys (estimate): 128 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 2 Read Count: 20 Read Latency: NaN ms. Write Count: 3 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 1 Key cache size: 1 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 73 Compacted row maximum size: 258 Compacted row mean size: 185 Column Family: Migrations SSTable count: 4 Space used (live): 4315909643 Space used (total): 4315909643 Number of Keys (estimate): 512 Memtable Columns Count: 0 Memtable Data Size: 0 Memtable Switch Count: 0 Read Count: 0 Read Latency: NaN ms. Write Count: 0 Write Latency: NaN ms. Pending Tasks: 0 Key cache capacity: 5 Key cache size: 0 Key cache hit rate: NaN Row cache: disabled Compacted row minimum size: 5839589 Compacted row maximum size: 9223372036854775807 Exception in thread main java.lang.IllegalStateException: Unable to compute ceiling for max when histogram overflowed at org.apache.cassandra.utils.EstimatedHistogram.mean(EstimatedHistogram.java:170) at org.apache.cassandra.db.DataTracker.getMeanRowSize
Re: Cassandra JVM heap size
Thanks. We are not planning to use row cache because we don't anticipate requests for the same row coming in often and we would better let the OS do the caching.. So does this mean in my case instead of running 6 servers with 100 GB each, I can run 75 servers with 8 GB RAM and set the Xms/Xmx to 4GB. thanks Ramesh On Mon, Oct 3, 2011 at 10:25 PM, Jonathan Ellis jbel...@gmail.com wrote: That's misleading, because you don't necessarily need to give the memory to the JVM for Cassandra to make use of it. (See, for example, http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-improved-memory-and-disk-space-management .) In fact it's counterproductive to increase heap size past the point where it can handle the bloom filters + memtables for your data set. I suspect that the vast majority of deployments will not benefit from heaps larger than 4GB, and there is a ticket open to make this the default for 1.0: https://issues.apache.org/jira/browse/CASSANDRA-3295 That said, if you have the choice it's generally better to choose more, smaller servers than fewer, larger ones, primarily because it's easier to deal with failures. If you had 12 nodes half as expensive, for instance, losing one would be 1/12 of your capacity instead of 1/6. On Mon, Oct 3, 2011 at 9:47 PM, Ramesh Natarajan rames...@gmail.com wrote: I was reading an article @ http://www.acunu.com/products/choosing-cassandra/ and it mentions cassandra cannot benefit from more than 8GB allocated to JVM heap. Is this true? Are these cassandra installations with larger heap sizes? We are planning to have a cluster of 6 nodes with each node running with about 100 GB or so RAM. Will this be a problem? thanks Ramesh from http://www.acunu.com/products/choosing-cassandra/ Memory Ceiling Cassandra typically cannot benefit from more than 8GB of RAM allocated to the Java heap, imposing a hard limit on data size. Taking advantage of big servers with lots of memory or many disks is no problem for Acunu. Thereʼs no memory ceiling for Acunu and as a result, no data ceiling either. Need to use larger servers? Go ahead. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com