database design
Hi all, Just some thoughts and question I have about cassandra data modeling. If I understand well, cassandra is better on writing than on reading. So you have to think about your queries to design cassandra schema. We are doing incremental design, and already have our system in production and we have to develop new queries. How do you usualy do when you have new queries, do you write a specific job to update data in the database to match the new query you are writing ? Thanks for your help. Jean-Yves
Renaming keyspace
Hi all, We are running .6.12, is there any particular precaution to rename the keyspace, is it enough to shutdown cassandra, update storag-conf.xml, rename data directory and start cassandra again. Thanks for your help. Jean-Yves
Re: Calculate memory used for keycache
One additionnal question, I don't really understand what is in the key cache. I have a column family with only one key, and the keycache size is 118 ... ? Any idea. Thks. Jean-Yves
Out of Memory every 2 weeks
Sorry to create a new thread about Out of Memory problem, but I checked all other threads and did not find the answer. We have a running cluster of 2 cassandra nodes replication factor = 2 on red hat 4.8 32 bits with 4 G of memory where we run periodicaly out of memory (every 2 weeks) and both nodes are crashing (trace at the end of the file). We are running 0.6.12 (do not had the time to upgrade to 0.7.3) and we followed the memory configuration from the wiki : memtable_throughput_in_mb * 3 * number of hot CFs + 1G + internal caches memtable_throughput_in_mb = 16 We have 11 column families So if my calculation is right, I need 16*3*11 + 1G + internal caches : 528 Mbytes + 1G + internal caches, so more than 1,5 G of heap memory We start with 2G Here is the cassandra.in.sh parameters JVM_OPTS= \ -ea \ -Xms2000M \ -Xmx2000M \ -XX:+UseParNewGC \ -XX:+UseConcMarkSweepGC \ -XX:+CMSParallelRemarkEnabled \ -XX:SurvivorRatio=8 \ -XX:MaxTenuringThreshold=1 \ -XX:CMSInitiatingOccupancyFraction=75 \ -XX:+UseCMSInitiatingOccupancyOnly \ -XX:+HeapDumpOnOutOfMemoryError \ The question is I don't really understand the configuration problem, if some body have any clue of what we could do, or what we should monitor to avoid the problem, as it is very difficult to reproduce the problem as it does not happen very often. Thanks for any help. Jean-Yves -below the stack trace -- INFO [FLUSH-WRITER-POOL:1] 2011-03-04 17:59:45,952 Memtable.java (line 152) Writing Memtable-HintsColumnFamily@15830055(2280 bytes, 240 operations) INFO [FLUSH-WRITER-POOL:1] 2011-03-04 17:59:45,972 Memtable.java (line 166) Completed flushing /opt/database/data/system/HintsColumnFamily-164-Data.db (255 bytes) INFO [FLUSH-TIMER] 2011-03-04 18:17:47,802 ColumnFamilyStore.java (line 478) VoiceMail has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/opt/database/commitlog/CommitLog-1297775500410.log', position=10407973) INFO [FLUSH-TIMER] 2011-03-04 18:17:47,803 ColumnFamilyStore.java (line 748) Enqueuing flush of Memtable-VoiceMail@29468188(3000 bytes, 120 operations) INFO [FLUSH-WRITER-POOL:1] 2011-03-04 18:17:47,803 Memtable.java (line 152) Writing Memtable-VoiceMail@29468188(3000 bytes, 120 operations) ERROR [FLUSH-WRITER-POOL:1] 2011-03-04 18:17:47,824 CassandraDaemon.java (line 87) Uncaught exception in thread Thread[FLUSH-WRITER-POOL:1,5,main] java.lang.OutOfMemoryError at java.io.RandomAccessFile.readBytes(Native Method) at java.io.RandomAccessFile.read(RandomAccessFile.java:322) at org.apache.cassandra.io.util.BufferedRandomAccessFile.fillBuffer(BufferedRandomAccessFile.java:209) at org.apache.cassandra.io.util.BufferedRandomAccessFile.seek(BufferedRandomAccessFile.java:246) at org.apache.cassandra.io.util.BufferedRandomAccessFile.writeAtMost(BufferedRandomAccessFile.java:389) at org.apache.cassandra.io.util.BufferedRandomAccessFile.write(BufferedRandomAccessFile.java:365) at java.io.DataOutputStream.writeUTF(DataOutputStream.java:384) at java.io.RandomAccessFile.writeUTF(RandomAccessFile.java:1064) at org.apache.cassandra.io.SSTableWriter.append(SSTableWriter.java:94) at org.apache.cassandra.db.Memtable.writeSortedContents(Memtable.java:162) at org.apache.cassandra.db.Memtable.access$000(Memtable.java:46) at org.apache.cassandra.db.Memtable$1.runMayThrow(Memtable.java:178) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) INFO [main] 2011-03-07 17:46:09,454 CLibrary.java (line 47) JNA not found. Native methods will be disabled. INFO [main] 2011-03-07 17:46:09,768 DatabaseDescriptor.java (line 277) DiskAccessMode 'auto' determined to be standard, indexAccessMode is standard
Re: Out of Memory every 2 weeks
Thank you, I am going to try that.
cassandra 0.6.11 binary package problem
Hi all, Just for info, in apache-cassandra-0.6.11-bin.tar.gz there are both apache-cassandra-0.6.10.jar and apache-cassandra-0.6.11.jar in the lib directory. Causing troubles to my upgrade scripts which use this file to get installed version and check if upgrade needed . :( Thanks for the good job. Jean-Yves
Re: cassandra 0.6.11 binary package problem
Don't known, only checked http://www.apache.org/dyn/closer.cgi?path=/cassandra/0.6.11/apache-cassandra-0.6.11-bin.tar.gz Rgds. JY On Thu, Feb 3, 2011 at 3:36 PM, Jonathan Ellis jbel...@gmail.com wrote: Well, that's odd. :) Do any of the other tar.gz balls contain multiple jars? On Thu, Feb 3, 2011 at 6:06 AM, Jean-Yves LEBLEU jleb...@gmail.com wrote: Hi all, Just for info, in apache-cassandra-0.6.11-bin.tar.gz there are both apache-cassandra-0.6.10.jar and apache-cassandra-0.6.11.jar in the lib directory. Causing troubles to my upgrade scripts which use this file to get installed version and check if upgrade needed . :( Thanks for the good job. Jean-Yves -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Do you have a site in production environment with Cassandra? What client do you use?
Java + Pelops Cassandra 0.6.8
Re: Data management on a ring
Thanks for the anwser. It was not exactly my point, I would like to know if in a 10 nodes rings if it is possible to restrict replication of some data to only 2 nodes, and other data to all nodes ? Regards. Jean-Yves On Wed, Nov 10, 2010 at 11:17 AM, aaron morton aa...@thelastpickle.comwrote: If I understand your correctly, you just want to add 8 nodes to a ring that already has 2 ? You could add the nodes and manually assign them tokens following the guidelines here http://wiki.apache.org/cassandra/Operations I'm not sure how to ensure the minimum amount of data transfer though. Adding all 8 at once is probably a bad idea. How about you make a new cluster of 8 nodes, manually assign tokens and then copy the data from the 2 node ring to the 8 node. Then move the 2 original nodes into the new cluster? Hope that helps. Aaron On 10 Nov 2010, at 20:56, Jean-Yves LEBLEU wrote: Hello all, We have an installation of 10 nodes, and we choose to deploy 5 rings of 2 nodes. We would like to change to a ring of 10 nodes. Some data have to be replicated on the 10 nodes, some should stay on 2 nodes. Do you have any idea or documentation pointer in order to have a ring of 10 nodes with such data repartition ? Thanks for any answer. Jean-Yves
Data management on a ring
Hello all, We have an installation of 10 nodes, and we choose to deploy 5 rings of 2 nodes. We would like to change to a ring of 10 nodes. Some data have to be replicated on the 10 nodes, some should stay on 2 nodes. Do you have any idea or documentation pointer in order to have a ring of 10 nodes with such data repartition ? Thanks for any answer. Jean-Yves
Changing column families and drain
Hi all, When I look at the wiki the procedure to change the column family is : 1. Empty the commitlog with nodetool drain. 2. Shutdown Cassandra and verify that there is no remaining data in the commitlog. 3. Delete the sstable files (-Data.db, -Index.db, and -Filter.db) for any CFs removed, and rename the files for any CFs that were renamed. 4. Make necessary changes to your storage-conf.xml. 5. Start Cassandra back up and your edits should take effect. ** How do we check that the commitlog is empty ? as it seems that there are still files in the commit log directory after a drain ? Is it necessary to shutdown all nodes in a ring before changing the storage-conf.xml files ? On linux is a kill -9 acceptable as a cassandra shutdown ? Thanks for answer. Jean-Yves **
Server takes a long time to answer
Hi all, We have a cassandra installation with two nodes in a ring, replication factor = 2, some times cassandra becomes non-responsive, it takes about three minutes before answering to a get. Do you have any idea of what we should check when it happens ? Or what could cause the problem. We are using cassandra release 6.3. Thanks regards. Jean-Yves