Re: cassandra goes infinite loop and data lost.....
okay, I am not sure if it is infinite loop, I change log4j to DEBUG only because cassandra never get online after run cassandra, it seems just halt. I enable debug then it start showing those message very fast and never end. I have just run nodetool cleanup, and it start reading commitlog, seems normal now. thanks for the help, I am really newbie on cassandra and has no idea how does slice works, could you give me more information? thanks alot! On Thu, Jul 14, 2011 at 1:36 PM, Jonathan Ellis jbel...@gmail.com wrote: That says I'm collecting data to answer requests. I don't see anything here that indicates an infinite loop. I do see that it's saying N of 2147483647 which looks like you're doing slices with a much larger limit than is advisable (good way to OOM the way you already did). On Wed, Jul 13, 2011 at 8:27 PM, Yan Chunlu springri...@gmail.com wrote: I gave cassandra 8GB heap size and somehow it run out of memory and crashed. after I start it, it just runs in to the following infinite loop, the last line: DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 100zs:false:14@1310168625866434 goes for ever I have 3 nodes and RF=2, so I am losing data. is that means I am screwed and can't get it back? DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) collecting 20 of 2147483647: q74k:false:14@1308886095008943 DEBUG [main] 2011-07-13 22:19:00,585 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 10fbu:false:1@1310223075340297 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: apbg:false:13@1305641597957086 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 1 of 2147483647: auje:false:13@1305641597957075 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 2 of 2147483647: ayj8:false:13@1305641597957060 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 3 of 2147483647: b4fz:false:13@1305641597957096 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 100zs:false:14@1310168625866434 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 1 of 2147483647: 1017f:false:14@1310168680375612 DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 2 of 2147483647: 1018e:false:14@1310168759614715 DEBUG [main] 2011-07-13 22:19:00,587 SliceQueryFilter.java (line 123) collecting 3 of 2147483647: 101dd:false:14@1310169260225339 On Thu, Jul 14, 2011 at 11:27 AM, Yan Chunlu springri...@gmail.com wrote: DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123) collecting 0 of 2147483647: 100zs:false:14@1310168625866434 -- 闫春路 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- 闫春路
Re: How to remove/add node
Thanks a lot dear. I will try it out and will let you know if the problem persists. On Thu, Jul 14, 2011 at 5:52 AM, Sameer Farooqui cassandral...@gmail.comwrote: As long as you have no data in this cluster, try clearing out the /var/lib/cassandra directory from all nodes and restart Cassandra. The only way to change tokens after they've been set is using a nodetool move token or clearing /var/lib/cassandra. On Wed, Jul 13, 2011 at 7:41 AM, Abdul Haq Shaik abdulsk.cassan...@gmail.com wrote: Hi, I have deleted the data, commitlog and saved cache directories. I have removed one of the nodes from the seeds of cassandra.yaml. When i tried to use nodetool, itshowing the removed node as up.. Thanks, Abdul
AW: New web client future API
Hi out there, it was a bit quiet now because I am changing the framework with which Apollo is built, to get away from the flash-based charting. But it continues ... Another question is: What would be a good name for that UI, Apollo seems to be used in another Cassandra context? Greets Markus -Ursprüngliche Nachricht- Von: MW | Codefreun.de [mailto:m...@codefreun.de] Gesendet: Samstag, 25. Juni 2011 07:30 An: user@cassandra.apache.org Betreff: AW: New web client future API I just implemented a simple charting to monitor the keyspaces (please have a look at http://www.codefreun.de/apolloUI, but therefore a Flash-plugin in your browser is needed). I am continuing now to code the monitoring for the column families and I am not sure where to place the charts: - A new tab - A collapsible panel inside the already existing CF-tab What do you think? -Ursprüngliche Nachricht- Von: Jonathan Colby [mailto:jonathan.co...@gmail.com] Gesendet: Montag, 20. Juni 2011 12:20 An: user@cassandra.apache.org Betreff: Re: New web client future API I just took a look at the demo. This is really great stuff! I will try this on our cluster as soon as possible. I like this because it allows people not too familiar with the cassandra CLI or Thrift a way to query cassandra data. On Jun 20, 2011, at 10:56 AM, Markus Wiesenbacher | Codefreun.de wrote: Should work now ... Von meinem iPhone gesendet Am 20.06.2011 um 09:28 schrieb Andrey V. Panov panov.a...@gmail.com: How to download it? Your Download war-file open just blank page :( On 14/06/2011, Markus Wiesenbacher | Codefreun.de m...@codefreun.de wrote: I just released an early version of my web client (http://www.codefreun.de/apollo) which is Thrift-based, and therefore I would like to know what the future is ...
What available Cassandra schema documentation is available?
I couldn't find any schema example for the supercolumn column family that is strongly typed. For example, create column family Super1 with comparator=UTF8Type and column_type=Super and key_validation_class=UTF8Type and column_metadata = [ {column_name: username, validation_class:UTF8Type}, {column_name: email, validation_class:UTF8Type, index_type: KEYS}, {column_name: address, validation_class:UTF8Type, subcolumn_metadata = [ {column_name: street, validation_class:UTF8Type}, {column_name: state, validation_class:UTF8Type, index_type: KEYS} ] } ]; Or does someone know a better method? I like to make it as painless as possible for developers with a strongly typed schema so as to avoid orphan data.
gossiper problem
All: I have four cassandra servers in cluster. I do not restart any one of the servers, why the following print show the four servers restart many times? What is the possible reason? The connection between the four server's is good. Swap may be used, because there are other applications run with cassandra server. 10.63.61.71 log INFO [Timer-0] 2011-07-13 10:44:55,732 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [GMFD:1] 2011-07-13 10:44:57,748 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 15:56:44,630 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [GMFD:1] 2011-07-13 15:56:44,653 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 16:03:24,391 Gossiper.java (line 181) InetAddress /10.63.61.72 is now dead. INFO [GMFD:1] 2011-07-13 16:03:24,405 Gossiper.java (line 579) InetAddress /10.63.61.72 is now UP INFO [Timer-0] 2011-07-13 22:21:41,246 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [Timer-0] 2011-07-13 22:22:45,602 Gossiper.java (line 181) InetAddress /10.63.61.73 is now dead. INFO [Timer-0] 2011-07-13 22:22:45,602 Gossiper.java (line 181) InetAddress /10.63.61.72 is now dead. INFO [GMFD:1] 2011-07-13 22:22:45,993 Gossiper.java (line 579) InetAddress /10.63.61.73 is now UP INFO [GMFD:1] 2011-07-13 22:22:46,107 Gossiper.java (line 579) InetAddress /10.63.61.72 is now UP INFO [GMFD:1] 2011-07-13 22:22:46,107 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 22:24:08,812 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [GMFD:1] 2011-07-13 22:24:08,920 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP 10.63.61.72 log INFO [Timer-0] 2011-07-13 02:06:03,941 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 02:06:05,109 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 03:39:41,918 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 03:39:45,536 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 10:10:17,449 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [Timer-0] 2011-07-13 10:10:17,471 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 10:10:18,451 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [GMFD:1] 2011-07-13 10:10:18,451 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 10:44:36,140 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 10:44:57,417 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 10:45:10,141 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 10:45:14,478 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 15:14:44,044 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 15:14:47,610 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 15:56:36,857 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 15:56:44,417 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 16:02:37,260 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 16:02:52,651 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 16:03:05,289 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 16:03:11,260 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 16:08:47,666 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 16:08:48,668 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 17:38:32,569 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 17:38:34,572 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 22:20:45,706 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 22:22:46,143 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 22:23:32,875 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 22:24:08,948 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 22:32:37,421 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 22:32:38,036 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP 10.63.61.73 log INFO [Timer-0] 2011-07-13 03:39:42,066 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13
NegativeArraySizeException
All: What is the following error mean? ERROR [pool-1-thread-62] 2011-07-14 14:31:04,671 CustomTThreadPoolServer.java (line 173) Error occurred during processing of message. java.lang.NegativeArraySizeException at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtoco l.java:296) at org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.ja va:290) at org.apache.cassandra.thrift.KeyRange.read(KeyRange.java:541) at org.apache.cassandra.thrift.Cassandra$get_range_slices_args.read(Cassand ra.java:10426) at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process (Cassandra.java:1432) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1 128) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(Cu stomTThreadPoolServer.java:167) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto r.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja va:908) at java.lang.Thread.run(Thread.java:619) INFO [FLUSH-TIMER] 2011-07-14 15:08:00,537 ColumnFamilyStore. Best Regards Donna li
Thrift Java Client - Get a column family from a Keyspace
Hi I have been playing around with Cassandra and its Java Thrift Client. From my understanding, one could get/retrieve a Keyspace, KsDef object, using the describe_keyspace(String name) method on the Cassandra.Client object. Subsequently, one could get a list of all the ColumnFamily definitions in a keyspace, using the getCf_defs() method on the KsDef Object. Is there a way to get a single ColumnFamily if I know the name of the columnfamily (just a convenience function) ? Currently the only way for that would be iterating through the list of column families returned by the getCf_defs() method. Thanks in Advance Chandra Register for Impetus Webinar on 'Device Side Performance Optimization of Mobile Apps', July 08 (10:00 am Pacific Time). Impetus is presenting a Cassandra case study on July 11 as a sponsor for Cassandra SF 2011 in San Francisco. Click http://www.impetus.com to know more. Follow us on www.twitter.com/impetuscalling NOTE: This message may contain information that is confidential, proprietary, privileged or otherwise protected by law. The message is intended solely for the named addressee. If received in error, please destroy and notify the sender. Any use of this email is prohibited when received in error. Impetus does not represent, warrant and/or guarantee, that the integrity of this communication has been maintained nor that the communication is free of errors, virus, interception or interference.
Re: Survey: Cassandra/JVM Resident Set Size increase
I finally upgraded to 0.7.4 - 0.8.0 (using riptano packages) 2 days ago. Before, my resident memory (for the java process) would slowly grow without bound and the OS would kill the process. But, over the last 2 days, I _think_ it's been stable. I'll let you know in a week :-) My other stats: AWS large (64 bit, 7.5GB, 4 compute units, no swap by default and I didn't enable it manually) Centos 5.6 Sun 1.6.0_24-b07 2 column families 4 machine cluster with RF=3 Mostly balanced write/read load (usually more writes) Not quite big data volumes, large 10^6 or small 10^7 ops/day No deletes or mutations, I only add or read Everything else is stock, I haven't tuned anything as performance was ok. No JVM options other than what was in the package. No JNA. Not sure the GC patterns. will On Tue, Jul 12, 2011 at 9:28 AM, Chris Burroughs chris.burrou...@gmail.comwrote: ### Preamble There have been several reports on the mailing list of the JVM running Cassandra using too much memory. That is, the resident set size is (max java heap size + mmaped segments) and continues to grow until the process swaps, kernel oom killer comes along, or performance just degrades too far due to the lack of space for the page cache. It has been unclear from these reports if there is a pattern. My hope here is that by comparing JVM versions, OS versions, JVM configuration etc., we will find something. Thank you everyone for your time. Some example reports: - http://www.mail-archive.com/user@cassandra.apache.org/msg09279.html - http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Very-high-memory-utilization-not-caused-by-mmap-on-sstables-td5840777.html - https://issues.apache.org/jira/browse/CASSANDRA-2868 - http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/OOM-or-what-settings-to-use-on-AWS-large-td6504060.html - http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-memory-problem-td6545642.html For reference theories include (in no particular order): - memory fragmentation - JVM bug - OS/glibc bug - direct memory - swap induced fragmentation - some other bad interaction of cassandra/jdk/jvm/os/nio-insanity. ### Survey 1. Do you think you are experiencing this problem? 2. Why? (This is a good time to share a graph like http://www.twitpic.com/5fdabn or http://img24.imageshack.us/img24/1754/cassandrarss.png) 2. Are you using mmap? (If yes be sure to have read http://wiki.apache.org/cassandra/FAQ#mmap , and explain how you have used pmap [or another tool] to rule you mmap and top decieving you.) 3. Are you using JNA? Was mlockall succesful (it's in the logs on startup)? 4. Is swap enabled? Are you swapping? 5. What version of Apache Cassandra are you using? 6. What is the earliest version of Apache Cassandra you recall seeing this problem with? 7. Have you tried the patch from CASSANDRA-2654 ? 8. What jvm and version are you using? 9. What OS and version are you using? 10. What are your jvm flags? 11. Have you tried limiting direct memory (-XX:MaxDirectMemorySize) 12. Can you characterise how much GC your cluster is doing? 13. Approximately how many read/writes per unit time is your cluster doing (per node or the whole cluster)? 14. How are you column families configured (key cache size, row cache size, etc.)?
Re: Cassandra Monitoring
On Thu, Jul 14, 2011 at 8:58 AM, Albert Vila a...@imente.com wrote: Anyone has Cassandra's cacti templates for server 0.7.4+? On 20 December 2010 17:40, Edward Capriolo edlinuxg...@gmail.com wrote: On Sun, Dec 19, 2010 at 10:37 PM, Dave Viner davevi...@gmail.com wrote: Can you share the code for run_column_family_stores.sh ? On Sun, Dec 19, 2010 at 6:14 PM, Edward Capriolo edlinuxg...@gmail.com wrote: On Sun, Dec 19, 2010 at 2:01 PM, Ran Tavory ran...@gmail.com wrote: Mx4j is in process, same jvm, you just need to throw mx4j-tools.jar in the lib before you start Cassandra jmx-to-rest runs in a separate jvm. It also has a nice useful HTML interface that you can look into any running host. On Sunday, December 19, 2010, Dave Viner davevi...@gmail.com wrote: How does mx4j compare with the earlier jmx-to-rest bridge listed in the operations page: JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge; ThanksDave Viner On Sun, Dec 19, 2010 at 7:01 AM, Ran Tavory ran...@gmail.com wrote: FYI, I just added an mx4j section to the bottom of this page http://wiki.apache.org/cassandra/Operations On Sun, Dec 19, 2010 at 4:30 PM, Jonathan Ellis jbel...@gmail.com wrote: mx4j? https://issues.apache.org/jira/browse/CASSANDRA-1068 On Sun, Dec 19, 2010 at 8:36 AM, Peter Schuller peter.schul...@infidyne.com wrote: How / what are you monitoring? Best practices someone? I recently set up monitoring using the cassandra-munin-plugins (https://github.com/jamesgolick/cassandra-munin-plugins). However, due to various little details that wasn't too fun to integrate properly with munin-node-configure and automated configuration management. A problem is also the starting of a JVM for each use of jmxquery, which can become a problem with many column families. I like your web server idea. Something persistent that can sit there and do the JMX acrobatics, and expose something more easily consumed for stuff like munin/zabbix/etc. It would be pretty nice to have that out of the box with Cassandra, though I expect that would be considered bloat. :) -- / Peter Schuller -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com -- /Ran -- /Ran There is a lot of overhead on your monitoring station to kick up so many JMX connections. There can also be nat/hostname problems for remote JMX. My solution is to execute JMX over nagios remote plugin executor (NRPE). command[run_column_family_stores]=/usr/lib64/nagios/plugins/run_column_family_stores.sh $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ Maybe not as fancy as a rest-jmx bridge, but solves most of the RMI issues involved in pulling stats over JMX, That script is just a wrapper: For example we can have our NMS directly call the JMX fetch code like this: java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar com.jointhegrid.m6.cassandra.CFStores service:jmx:rmi:///jndi/rmi://host:port/jmxrmi user pass org.apache.cassandra.db:columnfamily=columnfamily,keyspace=keyspace,type=ColumnFamilyStores But as mentioned this puts a lot of pressure on the monitoring node to open up all these JMX connections. With NRPE I can farm the requests out over NRPE. Nodes end up executing their checks locally. # cat /usr/lib64/nagios/plugins/run_column_family_stores.sh java -cp /usr/lib64/nagios/plugins/cassandra-cacti-m6.jar com.jointhegrid.m6.cassandra.CFStores service:jmx:rmi:///jndi/rmi://${1}:${2}/jmxrmi ${3} ${4} org.apache.cassandra.db:columnfamily=${5},keyspace=${6},type=ColumnFamilyStores All the code is up here: http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp http://www.jointhegrid.com/svn/cassandra-cacti-m6/trunk/src/com/jointhegrid/m6/cassandra/CFStores.java My main goal was to point out that you do not need REST bridges and embedded web servers to run JMX checks remotely. -- Albert Vila Puig a...@imente.com iMente.com http://www.imente.com http://www.jointhegrid.com/cassandra/cassandra-cacti-m6.jsp There is some preliminary support for 0.7.X but I have not ported over all the graphs yet. Look over the next couple of days. Edward
Re: NegativeArraySizeException
That means something sent a nonsense request to Cassandra. Often this happens when a non-Cassandra client connects on the Thrift port. On Thu, Jul 14, 2011 at 3:01 AM, Donna Li donna...@utstar.com wrote: All: What is the following error mean? ERROR [pool-1-thread-62] 2011-07-14 14:31:04,671 CustomTThreadPoolServer.java (line 173) Error occurred during processing of message. java.lang.NegativeArraySizeException at org.apache.thrift.protocol.TBinaryProtocol.readStringBody(TBinaryProtocol.java:296) at org.apache.thrift.protocol.TBinaryProtocol.readString(TBinaryProtocol.java:290) at org.apache.cassandra.thrift.KeyRange.read(KeyRange.java:541) at org.apache.cassandra.thrift.Cassandra$get_range_slices_args.read(Cassandra.java:10426) at org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:1432) at org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1128) at org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619) INFO [FLUSH-TIMER] 2011-07-14 15:08:00,537 ColumnFamilyStore. Best Regards Donna li -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: JDBC CQL Driver unable to locate cassandra.yaml
I tried putting the cassandra.yaml in the classpath but got the same error. Adding -Dcassandra.config=file:/path/to/cassandra.yaml did work. - Derek Tracy trac...@gmail.com - On Wed, Jul 13, 2011 at 6:22 PM, Jonathan Ellis jbel...@gmail.com wrote: The current version of the driver does require having the server's cassandra.yaml on the classpath. This is a bug. On Wed, Jul 13, 2011 at 3:13 PM, Derek Tracy trac...@gmail.com wrote: I am trying to integrate the Cassandra JDBC CQL driver with my companies ETL product. We have an interface that performs database queries using there respective JDBC drivers. When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace: Unable to locate cassandra.yaml I am using Cassandra 0.8.1. Is there a guide on how to utilize/setup the JDBC driver? Derek Tracy trac...@gmail.com - -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
JDBC CQL Driver setAutoCommit Unsupported Method
Still trying to integrate Cassandra's CQL JDBC driver with my companies ETL but ran into another roadblock. I will caveat this post with I do not know a lot about JDBC and how it is implemented When trying to test the connection (using the ETL) I get an Unsupported Method exception, I looked at the CassandraDriver.java file (line 561) and it is immediately throwing an exception whenever the setAutoCommit method is accessed, is there any way (via the JDBC connection string maybe) I can disable any attempt to set autocommit or any other parameter? Is my best best going to be downloading the source and re-compiling after commenting out the exceptions? - Derek Tracy trac...@gmail.com -
Re: Question about compaction
These 0 byte files with -Compacted suffix indicate that the associated sstables can be removed. In current version, Cassandra delete compacted sstables at Full GC and on startup. maki 2011/7/14 Sameer Farooqui cassandral...@gmail.com: Running Cassandra 0.8.1. Ran major compaction via: sudo /home/ubuntu/brisk/resources/cassandra/bin/nodetool -h localhost compact From what I'd read about Cassandra, I thought that after compaction all of the different SSTables on disk for a Column Family would be merged into one new file. However, there are now a bunch of 0-sized Compacted files and a bunch of Data files. Any ideas about why there are still so many files left? Also, is a minor compaction the same thing as a read-only compaction in 0.7? ubuntu@domU-12-31-39-0E-x-x:/raiddrive/data/DemoKS$ ls -l total 270527136 -rw-r--r-- 1 root root 0 2011-07-13 03:07 DemoCF-g-5670-Compacted -rw-r--r-- 1 root root 89457447799 2011-07-10 00:26 DemoCF-g-5670-Data.db -rw-r--r-- 1 root root 193456 2011-07-10 00:26 DemoCF-g-5670-Filter.db -rw-r--r-- 1 root root 2081159 2011-07-10 00:26 DemoCF-g-5670-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 00:26 DemoCF-g-5670-Statistics.db -rw-r--r-- 1 root root 0 2011-07-13 03:07 DemoCF-g-5686-Compacted -rw-r--r-- 1 root root 920521489 2011-07-09 22:03 DemoCF-g-5686-Data.db -rw-r--r-- 1 root root 11776 2011-07-09 22:03 DemoCF-g-5686-Filter.db -rw-r--r-- 1 root root 126725 2011-07-09 22:03 DemoCF-g-5686-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 22:03 DemoCF-g-5686-Statistics.db -rw-r--r-- 1 root root 0 2011-07-13 03:07 DemoCF-g-5781-Compacted -rw-r--r-- 1 root root 223970446 2011-07-09 22:38 DemoCF-g-5781-Data.db -rw-r--r-- 1 root root 7216 2011-07-09 22:38 DemoCF-g-5781-Filter.db -rw-r--r-- 1 root root 32750 2011-07-09 22:38 DemoCF-g-5781-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 22:38 DemoCF-g-5781-Statistics.db -rw-r--r-- 1 root root 0 2011-07-13 03:07 DemoCF-g-5874-Compacted -rw-r--r-- 1 root root 156284248 2011-07-09 23:20 DemoCF-g-5874-Data.db -rw-r--r-- 1 root root 5056 2011-07-09 23:20 DemoCF-g-5874-Filter.db -rw-r--r-- 1 root root 10400 2011-07-09 23:20 DemoCF-g-5874-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 23:20 DemoCF-g-5874-Statistics.db -rw-r--r-- 1 root root 0 2011-07-13 03:07 DemoCF-g-6938-Compacted -rw-r--r-- 1 root root 22947541446 2011-07-10 11:43 DemoCF-g-6938-Data.db -rw-r--r-- 1 root root 49936 2011-07-10 11:43 DemoCF-g-6938-Filter.db -rw-r--r-- 1 root root 563550 2011-07-10 11:43 DemoCF-g-6938-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 11:43 DemoCF-g-6938-Statistics.db -rw-r--r-- 1 root root 0 2011-07-13 03:07 DemoCF-g-6996-Compacted -rw-r--r-- 1 root root 224253930 2011-07-10 11:28 DemoCF-g-6996-Data.db -rw-r--r-- 1 root root 7216 2011-07-10 11:27 DemoCF-g-6996-Filter.db -rw-r--r-- 1 root root 26250 2011-07-10 11:28 DemoCF-g-6996-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 11:28 DemoCF-g-6996-Statistics.db -rw-r--r-- 1 root root 0 2011-07-13 03:07 DemoCF-g-8324-Compacted -- w3m
Re: Storing counters in the standard column families along with non-counter columns ?
On 07/13/2011 03:57 PM, Aaron Morton wrote: You can always use a dedicated CF for the counters, and use the same row key. Of course one could do this. The problem is you are now spending ~2x disk space on row keys, and app specific client code just became more complicated.
Re: Storing counters in the standard column families along with non-counter columns ?
Thanks Aaron Chris, I appreciate your help. With dedicated CF for counters, in addition to the issue pointed by Chris, the major drawback I see is that I cant read *in a single query* the counters with the regular columns row which is widely required by my application. My use case is like storing reading the 'views count' of a post along with other post details(like post content,postedBy etc) in my application. I wanted to store the views count(*counter column*) along with the details of the post. On Thu, Jul 14, 2011 at 10:20 PM, Chris Burroughs chris.burrou...@gmail.com wrote: On 07/13/2011 03:57 PM, Aaron Morton wrote: You can always use a dedicated CF for the counters, and use the same row key. Of course one could do this. The problem is you are now spending ~2x disk space on row keys, and app specific client code just became more complicated.
Re: Question about compaction
You were right, Maki. Restarting Cassandra cleaned up the directory and now there are only two SSTable files. On Thu, Jul 14, 2011 at 9:08 AM, Maki Watanabe watanabe.m...@gmail.comwrote: These 0 byte files with -Compacted suffix indicate that the associated sstables can be removed. In current version, Cassandra delete compacted sstables at Full GC and on startup. maki 2011/7/14 Sameer Farooqui cassandral...@gmail.com: Running Cassandra 0.8.1. Ran major compaction via: sudo /home/ubuntu/brisk/resources/cassandra/bin/nodetool -h localhost compact From what I'd read about Cassandra, I thought that after compaction all of the different SSTables on disk for a Column Family would be merged into one new file. However, there are now a bunch of 0-sized Compacted files and a bunch of Data files. Any ideas about why there are still so many files left? Also, is a minor compaction the same thing as a read-only compaction in 0.7? ubuntu@domU-12-31-39-0E-x-x:/raiddrive/data/DemoKS$ ls -l total 270527136 -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5670-Compacted -rw-r--r-- 1 root root 89457447799 2011-07-10 00:26 DemoCF-g-5670-Data.db -rw-r--r-- 1 root root 193456 2011-07-10 00:26 DemoCF-g-5670-Filter.db -rw-r--r-- 1 root root 2081159 2011-07-10 00:26 DemoCF-g-5670-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 00:26 DemoCF-g-5670-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5686-Compacted -rw-r--r-- 1 root root920521489 2011-07-09 22:03 DemoCF-g-5686-Data.db -rw-r--r-- 1 root root11776 2011-07-09 22:03 DemoCF-g-5686-Filter.db -rw-r--r-- 1 root root 126725 2011-07-09 22:03 DemoCF-g-5686-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 22:03 DemoCF-g-5686-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5781-Compacted -rw-r--r-- 1 root root223970446 2011-07-09 22:38 DemoCF-g-5781-Data.db -rw-r--r-- 1 root root 7216 2011-07-09 22:38 DemoCF-g-5781-Filter.db -rw-r--r-- 1 root root32750 2011-07-09 22:38 DemoCF-g-5781-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 22:38 DemoCF-g-5781-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-5874-Compacted -rw-r--r-- 1 root root156284248 2011-07-09 23:20 DemoCF-g-5874-Data.db -rw-r--r-- 1 root root 5056 2011-07-09 23:20 DemoCF-g-5874-Filter.db -rw-r--r-- 1 root root10400 2011-07-09 23:20 DemoCF-g-5874-Index.db -rw-r--r-- 1 root root 4276 2011-07-09 23:20 DemoCF-g-5874-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-6938-Compacted -rw-r--r-- 1 root root 22947541446 2011-07-10 11:43 DemoCF-g-6938-Data.db -rw-r--r-- 1 root root49936 2011-07-10 11:43 DemoCF-g-6938-Filter.db -rw-r--r-- 1 root root 563550 2011-07-10 11:43 DemoCF-g-6938-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 11:43 DemoCF-g-6938-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-6996-Compacted -rw-r--r-- 1 root root224253930 2011-07-10 11:28 DemoCF-g-6996-Data.db -rw-r--r-- 1 root root 7216 2011-07-10 11:27 DemoCF-g-6996-Filter.db -rw-r--r-- 1 root root26250 2011-07-10 11:28 DemoCF-g-6996-Index.db -rw-r--r-- 1 root root 4276 2011-07-10 11:28 DemoCF-g-6996-Statistics.db -rw-r--r-- 1 root root0 2011-07-13 03:07 DemoCF-g-8324-Compacted -- w3m
Re: question on capacity planning
So, in our experience, the amount of storage overhead is much higher. If you plan on storing 120TB of data, you will want to expect storing 250 TB of data on disk after the data over head. And then since you have to leave 50% of storage space free for compaction, you're looking at needing about 500TB of total storage space. On Wed, Jun 29, 2011 at 9:17 AM, Ryan King r...@twitter.com wrote: On Wed, Jun 29, 2011 at 5:36 AM, Jacob, Arun arun.ja...@disney.com wrote: if I'm planning to store 20TB of new data per week, and expire all data every 2 weeks, with a replication factor of 3, do I only need approximately 120 TB of disk? I'm going to use ttl in my column values to automatically expire data. Or would I need more capacity to handle sstable merges? Given this amount of data, would you recommend node storage at 2TB per node or more? This application will have a heavy write /moderate read use profile. You'll need extra space for both compaction and the overhead in the storage format. As to the amount of storage per node, that depends on your latency and throughput requirements. -ryan
Data overhead discussion in Cassandra
We just set up a demo cluster with Cassandra 0.8.1 with 12 nodes and loaded 1.5 TB of data into it. However, the actual space on disk being used by data files in Cassandra is 3 TB. We're using a standard column family with a million rows (key=string) and 35,040 columns per key. The column name is a long and the column value is a double. I was just hoping to understand more about why the data overhead is so large. We're not using expiring columns. Even considering indexing and bloom filters, it shouldn't have bloated up the data size to 2x the original amount. Or should it have? How can we better anticipate the actual data usage on disk in the future? - Sameer
Re: CQL + Counters = bad request
Yep there was an issue here. RPM has been updated to 0.8.1-2 and deployed. Thanks for tracking this down. On Wed, Jul 13, 2011 at 3:58 PM, Aaron Turner synfina...@gmail.com wrote: Thanks. Looks like we tracked down the problem to the datasax 0.8.1 rpm is actually 0.8.0. rpm -qa | grep cassandra apache-cassandra08-0.8.1-1 grep ' Cassandra version:' /var/log/cassandra/system.log | tail -1 INFO [main] 2011-07-13 12:04:31,039 StorageService.java (line 368) Cassandra version: 0.8.0 On Wed, Jul 13, 2011 at 11:40 AM, samal sa...@wakya.in wrote: cqlsh UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE KEY = '1_20110728_ifoutmulticastpkts'; Bad Request: line 1:51 no viable alternative at character '+' I m able to insert it. ___ cqlsh cqlsh UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY = '1_20110728_ifoutmulticastpkts'; cqlsh UPDATE counts SET 1310367600 = 1310367600 + 17 WHERE KEY = '1_20110728_ifoutmulticastpkts'; cqlsh _ [default@test] list counts; Using default limit of 100 --- RowKey: 1_20110728_ifoutmulticastpkts = (counter=12, value=16) = (counter=1310367600, value=34) --- RowKey: 1 = (counter=1, value=10) 2 Rows Returned. [default@test] -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin carpe diem quam minimum credula postero
Re: Off-heap Cache
Use describe keyspace and see the settings are right. Check the logs on all the servers, Make sure you dont see errors... Check JNA jar in all the servers. Regards, /VJ On Wed, Jul 13, 2011 at 1:29 PM, Raj N raj.cassan...@gmail.com wrote: How do I ensure it is indeed using the SerializingCacheProvider. Thanks -Rajesh On Tue, Jul 12, 2011 at 1:46 PM, Jonathan Ellis jbel...@gmail.com wrote: You need to set row_cache_provider=SerializingCacheProvider on the columnfamily definition (via the cli) On Tue, Jul 12, 2011 at 9:57 AM, Raj N raj.cassan...@gmail.com wrote: Do we need to do anything special to turn off-heap cache on? https://issues.apache.org/jira/browse/CASSANDRA-1969 -Raj -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: gossiper problem
How about GC logs, what are your pause times? JVM settings might help If you are not sure how to enable GC logs check cassandra.yaml look for application pause times. it is highly recommended not to swap -- include JNA jar. Regards, /VJ On Thu, Jul 14, 2011 at 1:42 AM, Donna Li donna...@utstar.com wrote: All: I have four cassandra servers in cluster. I do not restart any one of the servers, why the following print show the four servers restart many times? What is the possible reason? The connection between the four server’s is good. Swap may be used, because there are other applications run with cassandra server. ** ** 10.63.61.71 log INFO [Timer-0] 2011-07-13 10:44:55,732 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [GMFD:1] 2011-07-13 10:44:57,748 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 15:56:44,630 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [GMFD:1] 2011-07-13 15:56:44,653 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 16:03:24,391 Gossiper.java (line 181) InetAddress /10.63.61.72 is now dead. INFO [GMFD:1] 2011-07-13 16:03:24,405 Gossiper.java (line 579) InetAddress /10.63.61.72 is now UP INFO [Timer-0] 2011-07-13 22:21:41,246 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [Timer-0] 2011-07-13 22:22:45,602 Gossiper.java (line 181) InetAddress /10.63.61.73 is now dead. INFO [Timer-0] 2011-07-13 22:22:45,602 Gossiper.java (line 181) InetAddress /10.63.61.72 is now dead. INFO [GMFD:1] 2011-07-13 22:22:45,993 Gossiper.java (line 579) InetAddress /10.63.61.73 is now UP INFO [GMFD:1] 2011-07-13 22:22:46,107 Gossiper.java (line 579) InetAddress /10.63.61.72 is now UP INFO [GMFD:1] 2011-07-13 22:22:46,107 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 22:24:08,812 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [GMFD:1] 2011-07-13 22:24:08,920 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP ** ** 10.63.61.72 log INFO [Timer-0] 2011-07-13 02:06:03,941 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 02:06:05,109 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 03:39:41,918 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 03:39:45,536 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 10:10:17,449 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [Timer-0] 2011-07-13 10:10:17,471 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 10:10:18,451 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [GMFD:1] 2011-07-13 10:10:18,451 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 10:44:36,140 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 10:44:57,417 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 10:45:10,141 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 10:45:14,478 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 15:14:44,044 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 15:14:47,610 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 15:56:36,857 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 15:56:44,417 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 16:02:37,260 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 16:02:52,651 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 16:03:05,289 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 16:03:11,260 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 16:08:47,666 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 16:08:48,668 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 17:38:32,569 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 17:38:34,572 Gossiper.java (line 579) InetAddress /10.63.61.71 is now UP INFO [Timer-0] 2011-07-13 22:20:45,706 Gossiper.java (line 181) InetAddress /10.63.61.71 is now dead. INFO [GMFD:1] 2011-07-13 22:22:46,143 Gossiper.java (line 579)
Commit log is not emptied after nodetool drain
The deployed version is based on 0.6.13. After nodetool drain is invoked on one of the nodes, the commit log is not emptied. Is this the expected behavior? If so, how can I rename a column family on 0.6.x branch? Here is the log output: INFO [COMMIT-LOG-WRITER] 2011-07-15 00:39:49,541 CommitLogSegment.java (line 50) Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1310661589541.log INFO [RMI TCP Connection(8)-202.120.2.16] 2011-07-15 00:39:49,544 StorageService.java (line 391) Node is drained I saw an issue here, but it was reported against 0.8.x branch. https://issues.apache.org/jira/browse/CASSANDRA-2874 best regards, 韩竹(Zhu Han) 坚果铺子 https://jianguopuzi.com, 最简单易用的云存储 同步文件, 分享照片, 文档备份!
ttl on a record?
Hi, For now, cassandra support setting ttl on columns, is there any way to do the same to a record/row? Regards Boris
Re: Commit log is not emptied after nodetool drain
It's expected to have a new, empty segment after drain completes. 2011/7/14 Zhu Han schumi@gmail.com: The deployed version is based on 0.6.13. After nodetool drain is invoked on one of the nodes, the commit log is not emptied. Is this the expected behavior? If so, how can I rename a column family on 0.6.x branch? Here is the log output: INFO [COMMIT-LOG-WRITER] 2011-07-15 00:39:49,541 CommitLogSegment.java (line 50) Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1310661589541.log INFO [RMI TCP Connection(8)-202.120.2.16] 2011-07-15 00:39:49,544 StorageService.java (line 391) Node is drained I saw an issue here, but it was reported against 0.8.x branch. https://issues.apache.org/jira/browse/CASSANDRA-2874 best regards, 韩竹(Zhu Han) 坚果铺子, 最简单易用的云存储 同步文件, 分享照片, 文档备份! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: ttl on a record?
No. Just set it on all the columns. On Thu, Jul 14, 2011 at 7:08 PM, Boris Yen yulin...@gmail.com wrote: Hi, For now, cassandra support setting ttl on columns, is there any way to do the same to a record/row? Regards Boris -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: ttl on a record?
Hi Jonathan, In this case, will this record with no column be removed from cassandra automatically? if no, do you have any plan on supporting it? Regards Boris On Fri, Jul 15, 2011 at 10:43 AM, Jonathan Ellis jbel...@gmail.com wrote: No. Just set it on all the columns. On Thu, Jul 14, 2011 at 7:08 PM, Boris Yen yulin...@gmail.com wrote: Hi, For now, cassandra support setting ttl on columns, is there any way to do the same to a record/row? Regards Boris -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Commit log is not emptied after nodetool drain
If you have non-empty segments post-drain that is a bug. Is it reproducible? 2011/7/14 Zhu Han schumi@gmail.com: Jonathan, But all the old non-empty log segments are kept on the disk. And cassandra takes some time to apply the operations from these closed log segments after restart of the process. Is it expected? best regards, 韩竹(Zhu Han) 坚果铺子, 最简单易用的云存储 同步文件, 分享照片, 文档备份! 2011/7/15 Jonathan Ellis jbel...@gmail.com It's expected to have a new, empty segment after drain completes. 2011/7/14 Zhu Han schumi@gmail.com: The deployed version is based on 0.6.13. After nodetool drain is invoked on one of the nodes, the commit log is not emptied. Is this the expected behavior? If so, how can I rename a column family on 0.6.x branch? Here is the log output: INFO [COMMIT-LOG-WRITER] 2011-07-15 00:39:49,541 CommitLogSegment.java (line 50) Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1310661589541.log INFO [RMI TCP Connection(8)-202.120.2.16] 2011-07-15 00:39:49,544 StorageService.java (line 391) Node is drained I saw an issue here, but it was reported against 0.8.x branch. https://issues.apache.org/jira/browse/CASSANDRA-2874 best regards, 韩竹(Zhu Han) 坚果铺子, 最简单易用的云存储 同步文件, 分享照片, 文档备份! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Commit log is not emptied after nodetool drain
2011/7/15 Jonathan Ellis jbel...@gmail.com If you have non-empty segments post-drain that is a bug. Is it reproducible? I think it is always reproducible on 0.6.x branch. Here is a simple experiment: 1) bin/nodetool -h localhost 2) During flush the memtables, we can observe the name of the old commit log from log: INFO [main] 2011-07-15 11:57:46,742 ColumnFamilyStore.java (line 478) Data has reached its threshold; switching in a fresh Memtable at CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1310702265959.log', position=125) 3) Before the node is drained, new commitlog is created: INFO [COMMIT-LOG-WRITER] 2011-07-15 11:58:11,383 CommitLogSegment.java (line 50) Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1310702291383.log INFO [RMI TCP Connection(2)-192.168.1.101] 2011-07-15 11:58:11,413 StorageService.java (line 391) Node is drained 4) After the node is drained and killed, there are still two commit log under the directory $ ls -lh /var/lib/cassandra/commitlog/ total 128K -rw-r--r-- 1 root root 439 2011-07-15 11:57 CommitLog-1310702265959.log -rw-r--r-- 1 root root 125 2011-07-15 11:58 CommitLog-1310702291383.log 2011/7/14 Zhu Han schumi@gmail.com: Jonathan, But all the old non-empty log segments are kept on the disk. And cassandra takes some time to apply the operations from these closed log segments after restart of the process. Is it expected? best regards, 韩竹(Zhu Han) 坚果铺子, 最简单易用的云存储 同步文件, 分享照片, 文档备份! 2011/7/15 Jonathan Ellis jbel...@gmail.com It's expected to have a new, empty segment after drain completes. 2011/7/14 Zhu Han schumi@gmail.com: The deployed version is based on 0.6.13. After nodetool drain is invoked on one of the nodes, the commit log is not emptied. Is this the expected behavior? If so, how can I rename a column family on 0.6.x branch? Here is the log output: INFO [COMMIT-LOG-WRITER] 2011-07-15 00:39:49,541 CommitLogSegment.java (line 50) Creating new commitlog segment /var/lib/cassandra/commitlog/CommitLog-1310661589541.log INFO [RMI TCP Connection(8)-202.120.2.16] 2011-07-15 00:39:49,544 StorageService.java (line 391) Node is drained I saw an issue here, but it was reported against 0.8.x branch. https://issues.apache.org/jira/browse/CASSANDRA-2874 best regards, 韩竹(Zhu Han) 坚果铺子, 最简单易用的云存储 同步文件, 分享照片, 文档备份! -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: gossiper problem
well I am not a JVM guru, but it seem server has memory problem. 13 10:44:57,748 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 15:56:44,630 Gossiper.java (line 181) InetAddress /10.63.61.74 is now dead. INFO [GMFD:1] 2011-07-13 15:56:44,653 Gossiper.java (line 579) InetAddress /10.63.61.74 is now UP INFO [Timer-0] 2011-07-13 16:03:24,391 Gossiper.java (line 181) InetAddress /10.63.61.72 is now dead. It is swapping due to memory need, recommended!! disable swap. rather die with OOM than swapping. INFO [GC inspection] 2011-07-13 03:12:06,153 GCInspector.java (line 110) GC for ConcurrentMarkSweep: 1097 ms, 371528920 reclaimed leaving 17677528 used; max is 118784 INFO [GC inspection] 2011-07-13 03:12:07,351 GCInspector.java (line 110) GC for ParNew: 466 ms, 20619976 reclaimed leaving 157240232 used; max is 118784 INFO [GC inspection] 2011-07-13 03:25:54,378 GCInspector.java (line 110) GC for ParNew: 283 ms, 26850072 reclaimed leaving 154180424 used; max is 118784 INFO [GC inspection] 2011-07-13 06:29:58,092 GCInspector.java (line 110) GC for ParNew: 538 ms, 17358792 reclaimed leaving My cassandra version is **0.6.3**, and the configuration about gc on storage_conf.xml is GCGraceSeconds864000/GCGraceSeconds JVM configuration is as following: JVM_OPTS= \ -ea \ -Xms**256M** \ -Xmx**1G** \ -XX:+UseParNewGC \ Can I decrease the JVM_OPTS to –Xms**128M** –Xmx**512M** to avoid swap, the data saved in cassandra is small, I do not need so much memory. Reducing max head size wont solve problem, i think it will do more swapping. data only does not only count for memory requirement, but no. of memtables, as each CF has separate memtable and its size, compaction, caching, read You should upgrade to 0.7 or later. /samal
Re: ttl on a record?
Thanks a lot. ^^ My project can make good use of this feature. On Fri, Jul 15, 2011 at 10:59 AM, Jonathan Ellis jbel...@gmail.com wrote: On Thu, Jul 14, 2011 at 7:50 PM, Boris Yen yulin...@gmail.com wrote: Hi Jonathan, In this case, will this record with no column be removed from cassandra automatically? Yes. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Cache layer in front of cassandra... any help / suggestions?
Hi, We're presently trying to use Cassandra as a storage/retrieval system for live data composite counters (on the data). As we work on telecom data records (voice call/sms/GPRS xDRs), the data volume is simply HUGE, and we definitely need a controlled caching mechanism in front of the Cassandra layer. By the term controlled cache layer, what I am trying to suggest is something like maybe maintaining a list of most high-usage (and therefore, high occurrence) phone numbers somewhere, and the cache layer will hold all live data and counters for those numbers in memory. Therefore, all read/write operations which relate to that particular set of numbers will be very fast, since there will be no physical disk usage. For all other records in the data feed (which are not so frequent in occurrence) - the cache will pass through read/write operations to the Cassandra store directly. The basic caching mechanism provided by Cassandra seems to be inadequate for this strategy :( Any ideas or suggestions how we might proceed for this? Thanks Regards, SG. - Suman Ghosh Kolkata, India. - Notice: The information contained in this e-mail message and/or attachments to it may contain confidential or privileged information. If you are not the intended recipient, any dissemination, use,review, distribution, printing or copying of the information contained in this e-mail message and/or attachments to it are strictly prohibited. If you have received this communication in error, please notify us by reply e-mail or telephone and immediately and permanently delete the message and any attachments.