Re: column family names
Moving to the user list. The new restrictions were added as part of CASSANDRA-1377 for 0.6.5 and 0.7, AFAIK it's to ensure the file names created for the CFs can be correctly parsed. So it's probably not going to change. The names have to match the \w reg ex class, which includes the underscore character. Aaron On 30 Aug 2010, at 21:01, Terje Marthinussen tmarthinus...@gmail.com wrote: Hi, Now that we can make columns families on the fly, it gets interesting to use column families more as part of the data model (can reduce diskspace quite a bit vs. super columns in some cases). However, currently, the column family name validator is pretty strict allowing only word characters and in some cases it is pretty darned nice to be able to put something like a - inbetweenallthewords. Any reason to be this strict or could it be loosened up a little bit? Terje
Re: column family names
Ah, sorry, I forgot that underscore was part of \w. That will do the trick for now. I do not see the big issue with file names though. Why not expand the allowed characters a bit and escape the file names? Maybe some sort of URL like escaping. Terje On Mon, Aug 30, 2010 at 6:29 PM, Aaron Morton aa...@thelastpickle.comwrote: Moving to the user list. The new restrictions were added as part of CASSANDRA-1377 for 0.6.5 and 0.7, AFAIK it's to ensure the file names created for the CFs can be correctly parsed. So it's probably not going to change. The names have to match the \w reg ex class, which includes the underscore character. Aaron On 30 Aug 2010, at 21:01, Terje Marthinussen tmarthinus...@gmail.com wrote: Hi, Now that we can make columns families on the fly, it gets interesting to use column families more as part of the data model (can reduce diskspace quite a bit vs. super columns in some cases). However, currently, the column family name validator is pretty strict allowing only word characters and in some cases it is pretty darned nice to be able to put something like a - inbetweenallthewords. Any reason to be this strict or could it be loosened up a little bit? Terje
Re: Thrift + PHP: help!
Juho, do you mind sharing your implementation with the group? We'd love to help as well with rewriting the thrift interface, specificaly TSocket.php which seems to be where the majority of the problems are lurking. Has anyone tried compiling native thrift support as described here https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Thrift-PHP-help-tp5437314p5478057.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: cassandra for a inbox search with high reading qps
Chen, Have you considered using http://www.slideshare.net/otisg/lucandra Lucandra for Inbox search? We have a similar setup and are currently looking into using Lucandra over implementing the searching ourselves with pure Cassandra. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/cassandra-for-a-inbox-search-with-high-reading-qps-tp5434900p5478060.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: TException: Error: TSocket: timed out reading 1024 bytes from 10.1.1.27:9160
Hi guys, There are several patches you need to apply to Thrift to completely resolve all timeout errors. Here's a list of them along with a link to download a patched thrift library: http://www.softwareprojects.com/resources/programming/t-php-thrift-library-for-cassandra-1982.html http://www.softwareprojects.com/resources/programming/t-php-thrift-library-for-cassandra-1982.html -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/TException-Error-TSocket-timed-out-reading-1024-bytes-from-10-1-1-27-9160-tp4905687p5478055.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: RowMutationVerbHandler.java (line 78) Error in row mutation
Is it possible this was a new node with a manual token and autobootstrap turned off? If not, could you give more details about the node? Gary. On Fri, Aug 27, 2010 at 17:58, B. Todd Burruss bburr...@real.com wrote: i got the latest code this morning. i'm testing with 0.7 ERROR [ROW-MUTATION-STAGE:388] 2010-08-27 15:54:58,053 RowMutationVerbHandler.java (line 78) Error in row mutation org.apache.cassandra.db.UnserializableColumnFamilyException: Couldn't find cfId=1002 at org.apache.cassandra.db.ColumnFamilySerializer.deserialize(ColumnFamilySerializer.java:113) at org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps(RowMutation.java:372) at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:382) at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:340) at org.apache.cassandra.db.RowMutationVerbHandler.doVerb(RowMutationVerbHandler.java:46) at org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:50) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:619)
get_slice sometimes returns previous result on php
I've ran into a strange bug where get_slice returns the result from previous query. My application iterates over a set of columns inside a supercolumn and for some reason it sometimes (quite rarely but often enough that it shows up) the results gets shifted around so that the application gets the previous result. The application is using the same cassandra thrift connection (it doesn't close it in between) and everything is happening inside same php process. Here's a cleaned up example from logs where this happens: 14:40 suomirock php-fi: [MISC] WARNING /blog.php: Cassandra stored blog content for blog id 47528165 differs from database content. 14:40 suomirock php-fi: [MISC] WARNING /blog.php: from cassandra: 14:40 suomirock php-fi: [MISC] WARNING /blog.php: from database : B 14:40 suomirock php-fi: [MISC] WARNING /blog.php: Cassandra stored blog content for blog id 47523032 differs from database content. 14:40 suomirock php-fi: [MISC] WARNING /blog.php: from cassandra: B 14:40 suomirock php-fi: [MISC] WARNING /blog.php: from database : CC The data model is that I have a Super Column family which stores blog entries. Each user has a single row. Inside this row there are CF's where each CF contains a single blog entry. The key of the CF is the blog id number and one of the columns inside the CF contains the blog content. The data which is in cassandra is correctly there and it's the same what's inside our old storage tier (PostgreSQL) so I'm able to compare the data returned from cassandra with the data returned from old database. Here's part of the output from cassandra-cli where I queried the row for this user. As you can see, the blog id matches the super_column inside cassandra. = (super_column=47540671, (column=content, value=, timestamp=1282940401925456) ) = (super_column=47528165, (column=content, value=B, timestamp=1282940401925456) ) = (super_column=47523032, (column=content, value=CC, timestamp=1282940401925456) ) I'm in the middle of writing bunch of debugging code to get better data what's really going on, but I'd be very happy if someone could have any clue or helpful ideas how to debug this out. - Juho Mäkinen
Re: Calls block when using Thrift API
If you're only interested in accessing data natively, I suggest you try the fat client. It brings up a node that participates in gossip, exposes the StorageProxy API, but does not receive a token and so does not have storage responsibilities. StorageService.instance.initClient(); in 0.7 you will want to loop until the node receives its storage definitions via gossip. Calling SS.instance.initServer() directly bypasses some of the routine startup operations. I don't recommend it unless you really know what's going on (it might work now, but it's not guaranteed to in the future). Gary. On Sun, Aug 29, 2010 at 10:28, Ruben de Laat ru...@logic-labs.nl wrote: Just for the people looking to run Cassandra embedded and access directly (not via Thrift/Avro). This works: StorageService.instance.initServer(); And then just use the StorageProxy for data access. I have no idea if this is the right way, but is works. Kind regards, Ruben
Re: cassandra disk usage
column names are stored per cell (moving to user@) On Mon, Aug 30, 2010 at 6:58 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Hi, Was just looking at a SSTable file after loading a dataset. The data load has no updates of data but: - Columns can in some rare cases be added to existing super columns - SuperColumns will be added to the same key (but not overwriting existing data). I batch these, but it is quite likely that there will be 2-3 updates to a key. This is a random selected SSTable file from a much bigger dataset. The data is stored as date(super)/type(column)/value Date is a simple 20100811 type string. Value is a small integer, 2 digit on average If I run a simple strings on the SSTable and look for the data: value: 692Kbyte of data type: 4.01MByte of data date: 4.6MB of data In total: 9.4MByte The size of the .db file however, is 36.4MB... The expansion from the column headers are bad enough, but I can somehow accept that. The almost 4x expansion on top of that is a bit harder to justify... Anyone know already where this expansion comes from? Or I need to take a careful look at source (probably useful anyway :)) Terje -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: Thrift + PHP: help!
Interesting! Thanks for sharing Have you considered instead of retrying the failing node, to iterate through other nodes in your cluster? If one node is failing (let's assume it's overloaded for a minute), you're probably going to be better off having the client send the insert to the next node in line. Thoughts? On 8/30/2010 9:17 AM, Juho Mäkinen wrote: Yes, I've already planning to do so. The class has still some dependencies into our other functions which I need to first clear out. Basically each api call is wrapped inside a retry loop as we can assume that each operation can be retried as many times as needed: $tries = 0; $this-last_exception = null; $delay = 1000; // start with 1ms retry delay do { try { $this-client-insert($this-keyspace, $key, $column_path, $value, $timestamp, $consistency_level); return; } catch (cassandra_InvalidRequestException $e) { Logger::error(InvalidRequestException: . $e-why . ', stacktrace: ' . $e-getMessage()); throw $e; } catch (Exception $e) { $this-last_exception = $e; $tries++; // sleep for some time and try again usleep($delay); $delay = $delay * 3; $this-connect(); // Drop current server and reopen a connection into another server } } while ($tries 4); // Give up and throw the last exception throw $this-last_exception; - Juho Mäkinen On Mon, Aug 30, 2010 at 3:48 PM, Mike Peters cassan...@softwareprojects.com wrote: Juho, do you mind sharing your implementation with the group? We'd love to help as well with rewriting the thrift interface, specificaly TSocket.php which seems to be where the majority of the problems are lurking. Has anyone tried compiling native thrift support as described here https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Thrift-PHP-help-tp5437314p5478057.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Thrift + PHP: help!
On Mon, Aug 30, 2010 at 4:24 PM, Mike Peters cassan...@softwareprojects.com wrote: Have you considered instead of retrying the failing node, to iterate through other nodes in your cluster? Yes, the $this-connect() does just that: it removes the previous node from the node list and gives the list back to thrift connection function. In case all nodes have been tried (and thus removed) it refills the node list and starts looping it again. In practice this will never happen but the code is there just to be sure :) - Juho Mäkinen If one node is failing (let's assume it's overloaded for a minute), you're probably going to be better off having the client send the insert to the next node in line. Thoughts? On 8/30/2010 9:17 AM, Juho Mäkinen wrote: Yes, I've already planning to do so. The class has still some dependencies into our other functions which I need to first clear out. Basically each api call is wrapped inside a retry loop as we can assume that each operation can be retried as many times as needed: $tries = 0; $this-last_exception = null; $delay = 1000; // start with 1ms retry delay do { try { $this-client-insert($this-keyspace, $key, $column_path, $value, $timestamp, $consistency_level); return; } catch (cassandra_InvalidRequestException $e) { Logger::error(InvalidRequestException: . $e-why . ', stacktrace: ' . $e-getMessage()); throw $e; } catch (Exception $e) { $this-last_exception = $e; $tries++; // sleep for some time and try again usleep($delay); $delay = $delay * 3; $this-connect(); // Drop current server and reopen a connection into another server } } while ($tries 4); // Give up and throw the last exception throw $this-last_exception; - Juho Mäkinen On Mon, Aug 30, 2010 at 3:48 PM, Mike Peters cassan...@softwareprojects.com wrote: Juho, do you mind sharing your implementation with the group? We'd love to help as well with rewriting the thrift interface, specificaly TSocket.php which seems to be where the majority of the problems are lurking. Has anyone tried compiling native thrift support as described here https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Thrift-PHP-help-tp5437314p5478057.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
NodeTool won't connect remotely
Hi, I'm trying to manage my cassandra cluster from a remote box and having issues getting nodetool to connect. All the machines I'm using are running on AWS. Here's what happens when I try: /opt/apache-cassandra-0.6.4/bin/nodetool -h xxx.xxx.xxx.143 -p 10036 ring Error connecting to remote JMX agent! java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.143; nested exception is: java.net.ConnectException: Connection timed out When I'm local to a box (Ubuntu 10.04) running Cassandra, I can connect fine via both 127.0.0.1 and external ip (xxx.xxx.xxx.143). I can telnet into the jmx port from an external machine fine: telnet xxx.xxx.xxx.143 10036 Trying xxx.xxx.xxx.143... Connected to xxx.xxx.xxx.143. Escape character is '^]'. I already added the -Djava.rmi.server.hostname parameter to the java runtime, but it didn't seem to affect anything. /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile /var/run/cassandra.pid -errfile 1 -outfile /var/log/cassandra/output.log -cp /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.6.3.jar:/usr/share/cassandra/avro-1.2.0-dev.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collections-1.0.jar:/usr/share/cassandra/hadoop-core-0.20.1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/ivy-2.1.0.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/libthrift-r917130.jar:/usr/share/cassandra/log4j-1.2.14.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar -Xmx4G -Xms128M -Djava.rmi.server.hostname=xxx.xxx.xxx.143 -Dcassandra -Dstorage-config=/etc/cassandra -Dcom.sun.management.jmxremote.port=10036 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false org.apache.cassandra.thrift.CassandraDaemon netstat shows that I'm still bound to IP6 netstat -nap|grep 10036 tcp6 0 0 :::10036:::*LISTEN 29277/jsvc And, now I'm at an impasse. Any help would be greatly appreciated. Thanks -Allan
Re: NodeTool won't connect remotely
I think that JMX needs additional ports to function correctly. Try to disable all firewalls between the client and the server so that client can connect to any port in the server and try again. - Juho Mäkinen On Mon, Aug 30, 2010 at 7:07 PM, Allan Carroll alla...@gmail.com wrote: Hi, I'm trying to manage my cassandra cluster from a remote box and having issues getting nodetool to connect. All the machines I'm using are running on AWS. Here's what happens when I try: /opt/apache-cassandra-0.6.4/bin/nodetool -h xxx.xxx.xxx.143 -p 10036 ring Error connecting to remote JMX agent! java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.143; nested exception is: java.net.ConnectException: Connection timed out When I'm local to a box (Ubuntu 10.04) running Cassandra, I can connect fine via both 127.0.0.1 and external ip (xxx.xxx.xxx.143). I can telnet into the jmx port from an external machine fine: telnet xxx.xxx.xxx.143 10036 Trying xxx.xxx.xxx.143... Connected to xxx.xxx.xxx.143. Escape character is '^]'. I already added the -Djava.rmi.server.hostname parameter to the java runtime, but it didn't seem to affect anything. /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile /var/run/cassandra.pid -errfile 1 -outfile /var/log/cassandra/output.log -cp /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.6.3.jar:/usr/share/cassandra/avro-1.2.0-dev.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collections-1.0.jar:/usr/share/cassandra/hadoop-core-0.20.1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/ivy-2.1.0.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/libthrift-r917130.jar:/usr/share/cassandra/log4j-1.2.14.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar -Xmx4G -Xms128M -Djava.rmi.server.hostname=xxx.xxx.xxx.143 -Dcassandra -Dstorage-config=/etc/cassandra -Dcom.sun.management.jmxremote.port=10036 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false org.apache.cassandra.thrift.CassandraDaemon netstat shows that I'm still bound to IP6 netstat -nap|grep 10036 tcp6 0 0 :::10036 :::* LISTEN 29277/jsvc And, now I'm at an impasse. Any help would be greatly appreciated. Thanks -Allan
Re: Cassandra HAProxy
FWIW - we've been using HAProxy in front of a cassandra cluster in production and haven't run into any problems yet. It sounds like our cluster is tiny in comparison to Anthony M's cluster. But I just wanted to mentioned that others out there are doing the same. One thing in this thread that I thought was interesting is Ben's initial comment the presence of the proxy precludes clients properly backing off from nodes returning errors. I think it would be very cool if someone implemented a mechanism for haproxy to detect the error nodes and then enable it to drop those nodes from the rotation. I'd be happy to help with this, as I know how it works with haproxy and standard web servers or other tcp servers. But, I'm not sure how to make it work with Cassandra, since, as Ben points out, it can return valid tcp responses (that say error-condition) on the standard port. Dave Viner On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote: On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I don't know it seems to tax our setup of 39 extra large ec2 nodes, its also closer to 24000 reqs/sec at peak since there are different tables (2 tables for each read and 2 for each write) Could you clarify what you mean here? On the face of it, this performance seems really poor given the number and size of nodes. As you say I would expect to achieve much better performance given the node size, but if you go back and look through some of the issues we've seen over time, you'll find we've been hit with nodes being too small, having too few nodes to deal with request volume, having OOMs, having bad sstables, having the ring appear different to different nodes, and several other problems. Many of i/o problems presented themselves as MessageDeserializer pool backups (although we stopped having these since Jonathan was by and suggested row cache of about 1Gb, thanks Riptano!). We currently have mystery OOMs which are probably caused by GC storms during compactions (although usually the nodes restart and compact fine, so who knows). I also regularly watch nodes go away for 30 seconds or so (logs show node goes dead, then comes back to life a few seconds later). I've sort of given up worrying about these, as we are in the process of moving this cluster to our own machines in a colo, so I figure I should wait until they are moved, and see how the new machines do before I worry more about performance. -Anthony -- Anthony Molinaro antho...@alumni.caltech.edu
Re: cassandra disk usage
On Mon, Aug 30, 2010 at 10:10 PM, Jonathan Ellis jbel...@gmail.com wrote: column names are stored per cell (moving to user@) I think that is already accommodated for in my numbers? What i listed was measured from the actual SSTable file (using the output from strings sstable.db), so multiples of the supercolumn and columns names is already part of the strings output. Typically, you get something like this as output from strings: 20100629 20100629 20100629 string matching the type java.util.BitSetn bitst [Jxpur [Jx repeating. I am not entirely sure why I get those repeating supercolumn names there (there are more supercolumn names in this file than column names, which is not logical, it should be the other way around!), but I will have a closer look at that one. These strings makes up about 1/2 of the total data. The remainder being binary and tons of null bytes. The strings command (which will of course give me some binary noise) returns 14.943.928 bytes (or rather characters) of data If we ignore the binary noise for a second and also count the number of null bytes in this file, we get: Text: 14,943,928 bytes (as mentioned in my previous posting, 9.4MB of this is column headers) Null Bytes: 14,634,412 bytes Other (binary): 8,580,188 bytes Total size: 38,158,528 Yes yes yes, doing this is ugly and lots of null bytes would occur for many reasons (no reason to tell me that), but chew on that number for a second and take a look at an SSTable near you, there is a heck of a lot of nothing there. Should be noted that this is 0.7 beta 1. I realize that this code will change dramatically by 0.8 so this is probably not too interesting to spend too much time on, but the expansion of data is pretty excessive in many scenarios, so I just looked briefely at an actual file trying to understand it a bit better. Terje
Re: Cassandra HAProxy
On Mon, Aug 30, 2010 at 12:40 PM, Dave Viner davevi...@pobox.com wrote: FWIW - we've been using HAProxy in front of a cassandra cluster in production and haven't run into any problems yet. It sounds like our cluster is tiny in comparison to Anthony M's cluster. But I just wanted to mentioned that others out there are doing the same. One thing in this thread that I thought was interesting is Ben's initial comment the presence of the proxy precludes clients properly backing off from nodes returning errors. I think it would be very cool if someone implemented a mechanism for haproxy to detect the error nodes and then enable it to drop those nodes from the rotation. I'd be happy to help with this, as I know how it works with haproxy and standard web servers or other tcp servers. But, I'm not sure how to make it work with Cassandra, since, as Ben points out, it can return valid tcp responses (that say error-condition) on the standard port. Dave Viner On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro antho...@alumni.caltech.edu wrote: On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote: On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro antho...@alumni.caltech.edu wrote: I don't know it seems to tax our setup of 39 extra large ec2 nodes, its also closer to 24000 reqs/sec at peak since there are different tables (2 tables for each read and 2 for each write) Could you clarify what you mean here? On the face of it, this performance seems really poor given the number and size of nodes. As you say I would expect to achieve much better performance given the node size, but if you go back and look through some of the issues we've seen over time, you'll find we've been hit with nodes being too small, having too few nodes to deal with request volume, having OOMs, having bad sstables, having the ring appear different to different nodes, and several other problems. Many of i/o problems presented themselves as MessageDeserializer pool backups (although we stopped having these since Jonathan was by and suggested row cache of about 1Gb, thanks Riptano!). We currently have mystery OOMs which are probably caused by GC storms during compactions (although usually the nodes restart and compact fine, so who knows). I also regularly watch nodes go away for 30 seconds or so (logs show node goes dead, then comes back to life a few seconds later). I've sort of given up worrying about these, as we are in the process of moving this cluster to our own machines in a colo, so I figure I should wait until they are moved, and see how the new machines do before I worry more about performance. -Anthony -- Anthony Molinaro antho...@alumni.caltech.edu Any proxy with a TCP health check should be able to determine if the Cassandra service is down hard. The problem for the tools that are not cassandra protocol aware are detecting slowness or other anomalies like TimedOut exceptions. If you are seeing GC storms during compactions you might have rows that are too big. When the compaction hits these memory spikes. I lowered the compaction priority (and added more nodes) which has helped compaction back off leaving some IO for requests.
Re: NodeTool won't connect remotely
Thanks! That did it. Looks like the connection happens on 10036 and then the server negotiates a separate port for continued communication. Found this article once I knew what to look for. It also describes how to get more consistency on port numbers to allow for ssh tunneling and firewalls. From http://jared.ottleys.net/alfresco/tunneling-debug-and-jmx-for-alfresco The -Djava.rmi.server.hostname=dummyhost option is needed to help RMI know where to connect. RMI connects in a two part process. First by connecting to the RMI server registry, which pushes your request to the JMX service which is dynamically allocated on the first open port available to it at start up time. On Aug 30, 2010, at 10:30 AM, Juho Mäkinen wrote: I think that JMX needs additional ports to function correctly. Try to disable all firewalls between the client and the server so that client can connect to any port in the server and try again. - Juho Mäkinen On Mon, Aug 30, 2010 at 7:07 PM, Allan Carroll alla...@gmail.com wrote: Hi, I'm trying to manage my cassandra cluster from a remote box and having issues getting nodetool to connect. All the machines I'm using are running on AWS. Here's what happens when I try: /opt/apache-cassandra-0.6.4/bin/nodetool -h xxx.xxx.xxx.143 -p 10036 ring Error connecting to remote JMX agent! java.rmi.ConnectException: Connection refused to host: xxx.xxx.xxx.143; nested exception is: java.net.ConnectException: Connection timed out When I'm local to a box (Ubuntu 10.04) running Cassandra, I can connect fine via both 127.0.0.1 and external ip (xxx.xxx.xxx.143). I can telnet into the jmx port from an external machine fine: telnet xxx.xxx.xxx.143 10036 Trying xxx.xxx.xxx.143... Connected to xxx.xxx.xxx.143. Escape character is '^]'. I already added the -Djava.rmi.server.hostname parameter to the java runtime, but it didn't seem to affect anything. /usr/bin/jsvc -home /usr/lib/jvm/java-6-openjdk/jre -pidfile /var/run/cassandra.pid -errfile 1 -outfile /var/log/cassandra/output.log -cp /usr/share/cassandra/antlr-3.1.3.jar:/usr/share/cassandra/apache-cassandra-0.6.3.jar:/usr/share/cassandra/avro-1.2.0-dev.jar:/usr/share/cassandra/clhm-production.jar:/usr/share/cassandra/commons-cli-1.1.jar:/usr/share/cassandra/commons-codec-1.2.jar:/usr/share/cassandra/commons-collections-3.2.1.jar:/usr/share/cassandra/commons-lang-2.4.jar:/usr/share/cassandra/google-collections-1.0.jar:/usr/share/cassandra/hadoop-core-0.20.1.jar:/usr/share/cassandra/high-scale-lib.jar:/usr/share/cassandra/ivy-2.1.0.jar:/usr/share/cassandra/jackson-core-asl-1.4.0.jar:/usr/share/cassandra/jackson-mapper-asl-1.4.0.jar:/usr/share/cassandra/jline-0.9.94.jar:/usr/share/cassandra/json-simple-1.1.jar:/usr/share/cassandra/libthrift-r917130.jar:/usr/share/cassandra/log4j-1.2.14.jar:/usr/share/cassandra/slf4j-api-1.5.8.jar:/usr/share/cassandra/slf4j-log4j12-1.5.8.jar:/etc/cassandra:/usr/share/java/commons-daemon.jar -Xmx4G -Xms128M -Djava.rmi.server.hostname=xxx.xxx.xxx.143 -Dcassandra -Dstorage-config=/etc/cassandra -Dcom.sun.management.jmxremote.port=10036 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false org.apache.cassandra.thrift.CassandraDaemon netstat shows that I'm still bound to IP6 netstat -nap|grep 10036 tcp6 0 0 :::10036:::*LISTEN 29277/jsvc And, now I'm at an impasse. Any help would be greatly appreciated. Thanks -Allan
Dumping
Is there an easy way to retrieve all values from a CF.. similar to a dump? How about retrieving all columns for a particular key? In the second use case a simple iteration would work using a start and finish but how would this be accomplished across all keys for a particular CF when you don't know the keys in advance? Thanks
Re: Dumping
sstable2json discussed here http://wiki.apache.org/cassandra/Operations may be what you are after, or the snapshot feature. Not sure what you want to use the dump for. If you do not know the keys in the CF in advance take a look at get_range_slices (http://wiki.apache.org/cassandra/API) it allows you to slice through the keys in a similar way to slicing through the columns. Aaron On 31 Aug 2010, at 05:15, Mark wrote: Is there an easy way to retrieve all values from a CF.. similar to a dump? How about retrieving all columns for a particular key? In the second use case a simple iteration would work using a start and finish but how would this be accomplished across all keys for a particular CF when you don't know the keys in advance? Thanks
Client developer mailing list
There has been a new mailing list created for those who are working on Cassandra clients above thrift and/or avro. You can subscribe by sending an email to client-dev-subscr...@cassandra.apache.org or using the link at the bottom of http://cassandra.apache.org The list is meant to give client authors a discussion forum as well as a place to interact with core cassandra developers about the roadmap and upcoming features. Thanks to Cliff Moon (@moonpolysoft) for starting a discussion about client quality at the Cassandra Summit.
Re: Job opening cassandra Barcelona, Spain
Thanks for the suggestion. On Aug 30, 2010, at 8:01 PM, Norman Maurer wrote: I think you should try jobs at apache.org too ;) Bye, Norman 2010/8/25 Dimitry Lvovsky dimi...@reviewpro.com: Hi All, Please forgive the job offer spam. We're looking to add a developer with experience using Cassandra, to join our team in Barcelona. An ideal candidate will have a strong CS background (academic or otherwise) with high level Java skills and experience programing in Scala. Knowing your way around CSS/Javascript would be a definite plus. We can only accept potential candidates with permission to work in the EU at this time. Please send inquires to jobs at reviewpro dot com.
Re: Client developer mailing list
awesome, thanks, I'm subscribed :) On Mon, Aug 30, 2010 at 10:05 PM, Jeremy Hanna jeremy.hanna1...@gmail.comwrote: There has been a new mailing list created for those who are working on Cassandra clients above thrift and/or avro. You can subscribe by sending an email to client-dev-subscr...@cassandra.apache.org or using the link at the bottom of http://cassandra.apache.org The list is meant to give client authors a discussion forum as well as a place to interact with core cassandra developers about the roadmap and upcoming features. Thanks to Cliff Moon (@moonpolysoft) for starting a discussion about client quality at the Cassandra Summit.
Re: Client developer mailing list
I'm in! We really need a better PHP Thrift
Re: get_slice sometimes returns previous result on php
On Mon, Aug 30, 2010 at 6:05 AM, Juho Mäkinen juho.maki...@gmail.com wrote: The application is using the same cassandra thrift connection (it doesn't close it in between) and everything is happening inside same php process. This is why you are seeing this problem (and is specific to connection reuse in certain languages, not a general problem with connection reuse). b
Re: get_slice sometimes returns previous result on php
I'm not using connection poolin where the same tcp socket is used between different php requests. I open a new thrift connection with new socket to the node and I use the node through the request and I close it after. The get_slice requests are all happening in the same request, so something odd happens in the between. Tomorrow I'm going to implement a history buffer which logs all cassandra operations within the php request and logs it out in case I detect this anomaly again. Hopefully that gives some light to the problem. - Juho Mäkinen On Mon, Aug 30, 2010 at 10:50 PM, Benjamin Black b...@b3k.us wrote: On Mon, Aug 30, 2010 at 6:05 AM, Juho Mäkinen juho.maki...@gmail.com wrote: The application is using the same cassandra thrift connection (it doesn't close it in between) and everything is happening inside same php process. This is why you are seeing this problem (and is specific to connection reuse in certain languages, not a general problem with connection reuse). b
Re: Follow-up post on cassandra configuration with some experiments on GC tuning
collection runs for the cases tested. In most cases, I prefer having low pauses due to any garbage collection runs and don't care too much about the shape of the memory usage, and I guess, that's the reason why the low pause collector is used by default for running cassandra. For myself, I have mixed feelings regarding the low pause collector, because I found it difficult to find good young generation sizings, which are suitable to different load patterns. Therefor I mostly prefer the throughput collector, which adaptively sizes the young generation, doing a good job to avoiding that too much data goes to the tenured generation. Well, if you care about pause times, usually the best bet would be to have the young gen be as large as possible to yield what you consider to be a pause time within an acceptable range. I.e., as large as acceptable but no larger. I would be interested in, what are the differences concerning the stop times between the different GC variants, when running cassandra. Is it really much better to use the low pause collector in regard to get stabile response times, even if I use XX:+UseParallelOldGC and XX:MaxGCPauseMillis=nnn flags? Any experiences with this? If you use the default (for the JVM, not for cassandra) throughput collector, you *will* take full stop-the-world collections, period. You can enable parallel GC, but with that collector there's no way around the fact that full collections will pause the application for the full duration of such full GC:s. In general, the larger the heap (relative to speed of the collection), the more of a problem this will be. If you deem the pause times acceptable for your particular use-case, I don't see an obvious reason to prefer the CMS collector. MaxGCPauseMillis won't help; the throughput collector just doesn't have any way to ader to it. A full GC is a full GC. For CMS, I'm not sure what, if any, effect the MaxGCPauseMillis has. In my very limited testing I didn't see any obvious effect on e.g. sizing choice for the young generation (but I have not checked the code to see if CMS uses it). It is definitely used by the G1 collector; typically MaxGCPauseMillis and GCPauseIntervalMillis are the two most importants settings to tweak. They are directly used to decide the young generation size, as well as limit the number of non-young regions that are picked for GC during a partial (not young-only) GC. Has anyone run Cassandra with G1 in production for prolonged periods of time? One thing that concerns me is the reliance on GC to remove obsolete SS tables. That relies on certain GC behavior that is true for CMS and the throughput collector, but not with G1. With CMS, an unreachable sstable will be detected when concurrent mark/sweep triggers; but with G1, there is not necessarily any expectation at all that some particular region that happens to contain the reference in question will be collected - *ever* - since G1 always picks the best regions first (best in terms of bang for the buck - the most memory reclaimed at the lowest cost). -- / Peter Schuller
Re: Follow-up post on cassandra configuration with some experiments on GC tuning
On Mon, Aug 30, 2010 at 5:18 PM, Peter Schuller peter.schul...@infidyne.com wrote: Has anyone run Cassandra with G1 in production for prolonged periods of time? Not AFAIK. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com
Re: cassandra for a inbox search with high reading qps
We use Lucandra as well for searching for users, as well as geo-encoding. It really works well except for numeric fields. https://issues.apache.org/jira/browse/CASSANDRA-1235 That bug may be a bit of an issue, but after they release 0.6.5 all the Lucene functionality will be available to you. Todd On Mon, 2010-08-30 at 05:49 -0700, Mike Peters wrote: Chen, Have you considered using http://www.slideshare.net/otisg/lucandra Lucandra for Inbox search? We have a similar setup and are currently looking into using Lucandra over implementing the searching ourselves with pure Cassandra.
Re: cassandra for a inbox search with high reading qps
what's the average size of a user? As I know, lucandra will first poll the data from cassandra, then do computation in the client. That's ok for small rows. But we have 1M row in average, and some rows scale to 100M; at the same time, we expect high reading qps. Polling these data to client machine through network is unacceptable. I have setup a demo system which do the searching locally in cassandra; it seems working with reading qps 1000+ per node. 2010/8/31 Todd Nine t...@spidertracks.co.nz We use Lucandra as well for searching for users, as well as geo-encoding. It really works well except for numeric fields. https://issues.apache.org/jira/browse/CASSANDRA-1235 That bug may be a bit of an issue, but after they release 0.6.5 all the Lucene functionality will be available to you. Todd On Mon, 2010-08-30 at 05:49 -0700, Mike Peters wrote: Chen, Have you considered using http://www.slideshare.net/otisg/lucandra Lucandra for Inbox search? We have a similar setup and are currently looking into using Lucandra over implementing the searching ourselves with pure Cassandra. -- Best Regards, Chen Xinli
Re: column family names
URL encoding. On Mon, Aug 30, 2010 at 5:55 PM, Aaron Morton aa...@thelastpickle.com wrote: under scores or URL encoding ? Aaron On 31 Aug, 2010,at 12:27 PM, Benjamin Black b...@b3k.us wrote: Please don't do this. On Mon, Aug 30, 2010 at 5:22 AM, Terje Marthinussen tmarthinus...@gmail.com wrote: Ah, sorry, I forgot that underscore was part of \w. That will do the trick for now. I do not see the big issue with file names though. Why not expand the allowed characters a bit and escape the file names? Maybe some sort of URL like escaping. Terje On Mon, Aug 30, 2010 at 6:29 PM, Aaron Morton aa...@thelastpickle.com wrote: Moving to the user list. The new restrictions were added as part of CASSANDRA-1377 for 0.6.5 and 0.7, AFAIK it's to ensure the file names created for the CFs can be correctly parsed. So it's probably not going to change. The names have to match the \w reg ex class, which includes the underscore character. Aaron On 30 Aug 2010, at 21:01, Terje Marthinussen tmarthinus...@gmail.com wrote: Hi, Now that we can make columns families on the fly, it gets interesting to use column families more as part of the data model (can reduce diskspace quite a bit vs. super columns in some cases). However, currently, the column family name validator is pretty strict allowing only word characters and in some cases it is pretty darned nice to be able to put something like a - inbetweenallthewords. Any reason to be this strict or could it be loosened up a little bit? Terje
Re: column family names
Beyond aesthetics, specific reasons? Terje On Tue, Aug 31, 2010 at 11:54 AM, Benjamin Black b...@b3k.us wrote: URL encoding.