Re: Could ring cache really improve performance in Cassandra?
Thanks Jonathan, actually I'm wondering how CQL is implemented underlying, a different RPC mechanism? Why it is faster than thrift? I know I'm wrong, but now I just regard CQL as a query language. Could you please help explain to me? I still feel puzzled after reading some docs about CQL. I create table in CQL, and use cql3 API in thrift. I don't know what else I can do with CQL. And I am using C++ to write the client side code. Currently I am not using the C++ driver and want to write some simple functionality by myself. Also, I didn't use the stress test tool provided in the Cassandra distribution because I also want to make sure whether I can achieve good performance as excepted using my client code. I know others have benchmarked Cassandra and got good results. But if I cannot reproduce the satisfactory results, I cannot use it in my case. I will create a repo and send a link later, hope to get your kind help. Thanks very much. 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com: I would really not recommend using thrift for anything at this point, including your load tests. Take a look at CQL, all development is going there and has in 2.1 seen a massive performance boost over 2.0. You may want to try the Cassandra stress tool included in 2.1, it can stress a table you've already built. That way you can rule out any bugs on the client side. If you're going to keep using your tool, however, it would be helpful if you sent out a link to the repo, since currently we have no way of knowing if you've got a client side bug (data model or code) that's limiting your performance. On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com wrote: I find under the src/client folder of Cassandra 2.1.0 source code, there is a *RingCache.java* file. It uses a thrift client calling the* describe_ring()* API to get the token range of each Cassandra node. It is used on the client side. The client can use it combined with the partitioner to get the target node. In this way there is no need to route requests between Cassandra nodes, and the client can directly connect to the target node. So maybe it can save some routing time and improve performance. Thank you very much. 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com: What's a ring cache? FYI if you're using the DataStax CQL drivers they will automatically route requests to the correct node. On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com wrote: Hi, I'm doing stress test on Cassandra. And I learn that using ring cache can improve the performance because the client requests can directly go to the target Cassandra server and the coordinator Cassandra node is the desired target node. In this way, there is no need for coordinator node to route the client requests to the target node, and maybe we can get the linear performance increment. However, in my stress test on an Amazon EC2 cluster, the test results are weird. Seems that there's no performance improvement after using ring cache. Could anyone help me explain this results? (Also, I think the results of test without ring cache is weird, because there's no linear increment on QPS when new nodes are added. I need help on explaining this, too). The results are as follows: INSERT(write): Node count Replication factor QPS(No ring cache) QPS(ring cache) 1 1 18687 20195 2 1 20793 26403 2 2 22498 21263 4 1 28348 30010 4 3 28631 24413 SELECT(read): Node count Replication factor QPS(No ring cache) QPS(ring cache) 1 1 24498 22802 2 1 28219 27030 2 2 35383 36674 4 1 34648 28347 4 3 52932 52590 Thank you very much, Joy
Re: Could ring cache really improve performance in Cassandra?
cassandra-stress is a great tool to check whether the sizing of your cluster in combination of your data model will fit your production needs. I.e. without the application :) Removing the application removes any possible bugs from the load test. Sure, it’s a necessary step to do it with your application - but I’d recommend to start with the stress test tool first. Thrift is a deprecated API. I strongly recommend to use the C++ driver (I pretty sure it supports the native protocol). The native protocol achieves approx. twice the performance than thrift via much fewer TCP connections. (Thrift is RPC - means connections usually waste system, application and server resources while waiting for something. Native protocol is a multiplexed protocol.) As John already said, all development effort is spent on CQL3 and native protocol - thift is just supported. With CQL you can you everything that you can do with thrift + more, new stuff. I also recommend to use prepared statements (it automagically works in a distributed cluster with the native protocol) - it eliminates the effort to parse CQL statement again and again. Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com: Thanks Jonathan, actually I'm wondering how CQL is implemented underlying, a different RPC mechanism? Why it is faster than thrift? I know I'm wrong, but now I just regard CQL as a query language. Could you please help explain to me? I still feel puzzled after reading some docs about CQL. I create table in CQL, and use cql3 API in thrift. I don't know what else I can do with CQL. And I am using C++ to write the client side code. Currently I am not using the C++ driver and want to write some simple functionality by myself. Also, I didn't use the stress test tool provided in the Cassandra distribution because I also want to make sure whether I can achieve good performance as excepted using my client code. I know others have benchmarked Cassandra and got good results. But if I cannot reproduce the satisfactory results, I cannot use it in my case. I will create a repo and send a link later, hope to get your kind help. Thanks very much. 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com mailto:j...@jonhaddad.com: I would really not recommend using thrift for anything at this point, including your load tests. Take a look at CQL, all development is going there and has in 2.1 seen a massive performance boost over 2.0. You may want to try the Cassandra stress tool included in 2.1, it can stress a table you've already built. That way you can rule out any bugs on the client side. If you're going to keep using your tool, however, it would be helpful if you sent out a link to the repo, since currently we have no way of knowing if you've got a client side bug (data model or code) that's limiting your performance. On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com mailto:kongjiali...@gmail.com wrote: I find under the src/client folder of Cassandra 2.1.0 source code, there is a RingCache.java file. It uses a thrift client calling the describe_ring() API to get the token range of each Cassandra node. It is used on the client side. The client can use it combined with the partitioner to get the target node. In this way there is no need to route requests between Cassandra nodes, and the client can directly connect to the target node. So maybe it can save some routing time and improve performance. Thank you very much. 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com mailto:j...@jonhaddad.com: What's a ring cache? FYI if you're using the DataStax CQL drivers they will automatically route requests to the correct node. On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com mailto:kongjiali...@gmail.com wrote: Hi, I'm doing stress test on Cassandra. And I learn that using ring cache can improve the performance because the client requests can directly go to the target Cassandra server and the coordinator Cassandra node is the desired target node. In this way, there is no need for coordinator node to route the client requests to the target node, and maybe we can get the linear performance increment. However, in my stress test on an Amazon EC2 cluster, the test results are weird. Seems that there's no performance improvement after using ring cache. Could anyone help me explain this results? (Also, I think the results of test without ring cache is weird, because there's no linear increment on QPS when new nodes are added. I need help on explaining this, too). The results are as follows: INSERT(write): Node count Replication factor QPS(No ring cache) QPS(ring cache) 1 1 18687 20195 2 1 20793 26403 2 2 22498 21263 4 1 28348 30010 4 3 28631 24413 SELECT(read): Node count Replication factor QPS(No
Re: Can not connect with cqlsh to something different than localhost
This did not work either. I changed /etc/cassandra.yaml and restarted Cassandra (I even restarted the machine to make 100% sure). What I tried: 1) listen_address: localhost - connection OK (but of course I can't connect from outside the VM to localhost) 2) Set listen_interface: eth0 - connection refused 3) Set listen_address: 192.168.111.136 - connection refused What to do? Try: $ netstat -lnt and see which interface port 9042 is listening on. You will likely need to update cassandra.yaml to change the interface. By default, Cassandra is listening on localhost so your local cqlsh session works. On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com wrote: I am running Cassandra 2.1.2 in an Ubuntu VM. cqlsh or cqlsh localhost works fine. But I can not connect from outside the VM (firewall, etc. disabled). Even when I do cqlsh 192.168.111.136 in my VM I get connection refused. This is strange because when I check my network config I can see that 192.168.111.136 is my IP: root@ubuntu:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0c:29:02:e0:de inet addr:192.168.111.136 Bcast:192.168.111.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:21307125 (21.3 MB) TX bytes:709471 (709.4 KB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:550 errors:0 dropped:0 overruns:0 frame:0 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:148053 (148.0 KB) TX bytes:148053 (148.0 KB) root@ubuntu:~# cqlsh 192.168.111.136 9042 Connection error: ('Unable to connect to any servers', {'192.168.111.136': error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error: Connection refused)}) What to do?
Re: Can not connect with cqlsh to something different than localhost
Two things: 1. Try telnet 192.168.111.136 9042 and see if it connects? 2. check for hostname in /etc/hosts, if it is mapped correctly. -Vivek On Mon, Dec 8, 2014 at 4:19 PM, Richard Snowden richard.t.snow...@gmail.com wrote: This did not work either. I changed /etc/cassandra.yaml and restarted Cassandra (I even restarted the machine to make 100% sure). What I tried: 1) listen_address: localhost - connection OK (but of course I can't connect from outside the VM to localhost) 2) Set listen_interface: eth0 - connection refused 3) Set listen_address: 192.168.111.136 - connection refused What to do? Try: $ netstat -lnt and see which interface port 9042 is listening on. You will likely need to update cassandra.yaml to change the interface. By default, Cassandra is listening on localhost so your local cqlsh session works. On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com wrote: I am running Cassandra 2.1.2 in an Ubuntu VM. cqlsh or cqlsh localhost works fine. But I can not connect from outside the VM (firewall, etc. disabled). Even when I do cqlsh 192.168.111.136 in my VM I get connection refused. This is strange because when I check my network config I can see that 192.168.111.136 is my IP: root@ubuntu:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0c:29:02:e0:de inet addr:192.168.111.136 Bcast:192.168.111.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:21307125 (21.3 MB) TX bytes:709471 (709.4 KB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:550 errors:0 dropped:0 overruns:0 frame:0 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:148053 (148.0 KB) TX bytes:148053 (148.0 KB) root@ubuntu:~# cqlsh 192.168.111.136 9042 Connection error: ('Unable to connect to any servers', {'192.168.111.136': error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error: Connection refused)}) What to do?
Re: Cassandra Doesn't Get Linear Performance Increment in Stress Test on Amazon EC2
Thanks Chris. I run a *client on a separate* AWS *instance from* the Cassandra cluster servers. At the client side, I create 40 or 50 threads for sending requests to each Cassandra node. I create one thrift client for each of the threads. And at the beginning, all the created thrift clients connect to the corresponding Cassandra nodes and keep connecting during the whole process(I did not close all the transports until the end of the test process). So I use very simple load balancing, since the same number of thrift clients connect to each node. And my source code is here: https://github.com/kongjialin/Cassandra/blob/master/cassandra_client.cpp It's very nice of you to help me improve my code. As I increase the number of threads, the latency gets longer. I'm using C++, so if I want to use native binary + prepared statements, the only way is to use C++ driver? Thanks very much. 2014-12-08 12:51 GMT+08:00 Chris Lohfink clohfin...@gmail.com: I think your client could use improvements. How many threads do you have running in your test? With a thrift call like that you only can do one request at a time per connection. For example, assuming C* takes 0ms, a 10ms network latency/driver overhead will mean 20ms RTT and a max throughput of ~50 QPS per thread (native binary doesn't behave like this). Are you running client on its own system or shared with a node? how are you load balancing your requests? Source code would help since theres a lot that can become a bottleneck. Generally you will see a bit of a dip in latency from N=RF=1 and N=2, RF=2 etc since there are optimizations on the coordinator node when it doesn't need to send the request to the replicas. The impact of the network overhead decreases in significance as cluster grows. Typically; latency wise, RF=N=1 is going to be fastest possible for smaller loads (ie when a client cannot fully saturate a single node). Main thing to expect is that latency will plateau and remain fairly constant as load/nodes increase while throughput potential will linearly (empirically at least) increase. You should really attempt it with the native binary + prepared statements, running cql over thrift is far from optimal. I would recommend using the cassandra-stress tool if you want to stress test Cassandra (and not your code) http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema === Chris Lohfink On Sun, Dec 7, 2014 at 9:48 PM, 孔嘉林 kongjiali...@gmail.com wrote: Hi Eric, Thank you very much for your reply! Do you mean that I should clear my table after each run? Indeed, I can see several times of compaction during my test, but could only a few times compaction affect the performance that much? Also, I can see from the OpsCenter some ParNew GC happen but no CMS GC happen. I run my test on EC2 cluster, I think the network could be of high speed with in it. Each Cassandra server has 4 units CPU, 15 GiB memory and 80 SSD storage, which is of m3.xlarge type. As for latency, which latency should I care about most? p(99) or p(999)? I want to get the max QPS under a certain limited latency. I know my testing scenario are not the common case in production, I just want to know how much burden my cluster can bear under stress. So, how did you test your cluster that can get 86k writes/sec? How many requests did you send to your cluster? Was it also 1 million? Did you also use OpsCenter to monitor the real time performance? I also wonder why the write and read QPS OpsCenter provide are much lower than what I calculate. Could you please describe in detail about your test deployment? Thank you very much, Joy 2014-12-07 23:55 GMT+08:00 Eric Stevens migh...@gmail.com: Hi Joy, Are you resetting your data after each test run? I wonder if your tests are actually causing you to fall behind on data grooming tasks such as compaction, and so performance suffers for your later tests. There are *so many* factors which can affect performance, without reviewing test methodology in great detail, it's really hard to say whether there are flaws which might uncover an antipattern cause atypical number of cache hits or misses, and so forth. You may also be producing gc pressure in the write path, and so forth. I *can* say that 28k writes per second looks just a little low, but it depends a lot on your network, hardware, and write patterns (eg, data size). For a little performance test suite I wrote, with parallel batched writes, on a 3 node rf=3 cluster test cluster, I got about 86k writes per second. Also focusing exclusively on max latency is going to cause you some troubles especially in the case of magnetic media as you're using. Between ill-timed GC and inconsistent performance characteristics from magnetic media, your max numbers will often look significantly worse than your p(99) or p(999) numbers. All this said, one node will often look better than several nodes
Re: Can not connect with cqlsh to something different than localhost
Listen address needs the actual address, not the interface. This is best accomplished by setting up proper hostnames for each machine (through DNS or hosts file) and leaving listen_address blank, as it will pick the external ip. Otherwise, you'll need to set the listen address to the IP of the machine you want on each machine. I find the former to be less of a pain to manage. On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden richard.t.snow...@gmail.com wrote: This did not work either. I changed /etc/cassandra.yaml and restarted Cassandra (I even restarted the machine to make 100% sure). What I tried: 1) listen_address: localhost - connection OK (but of course I can't connect from outside the VM to localhost) 2) Set listen_interface: eth0 - connection refused 3) Set listen_address: 192.168.111.136 - connection refused What to do? Try: $ netstat -lnt and see which interface port 9042 is listening on. You will likely need to update cassandra.yaml to change the interface. By default, Cassandra is listening on localhost so your local cqlsh session works. On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com wrote: I am running Cassandra 2.1.2 in an Ubuntu VM. cqlsh or cqlsh localhost works fine. But I can not connect from outside the VM (firewall, etc. disabled). Even when I do cqlsh 192.168.111.136 in my VM I get connection refused. This is strange because when I check my network config I can see that 192.168.111.136 is my IP: root@ubuntu:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0c:29:02:e0:de inet addr:192.168.111.136 Bcast:192.168.111.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:21307125 (21.3 MB) TX bytes:709471 (709.4 KB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:550 errors:0 dropped:0 overruns:0 frame:0 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:148053 (148.0 KB) TX bytes:148053 (148.0 KB) root@ubuntu:~# cqlsh 192.168.111.136 9042 Connection error: ('Unable to connect to any servers', {'192.168.111.136': error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error: Connection refused)}) What to do?
Re: Could ring cache really improve performance in Cassandra?
I agree with Robert. If you're trying to test Cassandra, test Cassandra using stress. Set a reasonable benchmark, and then you'll be able to aim for that with your client code. Otherwise you're likely to be asking a lot of the wrong questions make incorrect assumptions. On Mon Dec 08 2014 at 12:42:32 AM Robert Stupp sn...@snazy.de wrote: cassandra-stress is a great tool to check whether the sizing of your cluster in combination of your data model will fit your production needs. I.e. without the application :) Removing the application removes any possible bugs from the load test. Sure, it’s a necessary step to do it with your application - but I’d recommend to start with the stress test tool first. Thrift is a deprecated API. I strongly recommend to use the C++ driver (I pretty sure it supports the native protocol). The native protocol achieves approx. twice the performance than thrift via much fewer TCP connections. (Thrift is RPC - means connections usually waste system, application and server resources while waiting for something. Native protocol is a multiplexed protocol.) As John already said, all development effort is spent on CQL3 and native protocol - thift is just supported. With CQL you can you everything that you can do with thrift + more, new stuff. I also recommend to use prepared statements (it automagically works in a distributed cluster with the native protocol) - it eliminates the effort to parse CQL statement again and again. Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com: Thanks Jonathan, actually I'm wondering how CQL is implemented underlying, a different RPC mechanism? Why it is faster than thrift? I know I'm wrong, but now I just regard CQL as a query language. Could you please help explain to me? I still feel puzzled after reading some docs about CQL. I create table in CQL, and use cql3 API in thrift. I don't know what else I can do with CQL. And I am using C++ to write the client side code. Currently I am not using the C++ driver and want to write some simple functionality by myself. Also, I didn't use the stress test tool provided in the Cassandra distribution because I also want to make sure whether I can achieve good performance as excepted using my client code. I know others have benchmarked Cassandra and got good results. But if I cannot reproduce the satisfactory results, I cannot use it in my case. I will create a repo and send a link later, hope to get your kind help. Thanks very much. 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com: I would really not recommend using thrift for anything at this point, including your load tests. Take a look at CQL, all development is going there and has in 2.1 seen a massive performance boost over 2.0. You may want to try the Cassandra stress tool included in 2.1, it can stress a table you've already built. That way you can rule out any bugs on the client side. If you're going to keep using your tool, however, it would be helpful if you sent out a link to the repo, since currently we have no way of knowing if you've got a client side bug (data model or code) that's limiting your performance. On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com wrote: I find under the src/client folder of Cassandra 2.1.0 source code, there is a *RingCache.java* file. It uses a thrift client calling the* describe_ring()* API to get the token range of each Cassandra node. It is used on the client side. The client can use it combined with the partitioner to get the target node. In this way there is no need to route requests between Cassandra nodes, and the client can directly connect to the target node. So maybe it can save some routing time and improve performance. Thank you very much. 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com: What's a ring cache? FYI if you're using the DataStax CQL drivers they will automatically route requests to the correct node. On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com wrote: Hi, I'm doing stress test on Cassandra. And I learn that using ring cache can improve the performance because the client requests can directly go to the target Cassandra server and the coordinator Cassandra node is the desired target node. In this way, there is no need for coordinator node to route the client requests to the target node, and maybe we can get the linear performance increment. However, in my stress test on an Amazon EC2 cluster, the test results are weird. Seems that there's no performance improvement after using ring cache. Could anyone help me explain this results? (Also, I think the results of test without ring cache is weird, because there's no linear increment on QPS when new nodes are added. I need help on explaining this, too). The results are as follows: INSERT(write): Node count Replication factor QPS(No ring cache) QPS(ring cache) 1 1 18687 20195
Re: Could ring cache really improve performance in Cassandra?
Thanks Robert. So the native protocol is an asynchronous protocol? And is native protocol specially created for Cassandra CQL? I haven't heard about this protocol before. I have tried using the stress test tool. But it seems that this tool should run on the same node as one of the Cassandra node(or at least on a node having Cassandra installed)? One I try to run this tool on a separate client instance, I got exceptions thrown. The ringcache I found is here: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/client/RingCache.java . And I try to implement the similar funcionality in C++. My repo is here: https://github.com/kongjialin/Cassandra . My idea is that all the requests go to the client-side ring cache and be sent to the target Cassandra node(each node is associated with a client pool) to avoid routing between nodes in the cluster. Thank you very much. 2014-12-08 16:42 GMT+08:00 Robert Stupp sn...@snazy.de: cassandra-stress is a great tool to check whether the sizing of your cluster in combination of your data model will fit your production needs. I.e. without the application :) Removing the application removes any possible bugs from the load test. Sure, it’s a necessary step to do it with your application - but I’d recommend to start with the stress test tool first. Thrift is a deprecated API. I strongly recommend to use the C++ driver (I pretty sure it supports the native protocol). The native protocol achieves approx. twice the performance than thrift via much fewer TCP connections. (Thrift is RPC - means connections usually waste system, application and server resources while waiting for something. Native protocol is a multiplexed protocol.) As John already said, all development effort is spent on CQL3 and native protocol - thift is just supported. With CQL you can you everything that you can do with thrift + more, new stuff. I also recommend to use prepared statements (it automagically works in a distributed cluster with the native protocol) - it eliminates the effort to parse CQL statement again and again. Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com: Thanks Jonathan, actually I'm wondering how CQL is implemented underlying, a different RPC mechanism? Why it is faster than thrift? I know I'm wrong, but now I just regard CQL as a query language. Could you please help explain to me? I still feel puzzled after reading some docs about CQL. I create table in CQL, and use cql3 API in thrift. I don't know what else I can do with CQL. And I am using C++ to write the client side code. Currently I am not using the C++ driver and want to write some simple functionality by myself. Also, I didn't use the stress test tool provided in the Cassandra distribution because I also want to make sure whether I can achieve good performance as excepted using my client code. I know others have benchmarked Cassandra and got good results. But if I cannot reproduce the satisfactory results, I cannot use it in my case. I will create a repo and send a link later, hope to get your kind help. Thanks very much. 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com: I would really not recommend using thrift for anything at this point, including your load tests. Take a look at CQL, all development is going there and has in 2.1 seen a massive performance boost over 2.0. You may want to try the Cassandra stress tool included in 2.1, it can stress a table you've already built. That way you can rule out any bugs on the client side. If you're going to keep using your tool, however, it would be helpful if you sent out a link to the repo, since currently we have no way of knowing if you've got a client side bug (data model or code) that's limiting your performance. On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com wrote: I find under the src/client folder of Cassandra 2.1.0 source code, there is a *RingCache.java* file. It uses a thrift client calling the* describe_ring()* API to get the token range of each Cassandra node. It is used on the client side. The client can use it combined with the partitioner to get the target node. In this way there is no need to route requests between Cassandra nodes, and the client can directly connect to the target node. So maybe it can save some routing time and improve performance. Thank you very much. 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com: What's a ring cache? FYI if you're using the DataStax CQL drivers they will automatically route requests to the correct node. On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com wrote: Hi, I'm doing stress test on Cassandra. And I learn that using ring cache can improve the performance because the client requests can directly go to the target Cassandra server and the coordinator Cassandra node is the desired target node. In this way, there is no need for coordinator node to route
Re: Could ring cache really improve performance in Cassandra?
So the native protocol is an asynchronous protocol? Yes. I have tried using the stress test tool. But it seems that this tool should run on the same node as one of the Cassandra node(or at least on a node having Cassandra installed)? One I try to run this tool on a separate client instance, I got exceptions thrown. You should start with „new“ kind of stress testing (using CQL3, using native protocol, using prepared statements). Forget about thrift ;) Start with the example YAML stress file first to learn about it. It allows you to configure simultaneous writes and reads that match your workload. And you do not need to run it on a C* node - but you should think about the network between the stress test tool and your cluster. The ringcache I found is here:https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/client/RingCache.java https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/client/RingCache.java . And I try to implement the similar funcionality in C++. My repo is here: https://github.com/kongjialin/Cassandra https://github.com/kongjialin/Cassandra . My idea is that all the requests go to the client-side ring cache and be sent to the target Cassandra node(each node is associated with a client pool) to avoid routing between nodes in the cluster. You can safe yourself a lot of work to implement it right“ - just use the C++ driver. It knows about the native protocol and routes requests to the correct nodes. Although you can go into the C++ driver code and look how it works, improve it etc. :) I don’t know anything about the C++ driver - but feel free to post to the driver mailing list and/or the #datastax-drivers IRC channel. 2014-12-08 16:42 GMT+08:00 Robert Stupp sn...@snazy.de mailto:sn...@snazy.de: cassandra-stress is a great tool to check whether the sizing of your cluster in combination of your data model will fit your production needs. I.e. without the application :) Removing the application removes any possible bugs from the load test. Sure, it’s a necessary step to do it with your application - but I’d recommend to start with the stress test tool first. Thrift is a deprecated API. I strongly recommend to use the C++ driver (I pretty sure it supports the native protocol). The native protocol achieves approx. twice the performance than thrift via much fewer TCP connections. (Thrift is RPC - means connections usually waste system, application and server resources while waiting for something. Native protocol is a multiplexed protocol.) As John already said, all development effort is spent on CQL3 and native protocol - thift is just supported. With CQL you can you everything that you can do with thrift + more, new stuff. I also recommend to use prepared statements (it automagically works in a distributed cluster with the native protocol) - it eliminates the effort to parse CQL statement again and again. Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com mailto:kongjiali...@gmail.com: Thanks Jonathan, actually I'm wondering how CQL is implemented underlying, a different RPC mechanism? Why it is faster than thrift? I know I'm wrong, but now I just regard CQL as a query language. Could you please help explain to me? I still feel puzzled after reading some docs about CQL. I create table in CQL, and use cql3 API in thrift. I don't know what else I can do with CQL. And I am using C++ to write the client side code. Currently I am not using the C++ driver and want to write some simple functionality by myself. Also, I didn't use the stress test tool provided in the Cassandra distribution because I also want to make sure whether I can achieve good performance as excepted using my client code. I know others have benchmarked Cassandra and got good results. But if I cannot reproduce the satisfactory results, I cannot use it in my case. I will create a repo and send a link later, hope to get your kind help. Thanks very much. 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com mailto:j...@jonhaddad.com: I would really not recommend using thrift for anything at this point, including your load tests. Take a look at CQL, all development is going there and has in 2.1 seen a massive performance boost over 2.0. You may want to try the Cassandra stress tool included in 2.1, it can stress a table you've already built. That way you can rule out any bugs on the client side. If you're going to keep using your tool, however, it would be helpful if you sent out a link to the repo, since currently we have no way of knowing if you've got a client side bug (data model or code) that's limiting your performance. On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com mailto:kongjiali...@gmail.com wrote: I find under the src/client folder of Cassandra 2.1.0 source code, there is a RingCache.java file. It uses a
Re: Can not connect with cqlsh to something different than localhost
I left listen_address blank - still I can't connect (connection refused). cqlsh - OK cqlsh ubuntu - fail (ubuntu is my hostname) cqlsh 192.168.111.136 - fail telnet 192.168.111.136 9042 from outside the VM gives me a connection refused. I just started a Tomcat in my VM and did a telnet 192.168.111.136 8080 from outside the VM - and got the expected result (Connected to 192.168.111.136. Escape character is '^]'. So what's so special in Cassandra? On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com wrote: Listen address needs the actual address, not the interface. This is best accomplished by setting up proper hostnames for each machine (through DNS or hosts file) and leaving listen_address blank, as it will pick the external ip. Otherwise, you'll need to set the listen address to the IP of the machine you want on each machine. I find the former to be less of a pain to manage. On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden richard.t.snow...@gmail.com wrote: This did not work either. I changed /etc/cassandra.yaml and restarted Cassandra (I even restarted the machine to make 100% sure). What I tried: 1) listen_address: localhost - connection OK (but of course I can't connect from outside the VM to localhost) 2) Set listen_interface: eth0 - connection refused 3) Set listen_address: 192.168.111.136 - connection refused What to do? Try: $ netstat -lnt and see which interface port 9042 is listening on. You will likely need to update cassandra.yaml to change the interface. By default, Cassandra is listening on localhost so your local cqlsh session works. On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com wrote: I am running Cassandra 2.1.2 in an Ubuntu VM. cqlsh or cqlsh localhost works fine. But I can not connect from outside the VM (firewall, etc. disabled). Even when I do cqlsh 192.168.111.136 in my VM I get connection refused. This is strange because when I check my network config I can see that 192.168.111.136 is my IP: root@ubuntu:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0c:29:02:e0:de inet addr:192.168.111.136 Bcast:192.168.111.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:21307125 (21.3 MB) TX bytes:709471 (709.4 KB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:550 errors:0 dropped:0 overruns:0 frame:0 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:148053 (148.0 KB) TX bytes:148053 (148.0 KB) root@ubuntu:~# cqlsh 192.168.111.136 9042 Connection error: ('Unable to connect to any servers', {'192.168.111.136': error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error: Connection refused)}) What to do?
Cassandra 2.1.2 node stuck on joining the cluster
Hi Cassandra users, I'm trying but failing to join a new (well old, but wiped out/decomissioned) node to an existing cluster. Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I start a third node with 2.1.2, it gets to joining state, it bootstraps, i.e. streams some data as shown by nodetool netstats, but after some time, it gets stuck. From that point nothing gets streamed, the new node stays in joining state. I restarted node multiple times, each time it streamed more data, but then got stuck again. Other facts: - I don't see any errors in the log on any of the nodes. - The connectivity seems fine, I can ping, netcat to port 7000 all ways. - I have ~ 200 GB load per running node, replication 2, 16 tokens. - Load of a new node got to around 300GBs now. - The bootstrapping process stops in the middle of streaming some table, *always* after sending exactly 10MB of some SSTable, e.g.: $ nodetool netstats | grep -P -v bytes\(100 Mode: NORMAL Bootstrap e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files, 12493900 bytes total /home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db 10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch (Background): 168721 Pool Name Active Pending Completed Commands n/a 0 55802918 Responses n/a 0 425963 I'm trying to join this node for several days and I don't know what to do with it... I'll be grateful for any help! Cheers, Krzysztof Zarzycki
Re: How to model data to achieve specific data locality
The upper bound for the data size of a single column is 2GB, and the upper bound for the number of columns in a row (partition) is 2 billion. So if you wanted to create the largest possible row, you probably can't afford enough disks to hold it. http://wiki.apache.org/cassandra/CassandraLimitations Practically speaking you start running into troubles *way* before you reach those thresholds though. Large columns and large numbers of columns create GC pressure in your cluster, and since all data for a given row reside on the same primary and replicas, this tends to lead to hot spotting. Repair happens for entire rows, so large rows increase the cost of repairs, including GC pressure during the repair. And rows of this size are often arrived at by appending to the same row repeatedly, which will cause the data for that row to be scattered across a large number of SSTables which will hurt read performance. Also depending on your interface, you'll find you start hitting limits that you have to increase, each with their own implications (eg, maximum thrift message sizes and so forth). The right maximum practical size for a row definitely depends on your read and write patterns, as well as your hardware and network. More memory, SSD's, larger SSTables, and faster networks will all raise the ceiling for where large rows start to become painful. @Kai, if you're familiar with the Thrift paradigm, the partition key equates to a Thrift row key, and the clustering key equates to the first part of a composite column name. CQL PRIMARY KEY ((a,b), c, d) equates to Thrift where row key is ['a:b'] and all columns begin with ['c:d:']. Recommended reading: http://www.datastax.com/dev/blog/thrift-to-cql3 Whatever your partition key, if you need to sub-partition to maintain reasonable row sizes, then the only way to preserve data locality for related records is probably to switch to byte ordered partitioner, and compute blob or long column as part of your partition key that is meant to cause the PK to to map to the same token. Just be aware that byte ordered partitioner comes with a number of caveats, and you'll become responsible for maintaining good data load distributions in your cluster. But the benefits from being able to tune locality may be worth it. On Sun Dec 07 2014 at 3:12:11 PM Jonathan Haddad j...@jonhaddad.com wrote: I think he mentioned 100MB as the max size - planning for 1mb might make your data model difficult to work. On Sun Dec 07 2014 at 12:07:47 PM Kai Wang dep...@gmail.com wrote: Thanks for the help. I wasn't clear how clustering column works. Coming from Thrift experience, it took me a while to understand how clustering column impacts partition storage on disk. Now I believe using seq_type as the first clustering column solves my problem. As of partition size, I will start with some bucket assumption. If the partition size exceeds the threshold I may need to re-bucket using smaller bucket size. On another thread Eric mentions the optimal partition size should be at 100 kb ~ 1 MB. I will use that as the start point to design my bucket strategy. On Sun, Dec 7, 2014 at 10:32 AM, Jack Krupansky j...@basetechnology.com wrote: It would be helpful to look at some specific examples of sequences, showing how they grow. I suspect that the term “sequence” is being overloaded in some subtly misleading way here. Besides, we’ve already answered the headline question – data locality is achieved by having a common partition key. So, we need some clarity as to what question we are really focusing on And, of course, we should be asking the “Cassandra Data Modeling 101” question of what do your queries want to look like, how exactly do you want to access your data. Only after we have a handle on how you need to read your data can we decide how it should be stored. My immediate question to get things back on track: When you say “The typical read is to load a subset of sequences with the same seq_id”, what type of “subset” are you talking about? Again, a few explicit and concise example queries (in some concise, easy to read pseudo language or even plain English, but not belabored with full CQL syntax.) would be very helpful. I mean, Cassandra has no “subset” concept, nor a “load subset” command, so what are we really talking about? Also, I presume we are talking CQL, but some of the references seem more Thrift/slice oriented. -- Jack Krupansky *From:* Eric Stevens migh...@gmail.com *Sent:* Sunday, December 7, 2014 10:12 AM *To:* user@cassandra.apache.org *Subject:* Re: How to model data to achieve specific data locality Also new seq_types can be added and old seq_types can be deleted. This means I often need to ALTER TABLE to add and drop columns. Kai, unless I'm misunderstanding something, I don't see why you need to alter the table to add a new seq type. From a data model perspective, these are just new values in a row. If you do have columns
Re: Cassandra Doesn't Get Linear Performance Increment in Stress Test on Amazon EC2
So I would -expect- an increase of ~20k qps per node with m3.xlarge so there may be something up with your client (I am not a c++ person however but hopefully someone on list will take notice). Latency does not decreases linearly as you add nodes. What you are likely seeing with latency since so few nodes is side effect of an optimization. When you read/write from a table the node you request will act as the coordinator. If the data exists on the coordinator and using rf=1 or cl=1 it will not have to send the request to another node, just service it locally: +-+ +--+ | node0 | +--|node1 | |-| |--| | client | --+| coordinator | +-+ +--+ In this case the write latency is dominated by the network between coordinator and client. A second case is where the coordinator actually has to send the request to another node: +-+ +--+ +---+ | node0 | +--|node1 |+-- |node2 | |-| |--| |---| | client | --+| coordinator |---+| data replica | +-+ +--+ +---+ As your adding nodes your increasing the probability of hitting this second scenario where the coordinator has to make an additional network hop. This possibly why your seeing an increase (aside from client issues). To get an idea on how the latency is affected when you increase nodes you really need to go higher then 4 (ie graph the same rf for 5, 10, 15, 25 nodes. below 5 isn't really the recommended way to run Cassandra anyway) nodes since the latency will approach that of the 2nd scenario (plus some spike outliers for GCs) and then it should settle down until you overwork the node. May want to give https://github.com/datastax/cpp-driver a go (not cpp guy take with grain of salt). I would still highly recommend using cassandra-stress instead of own stuff if you want to test cassandra and not your code. === Chris Lohfink On Mon, Dec 8, 2014 at 4:57 AM, 孔嘉林 kongjiali...@gmail.com wrote: Thanks Chris. I run a *client on a separate* AWS *instance from* the Cassandra cluster servers. At the client side, I create 40 or 50 threads for sending requests to each Cassandra node. I create one thrift client for each of the threads. And at the beginning, all the created thrift clients connect to the corresponding Cassandra nodes and keep connecting during the whole process(I did not close all the transports until the end of the test process). So I use very simple load balancing, since the same number of thrift clients connect to each node. And my source code is here: https://github.com/kongjialin/Cassandra/blob/master/cassandra_client.cpp It's very nice of you to help me improve my code. As I increase the number of threads, the latency gets longer. I'm using C++, so if I want to use native binary + prepared statements, the only way is to use C++ driver? Thanks very much. 2014-12-08 12:51 GMT+08:00 Chris Lohfink clohfin...@gmail.com: I think your client could use improvements. How many threads do you have running in your test? With a thrift call like that you only can do one request at a time per connection. For example, assuming C* takes 0ms, a 10ms network latency/driver overhead will mean 20ms RTT and a max throughput of ~50 QPS per thread (native binary doesn't behave like this). Are you running client on its own system or shared with a node? how are you load balancing your requests? Source code would help since theres a lot that can become a bottleneck. Generally you will see a bit of a dip in latency from N=RF=1 and N=2, RF=2 etc since there are optimizations on the coordinator node when it doesn't need to send the request to the replicas. The impact of the network overhead decreases in significance as cluster grows. Typically; latency wise, RF=N=1 is going to be fastest possible for smaller loads (ie when a client cannot fully saturate a single node). Main thing to expect is that latency will plateau and remain fairly constant as load/nodes increase while throughput potential will linearly (empirically at least) increase. You should really attempt it with the native binary + prepared statements, running cql over thrift is far from optimal. I would recommend using the cassandra-stress tool if you want to stress test Cassandra (and not your code) http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema === Chris Lohfink On Sun, Dec 7, 2014 at 9:48 PM, 孔嘉林 kongjiali...@gmail.com wrote: Hi Eric, Thank you very much for your reply! Do you mean that I should clear my table after each run? Indeed, I can see several times of compaction during my test, but could only a few times compaction
Re: Cassandra Doesn't Get Linear Performance Increment in Stress Test on Amazon EC2
Do you mean that I should clear my table after each run? Indeed, I can see several times of compaction during my test, but could only a few times compaction affect the performance that much? It certainly affects performance. Read performance suffers first, then write performance suffers eventually. For this synthetic test, if you want to compare like states then you should certainly wipe between. You may fall behind on compaction for the first run, then the second run pays the penalty for data grooming backlog generated during the first run. As for latency, which latency should I care about most? p(99) or p(999)? p(99) discards the worst 1% of results for reporting, p(999) discards the worst 0.1% of results for reporting. Which you prefer depends on your tolerance for response time jitter. I.E. do you need 99% of responses to be under a threshold, 99.9%? The more 9's, the more likely you are to fail your threshold due to an outlier. So, how did you test your cluster that can get 86k writes/sec? How many requests did you send to your cluster? I wrote the same data to each of 5 tables with similar columns, but different key configurations. I did 100 runs of 5,000 records (different records for each run). The data itself was 5 columns composed of a mix of bigint, text, and timestamp (so per record, fairly small data). I wrote records in asynchronous batches of 100 at a time, completing each of the 5,000 records for one table before moving on to the next table (the last write to table 1 needed to complete before I moved on to the first write of table 2, but within a table the operations were done in parallel). I used the Datastax Java Driver, which speaks the native protocol, and is faster and supports more parallelism than Thrift. Was it also 1 million? In total it was 500,000 records written to each of 5 tables - so 2.5 million records overall. Did you also use OpsCenter to monitor the real time performance? I also wonder why the write and read QPS OpsCenter provide are much lower than what I calculate. No, I measured throughput on my client only. I don't have much experience with OpsCenter, so I'm afraid I can't give you much insight into why you'd see inconsistent information compared to data you measured. Maybe you're just seeing information for a single node instead of the whole cluster? Again, the validity of this kind of test is highly suspect even though I happened to have set this up already. In my case I was trying to measure burst performance specifically. Cassandra will definitely accept bursts well, but if you sustain such a load, performance will degrade over time. Under sustained conditions you need to be certain you are staying on top of compaction - outstanding compaction tasks should rarely if ever exceed 2 or 3. Above 10, you need to reduce your write volume or your cluster will gradually fall over, and you'll struggle to bootstrap new nodes to expand. Do not size Cassandra for burst writes, size it for sustained writes. Write your sizing tests with that in mind - how much can you write and not fall behind on compaction over time, and accordingly your tests need to run for hours or days, not seconds or minutes. On Mon Dec 08 2014 at 3:58:35 AM 孔嘉林 kongjiali...@gmail.com wrote: Thanks Chris. I run a *client on a separate* AWS *instance from* the Cassandra cluster servers. At the client side, I create 40 or 50 threads for sending requests to each Cassandra node. I create one thrift client for each of the threads. And at the beginning, all the created thrift clients connect to the corresponding Cassandra nodes and keep connecting during the whole process(I did not close all the transports until the end of the test process). So I use very simple load balancing, since the same number of thrift clients connect to each node. And my source code is here: https://github.com/kongjialin/Cassandra/blob/master/cassandra_client.cpp It's very nice of you to help me improve my code. As I increase the number of threads, the latency gets longer. I'm using C++, so if I want to use native binary + prepared statements, the only way is to use C++ driver? Thanks very much. 2014-12-08 12:51 GMT+08:00 Chris Lohfink clohfin...@gmail.com: I think your client could use improvements. How many threads do you have running in your test? With a thrift call like that you only can do one request at a time per connection. For example, assuming C* takes 0ms, a 10ms network latency/driver overhead will mean 20ms RTT and a max throughput of ~50 QPS per thread (native binary doesn't behave like this). Are you running client on its own system or shared with a node? how are you load balancing your requests? Source code would help since theres a lot that can become a bottleneck. Generally you will see a bit of a dip in latency from N=RF=1 and N=2, RF=2 etc since there are optimizations on the coordinator node when it doesn't need to send the request to the
Re: Can not connect with cqlsh to something different than localhost
The difference is what interface your service is listening on. What is the output of $ netstat -ntl | grep 9042 On Mon, 8 Dec 2014 07:21 Richard Snowden richard.t.snow...@gmail.com wrote: I left listen_address blank - still I can't connect (connection refused). cqlsh - OK cqlsh ubuntu - fail (ubuntu is my hostname) cqlsh 192.168.111.136 - fail telnet 192.168.111.136 9042 from outside the VM gives me a connection refused. I just started a Tomcat in my VM and did a telnet 192.168.111.136 8080 from outside the VM - and got the expected result (Connected to 192.168.111.136. Escape character is '^]'. So what's so special in Cassandra? On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com wrote: Listen address needs the actual address, not the interface. This is best accomplished by setting up proper hostnames for each machine (through DNS or hosts file) and leaving listen_address blank, as it will pick the external ip. Otherwise, you'll need to set the listen address to the IP of the machine you want on each machine. I find the former to be less of a pain to manage. On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden richard.t.snow...@gmail.com wrote: This did not work either. I changed /etc/cassandra.yaml and restarted Cassandra (I even restarted the machine to make 100% sure). What I tried: 1) listen_address: localhost - connection OK (but of course I can't connect from outside the VM to localhost) 2) Set listen_interface: eth0 - connection refused 3) Set listen_address: 192.168.111.136 - connection refused What to do? Try: $ netstat -lnt and see which interface port 9042 is listening on. You will likely need to update cassandra.yaml to change the interface. By default, Cassandra is listening on localhost so your local cqlsh session works. On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com wrote: I am running Cassandra 2.1.2 in an Ubuntu VM. cqlsh or cqlsh localhost works fine. But I can not connect from outside the VM (firewall, etc. disabled). Even when I do cqlsh 192.168.111.136 in my VM I get connection refused. This is strange because when I check my network config I can see that 192.168.111.136 is my IP: root@ubuntu:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0c:29:02:e0:de inet addr:192.168.111.136 Bcast:192.168.111.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:21307125 (21.3 MB) TX bytes:709471 (709.4 KB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:550 errors:0 dropped:0 overruns:0 frame:0 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:148053 (148.0 KB) TX bytes:148053 (148.0 KB) root@ubuntu:~# cqlsh 192.168.111.136 9042 Connection error: ('Unable to connect to any servers', {'192.168.111.136': error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error: Connection refused)}) What to do?
Re: Cassandra 2.1.2 node stuck on joining the cluster
Any chance you have something along the path that causes the connectivity issues? What's the network connectivity between this node and the other node? Can you try transferring a big file between the two servers? perhaps you have an MTU issue that causes TCP PMTU discovery fail. Can you send large pings between the servers? try pinging them from both sides with large packets (5000, 1). On Mon, Dec 8, 2014 at 3:22 PM, Krzysztof Zarzycki k.zarzy...@gmail.com wrote: Hi Cassandra users, I'm trying but failing to join a new (well old, but wiped out/decomissioned) node to an existing cluster. Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I start a third node with 2.1.2, it gets to joining state, it bootstraps, i.e. streams some data as shown by nodetool netstats, but after some time, it gets stuck. From that point nothing gets streamed, the new node stays in joining state. I restarted node multiple times, each time it streamed more data, but then got stuck again. Other facts: I don't see any errors in the log on any of the nodes. The connectivity seems fine, I can ping, netcat to port 7000 all ways. I have ~ 200 GB load per running node, replication 2, 16 tokens. Load of a new node got to around 300GBs now. The bootstrapping process stops in the middle of streaming some table, always after sending exactly 10MB of some SSTable, e.g.: $ nodetool netstats | grep -P -v bytes\(100 Mode: NORMAL Bootstrap e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files, 12493900 bytes total /home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db 10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch (Background): 168721 Pool Name Active Pending Completed Commands n/a 0 55802918 Responses n/a 0 425963 I'm trying to join this node for several days and I don't know what to do with it... I'll be grateful for any help! Cheers, Krzysztof Zarzycki
Re: Can not connect with cqlsh to something different than localhost
rpc_address (or rpc_interface) is used for client connections, listen_address is for inter-node communication. On 8 December 2014 at 19:21, Richard Snowden richard.t.snow...@gmail.com wrote: $ netstat -ntl | grep 9042 tcp6 0 0 127.0.0.1:9042 :::* LISTEN (listen_address not set in cassandra.yaml) Even with listen_address: 192.168.111.136 I get: $ netstat -ntl | grep 9042 tcp6 0 0 127.0.0.1:9042 :::* LISTEN All I want to do is to access Cassandra from outside my VM. Is this really that hard? On Mon, Dec 8, 2014 at 7:30 PM, Michael Dykman mdyk...@gmail.com wrote: The difference is what interface your service is listening on. What is the output of $ netstat -ntl | grep 9042 On Mon, 8 Dec 2014 07:21 Richard Snowden richard.t.snow...@gmail.com wrote: I left listen_address blank - still I can't connect (connection refused). cqlsh - OK cqlsh ubuntu - fail (ubuntu is my hostname) cqlsh 192.168.111.136 - fail telnet 192.168.111.136 9042 from outside the VM gives me a connection refused. I just started a Tomcat in my VM and did a telnet 192.168.111.136 8080 from outside the VM - and got the expected result (Connected to 192.168.111.136. Escape character is '^]'. So what's so special in Cassandra? On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com wrote: Listen address needs the actual address, not the interface. This is best accomplished by setting up proper hostnames for each machine (through DNS or hosts file) and leaving listen_address blank, as it will pick the external ip. Otherwise, you'll need to set the listen address to the IP of the machine you want on each machine. I find the former to be less of a pain to manage. On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden richard.t.snow...@gmail.com wrote: This did not work either. I changed /etc/cassandra.yaml and restarted Cassandra (I even restarted the machine to make 100% sure). What I tried: 1) listen_address: localhost - connection OK (but of course I can't connect from outside the VM to localhost) 2) Set listen_interface: eth0 - connection refused 3) Set listen_address: 192.168.111.136 - connection refused What to do? Try: $ netstat -lnt and see which interface port 9042 is listening on. You will likely need to update cassandra.yaml to change the interface. By default, Cassandra is listening on localhost so your local cqlsh session works. On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com wrote: I am running Cassandra 2.1.2 in an Ubuntu VM. cqlsh or cqlsh localhost works fine. But I can not connect from outside the VM (firewall, etc. disabled). Even when I do cqlsh 192.168.111.136 in my VM I get connection refused. This is strange because when I check my network config I can see that 192.168.111.136 is my IP: root@ubuntu:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0c:29:02:e0:de inet addr:192.168.111.136 Bcast:192.168.111.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:21307125 (21.3 MB) TX bytes:709471 (709.4 KB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:550 errors:0 dropped:0 overruns:0 frame:0 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:148053 (148.0 KB) TX bytes:148053 (148.0 KB) root@ubuntu:~# cqlsh 192.168.111.136 9042 Connection error: ('Unable to connect to any servers', {'192.168.111.136': error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error: Connection refused)}) What to do?
Re: Can not connect with cqlsh to something different than localhost
Ah! That did the trick! Thanks Sam! On Mon, Dec 8, 2014 at 8:49 PM, Sam Tunnicliffe s...@beobal.com wrote: rpc_address (or rpc_interface) is used for client connections, listen_address is for inter-node communication. On 8 December 2014 at 19:21, Richard Snowden richard.t.snow...@gmail.com wrote: $ netstat -ntl | grep 9042 tcp6 0 0 127.0.0.1:9042 :::* LISTEN (listen_address not set in cassandra.yaml) Even with listen_address: 192.168.111.136 I get: $ netstat -ntl | grep 9042 tcp6 0 0 127.0.0.1:9042 :::* LISTEN All I want to do is to access Cassandra from outside my VM. Is this really that hard? On Mon, Dec 8, 2014 at 7:30 PM, Michael Dykman mdyk...@gmail.com wrote: The difference is what interface your service is listening on. What is the output of $ netstat -ntl | grep 9042 On Mon, 8 Dec 2014 07:21 Richard Snowden richard.t.snow...@gmail.com wrote: I left listen_address blank - still I can't connect (connection refused). cqlsh - OK cqlsh ubuntu - fail (ubuntu is my hostname) cqlsh 192.168.111.136 - fail telnet 192.168.111.136 9042 from outside the VM gives me a connection refused. I just started a Tomcat in my VM and did a telnet 192.168.111.136 8080 from outside the VM - and got the expected result (Connected to 192.168.111.136. Escape character is '^]'. So what's so special in Cassandra? On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com wrote: Listen address needs the actual address, not the interface. This is best accomplished by setting up proper hostnames for each machine (through DNS or hosts file) and leaving listen_address blank, as it will pick the external ip. Otherwise, you'll need to set the listen address to the IP of the machine you want on each machine. I find the former to be less of a pain to manage. On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden richard.t.snow...@gmail.com wrote: This did not work either. I changed /etc/cassandra.yaml and restarted Cassandra (I even restarted the machine to make 100% sure). What I tried: 1) listen_address: localhost - connection OK (but of course I can't connect from outside the VM to localhost) 2) Set listen_interface: eth0 - connection refused 3) Set listen_address: 192.168.111.136 - connection refused What to do? Try: $ netstat -lnt and see which interface port 9042 is listening on. You will likely need to update cassandra.yaml to change the interface. By default, Cassandra is listening on localhost so your local cqlsh session works. On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com wrote: I am running Cassandra 2.1.2 in an Ubuntu VM. cqlsh or cqlsh localhost works fine. But I can not connect from outside the VM (firewall, etc. disabled). Even when I do cqlsh 192.168.111.136 in my VM I get connection refused. This is strange because when I check my network config I can see that 192.168.111.136 is my IP: root@ubuntu:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:0c:29:02:e0:de inet addr:192.168.111.136 Bcast:192.168.111.255 Mask:255.255.255.0 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:21307125 (21.3 MB) TX bytes:709471 (709.4 KB) loLink encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:65536 Metric:1 RX packets:550 errors:0 dropped:0 overruns:0 frame:0 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:148053 (148.0 KB) TX bytes:148053 (148.0 KB) root@ubuntu:~# cqlsh 192.168.111.136 9042 Connection error: ('Unable to connect to any servers', {'192.168.111.136': error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error: Connection refused)}) What to do?
Re: Keyspace and table/cf limits
has there been any recent discussion on multitenancy namespaces ? I think this would effectively solve the scenario -- a formalized partition-key that's enforced at the storage layer, similar to oracle's virtual private database it was on the wiki from ~ Aug 2010 http://wiki.apache.org/cassandra/MultiTenant Namespaces - in a multi-tenant use case, each user might like to have a keyspace XYZ for whatever reason. So it might be nice to have namespaces so that keyspace XYZ could be specific to their user. Ideally this would be an option that would not affect those that don't use namespaces. - The distinction from keyspaces is that a namespace would be completely transparent to the user: the existence of namespaces would not be exposed. It might be returned by the authentication backend on login, and prefixed to keyspaces transparently. thanks !!! On Sat, Dec 6, 2014 at 11:25 PM, Jason Wee peich...@gmail.com wrote: +1 well said Jack! On Sun, Dec 7, 2014 at 6:13 AM, Jack Krupansky j...@basetechnology.com wrote: Generally, limit a Cassandra cluster low hundreds of tables, regardless of number of keyspaces. Beyond low hundreds is certainly an “expert” feature and requires great care. Sure, maybe you can have 500 or 750 or maybe even 1,000 tables in a cluster, but don’t be surprised if you start running into memory and performance issues. There is an undocumented method to reduce the table overhead to support more tables, but... if you are not expert enough to find it on your own, then you are definitely not expert enough to be using it. -- Jack Krupansky *From:* Raj N raj.cassan...@gmail.com *Sent:* Tuesday, November 25, 2014 12:07 PM *To:* user@cassandra.apache.org *Subject:* Keyspace and table/cf limits What's the latest on the maximum number of keyspaces and/or tables that one can have in Cassandra 2.1.x? -Raj -- Frank Hsueh | frank.hs...@gmail.com
Cassandra Files Taking up Much More Space than CF
Hi All, I am new to Cassandra so I apologise in advance if I have missed anything obvious but this one currently has me stumped. I am currently running a 6 node Cassandra 2.1.1 cluster on EC2 using C3.2XLarge nodes which overall is working very well for us. However, after letting it run for a while I seem to get into a situation where the amount of disk space used far exceeds the total amount of data on each node and I haven't been able to get the size to go back down except by stopping and restarting the node. For example, in my data I have almost all of my data in one table. On one of my nodes right now the total space used (as reported by nodetool cfstats) is 57.2 GB and there are no snapshots. However, when I look at the size of the data files (using du) the data file for that table is 107GB. Because the C3.2XLarge only have 160 GB of SSD you can see why this quickly becomes a problem. Running nodetool compact didn't reduce the size and neither does running nodetool repair -pr on the node. I also tried nodetool flush and nodetool cleanup (even though I have not added or removed any nodes recently) but it didn't change anything either. In order to keep my cluster up I then stopped and started that node and the size of the data file dropped to 54GB while the total column family size (as reported by nodetool) stayed about the same. Any suggestions as to what I could be doing wrong? Thanks, Nate