Re: Could ring cache really improve performance in Cassandra?

2014-12-08 Thread 孔嘉林
Thanks Jonathan, actually I'm wondering how CQL is implemented underlying,
a different RPC mechanism? Why it is faster than thrift? I know I'm wrong,
but now I just regard CQL as a query language. Could you please help
explain to me? I still feel puzzled after reading some docs about CQL. I
create table in CQL, and use cql3 API in thrift. I don't know what else I
can do with CQL. And I am using C++ to write the client side code.
Currently I am not using the C++ driver and want to write some simple
functionality by myself.

Also, I didn't use the stress test tool provided in the Cassandra
distribution because I also want to make sure whether I can achieve good
performance as excepted using my client code. I know others have
benchmarked Cassandra and got good results. But if I cannot reproduce the
satisfactory results, I cannot use it in my case.

I will create a repo and send a link later, hope to get your kind help.

Thanks very much.

2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com:

 I would really not recommend using thrift for anything at this point,
 including your load tests.  Take a look at CQL, all development is going
 there and has in 2.1 seen a massive performance boost over 2.0.

 You may want to try the Cassandra stress tool included in 2.1, it can
 stress a table you've already built.  That way you can rule out any bugs on
 the client side.  If you're going to keep using your tool, however, it
 would be helpful if you sent out a link to the repo, since currently we
 have no way of knowing if you've got a client side bug (data model or code)
 that's limiting your performance.


 On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com wrote:

 I find under the src/client folder of Cassandra 2.1.0 source code, there
 is a *RingCache.java* file. It uses a thrift client calling the*
 describe_ring()* API to get the token range of each Cassandra node. It
 is used on the client side. The client can use it combined with the
 partitioner to get the target node. In this way there is no need to route
 requests between Cassandra nodes, and the client can directly connect to
 the target node. So maybe it can save some routing time and improve
 performance.
 Thank you very much.

 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com:

 What's a ring cache?

 FYI if you're using the DataStax CQL drivers they will automatically
 route requests to the correct node.

 On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com wrote:

 Hi,

 I'm doing stress test on Cassandra. And I learn that using ring cache
 can improve the performance because the client requests can directly go to
 the target Cassandra server and the coordinator Cassandra node is the
 desired target node. In this way, there is no need for coordinator node to
 route the client requests to the target node, and maybe we can get the
 linear performance increment.



 However, in my stress test on an Amazon EC2 cluster, the test results
 are weird. Seems that there's no performance improvement after using ring
 cache. Could anyone help me explain this results? (Also, I think the
 results of test without ring cache is weird, because there's no linear
 increment on QPS when new nodes are added. I need help on explaining this,
 too). The results are as follows:



 INSERT(write):

 Node count

 Replication factor

 QPS(No ring cache)

 QPS(ring cache)

 1

 1

 18687

 20195

 2

 1

 20793

 26403

 2

 2

 22498

 21263

 4

 1

 28348

 30010

 4

 3

 28631

 24413



 SELECT(read):

 Node count

 Replication factor

 QPS(No ring cache)

 QPS(ring cache)

 1

 1

 24498

 22802

 2

 1

 28219

 27030

 2

 2

 35383

 36674

 4

 1

 34648

 28347

 4

 3

 52932

 52590





 Thank you very much,

 Joy





Re: Could ring cache really improve performance in Cassandra?

2014-12-08 Thread Robert Stupp
cassandra-stress is a great tool to check whether the sizing of your cluster in 
combination of your data model will fit your production needs. I.e. without the 
application :) Removing the application removes any possible bugs from the load 
test. Sure, it’s a necessary step to do it with your application - but I’d 
recommend to start with the stress test tool first.

Thrift is a deprecated API. I strongly recommend to use the C++ driver (I 
pretty sure it supports the native protocol). The native protocol achieves 
approx. twice the performance than thrift via much fewer TCP connections. 
(Thrift is RPC - means connections usually waste system, application and server 
resources while waiting for something. Native protocol is a multiplexed 
protocol.) As John already said, all development effort is spent on CQL3 and 
native protocol - thift is just supported.

With CQL you can you everything that you can do with thrift + more, new stuff.

I also recommend to use prepared statements (it automagically works in a 
distributed cluster with the native protocol) - it eliminates the effort to 
parse CQL statement again and again.


 Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com:
 
 Thanks Jonathan, actually I'm wondering how CQL is implemented underlying, a 
 different RPC mechanism? Why it is faster than thrift? I know I'm wrong, but 
 now I just regard CQL as a query language. Could you please help explain to 
 me? I still feel puzzled after reading some docs about CQL. I create table in 
 CQL, and use cql3 API in thrift. I don't know what else I can do with CQL. 
 And I am using C++ to write the client side code. Currently I am not using 
 the C++ driver and want to write some simple functionality by myself. 
 
 Also, I didn't use the stress test tool provided in the Cassandra 
 distribution because I also want to make sure whether I can achieve good 
 performance as excepted using my client code. I know others have benchmarked 
 Cassandra and got good results. But if I cannot reproduce the satisfactory 
 results, I cannot use it in my case.
 
 I will create a repo and send a link later, hope to get your kind help.
 
 Thanks very much.
 
 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com 
 mailto:j...@jonhaddad.com:
 I would really not recommend using thrift for anything at this point, 
 including your load tests.  Take a look at CQL, all development is going 
 there and has in 2.1 seen a massive performance boost over 2.0.
 
 You may want to try the Cassandra stress tool included in 2.1, it can stress 
 a table you've already built.  That way you can rule out any bugs on the 
 client side.  If you're going to keep using your tool, however, it would be 
 helpful if you sent out a link to the repo, since currently we have no way of 
 knowing if you've got a client side bug (data model or code) that's limiting 
 your performance.
 
 
 On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com 
 mailto:kongjiali...@gmail.com wrote:
 I find under the src/client folder of Cassandra 2.1.0 source code, there is a 
 RingCache.java file. It uses a thrift client calling the describe_ring() API 
 to get the token range of each Cassandra node. It is used on the client side. 
 The client can use it combined with the partitioner to get the target node. 
 In this way there is no need to route requests between Cassandra nodes, and 
 the client can directly connect to the target node. So maybe it can save some 
 routing time and improve performance.
 Thank you very much.
 
 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com 
 mailto:j...@jonhaddad.com:
 What's a ring cache?
 
 FYI if you're using the DataStax CQL drivers they will automatically route 
 requests to the correct node.
 
 On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com 
 mailto:kongjiali...@gmail.com wrote:
 Hi,
 
 I'm doing stress test on Cassandra. And I learn that using ring cache can 
 improve the performance because the client requests can directly go to the 
 target Cassandra server and the coordinator Cassandra node is the desired 
 target node. In this way, there is no need for coordinator node to route the 
 client requests to the target node, and maybe we can get the linear 
 performance increment.
 
  
 
 However, in my stress test on an Amazon EC2 cluster, the test results are 
 weird. Seems that there's no performance improvement after using ring cache. 
 Could anyone help me explain this results? (Also, I think the results of test 
 without ring cache is weird, because there's no linear increment on QPS when 
 new nodes are added. I need help on explaining this, too). The results are as 
 follows:
 
  
 
 INSERT(write):
 
 Node count
 
 Replication factor
 
 QPS(No ring cache)
 
 QPS(ring cache)
 
 1
 
 1
 
 18687
 
 20195
 
 2
 
 1
 
 20793
 
 26403
 
 2
 
 2
 
 22498
 
 21263
 
 4
 
 1
 
 28348
 
 30010
 
 4
 
 3
 
 28631
 
 24413
 
  
 
 SELECT(read):
 
 Node count
 
 Replication factor
 
 QPS(No 

Re: Can not connect with cqlsh to something different than localhost

2014-12-08 Thread Richard Snowden
This did not work either. I changed /etc/cassandra.yaml and restarted
Cassandra (I even restarted the machine to make 100% sure).

What I tried:

1) listen_address: localhost
   - connection OK (but of course I can't connect from outside the VM
to localhost)

2) Set listen_interface: eth0
   - connection refused

3) Set listen_address: 192.168.111.136
   - connection refused


What to do?


 Try:
 $ netstat -lnt
 and see which interface port 9042 is listening on. You will likely need to
 update cassandra.yaml to change the interface. By default, Cassandra is
 listening on localhost so your local cqlsh session works.

 On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
 wrote:

  I am running Cassandra 2.1.2 in an Ubuntu VM.
 
  cqlsh or cqlsh localhost works fine.
 
  But I can not connect from outside the VM (firewall, etc. disabled).
 
  Even when I do cqlsh 192.168.111.136 in my VM I get connection refused.
  This is strange because when I check my network config I can see that
  192.168.111.136 is my IP:
 
  root@ubuntu:~# ifconfig
 
  eth0  Link encap:Ethernet  HWaddr 00:0c:29:02:e0:de
inet addr:192.168.111.136  Bcast:192.168.111.255
  Mask:255.255.255.0
inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link
UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
RX packets:16042 errors:0 dropped:0 overruns:0 frame:0
TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:21307125 (21.3 MB)  TX bytes:709471 (709.4 KB)
 
  loLink encap:Local Loopback
inet addr:127.0.0.1  Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING  MTU:65536  Metric:1
RX packets:550 errors:0 dropped:0 overruns:0 frame:0
TX packets:550 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:148053 (148.0 KB)  TX bytes:148053 (148.0 KB)
 
 
  root@ubuntu:~# cqlsh 192.168.111.136 9042
  Connection error: ('Unable to connect to any servers', {'192.168.111.136':
  error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error:
  Connection refused)})
 
 
  What to do?
 


Re: Can not connect with cqlsh to something different than localhost

2014-12-08 Thread Vivek Mishra
Two things:
1. Try telnet 192.168.111.136 9042 and see if it connects?
2. check for hostname in /etc/hosts, if it is mapped correctly.

-Vivek

On Mon, Dec 8, 2014 at 4:19 PM, Richard Snowden richard.t.snow...@gmail.com
 wrote:

 This did not work either. I changed /etc/cassandra.yaml and restarted 
 Cassandra (I even restarted the machine to make 100% sure).

 What I tried:

 1) listen_address: localhost
- connection OK (but of course I can't connect from outside the VM to 
 localhost)

 2) Set listen_interface: eth0
- connection refused

 3) Set listen_address: 192.168.111.136
- connection refused


 What to do?


  Try:
  $ netstat -lnt
  and see which interface port 9042 is listening on. You will likely need to
  update cassandra.yaml to change the interface. By default, Cassandra is
  listening on localhost so your local cqlsh session works.

  On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
  wrote:

   I am running Cassandra 2.1.2 in an Ubuntu VM.
  
   cqlsh or cqlsh localhost works fine.
  
   But I can not connect from outside the VM (firewall, etc. disabled).
  
   Even when I do cqlsh 192.168.111.136 in my VM I get connection refused.
   This is strange because when I check my network config I can see that
   192.168.111.136 is my IP:
  
   root@ubuntu:~# ifconfig
  
   eth0  Link encap:Ethernet  HWaddr 00:0c:29:02:e0:de
 inet addr:192.168.111.136  Bcast:192.168.111.255
   Mask:255.255.255.0
 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0
 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:21307125 (21.3 MB)  TX bytes:709471 (709.4 KB)
  
   loLink encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 inet6 addr: ::1/128 Scope:Host
 UP LOOPBACK RUNNING  MTU:65536  Metric:1
 RX packets:550 errors:0 dropped:0 overruns:0 frame:0
 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:148053 (148.0 KB)  TX bytes:148053 (148.0 KB)
  
  
   root@ubuntu:~# cqlsh 192.168.111.136 9042
   Connection error: ('Unable to connect to any servers', {'192.168.111.136':
   error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error:
   Connection refused)})
  
  
   What to do?
  




Re: Cassandra Doesn't Get Linear Performance Increment in Stress Test on Amazon EC2

2014-12-08 Thread 孔嘉林
Thanks Chris.
I run a *client on a separate* AWS *instance from* the Cassandra cluster
servers. At the client side, I create 40 or 50 threads for sending requests
to each Cassandra node. I create one thrift client for each of the threads.
And at the beginning, all the created thrift clients connect to the
corresponding Cassandra nodes and keep connecting during the whole
process(I did not close all the transports until the end of the test
process). So I use very simple load balancing, since the same number of
thrift clients connect to each node. And my source code is here:
https://github.com/kongjialin/Cassandra/blob/master/cassandra_client.cpp It's
very nice of you to help me improve my code.

As I increase the number of threads, the latency gets longer.

I'm using C++, so if I want to use native binary + prepared statements, the
only way is to use C++ driver?
Thanks very much.




2014-12-08 12:51 GMT+08:00 Chris Lohfink clohfin...@gmail.com:

 I think your client could use improvements.  How many threads do you have
 running in your test?  With a thrift call like that you only can do one
 request at a time per connection.   For example, assuming C* takes 0ms, a
 10ms network latency/driver overhead will mean 20ms RTT and a max
 throughput of ~50 QPS per thread (native binary doesn't behave like this).
 Are you running client on its own system or shared with a node?  how are
 you load balancing your requests?  Source code would help since theres a
 lot that can become a bottleneck.

 Generally you will see a bit of a dip in latency from N=RF=1 and N=2, RF=2
 etc since there are optimizations on the coordinator node when it doesn't
 need to send the request to the replicas.  The impact of the network
 overhead decreases in significance as cluster grows.  Typically; latency
 wise, RF=N=1 is going to be fastest possible for smaller loads (ie when a
 client cannot fully saturate a single node).

 Main thing to expect is that latency will plateau and remain fairly
 constant as load/nodes increase while throughput potential will linearly
 (empirically at least) increase.

 You should really attempt it with the native binary + prepared statements,
 running cql over thrift is far from optimal.  I would recommend using the
 cassandra-stress tool if you want to stress test Cassandra (and not your
 code)
 http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema

 ===
 Chris Lohfink

 On Sun, Dec 7, 2014 at 9:48 PM, 孔嘉林 kongjiali...@gmail.com wrote:

 Hi Eric,
 Thank you very much for your reply!
 Do you mean that I should clear my table after each run? Indeed, I can
 see several times of compaction during my test, but could only a few times
 compaction affect the performance that much? Also, I can see from the
 OpsCenter some ParNew GC happen but no CMS GC happen.

 I run my test on EC2 cluster, I think the network could be of high speed
 with in it. Each Cassandra server has 4 units CPU, 15 GiB memory and 80 SSD
 storage, which is of m3.xlarge type.

 As for latency, which latency should I care about most? p(99) or p(999)?
 I want to get the max QPS under a certain limited latency.

 I know my testing scenario are not the common case in production, I just
 want to know how much burden my cluster can bear under stress.

 So, how did you test your cluster that can get 86k writes/sec? How many
 requests did you send to your cluster? Was it also 1 million? Did you also
 use OpsCenter to monitor the real time performance? I also wonder why the
 write and read QPS OpsCenter provide are much lower than what I calculate.
 Could you please describe in detail about your test deployment?

 Thank you very much,
 Joy

 2014-12-07 23:55 GMT+08:00 Eric Stevens migh...@gmail.com:

 Hi Joy,

 Are you resetting your data after each test run?  I wonder if your tests
 are actually causing you to fall behind on data grooming tasks such as
 compaction, and so performance suffers for your later tests.

 There are *so many* factors which can affect performance, without
 reviewing test methodology in great detail, it's really hard to say whether
 there are flaws which might uncover an antipattern cause atypical number of
 cache hits or misses, and so forth. You may also be producing gc pressure
 in the write path, and so forth.

 I *can* say that 28k writes per second looks just a little low, but it
 depends a lot on your network, hardware, and write patterns (eg, data
 size).  For a little performance test suite I wrote, with parallel batched
 writes, on a 3 node rf=3 cluster test cluster, I got about 86k writes per
 second.

 Also focusing exclusively on max latency is going to cause you some
 troubles especially in the case of magnetic media as you're using.  Between
 ill-timed GC and inconsistent performance characteristics from magnetic
 media, your max numbers will often look significantly worse than your p(99)
 or p(999) numbers.

 All this said, one node will often look better than several nodes 

Re: Can not connect with cqlsh to something different than localhost

2014-12-08 Thread Jonathan Haddad
Listen address needs the actual address, not the interface.  This is best
accomplished by setting up proper hostnames for each machine (through DNS
or hosts file) and leaving listen_address blank, as it will pick the
external ip.  Otherwise, you'll need to set the listen address to the IP of
the machine you want on each machine.  I find the former to be less of a
pain to manage.

On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden 
richard.t.snow...@gmail.com wrote:

 This did not work either. I changed /etc/cassandra.yaml and restarted 
 Cassandra (I even restarted the machine to make 100% sure).

 What I tried:

 1) listen_address: localhost
- connection OK (but of course I can't connect from outside the VM to 
 localhost)

 2) Set listen_interface: eth0
- connection refused

 3) Set listen_address: 192.168.111.136
- connection refused


 What to do?


  Try:
  $ netstat -lnt
  and see which interface port 9042 is listening on. You will likely need to
  update cassandra.yaml to change the interface. By default, Cassandra is
  listening on localhost so your local cqlsh session works.

  On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
  wrote:

   I am running Cassandra 2.1.2 in an Ubuntu VM.
  
   cqlsh or cqlsh localhost works fine.
  
   But I can not connect from outside the VM (firewall, etc. disabled).
  
   Even when I do cqlsh 192.168.111.136 in my VM I get connection refused.
   This is strange because when I check my network config I can see that
   192.168.111.136 is my IP:
  
   root@ubuntu:~# ifconfig
  
   eth0  Link encap:Ethernet  HWaddr 00:0c:29:02:e0:de
 inet addr:192.168.111.136  Bcast:192.168.111.255
   Mask:255.255.255.0
 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0
 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:21307125 (21.3 MB)  TX bytes:709471 (709.4 KB)
  
   loLink encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 inet6 addr: ::1/128 Scope:Host
 UP LOOPBACK RUNNING  MTU:65536  Metric:1
 RX packets:550 errors:0 dropped:0 overruns:0 frame:0
 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:148053 (148.0 KB)  TX bytes:148053 (148.0 KB)
  
  
   root@ubuntu:~# cqlsh 192.168.111.136 9042
   Connection error: ('Unable to connect to any servers', {'192.168.111.136':
   error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error:
   Connection refused)})
  
  
   What to do?
  




Re: Could ring cache really improve performance in Cassandra?

2014-12-08 Thread Jonathan Haddad
I agree with Robert.  If you're trying to test Cassandra, test Cassandra
using stress.  Set a reasonable benchmark, and then you'll be able to aim
for that with your client code.  Otherwise you're likely to be asking a lot
of the wrong questions  make incorrect assumptions.


On Mon Dec 08 2014 at 12:42:32 AM Robert Stupp sn...@snazy.de wrote:

 cassandra-stress is a great tool to check whether the sizing of your
 cluster in combination of your data model will fit your production needs.
 I.e. without the application :) Removing the application removes any
 possible bugs from the load test. Sure, it’s a necessary step to do it with
 your application - but I’d recommend to start with the stress test tool
 first.

 Thrift is a deprecated API. I strongly recommend to use the C++ driver (I
 pretty sure it supports the native protocol). The native protocol achieves
 approx. twice the performance than thrift via much fewer TCP connections.
 (Thrift is RPC - means connections usually waste system, application and
 server resources while waiting for something. Native protocol is a
 multiplexed protocol.) As John already said, all development effort is
 spent on CQL3 and native protocol - thift is just supported.

 With CQL you can you everything that you can do with thrift + more, new
 stuff.

 I also recommend to use prepared statements (it automagically works in a
 distributed cluster with the native protocol) - it eliminates the effort to
 parse CQL statement again and again.


 Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com:


 Thanks Jonathan, actually I'm wondering how CQL is implemented underlying,
 a different RPC mechanism? Why it is faster than thrift? I know I'm wrong,
 but now I just regard CQL as a query language. Could you please help
 explain to me? I still feel puzzled after reading some docs about CQL. I
 create table in CQL, and use cql3 API in thrift. I don't know what else I
 can do with CQL. And I am using C++ to write the client side code.
 Currently I am not using the C++ driver and want to write some simple
 functionality by myself.

 Also, I didn't use the stress test tool provided in the Cassandra
 distribution because I also want to make sure whether I can achieve good
 performance as excepted using my client code. I know others have
 benchmarked Cassandra and got good results. But if I cannot reproduce the
 satisfactory results, I cannot use it in my case.

 I will create a repo and send a link later, hope to get your kind help.

 Thanks very much.

 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com:

 I would really not recommend using thrift for anything at this point,
 including your load tests.  Take a look at CQL, all development is going
 there and has in 2.1 seen a massive performance boost over 2.0.

 You may want to try the Cassandra stress tool included in 2.1, it can
 stress a table you've already built.  That way you can rule out any bugs on
 the client side.  If you're going to keep using your tool, however, it
 would be helpful if you sent out a link to the repo, since currently we
 have no way of knowing if you've got a client side bug (data model or code)
 that's limiting your performance.


 On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com wrote:

 I find under the src/client folder of Cassandra 2.1.0 source code, there
 is a *RingCache.java* file. It uses a thrift client calling the*
 describe_ring()* API to get the token range of each Cassandra node. It
 is used on the client side. The client can use it combined with the
 partitioner to get the target node. In this way there is no need to route
 requests between Cassandra nodes, and the client can directly connect to
 the target node. So maybe it can save some routing time and improve
 performance.
 Thank you very much.

 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com:

 What's a ring cache?

 FYI if you're using the DataStax CQL drivers they will automatically
 route requests to the correct node.

 On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com wrote:

 Hi,

 I'm doing stress test on Cassandra. And I learn that using ring cache
 can improve the performance because the client requests can directly go to
 the target Cassandra server and the coordinator Cassandra node is the
 desired target node. In this way, there is no need for coordinator node to
 route the client requests to the target node, and maybe we can get the
 linear performance increment.



 However, in my stress test on an Amazon EC2 cluster, the test results
 are weird. Seems that there's no performance improvement after using ring
 cache. Could anyone help me explain this results? (Also, I think the
 results of test without ring cache is weird, because there's no linear
 increment on QPS when new nodes are added. I need help on explaining this,
 too). The results are as follows:



 INSERT(write):

 Node count

 Replication factor

 QPS(No ring cache)

 QPS(ring cache)

 1

 1

 18687

 20195

Re: Could ring cache really improve performance in Cassandra?

2014-12-08 Thread 孔嘉林
Thanks Robert. So the native protocol is an asynchronous protocol?  And is
native protocol specially created for Cassandra CQL? I haven't heard about
this protocol before.

I have tried using the stress test tool. But it seems that this tool should
run on the same node as one of the Cassandra node(or at least on a node
having Cassandra installed)? One I try to run this tool on a separate
client instance, I got exceptions thrown.

The ringcache I found is here:
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/client/RingCache.java
.
And I try to implement the similar funcionality in C++. My repo is here:
https://github.com/kongjialin/Cassandra . My idea is that all the requests
go to the client-side ring cache and be sent to the target Cassandra
node(each node is associated with a client pool) to avoid routing between
nodes in the cluster.

Thank you very much.

2014-12-08 16:42 GMT+08:00 Robert Stupp sn...@snazy.de:

 cassandra-stress is a great tool to check whether the sizing of your
 cluster in combination of your data model will fit your production needs.
 I.e. without the application :) Removing the application removes any
 possible bugs from the load test. Sure, it’s a necessary step to do it with
 your application - but I’d recommend to start with the stress test tool
 first.

 Thrift is a deprecated API. I strongly recommend to use the C++ driver (I
 pretty sure it supports the native protocol). The native protocol achieves
 approx. twice the performance than thrift via much fewer TCP connections.
 (Thrift is RPC - means connections usually waste system, application and
 server resources while waiting for something. Native protocol is a
 multiplexed protocol.) As John already said, all development effort is
 spent on CQL3 and native protocol - thift is just supported.

 With CQL you can you everything that you can do with thrift + more, new
 stuff.

 I also recommend to use prepared statements (it automagically works in a
 distributed cluster with the native protocol) - it eliminates the effort to
 parse CQL statement again and again.


 Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com:

 Thanks Jonathan, actually I'm wondering how CQL is implemented underlying,
 a different RPC mechanism? Why it is faster than thrift? I know I'm wrong,
 but now I just regard CQL as a query language. Could you please help
 explain to me? I still feel puzzled after reading some docs about CQL. I
 create table in CQL, and use cql3 API in thrift. I don't know what else I
 can do with CQL. And I am using C++ to write the client side code.
 Currently I am not using the C++ driver and want to write some simple
 functionality by myself.

 Also, I didn't use the stress test tool provided in the Cassandra
 distribution because I also want to make sure whether I can achieve good
 performance as excepted using my client code. I know others have
 benchmarked Cassandra and got good results. But if I cannot reproduce the
 satisfactory results, I cannot use it in my case.

 I will create a repo and send a link later, hope to get your kind help.

 Thanks very much.

 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com:

 I would really not recommend using thrift for anything at this point,
 including your load tests.  Take a look at CQL, all development is going
 there and has in 2.1 seen a massive performance boost over 2.0.

 You may want to try the Cassandra stress tool included in 2.1, it can
 stress a table you've already built.  That way you can rule out any bugs on
 the client side.  If you're going to keep using your tool, however, it
 would be helpful if you sent out a link to the repo, since currently we
 have no way of knowing if you've got a client side bug (data model or code)
 that's limiting your performance.


 On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com wrote:

 I find under the src/client folder of Cassandra 2.1.0 source code, there
 is a *RingCache.java* file. It uses a thrift client calling the*
 describe_ring()* API to get the token range of each Cassandra node. It
 is used on the client side. The client can use it combined with the
 partitioner to get the target node. In this way there is no need to route
 requests between Cassandra nodes, and the client can directly connect to
 the target node. So maybe it can save some routing time and improve
 performance.
 Thank you very much.

 2014-12-08 1:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com:

 What's a ring cache?

 FYI if you're using the DataStax CQL drivers they will automatically
 route requests to the correct node.

 On Sun Dec 07 2014 at 12:59:36 AM kong kongjiali...@gmail.com wrote:

 Hi,

 I'm doing stress test on Cassandra. And I learn that using ring cache
 can improve the performance because the client requests can directly go to
 the target Cassandra server and the coordinator Cassandra node is the
 desired target node. In this way, there is no need for coordinator node to
 route 

Re: Could ring cache really improve performance in Cassandra?

2014-12-08 Thread Robert Stupp
 
 So the native protocol is an asynchronous protocol? 
Yes.

 I have tried using the stress test tool. But it seems that this tool should 
 run on the same node as one of the Cassandra node(or at least on a node 
 having Cassandra installed)? One I try to run this tool on a separate client 
 instance, I got exceptions thrown.
You should start with „new“ kind of stress testing (using CQL3, using native 
protocol, using prepared statements). Forget about thrift ;)
Start with the example YAML stress file first to learn about it. It allows you 
to configure simultaneous writes and reads that match your workload.
And you do not need to run it on a C* node - but you should think about the 
network between the stress test tool and your cluster.

 The ringcache I found is 
 here:https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/client/RingCache.java
  
 https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/client/RingCache.java
  . And I try to implement the similar funcionality in C++. My repo is here: 
 https://github.com/kongjialin/Cassandra 
 https://github.com/kongjialin/Cassandra . My idea is that all the requests 
 go to the client-side ring cache and be sent to the target Cassandra 
 node(each node is associated with a client pool) to avoid routing between 
 nodes in the cluster.
You can safe yourself a lot of work to implement it right“ - just use the C++ 
driver. It knows about the native protocol and routes requests to the correct 
nodes. Although you can go into the C++ driver code and look how it works, 
improve it etc. :)
I don’t know anything about the C++ driver - but feel free to post to the 
driver mailing list and/or the #datastax-drivers IRC channel.


 
 2014-12-08 16:42 GMT+08:00 Robert Stupp sn...@snazy.de 
 mailto:sn...@snazy.de:
 cassandra-stress is a great tool to check whether the sizing of your cluster 
 in combination of your data model will fit your production needs. I.e. 
 without the application :) Removing the application removes any possible bugs 
 from the load test. Sure, it’s a necessary step to do it with your 
 application - but I’d recommend to start with the stress test tool first.
 
 Thrift is a deprecated API. I strongly recommend to use the C++ driver (I 
 pretty sure it supports the native protocol). The native protocol achieves 
 approx. twice the performance than thrift via much fewer TCP connections. 
 (Thrift is RPC - means connections usually waste system, application and 
 server resources while waiting for something. Native protocol is a 
 multiplexed protocol.) As John already said, all development effort is spent 
 on CQL3 and native protocol - thift is just supported.
 
 With CQL you can you everything that you can do with thrift + more, new stuff.
 
 I also recommend to use prepared statements (it automagically works in a 
 distributed cluster with the native protocol) - it eliminates the effort to 
 parse CQL statement again and again.
 
 
 Am 08.12.2014 um 09:26 schrieb 孔嘉林 kongjiali...@gmail.com 
 mailto:kongjiali...@gmail.com:
 
 Thanks Jonathan, actually I'm wondering how CQL is implemented underlying, a 
 different RPC mechanism? Why it is faster than thrift? I know I'm wrong, but 
 now I just regard CQL as a query language. Could you please help explain to 
 me? I still feel puzzled after reading some docs about CQL. I create table 
 in CQL, and use cql3 API in thrift. I don't know what else I can do with 
 CQL. And I am using C++ to write the client side code. Currently I am not 
 using the C++ driver and want to write some simple functionality by myself. 
 
 Also, I didn't use the stress test tool provided in the Cassandra 
 distribution because I also want to make sure whether I can achieve good 
 performance as excepted using my client code. I know others have benchmarked 
 Cassandra and got good results. But if I cannot reproduce the satisfactory 
 results, I cannot use it in my case.
 
 I will create a repo and send a link later, hope to get your kind help.
 
 Thanks very much.
 
 2014-12-08 14:28 GMT+08:00 Jonathan Haddad j...@jonhaddad.com 
 mailto:j...@jonhaddad.com:
 I would really not recommend using thrift for anything at this point, 
 including your load tests.  Take a look at CQL, all development is going 
 there and has in 2.1 seen a massive performance boost over 2.0.
 
 You may want to try the Cassandra stress tool included in 2.1, it can stress 
 a table you've already built.  That way you can rule out any bugs on the 
 client side.  If you're going to keep using your tool, however, it would be 
 helpful if you sent out a link to the repo, since currently we have no way 
 of knowing if you've got a client side bug (data model or code) that's 
 limiting your performance.
 
 
 On Sun Dec 07 2014 at 7:55:16 PM 孔嘉林 kongjiali...@gmail.com 
 mailto:kongjiali...@gmail.com wrote:
 I find under the src/client folder of Cassandra 2.1.0 source code, there is 
 a RingCache.java file. It uses a 

Re: Can not connect with cqlsh to something different than localhost

2014-12-08 Thread Richard Snowden
I left listen_address blank - still I can't connect (connection refused).

cqlsh - OK
cqlsh ubuntu - fail (ubuntu is my hostname)
cqlsh 192.168.111.136 - fail

telnet 192.168.111.136 9042 from outside the VM gives me a connection
refused.

I just started a Tomcat in my VM and did a telnet 192.168.111.136 8080
from outside the VM  - and got the expected result (Connected to
192.168.111.136. Escape character is '^]'.

So what's so special in Cassandra?


On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com wrote:

 Listen address needs the actual address, not the interface.  This is best
 accomplished by setting up proper hostnames for each machine (through DNS
 or hosts file) and leaving listen_address blank, as it will pick the
 external ip.  Otherwise, you'll need to set the listen address to the IP of
 the machine you want on each machine.  I find the former to be less of a
 pain to manage.


 On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden 
 richard.t.snow...@gmail.com wrote:

 This did not work either. I changed /etc/cassandra.yaml and restarted 
 Cassandra (I even restarted the machine to make 100% sure).

 What I tried:

 1) listen_address: localhost
- connection OK (but of course I can't connect from outside the VM to 
 localhost)

 2) Set listen_interface: eth0
- connection refused

 3) Set listen_address: 192.168.111.136
- connection refused


 What to do?


  Try:
  $ netstat -lnt
  and see which interface port 9042 is listening on. You will likely need to
  update cassandra.yaml to change the interface. By default, Cassandra is
  listening on localhost so your local cqlsh session works.

  On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
  wrote:

   I am running Cassandra 2.1.2 in an Ubuntu VM.
  
   cqlsh or cqlsh localhost works fine.
  
   But I can not connect from outside the VM (firewall, etc. disabled).
  
   Even when I do cqlsh 192.168.111.136 in my VM I get connection refused.
   This is strange because when I check my network config I can see that
   192.168.111.136 is my IP:
  
   root@ubuntu:~# ifconfig
  
   eth0  Link encap:Ethernet  HWaddr 00:0c:29:02:e0:de
 inet addr:192.168.111.136  Bcast:192.168.111.255
   Mask:255.255.255.0
 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0
 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:21307125 (21.3 MB)  TX bytes:709471 (709.4 KB)
  
   loLink encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 inet6 addr: ::1/128 Scope:Host
 UP LOOPBACK RUNNING  MTU:65536  Metric:1
 RX packets:550 errors:0 dropped:0 overruns:0 frame:0
 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:148053 (148.0 KB)  TX bytes:148053 (148.0 KB)
  
  
   root@ubuntu:~# cqlsh 192.168.111.136 9042
   Connection error: ('Unable to connect to any servers', 
   {'192.168.111.136':
   error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error:
   Connection refused)})
  
  
   What to do?
  




Cassandra 2.1.2 node stuck on joining the cluster

2014-12-08 Thread Krzysztof Zarzycki
Hi Cassandra users,

I'm trying but failing to join a new (well old, but wiped
out/decomissioned) node to an existing cluster.

Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I
start a third node with 2.1.2, it gets to joining state, it bootstraps,
i.e. streams some data as shown by nodetool netstats, but after some time,
it gets stuck. From that point nothing gets streamed, the new node stays in
joining state. I restarted node multiple times, each time it streamed more
data, but then got stuck again.

Other facts:

   - I don't see any errors in the log on any of the nodes.
   - The connectivity seems fine, I can ping, netcat to port 7000 all ways.
   - I have ~ 200 GB load per running node, replication 2, 16 tokens.
   - Load of a new node got to around 300GBs now.
   -

   The bootstrapping process stops in the middle of streaming some table,
   *always* after sending exactly 10MB of some SSTable, e.g.:

   $ nodetool netstats | grep -P -v bytes\(100 Mode: NORMAL Bootstrap
   e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files,
   12493900 bytes total
   
/home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db
   10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair
   Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch
   (Background): 168721 Pool Name Active Pending Completed Commands n/a 0
   55802918 Responses n/a 0 425963


I'm trying to join this node for several days and I don't know what to do
with it... I'll be grateful for any help!


Cheers,

Krzysztof Zarzycki


Re: How to model data to achieve specific data locality

2014-12-08 Thread Eric Stevens
The upper bound for the data size of a single column is 2GB, and the upper
bound for the number of columns in a row (partition) is 2 billion.  So if
you wanted to create the largest possible row, you probably can't afford
enough disks to hold it.
http://wiki.apache.org/cassandra/CassandraLimitations

Practically speaking you start running into troubles *way* before you reach
those thresholds though.  Large columns and large numbers of columns create
GC pressure in your cluster, and since all data for a given row reside on
the same primary and replicas, this tends to lead to hot spotting.  Repair
happens for entire rows, so large rows increase the cost of repairs,
including GC pressure during the repair.  And rows of this size are often
arrived at by appending to the same row repeatedly, which will cause the
data for that row to be scattered across a large number of SSTables which
will hurt read performance. Also depending on your interface, you'll find
you start hitting limits that you have to increase, each with their own
implications (eg, maximum thrift message sizes and so forth).  The right
maximum practical size for a row definitely depends on your read and write
patterns, as well as your hardware and network.  More memory, SSD's, larger
SSTables, and faster networks will all raise the ceiling for where large
rows start to become painful.

@Kai, if you're familiar with the Thrift paradigm, the partition key
equates to a Thrift row key, and the clustering key equates to the first
part of a composite column name.  CQL PRIMARY KEY ((a,b), c, d) equates to
Thrift where row key is ['a:b'] and all columns begin with ['c:d:'].
Recommended reading: http://www.datastax.com/dev/blog/thrift-to-cql3

Whatever your partition key, if you need to sub-partition to maintain
reasonable row sizes, then the only way to preserve data locality for
related records is probably to switch to byte ordered partitioner, and
compute blob or long column as part of your partition key that is meant to
cause the PK to to map to the same token.  Just be aware that byte ordered
partitioner comes with a number of caveats, and you'll become responsible
for maintaining good data load distributions in your cluster. But the
benefits from being able to tune locality may be worth it.


On Sun Dec 07 2014 at 3:12:11 PM Jonathan Haddad j...@jonhaddad.com wrote:

 I think he mentioned 100MB as the max size - planning for 1mb might make
 your data model difficult to work.

 On Sun Dec 07 2014 at 12:07:47 PM Kai Wang dep...@gmail.com wrote:

 Thanks for the help. I wasn't clear how clustering column works. Coming
 from Thrift experience, it took me a while to understand how clustering
 column impacts partition storage on disk. Now I believe using seq_type as
 the first clustering column solves my problem. As of partition size, I will
 start with some bucket assumption. If the partition size exceeds the
 threshold I may need to re-bucket using smaller bucket size.

 On another thread Eric mentions the optimal partition size should be at
 100 kb ~ 1 MB. I will use that as the start point to design my bucket
 strategy.


 On Sun, Dec 7, 2014 at 10:32 AM, Jack Krupansky j...@basetechnology.com
 wrote:

   It would be helpful to look at some specific examples of sequences,
 showing how they grow. I suspect that the term “sequence” is being
 overloaded in some subtly misleading way here.

 Besides, we’ve already answered the headline question – data locality is
 achieved by having a common partition key. So, we need some clarity as to
 what question we are really focusing on

 And, of course, we should be asking the “Cassandra Data Modeling 101”
 question of what do your queries want to look like, how exactly do you want
 to access your data. Only after we have a handle on how you need to read
 your data can we decide how it should be stored.

 My immediate question to get things back on track: When you say “The
 typical read is to load a subset of sequences with the same seq_id”,
 what type of “subset” are you talking about? Again, a few explicit and
 concise example queries (in some concise, easy to read pseudo language or
 even plain English, but not belabored with full CQL syntax.) would be very
 helpful. I mean, Cassandra has no “subset” concept, nor a “load subset”
 command, so what are we really talking about?

 Also, I presume we are talking CQL, but some of the references seem more
 Thrift/slice oriented.

 -- Jack Krupansky

  *From:* Eric Stevens migh...@gmail.com
 *Sent:* Sunday, December 7, 2014 10:12 AM
 *To:* user@cassandra.apache.org
 *Subject:* Re: How to model data to achieve specific data locality

  Also new seq_types can be added and old seq_types can be deleted.
 This means I often need to ALTER TABLE to add and drop columns.

 Kai, unless I'm misunderstanding something, I don't see why you need to
 alter the table to add a new seq type.  From a data model perspective,
 these are just new values in a row.

 If you do have columns 

Re: Cassandra Doesn't Get Linear Performance Increment in Stress Test on Amazon EC2

2014-12-08 Thread Chris Lohfink
So I would -expect- an increase of ~20k qps per node with m3.xlarge so
there may be something up with your client (I am not a c++ person however
but hopefully someone on list will take notice).

Latency does not decreases linearly as you add nodes.  What you are likely
seeing with latency since so few nodes is side effect of an optimization.
When you read/write from a table the node you request will act as the
coordinator.  If the data exists on the coordinator and using rf=1 or cl=1
it will not have to send the request to another node, just service it
locally:

  +-+ +--+
  |  node0  | +--|node1 |
  |-| |--|
  |  client | --+| coordinator  |
  +-+ +--+

In this case the write latency is dominated by the network between
coordinator and client.  A second case is where the coordinator actually
has to send the request to another node:

  +-+ +--+ +---+
  |  node0  | +--|node1 |+-- |node2  |
  |-| |--| |---|
  |  client | --+| coordinator  |---+| data replica  |
  +-+ +--+ +---+

As your adding nodes your increasing the probability of hitting this second
scenario where the coordinator has to make an additional network hop.  This
possibly why your seeing an increase (aside from client issues). To get an
idea on how the latency is affected when you increase nodes you really need
to go higher then 4 (ie graph the same rf for 5, 10, 15, 25 nodes.  below 5
isn't really the recommended way to run Cassandra anyway) nodes since the
latency will approach that of the 2nd scenario (plus some spike outliers
for GCs) and then it should settle down until you overwork the node.

May want to give https://github.com/datastax/cpp-driver a go (not cpp guy
take with grain of salt).  I would still highly recommend using
cassandra-stress instead of own stuff if you want to test cassandra and not
your code.

===
Chris Lohfink

On Mon, Dec 8, 2014 at 4:57 AM, 孔嘉林 kongjiali...@gmail.com wrote:

 Thanks Chris.
 I run a *client on a separate* AWS *instance from* the Cassandra cluster
 servers. At the client side, I create 40 or 50 threads for sending requests
 to each Cassandra node. I create one thrift client for each of the threads.
 And at the beginning, all the created thrift clients connect to the
 corresponding Cassandra nodes and keep connecting during the whole
 process(I did not close all the transports until the end of the test
 process). So I use very simple load balancing, since the same number of
 thrift clients connect to each node. And my source code is here:
 https://github.com/kongjialin/Cassandra/blob/master/cassandra_client.cpp It's
 very nice of you to help me improve my code.

 As I increase the number of threads, the latency gets longer.

 I'm using C++, so if I want to use native binary + prepared statements,
 the only way is to use C++ driver?
 Thanks very much.




 2014-12-08 12:51 GMT+08:00 Chris Lohfink clohfin...@gmail.com:

 I think your client could use improvements.  How many threads do you have
 running in your test?  With a thrift call like that you only can do one
 request at a time per connection.   For example, assuming C* takes 0ms, a
 10ms network latency/driver overhead will mean 20ms RTT and a max
 throughput of ~50 QPS per thread (native binary doesn't behave like this).
 Are you running client on its own system or shared with a node?  how are
 you load balancing your requests?  Source code would help since theres a
 lot that can become a bottleneck.

 Generally you will see a bit of a dip in latency from N=RF=1 and N=2,
 RF=2 etc since there are optimizations on the coordinator node when it
 doesn't need to send the request to the replicas.  The impact of the
 network overhead decreases in significance as cluster grows.  Typically;
 latency wise, RF=N=1 is going to be fastest possible for smaller loads (ie
 when a client cannot fully saturate a single node).

 Main thing to expect is that latency will plateau and remain fairly
 constant as load/nodes increase while throughput potential will linearly
 (empirically at least) increase.

 You should really attempt it with the native binary + prepared
 statements, running cql over thrift is far from optimal.  I would recommend
 using the cassandra-stress tool if you want to stress test Cassandra (and
 not your code)
 http://www.datastax.com/dev/blog/improved-cassandra-2-1-stress-tool-benchmark-any-schema

 ===
 Chris Lohfink

 On Sun, Dec 7, 2014 at 9:48 PM, 孔嘉林 kongjiali...@gmail.com wrote:

 Hi Eric,
 Thank you very much for your reply!
 Do you mean that I should clear my table after each run? Indeed, I can
 see several times of compaction during my test, but could only a few times
 compaction 

Re: Cassandra Doesn't Get Linear Performance Increment in Stress Test on Amazon EC2

2014-12-08 Thread Eric Stevens
 Do you mean that I should clear my table after each run? Indeed, I can
see several times of compaction during my test, but could only a few times
compaction affect the performance that much?

It certainly affects performance.  Read performance suffers first, then
write performance suffers eventually.  For this synthetic test, if you want
to compare like states then you should certainly wipe between.  You may
fall behind on compaction for the first run, then the second run pays the
penalty for data grooming backlog generated during the first run.

 As for latency, which latency should I care about most? p(99) or p(999)?

p(99) discards the worst 1% of results for reporting, p(999) discards the
worst 0.1% of results for reporting.  Which you prefer depends on your
tolerance for response time jitter.  I.E. do you need 99% of responses to
be under a threshold, 99.9%?  The more 9's, the more likely you are to fail
your threshold due to an outlier.

 So, how did you test your cluster that can get 86k writes/sec? How many
requests did you send to your cluster?

I wrote the same data to each of 5 tables with similar columns, but
different key configurations.  I did 100 runs of 5,000 records (different
records for each run).  The data itself was 5 columns composed of a mix of
bigint, text, and timestamp (so per record, fairly small data).  I wrote
records in asynchronous batches of 100 at a time, completing each of the
5,000 records for one table before moving on to the next table (the last
write to table 1 needed to complete before I moved on to the first write of
table 2, but within a table the operations were done in parallel).

I used the Datastax Java Driver, which speaks the native protocol, and is
faster and supports more parallelism than Thrift.

 Was it also 1 million?

In total it was 500,000 records written to each of 5 tables - so 2.5
million records overall.

 Did you also use OpsCenter to monitor the real time performance? I also
wonder why the write and read QPS OpsCenter provide are much lower than
what I calculate.

No, I measured throughput on my client only.  I don't have much experience
with OpsCenter, so I'm afraid I can't give you much insight into why you'd
see inconsistent information compared to data you measured.  Maybe you're
just seeing information for a single node instead of the whole cluster?

Again, the validity of this kind of test is highly suspect even though I
happened to have set this up already.  In my case I was trying to measure
burst performance specifically.  Cassandra will definitely accept bursts
well, but if you sustain such a load, performance will degrade over time.
Under sustained conditions you need to be certain you are staying on top of
compaction - outstanding compaction tasks should rarely if ever exceed 2 or
3.  Above 10, you need to reduce your write volume or your cluster will
gradually fall over, and you'll struggle to bootstrap new nodes to expand.

Do not size Cassandra for burst writes, size it for sustained writes.
Write your sizing tests with that in mind - how much can you write and not
fall behind on compaction over time, and accordingly your tests need to run
for hours or days, not seconds or minutes.

On Mon Dec 08 2014 at 3:58:35 AM 孔嘉林 kongjiali...@gmail.com wrote:

 Thanks Chris.
 I run a *client on a separate* AWS *instance from* the Cassandra cluster
 servers. At the client side, I create 40 or 50 threads for sending requests
 to each Cassandra node. I create one thrift client for each of the threads.
 And at the beginning, all the created thrift clients connect to the
 corresponding Cassandra nodes and keep connecting during the whole
 process(I did not close all the transports until the end of the test
 process). So I use very simple load balancing, since the same number of
 thrift clients connect to each node. And my source code is here:
 https://github.com/kongjialin/Cassandra/blob/master/cassandra_client.cpp It's
 very nice of you to help me improve my code.

 As I increase the number of threads, the latency gets longer.

 I'm using C++, so if I want to use native binary + prepared statements,
 the only way is to use C++ driver?
 Thanks very much.




 2014-12-08 12:51 GMT+08:00 Chris Lohfink clohfin...@gmail.com:

 I think your client could use improvements.  How many threads do you have
 running in your test?  With a thrift call like that you only can do one
 request at a time per connection.   For example, assuming C* takes 0ms, a
 10ms network latency/driver overhead will mean 20ms RTT and a max
 throughput of ~50 QPS per thread (native binary doesn't behave like this).
 Are you running client on its own system or shared with a node?  how are
 you load balancing your requests?  Source code would help since theres a
 lot that can become a bottleneck.

 Generally you will see a bit of a dip in latency from N=RF=1 and N=2,
 RF=2 etc since there are optimizations on the coordinator node when it
 doesn't need to send the request to the 

Re: Can not connect with cqlsh to something different than localhost

2014-12-08 Thread Michael Dykman
The difference is what interface your service is listening on. What is the
output of

$ netstat -ntl | grep 9042

On Mon, 8 Dec 2014 07:21 Richard Snowden richard.t.snow...@gmail.com
wrote:

 I left listen_address blank - still I can't connect (connection refused).

 cqlsh - OK
 cqlsh ubuntu - fail (ubuntu is my hostname)
 cqlsh 192.168.111.136 - fail

 telnet 192.168.111.136 9042 from outside the VM gives me a connection
 refused.

 I just started a Tomcat in my VM and did a telnet 192.168.111.136 8080
 from outside the VM  - and got the expected result (Connected to
 192.168.111.136. Escape character is '^]'.

 So what's so special in Cassandra?


 On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 Listen address needs the actual address, not the interface.  This is best
 accomplished by setting up proper hostnames for each machine (through DNS
 or hosts file) and leaving listen_address blank, as it will pick the
 external ip.  Otherwise, you'll need to set the listen address to the IP of
 the machine you want on each machine.  I find the former to be less of a
 pain to manage.


 On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden 
 richard.t.snow...@gmail.com wrote:

 This did not work either. I changed /etc/cassandra.yaml and restarted 
 Cassandra (I even restarted the machine to make 100% sure).

 What I tried:

 1) listen_address: localhost
- connection OK (but of course I can't connect from outside the VM to 
 localhost)

 2) Set listen_interface: eth0
- connection refused

 3) Set listen_address: 192.168.111.136
- connection refused


 What to do?


  Try:
  $ netstat -lnt
  and see which interface port 9042 is listening on. You will likely need to
  update cassandra.yaml to change the interface. By default, Cassandra is
  listening on localhost so your local cqlsh session works.

  On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
  wrote:

   I am running Cassandra 2.1.2 in an Ubuntu VM.
  
   cqlsh or cqlsh localhost works fine.
  
   But I can not connect from outside the VM (firewall, etc. disabled).
  
   Even when I do cqlsh 192.168.111.136 in my VM I get connection 
   refused.
   This is strange because when I check my network config I can see that
   192.168.111.136 is my IP:
  
   root@ubuntu:~# ifconfig
  
   eth0  Link encap:Ethernet  HWaddr 00:0c:29:02:e0:de
 inet addr:192.168.111.136  Bcast:192.168.111.255
   Mask:255.255.255.0
 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0
 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:21307125 (21.3 MB)  TX bytes:709471 (709.4 KB)
  
   loLink encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 inet6 addr: ::1/128 Scope:Host
 UP LOOPBACK RUNNING  MTU:65536  Metric:1
 RX packets:550 errors:0 dropped:0 overruns:0 frame:0
 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:148053 (148.0 KB)  TX bytes:148053 (148.0 KB)
  
  
   root@ubuntu:~# cqlsh 192.168.111.136 9042
   Connection error: ('Unable to connect to any servers', 
   {'192.168.111.136':
   error(111, Tried connecting to [('192.168.111.136', 9042)]. Last error:
   Connection refused)})
  
  
   What to do?
  





Re: Cassandra 2.1.2 node stuck on joining the cluster

2014-12-08 Thread Omri Bahumi
Any chance you have something along the path that causes the
connectivity issues?
What's the network connectivity between this node and the other node?

Can you try transferring a big file between the two servers? perhaps
you have an MTU issue that causes TCP PMTU discovery fail.
Can you send large pings between the servers? try pinging them from
both sides with large packets (5000, 1).

On Mon, Dec 8, 2014 at 3:22 PM, Krzysztof Zarzycki k.zarzy...@gmail.com wrote:
 Hi Cassandra users,

 I'm trying but failing to join a new (well old, but wiped out/decomissioned)
 node to an existing cluster.

 Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I
 start a third node with 2.1.2, it gets to joining state, it bootstraps, i.e.
 streams some data as shown by nodetool netstats, but after some time, it
 gets stuck. From that point nothing gets streamed, the new node stays in
 joining state. I restarted node multiple times, each time it streamed more
 data, but then got stuck again.

 Other facts:

 I don't see any errors in the log on any of the nodes.
 The connectivity seems fine, I can ping, netcat to port 7000 all ways.
 I have ~ 200 GB load per running node, replication 2, 16 tokens.
 Load of a new node got to around 300GBs now.

 The bootstrapping process stops in the middle of streaming some table,
 always after sending exactly 10MB of some SSTable, e.g.:

 $ nodetool netstats | grep -P -v bytes\(100 Mode: NORMAL Bootstrap
 e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files,
 12493900 bytes total
 /home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db
 10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair
 Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch (Background):
 168721 Pool Name Active Pending Completed Commands n/a 0 55802918 Responses
 n/a 0 425963


 I'm trying to join this node for several days and I don't know what to do
 with it... I'll be grateful for any help!


 Cheers,

 Krzysztof Zarzycki




Re: Can not connect with cqlsh to something different than localhost

2014-12-08 Thread Sam Tunnicliffe
rpc_address (or rpc_interface) is used for client connections,
listen_address is for inter-node communication.



On 8 December 2014 at 19:21, Richard Snowden richard.t.snow...@gmail.com
wrote:

 $ netstat -ntl | grep 9042
 tcp6   0  0   127.0.0.1:9042  :::*
 LISTEN

 (listen_address not set in cassandra.yaml)

 Even with listen_address: 192.168.111.136 I get:
 $ netstat -ntl | grep 9042
 tcp6   0  0   127.0.0.1:9042  :::*
 LISTEN


 All I want to do is to access Cassandra from outside my VM. Is this really
 that hard?



 On Mon, Dec 8, 2014 at 7:30 PM, Michael Dykman mdyk...@gmail.com wrote:

 The difference is what interface your service is listening on. What is
 the output of

 $ netstat -ntl | grep 9042


 On Mon, 8 Dec 2014 07:21 Richard Snowden richard.t.snow...@gmail.com
 wrote:

 I left listen_address blank - still I can't connect (connection refused).

 cqlsh - OK
 cqlsh ubuntu - fail (ubuntu is my hostname)
 cqlsh 192.168.111.136 - fail

 telnet 192.168.111.136 9042 from outside the VM gives me a connection
 refused.

 I just started a Tomcat in my VM and did a telnet 192.168.111.136 8080
 from outside the VM  - and got the expected result (Connected to
 192.168.111.136. Escape character is '^]'.

 So what's so special in Cassandra?


 On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 Listen address needs the actual address, not the interface.  This is
 best accomplished by setting up proper hostnames for each machine (through
 DNS or hosts file) and leaving listen_address blank, as it will pick the
 external ip.  Otherwise, you'll need to set the listen address to the IP of
 the machine you want on each machine.  I find the former to be less of a
 pain to manage.


 On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden 
 richard.t.snow...@gmail.com wrote:

 This did not work either. I changed /etc/cassandra.yaml and restarted 
 Cassandra (I even restarted the machine to make 100% sure).

 What I tried:

 1) listen_address: localhost
- connection OK (but of course I can't connect from outside the VM to 
 localhost)

 2) Set listen_interface: eth0
- connection refused

 3) Set listen_address: 192.168.111.136
- connection refused


 What to do?


  Try:
  $ netstat -lnt
  and see which interface port 9042 is listening on. You will likely need 
  to
  update cassandra.yaml to change the interface. By default, Cassandra is
  listening on localhost so your local cqlsh session works.

  On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
  wrote:

   I am running Cassandra 2.1.2 in an Ubuntu VM.
  
   cqlsh or cqlsh localhost works fine.
  
   But I can not connect from outside the VM (firewall, etc. disabled).
  
   Even when I do cqlsh 192.168.111.136 in my VM I get connection 
   refused.
   This is strange because when I check my network config I can see that
   192.168.111.136 is my IP:
  
   root@ubuntu:~# ifconfig
  
   eth0  Link encap:Ethernet  HWaddr 00:0c:29:02:e0:de
 inet addr:192.168.111.136  Bcast:192.168.111.255
   Mask:255.255.255.0
 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0
 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:21307125 (21.3 MB)  TX bytes:709471 (709.4 KB)
  
   loLink encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 inet6 addr: ::1/128 Scope:Host
 UP LOOPBACK RUNNING  MTU:65536  Metric:1
 RX packets:550 errors:0 dropped:0 overruns:0 frame:0
 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:148053 (148.0 KB)  TX bytes:148053 (148.0 KB)
  
  
   root@ubuntu:~# cqlsh 192.168.111.136 9042
   Connection error: ('Unable to connect to any servers', 
   {'192.168.111.136':
   error(111, Tried connecting to [('192.168.111.136', 9042)]. Last 
   error:
   Connection refused)})
  
  
   What to do?
  






Re: Can not connect with cqlsh to something different than localhost

2014-12-08 Thread Richard Snowden
Ah! That did the trick!

Thanks Sam!



On Mon, Dec 8, 2014 at 8:49 PM, Sam Tunnicliffe s...@beobal.com wrote:

 rpc_address (or rpc_interface) is used for client connections,
 listen_address is for inter-node communication.



 On 8 December 2014 at 19:21, Richard Snowden richard.t.snow...@gmail.com
 wrote:

 $ netstat -ntl | grep 9042
 tcp6   0  0   127.0.0.1:9042  :::*
 LISTEN

 (listen_address not set in cassandra.yaml)

 Even with listen_address: 192.168.111.136 I get:
 $ netstat -ntl | grep 9042
 tcp6   0  0   127.0.0.1:9042  :::*
 LISTEN


 All I want to do is to access Cassandra from outside my VM. Is this
 really that hard?



 On Mon, Dec 8, 2014 at 7:30 PM, Michael Dykman mdyk...@gmail.com wrote:

 The difference is what interface your service is listening on. What is
 the output of

 $ netstat -ntl | grep 9042


 On Mon, 8 Dec 2014 07:21 Richard Snowden richard.t.snow...@gmail.com
 wrote:

 I left listen_address blank - still I can't connect (connection
 refused).

 cqlsh - OK
 cqlsh ubuntu - fail (ubuntu is my hostname)
 cqlsh 192.168.111.136 - fail

 telnet 192.168.111.136 9042 from outside the VM gives me a
 connection refused.

 I just started a Tomcat in my VM and did a telnet 192.168.111.136
 8080 from outside the VM  - and got the expected result (Connected to
 192.168.111.136. Escape character is '^]'.

 So what's so special in Cassandra?


 On Mon, Dec 8, 2014 at 12:18 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 Listen address needs the actual address, not the interface.  This is
 best accomplished by setting up proper hostnames for each machine (through
 DNS or hosts file) and leaving listen_address blank, as it will pick the
 external ip.  Otherwise, you'll need to set the listen address to the IP 
 of
 the machine you want on each machine.  I find the former to be less of a
 pain to manage.


 On Mon Dec 08 2014 at 2:49:55 AM Richard Snowden 
 richard.t.snow...@gmail.com wrote:

 This did not work either. I changed /etc/cassandra.yaml and restarted 
 Cassandra (I even restarted the machine to make 100% sure).

 What I tried:

 1) listen_address: localhost
- connection OK (but of course I can't connect from outside the VM 
 to localhost)

 2) Set listen_interface: eth0
- connection refused

 3) Set listen_address: 192.168.111.136
- connection refused


 What to do?


  Try:
  $ netstat -lnt
  and see which interface port 9042 is listening on. You will likely 
  need to
  update cassandra.yaml to change the interface. By default, Cassandra is
  listening on localhost so your local cqlsh session works.

  On Sun, 7 Dec 2014 23:44 Richard Snowden richard.t.snow...@gmail.com
  wrote:

   I am running Cassandra 2.1.2 in an Ubuntu VM.
  
   cqlsh or cqlsh localhost works fine.
  
   But I can not connect from outside the VM (firewall, etc. disabled).
  
   Even when I do cqlsh 192.168.111.136 in my VM I get connection 
   refused.
   This is strange because when I check my network config I can see that
   192.168.111.136 is my IP:
  
   root@ubuntu:~# ifconfig
  
   eth0  Link encap:Ethernet  HWaddr 00:0c:29:02:e0:de
 inet addr:192.168.111.136  Bcast:192.168.111.255
   Mask:255.255.255.0
 inet6 addr: fe80::20c:29ff:fe02:e0de/64 Scope:Link
 UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
 RX packets:16042 errors:0 dropped:0 overruns:0 frame:0
 TX packets:8638 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:1000
 RX bytes:21307125 (21.3 MB)  TX bytes:709471 (709.4 KB)
  
   loLink encap:Local Loopback
 inet addr:127.0.0.1  Mask:255.0.0.0
 inet6 addr: ::1/128 Scope:Host
 UP LOOPBACK RUNNING  MTU:65536  Metric:1
 RX packets:550 errors:0 dropped:0 overruns:0 frame:0
 TX packets:550 errors:0 dropped:0 overruns:0 carrier:0
 collisions:0 txqueuelen:0
 RX bytes:148053 (148.0 KB)  TX bytes:148053 (148.0 KB)
  
  
   root@ubuntu:~# cqlsh 192.168.111.136 9042
   Connection error: ('Unable to connect to any servers', 
   {'192.168.111.136':
   error(111, Tried connecting to [('192.168.111.136', 9042)]. Last 
   error:
   Connection refused)})
  
  
   What to do?
  







Re: Keyspace and table/cf limits

2014-12-08 Thread Frank Hsueh
has there been any recent discussion on multitenancy namespaces ?  I think
this would effectively solve the scenario -- a formalized partition-key
that's enforced at the storage layer, similar to oracle's virtual private
database

it was on the wiki from ~ Aug 2010

http://wiki.apache.org/cassandra/MultiTenant

Namespaces - in a multi-tenant use case, each user might like to have a
keyspace XYZ for whatever reason. So it might be nice to have namespaces so
that keyspace XYZ could be specific to their user. Ideally this would be an
option that would not affect those that don't use namespaces.

   - The distinction from keyspaces is that a namespace would be completely
   transparent to the user: the existence of namespaces would not be exposed.
   It might be returned by the authentication backend on login, and prefixed
   to keyspaces transparently.



thanks !!!


On Sat, Dec 6, 2014 at 11:25 PM, Jason Wee peich...@gmail.com wrote:

 +1 well said Jack!

 On Sun, Dec 7, 2014 at 6:13 AM, Jack Krupansky j...@basetechnology.com
 wrote:

   Generally, limit a Cassandra cluster low hundreds of tables,
 regardless of number of keyspaces. Beyond low hundreds is certainly an
 “expert” feature and requires great care. Sure, maybe you can have 500 or
 750 or maybe even 1,000 tables in a cluster, but don’t be surprised if you
 start running into memory and performance issues.

 There is an undocumented method to reduce the table overhead to support
 more tables, but... if you are not expert enough to find it on your own,
 then you are definitely not expert enough to be using it.

 -- Jack Krupansky

  *From:* Raj N raj.cassan...@gmail.com
 *Sent:* Tuesday, November 25, 2014 12:07 PM
 *To:* user@cassandra.apache.org
 *Subject:* Keyspace and table/cf limits

  What's the latest on the maximum number of keyspaces and/or tables that
 one can have in Cassandra 2.1.x?

 -Raj





-- 
Frank Hsueh | frank.hs...@gmail.com


Cassandra Files Taking up Much More Space than CF

2014-12-08 Thread Nate Yoder
Hi All,

I am new to Cassandra so I apologise in advance if I have missed anything
obvious but this one currently has me stumped.

I am currently running a 6 node Cassandra 2.1.1 cluster on EC2 using
C3.2XLarge nodes which overall is working very well for us.  However, after
letting it run for a while I seem to get into a situation where the amount
of disk space used far exceeds the total amount of data on each node and I
haven't been able to get the size to go back down except by stopping and
restarting the node.

For example, in my data I have almost all of my data in one table.  On one
of my nodes right now the total space used (as reported by nodetool
cfstats) is 57.2 GB and there are no snapshots. However, when I look at the
size of the data files (using du) the data file for that table is 107GB.
Because the C3.2XLarge only have 160 GB of SSD you can see why this quickly
becomes a problem.

Running nodetool compact didn't reduce the size and neither does running
nodetool repair -pr on the node.  I also tried nodetool flush and nodetool
cleanup (even though I have not added or removed any nodes recently) but it
didn't change anything either.  In order to keep my cluster up I then
stopped and started that node and the size of the data file dropped to 54GB
while the total column family size (as reported by nodetool) stayed about
the same.

Any suggestions as to what I could be doing wrong?

Thanks,
Nate