Re: Availability testing of Cassandra nodes

2015-04-09 Thread Jiri Horky
Hi Jack,

it seems there is a some misunderstanding. There are two things. One is
that the Cassandra works for application, which may (and should) be true
even if some of the nodes are actually down. The other thing is that
even in this case you want to be notified that there are faulty
Cassandra nodes.

Now I am trying to tackle the later case, I am not having issues with
how client-side load balancing works.

Jirka H.

On 04/09/2015 07:15 AM, Ajay wrote:
 Adding Java driver forum.

 Even we like to know more on this.

 -
 Ajay

 On Wed, Apr 8, 2015 at 8:15 PM, Jack Krupansky
 jack.krupan...@gmail.com mailto:jack.krupan...@gmail.com wrote:

 Just a couple of quick comments:

 1. The driver is supposed to be doing availability and load
 balancing already.
 2. If your cluster is lightly loaded, it isn't necessary to be so
 precise with load balancing.
 3. If your cluster is heavily loaded, it won't help. Solution is
 to expand your cluster so that precise balancing of requests
 (beyond what the driver does) is not required.

 Is there anything special about your use case that you feel is
 worth the extra treatment?

 If you are having problems with the driver balancing requests and
 properly detecting available nodes or see some room for
 improvement, make sure to the issues so that they can be fixed.


 -- Jack Krupansky

 On Wed, Apr 8, 2015 at 10:31 AM, Jiri Horky ho...@avast.com
 mailto:ho...@avast.com wrote:

 Hi all,

 we are thinking of how to best proceed with availability
 testing of
 Cassandra nodes. It is becoming more and more apparent that it
 is rather
 complex task. We thought that we should try to read and write
 to each
 cassandra node to monitoring keyspace with a unique value
 with low
 TTL. This helps to find an issue but it also triggers flapping of
 unaffected hosts, as the key of the value which is beining
 inserted
 sometimes belongs to an affected host and sometimes not. Now,
 we could
 calculate the right value to insert so we can be sure it will
 hit the
 host we are connecting to, but then, you have replication
 factor and
 consistency level, so you can not be really sure that it
 actually tests
 ability of the given host to write values.

 So we ended up thinking that the best approach is to connect
 to each
 individual host, read some system keyspace (which might be on a
 different disk drive...), which should be local, and then
 check several
 JMX values that could indicate an error + JVM statitics (full
 heap, gc
 overhead). Moreover, we will more monitor our applications
 that are
 using cassandra (with mostly datastax driver) and try to get
 fail node
 information from them.

 How others do the testing?

 Jirka H.






RE: Availability testing of Cassandra nodes

2015-04-09 Thread SEAN_R_DURITY
I do two types of node monitoring. On each host, we have a process monitor 
looking for the cassandra process. If it goes down, it will get restarted (if a 
flag is set appropriately).

Secondly, from a remote host, I have an hourly check of all nodes where I 
essentially log in to each node and execute nodetool info. If that returns an 
error, then the node is probably “up,” but hung. (Or the flag above is not set 
properly and the host was bounced/patched, but cassandra did not start.) I 
email details to the support team to investigate.


Sean Durity

From: Jiri Horky [mailto:ho...@avast.com]
Sent: Thursday, April 09, 2015 4:32 AM
To: user@cassandra.apache.org; java-driver-u...@lists.datastax.com
Subject: Re: Availability testing of Cassandra nodes

Hi Jack,

it seems there is a some misunderstanding. There are two things. One is that 
the Cassandra works for application, which may (and should) be true even if 
some of the nodes are actually down. The other thing is that even in this case 
you want to be notified that there are faulty Cassandra nodes.

Now I am trying to tackle the later case, I am not having issues with how 
client-side load balancing works.

Jirka H.
On 04/09/2015 07:15 AM, Ajay wrote:
Adding Java driver forum.
Even we like to know more on this.
-
Ajay

On Wed, Apr 8, 2015 at 8:15 PM, Jack Krupansky 
jack.krupan...@gmail.commailto:jack.krupan...@gmail.com wrote:
Just a couple of quick comments:

1. The driver is supposed to be doing availability and load balancing already.
2. If your cluster is lightly loaded, it isn't necessary to be so precise with 
load balancing.
3. If your cluster is heavily loaded, it won't help. Solution is to expand your 
cluster so that precise balancing of requests (beyond what the driver does) is 
not required.

Is there anything special about your use case that you feel is worth the extra 
treatment?

If you are having problems with the driver balancing requests and properly 
detecting available nodes or see some room for improvement, make sure to the 
issues so that they can be fixed.


-- Jack Krupansky

On Wed, Apr 8, 2015 at 10:31 AM, Jiri Horky 
ho...@avast.commailto:ho...@avast.com wrote:
Hi all,

we are thinking of how to best proceed with availability testing of
Cassandra nodes. It is becoming more and more apparent that it is rather
complex task. We thought that we should try to read and write to each
cassandra node to monitoring keyspace with a unique value with low
TTL. This helps to find an issue but it also triggers flapping of
unaffected hosts, as the key of the value which is beining inserted
sometimes belongs to an affected host and sometimes not. Now, we could
calculate the right value to insert so we can be sure it will hit the
host we are connecting to, but then, you have replication factor and
consistency level, so you can not be really sure that it actually tests
ability of the given host to write values.

So we ended up thinking that the best approach is to connect to each
individual host, read some system keyspace (which might be on a
different disk drive...), which should be local, and then check several
JMX values that could indicate an error + JVM statitics (full heap, gc
overhead). Moreover, we will more monitor our applications that are
using cassandra (with mostly datastax driver) and try to get fail node
information from them.

How others do the testing?

Jirka H.






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


Re: Availability testing of Cassandra nodes

2015-04-08 Thread Ajay
Adding Java driver forum.

Even we like to know more on this.

-
Ajay

On Wed, Apr 8, 2015 at 8:15 PM, Jack Krupansky jack.krupan...@gmail.com
wrote:

 Just a couple of quick comments:

 1. The driver is supposed to be doing availability and load balancing
 already.
 2. If your cluster is lightly loaded, it isn't necessary to be so precise
 with load balancing.
 3. If your cluster is heavily loaded, it won't help. Solution is to expand
 your cluster so that precise balancing of requests (beyond what the driver
 does) is not required.

 Is there anything special about your use case that you feel is worth the
 extra treatment?

 If you are having problems with the driver balancing requests and properly
 detecting available nodes or see some room for improvement, make sure to
 the issues so that they can be fixed.


 -- Jack Krupansky

 On Wed, Apr 8, 2015 at 10:31 AM, Jiri Horky ho...@avast.com wrote:

 Hi all,

 we are thinking of how to best proceed with availability testing of
 Cassandra nodes. It is becoming more and more apparent that it is rather
 complex task. We thought that we should try to read and write to each
 cassandra node to monitoring keyspace with a unique value with low
 TTL. This helps to find an issue but it also triggers flapping of
 unaffected hosts, as the key of the value which is beining inserted
 sometimes belongs to an affected host and sometimes not. Now, we could
 calculate the right value to insert so we can be sure it will hit the
 host we are connecting to, but then, you have replication factor and
 consistency level, so you can not be really sure that it actually tests
 ability of the given host to write values.

 So we ended up thinking that the best approach is to connect to each
 individual host, read some system keyspace (which might be on a
 different disk drive...), which should be local, and then check several
 JMX values that could indicate an error + JVM statitics (full heap, gc
 overhead). Moreover, we will more monitor our applications that are
 using cassandra (with mostly datastax driver) and try to get fail node
 information from them.

 How others do the testing?

 Jirka H.





Re: Availability testing of Cassandra nodes

2015-04-08 Thread Jack Krupansky
Just a couple of quick comments:

1. The driver is supposed to be doing availability and load balancing
already.
2. If your cluster is lightly loaded, it isn't necessary to be so precise
with load balancing.
3. If your cluster is heavily loaded, it won't help. Solution is to expand
your cluster so that precise balancing of requests (beyond what the driver
does) is not required.

Is there anything special about your use case that you feel is worth the
extra treatment?

If you are having problems with the driver balancing requests and properly
detecting available nodes or see some room for improvement, make sure to
the issues so that they can be fixed.


-- Jack Krupansky

On Wed, Apr 8, 2015 at 10:31 AM, Jiri Horky ho...@avast.com wrote:

 Hi all,

 we are thinking of how to best proceed with availability testing of
 Cassandra nodes. It is becoming more and more apparent that it is rather
 complex task. We thought that we should try to read and write to each
 cassandra node to monitoring keyspace with a unique value with low
 TTL. This helps to find an issue but it also triggers flapping of
 unaffected hosts, as the key of the value which is beining inserted
 sometimes belongs to an affected host and sometimes not. Now, we could
 calculate the right value to insert so we can be sure it will hit the
 host we are connecting to, but then, you have replication factor and
 consistency level, so you can not be really sure that it actually tests
 ability of the given host to write values.

 So we ended up thinking that the best approach is to connect to each
 individual host, read some system keyspace (which might be on a
 different disk drive...), which should be local, and then check several
 JMX values that could indicate an error + JVM statitics (full heap, gc
 overhead). Moreover, we will more monitor our applications that are
 using cassandra (with mostly datastax driver) and try to get fail node
 information from them.

 How others do the testing?

 Jirka H.