RE: Issue in internode encryption in cassandra

2016-08-03 Thread Bastien DINE
Hi Ashwini,

On all my nodes, I’m installing the additional jce policy
https://support.datastax.com/hc/en-us/articles/204226129-Receiving-error-Caused-by-java-lang-IllegalArgumentException-Cannot-support-TLS-RSA-WITH-AES-256-CBC-SHA-with-currently-installed-providers-on-DSE-startup-after-setting-up-client-to-node-encryption

Then I’m generating one key / certificate on each of my node, exporting public 
part and store it in a truststore of other nodes and configure cassandra.yaml
Datastax documentation is pretty clear :
https://docs.datastax.com/en/cassandra/2.1/cassandra/security/secureSSLCertificates_t.html
https://docs.datastax.com/en/cassandra/2.1/cassandra/security/secureSSLNodeToNode_t.html

Hope its helps,
Regards,

De : Ashwini Mhatre (asmhatre) [mailto:asmha...@cisco.com]
Envoyé : mercredi 3 août 2016 12:25
À : user@cassandra.apache.org
Cc : Keshava H P (kehp); PRABHJOT KAUR (prabhkau)
Objet : Re: Issue in internode encryption in cassandra

Hi,
Is any one have any hint regarding node to node encryption .


Regards,
Ashwini Mhatre

From: asmhatre >
Reply-To: "user@cassandra.apache.org" 
>
Date: Monday, 25 July 2016 at 4:15 PM
To: "user@cassandra.apache.org" 
>
Subject: Issue in internode encryption in cassandra

I am using internode encryption in cassandra, with self signed CA it works 
fine. but with other product CA m getting this error "Filtering out 
TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA as it 
isnt supported by the socket”


Re: Cassandra

2016-05-26 Thread bastien dine
Hi Alain,

Thanks for your response :)



> A replication factor of 3 for a 3 node cluster does not balance the load:
> since you ask for 3 copies of the data (rf=3) on 3 nodes cluster,
> each node will have a copy of the data and you are overloading all nodes.
> May be you should try with a rf = 2 or add nodes to your cluster ?
>

I agree that with a RF=3 and 3 nodes, all nodes should have almost the same
load and i'm fine with that, this is what I want.
My problem is that this isn't the case, one node get a lot of CPU usage
compared to the others :
node 1 : load average = 17
node 2 : load average = 3
node 3 : load average = 3


Funny fact :
If I stop node 1, the two others nodes have the same load as before.
So this seems not to be a problem of "load is big on node1 because all
request goes to this node" : requests are now spread only amongst node 2
and 3 and they are not overwhelm as node 1 was..
Instead, I suspect hardware problem on node 1..



> "working" ? How can I list connections on a node ?
>>
>> For a 3.x (I think also 2.x) you can trace  requests at the query level
> with enableTracing() method.
> something like : (uncomment the line with .enableTracing() )
>
> session.execute( boundedInsertEventStatement.bind( aggregateId,
> aggregateType, eventType, payload )
> .setConsistencyLevel(ConsistencyLevel.ONE)
> //.enableTracing()
> );
> see the doc for other classes and tracing or consistency options,
> and have a look at nodetool settraceprobability if you cannot change the
> code
>
> The queries and query plans appear in the system_traces.sessions and
> system_traces.events tables. It can be very verbose for query plans (events
> table), may be you should truncate the sessions and events tables before
> running your load (on 3.x tables are  truncated on startup)
>
>

I'll take a look, this might be verbose because of the OPS rate (4k) but it
can give me info

Thanks again



> Regards,
>> Bastien
>>
>>
> HTH,
> --
> best,
> Alain
>


Cassandra

2016-05-25 Thread bastien dine
Hi,

I'm running a 3 nodes Cassandra 2.1.x cluster. Each node has 8vCPU and 30
Go RAM.
Replication factor = 3 for my keyspace.

Recently, i'm using the Java Driver (within Storm) to read / write data and
I've encountered a problem :

All of my cluster nodes are sucessfully discovered by the driver.

When doing a pretty heavy load on my cluster (1k read & 3k write per
seconds) it appears that one of my node is getting overhelm.. a lot.. and
other nodes are OK :
Node 1 : load : 17
node 2 : load 3
node 3 : load 3

RAM usage is not a problem at all.

On the node1, the system.log, there is a lot of StatusLogger stuff..

INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.range_xfers0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.compactions_in_progress 0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.peers  0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.schema_keyspaces   0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.schema_usertypes   0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.local  0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.sstable_activity 632,27087
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.schema_columns 0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
system.batchlog   0,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
keyspace1.Counter30,0
INFO  [Service Thread] 2016-05-25 15:35:04,530 StatusLogger.java:115 -
keyspace1.standard1   0,0
INFO  [Service Thread] 2016-05-25 15:35:04,531 StatusLogger.java:115 -
keyspace1.counter10,0
INFO  [Service Thread] 2016-05-25 15:35:04,531 StatusLogger.java:115 -
system_traces.sessions0,0
INFO  [Service Thread] 2016-05-25 15:35:04,532 StatusLogger.java:115 -
system_traces.events  0,0
INFO  [Service Thread] 2016-05-25 15:39:04,438 GCInspector.java:258 -
ParNew GC in 432ms.  CMS Old Gen: 2035104888 -> 2040946040; Par Eden Space:
671088640 -> 0; Par Survivor Space: 83884256 -> 83872168
INFO  [Service Thread] 2016-05-25 15:39:04,438 StatusLogger.java:51 - Pool
NameActive   Pending  Completed   Blocked  All Time
Blocked
INFO  [Service Thread] 2016-05-25 15:39:04,439 StatusLogger.java:66 -
MutationStage 0 0   12598562
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,439 StatusLogger.java:66 -
RequestResponseStage  0 09124551
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
ReadRepairStage   0 0 286466
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
CounterMutationStage  0 0  0
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
ReadStage 0 03090180
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
MiscStage 0 0  0
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
HintedHandoff 0 0 14
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
GossipStage   0 0  99815
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
CacheCleanupExecutor  0 0  0
0 0
INFO  [Service Thread] 2016-05-25 15:39:04,440 StatusLogger.java:66 -
InternalResponseStage 0 0  0
0 0

There is more message of GCInspector like this :
INFO  [Service Thread] 2016-05-25 15:35:04,524 GCInspector.java:258 -
ParNew GC in 266ms.  CMS Old Gen: 2029659880 -> 2035104888; Par Eden Space:
671088640 -> 0; Par Survivor Space: 83885104 -> 83884256

All of my node are configured the exact same way.

With cassandra stress tool, I was able to hit 40k to 75k operations per
secondes pretty fine.

Can someone help me to debug this problem ?

Is there a problem with the Java Driver ? The load balancing is not
"working" ? How can I list connections on a node ?

Regards,
Bastien