Cassandra 3.11 fails to start with JDK8u162

2018-01-17 Thread Steinmaurer, Thomas
Hello,

after switching from JDK8u152 to JDK8u162, Cassandra fails with the following 
stack trace upon startup.

ERROR [main] 2018-01-18 07:33:18,804 CassandraDaemon.java:706 - Exception 
encountered during startup
java.lang.AbstractMethodError: 
org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
at 
javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
 ~[na:1.8.0_162]
at 
javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
 ~[na:1.8.0_162]
at 
javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
 ~[na:1.8.0_162]
at 
org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
 ~[apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
 [apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188) 
[apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600) 
[apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) 
[apache-cassandra-3.11.2-SNAPSHOT.jar:3.11.2-SNAPSHOT]

Is this a known issue?


Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313


Re: New token allocation and adding a new DC

2018-01-17 Thread kurt greaves
Didn't know that about auto_bootstrap and the algorithm. We should probably
fix that. Can you create a JIRA for that issue? Workaround for #2 would be
to truncate system.available_ranges after "bootstrap".

On 17 January 2018 at 17:26, Oleksandr Shulgin  wrote:

> On Wed, Jan 17, 2018 at 4:21 AM, kurt greaves 
> wrote:
>
>> I believe you are able to get away with just altering the keyspace to
>> include both DC's even before the DC exists, and then adding your nodes to
>> that new DC using the algorithm. Note you'll probably want to take the
>> opportunity to reduce the number of vnodes to something reasonable. Based
>> off memory from previous testing you can get a good token balance with 16
>> vnodes if you have at least 6 nodes per rack (with RF=3 and 3 racks).
>>
>
> Alexander, Kurt,
>
> Thank you for the suggestions.
>
> None of them did work in the end, unfortunately:
>
> 1. Using auto_bootstrap=false always results in random token allocation,
> ignoring the allocate_tokens_for_keyspace option.
>
> The token allocation option is only considered if shouldBootstrap()
> returns true:
> https://github.com/apache/cassandra/blob/cassandra-3.0.
> 15/src/java/org/apache/cassandra/service/StorageService.java#L790  if
> (shouldBootstrap()) {
> https://github.com/apache/cassandra/blob/cassandra-3.0.
> 15/src/java/org/apache/cassandra/service/StorageService.java#L842
>   BootStrapper.getBootstrapTokens()  (the only place in code using the
> token allocation option)
> https://github.com/apache/cassandra/blob/cassandra-3.0.
> 15/src/java/org/apache/cassandra/service/StorageService.java#L901  else {
> ...
>
> 2. Using auto_bootstrap=true and allocate_tokens_for_keyspace=data_ks
> gives us balanced range ownership on the new empty DC.  The problem though,
> is that rebuilding of an already bootstrapped node doesn't work: the node
> believes that it already has all the data.
>
> We are going to proceed by manually assigning a small number of tokens to
> the nodes in new DC with auto_bootstrap=false and only use the automatic
> token allocation when we need to scale it out.  This seems to be the only
> supported way to use it anyway.
>
> Regards,
> --
> Alex
>
>


Re: question about nodetool decommission

2018-01-17 Thread Jon Haddad
For what it’s worth, it’s going to be a lot faster to rsync the data to a new 
node and replace the old one than to decommission and bootstrap.

> On Jan 17, 2018, at 3:20 PM, Jerome Basa  wrote:
> 
>> What C* version you are working with?
> 3.0.14
> 
>> What is the reason you're decommissioning the node? Any issues with it?
> upgrading instances.
> 
>> Pending tasks --- you mean output of 'nodetool tpstats'?
> pending tasks when i run `nodetool compactionstats`
> 
> 
> eventually it started streaming data to other nodes and failed because
> of a corrupted sstable which i then moved and run decommission again
> (so right now it’s compacting again). how do i decommission without
> compacting? thanks
> 
> regards,
> -jerome
> 
> 
> 
> 
> 
> On January 17, 2018 at 12:38:51 PM, Kyrylo Lebediev
> (kyrylo_lebed...@epam.com 
> (mailto:kyrylo_lebed...@epam.com 
> )) wrote:
> 
>> 
>> Hi Jerome,
>> 
>> 
>> 
>> I don't know reason for this, but compactions run during 'nodetool 
>> decommission'.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> What C* version you are working with?
>> 
>> 
>> What is the reason you're decommissioning the node? Any issues with it?
>> 
>> 
>> Can you see any errors/warnings in system.log on the node being 
>> decommissioned?
>> 
>> 
>> Pending tasks --- you mean output of 'nodetool tpstats'?
>> 
>> 
>> Could you please send output of 'nodetool netstats' from the node you're 
>> trying to evict from the cluster.
>> 
>> 
>> 
>> 
>> 
>> 
>> Regards,
>> 
>> 
>> 
>> Kyrill
>> 
>> 
>> 
>> 
>> From: Jerome Basa
>> Sent: Wednesday, January 17, 2018 6:56:10 PM
>> To: user@cassandra.apache.org 
>> Subject: question about nodetool decommission
>> 
>> 
>> hi,
>> 
>> am currently decommissioning a node and monitoring it using `nodetool
>> netstats`. i’ve noticed that it hasn’t started streaming any data and
>> it’s doing compaction (like 600+ pending tasks). the node is marked as
>> “UL” when i run `nodetool status`.
>> 
>> has anyone seen like this before? am thinking of stopping C* and run
>> `nodetool removenode`. also, can i add a new node to the cluster while
>> one node is marked as “UL” (decommissioning)? thanks
>> 
>> regards,
>> -jerome
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
>> 
>> For additional commands, e-mail: user-h...@cassandra.apache.org 
>> 
>> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org 
> 
> For additional commands, e-mail: user-h...@cassandra.apache.org 
> 


Re: question about nodetool decommission

2018-01-17 Thread Jerome Basa
> What C* version you are working with?
3.0.14

> What is the reason you're decommissioning the node? Any issues with it?
upgrading instances.

> Pending tasks --- you mean output of 'nodetool tpstats'?
pending tasks when i run `nodetool compactionstats`


eventually it started streaming data to other nodes and failed because
of a corrupted sstable which i then moved and run decommission again
(so right now it’s compacting again). how do i decommission without
compacting? thanks

regards,
-jerome





On January 17, 2018 at 12:38:51 PM, Kyrylo Lebediev
(kyrylo_lebed...@epam.com(mailto:kyrylo_lebed...@epam.com)) wrote:

>
> Hi Jerome,
>
>
>
> I don't know reason for this, but compactions run during 'nodetool 
> decommission'.
>
>
>
>
>
>
>
> What C* version you are working with?
>
>
> What is the reason you're decommissioning the node? Any issues with it?
>
>
> Can you see any errors/warnings in system.log on the node being 
> decommissioned?
>
>
> Pending tasks --- you mean output of 'nodetool tpstats'?
>
>
> Could you please send output of 'nodetool netstats' from the node you're 
> trying to evict from the cluster.
>
>
>
>
>
>
> Regards,
>
>
>
> Kyrill
>
>
>
>
> From: Jerome Basa
> Sent: Wednesday, January 17, 2018 6:56:10 PM
> To: user@cassandra.apache.org
> Subject: question about nodetool decommission
>
>
> hi,
>
> am currently decommissioning a node and monitoring it using `nodetool
> netstats`. i’ve noticed that it hasn’t started streaming any data and
> it’s doing compaction (like 600+ pending tasks). the node is marked as
> “UL” when i run `nodetool status`.
>
> has anyone seen like this before? am thinking of stopping C* and run
> `nodetool removenode`. also, can i add a new node to the cluster while
> one node is marked as “UL” (decommissioning)? thanks
>
> regards,
> -jerome
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: vnodes: high availability

2018-01-17 Thread Jon Haddad
I *strongly* recommend disabling dynamic snitch.  I’ve seen it make latency 
jump 10x.  

dynamic_snitch: false is your friend.



> On Jan 17, 2018, at 2:00 PM, Kyrylo Lebediev  wrote:
> 
> Avi, 
> If we prefer to have better balancing [like absence of hotspots during a node 
> down event etc], large number of vnodes is a good solution.
> Personally, I wouldn't prefer any balancing over overall resiliency  (and in 
> case of non-optimal setup, larger number of nodes in a cluster decreases 
> overall resiliency, as far as I understand.) 
> 
> Talking about hotspots, there is a number of features helping to mitigate the 
> issue, for example:
>   - dynamic snitch [if a node overloaded it won't be queried]
>   - throttling of streaming operations
> 
> Thanks, 
> Kyrill
> 
> From: Avi Kivity 
> Sent: Wednesday, January 17, 2018 2:50 PM
> To: user@cassandra.apache.org; kurt greaves
> Subject: Re: vnodes: high availability
>  
> On the flip side, a large number of vnodes is also beneficial. For example, 
> if you add a node to a 20-node cluster with many vnodes, each existing node 
> will contribute 5% of the data towards the new node, and all nodes will 
> participate in streaming (meaning the impact on any single node will be 
> limited, and completion time will be faster).
> 
> With a low number of vnodes, only a few nodes participate in streaming, which 
> means that the cluster is left unbalanced and the impact on each streaming 
> node is greater (or that completion time is slower).
> 
> Similarly, with a high number of vnodes, if a node is down its work is 
> distributed equally among all nodes. With a low number of vnodes the cluster 
> becomes unbalanced.
> 
> Overall I recommend high vnode count, and to limit the impact of failures in 
> other ways (smaller number of large nodes vs. larger number of small nodes).
> 
> btw, rack-aware topology improves the multi-failure problem but at the cost 
> of causing imbalance during maintenance operations. I recommend using 
> rack-aware topology only if you really have racks with 
> single-points-of-failure, not for other reasons.
> 
> On 01/17/2018 05:43 AM, kurt greaves wrote:
>> Even with a low amount of vnodes you're asking for a bad time. Even if you 
>> managed to get down to 2 vnodes per node, you're still likely to include 
>> double the amount of nodes in any streaming/repair operation which will 
>> likely be very problematic for incremental repairs, and you still won't be 
>> able to easily reason about which nodes are responsible for which token 
>> ranges. It's still quite likely that a loss of 2 nodes would mean some 
>> portion of the ring is down (at QUORUM). At the moment I'd say steer clear 
>> of vnodes and use single tokens if you can; a lot of work still needs to be 
>> done to ensure smooth operation of C* while using vnodes, and they are much 
>> more difficult to reason about (which is probably the reason no one has 
>> bothered to do the math). If you're really keen on the math your best bet is 
>> to do it yourself, because it's not a point of interest for many C* devs 
>> plus probably a lot of us wouldn't remember enough math to know how to 
>> approach it.
>> 
>> If you want to get out of this situation you'll need to do a DC migration to 
>> a new DC with a better configuration of snitch/replication 
>> strategy/racks/tokens.
>> 
>> 
>> On 16 January 2018 at 21:54, Kyrylo Lebediev > > wrote:
>> Thank you for this valuable info, Jon.
>> I guess both you and Alex are referring to improved vnodes allocation method 
>>  https://issues.apache.org/jira/browse/CASSANDRA-7032 
>>  which was implemented 
>> in 3.0.
>> Based on your info and comments in the ticket it's really a bad idea to have 
>> small number of vnodes for the versions using old allocation method because 
>> of hot-spots, so it's not an option for my particular case (v.2.1) :( 
>> 
>> [As far as I can see from the source code this new method wasn't backported 
>> to 2.1.]
>> 
>> 
>> Regards, 
>> Kyrill
>> [CASSANDRA-7032] Improve vnode allocation - ASF JIRA 
>> 
>> issues.apache.org 
>> It's been known for a little while that random vnode allocation causes 
>> hotspots of ownership. It should be possible to improve dramatically on this 
>> with deterministic ...
>> 
>> From: Jon Haddad > > on behalf of Jon Haddad 
>> >
>> Sent: Tuesday, January 16, 2018 8:21:33 PM
>> 
>> To: user@cassandra.apache.org 
>> Subject: Re: vnodes: high availability
>>  
>> We’ve used 32 tokens pre 3.0.  It’s been a mixed result due to the 
>> randomness.  There’s going to be some imbalance, the amount of imbalance 

Re: vnodes: high availability

2018-01-17 Thread Kyrylo Lebediev
Avi,

If we prefer to have better balancing [like absence of hotspots during a node 
down event etc], large number of vnodes is a good solution.

Personally, I wouldn't prefer any balancing over overall resiliency  (and in 
case of non-optimal setup, larger number of nodes in a cluster decreases 
overall resiliency, as far as I understand.)


Talking about hotspots, there is a number of features helping to mitigate the 
issue, for example:

  - dynamic snitch [if a node overloaded it won't be queried]

  - throttling of streaming operations

Thanks,
Kyrill


From: Avi Kivity 
Sent: Wednesday, January 17, 2018 2:50 PM
To: user@cassandra.apache.org; kurt greaves
Subject: Re: vnodes: high availability


On the flip side, a large number of vnodes is also beneficial. For example, if 
you add a node to a 20-node cluster with many vnodes, each existing node will 
contribute 5% of the data towards the new node, and all nodes will participate 
in streaming (meaning the impact on any single node will be limited, and 
completion time will be faster).


With a low number of vnodes, only a few nodes participate in streaming, which 
means that the cluster is left unbalanced and the impact on each streaming node 
is greater (or that completion time is slower).


Similarly, with a high number of vnodes, if a node is down its work is 
distributed equally among all nodes. With a low number of vnodes the cluster 
becomes unbalanced.


Overall I recommend high vnode count, and to limit the impact of failures in 
other ways (smaller number of large nodes vs. larger number of small nodes).


btw, rack-aware topology improves the multi-failure problem but at the cost of 
causing imbalance during maintenance operations. I recommend using rack-aware 
topology only if you really have racks with single-points-of-failure, not for 
other reasons.

On 01/17/2018 05:43 AM, kurt greaves wrote:
Even with a low amount of vnodes you're asking for a bad time. Even if you 
managed to get down to 2 vnodes per node, you're still likely to include double 
the amount of nodes in any streaming/repair operation which will likely be very 
problematic for incremental repairs, and you still won't be able to easily 
reason about which nodes are responsible for which token ranges. It's still 
quite likely that a loss of 2 nodes would mean some portion of the ring is down 
(at QUORUM). At the moment I'd say steer clear of vnodes and use single tokens 
if you can; a lot of work still needs to be done to ensure smooth operation of 
C* while using vnodes, and they are much more difficult to reason about (which 
is probably the reason no one has bothered to do the math). If you're really 
keen on the math your best bet is to do it yourself, because it's not a point 
of interest for many C* devs plus probably a lot of us wouldn't remember enough 
math to know how to approach it.

If you want to get out of this situation you'll need to do a DC migration to a 
new DC with a better configuration of snitch/replication strategy/racks/tokens.


On 16 January 2018 at 21:54, Kyrylo Lebediev 
> wrote:

Thank you for this valuable info, Jon.
I guess both you and Alex are referring to improved vnodes allocation method  
https://issues.apache.org/jira/browse/CASSANDRA-7032 which was implemented in 
3.0.

Based on your info and comments in the ticket it's really a bad idea to have 
small number of vnodes for the versions using old allocation method because of 
hot-spots, so it's not an option for my particular case (v.2.1) :(

[As far as I can see from the source code this new method wasn't backported to 
2.1.]



Regards,

Kyrill

[CASSANDRA-7032] Improve vnode allocation - ASF 
JIRA
issues.apache.org
It's been known for a little while that random vnode allocation causes hotspots 
of ownership. It should be possible to improve dramatically on this with 
deterministic ...



From: Jon Haddad > 
on behalf of Jon Haddad >
Sent: Tuesday, January 16, 2018 8:21:33 PM

To: user@cassandra.apache.org
Subject: Re: vnodes: high availability

We’ve used 32 tokens pre 3.0.  It’s been a mixed result due to the randomness.  
There’s going to be some imbalance, the amount of imbalance depends on luck, 
unfortunately.

I’m interested to hear your results using 4 tokens, would you mind letting the 
ML know your experience when you’ve done it?

Jon

On Jan 16, 2018, at 9:40 AM, Kyrylo Lebediev 
> wrote:

Agree with you, Jon.
Actually, this cluster was configured by my 'predecessor' and [fortunately for 
him] we've never met :)
We're using version 2.1.15 and can't upgrade 

Re: vnodes: high availability

2018-01-17 Thread Kyrylo Lebediev
Kurt, thanks for your recommendations.

Make sense.


Yes, we're planning to migrate the cluster and change endpoint-snitch to 
"AZ-aware" one.

Unfortunately, I'm 'not good enough' in math, have to think of how to calculate 
probabilities for the case of vnodes (whereas the case "without vnodes" should 
be easy to calculate: just a bit of combinatorics). Not an easy task for me, 
but will try to get at least some estimations.

Still believe that having formulas (results of doing math) we could come up 
with 'better' best-practices than are currently stated in C* documentation.


--

In particular, as far as I understand, probsbility of losing a keyrange [for 
CL=QUORUM] for a cluster with vnodes=256 and SimpleSnitch and total number of 
physical nodes not much more than 256 [not a lot of such large clusters..] 
equals to:

P1 = C(Nnodes, 2)*p^2 = Nnodes*(Nnodes-1)/2  *p^2


where :

p - failure probability for a server,

C(Nnodes, 2) - combination any 2 nodes  
[https://en.wikipedia.org/wiki/Combination]


Whereas probability of losing a keyrange for a non-vnode cluster is:

P2 = 2*Nnodes*p^2


So, 'old good' non-vnodes cluster is more reliable than 'new-style' vnodes 
cluster.

Correct?


Would like to get similar results for more realistic cases.  Will be back here 
once I get them (hoping to get)


Regards,

Kyrill


Combination - Wikipedia
en.wikipedia.org
In mathematics, a combination is a selection of items from a collection, such 
that (unlike permutations) the order of selection does not matter.







From: kurt greaves 
Sent: Wednesday, January 17, 2018 5:43:06 AM
To: User
Subject: Re: vnodes: high availability

Even with a low amount of vnodes you're asking for a bad time. Even if you 
managed to get down to 2 vnodes per node, you're still likely to include double 
the amount of nodes in any streaming/repair operation which will likely be very 
problematic for incremental repairs, and you still won't be able to easily 
reason about which nodes are responsible for which token ranges. It's still 
quite likely that a loss of 2 nodes would mean some portion of the ring is down 
(at QUORUM). At the moment I'd say steer clear of vnodes and use single tokens 
if you can; a lot of work still needs to be done to ensure smooth operation of 
C* while using vnodes, and they are much more difficult to reason about (which 
is probably the reason no one has bothered to do the math). If you're really 
keen on the math your best bet is to do it yourself, because it's not a point 
of interest for many C* devs plus probably a lot of us wouldn't remember enough 
math to know how to approach it.

If you want to get out of this situation you'll need to do a DC migration to a 
new DC with a better configuration of snitch/replication strategy/racks/tokens.


On 16 January 2018 at 21:54, Kyrylo Lebediev 
> wrote:

Thank you for this valuable info, Jon.
I guess both you and Alex are referring to improved vnodes allocation method  
https://issues.apache.org/jira/browse/CASSANDRA-7032 which was implemented in 
3.0.

Based on your info and comments in the ticket it's really a bad idea to have 
small number of vnodes for the versions using old allocation method because of 
hot-spots, so it's not an option for my particular case (v.2.1) :(

[As far as I can see from the source code this new method wasn't backported to 
2.1.]



Regards,

Kyrill

[CASSANDRA-7032] Improve vnode allocation - ASF 
JIRA
issues.apache.org
It's been known for a little while that random vnode allocation causes hotspots 
of ownership. It should be possible to improve dramatically on this with 
deterministic ...



From: Jon Haddad > 
on behalf of Jon Haddad >
Sent: Tuesday, January 16, 2018 8:21:33 PM

To: user@cassandra.apache.org
Subject: Re: vnodes: high availability

We’ve used 32 tokens pre 3.0.  It’s been a mixed result due to the randomness.  
There’s going to be some imbalance, the amount of imbalance depends on luck, 
unfortunately.

I’m interested to hear your results using 4 tokens, would you mind letting the 
ML know your experience when you’ve done it?

Jon

On Jan 16, 2018, at 9:40 AM, Kyrylo Lebediev 
> wrote:

Agree with you, Jon.
Actually, this cluster was configured by my 'predecessor' and [fortunately for 
him] we've never met :)
We're using version 2.1.15 and can't upgrade because of legacy Netflix Astyanax 
client used.

Below in the thread Alex mentioned that it's recommended to set vnodes to a 
value lower than 256 only for C* version > 3.0 (token allocation 

Re: Upgrade to 3.11.1 give SSLv2Hello is disabled error

2018-01-17 Thread Nate McCall
>
> We use Oracle jdk1.8.0_152 on all nodes and as I understand oracle use a
> dot in the protocol name (TLSv1.2) and I use the same protocol name and
> cipher names in the 3.0.14 nodes and the one I try to upgrade to 3.11.1.
>

I agree with Stefan's assessment and share his confusion. Would you be
willing to add the following to the startup options with the explicitly
configured "TLSv1.2" and post the results?
-Djavax.net.debug=ssl

That should provide additional detail on the SSL handshake.


Re: question about nodetool decommission

2018-01-17 Thread Kyrylo Lebediev
Hi Jerome,

I don't know reason for this, but compactions run during  'nodetool 
decommission'.


What C* version you are working with?

What is the reason you're decommissioning the node? Any issues with it?

Can you see any errors/warnings in system.log on the node being decommissioned?

Pending tasks --- you mean output of 'nodetool tpstats'?

Could you please send output of 'nodetool netstats' from the node you're trying 
to evict from the cluster.


Regards,

Kyrill



From: Jerome Basa 
Sent: Wednesday, January 17, 2018 6:56:10 PM
To: user@cassandra.apache.org
Subject: question about nodetool decommission

hi,

am currently decommissioning a node and monitoring it using `nodetool
netstats`. i’ve noticed that it hasn’t started streaming any data and
it’s doing compaction (like 600+ pending tasks). the node is marked as
“UL” when i run `nodetool status`.

has anyone seen like this before? am thinking of stopping C* and run
`nodetool removenode`. also, can i add a new node to the cluster while
one node is marked as “UL” (decommissioning)? thanks

regards,
-jerome

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: New token allocation and adding a new DC

2018-01-17 Thread Alexander Dejanovski
Well, that's a shame...

That part of the code has been changed in trunk and now it uses
BootStrapper.getBootstrapTokens() instead of getRandomToken() when auto
boostrap is disabled :
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L938

I was hoping this would already be the case in 3.0.x/3.11.x :(
Maybe that change should be backported to 3.11.x ?

It doesn't seem like a big change actually (I can be wrong though,
Cassandra is a complex beast...) and your use case doesn't seem to be that
exotic.
One would expect that a new DC can be created with balanced ownership,
which is obviously not the case.


On Wed, Jan 17, 2018 at 6:27 PM Oleksandr Shulgin <
oleksandr.shul...@zalando.de> wrote:

> On Wed, Jan 17, 2018 at 4:21 AM, kurt greaves 
> wrote:
>
>> I believe you are able to get away with just altering the keyspace to
>> include both DC's even before the DC exists, and then adding your nodes to
>> that new DC using the algorithm. Note you'll probably want to take the
>> opportunity to reduce the number of vnodes to something reasonable. Based
>> off memory from previous testing you can get a good token balance with 16
>> vnodes if you have at least 6 nodes per rack (with RF=3 and 3 racks).
>>
>
> Alexander, Kurt,
>
> Thank you for the suggestions.
>
> None of them did work in the end, unfortunately:
>
> 1. Using auto_bootstrap=false always results in random token allocation,
> ignoring the allocate_tokens_for_keyspace option.
>
> The token allocation option is only considered if shouldBootstrap()
> returns true:
>
> https://github.com/apache/cassandra/blob/cassandra-3.0.15/src/java/org/apache/cassandra/service/StorageService.java#L790
> if (shouldBootstrap()) {
>
> https://github.com/apache/cassandra/blob/cassandra-3.0.15/src/java/org/apache/cassandra/service/StorageService.java#L842
>   BootStrapper.getBootstrapTokens()  (the only place in code using the
> token allocation option)
>
> https://github.com/apache/cassandra/blob/cassandra-3.0.15/src/java/org/apache/cassandra/service/StorageService.java#L901
> else { ...
>
> 2. Using auto_bootstrap=true and allocate_tokens_for_keyspace=data_ks
> gives us balanced range ownership on the new empty DC.  The problem though,
> is that rebuilding of an already bootstrapped node doesn't work: the node
> believes that it already has all the data.
>
> We are going to proceed by manually assigning a small number of tokens to
> the nodes in new DC with auto_bootstrap=false and only use the automatic
> token allocation when we need to scale it out.  This seems to be the only
> supported way to use it anyway.
>
> Regards,
> --
> Alex
>
>

-- 
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com


High read rate on hard-disk

2018-01-17 Thread Octavian Rinciog
Hello!

I am using Cassandra 3.10, on Ubuntu 14.04 and I have a counter
table(RF=1), with the following schema:

CREATE TABLE edges (
src_id text,
src_type text,
source text
weight counter,
PRIMARY KEY ((src_id, src_type), source)
) WITH
   compaction = {'class':
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
'max_threshold': '32', 'min_threshold': '4'}

SELECT vs UPDATE requests ratio is 0.001. ( Read Count: 3771000, Write
Count: 3401236000, in one month)

We have Counter Cache enabled:

Counter Cache  : entries 1018782, size 256 MiB, capacity 256
MiB, 2799913189 hits, 3469459479 requests, 0.807 recent hit rate, 7200
save period in seconds

The problem is that our read rate limit on our hard-disk is always
near 30MBps and our write rate limit is near 500KBps.

One example of output of "iostat -x" is

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sdb   0.06 1.04  263.652.04 28832.42   572.53
146.07 0.361.350.74   81.16   1.27  33.81

Also with iotop, we saw that are about 8 threads that each goes around
3MB/s read rate.

Total DISK READ :  22.73 M/s | Total DISK WRITE : 494.35 K/s
Actual DISK READ:  22.62 M/s | Actual DISK WRITE: 528.57 K/s
  TID  PRIO  USERDISK READ>  DISK WRITE  SWAPIN  IOCOMMAND
14793 be/4 cassandra 3.061 M/s0.0010 B/s  0.00 % 93.27 % java
-Dcassandra.fd_max_interval_ms=400

The output of strace on these threads is :

strace -cp 14793
Process 14793 attached
^CProcess 14793 detached
% time seconds  usecs/call callserrors syscall
-- --- --- - - 
 99.85   32.118518  57567288256251 futex
  0.150.048822   3 15339   write
  0.000.00   0 1   rt_sigreturn
-- --- --- - - 
100.00   32.167340582628256251 total


Despite that iotop shows that this thread is reading with 3MB/s, there
is no read syscall in strace.

I want to ask if actually the futex is responsible for the read rate
and how can we debug this problem further ?

Btw, there are no compaction tasks in progress and there are no SELECT
queries in progress.

Also, I know that for each update, a lock is obtained[1]

Thank you,

[1]https://apache.googlesource.com/cassandra/+/refs/heads/trunk/src/java/org/apache/cassandra/db/CounterMutation.java#121
-- 
Octavian Rinciog

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: New token allocation and adding a new DC

2018-01-17 Thread Oleksandr Shulgin
On Wed, Jan 17, 2018 at 4:21 AM, kurt greaves  wrote:

> I believe you are able to get away with just altering the keyspace to
> include both DC's even before the DC exists, and then adding your nodes to
> that new DC using the algorithm. Note you'll probably want to take the
> opportunity to reduce the number of vnodes to something reasonable. Based
> off memory from previous testing you can get a good token balance with 16
> vnodes if you have at least 6 nodes per rack (with RF=3 and 3 racks).
>

Alexander, Kurt,

Thank you for the suggestions.

None of them did work in the end, unfortunately:

1. Using auto_bootstrap=false always results in random token allocation,
ignoring the allocate_tokens_for_keyspace option.

The token allocation option is only considered if shouldBootstrap() returns
true:
https://github.com/apache/cassandra/blob/cassandra-3.0.15/src/java/org/apache/cassandra/service/StorageService.java#L790
if (shouldBootstrap()) {
https://github.com/apache/cassandra/blob/cassandra-3.0.15/src/java/org/apache/cassandra/service/StorageService.java#L842
  BootStrapper.getBootstrapTokens()  (the only place in code using the
token allocation option)
https://github.com/apache/cassandra/blob/cassandra-3.0.15/src/java/org/apache/cassandra/service/StorageService.java#L901
else { ...

2. Using auto_bootstrap=true and allocate_tokens_for_keyspace=data_ks gives
us balanced range ownership on the new empty DC.  The problem though, is
that rebuilding of an already bootstrapped node doesn't work: the node
believes that it already has all the data.

We are going to proceed by manually assigning a small number of tokens to
the nodes in new DC with auto_bootstrap=false and only use the automatic
token allocation when we need to scale it out.  This seems to be the only
supported way to use it anyway.

Regards,
--
Alex


question about nodetool decommission

2018-01-17 Thread Jerome Basa
hi,

am currently decommissioning a node and monitoring it using `nodetool
netstats`. i’ve noticed that it hasn’t started streaming any data and
it’s doing compaction (like 600+ pending tasks). the node is marked as
“UL” when i run `nodetool status`.

has anyone seen like this before? am thinking of stopping C* and run
`nodetool removenode`. also, can i add a new node to the cluster while
one node is marked as “UL” (decommissioning)? thanks

regards,
-jerome

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



RE: [EXTERNAL] Re: Even after the drop table, the data actually was not erased.

2018-01-17 Thread Durity, Sean R
We have found it very useful to set up an infrastructure where we can execute a 
nodetool command (or any other arbitrary command) from a single (non-Cassandra) 
host that will get executed on each node across the cluster (or a list of 
nodes).


Sean Durity

From: Alain RODRIGUEZ [mailto:arodr...@gmail.com]
Sent: Monday, January 15, 2018 1:19 PM
To: user cassandra.apache.org 
Subject: [EXTERNAL] Re: Even after the drop table, the data actually was not 
erased.

As you said, the auto_bootstrap setting was turned on.

Well I was talking about the 'auto_snapshot' ;-). I understand that's what you 
meant to say.

This command seems to apply only to one node. Can it be applied cluster-wide? 
Or should I run this command on each node?

Indeed, 'nodetool clearsnapshot' is only for the node where you run the 
command, like most of the nodetool commands (repair is a bit specific).

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2018-01-15 1:56 GMT+00:00 Eunsu Kim 
>:
Thank you for your response.  As you said, the auto_bootstrap setting was 
turned on.
The actual data was deleted with the 'nodetool clearsnapshot' command.
This command seems to apply only to one node. Can it be applied cluster-wide? 
Or should I run this command on each node?




On 12 Jan 2018, at 8:10 PM, Alain RODRIGUEZ 
> wrote:

Hello,

However, the actual size of the data directory did not decrease at all. Disk 
Load monitored by JMX has been decreased.

This sounds like 'auto_snapshot' is enabled. This option will trigger a 
snapshot before any table drop / truncate to prevent user mistakes mostly. Then 
the data is removed but as it is still referenced by the snapshot (hard link), 
space cannot be freed.

Running 'nodetool clearsnapshot' should help reducing the dataset size in this 
situation.


The client fails to establish a connection and I see the following exceptions 
in the Cassandra logs.
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId…

This does not look like a failed connection to me but rather a try to query 
some inexistent data. If that's the data you just deleted (keyspace / table), 
this is expected. If not there is an other issue, I hope not related to the 
delete in this case...

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com



2018-01-12 7:14 GMT+00:00 Eunsu Kim 
>:
hi everyone

On the development server, I dropped all the tables and even keyspace dropped 
to change the table schema.
Then I created the keyspace and the table.

However, the actual size of the data directory did not decrease at all. Disk 
Load monitored by JMX has been decreased.




After that, Cassandra does not work normally.

The client fails to establish a connection and I see the following exceptions 
in the Cassandra logs.

org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId…….org.apache.cassandra.io.FSReadError:
 java.io.IOException: Digest mismatch exception……


After the data is forcibly deleted, Cassandra is restarted in a clean state and 
works well.

Can anyone guess why this is happening?

Thank you in advance.






The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability 

Best compaction strategy for counters tables

2018-01-17 Thread Octavian Rinciog
Hello!
I am using Cassandra 3.10.
I have a counter table, with the following schema and RF=1

CREATE TABLE edges (
src_id text,
src_type text,
source text
weight counter,
PRIMARY KEY ((src_id, src_type), source)
);

SELECT vs UPDATE requests ratio for this table is 0.1
READ vs WRITE rate, given by iostat is 100:1.
Counter cache hit rate is 80%, so only for 20% UPDATE requests, the
hard-disk is touched.

I want to ask you which compation strategy is best for this table
(SizeTieredCompactionStrategy or
LeveledCompactionStrategy).

Thank you,
-- 
Octavian Rinciog

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Upgrade to 3.11.1 give SSLv2Hello is disabled error

2018-01-17 Thread Tommy Stendahl
We use Oracle jdk1.8.0_152 on all nodes and as I understand oracle use a 
dot in the protocol name (TLSv1.2) and I use the same protocol name and 
cipher names in the 3.0.14 nodes and the one I try to upgrade to 3.11.1.



On 2018-01-17 15:02, Georg Brandemann wrote:
If i remember correctly the protocol names differ between some JRE 
vendors.


With IBM Java for instance the protocol name would be TLSv12 ( without 
. ).


Are you using the same JRE on all nodes and is the protocol name and 
cipher names exactly the same on all nodes?


2018-01-17 14:51 GMT+01:00 Tommy Stendahl >:


Thanks for your response.

I got it working by removing my protocol setting from the
configuration on the 3.11.1 node so it use the default protocol
setting, I'm not sure exactly how that change things so I need to
investigate that. We don't have any custom ssl settings that
should affect this and we use jdk1.8.0_152.

But I think this should have worked, as you say SSLv2Hello should
be enabled on the server side so I don't understand why I can't
specify TLSv1.2

/Tommy


On 2018-01-17 11:03, Stefan Podkowinski wrote:

I think what this error indicates is that a client is trying
to connect
using a SSLv2Hello handshake, while this protocol has been
disabled on
the server side. Starting with the mentioned ticket, we use
the JVM
default list of enabled protocols. What makes this issue a bit
confusing, is that starting with 1.7 SSLv2Hello should be
disabled by
default on the client side, but not on the server side.
Cassandra should
be able to accept SSLv2Hello connections from 3.0 nodes just
fine. What
JRE do you use? Any custom ssl specific settings that might be
effective
here?

On 16.01.2018 15:13, Tommy Stendahl wrote:

Hi,

I have problems upgrading a cluster from 3.0.14 to 3.11.1
but when I
upgrade the first node it fails to gossip.

I have server encryption enabled on all nodes with this
setting:

server_encryption_options:
 internode_encryption: all
 keystore: /usr/share/cassandra/.ssl/server/keystore.jks
 keystore_password: 'x'
 truststore:
/usr/share/cassandra/.ssl/server/truststore.jks
 truststore_password: 'x'
 protocol: TLSv1.2
 cipher_suites:

[TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA]


I get this error in the log:

2018-01-16T14:41:19.671+0100 ERROR [ACCEPT-/10.61.204.16
]
MessagingService.java:1329 SSL handshake error for inbound
connection
from 30f93bf4[SSL_NULL_WITH_NULL_NULL:
Socket[addr=/x.x.x.x,port=40583,localport=7001]]
javax.net.ssl.SSLHandshakeException: SSLv2Hello is disabled
 at

sun.security.ssl.InputRecord.handleUnknownRecord(InputRecord.java:637)
~[na:1.8.0_152]
 at
sun.security.ssl.InputRecord.read(InputRecord.java:527)
~[na:1.8.0_152]
 at
sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
~[na:1.8.0_152]
 at

sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
~[na:1.8.0_152]
 at

sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:938)
~[na:1.8.0_152]
 at
sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
~[na:1.8.0_152]
 at
sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
~[na:1.8.0_152]
 at
java.io.DataInputStream.readInt(DataInputStream.java:387)
~[na:1.8.0_152]
 at
org.apache.cassandra.net

.MessagingService$SocketThread.run(MessagingService.java:1303)
~[apache-cassandra-3.11.1.jar:3.11.1]

I suspect that this has something to do with the change in
CASSANDRA-10508. Any suggestions on how to get around this
would be very
much appreciated.

Thanks, /Tommy




-
To unsubscribe, e-mail:
user-unsubscr...@cassandra.apache.org

For additional commands, e-mail:
user-h...@cassandra.apache.org

Re: Upgrade to 3.11.1 give SSLv2Hello is disabled error

2018-01-17 Thread Georg Brandemann
If i remember correctly the protocol names differ between some JRE vendors.

With IBM Java for instance the protocol name would be TLSv12 ( without . ).

Are you using the same JRE on all nodes and is the protocol name and cipher
names exactly the same on all nodes?

2018-01-17 14:51 GMT+01:00 Tommy Stendahl :

> Thanks for your response.
>
> I got it working by removing my protocol setting from the configuration on
> the 3.11.1 node so it use the default protocol setting, I'm not sure
> exactly how that change things so I need to investigate that. We don't have
> any custom ssl settings that should affect this and we use jdk1.8.0_152.
>
> But I think this should have worked, as you say SSLv2Hello should be
> enabled on the server side so I don't understand why I can't specify TLSv1.2
>
> /Tommy
>
>
> On 2018-01-17 11:03, Stefan Podkowinski wrote:
>
>> I think what this error indicates is that a client is trying to connect
>> using a SSLv2Hello handshake, while this protocol has been disabled on
>> the server side. Starting with the mentioned ticket, we use the JVM
>> default list of enabled protocols. What makes this issue a bit
>> confusing, is that starting with 1.7 SSLv2Hello should be disabled by
>> default on the client side, but not on the server side. Cassandra should
>> be able to accept SSLv2Hello connections from 3.0 nodes just fine. What
>> JRE do you use? Any custom ssl specific settings that might be effective
>> here?
>>
>> On 16.01.2018 15:13, Tommy Stendahl wrote:
>>
>>> Hi,
>>>
>>> I have problems upgrading a cluster from 3.0.14 to 3.11.1 but when I
>>> upgrade the first node it fails to gossip.
>>>
>>> I have server encryption enabled on all nodes with this setting:
>>>
>>> server_encryption_options:
>>>  internode_encryption: all
>>>  keystore: /usr/share/cassandra/.ssl/server/keystore.jks
>>>  keystore_password: 'x'
>>>  truststore: /usr/share/cassandra/.ssl/server/truststore.jks
>>>  truststore_password: 'x'
>>>  protocol: TLSv1.2
>>>  cipher_suites:
>>> [TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_
>>> AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA]
>>>
>>>
>>> I get this error in the log:
>>>
>>> 2018-01-16T14:41:19.671+0100 ERROR [ACCEPT-/10.61.204.16]
>>> MessagingService.java:1329 SSL handshake error for inbound connection
>>> from 30f93bf4[SSL_NULL_WITH_NULL_NULL:
>>> Socket[addr=/x.x.x.x,port=40583,localport=7001]]
>>> javax.net.ssl.SSLHandshakeException: SSLv2Hello is disabled
>>>  at
>>> sun.security.ssl.InputRecord.handleUnknownRecord(InputRecord.java:637)
>>> ~[na:1.8.0_152]
>>>  at sun.security.ssl.InputRecord.read(InputRecord.java:527)
>>> ~[na:1.8.0_152]
>>>  at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.
>>> java:983)
>>> ~[na:1.8.0_152]
>>>  at
>>> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSo
>>> cketImpl.java:1385)
>>> ~[na:1.8.0_152]
>>>  at
>>> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:938)
>>> ~[na:1.8.0_152]
>>>  at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
>>> ~[na:1.8.0_152]
>>>  at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
>>> ~[na:1.8.0_152]
>>>  at java.io.DataInputStream.readInt(DataInputStream.java:387)
>>> ~[na:1.8.0_152]
>>>  at
>>> org.apache.cassandra.net.MessagingService$SocketThread.run(
>>> MessagingService.java:1303)
>>> ~[apache-cassandra-3.11.1.jar:3.11.1]
>>>
>>> I suspect that this has something to do with the change in
>>> CASSANDRA-10508. Any suggestions on how to get around this would be very
>>> much appreciated.
>>>
>>> Thanks, /Tommy
>>>
>>>
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>>
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Upgrade to 3.11.1 give SSLv2Hello is disabled error

2018-01-17 Thread Tommy Stendahl

Thanks for your response.

I got it working by removing my protocol setting from the configuration 
on the 3.11.1 node so it use the default protocol setting, I'm not sure 
exactly how that change things so I need to investigate that. We don't 
have any custom ssl settings that should affect this and we use 
jdk1.8.0_152.


But I think this should have worked, as you say SSLv2Hello should be 
enabled on the server side so I don't understand why I can't specify TLSv1.2


/Tommy

On 2018-01-17 11:03, Stefan Podkowinski wrote:

I think what this error indicates is that a client is trying to connect
using a SSLv2Hello handshake, while this protocol has been disabled on
the server side. Starting with the mentioned ticket, we use the JVM
default list of enabled protocols. What makes this issue a bit
confusing, is that starting with 1.7 SSLv2Hello should be disabled by
default on the client side, but not on the server side. Cassandra should
be able to accept SSLv2Hello connections from 3.0 nodes just fine. What
JRE do you use? Any custom ssl specific settings that might be effective
here?

On 16.01.2018 15:13, Tommy Stendahl wrote:

Hi,

I have problems upgrading a cluster from 3.0.14 to 3.11.1 but when I
upgrade the first node it fails to gossip.

I have server encryption enabled on all nodes with this setting:

server_encryption_options:
     internode_encryption: all
     keystore: /usr/share/cassandra/.ssl/server/keystore.jks
     keystore_password: 'x'
     truststore: /usr/share/cassandra/.ssl/server/truststore.jks
     truststore_password: 'x'
     protocol: TLSv1.2
     cipher_suites:
[TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA]


I get this error in the log:

2018-01-16T14:41:19.671+0100 ERROR [ACCEPT-/10.61.204.16]
MessagingService.java:1329 SSL handshake error for inbound connection
from 30f93bf4[SSL_NULL_WITH_NULL_NULL:
Socket[addr=/x.x.x.x,port=40583,localport=7001]]
javax.net.ssl.SSLHandshakeException: SSLv2Hello is disabled
     at
sun.security.ssl.InputRecord.handleUnknownRecord(InputRecord.java:637)
~[na:1.8.0_152]
     at sun.security.ssl.InputRecord.read(InputRecord.java:527)
~[na:1.8.0_152]
     at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
~[na:1.8.0_152]
     at
sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
~[na:1.8.0_152]
     at
sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:938)
~[na:1.8.0_152]
     at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
~[na:1.8.0_152]
     at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
~[na:1.8.0_152]
     at java.io.DataInputStream.readInt(DataInputStream.java:387)
~[na:1.8.0_152]
     at
org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:1303)
~[apache-cassandra-3.11.1.jar:3.11.1]

I suspect that this has something to do with the change in
CASSANDRA-10508. Any suggestions on how to get around this would be very
much appreciated.

Thanks, /Tommy



-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org






-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Upgrade to 3.11.1 give SSLv2Hello is disabled error

2018-01-17 Thread Tommy Stendahl

Thanks for your response.

I removed the protocol setting from the server_encryption_options in the 
3.11.1 node so it use the default value instead and now it works. I have 
to analyse if this has any impact on my security requirements but at 
least its working now.


/Tommy


On 2018-01-16 17:26, Michael Shuler wrote:

This looks like the post-POODLE commit:
https://issues.apache.org/jira/browse/CASSANDRA-10508

I think you might just set 'TLS' as in the example to use the JVM's
preferred TLS protocol version.




-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: vnodes: high availability

2018-01-17 Thread Avi Kivity
On the flip side, a large number of vnodes is also beneficial. For 
example, if you add a node to a 20-node cluster with many vnodes, each 
existing node will contribute 5% of the data towards the new node, and 
all nodes will participate in streaming (meaning the impact on any 
single node will be limited, and completion time will be faster).



With a low number of vnodes, only a few nodes participate in streaming, 
which means that the cluster is left unbalanced and the impact on each 
streaming node is greater (or that completion time is slower).



Similarly, with a high number of vnodes, if a node is down its work is 
distributed equally among all nodes. With a low number of vnodes the 
cluster becomes unbalanced.



Overall I recommend high vnode count, and to limit the impact of 
failures in other ways (smaller number of large nodes vs. larger number 
of small nodes).



btw, rack-aware topology improves the multi-failure problem but at the 
cost of causing imbalance during maintenance operations. I recommend 
using rack-aware topology only if you really have racks with 
single-points-of-failure, not for other reasons.



On 01/17/2018 05:43 AM, kurt greaves wrote:
Even with a low amount of vnodes you're asking for a bad time. Even if 
you managed to get down to 2 vnodes per node, you're still likely to 
include double the amount of nodes in any streaming/repair operation 
which will likely be very problematic for incremental repairs, and you 
still won't be able to easily reason about which nodes are responsible 
for which token ranges. It's still quite likely that a loss of 2 nodes 
would mean some portion of the ring is down (at QUORUM). At the moment 
I'd say steer clear of vnodes and use single tokens if you can; a lot 
of work still needs to be done to ensure smooth operation of C* while 
using vnodes, and they are much more difficult to reason about (which 
is probably the reason no one has bothered to do the math). If you're 
really keen on the math your best bet is to do it yourself, because 
it's not a point of interest for many C* devs plus probably a lot of 
us wouldn't remember enough math to know how to approach it.


If you want to get out of this situation you'll need to do a DC 
migration to a new DC with a better configuration of 
snitch/replication strategy/racks/tokens.



On 16 January 2018 at 21:54, Kyrylo Lebediev > wrote:


Thank you for this valuable info, Jon.
I guess both you and Alex are referring to improved vnodes
allocation method
https://issues.apache.org/jira/browse/CASSANDRA-7032
 which was
implemented in 3.0.

Based on your info and comments in the ticket it's really a bad
idea to have small number of vnodes for the versions using old
allocation method because of hot-spots, so it's not an option for
my particular case (v.2.1) :(

[As far as I can see from the source code this new method
wasn't backported to 2.1.]



Regards,

Kyrill

[CASSANDRA-7032] Improve vnode allocation - ASF JIRA

issues.apache.org 
It's been known for a little while that random vnode allocation
causes hotspots of ownership. It should be possible to improve
dramatically on this with deterministic ...



*From:* Jon Haddad > on behalf of Jon Haddad
>
*Sent:* Tuesday, January 16, 2018 8:21:33 PM

*To:* user@cassandra.apache.org 
*Subject:* Re: vnodes: high availability
We’ve used 32 tokens pre 3.0.  It’s been a mixed result due to the
randomness.  There’s going to be some imbalance, the amount of
imbalance depends on luck, unfortunately.

I’m interested to hear your results using 4 tokens, would you mind
letting the ML know your experience when you’ve done it?

Jon


On Jan 16, 2018, at 9:40 AM, Kyrylo Lebediev
> wrote:

Agree with you, Jon.
Actually, this cluster was configured by my 'predecessor' and
[fortunately for him] we've never met :)
We're using version 2.1.15 and can't upgrade because of legacy
Netflix Astyanax client used.

Below in the thread Alex mentioned that it's recommended to set
vnodes to a value lower than 256 only for C* version > 3.0 (token
allocation algorithm was improved since C* 3.0) .

Jon,
Do you have positive experience setting up cluster with vnodes <
256 for  C* 2.1?

vnodes=32 also too high, as for me (we need to have much more
than 32 servers per AZ in order to to get 'reliable' cluster)
vnodes=4 seems to be 

Re: Upgrade to 3.11.1 give SSLv2Hello is disabled error

2018-01-17 Thread Stefan Podkowinski
I think what this error indicates is that a client is trying to connect
using a SSLv2Hello handshake, while this protocol has been disabled on
the server side. Starting with the mentioned ticket, we use the JVM
default list of enabled protocols. What makes this issue a bit
confusing, is that starting with 1.7 SSLv2Hello should be disabled by
default on the client side, but not on the server side. Cassandra should
be able to accept SSLv2Hello connections from 3.0 nodes just fine. What
JRE do you use? Any custom ssl specific settings that might be effective
here?

On 16.01.2018 15:13, Tommy Stendahl wrote:
> Hi,
> 
> I have problems upgrading a cluster from 3.0.14 to 3.11.1 but when I
> upgrade the first node it fails to gossip.
> 
> I have server encryption enabled on all nodes with this setting:
> 
> server_encryption_options:
>     internode_encryption: all
>     keystore: /usr/share/cassandra/.ssl/server/keystore.jks
>     keystore_password: 'x'
>     truststore: /usr/share/cassandra/.ssl/server/truststore.jks
>     truststore_password: 'x'
>     protocol: TLSv1.2
>     cipher_suites:
> [TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_128_CBC_SHA]
> 
> 
> I get this error in the log:
> 
> 2018-01-16T14:41:19.671+0100 ERROR [ACCEPT-/10.61.204.16]
> MessagingService.java:1329 SSL handshake error for inbound connection
> from 30f93bf4[SSL_NULL_WITH_NULL_NULL:
> Socket[addr=/x.x.x.x,port=40583,localport=7001]]
> javax.net.ssl.SSLHandshakeException: SSLv2Hello is disabled
>     at
> sun.security.ssl.InputRecord.handleUnknownRecord(InputRecord.java:637)
> ~[na:1.8.0_152]
>     at sun.security.ssl.InputRecord.read(InputRecord.java:527)
> ~[na:1.8.0_152]
>     at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:983)
> ~[na:1.8.0_152]
>     at
> sun.security.ssl.SSLSocketImpl.performInitialHandshake(SSLSocketImpl.java:1385)
> ~[na:1.8.0_152]
>     at
> sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:938)
> ~[na:1.8.0_152]
>     at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
> ~[na:1.8.0_152]
>     at sun.security.ssl.AppInputStream.read(AppInputStream.java:71)
> ~[na:1.8.0_152]
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
> ~[na:1.8.0_152]
>     at
> org.apache.cassandra.net.MessagingService$SocketThread.run(MessagingService.java:1303)
> ~[apache-cassandra-3.11.1.jar:3.11.1]
> 
> I suspect that this has something to do with the change in
> CASSANDRA-10508. Any suggestions on how to get around this would be very
> much appreciated.
> 
> Thanks, /Tommy
> 
> 
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: NVMe SSD benchmarking with Cassandra

2018-01-17 Thread Matija Gobec
Justin,

NVMe drives have their own IO queueing mechanism and there is a huge
performance difference vs the linux queue.
Next to properly configured file system and scheduler try setting
"scsi_mod.use_blk_mq=1"
in grub cmdline.
If you are looking for a BFQ scheduler, its probably a module so you will
need to load it.

Best,
Matija

On Tue, Jan 9, 2018 at 1:17 AM, Nate McCall  wrote:

>
>>
>>
>> In regards to setting read ahead, how is this set for nvme drives? Also,
>> below is our compression settings for the table… It’s the same as our tests
>> that we are doing against SAS SSDs so I don’t think the compression
>> settings would be the issue…
>>
>>
>>
>
> Check blockdev --report between the old and the new servers to see if
> there is a difference. Are there other deltas in the disk layouts between
> the old and new servers (ie. LVM, mdadm, etc.)?
>
> You can control read ahead via 'blockdev --setra' or via poking the
> kernel: /sys/block/[YOUR DRIVE]/queue/read_ahead_kb
>
> In both cases, changes are instantaneous so you can do it on a canary and
> monitor for effect.
>
> Also, i'd be curious to know (since you have this benchmark setup) if you
> got the degradation you are currently seeing if you set concurrent_reads
> and concurrent_writes back to their defaults.
>
>
> --
> -
> Nate McCall
> Wellington, NZ
> @zznate
>
> CTO
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>