JMX metric for dropped hints?

2019-01-21 Thread Steinmaurer, Thomas
Hello,

is there a JMX metric for monitoring dropped hints as a counter/rate, 
equivalent to what we see in Cassandra log, e.g.:

WARN  [HintedHandoffManager:1] 2018-11-13 13:28:46,991 
HintedHandoffMetrics.java:79 - /XXX has 18180 dropped hints, because node is 
down past configured hint window.
WARN  [HintedHandoffManager:1] 2018-11-13 13:27:29,305 
HintedHandoffMetrics.java:79 - /XXX has 1191 dropped hints, because node is 
down past configured hint window.
WARN  [HintedHandoffManager:1] 2018-11-13 13:23:09,393 
HintedHandoffMetrics.java:79 - /XXX has 135531 dropped hints, because node is 
down past configured hint window.


Thanks,
Thomas

The contents of this e-mail are intended for the named addressee only. It 
contains information that may be confidential. Unless you are the named 
addressee or an authorized designee, you may not copy or use it, or disclose it 
to anyone else. If you received it in error please notify us immediately and 
then destroy it. Dynatrace Austria GmbH (registration number FN 91482h) is a 
company registered in Linz whose registered office is at 4040 Linz, Austria, 
Freist?dterstra?e 313


Re: question about the gain of increasing the number of vnode

2019-01-21 Thread Alain RODRIGUEZ
Sure, it's called "Cassandra Availability with Virtual Nodes”, by Joey
Lynch and Josh Snyder.

I found it in the mailing list archives:
https://github.com/jolynch/python_performance_toolkit/blob/master/notebooks/cassandra_availability/whitepaper/cassandra-availability-virtual.pdf

There are some maths in there to explain impacts of the number of vnodes on
availability.

Using the formula "1d", and considering a datacenter of 3 balanced racks
with RF = 3, we have:

Np*(1-(1-(1/Np))^(v*2*(R-1)) = 40*(1-(1-(1/40))^(256*2*(3-1)) =
39.98
Thus if my calculation is accurate, with 60 nodes and 256 vnodes, we expect
a node to have 39.98 neighbors. This means that with 60 nodes, *each
node* has 40 *possible* replicas (all the nodes in other racks) and will be
sharing a token range with all the other nodes. Thus 2 nodes down in
distinct racks and you have an outage almost ensured (still needs 2 nodes
down).

Some other arbitrary numbers that show the evolution of this value
depending on the number of nodes and vnodes.

- With 60 nodes and 256 vnodes, expect 39.98 neighbors
- With 60 nodes and 16 vnodes, expect 32.0867407145 neighbors
- With 60 nodes and 4 vnodes, expect 13.323193263 neighbors

- With 300 nodes and 256 vnodes, expect 198.8200470802 neighbors
- With 300 nodes and 16 vnodes, expect 54.8867183963 neighbors
- With 300 nodes and 4 vnodes, expect 15.4137752052 neighbors


Good reading :).

C*heers,
---
Alain Rodriguez - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le lun. 21 janv. 2019 à 13:30, VICTOR IBARRA  a écrit :

> Hi Alain ,
>
>  thank you very much for the explication and the points for the sujet of
> managing de vnodes
>
> you talk about the paper of netflix and the outage ?  you have the link
> with this discution
>
> thank you for your help
> BEST REGARDS
>
> Le lun. 21 janv. 2019 à 13:53, Alain RODRIGUEZ  a
> écrit :
>
>> There have been some discussion on this topic in this mailing list,
>> including a paper from Netflix with the impact of vnodes. I could not find
>> it quickly, but I invite you to check.
>>
>> To share some ideas:
>>
>> More vnodes:
>> + Better balance between nodes
>> + maximize the streaming throughput for operations as all nodes share a
>> small bit of the data of all the other nodes (according to the topology).
>> - When the cluster fails, there is more chance to lose availability as we
>> 256 vnodes for example, 2 nodes down in distinct racks would for sure make
>> data partially unavailable.
>> - Overheads / Operational issues (in practice, using 256 vnodes have been
>> a nightmare for multiple reasons, see below)
>>
>>
>> Less vnodes
>> - Imbalances can be big before C* 3.0. After, using
>> allocate_tokens_for_keyspace -->
>> http://cassandra.apache.org/doc/4.0/configuration/cassandra_config_file.html#allocate-tokens-for-keyspace,
>> you can mitigate this issue. With this and some technics*, you can have
>> good results in terms of balances.
>> * Off the top of my head: this involve bootstrapping the seeds first,
>> picking the tokens to use, create your keyspace then adding nodes with the
>> option above. You can test it quite easily. Then with "nodetool status
>> > - The streaming throughput is generally limited by the receiving host
>> when using vnodes, thus 16 vnodes is probably not worse than 256 in terms
>> of streaming
>> + The other way around, the overhead of having 256 vnodes makes
>> operations such as repair almost impossible, or at least way longer and
>> complex. Repairing tables almost empty can take up to minutes and repairing
>> big dataset might never end.
>> + In Netflix paper about this topic (very interesting, I recommend
>> reading), it is explained that reducing the number of vnodes reduces the
>> chances of an outage.
>> + There was a discussion in the dev mailing list. I believe the community
>> agreed on the need to reduce the number of vnodes by default. Here again,
>> you can have a quick look at the archive, Jira, github/trunk.
>>
>> I think that commonly accepted values would be 16/32. Values as low as 4
>> are considered to improve availability, reduce overheads induced by vnodes.
>> I would suggest you test it and see if low values you still manage to keep
>> the balance between nodes.
>>
>> Also using "physical" nodes (initial_token, no vnodes) gives the
>> possibility to reason about token distribution. You can perform advanced 
>> operations
>> where you bootstrap 1/3 of the cluster at once. This is very good
>> especially for big clusters, I would say. While with many vnodes you'll
>> have to go add a node at the time as each node is actually the 'neighbor'
>> of all the others (according to the topology again - ie racks/data
>> centers...).
>>
>> I would stay away from the default in this case (256 vnodes). I think
>> this value is way too high by default.
>>
>> Also, keep in mind that to change the number of vnodes 

Re: question about the gain of increasing the number of vnode

2019-01-21 Thread VICTOR IBARRA
Hi Alain ,

 thank you very much for the explication and the points for the sujet of
managing de vnodes

you talk about the paper of netflix and the outage ?  you have the link
with this discution

thank you for your help
BEST REGARDS

Le lun. 21 janv. 2019 à 13:53, Alain RODRIGUEZ  a
écrit :

> There have been some discussion on this topic in this mailing list,
> including a paper from Netflix with the impact of vnodes. I could not find
> it quickly, but I invite you to check.
>
> To share some ideas:
>
> More vnodes:
> + Better balance between nodes
> + maximize the streaming throughput for operations as all nodes share a
> small bit of the data of all the other nodes (according to the topology).
> - When the cluster fails, there is more chance to lose availability as we
> 256 vnodes for example, 2 nodes down in distinct racks would for sure make
> data partially unavailable.
> - Overheads / Operational issues (in practice, using 256 vnodes have been
> a nightmare for multiple reasons, see below)
>
>
> Less vnodes
> - Imbalances can be big before C* 3.0. After, using
> allocate_tokens_for_keyspace -->
> http://cassandra.apache.org/doc/4.0/configuration/cassandra_config_file.html#allocate-tokens-for-keyspace,
> you can mitigate this issue. With this and some technics*, you can have
> good results in terms of balances.
> * Off the top of my head: this involve bootstrapping the seeds first,
> picking the tokens to use, create your keyspace then adding nodes with the
> option above. You can test it quite easily. Then with "nodetool status
>  - The streaming throughput is generally limited by the receiving host when
> using vnodes, thus 16 vnodes is probably not worse than 256 in terms of
> streaming
> + The other way around, the overhead of having 256 vnodes makes operations
> such as repair almost impossible, or at least way longer and complex.
> Repairing tables almost empty can take up to minutes and repairing big
> dataset might never end.
> + In Netflix paper about this topic (very interesting, I recommend
> reading), it is explained that reducing the number of vnodes reduces the
> chances of an outage.
> + There was a discussion in the dev mailing list. I believe the community
> agreed on the need to reduce the number of vnodes by default. Here again,
> you can have a quick look at the archive, Jira, github/trunk.
>
> I think that commonly accepted values would be 16/32. Values as low as 4
> are considered to improve availability, reduce overheads induced by vnodes.
> I would suggest you test it and see if low values you still manage to keep
> the balance between nodes.
>
> Also using "physical" nodes (initial_token, no vnodes) gives the
> possibility to reason about token distribution. You can perform advanced 
> operations
> where you bootstrap 1/3 of the cluster at once. This is very good
> especially for big clusters, I would say. While with many vnodes you'll
> have to go add a node at the time as each node is actually the 'neighbor'
> of all the others (according to the topology again - ie racks/data
> centers...).
>
> I would stay away from the default in this case (256 vnodes). I think
> this value is way too high by default.
>
> Also, keep in mind that to change the number of vnodes cannot be changed
> in a running cluster. The best way to change it is to add a new data center
> I think.
>
> C*heers,
> ---
> Alain Rodriguez - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
> Le lun. 21 janv. 2019 à 11:08, VICTOR IBARRA  a écrit :
>
>>
>> Good morning every one,
>>
>> I would like have a contact with the cassandra community for the
>> questions of cluster configuration
>>
>> Today i have many questions and differents projets about the
>> configuration of cluster cassandra and with the general problems of
>> configuration migration and for the use of vnodes.
>>
>> and the principal question is what about the gain to use 256 vnodes vs 16
>> vnodes for example
>>
>> Best regards
>> --
>>  L'integrité de ce message n'étant pas assurée sur internet, VICTOR
>> IBARRA ne peut être tenue responsable de son contenu en ce compris les
>> pièces jointes. Toute utilisation ou diffusion non autorisée est interdite.
>> Si vous n'êtes pas destinataire de ce message, merci de le  détruire et
>> d'avertir l'expéditeur.
>>
>>  The integrity of this message cannot be guaranteed on the Internet.
>> VICTOR IBARRA can not therefore be considered liable for the  contents
>> including its attachments. Any unauthorized use or dissemination is
>> prohibited. If you are not the intended recipient of  this message, then
>> please delete it and notify the sender.
>>
>

-- 
 L'integrité de ce message n'étant pas assurée sur internet, VICTOR IBARRA
ne peut être tenue responsable de son contenu en ce compris les pièces
jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
vous n'êtes pas destinataire de ce 

Re: question about the gain of increasing the number of vnode

2019-01-21 Thread VICTOR IBARRA
Hi Jean Carlo,

thank you  for the link !!! I gonna read the ticket

Have a nice day

best regards


Le lun. 21 janv. 2019 à 13:52, Jean Carlo  a
écrit :

> Hi Victor,
>
> Take a look to this jira
>
> https://issues.apache.org/jira/browse/CASSANDRA-13701
>
> I may answer your questions
>
>
> Jean Carlo
>
> "The best way to predict the future is to invent it" Alan Kay
>
>
> On Mon, Jan 21, 2019 at 12:08 PM VICTOR IBARRA  wrote:
>
>>
>> Good morning every one,
>>
>> I would like have a contact with the cassandra community for the
>> questions of cluster configuration
>>
>> Today i have many questions and differents projets about the
>> configuration of cluster cassandra and with the general problems of
>> configuration migration and for the use of vnodes.
>>
>> and the principal question is what about the gain to use 256 vnodes vs 16
>> vnodes for example
>>
>> Best regards
>> --
>>  L'integrité de ce message n'étant pas assurée sur internet, VICTOR
>> IBARRA ne peut être tenue responsable de son contenu en ce compris les
>> pièces jointes. Toute utilisation ou diffusion non autorisée est interdite.
>> Si vous n'êtes pas destinataire de ce message, merci de le  détruire et
>> d'avertir l'expéditeur.
>>
>>  The integrity of this message cannot be guaranteed on the Internet.
>> VICTOR IBARRA can not therefore be considered liable for the  contents
>> including its attachments. Any unauthorized use or dissemination is
>> prohibited. If you are not the intended recipient of  this message, then
>> please delete it and notify the sender.
>>
>

-- 
 L'integrité de ce message n'étant pas assurée sur internet, VICTOR IBARRA
ne peut être tenue responsable de son contenu en ce compris les pièces
jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
vous n'êtes pas destinataire de ce message, merci de le  détruire et
d'avertir l'expéditeur.

 The integrity of this message cannot be guaranteed on the Internet. VICTOR
IBARRA can not therefore be considered liable for the  contents including
its attachments. Any unauthorized use or dissemination is prohibited. If
you are not the intended recipient of  this message, then please delete it
and notify the sender.


Re: question about the gain of increasing the number of vnode

2019-01-21 Thread Alain RODRIGUEZ
There have been some discussion on this topic in this mailing list,
including a paper from Netflix with the impact of vnodes. I could not find
it quickly, but I invite you to check.

To share some ideas:

More vnodes:
+ Better balance between nodes
+ maximize the streaming throughput for operations as all nodes share a
small bit of the data of all the other nodes (according to the topology).
- When the cluster fails, there is more chance to lose availability as we
256 vnodes for example, 2 nodes down in distinct racks would for sure make
data partially unavailable.
- Overheads / Operational issues (in practice, using 256 vnodes have been a
nightmare for multiple reasons, see below)


Less vnodes
- Imbalances can be big before C* 3.0. After, using
allocate_tokens_for_keyspace -->
http://cassandra.apache.org/doc/4.0/configuration/cassandra_config_file.html#allocate-tokens-for-keyspace,
you can mitigate this issue. With this and some technics*, you can have
good results in terms of balances.
* Off the top of my head: this involve bootstrapping the seeds first,
picking the tokens to use, create your keyspace then adding nodes with the
option above. You can test it quite easily. Then with "nodetool status
http://www.thelastpickle.com


Le lun. 21 janv. 2019 à 11:08, VICTOR IBARRA  a écrit :

>
> Good morning every one,
>
> I would like have a contact with the cassandra community for the questions
> of cluster configuration
>
> Today i have many questions and differents projets about the configuration
> of cluster cassandra and with the general problems of configuration
> migration and for the use of vnodes.
>
> and the principal question is what about the gain to use 256 vnodes vs 16
> vnodes for example
>
> Best regards
> --
>  L'integrité de ce message n'étant pas assurée sur internet, VICTOR IBARRA
> ne peut être tenue responsable de son contenu en ce compris les pièces
> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
> vous n'êtes pas destinataire de ce message, merci de le  détruire et
> d'avertir l'expéditeur.
>
>  The integrity of this message cannot be guaranteed on the Internet.
> VICTOR IBARRA can not therefore be considered liable for the  contents
> including its attachments. Any unauthorized use or dissemination is
> prohibited. If you are not the intended recipient of  this message, then
> please delete it and notify the sender.
>


Re: question about the gain of increasing the number of vnode

2019-01-21 Thread Jean Carlo
Hi Victor,

Take a look to this jira

https://issues.apache.org/jira/browse/CASSANDRA-13701

I may answer your questions


Jean Carlo

"The best way to predict the future is to invent it" Alan Kay


On Mon, Jan 21, 2019 at 12:08 PM VICTOR IBARRA  wrote:

>
> Good morning every one,
>
> I would like have a contact with the cassandra community for the questions
> of cluster configuration
>
> Today i have many questions and differents projets about the configuration
> of cluster cassandra and with the general problems of configuration
> migration and for the use of vnodes.
>
> and the principal question is what about the gain to use 256 vnodes vs 16
> vnodes for example
>
> Best regards
> --
>  L'integrité de ce message n'étant pas assurée sur internet, VICTOR IBARRA
> ne peut être tenue responsable de son contenu en ce compris les pièces
> jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
> vous n'êtes pas destinataire de ce message, merci de le  détruire et
> d'avertir l'expéditeur.
>
>  The integrity of this message cannot be guaranteed on the Internet.
> VICTOR IBARRA can not therefore be considered liable for the  contents
> including its attachments. Any unauthorized use or dissemination is
> prohibited. If you are not the intended recipient of  this message, then
> please delete it and notify the sender.
>


question about the gain of increasing the number of vnode

2019-01-21 Thread VICTOR IBARRA
Good morning every one,

I would like have a contact with the cassandra community for the questions
of cluster configuration

Today i have many questions and differents projets about the configuration
of cluster cassandra and with the general problems of configuration
migration and for the use of vnodes.

and the principal question is what about the gain to use 256 vnodes vs 16
vnodes for example

Best regards
-- 
 L'integrité de ce message n'étant pas assurée sur internet, VICTOR IBARRA
ne peut être tenue responsable de son contenu en ce compris les pièces
jointes. Toute utilisation ou diffusion non autorisée est interdite. Si
vous n'êtes pas destinataire de ce message, merci de le  détruire et
d'avertir l'expéditeur.

 The integrity of this message cannot be guaranteed on the Internet. VICTOR
IBARRA can not therefore be considered liable for the  contents including
its attachments. Any unauthorized use or dissemination is prohibited. If
you are not the intended recipient of  this message, then please delete it
and notify the sender.