Fwd: Re: How to set up a cluster with allocate_tokens_for_keyspace?

2019-05-04 Thread onmstester onmstester
So do you mean setting tokens for only one node (one of the seed node) is fair 
enough?

 I can not see any problem with this mechanism (only one manual token 
assignment at cluster set up), but the article was also trying to set up a 
balanced cluster and the way that it insist on doing manual token assignment 
for multiple seed nodes, confused me.



Sent using https://www.zoho.com/mail/






 Forwarded message 

From: Jon Haddad 

To: 

Date: Sat, 04 May 2019 22:10:39 +0430

Subject: Re: How to set up a cluster with allocate_tokens_for_keyspace?

 Forwarded message 




That line is only relevant for when you're starting your cluster and 

you need to define your initial tokens in a non-random way.  Random 

token distribution doesn't work very well when you only use 4 tokens. 

 

Once you get the cluster set up you don't need to specify tokens 

anymore, you can just use allocate_tokens_for_keyspace. 

 

On Sat, May 4, 2019 at 2:14 AM onmstester onmstester 

 wrote: 

> 

> I just read this article by tlp: 

> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>  

> 

> Noticed that: 

> >>We will need to set the tokens for the seed nodes in each rack manually. 
> >>This is to prevent each node from randomly calculating its own token ranges 

> 

>  But until now, i was using this recommendation to setup a new cluster: 

> >> 

> 

> You'll want to set them explicitly using: python -c 'print( [str(((2**64 / 4) 
> * i) - 2**63) for i in range(4)])' 

> 

> 

> After you fire up the first seed, create a keyspace using RF=3 (or whatever 
> you're planning on using) and set allocate_tokens_for_keyspace to that 
> keyspace in your config, and join the rest of the nodes. That gives even 

> distribution. 

> 

> I've defined plenty of racks in my cluster (and only 3 seed nodes), should i 
> have a seed node per rack and use initial_token for all of the seed nodes or 
> just one seed node with inital_token would be ok? 

> 

> Best Regards 

> 

> 

 

- 

To unsubscribe, e-mail: mailto:user-unsubscr...@cassandra.apache.org 

For additional commands, e-mail: mailto:user-h...@cassandra.apache.org

Re: How to set up a cluster with allocate_tokens_for_keyspace?

2019-05-04 Thread Jon Haddad
That line is only relevant for when you're starting your cluster and
you need to define your initial tokens in a non-random way.  Random
token distribution doesn't work very well when you only use 4 tokens.

Once you get the cluster set up you don't need to specify tokens
anymore, you can just use allocate_tokens_for_keyspace.

On Sat, May 4, 2019 at 2:14 AM onmstester onmstester
 wrote:
>
> I just read this article by tlp:
> https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html
>
> Noticed that:
> >>We will need to set the tokens for the seed nodes in each rack manually. 
> >>This is to prevent each node from randomly calculating its own token ranges
>
>  But until now, i was using this recommendation to setup a new cluster:
> >>
>
> You'll want to set them explicitly using: python -c 'print( [str(((2**64 / 4) 
> * i) - 2**63) for i in range(4)])'
>
>
> After you fire up the first seed, create a keyspace using RF=3 (or whatever 
> you're planning on using) and set allocate_tokens_for_keyspace to that 
> keyspace in your config, and join the rest of the nodes. That gives even
> distribution.
>
> I've defined plenty of racks in my cluster (and only 3 seed nodes), should i 
> have a seed node per rack and use initial_token for all of the seed nodes or 
> just one seed node with inital_token would be ok?
>
> Best Regards
>
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



nodetool repair failing with "Validation failed in /X.X.X.X

2019-05-04 Thread Rhys Campbell


> Hello,
> 
> I’m having issues running repair on an Apache Cassandra Cluster. I’m getting 
> "Failed creating a merkle tree“ errors on the replication partner nodes. 
> Anyone have any experience of this? I am running 2.2.13.
> 
> Further details here… 
> https://issues.apache.org/jira/projects/CASSANDRA/issues/CASSANDRA-15109?filter=allopenissues
> 
> Best,
> 
> Rhys


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



How to set up a cluster with allocate_tokens_for_keyspace?

2019-05-04 Thread onmstester onmstester
I just read this article by tlp:

https://thelastpickle.com/blog/2019/02/21/set-up-a-cluster-with-even-token-distribution.html

 

Noticed that:

>>We will need to set the tokens for the seed nodes in each rack manually. This 
>>is to prevent each node from randomly calculating its own token ranges



 But until now, i was using this recommendation to setup a new cluster:

>>

You'll want to set them explicitly using: python -c 'print( [str(((2**64 / 4) * 
i) - 2**63) for i in range(4)])'


After you fire up the first seed, create a keyspace using RF=3 (or whatever 
you're planning on using) and set allocate_tokens_for_keyspace to that keyspace 
in your config, and join the rest of the nodes. That gives even
distribution.

I've defined plenty of racks in my cluster (and only 3 seed nodes), should i 
have a seed node per rack and use initial_token for all of the seed nodes or 
just one seed node with inital_token would be ok?
Best Regards

Re: CL=LQ, RF=3: Can a Write be Lost If Two Nodes ACK'ing it Die

2019-05-04 Thread Ben Slater
In the normal, happy case the replica would be written to the third node at
the time of the write. However, if they third node happened to be down or
very overloaded at the time of the write (your step 3) the write would
still be reported to the client as successful. Even if the 3rd node is up
again before nodes 1 and 2 die, hints may have expired by that time or may
not finish replaying either due to load (which is sort of the scenario you
outlined) or just not enough time. You’re only really guaranteed all three
replicas are there if a repair runs successful between the initial write
and the two nodes dieing (although it’s very likely there will be three
replicas from the start if the cluster is in a healthy state at the time of
the write).

Cheers
Ben

---


*Ben Slater**Chief Product Officer*



   


Read our latest technical blog posts here
.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


On Fri, 3 May 2019 at 23:19, Fred Habash  wrote:

> Thank you all.
>
> So, please, bear with me for a second. I'm trying to figure out how can
> data be totally lost under the above circumstances when nodes die in two
> out of three racks.
>
> You stated 
> "the replica may or many not have made its way to the third node '. Why
> 'may not'?
>
> This is what I came up with ...
>
> 1. Write goes to coordinator in rac1
> 2. Local coordinator submits RF = 3 writes to all racks
> 3. Two nodes in rack 1 and 2 ack the write. Client is happy
> 4. Nodes massacre happens in rack 1 & 2 (infrastructure event)
> 5. Nodes in rack3 witness in increased in load as a result of cluster
> shrinking
> 6. Coordinator in rack1 stores HH for the row for rack3 (either
> coordinator slows down or rack3 node is overloaded).
> 7. Eventually, coordinator in rack1 dies and HH's are lost.
> 8. The row that was once ack'd to the app, is now gone gone.
>
> Plausible?
>
>
> On Thu, May 2, 2019 at 8:23 PM Avinash Mandava 
> wrote:
>
>> Good catch, misread the detail.
>>
>> On Thu, May 2, 2019 at 4:56 PM Ben Slater 
>> wrote:
>>
>>> Reading more carefully, it could actually be either way: quorum requires
>>> that a majority of nodes complete and ack the write but still aims to write
>>> to RF nodes (with the last replicate either written immediately or
>>> eventually via hints or repairs). So, in the scenario outlined the replica
>>> may or many not have made its way to the third node by the time the first
>>> two replicas are lost. If there is a replica on the third node it can be
>>> recovered to the other two nodes by either rebuild (actually replace) or
>>> repair.
>>>
>>> Cheers
>>> Ben
>>>
>>> ---
>>>
>>>
>>> *Ben Slater**Chief Product Officer*
>>>
>>> 
>>>
>>> 
>>> 
>>> 
>>>
>>> Read our latest technical blog posts here
>>> .
>>>
>>> This email has been sent on behalf of Instaclustr Pty. Limited
>>> (Australia) and Instaclustr Inc (USA).
>>>
>>> This email and any attachments may contain confidential and legally
>>> privileged information.  If you are not the intended recipient, do not copy
>>> or disclose its content, but please reply to this email immediately and
>>> highlight the error to the sender and then immediately delete the message.
>>>
>>>
>>> On Fri, 3 May 2019 at 09:33, Avinash Mandava 
>>> wrote:
>>>
 In scenario 2 it's lost, if both nodes die and get replaced entirely
 there's no history anywhere that the write ever happened, as it wouldn't be
 in commitlog, memtable, or sstable in node 3. Surviving that failure
 scenario of two nodes with same data simultaneously failing requires upping
 CL or RF, or spreading across 3 racks, if the situation you're trying to
 avoid is rack failure (which im guessing it is from the question setup)

 On Thu, May 2, 2019 at 2:25 PM Ben Slater 
 wrote:

> In scenario 2, if the row has been written to node 3 it will be
> replaced on the other nodes via rebuild or repair.
>
> ---
>
>
> *Ben Slater**Chief Product Officer*
>
> 
>
> 
> 
> 
>
> Read our latest technical blog posts here
> .