Re: Question regarding concurrent bootstrapping

2015-06-16 Thread Alain RODRIGUEZ
Hi Jens,

if you are going to dramatically change the number of nodes or for a lot of
high movement in tokens you might want to consider DC switching - adding a
DC, switching client, dropping old DC. You can use this also for a few
other cases like using vnodes, changing their number or even the
partitioner. This is quite efficient and safe if you can afford doubling
your servers (cloud ?).

Alain

2015-06-14 20:01 GMT+02:00 Jens Rantil jens.ran...@tink.se:

 Rob,

 Thanks for a great answer. While I'm at it, thanks for all the time you
 put into answering people on this mailing list. I'm sure I'm not the only
 appreciating it.

 Cheers,
 Jens

 –
 Skickat från Mailbox https://www.dropbox.com/mailbox


 On Sat, Jun 13, 2015 at 12:37 AM, Robert Coli rc...@eventbrite.com
 wrote:

 On Fri, Jun 12, 2015 at 5:21 AM, Jens Rantil jens.ran...@tink.se wrote:

 Let's say I have an existing cluster and do the following:

1. I start a new joining node (A). It enters state Up/Joining.
Streaming automatically start to this node.
2. I wait two minutes (best practise for bootstrapping).
3. I start a second node (B) to join the cluster. It allocates some
of A:s previous parts of the ring and enters state Up/Joining. 
 Streaming
automatically starts to this node.

 Will streaming of data that A is no longer responsible (after B joined)
 stop immediately? That is, after (3), will data streamed to A only be what
 it is responsible of?


 It depends on the version of Cassandra. A will get data it shouldn't
 get in any version that doesn't contain CASSANDRA-2434 patch. If you do not
 run cleanup on A when A is done bootstrapping

 In a version containing 2434, the attempt to bootstrap B will fail and
 will not work until A is done bootstrapping, unless you set the
 property -Dcassandra.consistent.rangemovement=false while starting it.

 In general, one DOES NOT WANT TO
 SET -Dcassandra.consistent.rangemovement! It fixes 2434, and 2434 is
 bad for consistency.

 Instead, considering expanding clusters to initial size when they are
 empty, and disabling bootstrapping while doing so.

 Lots and lots of background on :
 https://issues.apache.org/jira/browse/CASSANDRA-2434

  Related ticket : https://issues.apache.org/jira/browse/CASSANDRA-7069

  =Rob
 PS - BTW, the fact that 2434 existed for so long, in versions where
 repair was often broken/unused, is the strongest single item of information
 in support of the Coli Conjecture...





Re: Question regarding concurrent bootstrapping

2015-06-14 Thread Jens Rantil
Rob,


Thanks for a great answer. While I'm at it, thanks for all the time you put 
into answering people on this mailing list. I'm sure I'm not the only 
appreciating it.




Cheers,

Jens





–
Skickat från Mailbox

On Sat, Jun 13, 2015 at 12:37 AM, Robert Coli rc...@eventbrite.com
wrote:

 On Fri, Jun 12, 2015 at 5:21 AM, Jens Rantil jens.ran...@tink.se wrote:
 Let's say I have an existing cluster and do the following:

1. I start a new joining node (A). It enters state Up/Joining.
Streaming automatically start to this node.
2. I wait two minutes (best practise for bootstrapping).
3. I start a second node (B) to join the cluster. It allocates some of
A:s previous parts of the ring and enters state Up/Joining. Streaming
automatically starts to this node.

 Will streaming of data that A is no longer responsible (after B joined)
 stop immediately? That is, after (3), will data streamed to A only be what
 it is responsible of?

 It depends on the version of Cassandra. A will get data it shouldn't get
 in any version that doesn't contain CASSANDRA-2434 patch. If you do not run
 cleanup on A when A is done bootstrapping
 In a version containing 2434, the attempt to bootstrap B will fail and will
 not work until A is done bootstrapping, unless you set the
 property -Dcassandra.consistent.rangemovement=false while starting it.
 In general, one DOES NOT WANT TO
 SET -Dcassandra.consistent.rangemovement! It fixes 2434, and 2434 is
 bad for consistency.
 Instead, considering expanding clusters to initial size when they are
 empty, and disabling bootstrapping while doing so.
 Lots and lots of background on :
 https://issues.apache.org/jira/browse/CASSANDRA-2434
 Related ticket : https://issues.apache.org/jira/browse/CASSANDRA-7069
 =Rob
 PS - BTW, the fact that 2434 existed for so long, in versions where repair
 was often broken/unused, is the strongest single item of information in
 support of the Coli Conjecture...

Question regarding concurrent bootstrapping

2015-06-12 Thread Jens Rantil
Hi,

Let's say I have an existing cluster and do the following:

   1. I start a new joining node (A). It enters state Up/Joining.
   Streaming automatically start to this node.
   2. I wait two minutes (best practise for bootstrapping).
   3. I start a second node (B) to join the cluster. It allocates some of
   A:s previous parts of the ring and enters state Up/Joining. Streaming
   automatically starts to this node.

Will streaming of data that A is no longer responsible (after B joined)
stop immediately? That is, after (3), will data streamed to A only be what
it is responsible of?

This is of importance for planning when one it expanding a cluster to
multiple smaller nodes.

Thanks,
Jens

-- 
Jens Rantil
Backend engineer
Tink AB

Email: jens.ran...@tink.se
Phone: +46 708 84 18 32
Web: www.tink.se

Facebook https://www.facebook.com/#!/tink.se Linkedin
http://www.linkedin.com/company/2735919?trk=vsrp_companies_res_phototrkInfo=VSRPsearchId%3A1057023381369207406670%2CVSRPtargetId%3A2735919%2CVSRPcmpt%3Aprimary
 Twitter https://twitter.com/tink


Re: Question regarding concurrent bootstrapping

2015-06-12 Thread Robert Coli
On Fri, Jun 12, 2015 at 5:21 AM, Jens Rantil jens.ran...@tink.se wrote:

 Let's say I have an existing cluster and do the following:

1. I start a new joining node (A). It enters state Up/Joining.
Streaming automatically start to this node.
2. I wait two minutes (best practise for bootstrapping).
3. I start a second node (B) to join the cluster. It allocates some of
A:s previous parts of the ring and enters state Up/Joining. Streaming
automatically starts to this node.

 Will streaming of data that A is no longer responsible (after B joined)
 stop immediately? That is, after (3), will data streamed to A only be what
 it is responsible of?


It depends on the version of Cassandra. A will get data it shouldn't get
in any version that doesn't contain CASSANDRA-2434 patch. If you do not run
cleanup on A when A is done bootstrapping

In a version containing 2434, the attempt to bootstrap B will fail and will
not work until A is done bootstrapping, unless you set the
property -Dcassandra.consistent.rangemovement=false while starting it.

In general, one DOES NOT WANT TO
SET -Dcassandra.consistent.rangemovement! It fixes 2434, and 2434 is
bad for consistency.

Instead, considering expanding clusters to initial size when they are
empty, and disabling bootstrapping while doing so.

Lots and lots of background on :
https://issues.apache.org/jira/browse/CASSANDRA-2434

Related ticket : https://issues.apache.org/jira/browse/CASSANDRA-7069

=Rob
PS - BTW, the fact that 2434 existed for so long, in versions where repair
was often broken/unused, is the strongest single item of information in
support of the Coli Conjecture...