Re: Bootstrapping a new node to a running cluster

2012-03-16 Thread Mikael Wikblom

ok, thank you for your time!

Cheers


On 03/16/2012 10:12 AM, aaron morton wrote:

I think your original plan is sound.

1. Up the RF to 4.
2. Add the node with auto_bootstrap true
3. Once bootrapping has finished the new node has all the data it needs.
4. Check for secondary index creation using describe in the CLI to see 
which are build. You can also see progress using nodetool compactionstats


I'm a bit puzzled though, I just tried to increase R to 3 in a 
cluster with N=2. It serves reads and writes without issues CL.one. 
Is the described restriction is something that will be implemented in 
the future?
I had a quick glance at the code. IIRC there was an explicit check if 
RF > N, but I cannot find it any more. I'm guessing we now rely on a 
normal UnavailableFailure if there are not enough UP nodes.


Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/03/2012, at 8:56 PM, Mikael Wikblom wrote:

ok, thank you both for the clarification. So the correct approach 
would be to bootstrap the new node and run repair on each of the 
nodes in the cluster.


I'm a bit puzzled though, I just tried to increase R to 3 in a 
cluster with N=2. It serves reads and writes without issues CL.one. 
Is the described restriction is something that will be implemented in 
the future?


Thank you
Regards




On 03/16/2012 03:07 AM, aaron morton wrote:

The documentation is correct.
I was mistakenly remembering discussions in the past about RF > #nodes.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com 

On 16/03/2012, at 4:34 AM, Doğan Çeçen wrote:

I'm not sure why this is not allowed. As long as I do not use 
CL.all there
will be enough nodes available to satisfy the read / write (at 
least when I

look at ReadCallback and the WriteResponseHandler). Or am I missing
something here?


According to 
http://www.datastax.com/docs/1.0/cluster_architecture/replication


"As a general rule, the replication factor should not exceed the
number of nodes in the cluster. However, it is possible to increase
replication factor, and then add the desired number of nodes
afterwards. When replication factor exceeds the number of nodes,
writes will be rejected, but reads will be served as long as the
desired consistency level can be met."

--
()  ascii ribbon campaign - against html e-mail
/\ www.asciiribbon.org    - against 
proprietary attachments





--
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikb...@sitevision.se
http://www.sitevision.se





--
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikb...@sitevision.se
http://www.sitevision.se



Re: Bootstrapping a new node to a running cluster

2012-03-16 Thread aaron morton
I think your original plan is sound. 

1. Up the RF to 4. 
2. Add the node with auto_bootstrap true
3. Once bootrapping has finished the new node has all the data it needs. 
4. Check for secondary index creation using describe in the CLI to see which 
are build. You can also see progress using nodetool compactionstats

> I'm a bit puzzled though, I just tried to increase R to 3 in a cluster with 
> N=2. It serves reads and writes without issues CL.one. Is the described 
> restriction is something that will be implemented in the future?
I had a quick glance at the code. IIRC there was an explicit check if RF > N, 
but I cannot find it any more. I'm guessing we now rely on a normal 
UnavailableFailure if there are not enough UP nodes. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/03/2012, at 8:56 PM, Mikael Wikblom wrote:

> ok, thank you both for the clarification. So the correct approach would be to 
> bootstrap the new node and run repair on each of the nodes in the cluster.
> 
> I'm a bit puzzled though, I just tried to increase R to 3 in a cluster with 
> N=2. It serves reads and writes without issues CL.one. Is the described 
> restriction is something that will be implemented in the future?
> 
> Thank you
> Regards
> 
> 
> 
> 
> On 03/16/2012 03:07 AM, aaron morton wrote:
>> 
>> The documentation is correct. 
>> I was mistakenly remembering discussions in the past about RF > #nodes. 
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 16/03/2012, at 4:34 AM, Doğan Çeçen wrote:
>> 
 I'm not sure why this is not allowed. As long as I do not use CL.all there
 will be enough nodes available to satisfy the read / write (at least when I
 look at ReadCallback and the WriteResponseHandler). Or am I missing
 something here?
>>> 
>>> According to 
>>> http://www.datastax.com/docs/1.0/cluster_architecture/replication
>>> 
>>> "As a general rule, the replication factor should not exceed the
>>> number of nodes in the cluster. However, it is possible to increase
>>> replication factor, and then add the desired number of nodes
>>> afterwards. When replication factor exceeds the number of nodes,
>>> writes will be rejected, but reads will be served as long as the
>>> desired consistency level can be met."
>>> 
>>> -- 
>>> ()  ascii ribbon campaign - against html e-mail
>>> /\  www.asciiribbon.org   - against proprietary attachments
>> 
> 
> 
> -- 
> Mikael Wikblom
> Software Architect
> SiteVision AB
> 019-217058
> mikael.wikb...@sitevision.se
> http://www.sitevision.se



Re: Bootstrapping a new node to a running cluster

2012-03-16 Thread Mikael Wikblom
ok, thank you both for the clarification. So the correct approach would 
be to bootstrap the new node and run repair on each of the nodes in the 
cluster.


I'm a bit puzzled though, I just tried to increase R to 3 in a cluster 
with N=2. It serves reads and writes without issues CL.one. Is the 
described restriction is something that will be implemented in the future?


Thank you
Regards




On 03/16/2012 03:07 AM, aaron morton wrote:

The documentation is correct.
I was mistakenly remembering discussions in the past about RF > #nodes.

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/03/2012, at 4:34 AM, Doğan Çeçen wrote:

I'm not sure why this is not allowed. As long as I do not use CL.all 
there
will be enough nodes available to satisfy the read / write (at least 
when I

look at ReadCallback and the WriteResponseHandler). Or am I missing
something here?


According to 
http://www.datastax.com/docs/1.0/cluster_architecture/replication


"As a general rule, the replication factor should not exceed the
number of nodes in the cluster. However, it is possible to increase
replication factor, and then add the desired number of nodes
afterwards. When replication factor exceeds the number of nodes,
writes will be rejected, but reads will be served as long as the
desired consistency level can be met."

--
()  ascii ribbon campaign - against html e-mail
/\ www.asciiribbon.org    - against 
proprietary attachments





--
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikb...@sitevision.se
http://www.sitevision.se



Re: Bootstrapping a new node to a running cluster

2012-03-15 Thread aaron morton
The documentation is correct. 
I was mistakenly remembering discussions in the past about RF > #nodes. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 16/03/2012, at 4:34 AM, Doğan Çeçen wrote:

>> I'm not sure why this is not allowed. As long as I do not use CL.all there
>> will be enough nodes available to satisfy the read / write (at least when I
>> look at ReadCallback and the WriteResponseHandler). Or am I missing
>> something here?
> 
> According to http://www.datastax.com/docs/1.0/cluster_architecture/replication
> 
> "As a general rule, the replication factor should not exceed the
> number of nodes in the cluster. However, it is possible to increase
> replication factor, and then add the desired number of nodes
> afterwards. When replication factor exceeds the number of nodes,
> writes will be rejected, but reads will be served as long as the
> desired consistency level can be met."
> 
> -- 
> ()  ascii ribbon campaign - against html e-mail
> /\  www.asciiribbon.org   - against proprietary attachments



Re: Bootstrapping a new node to a running cluster

2012-03-15 Thread Doğan Çeçen
> I'm not sure why this is not allowed. As long as I do not use CL.all there
> will be enough nodes available to satisfy the read / write (at least when I
> look at ReadCallback and the WriteResponseHandler). Or am I missing
> something here?

According to http://www.datastax.com/docs/1.0/cluster_architecture/replication

"As a general rule, the replication factor should not exceed the
number of nodes in the cluster. However, it is possible to increase
replication factor, and then add the desired number of nodes
afterwards. When replication factor exceeds the number of nodes,
writes will be rejected, but reads will be served as long as the
desired consistency level can be met."

-- 
()  ascii ribbon campaign - against html e-mail
/\  www.asciiribbon.org   - against proprietary attachments


Re: Bootstrapping a new node to a running cluster

2012-03-15 Thread Mikael Wikblom

Hi Aaron,

On 03/15/2012 10:52 AM, aaron morton wrote:

1. a running cluster of N=3,  R=3
2. upgrade R to 4

You should not be allowed to set the RF higher than the number of nodes.
I'm not sure why this is not allowed. As long as I do not use CL.all 
there will be enough nodes available to satisfy the read / write (at 
least when I look at ReadCallback and the WriteResponseHandler). Or am I 
missing something here?




I'm going to assume that clients on the web server only talk to the 
local cassandra. So that when you add the new cassandra node it will 
not have any clients until you serve pages off the node.
Yes that's the general idea; wait until all data is available to the 
local node before accepting requests (i.e. starting the application).


Thank you for your reply
Regards


0. Personally I would run a repair before doing this to ensure the 
data is fully distributed.

1. Optionally, increase the CL  QUOURM. See step 3.
2. Add the new node with auto_bootstrap off. It will join the ring, 
write requests will be sent to it (from other cassandra nodes), but it 
should not get any direct client reads. It will not stream data from 
other nodes.
3. It is now possible for a READ to be received at an old node where 
it is no longer a replica for the row. It has to send the request to 
another node. If it is sent to the new node (at CL ONE) the read will 
fail. If you are running at a high CL it will always involve the old 
nodes.

4. Update the RF to 4. Every node is now a replica for every key.
5. Roll back the CL change.
6. Repair the new node.
7. Turn on the clients for the new node.

Hope that helps.

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/03/2012, at 9:50 PM, Mikael Wikblom wrote:


Hi,

I'm using cassandra (1.0.8) embedded in the same jvm as a 
webapplication. All data is available on all nodes (R = N),  read / 
write CL.ONE.


Is it correct to assume in the following scenario that the newly 
added node has all data locally and that secondary indexes are fully 
created  after the bootstrap process finishes?


scenario:
1. a running cluster of N=3,  R=3
2. upgrade R to 4
3. bootstrap the new node

I would like to avoid having to do the required repair on each node 
if I upgrade R to 4 after bootstrapping the new node.


Thanks
Regards

--
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikb...@sitevision.se 
http://www.sitevision.se









--
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikb...@sitevision.se
http://www.sitevision.se



Re: Bootstrapping a new node to a running cluster (setting RF > N)

2012-03-15 Thread Mateusz Korniak
On Thursday 15 of March 2012, aaron morton wrote:
> > 1. a running cluster of N=3,  R=3
> > 2. upgrade R to 4
> 
> You should not be allowed to set the RF higher than the number of nodes.

I wonder why is that restriction ?
It would be easier to increase RF first (similar as having new node down), add 
node (not responding to client reads), repair new node, turn on clients.

Would it be simpler than changing CL ?

> I'm going to assume that clients on the web server only talk to the local
> cassandra. So that when you add the new cassandra node it will not have
> any clients until you serve pages off the node.
> 
> 0. Personally I would run a repair before doing this to ensure the data is
> fully distributed. 1. Optionally, increase the CL  QUOURM. See step 3.
> 2. Add the new node with auto_bootstrap off. It will join the ring, write
> requests will be sent to it (from other cassandra nodes), but it should
> not get any direct client reads. It will not stream data from other nodes.
> 3. It is now possible for a READ to be received at an old node where it is
> no longer a replica for the row. It has to send the request to another
> node. If it is sent to the new node (at CL ONE) the read will fail. If you
> are running at a high CL it will always involve the old nodes. 4. Update
> the RF to 4. Every node is now a replica for every key.
> 5. Roll back the CL change.
> 6. Repair the new node.
> 7. Turn on the clients for the new node.

Regards,
-- 
Mateusz Korniak


Re: Bootstrapping a new node to a running cluster

2012-03-15 Thread aaron morton
> 1. a running cluster of N=3,  R=3
> 2. upgrade R to 4
You should not be allowed to set the RF higher than the number of nodes. 

I'm going to assume that clients on the web server only talk to the local 
cassandra. So that when you add the new cassandra node it will not have any 
clients until you serve pages off the node.

0. Personally I would run a repair before doing this to ensure the data is 
fully distributed. 
1. Optionally, increase the CL  QUOURM. See step 3. 
2. Add the new node with auto_bootstrap off. It will join the ring, write 
requests will be sent to it (from other cassandra nodes), but it should not get 
any direct client reads. It will not stream data from other nodes.
3. It is now possible for a READ to be received at an old node where it is no 
longer a replica for the row. It has to send the request to another node. If it 
is sent to the new node (at CL ONE) the read will fail. If you are running at a 
high CL it will always involve the old nodes.  
4. Update the RF to 4. Every node is now a replica for every key.
5. Roll back the CL change. 
6. Repair the new node.  
7. Turn on the clients for the new node.  

Hope that helps. 

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 15/03/2012, at 9:50 PM, Mikael Wikblom wrote:

> Hi,
> 
> I'm using cassandra (1.0.8) embedded in the same jvm as a webapplication. All 
> data is available on all nodes (R = N),  read / write CL.ONE.
> 
> Is it correct to assume in the following scenario that the newly added node 
> has all data locally and that secondary indexes are fully created  after the 
> bootstrap process finishes?
> 
> scenario:
> 1. a running cluster of N=3,  R=3
> 2. upgrade R to 4
> 3. bootstrap the new node
> 
> I would like to avoid having to do the required repair on each node if I 
> upgrade R to 4 after bootstrapping the new node.
> 
> Thanks
> Regards
> 
> -- 
> Mikael Wikblom
> Software Architect
> SiteVision AB
> 019-217058
> mikael.wikb...@sitevision.se
> http://www.sitevision.se
> 
> 
> 
> 



Bootstrapping a new node to a running cluster

2012-03-15 Thread Mikael Wikblom

Hi,

I'm using cassandra (1.0.8) embedded in the same jvm as a 
webapplication. All data is available on all nodes (R = N),  read / 
write CL.ONE.


Is it correct to assume in the following scenario that the newly added 
node has all data locally and that secondary indexes are fully created  
after the bootstrap process finishes?


scenario:
1. a running cluster of N=3,  R=3
2. upgrade R to 4
3. bootstrap the new node

I would like to avoid having to do the required repair on each node if I 
upgrade R to 4 after bootstrapping the new node.


Thanks
Regards

--
Mikael Wikblom
Software Architect
SiteVision AB
019-217058
mikael.wikb...@sitevision.se
http://www.sitevision.se