Re: Creating a copy of a C* cluster

2017-08-07 Thread kurt greaves
The most effective way to "divorce" it is to remove connectivity between
the datacentres. I would put in place firewall rules between the DC's to
stop them from communicating, and then rolling restart one of the DC's. You
should be left with 2 datacentres that see each other as down, and on each
one you can then go ahead and remove the opposite DC from replication, and
then remove the nodes from the other DC (via nodetool removenode). Before
changing RF or removing nodes make completely sure that there is *no way* for
the DC's to communicate.​


Re: Cassandra upgrade from 2.2.8 to 3.10

2017-08-07 Thread Michael Shuler
Use 3.11.0, instead of 3.10 too - has bug fixes on top of 3.10 and gets
long term release support.

-- 
Michael

On 08/07/2017 05:09 PM, Jeff Jirsa wrote:
> Cant really stream cross-version. You need to add nodes and then upgrade
> them (or upgrade all the nodes, and then add new ones).
> 
> 
> On Mon, Aug 7, 2017 at 2:58 PM, ZAIDI, ASAD A  > wrote:
> 
> Hi folks, I’ve question on upgrade method I’m thinking to execute.
> 
> __ __
> 
> I’m  planning from apache-Cassandra 2.2.8 to release 3.10.
> 
> __ __
> 
> My Cassandra cluster is configured like one rack with two
> Datacenters like:
> 
> __ __
> 
> __1.   __DC1 has 4 nodes 
> 
> __2.   __DC2 has 16 nodes
> 
> __ __
> 
> We’re adding another 12 nodes and would eventually need to remove
> those 4 nodes in DC1.
> 
> __ __
> 
> I’m thinking to add another third data center with like DC3 with 12
> nodes having apache Cassandra 3.10 installed. Then, I start
> upgrading seed nodes first in DC1 & DC2 – once all 20nodes in ( DC1
> plus DC2) upgraded – I can safely remove 4 DC1 nodes,
> 
> 
> 
> can you guys please let me know if this approach would work? I’m
> concerned if having mixed version on Cassandra nodes may  cause any
> issues like in streaming  data/sstables from existing DC to newly
> created third DC with version 3.10 installed, will nodes in DC3 join
> the cluster with data without issues?
> 
> __ __
> 
> Thanks/Asad
> 
> __ __
> 
> __ __
> 
> __ __
> 
> __ __
> 
> 


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Cassandra upgrade from 2.2.8 to 3.10

2017-08-07 Thread Jeff Jirsa
Cant really stream cross-version. You need to add nodes and then upgrade
them (or upgrade all the nodes, and then add new ones).


On Mon, Aug 7, 2017 at 2:58 PM, ZAIDI, ASAD A  wrote:

> Hi folks, I’ve question on upgrade method I’m thinking to execute.
>
>
>
> I’m  planning from apache-Cassandra 2.2.8 to release 3.10.
>
>
>
> My Cassandra cluster is configured like one rack with two Datacenters like:
>
>
>
> 1.   DC1 has 4 nodes
>
> 2.   DC2 has 16 nodes
>
>
>
> We’re adding another 12 nodes and would eventually need to remove those 4
> nodes in DC1.
>
>
>
> I’m thinking to add another third data center with like DC3 with 12 nodes
> having apache Cassandra 3.10 installed. Then, I start upgrading seed nodes
> first in DC1 & DC2 – once all 20nodes in ( DC1 plus DC2) upgraded – I can
> safely remove 4 DC1 nodes,
>
> can you guys please let me know if this approach would work? I’m concerned
> if having mixed version on Cassandra nodes may  cause any issues like in
> streaming  data/sstables from existing DC to newly created third DC with
> version 3.10 installed, will nodes in DC3 join the cluster with data
> without issues?
>
>
>
> Thanks/Asad
>
>
>
>
>
>
>
>
>


Cassandra upgrade from 2.2.8 to 3.10

2017-08-07 Thread ZAIDI, ASAD A
Hi folks, I’ve question on upgrade method I’m thinking to execute.

I’m  planning from apache-Cassandra 2.2.8 to release 3.10.

My Cassandra cluster is configured like one rack with two Datacenters like:


1.   DC1 has 4 nodes

2.   DC2 has 16 nodes

We’re adding another 12 nodes and would eventually need to remove those 4 nodes 
in DC1.

I’m thinking to add another third data center with like DC3 with 12 nodes 
having apache Cassandra 3.10 installed. Then, I start upgrading seed nodes 
first in DC1 & DC2 – once all 20nodes in ( DC1 plus DC2) upgraded – I can 
safely remove 4 DC1 nodes,
can you guys please let me know if this approach would work? I’m concerned if 
having mixed version on Cassandra nodes may  cause any issues like in streaming 
 data/sstables from existing DC to newly created third DC with version 3.10 
installed, will nodes in DC3 join the cluster with data without issues?

Thanks/Asad






Re: Creating a copy of a C* cluster

2017-08-07 Thread Ben Slater
For minimum disruption to your production cluster, restoring from backups
is probably the best option. However, there is no reason adding a DC,
building and then splitting shouldn’t work if done correct.

Cheers
Ben

On Tue, 8 Aug 2017 at 07:11 Robert Wille  wrote:

> We need to make a copy of a cluster. We’re going to do some testing
> against the copy and then discard it. What’s the best way of doing that? I
> created another datacenter, and then have tried to divorce it from the
> original datacenter, but have had troubles doing so.
>
> Suggestions?
>
> Thanks in advance
>
> Robert
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
-- 


*Ben Slater*

*Chief Product Officer *

   


Read our latest technical blog posts here
.

This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
and Instaclustr Inc (USA).

This email and any attachments may contain confidential and legally
privileged information.  If you are not the intended recipient, do not copy
or disclose its content, but please reply to this email immediately and
highlight the error to the sender and then immediately delete the message.


Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
Keyspace has WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1': 
'3', 'us-east-productiondata': '3'}  AND durable_writes = true;

From: Jeff Jirsa 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, August 7, 2017 at 2:51 PM
To: cassandra 
Subject: Re: Different data size between datacenters

And when you say the data size is smaller, you mean per node? Or sum of all 
nodes in the datacenter?

With 185 hosts in AWS vs 135 in your DC, I would expect your DC hosts to have  
30% less data per host than AWS.

If instead they have twice as much, it sounds like it's balancing by # of 
tokens instead, which may be an indication that you're somehow using 
SimpleStrategy, or your NetworkTopologyStrategy is somehow misconfigured for 
one or more keyspaces.

Can you paste your keyspace replication strategy lines, anonymized as needed?


On Mon, Aug 7, 2017 at 1:46 PM, Chuck Reynolds 
> wrote:
Yes to the NetworkTopologyStrategy.

From: Jeff Jirsa >
Reply-To: "user@cassandra.apache.org" 
>
Date: Monday, August 7, 2017 at 2:39 PM
To: cassandra >
Subject: Re: Different data size between datacenters

You're using NetworkTopologyStrategy and not SimpleStrategy, correct?


On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds 
> wrote:
I have a cluster that spans two datacenters running Cassandra 2.1.12.  135 
nodes in my data center and about 185 in AWS.

The size of the second data center (AWS) is quite a bit smaller.  Replication 
is the same in both datacenters.  Is there a logical explanation for this?

thanks




Creating a copy of a C* cluster

2017-08-07 Thread Robert Wille
We need to make a copy of a cluster. We’re going to do some testing against the 
copy and then discard it. What’s the best way of doing that? I created another 
datacenter, and then have tried to divorce it from the original datacenter, but 
have had troubles doing so.

Suggestions?

Thanks in advance

Robert


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org


Re: Different data size between datacenters

2017-08-07 Thread Jeff Jirsa
Tombstones should eventually compact away in most cases, but if you've
recently changed topology (added nodes, removed nodes, etc), you should run
"nodetool cleanup" to remove no-longer-owned data (start by running it on
one instance at a time, it's a form of compaction and can impact disk space
and latencies).


On Mon, Aug 7, 2017 at 2:04 PM, Chuck Reynolds 
wrote:

> Yes it’s the total size.
>
>
>
> Could it be that tombstones or data that nodes no longer own is not being
> copied/streamed to the data center in AWS?
>
>
>
> *From: *Jeff Jirsa 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, August 7, 2017 at 2:51 PM
> *To: *cassandra 
> *Subject: *Re: Different data size between datacenters
>
>
>
> And when you say the data size is smaller, you mean per node? Or sum of
> all nodes in the datacenter?
>
>
>
> With 185 hosts in AWS vs 135 in your DC, I would expect your DC hosts to
> have  30% less data per host than AWS.
>
>
>
> If instead they have twice as much, it sounds like it's balancing by # of
> tokens instead, which may be an indication that you're somehow using
> SimpleStrategy, or your NetworkTopologyStrategy is somehow misconfigured
> for one or more keyspaces.
>
>
>
> Can you paste your keyspace replication strategy lines, anonymized as
> needed?
>
>
>
>
>
> On Mon, Aug 7, 2017 at 1:46 PM, Chuck Reynolds 
> wrote:
>
> Yes to the NetworkTopologyStrategy.
>
>
>
> *From: *Jeff Jirsa 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, August 7, 2017 at 2:39 PM
> *To: *cassandra 
> *Subject: *Re: Different data size between datacenters
>
>
>
> You're using NetworkTopologyStrategy and not SimpleStrategy, correct?
>
>
>
>
>
> On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds 
> wrote:
>
> I have a cluster that spans two datacenters running Cassandra 2.1.12.  135
> nodes in my data center and about 185 in AWS.
>
>
>
> The size of the second data center (AWS) is quite a bit smaller.
> Replication is the same in both datacenters.  Is there a logical
> explanation for this?
>
>
>
> thanks
>
>
>
>
>


Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
Yes it’s the total size.

Could it be that tombstones or data that nodes no longer own is not being 
copied/streamed to the data center in AWS?

From: Jeff Jirsa 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, August 7, 2017 at 2:51 PM
To: cassandra 
Subject: Re: Different data size between datacenters

And when you say the data size is smaller, you mean per node? Or sum of all 
nodes in the datacenter?

With 185 hosts in AWS vs 135 in your DC, I would expect your DC hosts to have  
30% less data per host than AWS.

If instead they have twice as much, it sounds like it's balancing by # of 
tokens instead, which may be an indication that you're somehow using 
SimpleStrategy, or your NetworkTopologyStrategy is somehow misconfigured for 
one or more keyspaces.

Can you paste your keyspace replication strategy lines, anonymized as needed?


On Mon, Aug 7, 2017 at 1:46 PM, Chuck Reynolds 
> wrote:
Yes to the NetworkTopologyStrategy.

From: Jeff Jirsa >
Reply-To: "user@cassandra.apache.org" 
>
Date: Monday, August 7, 2017 at 2:39 PM
To: cassandra >
Subject: Re: Different data size between datacenters

You're using NetworkTopologyStrategy and not SimpleStrategy, correct?


On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds 
> wrote:
I have a cluster that spans two datacenters running Cassandra 2.1.12.  135 
nodes in my data center and about 185 in AWS.

The size of the second data center (AWS) is quite a bit smaller.  Replication 
is the same in both datacenters.  Is there a logical explanation for this?

thanks




Re: Different data size between datacenters

2017-08-07 Thread Jeff Jirsa
And when you say the data size is smaller, you mean per node? Or sum of all
nodes in the datacenter?

With 185 hosts in AWS vs 135 in your DC, I would expect your DC hosts to
have  30% less data per host than AWS.

If instead they have twice as much, it sounds like it's balancing by # of
tokens instead, which may be an indication that you're somehow using
SimpleStrategy, or your NetworkTopologyStrategy is somehow misconfigured
for one or more keyspaces.

Can you paste your keyspace replication strategy lines, anonymized as
needed?


On Mon, Aug 7, 2017 at 1:46 PM, Chuck Reynolds 
wrote:

> Yes to the NetworkTopologyStrategy.
>
>
>
> *From: *Jeff Jirsa 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Monday, August 7, 2017 at 2:39 PM
> *To: *cassandra 
> *Subject: *Re: Different data size between datacenters
>
>
>
> You're using NetworkTopologyStrategy and not SimpleStrategy, correct?
>
>
>
>
>
> On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds 
> wrote:
>
> I have a cluster that spans two datacenters running Cassandra 2.1.12.  135
> nodes in my data center and about 185 in AWS.
>
>
>
> The size of the second data center (AWS) is quite a bit smaller.
> Replication is the same in both datacenters.  Is there a logical
> explanation for this?
>
>
>
> thanks
>
>
>


Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
So we have the default 256 in our datacenter and 128 in AWS.

From: "ZAIDI, ASAD A" 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, August 7, 2017 at 1:36 PM
To: "user@cassandra.apache.org" 
Subject: RE: Different data size between datacenters

Are you using same number of token/vnodes in both data centers?

From: Chuck Reynolds [mailto:creyno...@ancestry.com]
Sent: Monday, August 07, 2017 1:51 PM
To: user@cassandra.apache.org
Subject: Different data size between datacenters

I have a cluster that spans two datacenters running Cassandra 2.1.12.  135 
nodes in my data center and about 185 in AWS.

The size of the second data center (AWS) is quite a bit smaller.  Replication 
is the same in both datacenters.  Is there a logical explanation for this?

thanks


Re: Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
Yes to the NetworkTopologyStrategy.

From: Jeff Jirsa 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, August 7, 2017 at 2:39 PM
To: cassandra 
Subject: Re: Different data size between datacenters

You're using NetworkTopologyStrategy and not SimpleStrategy, correct?


On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds 
> wrote:
I have a cluster that spans two datacenters running Cassandra 2.1.12.  135 
nodes in my data center and about 185 in AWS.

The size of the second data center (AWS) is quite a bit smaller.  Replication 
is the same in both datacenters.  Is there a logical explanation for this?

thanks



Re: Different data size between datacenters

2017-08-07 Thread Jeff Jirsa
You're using NetworkTopologyStrategy and not SimpleStrategy, correct?


On Mon, Aug 7, 2017 at 11:50 AM, Chuck Reynolds 
wrote:

> I have a cluster that spans two datacenters running Cassandra 2.1.12.  135
> nodes in my data center and about 185 in AWS.
>
>
>
> The size of the second data center (AWS) is quite a bit smaller.
> Replication is the same in both datacenters.  Is there a logical
> explanation for this?
>
>
>
> thanks
>


RE: Different data size between datacenters

2017-08-07 Thread ZAIDI, ASAD A
Are you using same number of token/vnodes in both data centers?

From: Chuck Reynolds [mailto:creyno...@ancestry.com]
Sent: Monday, August 07, 2017 1:51 PM
To: user@cassandra.apache.org
Subject: Different data size between datacenters

I have a cluster that spans two datacenters running Cassandra 2.1.12.  135 
nodes in my data center and about 185 in AWS.

The size of the second data center (AWS) is quite a bit smaller.  Replication 
is the same in both datacenters.  Is there a logical explanation for this?

thanks


Different data size between datacenters

2017-08-07 Thread Chuck Reynolds
I have a cluster that spans two datacenters running Cassandra 2.1.12.  135 
nodes in my data center and about 185 in AWS.

The size of the second data center (AWS) is quite a bit smaller.  Replication 
is the same in both datacenters.  Is there a logical explanation for this?

thanks