Re: Replication issue with Multi DC setup in cassandra

2017-05-24 Thread daemeon reiydelle
Cqlsh looks at the cluster, not node

“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they may act their dreams with
open eyes, to make it possible.” — T.E. Lawrence

sent from my mobile
Daemeon Reiydelle
skype daemeon.c.m.reiydelle
USA 415.501.0198

On May 16, 2017 2:42 PM, "suraj pasuparthy" 
wrote:

> So i though the same,
> I see the data via the CQLSH in both the datacenters. consistency is set
> to LQ
>
> thanks
> -Suraj
>
> On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth  wrote:
>
>> Do you see data on other DC or just directory structure? Directory
>> structure would populate because it is DDL but inserts shouldn’t populate,
>> ideally.
>>
>> On May 16, 2017, at 3:19 PM, suraj pasuparthy 
>> wrote:
>>
>> elp me fig
>>
>>
>>
>
>
> --
> Suraj Pasuparthy
>
> cisco systems
> Software Engineer
> San Jose CA
>


Re: Replication issue with Multi DC setup in cassandra

2017-05-24 Thread Arvydas Jonusonis
Run *nodetool cleanup* on the *4.4.4.5* DC node(s). Changing network
topology does not *remove* data - it's a manual task.

But it should prevent it from replicating over to the undesired DC.

Also make sure your LoadBalancingStrategy is set to DCAwareRoundRobinPolicy,
with *4.4.4.4* DC set as the *local* DC.

Arvydas

On Wed, May 24, 2017 at 9:46 PM, daemeon reiydelle 
wrote:

> May I inquire if your configuration is actually data center aware? Do you
> understand the difference between LQ and replication?
>
>
>
>
>
> *Daemeon C.M. ReiydelleUSA (+1) 415.501.0198 <(415)%20501-0198>London
> (+44) (0) 20 8144 9872 <+44%2020%208144%209872>*
>
>
> *“All men dream, but not equally. Those who dream by night in the dusty
> recesses of their minds wake up in the day to find it was vanity, but the
> dreamers of the day are dangerous men, for they may act their dreams with
> open eyes, to make it possible.” — T.E. Lawrence*
>
>
> On Wed, May 24, 2017 at 12:03 PM, Igor Leão  wrote:
>
>> Did you run `nodetool repair` after changing the keyspace? (not sure if
>> it makes sense though)
>>
>> 2017-05-16 19:52 GMT-03:00 Nitan Kainth :
>>
>>> Strange. Anybody else might share something more important.
>>>
>>> Sent from my iPhone
>>>
>>> On May 16, 2017, at 5:23 PM, suraj pasuparthy <
>>> suraj.pasupar...@gmail.com> wrote:
>>>
>>> Yes is see them in the datacenter's data directories.. infact i see then
>>> even after i bring down the interface between the 2 DC's which further
>>> confirms that a local copy is maintained in the DC that was not configured
>>> in the strategy ..
>>> its quite important that we block the info for this keyspace from
>>> replicating :(.. not sure why this does not work
>>>
>>> Thanks
>>> Suraj
>>>
>>> On Tue, May 16, 2017 at 3:06 PM Nitan Kainth  wrote:
>>>
 check for datafiles on filesystem in both DCs.

 On May 16, 2017, at 4:42 PM, suraj pasuparthy <
 suraj.pasupar...@gmail.com> wrote:

 So i though the same,
 I see the data via the CQLSH in both the datacenters. consistency is
 set to LQ

 thanks
 -Suraj

 On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth 
 wrote:

> Do you see data on other DC or just directory structure? Directory
> structure would populate because it is DDL but inserts shouldn’t populate,
> ideally.
>
> On May 16, 2017, at 3:19 PM, suraj pasuparthy <
> suraj.pasupar...@gmail.com> wrote:
>
> elp me fig
>
>
>


 --
 Suraj Pasuparthy

 cisco systems
 Software Engineer
 San Jose CA






>>
>>
>> --
>> Igor Leão  Site Reliability Engineer
>>
>> Mobile: +55 81 99727-1083 
>> Skype: *igorvpcleao*
>> Office: +55 81 4042-9757 
>> Website: inlocomedia.com 
>> [image: inlocomedia]
>> 
>>  [image: LinkedIn]
>> 
>>  [image: Facebook]  [image:
>> Twitter]
>> 
>>
>>
>>
>>
>>
>>
>>
>


Re: Replication issue with Multi DC setup in cassandra

2017-05-24 Thread daemeon reiydelle
May I inquire if your configuration is actually data center aware? Do you
understand the difference between LQ and replication?





*Daemeon C.M. ReiydelleUSA (+1) 415.501.0198London (+44) (0) 20 8144 9872*


*“All men dream, but not equally. Those who dream by night in the dusty
recesses of their minds wake up in the day to find it was vanity, but the
dreamers of the day are dangerous men, for they may act their dreams with
open eyes, to make it possible.” — T.E. Lawrence*


On Wed, May 24, 2017 at 12:03 PM, Igor Leão  wrote:

> Did you run `nodetool repair` after changing the keyspace? (not sure if it
> makes sense though)
>
> 2017-05-16 19:52 GMT-03:00 Nitan Kainth :
>
>> Strange. Anybody else might share something more important.
>>
>> Sent from my iPhone
>>
>> On May 16, 2017, at 5:23 PM, suraj pasuparthy 
>> wrote:
>>
>> Yes is see them in the datacenter's data directories.. infact i see then
>> even after i bring down the interface between the 2 DC's which further
>> confirms that a local copy is maintained in the DC that was not configured
>> in the strategy ..
>> its quite important that we block the info for this keyspace from
>> replicating :(.. not sure why this does not work
>>
>> Thanks
>> Suraj
>>
>> On Tue, May 16, 2017 at 3:06 PM Nitan Kainth  wrote:
>>
>>> check for datafiles on filesystem in both DCs.
>>>
>>> On May 16, 2017, at 4:42 PM, suraj pasuparthy <
>>> suraj.pasupar...@gmail.com> wrote:
>>>
>>> So i though the same,
>>> I see the data via the CQLSH in both the datacenters. consistency is set
>>> to LQ
>>>
>>> thanks
>>> -Suraj
>>>
>>> On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth  wrote:
>>>
 Do you see data on other DC or just directory structure? Directory
 structure would populate because it is DDL but inserts shouldn’t populate,
 ideally.

 On May 16, 2017, at 3:19 PM, suraj pasuparthy <
 suraj.pasupar...@gmail.com> wrote:

 elp me fig



>>>
>>>
>>> --
>>> Suraj Pasuparthy
>>>
>>> cisco systems
>>> Software Engineer
>>> San Jose CA
>>>
>>>
>>>
>>>
>>>
>>>
>
>
> --
> Igor Leão  Site Reliability Engineer
>
> Mobile: +55 81 99727-1083 
> Skype: *igorvpcleao*
> Office: +55 81 4042-9757 
> Website: inlocomedia.com 
> [image: inlocomedia]
> 
>  [image: LinkedIn]
> 
>  [image: Facebook]  [image: Twitter]
> 
>
>
>
>
>
>
>


Re: Replication issue with Multi DC setup in cassandra

2017-05-24 Thread Igor Leão
Did you run `nodetool repair` after changing the keyspace? (not sure if it
makes sense though)

2017-05-16 19:52 GMT-03:00 Nitan Kainth :

> Strange. Anybody else might share something more important.
>
> Sent from my iPhone
>
> On May 16, 2017, at 5:23 PM, suraj pasuparthy 
> wrote:
>
> Yes is see them in the datacenter's data directories.. infact i see then
> even after i bring down the interface between the 2 DC's which further
> confirms that a local copy is maintained in the DC that was not configured
> in the strategy ..
> its quite important that we block the info for this keyspace from
> replicating :(.. not sure why this does not work
>
> Thanks
> Suraj
>
> On Tue, May 16, 2017 at 3:06 PM Nitan Kainth  wrote:
>
>> check for datafiles on filesystem in both DCs.
>>
>> On May 16, 2017, at 4:42 PM, suraj pasuparthy 
>> wrote:
>>
>> So i though the same,
>> I see the data via the CQLSH in both the datacenters. consistency is set
>> to LQ
>>
>> thanks
>> -Suraj
>>
>> On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth  wrote:
>>
>>> Do you see data on other DC or just directory structure? Directory
>>> structure would populate because it is DDL but inserts shouldn’t populate,
>>> ideally.
>>>
>>> On May 16, 2017, at 3:19 PM, suraj pasuparthy <
>>> suraj.pasupar...@gmail.com> wrote:
>>>
>>> elp me fig
>>>
>>>
>>>
>>
>>
>> --
>> Suraj Pasuparthy
>>
>> cisco systems
>> Software Engineer
>> San Jose CA
>>
>>
>>
>>
>>
>>


-- 
Igor Leão  Site Reliability Engineer

Mobile: +55 81 99727-1083 
Skype: *igorvpcleao*
Office: +55 81 4042-9757 
Website: inlocomedia.com 
[image: inlocomedia]

 [image: LinkedIn]

 [image: Facebook]  [image: Twitter]



Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread suraj pasuparthy
Yes is see them in the datacenter's data directories.. infact i see then
even after i bring down the interface between the 2 DC's which further
confirms that a local copy is maintained in the DC that was not configured
in the strategy ..
its quite important that we block the info for this keyspace from
replicating :(.. not sure why this does not work

Thanks
Suraj

On Tue, May 16, 2017 at 3:06 PM Nitan Kainth  wrote:

> check for datafiles on filesystem in both DCs.
>
> On May 16, 2017, at 4:42 PM, suraj pasuparthy 
> wrote:
>
> So i though the same,
> I see the data via the CQLSH in both the datacenters. consistency is set
> to LQ
>
> thanks
> -Suraj
>
> On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth  wrote:
>
>> Do you see data on other DC or just directory structure? Directory
>> structure would populate because it is DDL but inserts shouldn’t populate,
>> ideally.
>>
>> On May 16, 2017, at 3:19 PM, suraj pasuparthy 
>> wrote:
>>
>> elp me fig
>>
>>
>>
>
>
> --
> Suraj Pasuparthy
>
> cisco systems
> Software Engineer
> San Jose CA
>
>
>
>
>
>


Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread Nitan Kainth
check for datafiles on filesystem in both DCs.

> On May 16, 2017, at 4:42 PM, suraj pasuparthy  
> wrote:
> 
> So i though the same,
> I see the data via the CQLSH in both the datacenters. consistency is set to LQ
> 
> thanks
> -Suraj
> 
> On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth  > wrote:
> Do you see data on other DC or just directory structure? Directory structure 
> would populate because it is DDL but inserts shouldn’t populate, ideally.
> 
>> On May 16, 2017, at 3:19 PM, suraj pasuparthy > > wrote:
>> 
>> elp me fig
> 
> 
> 
> 
> -- 
> Suraj Pasuparthy
> 
> cisco systems
> Software Engineer
> San Jose CA



Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread suraj pasuparthy
So i though the same,
I see the data via the CQLSH in both the datacenters. consistency is set to
LQ

thanks
-Suraj

On Tue, May 16, 2017 at 2:19 PM, Nitan Kainth  wrote:

> Do you see data on other DC or just directory structure? Directory
> structure would populate because it is DDL but inserts shouldn’t populate,
> ideally.
>
> On May 16, 2017, at 3:19 PM, suraj pasuparthy 
> wrote:
>
> elp me fig
>
>
>


-- 
Suraj Pasuparthy

cisco systems
Software Engineer
San Jose CA


Re: Replication issue with Multi DC setup in cassandra

2017-05-16 Thread Nitan Kainth
Do you see data on other DC or just directory structure? Directory structure 
would populate because it is DDL but inserts shouldn’t populate, ideally.

> On May 16, 2017, at 3:19 PM, suraj pasuparthy  
> wrote:
> 
> elp me fig



Replication issue with Multi DC setup in cassandra

2017-05-16 Thread suraj pasuparthy
Hello,
I am tying to find a way to PREVENT just one of my keyspaces to not sync to
the other datacenter.

I have 2 datacenters setup this way :

Datacenter: DC:4.4.4.4

==

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address  Load   Tokens   Owns (effective)  Host ID
  Rack

UN  4.4.4.4  189.42 KiB  32   100.0%
939f5965-f9d5-4673-b1a4-29fa5ecae0f9  rack1

Datacenter: DC:4.4.4.5

==

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

--  Address  Load   Tokens   Owns (effective)  Host ID
  Rack

UN  4.4.4.5  218.29 KiB  32   100.0%
c0d1d859-7ae9-4ce9-a50f-ea316963dbb1  rack1

all my keyspaces have 1 copy in each DC and it works like a charm.

however, I have ONE keyspace that i do not want to sync. and i define the
keyspace this way:

CREATE KEYSPACE nosync WITH replication = {'class':
'NetworkTopologyStrategy', 'DC:4.4.4.4': '1', 'DC:4.4.4.5': '0'}  AND
durable_writes = true;
and i still see that keyspace show up in DC:4.4.4.5
i even tried

CREATE KEYSPACE nosync WITH replication = {'class':
'NetworkTopologyStrategy', 'DC:4.4.4.4': '1'}  AND durable_writes = true;

Same issue, i still see the keyspace show up in the DC:4.4.4.5

Could anyone help me figure this out?

Cheers

-Suraj


question on multi DC setup and LWT's

2017-01-23 Thread Kant Kodali
HI Guys,

Lets say I have 2 DC's and I have 3 node cluster on each DC and one replica
on each DC. I would like to maintain Strong consistency and high
availability so

1) First of all, How do I even set up one replica on each DC?
2) what should my read and write consistent levels be when I am using LWT?
3) what is the difference of between QUORUM and SERIAL when using LWT for
both reads and writes?

Thanks!


Re: Multi DC setup question

2016-06-30 Thread Jens Rantil
I'm AFK, but you might be able to query the system.peers table to see which
nodes are up.

Cheers,
Jens

Den tis 28 juni 2016 06:44Charulata Sharma (charshar) 
skrev:

> Hi All,
>
>We are setting up another Data Center and have the following
> question:
>
> 6 nodes in each DC Cassandra cluster.
>
> All key spaces have an RF of 3
>
> *Our scenario is *
>
>
>
> Apps node connect to Cassandra cluster using LOCAL_QUORUM consistency.
>
>
>
> We want to ensure that If 5 nodes out of the 6 are available then
> application enters the primary DC else the application URL be directed to
> another DC.
>
>
>
> What is the best option to achieve this??
>
>
>
> Thanks,
>
> Charu
>
>
>
>
>
>
>
>
>
> --

Jens Rantil
Backend Developer @ Tink

Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.


Multi DC setup question

2016-06-27 Thread Charulata Sharma (charshar)
Hi All,
   We are setting up another Data Center and have the following question:

6 nodes in each DC Cassandra cluster.


All key spaces have an RF of 3
Our scenario is

Apps node connect to Cassandra cluster using LOCAL_QUORUM consistency.

We want to ensure that If 5 nodes out of the 6 are available then application 
enters the primary DC else the application URL be directed to another DC.

What is the best option to achieve this??

Thanks,
Charu







Re: Multi DC setup for analytics

2016-04-01 Thread Laszlo Jobs
Anishek,

AFAIK you can not have clusters "overlap" each oder.

Just an idea: Try to address it as an sstable restore.
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_snapshot_restore_new_cluster.html

What I would try to do (not tested!):

- create a logical DC in each cluster (CLUSTER_1 and CLUSTER_2), with
limited number of nodes so you do not need to backup a lot of nodes. lets
call it DC_AR (for Analytics Replica)

- alter replication factor of keyspaces and tables in CLUSTER_1 and
CLUSTER_2 clusters to use their own DC_AR DC, store at least 1 replica
there too (this is a minimal change to CLUSTER_1 and CLUSTER_2)

- when restoring to CLUSTER_3 create snapshots in both DC_AR on each
cluster CLUSTER_1 and CLUSTER_2

- follow the restore procedure described in the link above and restore
sstables to the analytics cluster CLUSTER_3

- create a logical DC (DA_A) in each Cluster, CLUSTER_1 and CLUSTER_2, with
one or more nodes according to your

- if you need more power on the CLUSTER_3 then you can add more nodes after
the restore and repair (could be time consuming)

You might tune the process above as this is just a high level idea.
You need to consider the following thing among others potentially 9i sure
this is not a complete list below):
- maintain the schema in the CLUSTER_3 keyspaces whenever they change on
CLUSTER_1 or CLUSTER_2
- can not use the same keyspace names on CLUSTER_1 and CLUSTER_2
- replication factor for the DC_AR DCs in both clusters CLUSTER_1 and
CLUSTER_2
- what consistency level you use in your application, QUORUM might hurt you
but LOCAL_QUORUM could be OK.
- ensure that clients are not connecting to DC_AR nodes (not a hard
requirement)

If this works, then you do not have to rebuild the clusters you have today
(CLUSTER_1 and CLUSTER_2).

P.S. I am relatively new to Cassandra and using only 3.x versions (using =
playing and learning).

Regards,

Laszlo


On Wed, Mar 30, 2016 at 8:43 AM, Anishek Agarwal  wrote:

> Hey Guys,
>
> We did the necessary changes and were trying to get this back on track,
> but hit another wall,
>
> we have two Clusters in Different DC ( DC1 and DC2) with cluster names (
> CLUSTER_1, CLUSTER_2)
>
> we want to have a common analytics cluster in DC3 with cluster name
> (CLUSTER_3). -- looks like this can't be done, so we have to setup two
> different analytics cluster ? can't we just get data from CLUSTER_1/2 to
> same cluster CLUSTER_3 ?
>
> thanks
> anishek
>
> On Mon, Mar 21, 2016 at 3:31 PM, Anishek Agarwal 
> wrote:
>
>> Hey Clint,
>>
>> we have two separate rings which don't talk to each other but both having
>> the same DC name "DCX".
>>
>> @Raja,
>>
>> We had already gone towards the path you suggested.
>>
>> thanks all
>> anishek
>>
>> On Fri, Mar 18, 2016 at 8:01 AM, Reddy Raja  wrote:
>>
>>> Yes. Here are the steps.
>>> You will have to change the DC Names first.
>>> DC1 and DC2 would be independent clusters.
>>>
>>> Create a new DC, DC3 and include these two DC's on DC3.
>>>
>>> This should work well.
>>>
>>>
>>> On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
>>> clintlmar...@coolfiretechnologies.com> wrote:
>>>
 When you say you have two logical DC both with the same name are you
 saying that you have two clusters of servers both with the same DC name,
 nether of which currently talk to each other? IE they are two separate
 rings?

 Or do you mean that you have two keyspaces in one cluster?

 Or?

 Clint
 On Mar 14, 2016 2:11 AM, "Anishek Agarwal"  wrote:

> Hello,
>
> We are using cassandra 2.0.17 and have two logical DC having different
> Keyspaces but both having same logical name DC1.
>
> we want to setup another cassandra cluster for analytics which should
> get data from both the above DC.
>
> if we setup the new DC with name DC2 and follow the steps
> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
> will it work ?
>
> I would think we would have to first change the names of existing
> clusters to have to different names and then go with adding another dc
> getting data from these?
>
> Also as soon as we add the node the data starts moving... this will
> all be only real time changes done to the cluster right ? we still have to
> do the rebuild to get the data for tokens for node in new cluster ?
>
> Thanks
> Anishek
>

>>>
>>>
>>> --
>>> "In this world, you either have an excuse or a story. I preferred to
>>> have a story"
>>>
>>
>>
>


Re: Multi DC setup for analytics

2016-03-31 Thread Anishek Agarwal
Hey Bryan,

Thanks for the info, we inferred as much, currently the only other thing we
were trying were trying to start two separate instances in Analytics
cluster on same set of machines to talk to respective individual DC's but
within 2 mins dropped that as we will have to change ports on atlas one of
the existing DC's so when they join with the analytics cluster they are on
same port.

for now we are just getting another set of machines for this.


I had known about the pattern of using a separate analytics cluster for
cassandra but thought we could join them across two clusters, my bad now
that i think of it i think it would have been better to have just one DC
for realtime prod requests instead of two.

are there ways of merging existing clusters to one cluster in cassandra ?


On Fri, Apr 1, 2016 at 5:05 AM, Bryan Cheng  wrote:

> I'm jumping into this thread late, so sorry if this has been covered
> before. But am I correct in reading that you have two different Cassandra
> rings, not talking to each other at all, and you want to have a shared DC
> with a third Cassandra ring?
>
> I'm not sure what you want to do is possible.
>
> If I had the luxury of starting from scratch, the design I would do is:
> All three DC's in one cluster, with 3 datacenters. DC3 is the analytics DC.
> DC1's keyspaces are replicated to DC1 and DC3 only.
> DC2's keyspaces are replicated to DC2 and DC3 only.
>
> Then you have DC3 with all data from both DC1 and DC2 to run analytics on,
> and no cross-talk between DC1 and DC2.
>
> If you cannot rebuild your existing clusters, you may want to consider
> using something like Spark to ETL your data out of DC1 and DC2 into a new
> cluster at DC3. At that point you're running a data warehouse and lose some
> of the advantages of seemless cluster membership.
>
> On Wed, Mar 30, 2016 at 5:43 AM, Anishek Agarwal 
> wrote:
>
>> Hey Guys,
>>
>> We did the necessary changes and were trying to get this back on track,
>> but hit another wall,
>>
>> we have two Clusters in Different DC ( DC1 and DC2) with cluster names (
>> CLUSTER_1, CLUSTER_2)
>>
>> we want to have a common analytics cluster in DC3 with cluster name
>> (CLUSTER_3). -- looks like this can't be done, so we have to setup two
>> different analytics cluster ? can't we just get data from CLUSTER_1/2 to
>> same cluster CLUSTER_3 ?
>>
>> thanks
>> anishek
>>
>> On Mon, Mar 21, 2016 at 3:31 PM, Anishek Agarwal 
>> wrote:
>>
>>> Hey Clint,
>>>
>>> we have two separate rings which don't talk to each other but both
>>> having the same DC name "DCX".
>>>
>>> @Raja,
>>>
>>> We had already gone towards the path you suggested.
>>>
>>> thanks all
>>> anishek
>>>
>>> On Fri, Mar 18, 2016 at 8:01 AM, Reddy Raja 
>>> wrote:
>>>
 Yes. Here are the steps.
 You will have to change the DC Names first.
 DC1 and DC2 would be independent clusters.

 Create a new DC, DC3 and include these two DC's on DC3.

 This should work well.


 On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
 clintlmar...@coolfiretechnologies.com> wrote:

> When you say you have two logical DC both with the same name are you
> saying that you have two clusters of servers both with the same DC name,
> nether of which currently talk to each other? IE they are two separate
> rings?
>
> Or do you mean that you have two keyspaces in one cluster?
>
> Or?
>
> Clint
> On Mar 14, 2016 2:11 AM, "Anishek Agarwal"  wrote:
>
>> Hello,
>>
>> We are using cassandra 2.0.17 and have two logical DC having
>> different Keyspaces but both having same logical name DC1.
>>
>> we want to setup another cassandra cluster for analytics which should
>> get data from both the above DC.
>>
>> if we setup the new DC with name DC2 and follow the steps
>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>> will it work ?
>>
>> I would think we would have to first change the names of existing
>> clusters to have to different names and then go with adding another dc
>> getting data from these?
>>
>> Also as soon as we add the node the data starts moving... this will
>> all be only real time changes done to the cluster right ? we still have 
>> to
>> do the rebuild to get the data for tokens for node in new cluster ?
>>
>> Thanks
>> Anishek
>>
>


 --
 "In this world, you either have an excuse or a story. I preferred to
 have a story"

>>>
>>>
>>
>


Re: Multi DC setup for analytics

2016-03-31 Thread Bryan Cheng
I'm jumping into this thread late, so sorry if this has been covered
before. But am I correct in reading that you have two different Cassandra
rings, not talking to each other at all, and you want to have a shared DC
with a third Cassandra ring?

I'm not sure what you want to do is possible.

If I had the luxury of starting from scratch, the design I would do is:
All three DC's in one cluster, with 3 datacenters. DC3 is the analytics DC.
DC1's keyspaces are replicated to DC1 and DC3 only.
DC2's keyspaces are replicated to DC2 and DC3 only.

Then you have DC3 with all data from both DC1 and DC2 to run analytics on,
and no cross-talk between DC1 and DC2.

If you cannot rebuild your existing clusters, you may want to consider
using something like Spark to ETL your data out of DC1 and DC2 into a new
cluster at DC3. At that point you're running a data warehouse and lose some
of the advantages of seemless cluster membership.

On Wed, Mar 30, 2016 at 5:43 AM, Anishek Agarwal  wrote:

> Hey Guys,
>
> We did the necessary changes and were trying to get this back on track,
> but hit another wall,
>
> we have two Clusters in Different DC ( DC1 and DC2) with cluster names (
> CLUSTER_1, CLUSTER_2)
>
> we want to have a common analytics cluster in DC3 with cluster name
> (CLUSTER_3). -- looks like this can't be done, so we have to setup two
> different analytics cluster ? can't we just get data from CLUSTER_1/2 to
> same cluster CLUSTER_3 ?
>
> thanks
> anishek
>
> On Mon, Mar 21, 2016 at 3:31 PM, Anishek Agarwal 
> wrote:
>
>> Hey Clint,
>>
>> we have two separate rings which don't talk to each other but both having
>> the same DC name "DCX".
>>
>> @Raja,
>>
>> We had already gone towards the path you suggested.
>>
>> thanks all
>> anishek
>>
>> On Fri, Mar 18, 2016 at 8:01 AM, Reddy Raja  wrote:
>>
>>> Yes. Here are the steps.
>>> You will have to change the DC Names first.
>>> DC1 and DC2 would be independent clusters.
>>>
>>> Create a new DC, DC3 and include these two DC's on DC3.
>>>
>>> This should work well.
>>>
>>>
>>> On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
>>> clintlmar...@coolfiretechnologies.com> wrote:
>>>
 When you say you have two logical DC both with the same name are you
 saying that you have two clusters of servers both with the same DC name,
 nether of which currently talk to each other? IE they are two separate
 rings?

 Or do you mean that you have two keyspaces in one cluster?

 Or?

 Clint
 On Mar 14, 2016 2:11 AM, "Anishek Agarwal"  wrote:

> Hello,
>
> We are using cassandra 2.0.17 and have two logical DC having different
> Keyspaces but both having same logical name DC1.
>
> we want to setup another cassandra cluster for analytics which should
> get data from both the above DC.
>
> if we setup the new DC with name DC2 and follow the steps
> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
> will it work ?
>
> I would think we would have to first change the names of existing
> clusters to have to different names and then go with adding another dc
> getting data from these?
>
> Also as soon as we add the node the data starts moving... this will
> all be only real time changes done to the cluster right ? we still have to
> do the rebuild to get the data for tokens for node in new cluster ?
>
> Thanks
> Anishek
>

>>>
>>>
>>> --
>>> "In this world, you either have an excuse or a story. I preferred to
>>> have a story"
>>>
>>
>>
>


Re: Multi DC setup for analytics

2016-03-30 Thread Anishek Agarwal
Hey Guys,

We did the necessary changes and were trying to get this back on track, but
hit another wall,

we have two Clusters in Different DC ( DC1 and DC2) with cluster names (
CLUSTER_1, CLUSTER_2)

we want to have a common analytics cluster in DC3 with cluster name
(CLUSTER_3). -- looks like this can't be done, so we have to setup two
different analytics cluster ? can't we just get data from CLUSTER_1/2 to
same cluster CLUSTER_3 ?

thanks
anishek

On Mon, Mar 21, 2016 at 3:31 PM, Anishek Agarwal  wrote:

> Hey Clint,
>
> we have two separate rings which don't talk to each other but both having
> the same DC name "DCX".
>
> @Raja,
>
> We had already gone towards the path you suggested.
>
> thanks all
> anishek
>
> On Fri, Mar 18, 2016 at 8:01 AM, Reddy Raja  wrote:
>
>> Yes. Here are the steps.
>> You will have to change the DC Names first.
>> DC1 and DC2 would be independent clusters.
>>
>> Create a new DC, DC3 and include these two DC's on DC3.
>>
>> This should work well.
>>
>>
>> On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
>> clintlmar...@coolfiretechnologies.com> wrote:
>>
>>> When you say you have two logical DC both with the same name are you
>>> saying that you have two clusters of servers both with the same DC name,
>>> nether of which currently talk to each other? IE they are two separate
>>> rings?
>>>
>>> Or do you mean that you have two keyspaces in one cluster?
>>>
>>> Or?
>>>
>>> Clint
>>> On Mar 14, 2016 2:11 AM, "Anishek Agarwal"  wrote:
>>>
 Hello,

 We are using cassandra 2.0.17 and have two logical DC having different
 Keyspaces but both having same logical name DC1.

 we want to setup another cassandra cluster for analytics which should
 get data from both the above DC.

 if we setup the new DC with name DC2 and follow the steps
 https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
 will it work ?

 I would think we would have to first change the names of existing
 clusters to have to different names and then go with adding another dc
 getting data from these?

 Also as soon as we add the node the data starts moving... this will all
 be only real time changes done to the cluster right ? we still have to do
 the rebuild to get the data for tokens for node in new cluster ?

 Thanks
 Anishek

>>>
>>
>>
>> --
>> "In this world, you either have an excuse or a story. I preferred to have
>> a story"
>>
>
>


Re: Multi DC setup for analytics

2016-03-21 Thread Anishek Agarwal
Hey Clint,

we have two separate rings which don't talk to each other but both having
the same DC name "DCX".

@Raja,

We had already gone towards the path you suggested.

thanks all
anishek

On Fri, Mar 18, 2016 at 8:01 AM, Reddy Raja  wrote:

> Yes. Here are the steps.
> You will have to change the DC Names first.
> DC1 and DC2 would be independent clusters.
>
> Create a new DC, DC3 and include these two DC's on DC3.
>
> This should work well.
>
>
> On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
> clintlmar...@coolfiretechnologies.com> wrote:
>
>> When you say you have two logical DC both with the same name are you
>> saying that you have two clusters of servers both with the same DC name,
>> nether of which currently talk to each other? IE they are two separate
>> rings?
>>
>> Or do you mean that you have two keyspaces in one cluster?
>>
>> Or?
>>
>> Clint
>> On Mar 14, 2016 2:11 AM, "Anishek Agarwal"  wrote:
>>
>>> Hello,
>>>
>>> We are using cassandra 2.0.17 and have two logical DC having different
>>> Keyspaces but both having same logical name DC1.
>>>
>>> we want to setup another cassandra cluster for analytics which should
>>> get data from both the above DC.
>>>
>>> if we setup the new DC with name DC2 and follow the steps
>>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>>> will it work ?
>>>
>>> I would think we would have to first change the names of existing
>>> clusters to have to different names and then go with adding another dc
>>> getting data from these?
>>>
>>> Also as soon as we add the node the data starts moving... this will all
>>> be only real time changes done to the cluster right ? we still have to do
>>> the rebuild to get the data for tokens for node in new cluster ?
>>>
>>> Thanks
>>> Anishek
>>>
>>
>
>
> --
> "In this world, you either have an excuse or a story. I preferred to have
> a story"
>


Re: Multi DC setup for analytics

2016-03-20 Thread Clint Martin
When you say you have two logical DC both with the same name are you saying
that you have two clusters of servers both with the same DC name, nether of
which currently talk to each other? IE they are two separate rings?

Or do you mean that you have two keyspaces in one cluster?

Or?

Clint
On Mar 14, 2016 2:11 AM, "Anishek Agarwal"  wrote:

> Hello,
>
> We are using cassandra 2.0.17 and have two logical DC having different
> Keyspaces but both having same logical name DC1.
>
> we want to setup another cassandra cluster for analytics which should get
> data from both the above DC.
>
> if we setup the new DC with name DC2 and follow the steps
> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
> will it work ?
>
> I would think we would have to first change the names of existing clusters
> to have to different names and then go with adding another dc getting data
> from these?
>
> Also as soon as we add the node the data starts moving... this will all be
> only real time changes done to the cluster right ? we still have to do the
> rebuild to get the data for tokens for node in new cluster ?
>
> Thanks
> Anishek
>


Re: Multi DC setup for analytics

2016-03-19 Thread Reddy Raja
Yes. Here are the steps.
You will have to change the DC Names first.
DC1 and DC2 would be independent clusters.

Create a new DC, DC3 and include these two DC's on DC3.

This should work well.


On Thu, Mar 17, 2016 at 11:03 PM, Clint Martin <
clintlmar...@coolfiretechnologies.com> wrote:

> When you say you have two logical DC both with the same name are you
> saying that you have two clusters of servers both with the same DC name,
> nether of which currently talk to each other? IE they are two separate
> rings?
>
> Or do you mean that you have two keyspaces in one cluster?
>
> Or?
>
> Clint
> On Mar 14, 2016 2:11 AM, "Anishek Agarwal"  wrote:
>
>> Hello,
>>
>> We are using cassandra 2.0.17 and have two logical DC having different
>> Keyspaces but both having same logical name DC1.
>>
>> we want to setup another cassandra cluster for analytics which should get
>> data from both the above DC.
>>
>> if we setup the new DC with name DC2 and follow the steps
>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
>> will it work ?
>>
>> I would think we would have to first change the names of existing
>> clusters to have to different names and then go with adding another dc
>> getting data from these?
>>
>> Also as soon as we add the node the data starts moving... this will all
>> be only real time changes done to the cluster right ? we still have to do
>> the rebuild to get the data for tokens for node in new cluster ?
>>
>> Thanks
>> Anishek
>>
>


-- 
"In this world, you either have an excuse or a story. I preferred to have a
story"


Multi DC setup for analytics

2016-03-14 Thread Anishek Agarwal
Hello,

We are using cassandra 2.0.17 and have two logical DC having different
Keyspaces but both having same logical name DC1.

we want to setup another cassandra cluster for analytics which should get
data from both the above DC.

if we setup the new DC with name DC2 and follow the steps
https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_add_dc_to_cluster_t.html
will it work ?

I would think we would have to first change the names of existing clusters
to have to different names and then go with adding another dc getting data
from these?

Also as soon as we add the node the data starts moving... this will all be
only real time changes done to the cluster right ? we still have to do the
rebuild to get the data for tokens for node in new cluster ?

Thanks
Anishek


Issues during Multi-DC setup across AWS regions + VPC setup

2014-09-15 Thread Dinesh Narayanan
We are trying to add new data center in us-east. Servers in each DC are
running inside VPC. We currently have a cluster in us-west and all servers
are running 2.0.7. The two DCs are talking via VPN. listen_address and
broadcast_address have private ip. Our endpoint_snitch is
GossipingPropertyFileSnitch. All the required ports are open across VPC and
we could telnet from one DC to the other.

We see this exception when we add a new node in the new dc(us-east)

INFO [HANDSHAKE-/10.0.51.81] 2014-09-15 16:38:26,896
OutboundTcpConnection.java (line 386) Handshaking version with /10.0.51.81
DEBUG [WRITE-/10.0.52.81] 2014-09-15 16:38:29,038
OutboundTcpConnection.java (line 333) Target max version is -2147483648; no
version information yet, will retry
TRACE [HANDSHAKE-/10.0.52.81] 2014-09-15 16:38:29,038
OutboundTcpConnection.java (line 393) Cannot handshake version with /
10.0.52.81
java.nio.channels.AsynchronousCloseException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:205)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:412)
at sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:203)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
at java.io.InputStream.read(InputStream.java:101)
at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:81)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at
org.apache.cassandra.net.OutboundTcpConnection$1.run(OutboundTcpConnection.java:387)

What are we missing here?


Thanks
Dinesh


Re: Monitoring replication lag/latency in multi DC setup

2012-09-06 Thread aaron morton
 Is there a specific metric you can recommend?


the not entirely correct but very lightweight approach would be to look at the 
size of the HintsColumnFamily in the system KS. 

If you want an exact number use the functions on the HH MBean 
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/HintedHandOffManagerMBean.java

Note that counting the number of hits involves counting the number of hits, so 
that can take a while. 

Cheers


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/09/2012, at 5:33 PM, Venkat Rama venkata.s.r...@gmail.com wrote:

 Is there a specific metric you can recommend?
 
 VR
 
 On Wed, Sep 5, 2012 at 9:19 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
 Cassandra exposes lot of metrics through Jconsole. You might be able to get 
 some information from Jconsole.
 
 
 On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama venkata.s.r...@gmail.com wrote:
 Thanks for the quick reply, Mohit.Can we measure/monitor the size of 
 Hinted Handoffs?  Would it be a good enough indicator of my back log?  
 
 Although we know when a network is flaky, we are interested in knowing how 
 much data is piling up in local DC that needs to be transferred.  
 
 Greatly appreciate your help.
 
 VR
 
 
 On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.com wrote:
 As far as I know Cassandra doesn't use internal queueing mechanism specific 
 to replication. Cassandra sends the write the remote DC and after that it's 
 upto the tcp/ip stack to deal with buffering. If requests starts to timeout 
 Cassandra would use HH upto certain time. For longer outage you would have to 
 run repair.
  
 Also look at tcp/ip tuning parameters that are helpful with your scenario:
  
 http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html
  
 Run iperf and test the latency.
 
 On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.com wrote:
 Hi,
 
 We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM for 
 writes and reads.  The network we have seen between the DC is sometimes flaky 
 lasting few minutes to few 10 of minutes.
 
 I wanted to know what is the best way to measure/monitor either the lag or 
 replication latency between the data centers.  Are there any metrics I can 
 monitor to find the backlog of data that needs to be transferred?
 
 Thanks in advance.
 
 VR
 
 
 
 



Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Venkat Rama
Hi,

We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM for
writes and reads.  The network we have seen between the DC is sometimes
flaky lasting few minutes to few 10 of minutes.

I wanted to know what is the best way to measure/monitor either the lag or
replication latency between the data centers.  Are there any metrics I can
monitor to find the backlog of data that needs to be transferred?

Thanks in advance.

VR


Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
As far as I know Cassandra doesn't use internal queueing mechanism specific
to replication. Cassandra sends the write the remote DC and after that it's
upto the tcp/ip stack to deal with buffering. If requests starts to timeout
Cassandra would use HH upto certain time. For longer outage you would have
to run repair.

Also look at tcp/ip tuning parameters that are helpful with your scenario:

http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html

Run iperf and test the latency.

On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote:

 Hi,

 We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
 for writes and reads.  The network we have seen between the DC is sometimes
 flaky lasting few minutes to few 10 of minutes.

 I wanted to know what is the best way to measure/monitor either the lag or
 replication latency between the data centers.  Are there any metrics I can
 monitor to find the backlog of data that needs to be transferred?

 Thanks in advance.

 VR



Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Venkat Rama
Thanks for the quick reply, Mohit.Can we measure/monitor the size of
Hinted Handoffs?  Would it be a good enough indicator of my back log?

Although we know when a network is flaky, we are interested in knowing how
much data is piling up in local DC that needs to be transferred.

Greatly appreciate your help.

VR


On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 As far as I know Cassandra doesn't use internal queueing mechanism
 specific to replication. Cassandra sends the write the remote DC and after
 that it's upto the tcp/ip stack to deal with buffering. If requests starts
 to timeout Cassandra would use HH upto certain time. For longer outage you
 would have to run repair.

 Also look at tcp/ip tuning parameters that are helpful with your scenario:

 http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html

 Run iperf and test the latency.

 On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote:

 Hi,

 We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
 for writes and reads.  The network we have seen between the DC is sometimes
 flaky lasting few minutes to few 10 of minutes.

 I wanted to know what is the best way to measure/monitor either the lag
 or replication latency between the data centers.  Are there any metrics I
 can monitor to find the backlog of data that needs to be transferred?

 Thanks in advance.

 VR





Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Mohit Anchlia
Cassandra exposes lot of metrics through Jconsole. You might be able to get
some information from Jconsole.

On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama venkata.s.r...@gmail.comwrote:

 Thanks for the quick reply, Mohit.Can we measure/monitor the size of
 Hinted Handoffs?  Would it be a good enough indicator of my back log?

 Although we know when a network is flaky, we are interested in knowing how
 much data is piling up in local DC that needs to be transferred.

 Greatly appreciate your help.

 VR


 On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 As far as I know Cassandra doesn't use internal queueing mechanism
 specific to replication. Cassandra sends the write the remote DC and after
 that it's upto the tcp/ip stack to deal with buffering. If requests starts
 to timeout Cassandra would use HH upto certain time. For longer outage you
 would have to run repair.

 Also look at tcp/ip tuning parameters that are helpful with your scenario:

 http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html

 Run iperf and test the latency.

  On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama venkata.s.r...@gmail.comwrote:

 Hi,

 We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
 for writes and reads.  The network we have seen between the DC is sometimes
 flaky lasting few minutes to few 10 of minutes.

 I wanted to know what is the best way to measure/monitor either the lag
 or replication latency between the data centers.  Are there any metrics I
 can monitor to find the backlog of data that needs to be transferred?

 Thanks in advance.

 VR






Re: Monitoring replication lag/latency in multi DC setup

2012-09-05 Thread Venkat Rama
Is there a specific metric you can recommend?

VR

On Wed, Sep 5, 2012 at 9:19 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 Cassandra exposes lot of metrics through Jconsole. You might be able to
 get some information from Jconsole.


 On Wed, Sep 5, 2012 at 8:47 PM, Venkat Rama venkata.s.r...@gmail.comwrote:

 Thanks for the quick reply, Mohit.Can we measure/monitor the size of
 Hinted Handoffs?  Would it be a good enough indicator of my back log?

 Although we know when a network is flaky, we are interested in knowing
 how much data is piling up in local DC that needs to be transferred.

 Greatly appreciate your help.

 VR


 On Wed, Sep 5, 2012 at 8:33 PM, Mohit Anchlia mohitanch...@gmail.comwrote:

 As far as I know Cassandra doesn't use internal queueing mechanism
 specific to replication. Cassandra sends the write the remote DC and after
 that it's upto the tcp/ip stack to deal with buffering. If requests starts
 to timeout Cassandra would use HH upto certain time. For longer outage you
 would have to run repair.

 Also look at tcp/ip tuning parameters that are helpful with your
 scenario:

 http://kaivanov.blogspot.com/2010/09/linux-tcp-tuning.html

 Run iperf and test the latency.

  On Wed, Sep 5, 2012 at 8:22 PM, Venkat Rama 
 venkata.s.r...@gmail.comwrote:

 Hi,

 We have multi DC Cassandra ring with 2 DCs setup.   We use LOCAL_QUORUM
 for writes and reads.  The network we have seen between the DC is sometimes
 flaky lasting few minutes to few 10 of minutes.

 I wanted to know what is the best way to measure/monitor either the lag
 or replication latency between the data centers.  Are there any metrics I
 can monitor to find the backlog of data that needs to be transferred?

 Thanks in advance.

 VR







Re: Multi DC setup

2011-10-11 Thread Peter Schuller
 We already have two separate rings. Idea of bidirectional sync is, if one
 ring is down, we can still send the traffic to other ring. When original
 cluster comes back, it will pick up the data from available cluster. I'm not
 sure if it makes sense to have separate rings or combine these two rings
 into one.

Cassandra doesn't have support for synchronizing data between two
different rings. The multi-dc support in Cassandra amounts to having a
single ring containing all nodes from all data centers. Cassandra is
told (by configuring the snitch, such as through a property files)
which nodes are in which data center. Using the
NetworkTopologyStrategy, you then make sure to distribute replicas in
DC:s as you see fit.

Cassandra will then prefer local nodes for read and write operations,
and you can use e.g. LOCAL_QUORUM consistency level to get quorum like
consistency within a DC.

Google/check wiki/read docs about NetworkTopologyStrategy and
PropertyFileSnitch. I don't have a good link to multi-dc off hand
(anyone got a good link to suggest that goes through this?).

-- 
/ Peter Schuller (@scode on twitter)


Re: Multi DC setup

2011-10-11 Thread Brandon Williams
On Tue, Oct 11, 2011 at 2:36 AM, Peter Schuller
peter.schul...@infidyne.com wrote:
 Google/check wiki/read docs about NetworkTopologyStrategy and
 PropertyFileSnitch. I don't have a good link to multi-dc off hand
 (anyone got a good link to suggest that goes through this?).

http://www.datastax.com/docs/0.8/cluster_architecture/replication is
pretty good imo.

-Brandon


Re: Multi DC setup

2011-10-11 Thread Eric Tamme



We already have two separate rings. Idea of bidirectional sync is, if one
ring is down, we can still send the traffic to other ring. When original
cluster comes back, it will pick up the data from available cluster. I'm not
sure if it makes sense to have separate rings or combine these two rings
into one.
I am not sure you fully understand how Cassandra is supposed to work - 
you do not need two rings to have two complete sets of data that you can 
hot cutover between.



Cassandra doesn't have support for synchronizing data between two
different rings. The multi-dc support in Cassandra amounts to having a
single ring containing all nodes from all data centers. Cassandra is
told (by configuring the snitch, such as through a property files)
which nodes are in which data center. Using the
NetworkTopologyStrategy, you then make sure to distribute replicas in
DC:s as you see fit.
Using NTS you can configure a single ring into multiple logical 
rings.  This is effectively what the property file snitch does in 
conjunction with NTS.


I gave a presentation on the NTS internals, and replicating data across 
geographically distributed data centers. You can find the slides here 
http://files.meetup.com/1794037/NTS_presentation.pdf


Also Edward Capriolio's book high performance cassandra has some 
recipes for using NTS.


I currently have 4 nodes in two data centers and I use NTS with property 
file snitch to write 1 copy of data to each DC (one node per DC) so that 
in the event of a total DC failure, we can still get to the data.  The 
first write is local and the replica is asynchronous if you set write 
consistency to 1 - so you get fast writes with distribution.


-Eric




Multi DC setup

2011-10-10 Thread Cassa L
I am trying to understand multi DC setup for cassandra. As I understand, in
this setup,  replicas exists in same cluster ring, but physically nodes are
distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data
bidirectionally. They both have same keyspace. They take  data traffic from
different sources, but we want to make sure, data exists in both the rings.
What could be the way to achieve this?

Thanks,
L.


Re: Multi DC setup

2011-10-10 Thread Milind Parikh
Why have two rings? Cassandra manages the replication for youone ring
with physical nodes in two dc might be a better option. Of course, depending
on the inter-dc failure characteristics, might need to endure split-brain
for a while.

/***
sent from my android...please pardon occasional typos as I respond @ the
speed of thought
/

On Oct 10, 2011 10:09 PM, Cassa L lcas...@gmail.com wrote:

I am trying to understand multi DC setup for cassandra. As I understand, in
this setup,  replicas exists in same cluster ring, but physically nodes are
distributed across DCs. Is this correct?
I have two different cluster rings in two DCs, and want to replicate data
bidirectionally. They both have same keyspace. They take  data traffic from
different sources, but we want to make sure, data exists in both the rings.
What could be the way to achieve this?

Thanks,
L.


Re: Multi DC setup

2011-10-10 Thread Cassa L
We already have two separate rings. Idea of bidirectional sync is, if one
ring is down, we can still send the traffic to other ring. When original
cluster comes back, it will pick up the data from available cluster. I'm not
sure if it makes sense to have separate rings or combine these two rings
into one.



On Mon, Oct 10, 2011 at 10:17 PM, Milind Parikh milindpar...@gmail.comwrote:

 Why have two rings? Cassandra manages the replication for youone ring
 with physical nodes in two dc might be a better option. Of course, depending
 on the inter-dc failure characteristics, might need to endure split-brain
 for a while.

 /***
 sent from my android...please pardon occasional typos as I respond @ the
 speed of thought
 /

 On Oct 10, 2011 10:09 PM, Cassa L lcas...@gmail.com wrote:

 I am trying to understand multi DC setup for cassandra. As I understand, in
 this setup,  replicas exists in same cluster ring, but physically nodes are
 distributed across DCs. Is this correct?
 I have two different cluster rings in two DCs, and want to replicate data
 bidirectionally. They both have same keyspace. They take  data traffic from
 different sources, but we want to make sure, data exists in both the rings.
 What could be the way to achieve this?

 Thanks,
 L.