Hi Dorian,

I'm thinking of creating many keyspaces and storing them into many virtual
> datacenters (the servers will be in 1 logical datacenter, but separated by
> keyspaces).
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?


There is 3 main things you can do here

1 - Use 1 DC, 200 keyspaces using the DC
2 - Use 200 DC, 1 keyspace per DC.
3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but
related to 1 client)

I am not sure if you want to go with 1 or 2, my understanding is you wanted
to write "the servers will be in 1 -*logical- **physical* datacenter" and
you are willing to do as described in 2.

This looks to be a good idea to me, but for other reasons (clients /
workload isolation, limited risk, independent growth for each client,
visibility on cost per client, ...)

Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?
>

Yet I would not go with distinct DC, but rather distinct C* clusters
(different cluster names, seeds, etc).

I see no good reason to use virtual cluster instead of distinct cluster.
Keep keyspace in distinct isolated datacenter would work. Datacenter would
be quite isolated since no information or load would be shared, excepted
from gossip.

Yet there are some issue with big clusters due to gossip, and I had some
issue in the past due to gossip, affecting all the DC within a cluster. In
this case you would face a major issue, that you could have avoided or
limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes
quite quickly when distinct clusters can be upgraded independently. I would
then go with either option 1 or 3.

and because I don't like having multiple cql clients/connections on my
> app-code


In this case, wouldn't it make sense for you to have per customer app-code
or just a conditional connection creation depending on the client?

I just try to give you some ideas.

Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
> there is overhead with each keyspace + table which would probably break
> this design)

Or is it just a simple map dcx--->ip1,ip2,ip3 ?


I just checked it. All the nodes would know about every keyspace and table,
if using the same Cassandra cluster, (in my testing version C*3.7, this is
stored under system_schema.tables - local strategy, no replication). To
avoid that, using distinct clusters is the way to go.

https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c

C*heers,
-----------------------
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-09-20 22:49 GMT+02:00 Dorian Hoxha <dorian.ho...@gmail.com>:

> Hi,
>
> I need to separate clients data into multiple clusters and because I don't
> like having multiple cql clients/connections on my app-code, I'm thinking
> of creating many keyspaces and storing them into many virtual datacenters
> (the servers will be in 1 logical datacenter, but separated by keyspaces).
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?
>
> Does the cql-engine make a new connection (like "use keyspace") when
> specifying "keyspace.table" on the query ?
>
> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
> there is overhead with each keyspace + table which would probably break
> this design)
> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>
> Thank you!
>

Reply via email to