Hi Dorian, I'm thinking of creating many keyspaces and storing them into many virtual > datacenters (the servers will be in 1 logical datacenter, but separated by > keyspaces). > > Does that make sense (so growing up to 200 dcs of 3 servers each in best > case scenario)?
There is 3 main things you can do here 1 - Use 1 DC, 200 keyspaces using the DC 2 - Use 200 DC, 1 keyspace per DC. 3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but related to 1 client) I am not sure if you want to go with 1 or 2, my understanding is you wanted to write "the servers will be in 1 -*logical- **physical* datacenter" and you are willing to do as described in 2. This looks to be a good idea to me, but for other reasons (clients / workload isolation, limited risk, independent growth for each client, visibility on cost per client, ...) Does that make sense (so growing up to 200 dcs of 3 servers each in best > case scenario)? > Yet I would not go with distinct DC, but rather distinct C* clusters (different cluster names, seeds, etc). I see no good reason to use virtual cluster instead of distinct cluster. Keep keyspace in distinct isolated datacenter would work. Datacenter would be quite isolated since no information or load would be shared, excepted from gossip. Yet there are some issue with big clusters due to gossip, and I had some issue in the past due to gossip, affecting all the DC within a cluster. In this case you would face a major issue, that you could have avoided or limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes quite quickly when distinct clusters can be upgraded independently. I would then go with either option 1 or 3. and because I don't like having multiple cql clients/connections on my > app-code In this case, wouldn't it make sense for you to have per customer app-code or just a conditional connection creation depending on the client? I just try to give you some ideas. Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since > there is overhead with each keyspace + table which would probably break > this design) Or is it just a simple map dcx--->ip1,ip2,ip3 ? I just checked it. All the nodes would know about every keyspace and table, if using the same Cassandra cluster, (in my testing version C*3.7, this is stored under system_schema.tables - local strategy, no replication). To avoid that, using distinct clusters is the way to go. https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c C*heers, ----------------------- Alain Rodriguez - @arodream - al...@thelastpickle.com France The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com 2016-09-20 22:49 GMT+02:00 Dorian Hoxha <dorian.ho...@gmail.com>: > Hi, > > I need to separate clients data into multiple clusters and because I don't > like having multiple cql clients/connections on my app-code, I'm thinking > of creating many keyspaces and storing them into many virtual datacenters > (the servers will be in 1 logical datacenter, but separated by keyspaces). > > Does that make sense (so growing up to 200 dcs of 3 servers each in best > case scenario)? > > Does the cql-engine make a new connection (like "use keyspace") when > specifying "keyspace.table" on the query ? > > Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since > there is overhead with each keyspace + table which would probably break > this design) > Or is it just a simple map dcx--->ip1,ip2,ip3 ? > > Thank you! >