@Alain I wanted to do 2, but looks like that won't be possible because of too much overhead.
@Eric Yeah that's what I was afraid of. Though I know that the client connects to every server, I just didn't want to do the extra code. On Wed, Sep 21, 2016 at 4:56 PM, Eric Stevens <migh...@gmail.com> wrote: > Using keyspaces to support multi tenancy is very close to an anti pattern > unless there is a finite and reasonable upper bound to how many tenants > you'll support overall. Large numbers of tables comes with cluster overhead > and operational complexity you will come to regret eventually. > > >and because I don't like having multiple cql clients/connections on my > app-code > > You should note that although Cassandra drivers present a single logical > connection per cluster, under the hood it maintains connection pools per C* > host. You might be able to do a slightly better job of managing those pools > as a single cluster and logical connection, but I doubt it will be very > significant. It would depend on what options you have available in your > driver of choice. > > Application logic would complexity not be greatly improved because you > still need to switch by tenant, whether it's keyspace name or connection > name doesn't seem like it would make much difference. > > As Alain pointed out, upgrades will be painful and maybe even dangerous as > a monolithic cluster. > > On Wed, Sep 21, 2016, 3:50 AM Alain RODRIGUEZ <arodr...@gmail.com> wrote: > >> Hi Dorian, >> >> I'm thinking of creating many keyspaces and storing them into many >>> virtual datacenters (the servers will be in 1 logical datacenter, but >>> separated by keyspaces). >>> >>> Does that make sense (so growing up to 200 dcs of 3 servers each in best >>> case scenario)? >> >> >> There is 3 main things you can do here >> >> 1 - Use 1 DC, 200 keyspaces using the DC >> 2 - Use 200 DC, 1 keyspace per DC. >> 3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but >> related to 1 client) >> >> I am not sure if you want to go with 1 or 2, my understanding is you >> wanted to write "the servers will be in 1 -*logical- **physical* >> datacenter" and you are willing to do as described in 2. >> >> This looks to be a good idea to me, but for other reasons (clients / >> workload isolation, limited risk, independent growth for each client, >> visibility on cost per client, ...) >> >> Does that make sense (so growing up to 200 dcs of 3 servers each in best >>> case scenario)? >>> >> >> Yet I would not go with distinct DC, but rather distinct C* clusters >> (different cluster names, seeds, etc). >> >> I see no good reason to use virtual cluster instead of distinct cluster. >> Keep keyspace in distinct isolated datacenter would work. Datacenter would >> be quite isolated since no information or load would be shared, excepted >> from gossip. >> >> Yet there are some issue with big clusters due to gossip, and I had some >> issue in the past due to gossip, affecting all the DC within a cluster. In >> this case you would face a major issue, that you could have avoided or >> limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes >> quite quickly when distinct clusters can be upgraded independently. I would >> then go with either option 1 or 3. >> >> and because I don't like having multiple cql clients/connections on my >>> app-code >> >> >> In this case, wouldn't it make sense for you to have per customer app-code >> or just a conditional connection creation depending on the client? >> >> I just try to give you some ideas. >> >> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since >>> there is overhead with each keyspace + table which would probably break >>> this design) >> >> Or is it just a simple map dcx--->ip1,ip2,ip3 ? >> >> >> I just checked it. All the nodes would know about every keyspace and >> table, if using the same Cassandra cluster, (in my testing version C*3.7, >> this is stored under system_schema.tables - local strategy, no >> replication). To avoid that, using distinct clusters is the way to go. >> >> https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c >> >> C*heers, >> ----------------------- >> Alain Rodriguez - @arodream - al...@thelastpickle.com >> France >> >> The Last Pickle - Apache Cassandra Consulting >> http://www.thelastpickle.com >> >> 2016-09-20 22:49 GMT+02:00 Dorian Hoxha <dorian.ho...@gmail.com>: >> >>> Hi, >>> >>> I need to separate clients data into multiple clusters and because I >>> don't like having multiple cql clients/connections on my app-code, I'm >>> thinking of creating many keyspaces and storing them into many virtual >>> datacenters (the servers will be in 1 logical datacenter, but separated by >>> keyspaces). >>> >>> Does that make sense (so growing up to 200 dcs of 3 servers each in best >>> case scenario)? >>> >>> Does the cql-engine make a new connection (like "use keyspace") when >>> specifying "keyspace.table" on the query ? >>> >>> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 >>> ?(since there is overhead with each keyspace + table which would probably >>> break this design) >>> Or is it just a simple map dcx--->ip1,ip2,ip3 ? >>> >>> Thank you! >>> >> >>