subject:"Using keyspaces for virtual clusters"

Re: Using keyspaces for virtual clusters

2016-09-21 Thread Dorian Hoxha

@Alain
I wanted to do 2, but looks like that won't be possible because of too much
overhead.

@Eric
Yeah that's what I was afraid of. Though I know that the client connects to
every server, I just didn't want to do the extra code.

On Wed, Sep 21, 2016 at 4:56 PM, Eric Stevens  wrote:

> Using keyspaces to support multi tenancy is very close to an anti pattern
> unless there is a finite and reasonable upper bound to how many tenants
> you'll support overall. Large numbers of tables comes with cluster overhead
> and operational complexity you will come to regret eventually.
>
> >and because I don't like having multiple cql clients/connections on my
> app-code
>
> You should note that although Cassandra drivers present a single logical
> connection per cluster, under the hood it maintains connection pools per C*
> host. You might be able to do a slightly better job of managing those pools
> as a single cluster and logical connection, but I doubt it will be very
> significant. It would depend on what options you have available in your
> driver of choice.
>
> Application logic would complexity not be greatly improved because you
> still need to switch by tenant, whether it's keyspace name or connection
> name doesn't seem like it would make much difference.
>
> As Alain pointed out, upgrades will be painful and maybe even dangerous as
> a monolithic cluster.
>
> On Wed, Sep 21, 2016, 3:50 AM Alain RODRIGUEZ  wrote:
>
>> Hi Dorian,
>>
>> I'm thinking of creating many keyspaces and storing them into many
>>> virtual datacenters (the servers will be in 1 logical datacenter, but
>>> separated by keyspaces).
>>>
>>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>>> case scenario)?
>>
>>
>> There is 3 main things you can do here
>>
>> 1 - Use 1 DC, 200 keyspaces using the DC
>> 2 - Use 200 DC, 1 keyspace per DC.
>> 3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but
>> related to 1 client)
>>
>> I am not sure if you want to go with 1 or 2, my understanding is you
>> wanted to write "the servers will be in 1 -*logical- **physical*
>> datacenter" and you are willing to do as described in 2.
>>
>> This looks to be a good idea to me, but for other reasons (clients /
>> workload isolation, limited risk, independent growth for each client,
>> visibility on cost per client, ...)
>>
>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>>> case scenario)?
>>>
>>
>> Yet I would not go with distinct DC, but rather distinct C* clusters
>> (different cluster names, seeds, etc).
>>
>> I see no good reason to use virtual cluster instead of distinct cluster.
>> Keep keyspace in distinct isolated datacenter would work. Datacenter would
>> be quite isolated since no information or load would be shared, excepted
>> from gossip.
>>
>> Yet there are some issue with big clusters due to gossip, and I had some
>> issue in the past due to gossip, affecting all the DC within a cluster. In
>> this case you would face a major issue, that you could have avoided or
>> limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes
>> quite quickly when distinct clusters can be upgraded independently. I would
>> then go with either option 1 or 3.
>>
>> and because I don't like having multiple cql clients/connections on my
>>> app-code
>>
>>
>> In this case, wouldn't it make sense for you to have per customer app-code
>> or just a conditional connection creation depending on the client?
>>
>> I just try to give you some ideas.
>>
>> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
>>> there is overhead with each keyspace + table which would probably break
>>> this design)
>>
>> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>>
>>
>> I just checked it. All the nodes would know about every keyspace and
>> table, if using the same Cassandra cluster, (in my testing version C*3.7,
>> this is stored under system_schema.tables - local strategy, no
>> replication). To avoid that, using distinct clusters is the way to go.
>>
>> https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c
>>
>> C*heers,
>> ---
>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>> France
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> 2016-09-20 22:49 GMT+02:00 Dorian Hoxha :
>>
>>> Hi,
>>>
>>> I need to separate clients data into multiple clusters and because I
>>> don't like having multiple cql clients/connections on my app-code, I'm
>>> thinking of creating many keyspaces and storing them into many virtual
>>> datacenters (the servers will be in 1 logical datacenter, but separated by
>>> keyspaces).
>>>
>>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>>> case scenario)?
>>>
>>> Does the cql-engine make a new connection (like "use keyspace") when
>>> specifying "keyspace.table" on the query ?
>>>
>>> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2

Re: Using keyspaces for virtual clusters

2016-09-21 Thread Eric Stevens

Using keyspaces to support multi tenancy is very close to an anti pattern
unless there is a finite and reasonable upper bound to how many tenants
you'll support overall. Large numbers of tables comes with cluster overhead
and operational complexity you will come to regret eventually.

>and because I don't like having multiple cql clients/connections on my
app-code

You should note that although Cassandra drivers present a single logical
connection per cluster, under the hood it maintains connection pools per C*
host. You might be able to do a slightly better job of managing those pools
as a single cluster and logical connection, but I doubt it will be very
significant. It would depend on what options you have available in your
driver of choice.

Application logic would complexity not be greatly improved because you
still need to switch by tenant, whether it's keyspace name or connection
name doesn't seem like it would make much difference.

As Alain pointed out, upgrades will be painful and maybe even dangerous as
a monolithic cluster.

On Wed, Sep 21, 2016, 3:50 AM Alain RODRIGUEZ  wrote:

> Hi Dorian,
>
> I'm thinking of creating many keyspaces and storing them into many virtual
>> datacenters (the servers will be in 1 logical datacenter, but separated by
>> keyspaces).
>>
>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>> case scenario)?
>
>
> There is 3 main things you can do here
>
> 1 - Use 1 DC, 200 keyspaces using the DC
> 2 - Use 200 DC, 1 keyspace per DC.
> 3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but
> related to 1 client)
>
> I am not sure if you want to go with 1 or 2, my understanding is you
> wanted to write "the servers will be in 1 -*logical- **physical*
> datacenter" and you are willing to do as described in 2.
>
> This looks to be a good idea to me, but for other reasons (clients /
> workload isolation, limited risk, independent growth for each client,
> visibility on cost per client, ...)
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>> case scenario)?
>>
>
> Yet I would not go with distinct DC, but rather distinct C* clusters
> (different cluster names, seeds, etc).
>
> I see no good reason to use virtual cluster instead of distinct cluster.
> Keep keyspace in distinct isolated datacenter would work. Datacenter would
> be quite isolated since no information or load would be shared, excepted
> from gossip.
>
> Yet there are some issue with big clusters due to gossip, and I had some
> issue in the past due to gossip, affecting all the DC within a cluster. In
> this case you would face a major issue, that you could have avoided or
> limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes
> quite quickly when distinct clusters can be upgraded independently. I would
> then go with either option 1 or 3.
>
> and because I don't like having multiple cql clients/connections on my
>> app-code
>
>
> In this case, wouldn't it make sense for you to have per customer app-code
> or just a conditional connection creation depending on the client?
>
> I just try to give you some ideas.
>
> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
>> there is overhead with each keyspace + table which would probably break
>> this design)
>
> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>
>
> I just checked it. All the nodes would know about every keyspace and
> table, if using the same Cassandra cluster, (in my testing version C*3.7,
> this is stored under system_schema.tables - local strategy, no
> replication). To avoid that, using distinct clusters is the way to go.
>
> https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-09-20 22:49 GMT+02:00 Dorian Hoxha :
>
>> Hi,
>>
>> I need to separate clients data into multiple clusters and because I
>> don't like having multiple cql clients/connections on my app-code, I'm
>> thinking of creating many keyspaces and storing them into many virtual
>> datacenters (the servers will be in 1 logical datacenter, but separated by
>> keyspaces).
>>
>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>> case scenario)?
>>
>> Does the cql-engine make a new connection (like "use keyspace") when
>> specifying "keyspace.table" on the query ?
>>
>> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
>> there is overhead with each keyspace + table which would probably break
>> this design)
>> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>>
>> Thank you!
>>
>
>

Re: Using keyspaces for virtual clusters

2016-09-21 Thread Alain RODRIGUEZ

Hi Dorian,

I'm thinking of creating many keyspaces and storing them into many virtual
> datacenters (the servers will be in 1 logical datacenter, but separated by
> keyspaces).
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?


There is 3 main things you can do here

1 - Use 1 DC, 200 keyspaces using the DC
2 - Use 200 DC, 1 keyspace per DC.
3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but
related to 1 client)

I am not sure if you want to go with 1 or 2, my understanding is you wanted
to write "the servers will be in 1 -*logical- **physical* datacenter" and
you are willing to do as described in 2.

This looks to be a good idea to me, but for other reasons (clients /
workload isolation, limited risk, independent growth for each client,
visibility on cost per client, ...)

Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?
>

Yet I would not go with distinct DC, but rather distinct C* clusters
(different cluster names, seeds, etc).

I see no good reason to use virtual cluster instead of distinct cluster.
Keep keyspace in distinct isolated datacenter would work. Datacenter would
be quite isolated since no information or load would be shared, excepted
from gossip.

Yet there are some issue with big clusters due to gossip, and I had some
issue in the past due to gossip, affecting all the DC within a cluster. In
this case you would face a major issue, that you could have avoided or
limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes
quite quickly when distinct clusters can be upgraded independently. I would
then go with either option 1 or 3.

and because I don't like having multiple cql clients/connections on my
> app-code


In this case, wouldn't it make sense for you to have per customer app-code
or just a conditional connection creation depending on the client?

I just try to give you some ideas.

Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
> there is overhead with each keyspace + table which would probably break
> this design)

Or is it just a simple map dcx--->ip1,ip2,ip3 ?


I just checked it. All the nodes would know about every keyspace and table,
if using the same Cassandra cluster, (in my testing version C*3.7, this is
stored under system_schema.tables - local strategy, no replication). To
avoid that, using distinct clusters is the way to go.

https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-09-20 22:49 GMT+02:00 Dorian Hoxha :

> Hi,
>
> I need to separate clients data into multiple clusters and because I don't
> like having multiple cql clients/connections on my app-code, I'm thinking
> of creating many keyspaces and storing them into many virtual datacenters
> (the servers will be in 1 logical datacenter, but separated by keyspaces).
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?
>
> Does the cql-engine make a new connection (like "use keyspace") when
> specifying "keyspace.table" on the query ?
>
> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
> there is overhead with each keyspace + table which would probably break
> this design)
> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>
> Thank you!
>

Using keyspaces for virtual clusters

2016-09-20 Thread Dorian Hoxha

Hi,

I need to separate clients data into multiple clusters and because I don't
like having multiple cql clients/connections on my app-code, I'm thinking
of creating many keyspaces and storing them into many virtual datacenters
(the servers will be in 1 logical datacenter, but separated by keyspaces).

Does that make sense (so growing up to 200 dcs of 3 servers each in best
case scenario)?

Does the cql-engine make a new connection (like "use keyspace") when
specifying "keyspace.table" on the query ?

Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
there is overhead with each keyspace + table which would probably break
this design)
Or is it just a simple map dcx--->ip1,ip2,ip3 ?

Thank you!

Re: Using keyspaces for virtual clusters

Re: Using keyspaces for virtual clusters

Re: Using keyspaces for virtual clusters

Using keyspaces for virtual clusters

4 matches

Site Navigation

Mail list logo

Footer information