Re: Support Multi-Tenant in Cassandra

jason zhao yang Fri, 09 Sep 2016 07:16:24 -0700

Hi Romain,

Thanks for the reply.


> request_scheduler

it is a legacy feature which only works for thrift api..

It will be great to have some sort of scheduling per user/role, but
scheduling on the request will only provide limit isolation..if JVM crashes
due to one tenant's invalid request(eg. insert a blo to collection column),
it will be awful.


Thank you.

jason zhao yang <zhaoyangsingap...@gmail.com>于2016年8月6日周六 下午12:33写道：

> We consider splitting by Keypspace or tables before, but Cassandra's table
> is a costly structure(more cpu, flush, memory..).
>
> In our use case, it's expected to have more than 50 tenants on same
> cluster.
>
> > As it was already mentioned in the ticket itself, filtering is a highly 
> > inefficient
> operation.
> I totally agree, but it's to good to have data filtered on server sider,
> rather than client side..
>
> How about adding a logical tenant concept in Cassandra?  all logical
> tenants will share the same table schemas, but queries/storage are
> separated?
>
>
> Oleksandr Petrov <oleksandr.pet...@gmail.com>于2016年7月15日周五 下午4:28写道：
>
>> There's a ticket on filtering (#11031), although I would not count on
>> filtering in production.
>>
>> As it was already mentioned in the ticket itself, filtering is a highly
>> inefficient operation. it was thought as aid for people who're exploring
>> data and/or can structure query in such a way that it will at least be
>> local (for example, with IN or EQ query on the partition key and filtering
>> out results from the small partition). However, filtering on the Partition
>> Key assumes that _every_ replica has to be queried for the results, as we
>> do not know which partitions are going to be holding the data. Having
>> every
>> query in your system to rely on filtering, big amount of data and high
>> load
>> will eventually have substantial negative impact on performance.
>>
>> I'm not sure what's the amount of tenants you're working with, although
>> I've seen setups where tenancy was solved by using multiple keyspaces,
>> which helps to completely isolate the data, avoid filtering. Given that
>> you've tried splitting sstables on tenant_id, that might be solved by
>> using
>> multiple keyspaces. This will also help with server resource isolation and
>> most of the issues you've raised.
>>
>>
>> On Fri, Jul 15, 2016 at 10:10 AM Romain Hardouin
>> <romainh...@yahoo.fr.invalid> wrote:
>>
>> > I don't use C* in such a context but out of curiosity did you set
>> > the request_scheduler to RoundRobin or did you implement your own
>> scheduler?
>> > Romain
>> >     Le Vendredi 15 juillet 2016 8h39, jason zhao yang <
>> > zhaoyangsingap...@gmail.com> a écrit :
>> >
>> >
>> >  Hi,
>> >
>> > May I ask is there any plan of extending functionalities related to
>> > Multi-Tenant?
>> >
>> > Our current approach is to define an extra PartitionKey called
>> "tenant_id".
>> > In my use cases, all tenants will have the same table schemas.
>> >
>> > * For security isolation: we customized GRANT statement to be able to
>> > restrict user query based on the "tenant_id" partition.
>> >
>> > * For getting all data of single tenant, we customized SELECT statement
>> to
>> > support allow filtering on "tenant_id" partition key.
>> >
>> > * For server resource isolation, I have no idea how to.
>> >
>> > * For per-tenant backup restore, I was trying a
>> > tenant_base_compaction_strategy to split sstables based on tenant_id. it
>> > turned out to be very inefficient.
>> >
>> > What's community's opinion about submitting those patches to Cassandra?
>> It
>> > will be great if you guys can share the ideal Multi-Tenant architecture
>> for
>> > Cassandra?
>> >
>> > jasonstack
>> >
>> >
>> >
>>
>> --
>> Alex Petrov
>>
>

Re: Support Multi-Tenant in Cassandra

Reply via email to