Vitaliy Bondarenko created ZOOKEEPER-4562:
---------------------------------------------

             Summary: Zookeeper as a platform (multi-tenant setup): Throughput 
quotas for each tenant
                 Key: ZOOKEEPER-4562
                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4562
             Project: ZooKeeper
          Issue Type: New Feature
          Components: server
            Reporter: Vitaliy Bondarenko


Folks, I have kind of an RFC here for the feature described as '{*}Zookeeper as 
a platform (multi-tenant setup): Throughput quotas for each tenant{*}' I am 
sure the problem described below bothered other engineers too, so it might be 
solved at least to some extent? 

I would appreciate your comments on this.

 

*Problem*

In a multi-tenant zookeeper, It is impossible to separate throughput between 
tenants. This leads to a situation when noisy tenant can grab most of the 
throughput and affect other tenants by increasing their latency and in severe 
cases the ability to read/write data.

 

*Use-cases*

As a zookeeper platform engineer I want to bo able to separate throughput and 
other resources usage between tenants in a multi-tenant zookeeper environment. 

 

*Example*

Let's consider multi-tenant platform setup, when we have :
 * tenant_1 having chroot /tenant1_data 
 * tenant_2 having chroot /tenant2_data 

Tenants have recursive ACL configured in a way that tenant_1 clients does not 
have any access to chroot of tenant_2 and vice versa. Effectively, each of them 
can see only it's own Zookeeper data. 

So far so good.

Now as a zookeeper platform engineer I want to be able to limit the resources 
usage by tenants 1 and 2. Let's assume that tenant1 needs 90% of disk 
usage/throughput and tenant2 only 10%.
 # I can use quota for chroot folders to limit the disk usage by every tenant. 
Which is great!
 # I can use Throttling to throttle connection for all connections at once. So 
the connection throttling for tenant 1 will affect tenant 2 as well in my 
understanding.
 # I want to limit the throughput for tenant1 to consume max 90% or read/write 
throughput.

There are multiple options how particularly should 3. work. I imagine something 
like this probably: Count number of bytes each tenant wants to write (let's 
limit to writes first) in a running fashion. if tenant1 accumulates more than 
90% write traffic during certain time period, we should throttle him to allow 
tenant 2 to use his 10%. 

Another, possibly simpler option is to configure absolute quotas per tenant. 
Basically bytes per second each tenant allowed to write or read. The con of 
this method is high probability of unused capacities. 

 

*Open questions*

What is the current best practice for such a multi-tenant setup? Can I achieve 
what described above to some extent with Throttling/Quota? 

 

Let's kick of the discussion about this feature request!



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to