Re: [DISCUSS] KIP-124: Request rate quotas

Rajini Sivaram Mon, 20 Feb 2017 02:45:00 -0800

Dong, Onur & Becket,

Thank you all for the very useful feedback.


The choice of request handling time as opposed to request rate was based on
the observation in KAFKA-4195
<https://issues.apache.org/jira/browse/KAFKA-4195> that request rates may
be less intuitive to configure than percentage utilization. But since the
KIP is measuring time rather than request pool utilization as suggested in
the JIRA, I agree that request rate would probably work better than
percentage. So I am inclined to change the KIP to throttle on request rates
(e.g 100 requests per second) rather than percentage. Average request rates
are exposed as metrics, so admin can configure quotas based on that. And
the values are more meaningful from the client application point of view. I
am still interested in feedback regarding the second rejected alternative
that throttles based on percentage utilization of resource handler pool.
That was the suggestion from Jun/Ismael in KAFKA-4195, but I couldn't see
how that would help in the case where a small number of connections pushed
a continuous stream of short requests. Suggestions welcome.

Responses to other questions above:

- (Dong): The KIP proposes to throttle most requests (and not just
Produce/Fetch) since the goal is to control usage of broker resources. So
LeaderAndIsrRequest and MetadataRequest will also be throttled. The few
requests not being throttled are timing-sensitive.

- (Dong): The KIP does not propose to throttle inter-broker traffic based
on request rates. The most frequent requests in inter-broker traffic are
fetch requests and a well configured broker would use reasonably good
values of min.bytes and max.wait that avoids overloading the broker
unnecessarily with fetch requests. The existing byte-rate based quotas
should be sufficient in this case.

- (Onur): Quota window configuration - this is the existing configuration
quota.window.size.seconds (also used for byte-rate quotas)

- (Becket): The main issue that the KIP is addressing is clients flooding
the broker with small requests (eg. fetch with max.wait.ms=0), which can
overload the broker and delay requests from other clients/users even though
the byte rate is quite small. CPU quota reflects the resource usage on the
broker that the KIP is attempting to limit. Since this is the time on the
local broker, it shouldn't vary much depending on acks=-1 etc. but I do
agree on the unpredictability of time based quotas. Switching from request
processing time to request rates will address this. Would you still be
concerned that "*Users do not have direct control over the request rate,
i.e. users do **not know when a request will be sent by the clients*"?

Jun/Ismael,

I am interested in your views on request rate based quotas and whether we
should still consider utilization of the resource handler pool.


Many thanks,

Rajini


On Sun, Feb 19, 2017 at 11:54 PM, Becket Qin <[email protected]> wrote:

> Thanks for the KIP, Rajini,
>
> If I understand correctly the proposal was essentially trying to quota the
> CPU usage (that is probably why time slice is used instead of request rate)
> while the existing quota we have is for network bandwidth.
>
> Given we are trying to throttle both CPU and Network, that implies the
> following patterns for the clients:
> 1. High CPU usage, high network usage.
> 2. High CPU usage, low network usage.
> 3. Low CPU usage, high network usage.
> 4. Low CPU usage, low network usage
>
> Theoretically the existing quota addresses case 3 & 4. And this KIP seems
> trying to address case 1 & 2. However, it might be helpful to understand
> what we want to achieve with CPU and network quotas.
>
> People mainly use quota for two different purposes:
> a) protecting the broker from misbehaving clients, and
> b) resource distribution for multi-tenancy.
>
> I agree that generally speaking CPU time is a suitable metric to quota on
> for CPU usage and would work for a). However, as Dong and Onur noticed, it
> is not easy to quantify the impact for the end users at application level
> with a throttled CPU time. If the purpose of the CPU quota is only for
> protection, maybe we don't need a user facing CPU quota.
>
> That said, a user facing CPU quota could be useful for virtualization,
> which maybe related to multi-tenancy but is a little different. Imagine
> there are 10 services sharing the same physical Kafka cluster. With CPU
> time quota and network bandwidth quota, each service can provision a
> logical Kafka cluster with some reserved CPU time and network bandwidth.
> And in this case the quota will be on per logic cluster. Not sure if this
> is what the KIP is intended in the future, though. It would be good if the
> KIP can be more clear on what exact scenarios the CPU quota is trying to
> address.
>
> As of the request rate quota, while it seems easy to enforce and intuitive,
> there are some caveats.
> 1. Users do not have direct control over the request rate, i.e. users do
> not known when a request will be sent by the clients.
> 2. Each request may require different amount of CPU resources to handle.
> That may depends on many things, e.g. whether acks = 1 or acks = -1,
> whether a request is addressing 1000 partitions or 1 partition, whether a
> fetch request requires message format down conversion or not, etc.
> So the result of using request rate quota could be quite unpredictable.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Sat, Feb 18, 2017 at 9:35 PM, Dong Lin <[email protected]> wrote:
>
> > I realized the main concern with this proposal is how user can interpret
> > this CPU-percentage based quota. Since this quota is exposed to user, we
> > need to explain to user how this quota is going to impact their
> application
> > performance and convince them that the quota is now too low for their
> > application. We can able to do this with byte-rate based quota. But I am
> > not sure how we can do this with CPU-percentage based quota. For example,
> > how is user going to understand whether 1% CPU is OK?
> >
> > On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <
> > [email protected]
> > > wrote:
> >
> > > Overall a big fan of the KIP.
> > >
> > > I'd have to agree with Dong. I'm not sure about the decision of using
> the
> > > percentage over the window as opposed to request rate. It's pretty hard
> > to
> > > reason about. I just spoke to one of our SRE's and he agrees.
> > >
> > > Also I may have missed it, but I couldn't find information in the KIP
> on
> > > where this window would be configured.
> > >
> > > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <[email protected]> wrote:
> > >
> > > > To correct the typo above: It seems to me that determination of
> request
> > > > rate is not any more difficult than determination of *byte* rate as
> > both
> > > > metrics are commonly used to measure performance and provide
> guarantee
> > to
> > > > user.
> > > >
> > > > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <[email protected]>
> wrote:
> > > >
> > > > > Hey Rajini,
> > > > >
> > > > > Thanks for the KIP. I have some questions:
> > > > >
> > > > > - I am wondering why throttling based on request rate is listed as
> a
> > > > > rejected alternative. Can you provide more specific reason why it
> is
> > > > > difficult for administrators to decide request rates to allocate?
> It
> > > > seems
> > > > > to me that determination of request rate is not any more difficult
> > than
> > > > > determination of request rate as both metrics are commonly used to
> > > > measure
> > > > > performance and provide guarantee to user. On the other hand, the
> > > > > percentage of processing time provides a vague guarantee to user.
> For
> > > > > example, what performance can user expect if you provide 1%
> > processing
> > > > time
> > > > > quota to this user? How is administrator going to decide this
> quota?
> > > > Should
> > > > > Kafka administrator continues to reduce this percentage quota as
> > number
> > > > of
> > > > > users grow?
> > > > >
> > > > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest
> will
> > > also
> > > > > be throttled by this quota. What is the motivation for throttling
> > these
> > > > > requests? It is also inconsistent with rate-based quota which is
> only
> > > > > applied to ProduceRequest and FetchRequest. IMO it will be simpler
> to
> > > > only
> > > > > throttle ProduceRequest and FetchRequest.
> > > > >
> > > > > - Do you think we should also throttle the inter-broker traffic
> using
> > > > this
> > > > > quota as well similar to KIP-73?
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > [email protected]
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> I have just created KIP-124 to introduce request rate quotas to
> > Kafka:
> > > > >>
> > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > > > >> Request+rate+quotas
> > > > >>
> > > > >> The proposal is for a simple percentage request handling time
> quota
> > > that
> > > > >> can be allocated to *<client-id>*, *<user>* or *<user,
> client-id>*.
> > > > There
> > > > >> are a few other suggestions also under "Rejected alternatives".
> > > Feedback
> > > > >> and suggestions are welcome.
> > > > >>
> > > > >> Thank you...
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Rajini
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Reply via email to