Re: [DISCUSS] KIP-124: Request rate quotas

Rajini Sivaram Fri, 24 Feb 2017 04:22:13 -0800

I have updated the KIP based on the discussions so far.


Regards,

Rajini

On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <rajinisiva...@gmail.com>
wrote:

> Thank you all for the feedback.
>
> Ismael #1. It makes sense not to throttle inter-broker requests like
> LeaderAndIsr etc. The simplest way to ensure that clients cannot use these
> requests to bypass quotas for DoS attacks is to ensure that ACLs prevent
> clients from using these requests and unauthorized requests are included
> towards quotas.
>
> Ismael #2, Jay #1 : I was thinking that these quotas can return a separate
> throttle time, and all utilization based quotas could use the same field
> (we won't add another one for network thread utilization for instance). But
> perhaps it makes sense to keep byte rate quotas separate in produce/fetch
> responses to provide separate metrics? Agree with Ismael that the name of
> the existing field should be changed if we have two. Happy to switch to a
> single combined throttle time if that is sufficient.
>
> Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
> property. Replication quotas use dot separated, so it will be consistent
> with all properties except byte rate quotas.
>
> Radai: #1 Request processing time rather than request rate were chosen
> because the time per request can vary significantly between requests as
> mentioned in the discussion and KIP.
> #2 Two separate quotas for heartbeats/regular requests feel like more
> configuration and more metrics. Since most users would set quotas higher
> than the expected usage and quotas are more of a safety net, a single quota
> should work in most cases.
>  #3 The number of requests in purgatory is limited by the number of active
> connections since only one request per connection will be throttled at a
> time.
> #4 As with byte rate quotas, to use the full allocated quotas,
> clients/users would need to use partitions that are distributed across the
> cluster. The alternative of using cluster-wide quotas instead of per-broker
> quotas would be far too complex to implement.
>
> Dong : We currently have two ClientQuotaManagers for quota types Fetch and
> Produce. A new one will be added for IOThread, which manages quotas for I/O
> thread utilization. This will not update the Fetch or Produce queue-size,
> but will have a separate metric for the queue-size.  I wasn't planning to
> add any additional metrics apart from the equivalent ones for existing
> quotas as part of this KIP. Ratio of byte-rate to I/O thread utilization
> could be slightly misleading since it depends on the sequence of requests.
> But we can look into more metrics after the KIP is implemented if required.
>
> I think we need to limit the maximum delay since all requests are
> throttled. If a client has a quota of 0.001 units and a single request used
> 50ms, we don't want to delay all requests from the client by 50 seconds,
> throwing the client out of all its consumer groups. The issue is only if a
> user is allocated a quota that is insufficient to process one large
> request. The expectation is that the units allocated per user will be much
> higher than the time taken to process one request and the limit should
> seldom be applied. Agree this needs proper documentation.
>
> Regards,
>
> Rajini
>
>
> On Thu, Feb 23, 2017 at 8:04 PM, radai <radai.rosenbl...@gmail.com> wrote:
>
>> @jun: i wasnt concerned about tying up a request processing thread, but
>> IIUC the code does still read the entire request out, which might add-up
>> to
>> a non-negligible amount of memory.
>>
>> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <lindon...@gmail.com> wrote:
>>
>> > Hey Rajini,
>> >
>> > The current KIP says that the maximum delay will be reduced to window
>> size
>> > if it is larger than the window size. I have a concern with this:
>> >
>> > 1) This essentially means that the user is allowed to exceed their quota
>> > over a long period of time. Can you provide an upper bound on this
>> > deviation?
>> >
>> > 2) What is the motivation for cap the maximum delay by the window size?
>> I
>> > am wondering if there is better alternative to address the problem.
>> >
>> > 3) It means that the existing metric-related config will have a more
>> > directly impact on the mechanism of this io-thread-unit-based quota. The
>> > may be an important change depending on the answer to 1) above. We
>> probably
>> > need to document this more explicitly.
>> >
>> > Dong
>> >
>> >
>> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <lindon...@gmail.com> wrote:
>> >
>> > > Hey Jun,
>> > >
>> > > Yeah you are right. I thought it wasn't because at LinkedIn it will be
>> > too
>> > > much pressure on inGraph to expose those per-clientId metrics so we
>> ended
>> > > up printing them periodically to local log. Never mind if it is not a
>> > > general problem.
>> > >
>> > > Hey Rajini,
>> > >
>> > > - I agree with Jay that we probably don't want to add a new field for
>> > > every quota ProduceResponse or FetchResponse. Is there any use-case
>> for
>> > > having separate throttle-time fields for byte-rate-quota and
>> > > io-thread-unit-quota? You probably need to document this as interface
>> > > change if you plan to add new field in any request.
>> > >
>> > > - I don't think IOThread belongs to quotaType. The existing quota
>> types
>> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify
>> the
>> > > type of request that are throttled, not the quota mechanism that is
>> > applied.
>> > >
>> > > - If a request is throttled due to this io-thread-unit-based quota, is
>> > the
>> > > existing queue-size metric in ClientQuotaManager incremented?
>> > >
>> > > - In the interest of providing guide line for admin to decide
>> > > io-thread-unit-based quota and for user to understand its impact on
>> their
>> > > traffic, would it be useful to have a metric that shows the overall
>> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
>> > metric?
>> > >
>> > > Thanks,
>> > > Dong
>> > >
>> > >
>> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <j...@confluent.io> wrote:
>> > >
>> > >> Hi, Ismael,
>> > >>
>> > >> For #3, typically, an admin won't configure more io threads than CPU
>> > >> cores,
>> > >> but it's possible for an admin to start with fewer io threads than
>> cores
>> > >> and grow that later on.
>> > >>
>> > >> Hi, Dong,
>> > >>
>> > >> I think the throttleTime sensor on the broker tells the admin
>> whether a
>> > >> user/clentId is throttled or not.
>> > >>
>> > >> Hi, Radi,
>> > >>
>> > >> The reasoning for delaying the throttled requests on the broker
>> instead
>> > of
>> > >> returning an error immediately is that the latter has no way to
>> prevent
>> > >> the
>> > >> client from retrying immediately, which will make things worse. The
>> > >> delaying logic is based off a delay queue. A separate expiration
>> thread
>> > >> just waits on the next to be expired request. So, it doesn't tie up a
>> > >> request handler thread.
>> > >>
>> > >> Thanks,
>> > >>
>> > >> Jun
>> > >>
>> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <ism...@juma.me.uk>
>> wrote:
>> > >>
>> > >> > Hi Jay,
>> > >> >
>> > >> > Regarding 1, I definitely like the simplicity of keeping a single
>> > >> throttle
>> > >> > time field in the response. The downside is that the client metrics
>> > >> will be
>> > >> > more coarse grained.
>> > >> >
>> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage` and
>> > >> > `log.cleaner.min.cleanable.ratio`.
>> > >> >
>> > >> > Ismael
>> > >> >
>> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <j...@confluent.io>
>> wrote:
>> > >> >
>> > >> > > A few minor comments:
>> > >> > >
>> > >> > >    1. Isn't it the case that the throttling time response field
>> > should
>> > >> > have
>> > >> > >    the total time your request was throttled irrespective of the
>> > >> quotas
>> > >> > > that
>> > >> > >    caused that. Limiting it to byte rate quota doesn't make
>> sense,
>> > >> but I
>> > >> > > also
>> > >> > >    I don't think we want to end up adding new fields in the
>> response
>> > >> for
>> > >> > > every
>> > >> > >    single thing we quota, right?
>> > >> > >    2. I don't think we should make this quota specifically about
>> io
>> > >> > >    threads. Once we introduce these quotas people set them and
>> > expect
>> > >> > them
>> > >> > > to
>> > >> > >    be enforced (and if they aren't it may cause an outage). As a
>> > >> result
>> > >> > > they
>> > >> > >    are a bit more sensitive than normal configs, I think. The
>> > current
>> > >> > > thread
>> > >> > >    pools seem like something of an implementation detail and not
>> the
>> > >> > level
>> > >> > > the
>> > >> > >    user-facing quotas should be involved with. I think it might
>> be
>> > >> better
>> > >> > > to
>> > >> > >    make this a general request-time throttle with no mention in
>> the
>> > >> > naming
>> > >> > >    about I/O threads and simply acknowledge the current
>> limitation
>> > >> (which
>> > >> > > we
>> > >> > >    may someday fix) in the docs that this covers only the time
>> after
>> > >> the
>> > >> > >    thread is read off the network.
>> > >> > >    3. As such I think the right interface to the user would be
>> > >> something
>> > >> > >    like percent_request_time and be in {0,...100} or
>> > >> request_time_ratio
>> > >> > > and be
>> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used
>> if
>> > the
>> > >> > > scale
>> > >> > >    is between 0 and 1 in the other metrics, right?)
>> > >> > >
>> > >> > > -Jay
>> > >> > >
>> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
>> > >> rajinisiva...@gmail.com
>> > >> > >
>> > >> > > wrote:
>> > >> > >
>> > >> > > > Guozhang/Dong,
>> > >> > > >
>> > >> > > > Thank you for the feedback.
>> > >> > > >
>> > >> > > > Guozhang : I have updated the section on co-existence of byte
>> rate
>> > >> and
>> > >> > > > request time quotas.
>> > >> > > >
>> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
>> since
>> > >> they
>> > >> > > are
>> > >> > > > going to be very similar to the existing metrics and sensors.
>> To
>> > >> avoid
>> > >> > > > confusion, I have now added more detail. All metrics are in the
>> > >> group
>> > >> > > > "quotaType" and all sensors have names starting with
>> "quotaType"
>> > >> (where
>> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
>> > >> > > > FollowerReplication/*IOThread*).
>> > >> > > > So there will be no reuse of existing metrics/sensors. The new
>> > ones
>> > >> for
>> > >> > > > request processing time based throttling will be completely
>> > >> independent
>> > >> > > of
>> > >> > > > existing metrics/sensors, but will be consistent in format.
>> > >> > > >
>> > >> > > > The existing throttle_time_ms field in produce/fetch responses
>> > will
>> > >> not
>> > >> > > be
>> > >> > > > impacted by this KIP. That will continue to return byte-rate
>> based
>> > >> > > > throttling times. In addition, a new field
>> > request_throttle_time_ms
>> > >> > will
>> > >> > > be
>> > >> > > > added to return request quota based throttling times. These
>> will
>> > be
>> > >> > > exposed
>> > >> > > > as new metrics on the client-side.
>> > >> > > >
>> > >> > > > Since all metrics and sensors are different for each type of
>> > quota,
>> > >> I
>> > >> > > > believe there is already sufficient metrics to monitor
>> throttling
>> > on
>> > >> > both
>> > >> > > > client and broker side for each type of throttling.
>> > >> > > >
>> > >> > > > Regards,
>> > >> > > >
>> > >> > > > Rajini
>> > >> > > >
>> > >> > > >
>> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <lindon...@gmail.com
>> >
>> > >> wrote:
>> > >> > > >
>> > >> > > > > Hey Rajini,
>> > >> > > > >
>> > >> > > > > I think it makes a lot of sense to use io_thread_units as
>> metric
>> > >> to
>> > >> > > quota
>> > >> > > > > user's traffic here. LGTM overall. I have some questions
>> > regarding
>> > >> > > > sensors.
>> > >> > > > >
>> > >> > > > > - Can you be more specific in the KIP what sensors will be
>> > added?
>> > >> For
>> > >> > > > > example, it will be useful to specify the name and
>> attributes of
>> > >> > these
>> > >> > > > new
>> > >> > > > > sensors.
>> > >> > > > >
>> > >> > > > > - We currently have throttle-time and queue-size for
>> byte-rate
>> > >> based
>> > >> > > > quota.
>> > >> > > > > Are you going to have separate throttle-time and queue-size
>> for
>> > >> > > requests
>> > >> > > > > throttled by io_thread_unit-based quota, or will they share
>> the
>> > >> same
>> > >> > > > > sensor?
>> > >> > > > >
>> > >> > > > > - Does the throttle-time in the ProduceResponse and
>> > FetchResponse
>> > >> > > > contains
>> > >> > > > > time due to io_thread_unit-based quota?
>> > >> > > > >
>> > >> > > > > - Currently kafka server doesn't not provide any log or
>> metrics
>> > >> that
>> > >> > > > tells
>> > >> > > > > whether any given clientId (or user) is throttled. This is
>> not
>> > too
>> > >> > bad
>> > >> > > > > because we can still check the client-side byte-rate metric
>> to
>> > >> > validate
>> > >> > > > > whether a given client is throttled. But with this
>> > io_thread_unit,
>> > >> > > there
>> > >> > > > > will be no way to validate whether a given client is slow
>> > because
>> > >> it
>> > >> > > has
>> > >> > > > > exceeded its io_thread_unit limit. It is necessary for user
>> to
>> > be
>> > >> > able
>> > >> > > to
>> > >> > > > > know this information to figure how whether they have reached
>> > >> there
>> > >> > > quota
>> > >> > > > > limit. How about we add log4j log on the server side to
>> > >> periodically
>> > >> > > > print
>> > >> > > > > the (client_id, byte-rate-throttle-time,
>> > >> > io-thread-unit-throttle-time)
>> > >> > > so
>> > >> > > > > that kafka administrator can figure those users that have
>> > reached
>> > >> > their
>> > >> > > > > limit and act accordingly?
>> > >> > > > >
>> > >> > > > > Thanks,
>> > >> > > > > Dong
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
>> > >> wangg...@gmail.com>
>> > >> > > > wrote:
>> > >> > > > >
>> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
>> comment
>> > on
>> > >> > the
>> > >> > > > > > throttling implementation:
>> > >> > > > > >
>> > >> > > > > > Stated as "Request processing time throttling will be
>> applied
>> > on
>> > >> > top
>> > >> > > if
>> > >> > > > > > necessary." I thought that it meant the request processing
>> > time
>> > >> > > > > throttling
>> > >> > > > > > is applied first, but continue reading I found it actually
>> > >> meant to
>> > >> > > > apply
>> > >> > > > > > produce / fetch byte rate throttling first.
>> > >> > > > > >
>> > >> > > > > > Also the last sentence "The remaining delay if any is
>> applied
>> > to
>> > >> > the
>> > >> > > > > > response." is a bit confusing to me. Maybe rewording it a
>> bit?
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > Guozhang
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <j...@confluent.io
>> >
>> > >> wrote:
>> > >> > > > > >
>> > >> > > > > > > Hi, Rajini,
>> > >> > > > > > >
>> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks
>> good
>> > to
>> > >> me.
>> > >> > > > > > >
>> > >> > > > > > > Jun
>> > >> > > > > > >
>> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
>> > >> > > > > rajinisiva...@gmail.com
>> > >> > > > > > >
>> > >> > > > > > > wrote:
>> > >> > > > > > >
>> > >> > > > > > > > Jun/Roger,
>> > >> > > > > > > >
>> > >> > > > > > > > Thank you for the feedback.
>> > >> > > > > > > >
>> > >> > > > > > > > 1. I have updated the KIP to use absolute units
>> instead of
>> > >> > > > > percentage.
>> > >> > > > > > > The
>> > >> > > > > > > > property is called* io_thread_units* to align with the
>> > >> thread
>> > >> > > count
>> > >> > > > > > > > property *num.io.threads*. When we implement network
>> > thread
>> > >> > > > > utilization
>> > >> > > > > > > > quotas, we can add another property
>> > *network_thread_units.*
>> > >> > > > > > > >
>> > >> > > > > > > > 2. ControlledShutdown is already listed under the
>> exempt
>> > >> > > requests.
>> > >> > > > > Jun,
>> > >> > > > > > > did
>> > >> > > > > > > > you mean a different request that needs to be added?
>> The
>> > >> four
>> > >> > > > > requests
>> > >> > > > > > > > currently exempt in the KIP are StopReplica,
>> > >> > ControlledShutdown,
>> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled
>> > using
>> > >> > > > > > ClusterAction
>> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
>> > >> > unauthorized.
>> > >> > > I
>> > >> > > > > > wasn't
>> > >> > > > > > > > sure if there are other requests used only for
>> > inter-broker
>> > >> > that
>> > >> > > > > needed
>> > >> > > > > > > to
>> > >> > > > > > > > be excluded.
>> > >> > > > > > > >
>> > >> > > > > > > > 3. I was thinking the smallest change would be to
>> replace
>> > >> all
>> > >> > > > > > references
>> > >> > > > > > > to
>> > >> > > > > > > > *requestChannel.sendResponse()* with a local method
>> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the throttling
>> if
>> > >> any
>> > >> > > plus
>> > >> > > > > send
>> > >> > > > > > > > response. If we throttle first in *KafkaApis.handle()*,
>> > the
>> > >> > time
>> > >> > > > > spent
>> > >> > > > > > > > within the method handling the request will not be
>> > recorded
>> > >> or
>> > >> > > used
>> > >> > > > > in
>> > >> > > > > > > > throttling. We can look into this again when the PR is
>> > ready
>> > >> > for
>> > >> > > > > > review.
>> > >> > > > > > > >
>> > >> > > > > > > > Regards,
>> > >> > > > > > > >
>> > >> > > > > > > > Rajini
>> > >> > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
>> > >> > > > > roger.hoo...@gmail.com>
>> > >> > > > > > > > wrote:
>> > >> > > > > > > >
>> > >> > > > > > > > > Great to see this KIP and the excellent discussion.
>> > >> > > > > > > > >
>> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
>> application
>> > is
>> > >> > > > > allocated
>> > >> > > > > > 1
>> > >> > > > > > > > > request handler unit, then it's as if I have a Kafka
>> > >> broker
>> > >> > > with
>> > >> > > > a
>> > >> > > > > > > single
>> > >> > > > > > > > > request handler thread dedicated to me.  That's the
>> > most I
>> > >> > can
>> > >> > > > use,
>> > >> > > > > > at
>> > >> > > > > > > > > least.  That allocation doesn't change even if an
>> admin
>> > >> later
>> > >> > > > > > increases
>> > >> > > > > > > > the
>> > >> > > > > > > > > size of the request thread pool on the broker.  It's
>> > >> similar
>> > >> > to
>> > >> > > > the
>> > >> > > > > > CPU
>> > >> > > > > > > > > abstraction that VMs and containers get from
>> hypervisors
>> > >> or
>> > >> > OS
>> > >> > > > > > > > schedulers.
>> > >> > > > > > > > > While different client access patterns can use wildly
>> > >> > different
>> > >> > > > > > amounts
>> > >> > > > > > > > of
>> > >> > > > > > > > > request thread resources per request, a given
>> > application
>> > >> > will
>> > >> > > > > > > generally
>> > >> > > > > > > > > have a stable access pattern and can figure out
>> > >> empirically
>> > >> > how
>> > >> > > > > many
>> > >> > > > > > > > > "request thread units" it needs to meet it's
>> > >> > throughput/latency
>> > >> > > > > > goals.
>> > >> > > > > > > > >
>> > >> > > > > > > > > Cheers,
>> > >> > > > > > > > >
>> > >> > > > > > > > > Roger
>> > >> > > > > > > > >
>> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
>> > >> j...@confluent.io>
>> > >> > > > wrote:
>> > >> > > > > > > > >
>> > >> > > > > > > > > > Hi, Rajini,
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Thanks for the updated KIP. A few more comments.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > 1. A concern of request_time_percent is that it's
>> not
>> > an
>> > >> > > > absolute
>> > >> > > > > > > > value.
>> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the admin
>> > >> doubles
>> > >> > > the
>> > >> > > > > > > number
>> > >> > > > > > > > of
>> > >> > > > > > > > > > request handler threads, that user now actually has
>> > >> twice
>> > >> > the
>> > >> > > > > > > absolute
>> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
>> perhaps
>> > >> > setting
>> > >> > > > the
>> > >> > > > > > > quota
>> > >> > > > > > > > > > based on an absolute request thread unit is better.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
>> inter-broker
>> > >> > request
>> > >> > > > and
>> > >> > > > > > > needs
>> > >> > > > > > > > to
>> > >> > > > > > > > > > be excluded from throttling.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
>> simpler
>> > >> to
>> > >> > > apply
>> > >> > > > > the
>> > >> > > > > > > > > request
>> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
>> > Otherwise,
>> > >> we
>> > >> > > will
>> > >> > > > > > need
>> > >> > > > > > > to
>> > >> > > > > > > > > add
>> > >> > > > > > > > > > the throttling logic in each type of request.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Thanks,
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Jun
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
>> > >> > > > > > > > rajinisiva...@gmail.com
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > wrote:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > > Jun,
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > Thank you for the review.
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > I have reverted to the original KIP that
>> throttles
>> > >> based
>> > >> > on
>> > >> > > > > > request
>> > >> > > > > > > > > > handler
>> > >> > > > > > > > > > > utilization. At the moment, it uses percentage,
>> but
>> > I
>> > >> am
>> > >> > > > happy
>> > >> > > > > to
>> > >> > > > > > > > > change
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
>> required. I
>> > >> have
>> > >> > > > added
>> > >> > > > > > the
>> > >> > > > > > > > > > examples
>> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
>> > "Future
>> > >> > Work"
>> > >> > > > > > section
>> > >> > > > > > > > to
>> > >> > > > > > > > > > > address network thread utilization. The
>> > configuration
>> > >> is
>> > >> > > > named
>> > >> > > > > > > > > > > "request_time_percent" with the expectation that
>> it
>> > >> can
>> > >> > > also
>> > >> > > > be
>> > >> > > > > > > used
>> > >> > > > > > > > as
>> > >> > > > > > > > > > the
>> > >> > > > > > > > > > > limit for network thread utilization when that is
>> > >> > > > implemented,
>> > >> > > > > so
>> > >> > > > > > > > that
>> > >> > > > > > > > > > > users have to set only one config for the two and
>> > not
>> > >> > have
>> > >> > > to
>> > >> > > > > > worry
>> > >> > > > > > > > > about
>> > >> > > > > > > > > > > the internal distribution of the work between the
>> > two
>> > >> > > thread
>> > >> > > > > > pools
>> > >> > > > > > > in
>> > >> > > > > > > > > > > Kafka.
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > Regards,
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > Rajini
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
>> > >> > > j...@confluent.io>
>> > >> > > > > > > wrote:
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > > Hi, Rajini,
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Thanks for the proposal.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > The benefit of using the request processing
>> time
>> > >> over
>> > >> > the
>> > >> > > > > > request
>> > >> > > > > > > > > rate
>> > >> > > > > > > > > > is
>> > >> > > > > > > > > > > > exactly what people have said. I will just
>> expand
>> > >> that
>> > >> > a
>> > >> > > > bit.
>> > >> > > > > > > > > Consider
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > following case. The producer sends a produce
>> > request
>> > >> > > with a
>> > >> > > > > > 10MB
>> > >> > > > > > > > > > message
>> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
>> > >> decompression of
>> > >> > > the
>> > >> > > > > > > message
>> > >> > > > > > > > > on
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > broker could take 10-15 seconds, during which
>> > time,
>> > >> a
>> > >> > > > request
>> > >> > > > > > > > handler
>> > >> > > > > > > > > > > > thread is completely blocked. In this case,
>> > neither
>> > >> the
>> > >> > > > > byte-in
>> > >> > > > > > > > quota
>> > >> > > > > > > > > > nor
>> > >> > > > > > > > > > > > the request rate quota may be effective in
>> > >> protecting
>> > >> > the
>> > >> > > > > > broker.
>> > >> > > > > > > > > > > Consider
>> > >> > > > > > > > > > > > another case. A consumer group starts with 10
>> > >> instances
>> > >> > > and
>> > >> > > > > > later
>> > >> > > > > > > > on
>> > >> > > > > > > > > > > > switches to 20 instances. The request rate will
>> > >> likely
>> > >> > > > > double,
>> > >> > > > > > > but
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > actually load on the broker may not double
>> since
>> > >> each
>> > >> > > fetch
>> > >> > > > > > > request
>> > >> > > > > > > > > > only
>> > >> > > > > > > > > > > > contains half of the partitions. Request rate
>> > quota
>> > >> may
>> > >> > > not
>> > >> > > > > be
>> > >> > > > > > > easy
>> > >> > > > > > > > > to
>> > >> > > > > > > > > > > > configure in this case.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > What we really want is to be able to prevent a
>> > >> client
>> > >> > > from
>> > >> > > > > > using
>> > >> > > > > > > > too
>> > >> > > > > > > > > > much
>> > >> > > > > > > > > > > > of the server side resources. In this
>> particular
>> > >> KIP,
>> > >> > > this
>> > >> > > > > > > resource
>> > >> > > > > > > > > is
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > capacity of the request handler threads. I
>> agree
>> > >> that
>> > >> > it
>> > >> > > > may
>> > >> > > > > > not
>> > >> > > > > > > be
>> > >> > > > > > > > > > > > intuitive for the users to determine how to set
>> > the
>> > >> > right
>> > >> > > > > > limit.
>> > >> > > > > > > > > > However,
>> > >> > > > > > > > > > > > this is not completely new and has been done in
>> > the
>> > >> > > > container
>> > >> > > > > > > world
>> > >> > > > > > > > > > > > already. For example, Linux cgroup (
>> > >> > > > > https://access.redhat.com/
>> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
>> > >> terprise_Linux/6/html/
>> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has
>> the
>> > >> > concept
>> > >> > > of
>> > >> > > > > > > > > > > > cpu.cfs_quota_us,
>> > >> > > > > > > > > > > > which specifies the total amount of time in
>> > >> > microseconds
>> > >> > > > for
>> > >> > > > > > > which
>> > >> > > > > > > > > all
>> > >> > > > > > > > > > > > tasks in a cgroup can run during a one second
>> > >> period.
>> > >> > We
>> > >> > > > can
>> > >> > > > > > > > > > potentially
>> > >> > > > > > > > > > > > model the request handler threads in a similar
>> > way.
>> > >> For
>> > >> > > > > > example,
>> > >> > > > > > > > each
>> > >> > > > > > > > > > > > request handler thread can be 1 request handler
>> > unit
>> > >> > and
>> > >> > > > the
>> > >> > > > > > > admin
>> > >> > > > > > > > > can
>> > >> > > > > > > > > > > > configure a limit on how many units (say 0.01)
>> a
>> > >> client
>> > >> > > can
>> > >> > > > > > have.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Regarding not throttling the internal broker to
>> > >> broker
>> > >> > > > > > requests.
>> > >> > > > > > > We
>> > >> > > > > > > > > > could
>> > >> > > > > > > > > > > > do that. Alternatively, we could just let the
>> > admin
>> > >> > > > > configure a
>> > >> > > > > > > > high
>> > >> > > > > > > > > > > limit
>> > >> > > > > > > > > > > > for the kafka user (it may not be able to do
>> that
>> > >> > easily
>> > >> > > > > based
>> > >> > > > > > on
>> > >> > > > > > > > > > > clientId
>> > >> > > > > > > > > > > > though).
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Ideally we want to be able to protect the
>> > >> utilization
>> > >> > of
>> > >> > > > the
>> > >> > > > > > > > network
>> > >> > > > > > > > > > > thread
>> > >> > > > > > > > > > > > pool too. The difficult is mostly what Rajini
>> > said:
>> > >> (1)
>> > >> > > The
>> > >> > > > > > > > mechanism
>> > >> > > > > > > > > > for
>> > >> > > > > > > > > > > > throttling the requests is through Purgatory
>> and
>> > we
>> > >> > will
>> > >> > > > have
>> > >> > > > > > to
>> > >> > > > > > > > > think
>> > >> > > > > > > > > > > > through how to integrate that into the network
>> > >> layer.
>> > >> > > (2)
>> > >> > > > In
>> > >> > > > > > the
>> > >> > > > > > > > > > network
>> > >> > > > > > > > > > > > layer, currently we know the user, but not the
>> > >> clientId
>> > >> > > of
>> > >> > > > > the
>> > >> > > > > > > > > request.
>> > >> > > > > > > > > > > So,
>> > >> > > > > > > > > > > > it's a bit tricky to throttle based on clientId
>> > >> there.
>> > >> > > > Plus,
>> > >> > > > > > the
>> > >> > > > > > > > > > byteOut
>> > >> > > > > > > > > > > > quota can already protect the network thread
>> > >> > utilization
>> > >> > > > for
>> > >> > > > > > > fetch
>> > >> > > > > > > > > > > > requests. So, if we can't figure out this part
>> > right
>> > >> > now,
>> > >> > > > > just
>> > >> > > > > > > > > focusing
>> > >> > > > > > > > > > > on
>> > >> > > > > > > > > > > > the request handling threads for this KIP is
>> > still a
>> > >> > > useful
>> > >> > > > > > > > feature.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Thanks,
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Jun
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
>> Sivaram <
>> > >> > > > > > > > > > rajinisiva...@gmail.com
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Thank you all for the feedback.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Jay: I have removed exemption for consumer
>> > >> heartbeat
>> > >> > > etc.
>> > >> > > > > > Agree
>> > >> > > > > > > > > that
>> > >> > > > > > > > > > > > > protecting the cluster is more important than
>> > >> > > protecting
>> > >> > > > > > > > individual
>> > >> > > > > > > > > > > apps.
>> > >> > > > > > > > > > > > > Have retained the exemption for
>> > >> > > StopReplicat/LeaderAndIsr
>> > >> > > > > > etc,
>> > >> > > > > > > > > these
>> > >> > > > > > > > > > > are
>> > >> > > > > > > > > > > > > throttled only if authorization fails (so
>> can't
>> > be
>> > >> > used
>> > >> > > > for
>> > >> > > > > > DoS
>> > >> > > > > > > > > > attacks
>> > >> > > > > > > > > > > > in
>> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
>> > >> requests to
>> > >> > > > > > complete
>> > >> > > > > > > > > > without
>> > >> > > > > > > > > > > > > delays).
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > I will wait another day to see if these is
>> any
>> > >> > > objection
>> > >> > > > to
>> > >> > > > > > > > quotas
>> > >> > > > > > > > > > > based
>> > >> > > > > > > > > > > > on
>> > >> > > > > > > > > > > > > request processing time (as opposed to
>> request
>> > >> rate)
>> > >> > > and
>> > >> > > > if
>> > >> > > > > > > there
>> > >> > > > > > > > > are
>> > >> > > > > > > > > > > no
>> > >> > > > > > > > > > > > > objections, I will revert to the original
>> > proposal
>> > >> > with
>> > >> > > > > some
>> > >> > > > > > > > > changes.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > The original proposal was only including the
>> > time
>> > >> > used
>> > >> > > by
>> > >> > > > > the
>> > >> > > > > > > > > request
>> > >> > > > > > > > > > > > > handler threads (that made calculation
>> easy). I
>> > >> think
>> > >> > > the
>> > >> > > > > > > > > suggestion
>> > >> > > > > > > > > > is
>> > >> > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > include the time spent in the network
>> threads as
>> > >> well
>> > >> > > > since
>> > >> > > > > > > that
>> > >> > > > > > > > > may
>> > >> > > > > > > > > > be
>> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is more
>> > >> > complicated
>> > >> > > > to
>> > >> > > > > > > > > calculate
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > total available CPU time and convert to a
>> ratio
>> > >> when
>> > >> > > > there
>> > >> > > > > > *m*
>> > >> > > > > > > > I/O
>> > >> > > > > > > > > > > > threads
>> > >> > > > > > > > > > > > > and *n* network threads.
>> > >> > ThreadMXBean#getThreadCPUTime(
>> > >> > > )
>> > >> > > > > may
>> > >> > > > > > > > give
>> > >> > > > > > > > > us
>> > >> > > > > > > > > > > > what
>> > >> > > > > > > > > > > > > we want, but it can be very expensive on some
>> > >> > > platforms.
>> > >> > > > As
>> > >> > > > > > > > Becket
>> > >> > > > > > > > > > and
>> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have several
>> > time
>> > >> > > > > > measurements
>> > >> > > > > > > > > > already
>> > >> > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > generating metrics that we could use, though
>> we
>> > >> might
>> > >> > > > want
>> > >> > > > > to
>> > >> > > > > > > > > switch
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
>> since
>> > >> some
>> > >> > of
>> > >> > > > the
>> > >> > > > > > > > values
>> > >> > > > > > > > > > for
>> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather than
>> add
>> > >> up
>> > >> > the
>> > >> > > > > time
>> > >> > > > > > > > spent
>> > >> > > > > > > > > in
>> > >> > > > > > > > > > > I/O
>> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
>> better
>> > >> to
>> > >> > > > convert
>> > >> > > > > > the
>> > >> > > > > > > > > time
>> > >> > > > > > > > > > > > spent
>> > >> > > > > > > > > > > > > on each thread into a separate ratio? UserA
>> has
>> > a
>> > >> > > request
>> > >> > > > > > quota
>> > >> > > > > > > > of
>> > >> > > > > > > > > > 5%.
>> > >> > > > > > > > > > > > Can
>> > >> > > > > > > > > > > > > we take that to mean that UserA can use 5% of
>> > the
>> > >> > time
>> > >> > > on
>> > >> > > > > > > network
>> > >> > > > > > > > > > > threads
>> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If either
>> is
>> > >> > > exceeded,
>> > >> > > > > the
>> > >> > > > > > > > > > response
>> > >> > > > > > > > > > > is
>> > >> > > > > > > > > > > > > throttled - it would mean maintaining two
>> sets
>> > of
>> > >> > > metrics
>> > >> > > > > for
>> > >> > > > > > > the
>> > >> > > > > > > > > two
>> > >> > > > > > > > > > > > > durations, but would result in more
>> meaningful
>> > >> > ratios.
>> > >> > > We
>> > >> > > > > > could
>> > >> > > > > > > > > > define
>> > >> > > > > > > > > > > > two
>> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request threads
>> > and
>> > >> 10%
>> > >> > > of
>> > >> > > > > > > network
>> > >> > > > > > > > > > > > threads),
>> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
>> explain
>> > >> to
>> > >> > > > users.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
>> > network
>> > >> > > thread
>> > >> > > > > > > > > utilization:
>> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent in
>> the
>> > >> > network
>> > >> > > > > > thread
>> > >> > > > > > > > may
>> > >> > > > > > > > > be
>> > >> > > > > > > > > > > > > significant and I can see the need to include
>> > >> this.
>> > >> > Are
>> > >> > > > > there
>> > >> > > > > > > > other
>> > >> > > > > > > > > > > > > requests where the network thread
>> utilization is
>> > >> > > > > significant?
>> > >> > > > > > > In
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > case
>> > >> > > > > > > > > > > > > of fetch, request handler thread utilization
>> > would
>> > >> > > > throttle
>> > >> > > > > > > > clients
>> > >> > > > > > > > > > > with
>> > >> > > > > > > > > > > > > high request rate, low data volume and fetch
>> > byte
>> > >> > rate
>> > >> > > > > quota
>> > >> > > > > > > will
>> > >> > > > > > > > > > > > throttle
>> > >> > > > > > > > > > > > > clients with high data volume. Network thread
>> > >> > > utilization
>> > >> > > > > is
>> > >> > > > > > > > > perhaps
>> > >> > > > > > > > > > > > > proportional to the data volume. I am
>> wondering
>> > >> if we
>> > >> > > > even
>> > >> > > > > > need
>> > >> > > > > > > > to
>> > >> > > > > > > > > > > > throttle
>> > >> > > > > > > > > > > > > based on network thread utilization or
>> whether
>> > the
>> > >> > data
>> > >> > > > > > volume
>> > >> > > > > > > > > quota
>> > >> > > > > > > > > > > > covers
>> > >> > > > > > > > > > > > > this case.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > b) At the moment, we record and check for
>> quota
>> > >> > > violation
>> > >> > > > > at
>> > >> > > > > > > the
>> > >> > > > > > > > > same
>> > >> > > > > > > > > > > > time.
>> > >> > > > > > > > > > > > > If a quota is violated, the response is
>> delayed.
>> > >> > Using
>> > >> > > > > Jay'e
>> > >> > > > > > > > > example
>> > >> > > > > > > > > > of
>> > >> > > > > > > > > > > > > disk reads for fetches happening in the
>> network
>> > >> > thread,
>> > >> > > > We
>> > >> > > > > > > can't
>> > >> > > > > > > > > > record
>> > >> > > > > > > > > > > > and
>> > >> > > > > > > > > > > > > delay a response after the disk reads. We
>> could
>> > >> > record
>> > >> > > > the
>> > >> > > > > > time
>> > >> > > > > > > > > spent
>> > >> > > > > > > > > > > on
>> > >> > > > > > > > > > > > > the network thread when the response is
>> complete
>> > >> and
>> > >> > > > > > introduce
>> > >> > > > > > > a
>> > >> > > > > > > > > > delay
>> > >> > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > handling a subsequent request (separate out
>> > >> recording
>> > >> > > and
>> > >> > > > > > quota
>> > >> > > > > > > > > > > violation
>> > >> > > > > > > > > > > > > handling in the case of network thread
>> > overload).
>> > >> > Does
>> > >> > > > that
>> > >> > > > > > > make
>> > >> > > > > > > > > > sense?
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Regards,
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Rajini
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
>> > >> > > > > > > > becket....@gmail.com>
>> > >> > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Hey Jay,
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time
>> is a
>> > >> > little
>> > >> > > > > > > tricky. I
>> > >> > > > > > > > > am
>> > >> > > > > > > > > > > > > thinking
>> > >> > > > > > > > > > > > > > that maybe we can use the existing request
>> > >> > > statistics.
>> > >> > > > > They
>> > >> > > > > > > are
>> > >> > > > > > > > > > > already
>> > >> > > > > > > > > > > > > > very detailed so we can probably see the
>> > >> > approximate
>> > >> > > > CPU
>> > >> > > > > > time
>> > >> > > > > > > > > from
>> > >> > > > > > > > > > > it,
>> > >> > > > > > > > > > > > > e.g.
>> > >> > > > > > > > > > > > > > something like (total_time -
>> > >> > > > request/response_queue_time
>> > >> > > > > -
>> > >> > > > > > > > > > > > remote_time).
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user is
>> > >> throttled
>> > >> > > it
>> > >> > > > is
>> > >> > > > > > > > likely
>> > >> > > > > > > > > > that
>> > >> > > > > > > > > > > > we
>> > >> > > > > > > > > > > > > > need to see if anything has went wrong
>> first,
>> > >> and
>> > >> > if
>> > >> > > > the
>> > >> > > > > > > users
>> > >> > > > > > > > > are
>> > >> > > > > > > > > > > well
>> > >> > > > > > > > > > > > > > behaving and just need more resources, we
>> will
>> > >> have
>> > >> > > to
>> > >> > > > > bump
>> > >> > > > > > > up
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > quota
>> > >> > > > > > > > > > > > > > for them. It is true that pre-allocating
>> CPU
>> > >> time
>> > >> > > quota
>> > >> > > > > > > > precisely
>> > >> > > > > > > > > > for
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > users is difficult. So in practice it would
>> > >> > probably
>> > >> > > be
>> > >> > > > > > more
>> > >> > > > > > > > like
>> > >> > > > > > > > > > > first
>> > >> > > > > > > > > > > > > set
>> > >> > > > > > > > > > > > > > a relative high protective CPU time quota
>> for
>> > >> > > everyone
>> > >> > > > > and
>> > >> > > > > > > > > increase
>> > >> > > > > > > > > > > > that
>> > >> > > > > > > > > > > > > > for some individual clients on demand.
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Thanks,
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang
>> > Wang <
>> > >> > > > > > > > > wangg...@gmail.com
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see it
>> > >> > happening.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or
>> more
>> > >> > > > > specifically
>> > >> > > > > > > > > > > processing
>> > >> > > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > ratio instead of the request rate
>> throttling
>> > >> as
>> > >> > > well.
>> > >> > > > > > > Becket
>> > >> > > > > > > > > has
>> > >> > > > > > > > > > > very
>> > >> > > > > > > > > > > > > > well
>> > >> > > > > > > > > > > > > > > summed my rationales above, and one
>> thing to
>> > >> add
>> > >> > > here
>> > >> > > > > is
>> > >> > > > > > > that
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > > former
>> > >> > > > > > > > > > > > > > > has a good support for both "protecting
>> > >> against
>> > >> > > rogue
>> > >> > > > > > > > clients"
>> > >> > > > > > > > > as
>> > >> > > > > > > > > > > > well
>> > >> > > > > > > > > > > > > as
>> > >> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy
>> > usage":
>> > >> > when
>> > >> > > > > > > thinking
>> > >> > > > > > > > > > about
>> > >> > > > > > > > > > > > how
>> > >> > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > explain this to the end users, I find it
>> > >> actually
>> > >> > > > more
>> > >> > > > > > > > natural
>> > >> > > > > > > > > > than
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > request rate since as mentioned above,
>> > >> different
>> > >> > > > > requests
>> > >> > > > > > > > will
>> > >> > > > > > > > > > have
>> > >> > > > > > > > > > > > > quite
>> > >> > > > > > > > > > > > > > > different "cost", and Kafka today already
>> > have
>> > >> > > > various
>> > >> > > > > > > > request
>> > >> > > > > > > > > > > types
>> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc),
>> > >> because
>> > >> > of
>> > >> > > > that
>> > >> > > > > > the
>> > >> > > > > > > > > > request
>> > >> > > > > > > > > > > > > rate
>> > >> > > > > > > > > > > > > > > throttling may not be as effective
>> unless it
>> > >> is
>> > >> > set
>> > >> > > > > very
>> > >> > > > > > > > > > > > > conservatively.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > Regarding to user reactions when they are
>> > >> > > throttled,
>> > >> > > > I
>> > >> > > > > > > think
>> > >> > > > > > > > it
>> > >> > > > > > > > > > may
>> > >> > > > > > > > > > > > > > differ
>> > >> > > > > > > > > > > > > > > case-by-case, and need to be discovered /
>> > >> guided
>> > >> > by
>> > >> > > > > > looking
>> > >> > > > > > > > at
>> > >> > > > > > > > > > > > relative
>> > >> > > > > > > > > > > > > > > metrics. So in other words users would
>> not
>> > >> expect
>> > >> > > to
>> > >> > > > > get
>> > >> > > > > > > > > > additional
>> > >> > > > > > > > > > > > > > > information by simply being told "hey,
>> you
>> > are
>> > >> > > > > > throttled",
>> > >> > > > > > > > > which
>> > >> > > > > > > > > > is
>> > >> > > > > > > > > > > > all
>> > >> > > > > > > > > > > > > > > what throttling does; they need to take a
>> > >> > follow-up
>> > >> > > > > step
>> > >> > > > > > > and
>> > >> > > > > > > > > see
>> > >> > > > > > > > > > > > "hmm,
>> > >> > > > > > > > > > > > > > I'm
>> > >> > > > > > > > > > > > > > > throttled probably because of ..", which
>> is
>> > by
>> > >> > > > looking
>> > >> > > > > at
>> > >> > > > > > > > other
>> > >> > > > > > > > > > > > metric
>> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the
>> > >> brokers
>> > >> > > with
>> > >> > > > > > > metadata
>> > >> > > > > > > > > > > > request,
>> > >> > > > > > > > > > > > > > > which are usually cheap to handle but I'm
>> > >> sending
>> > >> > > > > > thousands
>> > >> > > > > > > > per
>> > >> > > > > > > > > > > > second;
>> > >> > > > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > > is it because I'm catching up and hence
>> > >> sending
>> > >> > > very
>> > >> > > > > > heavy
>> > >> > > > > > > > > > fetching
>> > >> > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > Regarding to the implementation, as once
>> > >> > discussed
>> > >> > > > with
>> > >> > > > > > > Jun,
>> > >> > > > > > > > > this
>> > >> > > > > > > > > > > > seems
>> > >> > > > > > > > > > > > > > not
>> > >> > > > > > > > > > > > > > > very difficult since today we are already
>> > >> > > collecting
>> > >> > > > > the
>> > >> > > > > > > > > "thread
>> > >> > > > > > > > > > > pool
>> > >> > > > > > > > > > > > > > > utilization" metrics, which is a single
>> > >> > percentage
>> > >> > > > > > > > > > > > "aggregateIdleMeter"
>> > >> > > > > > > > > > > > > > > value; but we are already effectively
>> > >> aggregating
>> > >> > > it
>> > >> > > > > for
>> > >> > > > > > > each
>> > >> > > > > > > > > > > > requests
>> > >> > > > > > > > > > > > > in
>> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
>> extend
>> > >> it by
>> > >> > > > > > recording
>> > >> > > > > > > > the
>> > >> > > > > > > > > > > > source
>> > >> > > > > > > > > > > > > > > client id when handling them and
>> aggregating
>> > >> by
>> > >> > > > > clientId
>> > >> > > > > > as
>> > >> > > > > > > > > well
>> > >> > > > > > > > > > as
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > total aggregate.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > Guozhang
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
>> Kreps <
>> > >> > > > > > > j...@confluent.io
>> > >> > > > > > > > >
>> > >> > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > When I thought about it more deeply I
>> came
>> > >> > around
>> > >> > > > to
>> > >> > > > > > the
>> > >> > > > > > > > > > "percent
>> > >> > > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > > processing time" metric too. It seems a
>> > lot
>> > >> > > closer
>> > >> > > > to
>> > >> > > > > > the
>> > >> > > > > > > > > thing
>> > >> > > > > > > > > > > we
>> > >> > > > > > > > > > > > > > > actually
>> > >> > > > > > > > > > > > > > > > care about and need to protect. I also
>> > think
>> > >> > this
>> > >> > > > > would
>> > >> > > > > > > be
>> > >> > > > > > > > a
>> > >> > > > > > > > > > very
>> > >> > > > > > > > > > > > > > useful
>> > >> > > > > > > > > > > > > > > > metric even in the absence of
>> throttling
>> > >> just
>> > >> > to
>> > >> > > > > debug
>> > >> > > > > > > > whose
>> > >> > > > > > > > > > > using
>> > >> > > > > > > > > > > > > > > > capacity.
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > Two problems to consider:
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it is
>> > >> > > > understandable
>> > >> > > > > > what
>> > >> > > > > > > > > lead
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > > their
>> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit
>> hard
>> > to
>> > >> > > figure
>> > >> > > > > out
>> > >> > > > > > > the
>> > >> > > > > > > > > safe
>> > >> > > > > > > > > > > > range
>> > >> > > > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app that
>> > will
>> > >> > send
>> > >> > > > 200
>> > >> > > > > > > > > > > messages/sec I
>> > >> > > > > > > > > > > > > can
>> > >> > > > > > > > > > > > > > > >    probably reason that I'll be under
>> the
>> > >> > > > throttling
>> > >> > > > > > > limit
>> > >> > > > > > > > of
>> > >> > > > > > > > > > 300
>> > >> > > > > > > > > > > > > > > req/sec.
>> > >> > > > > > > > > > > > > > > >    However if I need to be under a 10%
>> CPU
>> > >> > > > resources
>> > >> > > > > > > limit
>> > >> > > > > > > > it
>> > >> > > > > > > > > > may
>> > >> > > > > > > > > > > > be
>> > >> > > > > > > > > > > > > a
>> > >> > > > > > > > > > > > > > > bit
>> > >> > > > > > > > > > > > > > > >    harder for me to know a priori if i
>> > will
>> > >> or
>> > >> > > > won't.
>> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU
>> time
>> > is
>> > >> a
>> > >> > bit
>> > >> > > > > > > difficult
>> > >> > > > > > > > > > since
>> > >> > > > > > > > > > > > > there
>> > >> > > > > > > > > > > > > > > are
>> > >> > > > > > > > > > > > > > > >    actually two thread pools--the I/O
>> > >> threads
>> > >> > and
>> > >> > > > the
>> > >> > > > > > > > network
>> > >> > > > > > > > > > > > > threads.
>> > >> > > > > > > > > > > > > > I
>> > >> > > > > > > > > > > > > > > > think
>> > >> > > > > > > > > > > > > > > >    it might be workable to count just
>> the
>> > >> I/O
>> > >> > > > thread
>> > >> > > > > > time
>> > >> > > > > > > > as
>> > >> > > > > > > > > in
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > > proposal,
>> > >> > > > > > > > > > > > > > > >    but the network thread work is
>> actually
>> > >> > > > > non-trivial
>> > >> > > > > > > > (e.g.
>> > >> > > > > > > > > > all
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > disk
>> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
>> > >> thread). If
>> > >> > > you
>> > >> > > > > > count
>> > >> > > > > > > > > both
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > network
>> > >> > > > > > > > > > > > > > > > and
>> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a
>> bit.
>> > >> E.g.
>> > >> > say
>> > >> > > > you
>> > >> > > > > > > have
>> > >> > > > > > > > 50
>> > >> > > > > > > > > > > > network
>> > >> > > > > > > > > > > > > > > > threads,
>> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is
>> > the
>> > >> > > > available
>> > >> > > > > > cpu
>> > >> > > > > > > > > time
>> > >> > > > > > > > > > > > > > available
>> > >> > > > > > > > > > > > > > > > in a
>> > >> > > > > > > > > > > > > > > >    second? I suppose this is a problem
>> > >> whenever
>> > >> > > you
>> > >> > > > > > have
>> > >> > > > > > > a
>> > >> > > > > > > > > > > > bottleneck
>> > >> > > > > > > > > > > > > > > > between
>> > >> > > > > > > > > > > > > > > >    I/O and network threads or if you
>> end
>> > up
>> > >> > > > > > significantly
>> > >> > > > > > > > > > > > > > > over-provisioning
>> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard to
>> > >> avoid).
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling
>> would be
>> > >> to
>> > >> > use
>> > >> > > > > this
>> > >> > > > > > > api:
>> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
>> > >> > > > > > 1.5.0/docs/api/java/lang/
>> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
>> > >> > > > getThreadCpuTime(long)
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
>> usage
>> > >> > across
>> > >> > > > the
>> > >> > > > > > > > network,
>> > >> > > > > > > > > > I/O
>> > >> > > > > > > > > > > > > > > threads,
>> > >> > > > > > > > > > > > > > > > and purgatory threads and look at it
>> as a
>> > >> > > > percentage
>> > >> > > > > of
>> > >> > > > > > > > total
>> > >> > > > > > > > > > > > cores.
>> > >> > > > > > > > > > > > > I
>> > >> > > > > > > > > > > > > > > > think this fixes many problems in the
>> > >> > reliability
>> > >> > > > of
>> > >> > > > > > the
>> > >> > > > > > > > > > metric.
>> > >> > > > > > > > > > > > It's
>> > >> > > > > > > > > > > > > > > > meaning is slightly different as it is
>> > just
>> > >> CPU
>> > >> > > > (you
>> > >> > > > > > > don't
>> > >> > > > > > > > > get
>> > >> > > > > > > > > > > > > charged
>> > >> > > > > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may be
>> okay
>> > >> > > because
>> > >> > > > we
>> > >> > > > > > > > already
>> > >> > > > > > > > > > > have
>> > >> > > > > > > > > > > > a
>> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I
>> think
>> > it
>> > >> is
>> > >> > > > > possible
>> > >> > > > > > > > this
>> > >> > > > > > > > > > api
>> > >> > > > > > > > > > > > can
>> > >> > > > > > > > > > > > > be
>> > >> > > > > > > > > > > > > > > > disabled or isn't always available and
>> it
>> > >> may
>> > >> > > also
>> > >> > > > be
>> > >> > > > > > > > > expensive
>> > >> > > > > > > > > > > > (also
>> > >> > > > > > > > > > > > > > > I've
>> > >> > > > > > > > > > > > > > > > never used it so not sure if it really
>> > works
>> > >> > the
>> > >> > > > way
>> > >> > > > > i
>> > >> > > > > > > > > think).
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > -Jay
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket
>> > Qin
>> > >> <
>> > >> > > > > > > > > > > becket....@gmail.com>
>> > >> > > > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only to
>> > >> protect
>> > >> > > the
>> > >> > > > > > > cluster
>> > >> > > > > > > > > from
>> > >> > > > > > > > > > > > being
>> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is
>> not
>> > >> > > intended
>> > >> > > > to
>> > >> > > > > > > > address
>> > >> > > > > > > > > > > > > resource
>> > >> > > > > > > > > > > > > > > > > allocation problem among the
>> clients, I
>> > am
>> > >> > > > > wondering
>> > >> > > > > > if
>> > >> > > > > > > > > using
>> > >> > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time quota)
>> is
>> > a
>> > >> > > better
>> > >> > > > > > > option.
>> > >> > > > > > > > > Here
>> > >> > > > > > > > > > > are
>> > >> > > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > > > reasons:
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > 1. request handling time quota has
>> > better
>> > >> > > > > protection.
>> > >> > > > > > > Say
>> > >> > > > > > > > > we
>> > >> > > > > > > > > > > have
>> > >> > > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > rate quota and set that to some value
>> > like
>> > >> > 100
>> > >> > > > > > > > > requests/sec,
>> > >> > > > > > > > > > it
>> > >> > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > possible
>> > >> > > > > > > > > > > > > > > > > that some of the requests are very
>> > >> expensive
>> > >> > > > > actually
>> > >> > > > > > > > take
>> > >> > > > > > > > > a
>> > >> > > > > > > > > > > lot
>> > >> > > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > handle. In that case a few clients
>> may
>> > >> still
>> > >> > > > > occupy a
>> > >> > > > > > > lot
>> > >> > > > > > > > > of
>> > >> > > > > > > > > > > CPU
>> > >> > > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > > even
>> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably we
>> can
>> > >> > > > carefully
>> > >> > > > > > set
>> > >> > > > > > > > > > request
>> > >> > > > > > > > > > > > rate
>> > >> > > > > > > > > > > > > > > quota
>> > >> > > > > > > > > > > > > > > > > for each request and client id
>> > >> combination,
>> > >> > but
>> > >> > > > it
>> > >> > > > > > > could
>> > >> > > > > > > > > > still
>> > >> > > > > > > > > > > be
>> > >> > > > > > > > > > > > > > > tricky
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > get it right for everyone.
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > If we use the request time handling
>> > >> quota, we
>> > >> > > can
>> > >> > > > > > > simply
>> > >> > > > > > > > > say
>> > >> > > > > > > > > > no
>> > >> > > > > > > > > > > > > > clients
>> > >> > > > > > > > > > > > > > > > can
>> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the total
>> > >> request
>> > >> > > > > > handling
>> > >> > > > > > > > > > capacity
>> > >> > > > > > > > > > > > > > > (measured
>> > >> > > > > > > > > > > > > > > > > by time), regardless of the
>> difference
>> > >> among
>> > >> > > > > > different
>> > >> > > > > > > > > > requests
>> > >> > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > what
>> > >> > > > > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > the client doing. In this case maybe
>> we
>> > >> can
>> > >> > > quota
>> > >> > > > > all
>> > >> > > > > > > the
>> > >> > > > > > > > > > > > requests
>> > >> > > > > > > > > > > > > if
>> > >> > > > > > > > > > > > > > > we
>> > >> > > > > > > > > > > > > > > > > want to.
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using request
>> > rate
>> > >> > limit
>> > >> > > > is
>> > >> > > > > > that
>> > >> > > > > > > > it
>> > >> > > > > > > > > > > seems
>> > >> > > > > > > > > > > > > more
>> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
>> > probably
>> > >> > > easier
>> > >> > > > to
>> > >> > > > > > > > explain
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > user
>> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
>> > practice
>> > >> it
>> > >> > > > looks
>> > >> > > > > > the
>> > >> > > > > > > > > impact
>> > >> > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > rate quota is not more quantifiable
>> than
>> > >> the
>> > >> > > > > request
>> > >> > > > > > > > > handling
>> > >> > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > quota.
>> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
>> still
>> > >> > > difficult
>> > >> > > > > to
>> > >> > > > > > > > give a
>> > >> > > > > > > > > > > > number
>> > >> > > > > > > > > > > > > > > about
>> > >> > > > > > > > > > > > > > > > > impact of throughput or latency when
>> a
>> > >> > request
>> > >> > > > rate
>> > >> > > > > > > quota
>> > >> > > > > > > > > is
>> > >> > > > > > > > > > > hit.
>> > >> > > > > > > > > > > > > So
>> > >> > > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > not better than the request handling
>> > time
>> > >> > > quota.
>> > >> > > > In
>> > >> > > > > > > fact
>> > >> > > > > > > > I
>> > >> > > > > > > > > > feel
>> > >> > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you are
>> > limited
>> > >> > > > because
>> > >> > > > > > you
>> > >> > > > > > > > have
>> > >> > > > > > > > > > > taken
>> > >> > > > > > > > > > > > > 30%
>> > >> > > > > > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
>> > otherwise
>> > >> > > > > something
>> > >> > > > > > > like
>> > >> > > > > > > > > > "your
>> > >> > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > rate quota on metadata request has
>> > >> reached".
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > Thanks,
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay
>> > >> Kreps <
>> > >> > > > > > > > > j...@confluent.io
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a lot
>> of
>> > >> sense
>> > >> > > > > > > (especially
>> > >> > > > > > > > > now
>> > >> > > > > > > > > > > that
>> > >> > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > > oriented around request rate) and
>> > fills
>> > >> the
>> > >> > > > > biggest
>> > >> > > > > > > > > > remaining
>> > >> > > > > > > > > > > > gap
>> > >> > > > > > > > > > > > > > in
>> > >> > > > > > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
>> > communication
>> > >> > > > > > (StopReplica,
>> > >> > > > > > > > > etc)
>> > >> > > > > > > > > > we
>> > >> > > > > > > > > > > > > could
>> > >> > > > > > > > > > > > > > > > avoid
>> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can
>> secure or
>> > >> > > > otherwise
>> > >> > > > > > > > > lock-down
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > cluster
>> > >> > > > > > > > > > > > > > > > > > communication to avoid any
>> > unauthorized
>> > >> > > > external
>> > >> > > > > > > party
>> > >> > > > > > > > > from
>> > >> > > > > > > > > > > > > trying
>> > >> > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a
>> result
>> > we
>> > >> are
>> > >> > > as
>> > >> > > > > > likely
>> > >> > > > > > > > to
>> > >> > > > > > > > > > > cause
>> > >> > > > > > > > > > > > > > > problems
>> > >> > > > > > > > > > > > > > > > > as
>> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
>> right?
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
>> exempt
>> > >> the
>> > >> > > > > consumer
>> > >> > > > > > > > > requests
>> > >> > > > > > > > > > > > such
>> > >> > > > > > > > > > > > > as
>> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
>> > >> throttle an
>> > >> > > > app's
>> > >> > > > > > > > > heartbeat
>> > >> > > > > > > > > > > > > > requests
>> > >> > > > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > > > > may
>> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
>> consumer
>> > >> group.
>> > >> > > > > However
>> > >> > > > > > > if
>> > >> > > > > > > > we
>> > >> > > > > > > > > > > don't
>> > >> > > > > > > > > > > > > > > > throttle
>> > >> > > > > > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
>> > heartbeat
>> > >> > > > interval
>> > >> > > > > > is
>> > >> > > > > > > > set
>> > >> > > > > > > > > > > > > > incorrectly
>> > >> > > > > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > > > > if
>> > >> > > > > > > > > > > > > > > > > > some client in some language has a
>> > bug.
>> > >> I
>> > >> > > think
>> > >> > > > > the
>> > >> > > > > > > > > policy
>> > >> > > > > > > > > > > with
>> > >> > > > > > > > > > > > > > this
>> > >> > > > > > > > > > > > > > > > kind
>> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
>> > cluster
>> > >> > above
>> > >> > > > any
>> > >> > > > > > > > > > individual
>> > >> > > > > > > > > > > > app,
>> > >> > > > > > > > > > > > > > > > right?
>> > >> > > > > > > > > > > > > > > > > I
>> > >> > > > > > > > > > > > > > > > > > think in general this should be
>> okay
>> > >> since
>> > >> > > for
>> > >> > > > > most
>> > >> > > > > > > > > > > deployments
>> > >> > > > > > > > > > > > > > this
>> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a
>> safety
>> > >> > > > valve---that
>> > >> > > > > > is
>> > >> > > > > > > > > rather
>> > >> > > > > > > > > > > > than
>> > >> > > > > > > > > > > > > > set
>> > >> > > > > > > > > > > > > > > > > > something very close to what you
>> > expect
>> > >> to
>> > >> > > need
>> > >> > > > > > (say
>> > >> > > > > > > 2
>> > >> > > > > > > > > > > req/sec
>> > >> > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > > > > whatever)
>> > >> > > > > > > > > > > > > > > > > > you would have something quite high
>> > >> (like
>> > >> > 100
>> > >> > > > > > > req/sec)
>> > >> > > > > > > > > with
>> > >> > > > > > > > > > > > this
>> > >> > > > > > > > > > > > > > > meant
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I
>> think
>> > >> when
>> > >> > > used
>> > >> > > > > this
>> > >> > > > > > > way
>> > >> > > > > > > > > > > > allowing
>> > >> > > > > > > > > > > > > > > those
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > > be throttled would actually provide
>> > >> > > meaningful
>> > >> > > > > > > > > protection.
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > -Jay
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM,
>> > Rajini
>> > >> > > > Sivaram <
>> > >> > > > > > > > > > > > > > > > rajinisiva...@gmail.com
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Hi all,
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
>> > >> introduce
>> > >> > > > > request
>> > >> > > > > > > rate
>> > >> > > > > > > > > > > quotas
>> > >> > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > Kafka:
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
>> > >> > > > > > > > confluence/display/KAFKA/KIP-
>> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
>> > >> percentage
>> > >> > > > request
>> > >> > > > > > > > > handling
>> > >> > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > quota
>> > >> > > > > > > > > > > > > > > > > that
>> > >> > > > > > > > > > > > > > > > > > > can be allocated to
>> *<client-id>*,
>> > >> > *<user>*
>> > >> > > > or
>> > >> > > > > > > > *<user,
>> > >> > > > > > > > > > > > > > client-id>*.
>> > >> > > > > > > > > > > > > > > > > There
>> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions also
>> > under
>> > >> > > > > "Rejected
>> > >> > > > > > > > > > > > alternatives".
>> > >> > > > > > > > > > > > > > > > > Feedback
>> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Thank you...
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Regards,
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Rajini
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > --
>> > >> > > > > > > > > > > > > > > -- Guozhang
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > >
>> > >> > > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > --
>> > >> > > > > > -- Guozhang
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] KIP-124: Request rate quotas

Reply via email to