Re: [VOTE] KIP-599: Throttle Create Topic, Create Partition and Delete Topic Operations

Jun Rao Tue, 09 Jun 2020 13:22:35 -0700

Hi, David,

Sounds good then.


Thanks,

Jun

On Tue, Jun 9, 2020 at 10:59 AM David Jacot <dja...@confluent.io> wrote:

> Hi Jun,
>
> Both are already in the KIP, see "New Broker Configurations" chapter. I
> think
> that we need them in order to be able to define different burst for the new
> quota.
>
> Best,
> David
>
> On Tue, Jun 9, 2020 at 7:48 PM Jun Rao <j...@confluent.io> wrote:
>
> > Hi, David,
> >
> > Another thing. Should we add controller.quota.window.size.seconds and
> > controller.quota.window.num
> > or just reuse the existing quota.window.size.seconds and quota.window.num
> > that are used for other types of quotas?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Jun 9, 2020 at 10:30 AM Jun Rao <j...@confluent.io> wrote:
> >
> > > Hi, David,
> > >
> > > Thanks for the KIP. The name QUOTA_VIOLATED sounds reasonable to me. +1
> > on
> > > the KIP.
> > >
> > > Jun
> > >
> > > On Tue, Jun 9, 2020 at 5:07 AM David Jacot <dja...@confluent.io>
> wrote:
> > >
> > >> Hi Colin,
> > >>
> > >> Thank you for your feedback.
> > >>
> > >> Jun has summarized the situation pretty well. Thanks Jun! I would like
> > to
> > >> complement it with the following points:
> > >>
> > >> 1. Indeed, when the quota is exceeded, the broker will reject the
> topic
> > >> creations, partition creations and topics deletions that are exceeding
> > >> with the new QUOTA_VIOLATED error. The ThrottleTimeMs field will
> > >> be populated accordingly to let the client know how long it must wait.
> > >>
> > >> 2. I do agree that we actually want a mechanism to apply back pressure
> > >> to the clients. The KIP basically proposes a mechanism to control and
> to
> > >> limit the rate of operations before entering the controller. I think
> > that
> > >> it is
> > >> similar to your thinking but is enforced based on a defined quota
> > instead
> > >> of relying on the number of pending items in the controller.
> > >>
> > >> 3. You mentioned an alternative idea in your comments that, if I
> > >> understood
> > >> correctly, would bound the queue to limit the overload and reject work
> > if
> > >> the
> > >> queue is full. I have been thinking about this as well but I don't
> think
> > >> that it
> > >> works well in our case.
> > >> - The first reason is the one mentioned by Jun. We actually want to be
> > >> able
> > >> to limit specific clients (the misbehaving ones) in a multi-tenant
> > >> environment.
> > >> - The second reason is that, at least in our current implementation,
> the
> > >> length of
> > >> the queue is not really a good characteristic to estimate the load.
> > >> Coming back
> > >> to your example of the CreateTopicsRequest. They create path in ZK for
> > >> each
> > >> newly created topics which trigger a ChangeTopic event in the
> > controller.
> > >> That
> > >> single event could be for a single topic in some cases or for a
> thousand
> > >> topics
> > >> in others.
> > >> These two reasons aside, bounding the queue also introduces a knob
> which
> > >> requires some tuning and thus suffers from all the points you
> mentioned
> > as
> > >> well, isn't it? The value may be true for one version but not for
> > another.
> > >>
> > >> 4. Regarding the integration with KIP-500. The implementation of this
> > KIP
> > >> will span across the ApiLayer and the AdminManager. To be honest, we
> > >> can influence the implementation to work best with what you have in
> mind
> > >> for the future controller. If you could shed some more light on the
> > future
> > >> internal architecture, I can take this into account during the
> > >> implementation.
> > >>
> > >> 5. Regarding KIP-590. For the create topics request, create partitions
> > >> request,
> > >> and delete topics request, we are lucky enough to have directed all of
> > >> them
> > >> to
> > >> the controller already so we are fine with doing the admission in the
> > >> broker
> > >> which hosts the controller. If we were to throttle more operations in
> > the
> > >> future,
> > >> I believe that we can continue to do it centrally while leveraging the
> > >> principal
> > >> that is forwarded to account for the right tenant. The reason why I
> > would
> > >> like
> > >> to keep it central is that we are talking about sparse (or bursty)
> > >> workloads here
> > >> so distributing the quota among all the brokers makes little sense.
> > >>
> > >> 6. Regarding the naming of the new error code. BUSY sounds too generic
> > to
> > >> me so I would rather prefer a specific error code. The main reason
> being
> > >> that
> > >> we may be able to reuse it in the future for other quotas. That being
> > >> said,
> > >> we
> > >> can find another name. QUOTA_EXHAUSTED? I don't feel too strongly
> about
> > >> the naming. I wonder what the others think about this.
> > >>
> > >> Voilà. I hope that I have addressed all your questions/points and I am
> > >> sorry for
> > >> the lengthy email.
> > >>
> > >> Regards,
> > >> David
> > >>
> > >>
> > >> On Tue, Jun 9, 2020 at 2:13 AM Colin McCabe <cmcc...@apache.org>
> wrote:
> > >>
> > >> > On Mon, Jun 8, 2020, at 14:41, Jun Rao wrote:
> > >> > > Hi, Colin,
> > >> > >
> > >> > > Thanks for the comment. You brought up several points.
> > >> > >
> > >> > > 1. Should we set up a per user quota? To me, it does seem we need
> > some
> > >> > sort
> > >> > > of a quota. When the controller runs out of resources, ideally, we
> > >> only
> > >> > > want to penalize the bad behaving applications, instead of every
> > >> > > application. To do that, we will need to know what each
> application
> > is
> > >> > > entitled to and the per user quota is intended to capture that.
> > >> > >
> > >> > > 2. How easy is it to configure a quota? The following is how an
> > admin
> > >> > > typically sets up a quota in our existing systems. Pick a generous
> > >> > default
> > >> > > per user quota works for most applications. For the few resource
> > >> > intensive
> > >> > > applications, customize a higher quota for them. Reserve enough
> > >> resources
> > >> > > in anticipation that a single (or a few) application will exceed
> the
> > >> > quota
> > >> > > at a given time.
> > >> > >
> > >> >
> > >> > Hi Jun,
> > >> >
> > >> > Thanks for the response.
> > >> >
> > >> > Maybe I was too pessimistic about the ability of admins to
> configure a
> > >> > useful quota here.  I do agree that it would be nice to have the
> > >> ability to
> > >> > set different quotas for different users, as you mentioned.
> > >> >
> > >> > >
> > >> > > 3. How should the quota be defined? In the discussion thread, we
> > >> debated
> > >> > > between a usage based model vs a rate based model. Dave and Anna
> > >> argued
> > >> > for
> > >> > > the rate based model mostly because it's simpler to implement.
> > >> > >
> > >> >
> > >> > I'm trying to think more about how this integrates with our plans
> for
> > >> > KIP-500.  When we get rid of ZK, we will have to handle this in the
> > >> > controller itself, rather than in the AdminManager.  That implies
> > we'll
> > >> > have to rewrite the code.  Maybe this is worth it if we want this
> > >> feature
> > >> > now, though.
> > >> >
> > >> > Another wrinkle here is that as we discussed in KIP-590, controller
> > >> > operations will land on a random broker first, and only then be
> > >> forwarded
> > >> > to the active controller.  This implies that either admissions
> control
> > >> > should happen on all brokers (needing some kind of distributed quota
> > >> > scheme), or be done on the controller after we've already done the
> > work
> > >> of
> > >> > forwarding the message.  The second approach might not be that bad,
> > but
> > >> it
> > >> > would be nice to figure this out.
> > >> >
> > >> > >
> > >> > > 4. If a quota is exceeded, how is that enforced? My understanding
> of
> > >> the
> > >> > > KIP is that, if a quota is exceeded, the broker immediately sends
> > back
> > >> > > a QUOTA_VIOLATED error and a throttle time back to the client, and
> > the
> > >> > > client will wait for the throttle time before issuing the next
> > >> request.
> > >> > > This seems to be the same as the BUSY error code you mentioned.
> > >> > >
> > >> >
> > >> > Yes, I agree, it sounds like we're thinking along the same lines.
> > >> > However, rather than QUOTA_VIOLATED, how about naming the error code
> > >> BUSY?
> > >> > Then the error text could indicate the quota that we violated.  This
> > >> would
> > >> > be more generally useful as an error code and also avoid being
> > >> confusingly
> > >> > similar to POLICY_VIOLATION.
> > >> >
> > >> > best,
> > >> > Colin
> > >> >
> > >> > >
> > >> > > I will let David chime in more on that.
> > >> > >
> > >> > > Thanks,
> > >> > >
> > >> > > Jun
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Sun, Jun 7, 2020 at 2:30 PM Colin McCabe <cmcc...@apache.org>
> > >> wrote:
> > >> > >
> > >> > > > Hi David,
> > >> > > >
> > >> > > > Thanks for the KIP.
> > >> > > >
> > >> > > > I thought about this for a while and I actually think this
> > approach
> > >> is
> > >> > not
> > >> > > > quite right.  The problem that I see here is that using an
> > >> explicitly
> > >> > set
> > >> > > > quota here requires careful tuning by the cluster operator.
> Even
> > >> > worse,
> > >> > > > this tuning might be invalidated by changes in overall
> conditions
> > or
> > >> > even
> > >> > > > more efficient controller software.
> > >> > > >
> > >> > > > For example, if we empirically find that the controller can do
> > 1000
> > >> > topics
> > >> > > > in a minute (or whatever), this tuning might actually be wrong
> if
> > >> the
> > >> > next
> > >> > > > version of the software can do 2000 topics in a minute because
> of
> > >> > > > efficiency upgrades.  Or, the broker that the controller is
> > located
> > >> on
> > >> > > > might be experiencing heavy load from its non-controller
> > operations,
> > >> > and so
> > >> > > > it can only do 500 topics in a minute during this period.
> > >> > > >
> > >> > > > So the system administrator gets a very obscure tunable (it's
> not
> > >> > clear to
> > >> > > > a non-Kafka-developer what "controller mutations" are or why
> they
> > >> > should
> > >> > > > care).  And even worse, they will have to significantly
> "sandbag"
> > >> the
> > >> > value
> > >> > > > that they set it to, so that even under the heaviest load and
> > oldest
> > >> > > > deployed version of the software, the controller can still
> > function.
> > >> > Even
> > >> > > > worse, this new quota adds a lot of complexity to the
> controller.
> > >> > > >
> > >> > > > What we really want is backpressure when the controller is
> > >> > overloaded.  I
> > >> > > > believe this is the alternative you discuss in "Rejected
> > >> Alternatives"
> > >> > > > under "Throttle the Execution instead of the Admission"  Your
> > reason
> > >> > for
> > >> > > > rejecting it is that the client error handling does not work
> well
> > in
> > >> > this
> > >> > > > case.  But actually, this is an artifact of our current
> > >> implementation,
> > >> > > > rather than a fundamental issue with backpressure.
> > >> > > >
> > >> > > > Consider the example of a CreateTopicsRequest.  The controller
> > could
> > >> > > > return a special error code if the load was too high, and take
> the
> > >> > create
> > >> > > > topics event off the controller queue.  Let's call that error
> code
> > >> > BUSY.
> > >> > > >  Additionally, the controller could immediately refuse new
> events
> > if
> > >> > the
> > >> > > > queue had reached its maximum length, and simply return BUSY for
> > >> that
> > >> > case
> > >> > > > as well.
> > >> > > >
> > >> > > > Basically, the way we handle RPC timeouts in the controller
> right
> > >> now
> > >> > is
> > >> > > > not very good.  As you know, we time out the RPC, so the client
> > gets
> > >> > > > TimeoutException, but then keep the event on the queue, so that
> it
> > >> > > > eventually gets executed!  There's no reason why we have to do
> > that.
> > >> > We
> > >> > > > could take the event off the queue if there is a timeout.  This
> > >> would
> > >> > > > reduce load and mostly avoid the paradoxical situations you
> > describe
> > >> > > > (getting TopicExistsException for a CreateTopicsRequest retry,
> > etc.)
> > >> > > >
> > >> > > > I say "mostly" because there are still cases where retries could
> > go
> > >> > astray
> > >> > > > (for example if we execute the topic creation but a network
> > problem
> > >> > > > prevents the response from being sent to the client).  But this
> > >> would
> > >> > still
> > >> > > > be a very big improvement over what we have now.
> > >> > > >
> > >> > > > Sorry for commenting so late on this but I got distracted by
> some
> > >> other
> > >> > > > things.  I hope we can figure this one out-- I feel like there
> is
> > a
> > >> > chance
> > >> > > > to significantly simplify this.
> > >> > > >
> > >> > > > best,
> > >> > > > Colin
> > >> > > >
> > >> > > >
> > >> > > > On Fri, May 29, 2020, at 07:57, David Jacot wrote:
> > >> > > > > Hi folks,
> > >> > > > >
> > >> > > > > I'd like to start the vote for KIP-599 which proposes a new
> > quota
> > >> to
> > >> > > > > throttle create topic, create partition, and delete topics
> > >> > operations to
> > >> > > > > protect the Kafka controller:
> > >> > > > >
> > >> > > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-599%3A+Throttle+Create+Topic%2C+Create+Partition+and+Delete+Topic+Operations
> > >> > > > >
> > >> > > > > Please, let me know what you think.
> > >> > > > >
> > >> > > > > Cheers,
> > >> > > > > David
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Re: [VOTE] KIP-599: Throttle Create Topic, Create Partition and Delete Topic Operations

Reply via email to