Hi, David, Sounds good then.
Thanks, Jun On Tue, Jun 9, 2020 at 10:59 AM David Jacot <dja...@confluent.io> wrote: > Hi Jun, > > Both are already in the KIP, see "New Broker Configurations" chapter. I > think > that we need them in order to be able to define different burst for the new > quota. > > Best, > David > > On Tue, Jun 9, 2020 at 7:48 PM Jun Rao <j...@confluent.io> wrote: > > > Hi, David, > > > > Another thing. Should we add controller.quota.window.size.seconds and > > controller.quota.window.num > > or just reuse the existing quota.window.size.seconds and quota.window.num > > that are used for other types of quotas? > > > > Thanks, > > > > Jun > > > > On Tue, Jun 9, 2020 at 10:30 AM Jun Rao <j...@confluent.io> wrote: > > > > > Hi, David, > > > > > > Thanks for the KIP. The name QUOTA_VIOLATED sounds reasonable to me. +1 > > on > > > the KIP. > > > > > > Jun > > > > > > On Tue, Jun 9, 2020 at 5:07 AM David Jacot <dja...@confluent.io> > wrote: > > > > > >> Hi Colin, > > >> > > >> Thank you for your feedback. > > >> > > >> Jun has summarized the situation pretty well. Thanks Jun! I would like > > to > > >> complement it with the following points: > > >> > > >> 1. Indeed, when the quota is exceeded, the broker will reject the > topic > > >> creations, partition creations and topics deletions that are exceeding > > >> with the new QUOTA_VIOLATED error. The ThrottleTimeMs field will > > >> be populated accordingly to let the client know how long it must wait. > > >> > > >> 2. I do agree that we actually want a mechanism to apply back pressure > > >> to the clients. The KIP basically proposes a mechanism to control and > to > > >> limit the rate of operations before entering the controller. I think > > that > > >> it is > > >> similar to your thinking but is enforced based on a defined quota > > instead > > >> of relying on the number of pending items in the controller. > > >> > > >> 3. You mentioned an alternative idea in your comments that, if I > > >> understood > > >> correctly, would bound the queue to limit the overload and reject work > > if > > >> the > > >> queue is full. I have been thinking about this as well but I don't > think > > >> that it > > >> works well in our case. > > >> - The first reason is the one mentioned by Jun. We actually want to be > > >> able > > >> to limit specific clients (the misbehaving ones) in a multi-tenant > > >> environment. > > >> - The second reason is that, at least in our current implementation, > the > > >> length of > > >> the queue is not really a good characteristic to estimate the load. > > >> Coming back > > >> to your example of the CreateTopicsRequest. They create path in ZK for > > >> each > > >> newly created topics which trigger a ChangeTopic event in the > > controller. > > >> That > > >> single event could be for a single topic in some cases or for a > thousand > > >> topics > > >> in others. > > >> These two reasons aside, bounding the queue also introduces a knob > which > > >> requires some tuning and thus suffers from all the points you > mentioned > > as > > >> well, isn't it? The value may be true for one version but not for > > another. > > >> > > >> 4. Regarding the integration with KIP-500. The implementation of this > > KIP > > >> will span across the ApiLayer and the AdminManager. To be honest, we > > >> can influence the implementation to work best with what you have in > mind > > >> for the future controller. If you could shed some more light on the > > future > > >> internal architecture, I can take this into account during the > > >> implementation. > > >> > > >> 5. Regarding KIP-590. For the create topics request, create partitions > > >> request, > > >> and delete topics request, we are lucky enough to have directed all of > > >> them > > >> to > > >> the controller already so we are fine with doing the admission in the > > >> broker > > >> which hosts the controller. If we were to throttle more operations in > > the > > >> future, > > >> I believe that we can continue to do it centrally while leveraging the > > >> principal > > >> that is forwarded to account for the right tenant. The reason why I > > would > > >> like > > >> to keep it central is that we are talking about sparse (or bursty) > > >> workloads here > > >> so distributing the quota among all the brokers makes little sense. > > >> > > >> 6. Regarding the naming of the new error code. BUSY sounds too generic > > to > > >> me so I would rather prefer a specific error code. The main reason > being > > >> that > > >> we may be able to reuse it in the future for other quotas. That being > > >> said, > > >> we > > >> can find another name. QUOTA_EXHAUSTED? I don't feel too strongly > about > > >> the naming. I wonder what the others think about this. > > >> > > >> VoilĂ . I hope that I have addressed all your questions/points and I am > > >> sorry for > > >> the lengthy email. > > >> > > >> Regards, > > >> David > > >> > > >> > > >> On Tue, Jun 9, 2020 at 2:13 AM Colin McCabe <cmcc...@apache.org> > wrote: > > >> > > >> > On Mon, Jun 8, 2020, at 14:41, Jun Rao wrote: > > >> > > Hi, Colin, > > >> > > > > >> > > Thanks for the comment. You brought up several points. > > >> > > > > >> > > 1. Should we set up a per user quota? To me, it does seem we need > > some > > >> > sort > > >> > > of a quota. When the controller runs out of resources, ideally, we > > >> only > > >> > > want to penalize the bad behaving applications, instead of every > > >> > > application. To do that, we will need to know what each > application > > is > > >> > > entitled to and the per user quota is intended to capture that. > > >> > > > > >> > > 2. How easy is it to configure a quota? The following is how an > > admin > > >> > > typically sets up a quota in our existing systems. Pick a generous > > >> > default > > >> > > per user quota works for most applications. For the few resource > > >> > intensive > > >> > > applications, customize a higher quota for them. Reserve enough > > >> resources > > >> > > in anticipation that a single (or a few) application will exceed > the > > >> > quota > > >> > > at a given time. > > >> > > > > >> > > > >> > Hi Jun, > > >> > > > >> > Thanks for the response. > > >> > > > >> > Maybe I was too pessimistic about the ability of admins to > configure a > > >> > useful quota here. I do agree that it would be nice to have the > > >> ability to > > >> > set different quotas for different users, as you mentioned. > > >> > > > >> > > > > >> > > 3. How should the quota be defined? In the discussion thread, we > > >> debated > > >> > > between a usage based model vs a rate based model. Dave and Anna > > >> argued > > >> > for > > >> > > the rate based model mostly because it's simpler to implement. > > >> > > > > >> > > > >> > I'm trying to think more about how this integrates with our plans > for > > >> > KIP-500. When we get rid of ZK, we will have to handle this in the > > >> > controller itself, rather than in the AdminManager. That implies > > we'll > > >> > have to rewrite the code. Maybe this is worth it if we want this > > >> feature > > >> > now, though. > > >> > > > >> > Another wrinkle here is that as we discussed in KIP-590, controller > > >> > operations will land on a random broker first, and only then be > > >> forwarded > > >> > to the active controller. This implies that either admissions > control > > >> > should happen on all brokers (needing some kind of distributed quota > > >> > scheme), or be done on the controller after we've already done the > > work > > >> of > > >> > forwarding the message. The second approach might not be that bad, > > but > > >> it > > >> > would be nice to figure this out. > > >> > > > >> > > > > >> > > 4. If a quota is exceeded, how is that enforced? My understanding > of > > >> the > > >> > > KIP is that, if a quota is exceeded, the broker immediately sends > > back > > >> > > a QUOTA_VIOLATED error and a throttle time back to the client, and > > the > > >> > > client will wait for the throttle time before issuing the next > > >> request. > > >> > > This seems to be the same as the BUSY error code you mentioned. > > >> > > > > >> > > > >> > Yes, I agree, it sounds like we're thinking along the same lines. > > >> > However, rather than QUOTA_VIOLATED, how about naming the error code > > >> BUSY? > > >> > Then the error text could indicate the quota that we violated. This > > >> would > > >> > be more generally useful as an error code and also avoid being > > >> confusingly > > >> > similar to POLICY_VIOLATION. > > >> > > > >> > best, > > >> > Colin > > >> > > > >> > > > > >> > > I will let David chime in more on that. > > >> > > > > >> > > Thanks, > > >> > > > > >> > > Jun > > >> > > > > >> > > > > >> > > > > >> > > On Sun, Jun 7, 2020 at 2:30 PM Colin McCabe <cmcc...@apache.org> > > >> wrote: > > >> > > > > >> > > > Hi David, > > >> > > > > > >> > > > Thanks for the KIP. > > >> > > > > > >> > > > I thought about this for a while and I actually think this > > approach > > >> is > > >> > not > > >> > > > quite right. The problem that I see here is that using an > > >> explicitly > > >> > set > > >> > > > quota here requires careful tuning by the cluster operator. > Even > > >> > worse, > > >> > > > this tuning might be invalidated by changes in overall > conditions > > or > > >> > even > > >> > > > more efficient controller software. > > >> > > > > > >> > > > For example, if we empirically find that the controller can do > > 1000 > > >> > topics > > >> > > > in a minute (or whatever), this tuning might actually be wrong > if > > >> the > > >> > next > > >> > > > version of the software can do 2000 topics in a minute because > of > > >> > > > efficiency upgrades. Or, the broker that the controller is > > located > > >> on > > >> > > > might be experiencing heavy load from its non-controller > > operations, > > >> > and so > > >> > > > it can only do 500 topics in a minute during this period. > > >> > > > > > >> > > > So the system administrator gets a very obscure tunable (it's > not > > >> > clear to > > >> > > > a non-Kafka-developer what "controller mutations" are or why > they > > >> > should > > >> > > > care). And even worse, they will have to significantly > "sandbag" > > >> the > > >> > value > > >> > > > that they set it to, so that even under the heaviest load and > > oldest > > >> > > > deployed version of the software, the controller can still > > function. > > >> > Even > > >> > > > worse, this new quota adds a lot of complexity to the > controller. > > >> > > > > > >> > > > What we really want is backpressure when the controller is > > >> > overloaded. I > > >> > > > believe this is the alternative you discuss in "Rejected > > >> Alternatives" > > >> > > > under "Throttle the Execution instead of the Admission" Your > > reason > > >> > for > > >> > > > rejecting it is that the client error handling does not work > well > > in > > >> > this > > >> > > > case. But actually, this is an artifact of our current > > >> implementation, > > >> > > > rather than a fundamental issue with backpressure. > > >> > > > > > >> > > > Consider the example of a CreateTopicsRequest. The controller > > could > > >> > > > return a special error code if the load was too high, and take > the > > >> > create > > >> > > > topics event off the controller queue. Let's call that error > code > > >> > BUSY. > > >> > > > Additionally, the controller could immediately refuse new > events > > if > > >> > the > > >> > > > queue had reached its maximum length, and simply return BUSY for > > >> that > > >> > case > > >> > > > as well. > > >> > > > > > >> > > > Basically, the way we handle RPC timeouts in the controller > right > > >> now > > >> > is > > >> > > > not very good. As you know, we time out the RPC, so the client > > gets > > >> > > > TimeoutException, but then keep the event on the queue, so that > it > > >> > > > eventually gets executed! There's no reason why we have to do > > that. > > >> > We > > >> > > > could take the event off the queue if there is a timeout. This > > >> would > > >> > > > reduce load and mostly avoid the paradoxical situations you > > describe > > >> > > > (getting TopicExistsException for a CreateTopicsRequest retry, > > etc.) > > >> > > > > > >> > > > I say "mostly" because there are still cases where retries could > > go > > >> > astray > > >> > > > (for example if we execute the topic creation but a network > > problem > > >> > > > prevents the response from being sent to the client). But this > > >> would > > >> > still > > >> > > > be a very big improvement over what we have now. > > >> > > > > > >> > > > Sorry for commenting so late on this but I got distracted by > some > > >> other > > >> > > > things. I hope we can figure this one out-- I feel like there > is > > a > > >> > chance > > >> > > > to significantly simplify this. > > >> > > > > > >> > > > best, > > >> > > > Colin > > >> > > > > > >> > > > > > >> > > > On Fri, May 29, 2020, at 07:57, David Jacot wrote: > > >> > > > > Hi folks, > > >> > > > > > > >> > > > > I'd like to start the vote for KIP-599 which proposes a new > > quota > > >> to > > >> > > > > throttle create topic, create partition, and delete topics > > >> > operations to > > >> > > > > protect the Kafka controller: > > >> > > > > > > >> > > > > > >> > > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-599%3A+Throttle+Create+Topic%2C+Create+Partition+and+Delete+Topic+Operations > > >> > > > > > > >> > > > > Please, let me know what you think. > > >> > > > > > > >> > > > > Cheers, > > >> > > > > David > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > >