Hi all,

I have submitted a PR to address a few review comments from Ismael. There
are a couple of interface changes in the PR, I have updated the KIP as well.

   1. The `listener()` method in AuthorizableRequestContext and Endpoint
   has been renamed to listenerName()
   2. Endpoint.listenerName() now returns an Optional<String>. Since this
   is in the common package, we may use it in future in clients where we
   don't have listener names

Let me know if there are any concerns.

Regards,

Rajini

On Mon, Sep 9, 2019 at 4:53 PM Rajini Sivaram <rajinisiva...@gmail.com>
wrote:

> Hi Jun,
>
> Thanks for the response. If we use the existing purgatory implementation,
> we should get additional purgatory metrics for ACL updates with the new
> purgatory name as tag, consistent with what we have for other delayed
> operations. I will add these to the KIP. We also have request metrics which
> indicate local vs remote time with the API key as tag, so that should
> indicate the portion of time waiting for async operations.
>
> Regards,
>
> Rajini
>
> On Mon, Sep 9, 2019 at 4:31 PM Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Rajini, Ismael,
>>
>> Yes, I can see the argument for making CreateAcls/DeleteAcls async. I am
>> ok
>> with that if you feel the implementation is not too complicated. Should we
>> consider adding some additional metric to reflect the portion of the time
>> spent in waiting for the async operation to complete?
>>
>> Thanks,
>>
>> Jun
>>
>> On Mon, Sep 9, 2019 at 5:20 AM Ismael Juma <isma...@gmail.com> wrote:
>>
>> > Hi Jun,
>> >
>> > As you say, even though the average number of operations may be low, the
>> > request rate can be high in bursts. This can overwhelm the request queue
>> > and cause an outage for no good reason. Or the backing storage system
>> may
>> > be slow for a period of time, causing a similar issue.
>> >
>> > I've seen several such issues in production affecting Kafka's
>> > create/alter/delete paths (e.g people creating/deleting thousands/tens
>> of
>> > thousands of topics in a hot loop). As you say, it's not just ACLs that
>> are
>> > vulnerable. In my opinion, we should learn from these experiences when
>> > designing new interfaces and we should adjust the existing ones to be
>> more
>> > resilient. It's also worth mentioning that create/delete topics already
>> > relies on the purgatory if the request timeout is greater than 0.
>> >
>> > Can you elaborate on how CPU throttling would help? It seems to me that
>> it
>> > would not trigger fast enough if there was a slow back-end, for example
>> > (the number of request threads is often lower than 10, so it doesn't
>> take
>> > many blocked requests to cause major problems).
>> >
>> > Ismael
>> >
>> > On Sun, Sep 8, 2019, 9:5k2 PM Jun Rao <j...@confluent.io> wrote:
>> >
>> > > Hi, Rajini,
>> > >
>> > > Thanks for the reply. The 4-step approach that you outlined seems to
>> > work.
>> > > Overall, I can see that the async authorize() api could lead to an
>> > overall
>> > > more efficient implementation. The tradeoff is that we have to code
>> every
>> > > request with an extra stage. To me, this optimization seems too early.
>> > With
>> > > 100K topics, each with 1KB worth of users, the required ACL space is
>> > about
>> > > 100MB and we should be able to cache everything. Beyond that, we still
>> > have
>> > > the option of dividing up the topic and ACL metadata such that every
>> > broker
>> > > is only required to cache a subset of all metadata.
>> > >
>> > > Regarding making create/delete api async, I still have mixed feelings
>> on
>> > > this. I am wondering if the benefit is worth the added complexity. To
>> me,
>> > > both operations are rare. If one request thread is blocked
>> occasionally
>> > for
>> > > a few millisecs, it's probably ok since it mostly just affects one
>> > client.
>> > > In the case that a particular user issues many those operations in a
>> > short
>> > > window. Could we just use CPU throttling to prevent too much resource
>> > being
>> > > used? There are a few other similar types of requests such as
>> > create/alter
>> > > configs, create/alter topic, etc. Do we plan to add an extra
>> processing
>> > > stage for each of them too in the future?
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> >
>>
>

Reply via email to