Hi Anna,

Thanks for the response, sounds good.

Regards,

Rajini


On Sun, May 17, 2020 at 1:38 AM Anna Povzner <a...@confluent.io> wrote:

> Hi Rajini,
>
> Thanks for reviewing the KIP!
>
> I agree with your suggestion to make per-IP connection rate quota a dynamic
> quota for entity name IP. This will allow configuring connection rate for a
> particular IP as well. I updated the wiki accordingly.
>
> Your second concern makes sense -- rejecting the connection right away will
> likely cause a new connection from the same client. I am concerned about
> delaying new connections for processing later, because if the connections
> keep coming with the high rate, there may be potentially a large backlog
> and connections may start timing out before the broker gets to processing
> them. For example, if clients come through proxy, there may be
> potentially a large number of incoming connections with the same IP.
>
> What do you think about the following option:
> * Once per-IP connection rate reaches the limit, accept or drop (clean up)
> the connection after a delay depending on whether the quota is still
> violated. We could re-use the mechanism implemented with KIP-306 where the
> broker delays the response for failed client authentication. The delay will
> be set to min(delay calculated based on the rate quota, 1 second), which
> matches the max delay for request quota.
>
> I think this option is somewhat your suggestion with delaying accepting per
> IP connections that reached the rate limit, but with protection in place to
> make sure the number of delayed connections does not blow up. What do you
> think?
>
> Thanks,
> Anna
>
> On Sat, May 16, 2020 at 1:09 AM Alexandre Dupriez <
> alexandre.dupr...@gmail.com> wrote:
>
> > Hi Anna,
> >
> > Thank you for your answers and explanations.
> >
> > A couple of additional comments:
> >
> > 900. KIP-612 does not intend to dedicate a metric to the throttling of
> > incoming connections. I wonder if such a metric would be handy for
> > monitoring and help set-up metric-based alarming if one wishes to
> > capture this type of incident?
> >
> > 901. Following-up on Rajini's point 2 above - from my understanding,
> > this new quota should prevent excess CPU consumption in
> > Processor#run() method when a new connection has been accepted.
> > Through the throttling in place, connections will be delayed as
> > indicated by the KIP's specifications:
> >
> > " If connection creation rate on the broker exceeds the broker-wide
> > limit, the broker will delay accepting a new connection by an amount
> > of time that brings the rate within the limit."
> >
> > You may be referring to the following sentence:
> >
> > "A new broker configuration option will be added to limit the rate at
> > which connections will be accepted for each IP address. New
> > connections for the IP will be dropped once the limit is reached."?
> >
> > 902. It may be interesting to capture the data with and without
> > connection throttling under stress scenarios. You may have these data
> > already. If you need a pair of hands to do some stress tests once you
> > have a POC or a PR, I am happy to contribute :)
> >
> > Thanks,
> > Alexandre
> >
> > Le ven. 15 mai 2020 à 12:22, Rajini Sivaram <rajinisiva...@gmail.com> a
> > écrit :
> > >
> > > Hi Anna,
> > >
> > > Thanks for the KIP, looks good overall. A couple of comments about
> per-IP
> > > connection quotas:
> > >
> > > 1) Should we consider making per-IP quota similar to other quotas?
> > > Configured as a dynamic quota for entity type IP, with per-IP limit as
> > well
> > > as defaults? Perhaps that would fit better rather than configs?
> > >
> > > 2) The current proposal drops connections after accepting connections
> for
> > > per-IP limit. We do this in other cases too, but in this case, should
> we
> > > throttle instead? My point is what is the quota protecting? If we want
> to
> > > limit rate of accepted connections, then accepting a connection and
> then
> > > dropping doesn't really help since that IP is going to reconnect. If we
> > > want to rate limit what happens next, i.e. authentication, then
> > > throttling the accepted connection so its processing is delayed would
> > > perhaps be better?
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > > On Thu, May 14, 2020 at 4:12 PM David Jacot <dja...@confluent.io>
> wrote:
> > >
> > > > Hi Anna,
> > > >
> > > > Thanks for your answers and the updated KIP. Looks good to me!
> > > >
> > > > Best,
> > > > David
> > > >
> > > > On Thu, May 14, 2020 at 12:54 AM Anna Povzner <a...@confluent.io>
> > wrote:
> > > >
> > > > > I updated the KIP to add a new broker configuration to limit
> > connection
> > > > > creation rate per IP: max.connection.creation.rate.per.ip. Once the
> > limit
> > > > > is reached for a particular IP address, the broker will reject the
> > > > > connection from that IP (close the connection it accepted) and
> > continue
> > > > > rejecting them until the rate is back within the rate limit.
> > > > >
> > > > > On Wed, May 13, 2020 at 11:46 AM Anna Povzner <a...@confluent.io>
> > wrote:
> > > > >
> > > > > > Hi David and Alexandre,
> > > > > >
> > > > > > Thanks so much for your feedback! Here are my answers:
> > > > > >
> > > > > > 1. Yes, we have seen several cases of clients that create a new
> > > > > connection
> > > > > > per produce/consume request. One hypothesis is someone who is
> used
> > to
> > > > > > connection pooling may accidentally write a Kafka client that
> > creates a
> > > > > new
> > > > > > connection every time.
> > > > > >
> > > > > > 2 & 4. That's a good point I haven't considered. I think it makes
> > sense
> > > > > to
> > > > > > provide an ability to limit connection creations per IP as well.
> > This
> > > > is
> > > > > > not hard to implement -- the broker already keeps track of the
> > number
> > > > of
> > > > > > connections per IP, and immediately closes a new connection if it
> > comes
> > > > > > from an IP that reached the connection limit. So, we could
> > additionally
> > > > > > track the rate, and close the connection from IP that exceeds the
> > rate.
> > > > > One
> > > > > > slight concern is whether keeping track of per IP rates and
> quotas
> > adds
> > > > > > overhead (CPU and memory). But perhaps it is not a problem if we
> > use
> > > > > > expiring sensors.
> > > > > >
> > > > > > It would still make sense to limit the overall connection
> creation
> > rate
> > > > > > for the Kafka clusters which are shared among many different
> > > > > > applications/clients, since they may spike at the same time
> > bringing
> > > > the
> > > > > > total rate too high.
> > > > > >
> > > > > > 3. Controlling connection queue sizes only controls the share of
> > time
> > > > > > network threads use for creating new connections (and accepting
> on
> > > > > Acceptor
> > > > > > thread) vs. doing other work on each Processor iteration. It does
> > not
> > > > > > directly control how processing connection creations would be
> > related
> > > > to
> > > > > > other processing done by brokers like on request handler threads.
> > So,
> > > > > while
> > > > > > controlling queue size may mitigate the issue for some of the
> > > > workloads,
> > > > > it
> > > > > > does not guarantee that. Plus, if we want to limit how many
> > connections
> > > > > are
> > > > > > created per IP, the queue size approach would not work, unless we
> > go
> > > > > with a
> > > > > > "share" of the queue, which I think even further obscures what
> that
> > > > > setting
> > > > > > means (and what we would achieve as an end result). Does this
> > answer
> > > > the
> > > > > > question?
> > > > > >
> > > > > > If there are no objections, I will update the KIP to add per IP
> > > > > connection
> > > > > > rate limits (config and enforcement).
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Anna
> > > > > >
> > > > > >
> > > > > > On Tue, May 12, 2020 at 11:25 AM Alexandre Dupriez <
> > > > > > alexandre.dupr...@gmail.com> wrote:
> > > > > >
> > > > > >> Hello,
> > > > > >>
> > > > > >> Thank you for the KIP.
> > > > > >>
> > > > > >> I experienced in the past genuine broker brownouts due to
> > connection
> > > > > >> storms consuming most of the CPU available on the server and I
> > think
> > > > > >> it is useful to protect against it.
> > > > > >>
> > > > > >> I tend to share the questions asked in points 2 and 4 from
> David.
> > Is
> > > > > >> there still a risk of denial of service if the limit applies at
> > the
> > > > > >> listener-level without differentiating between (an) “offending”
> > > > > >> client(s) and the others?
> > > > > >>
> > > > > >> To rebound on point 3 - conceptually one difference between
> > capping
> > > > > >> the queue size or throttling as presented in the KIP would come
> > from
> > > > > >> the time it takes to accept a connection and how that time
> evolves
> > > > > >> with the connection rate.
> > > > > >> Assuming that that time increases monotonically with resource
> > > > > >> utilization, the admissible rate of connections would decrease
> as
> > the
> > > > > >> server becomes more loaded, if the limit was set on queue size.
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Alexandre
> > > > > >>
> > > > > >> Le mar. 12 mai 2020 à 08:49, David Jacot <dja...@confluent.io>
> a
> > > > écrit
> > > > > :
> > > > > >> >
> > > > > >> > Hi Anna,
> > > > > >> >
> > > > > >> > Thanks for the KIP! I have few questions:
> > > > > >> >
> > > > > >> > 1. You mention that some clients may create a new connections
> > for
> > > > each
> > > > > >> > requests: "Another example is clients that create a new
> > connection
> > > > for
> > > > > >> each
> > > > > >> > produce/consume request". I am curious here but do we know any
> > > > clients
> > > > > >> > behaving like this?
> > > > > >> >
> > > > > >> > 2. I am a bit concerned by the impact of misbehaving clients
> on
> > the
> > > > > >> other
> > > > > >> > ones. Let's say that we define a quota of 10 connections / sec
> > for a
> > > > > >> broker
> > > > > >> > and that we have a misbehaving application constantly trying
> to
> > > > create
> > > > > >> 20
> > > > > >> > connections on that broker. That application will constantly
> > hit the
> > > > > >> quota
> > > > > >> > and
> > > > > >> > always have many pending connections in the queue waiting to
> be
> > > > > >> accepted.
> > > > > >> > Regular clients trying to connect would need to wait until all
> > the
> > > > > >> pending
> > > > > >> > connections upfront in the queue are drained in the best case
> > > > scenario
> > > > > >> or
> > > > > >> > won't be able to connect at all in the worst case scenario if
> > the
> > > > > queue
> > > > > >> is
> > > > > >> > full.
> > > > > >> > Does it sound like a valid concern? How do you see this?
> > > > > >> >
> > > > > >> > 3. As you mention it in the KIP, we use bounded queues which
> > already
> > > > > >> limit
> > > > > >> > the maximum number of connections that can be accepted. I
> > wonder if
> > > > we
> > > > > >> > could reach the same goal by making the size of the queue
> > > > > configurable.
> > > > > >> >
> > > > > >> > 4. Did you consider doing something similar to the connections
> > quota
> > > > > >> which
> > > > > >> > limits the number of connections per IP? Instead of rate
> > limiting
> > > > all
> > > > > >> the
> > > > > >> > creation,
> > > > > >> > we could perhaps rate limit the number of creation per IP as
> > well.
> > > > > That
> > > > > >> > could
> > > > > >> > perhaps reduce the effect on the other clients. That may be
> > harder
> > > > to
> > > > > >> > implement
> > > > > >> > though.
> > > > > >> >
> > > > > >> > Best,
> > > > > >> > David
> > > > > >> >
> > > > > >> > On Mon, May 11, 2020 at 7:58 PM Anna Povzner <
> a...@confluent.io
> > >
> > > > > wrote:
> > > > > >> >
> > > > > >> > > Hi,
> > > > > >> > >
> > > > > >> > > I just created KIP-612 to allow limiting connection creation
> > rate
> > > > on
> > > > > >> > > brokers, and would like to start a discussion.
> > > > > >> > >
> > > > > >> > >
> > > > > >> > >
> > > > > >>
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-612%3A+Ability+to+Limit+Connection+Creation+Rate+on+Brokers
> > > > > >> > >
> > > > > >> > > Feedback and suggestions are welcome!
> > > > > >> > >
> > > > > >> > > Thanks,
> > > > > >> > > Anna
> > > > > >> > >
> > > > > >>
> > > > > >
> > > > >
> > > >
> >
>

Reply via email to