Hi Gokul,

Leaving aside the question of how Kafka scales, I think the proposed
solution, limiting the number of partitions in a cluster or per-broker, is
a policy which ought to be addressable via the pluggable policies (e.g.
create.topic.policy.class.name). Unfortunately although there's a policy
for topic creation, it's currently not possible to enforce a policy on
partition increase. It would be more flexible to be able enforce this kind
of thing via a pluggable policy, and it would also avoid the situation
where different people each want to have a config which addresses some
specific use case or problem that they're experiencing.

Quite a while ago I proposed KIP-201 to solve this issue with policies
being easily circumvented, but it didn't really make any progress. I've
looked at it again in some detail more recently and I think something might
be possible following the work to make all ZK writes happen on the
controller.

Of course, this is just my take on it.

Kind regards,

Tom

On Thu, Apr 16, 2020 at 11:47 AM Gokul Ramanan Subramanian <
gokul24...@gmail.com> wrote:

> Hi.
>
> For the sake of expediting the discussion, I have created a prototype PR:
> https://github.com/apache/kafka/pull/8499. Eventually, (if and) when the
> KIP is accepted, I'll modify this to add the full implementation and tests
> etc. in there.
>
> Would appreciate if a Kafka committer could share their thoughts, so that I
> can more confidently start the voting thread.
>
> Thanks.
>
> On Thu, Apr 16, 2020 at 11:30 AM Gokul Ramanan Subramanian <
> gokul24...@gmail.com> wrote:
>
> > Thanks for your comments Alex.
> >
> > The KIP proposes using two configurations max.partitions and
> > max.broker.partitions. It does not enforce their use. The default values
> > are pretty large (INT MAX), therefore should be non-intrusive.
> >
> > In multi-tenant environments and in partition assignment and rebalancing,
> > the admin could (a) use the default values which would yield similar
> > behavior to now, (b) set very high values that they know is sufficient,
> (c)
> > dynamically re-adjust the values should the business requirements change.
> > Note that the two configurations are cluster-wide, so they can be updated
> > without restarting the brokers.
> >
> > The quota system in Kafka seems to be geared towards limiting traffic for
> > specific clients or users, or in the case of replication, to leaders and
> > followers. The quota configuration itself is very similar to the one
> > introduced in this KIP i.e. just a few configuration options to specify
> the
> > quota. The main difference is that the quota system is far more
> > heavy-weight because it needs to be applied to traffic that is flowing
> > in/out constantly. Whereas in this KIP, we want to limit number of
> > partition replicas, which gets modified rarely by comparison in a typical
> > cluster.
> >
> > Hope this addresses your comments.
> >
> > On Thu, Apr 9, 2020 at 12:53 PM Alexandre Dupriez <
> > alexandre.dupr...@gmail.com> wrote:
> >
> >> Hi Gokul,
> >>
> >> Thanks for the KIP.
> >>
> >> From what I understand, the objective of the new configuration is to
> >> protect a cluster from an overload driven by an excessive number of
> >> partitions independently from the load handled on the partitions
> >> themselves. As such, the approach uncouples the data-path load from
> >> the number of unit of distributions of throughput and intends to avoid
> >> the degradation of performance exhibited in the test results provided
> >> with the KIP by setting an upper-bound on that number.
> >>
> >> Couple of comments:
> >>
> >> 900. Multi-tenancy - one concern I would have with a cluster and
> >> broker-level configuration is that it is possible for a user to
> >> consume a large proportions of the allocatable partitions within the
> >> configured limit, leaving other users with not enough partitions to
> >> satisfy their requirements.
> >>
> >> 901. Quotas - an approach in Apache Kafka to set-up an upper-bound on
> >> resource consumptions is via client/user quotas. Could this framework
> >> be leveraged to add this limit?
> >>
> >> 902. Partition assignment - one potential problem with the new
> >> repartitioning scheme is that if a subset of brokers have reached
> >> their number of assignable partitions, yet their data path is
> >> under-loaded, new topics and/or partitions will be assigned
> >> exclusively to other brokers, which could increase the likelihood of
> >> data-path load imbalance. Fundamentally, the isolation of the
> >> constraint on the number of partitions from the data-path throughput
> >> can have conflicting requirements.
> >>
> >> 903. Rebalancing - as a corollary to 902, external tools used to
> >> balance ingress throughput may adopt an incremental approach in
> >> partition re-assignment to redistribute load, and could hit the limit
> >> on the number of partitions on a broker when a (too) conservative
> >> limit is used, thereby over-constraining the objective function and
> >> reducing the migration path.
> >>
> >> Thanks,
> >> Alexandre
> >>
> >> Le jeu. 9 avr. 2020 à 00:19, Gokul Ramanan Subramanian
> >> <gokul24...@gmail.com> a écrit :
> >> >
> >> > Hi. Requesting you to take a look at this KIP and provide feedback.
> >> >
> >> > Thanks. Regards.
> >> >
> >> > On Wed, Apr 1, 2020 at 4:28 PM Gokul Ramanan Subramanian <
> >> > gokul24...@gmail.com> wrote:
> >> >
> >> > > Hi.
> >> > >
> >> > > I have opened KIP-578, intended to provide a mechanism to limit the
> >> number
> >> > > of partitions in a Kafka cluster. Kindly provide feedback on the KIP
> >> which
> >> > > you can find at
> >> > >
> >> > >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-578%3A+Add+configuration+to+limit+number+of+partitions
> >> > >
> >> > > I want to specially thank Stanislav Kozlovski who helped in
> >> formulating
> >> > > some aspects of the KIP.
> >> > >
> >> > > Many thanks,
> >> > >
> >> > > Gokul.
> >> > >
> >>
> >
>

Reply via email to