Re: Metadata requests for subset of partitions

Michał Łowicki Fri, 28 Feb 2025 04:51:27 -0800

On Fri, Feb 28, 2025 at 10:10 AM Stanislav Kozlovski <
stanislavkozlov...@apache.org> wrote:


> > > It's certainly been a topic that's come up before. In certain
> situations
> > > the current approach is a bit heavy-handed. The current approach for
> > > fetching metadata has a number of benefits: it keeps the protocol from
> > > being too chatty, which reduces load on the brokers and makes
> maintaining a
> > > consistent via of the metadata on the client much easier. There's a
> fairly
> > > substantial overhead with fetching metadata and batching it in a single
> > > request eliminates a lot of edge cases.
>
> My understanding is that the substantial overhead of the metadata request
> comes precisely from the total number of partitions the broker needs to
> iterate over and build objects for. (please correct me if I'm wrong and
> it's something non-obvious)
>
> If that's true, then the less partitions it has to do that for - the less
> overhead there would be?
>
> As for the edge cases, I am not aware of them but can certainly imagine
> something like the old consumer protocol where the client chooses
> assignment be prone to edge cases from incomplete metadata. Perhaps the
> subset partition metadata fetching can be employed strategically in cases
> where that risk is lower.
>
> --
>
> Michal, out of curiosity, what lead you to this question? Do you see any
> substantial overhead in the metadata path on the clients/brokers because of
> this unnecessary fetching?
>
> --
>
> re: chattiness - do we all define chattiness by the number of requests per
> second?
> Michal, you mention fetching the subset could reduce chattiness but I
> don't see how that could happen. By definition if you send less data per
> response, then the chances are you'll need more to send more requests once
> you want more data. Am I missing anything?
>

amount of data transferred.

We've an in-house client and frequently for topics with hundreds or
thousands of partitions, the consumption is spread across a significant
number of consumers where each one is interested in a few partitions.

1000 partitions, 200 consumers where each gets 5 partitions.

Currently each one on start needs to fetch metadata for all topics so we
retrieve 1000 * 200 partitions metadata (1000 requests) from the brokers
where 1000 would be enough.


>
> On 2025/02/28 07:56:29 Michał Łowicki wrote:
> > On Thu, Feb 27, 2025 at 5:39 PM Kirk True <k...@kirktrue.pro> wrote:
> >
> > > Hi Michał,
> > >
> > > On Thu, Feb 27, 2025, at 3:44 AM, Michał Łowicki wrote:
> > > > Hi there!
> > > >
> > > > Is there any reason why Metadata requests
> > > > <https://kafka.apache.org/protocol.html#The_Messages_Metadata> do
> not
> > > > support fetching metadata for subsets of the partitions? If a certain
> > > > client is interested only in e.g. 1 but topic may have many so most
> of
> > > > fetched data isn't really used.
> > > >
> > >
> > > It's certainly been a topic that's come up before. In certain
> situations
> > > the current approach is a bit heavy-handed. The current approach for
> > > fetching metadata has a number of benefits: it keeps the protocol from
> > > being too chatty, which reduces load on the brokers and makes
> maintaining a
> > > consistent via of the metadata on the client much easier. There's a
> fairly
> > > substantial overhead with fetching metadata and batching it in a single
> > > request eliminates a lot of edge cases.
> > >
> >
> > Sure, I'm rather thinking about an opt-in option to the protocol where,
> if
> > specified, metadata response would contain metadata for a specified set
> of
> > partitions (otherwise as of today metadata for all of them). To cover the
> > cases where consumers need to know metadata for only a small portion of
> > partitions. Then it would be less for the broker to handle such requests
> > and craft responses and protocol would be actually less chatty in those
> > cases.
> >
> >
> > >
> > > As always, further discussion and suggestions for improvements in this
> > > area are welcomed :)
> > >
> > > Thanks,
> > > Kirk
> >
>

Re: Metadata requests for subset of partitions

Reply via email to