Hello,

I got this from the JavaDocs for KafkaConsumer.

 * If a consumer is assigned multiple partitions to fetch data from, it
will try to consume from all of them at the same time,
 * effectively giving these partitions the same priority for consumption.
However in some cases consumers may want to
 * first focus on fetching from some subset of the assigned partitions at
full speed, and only start fetching other partitions
 * when these partitions have few or no data to consume.

*  One of such cases is stream processing, where processor fetches from two
topics and performs the join on these two streams.
 * When one of the topics is long lagging behind the other, the processor
would like to pause fetching from the ahead topic
 * in order to get the lagging stream to catch up. Another example is
bootstraping upon consumer starting up where there are
 * a lot of history data to catch up, the applications usually want to get
the latest data on some of the topics before consider
 * fetching other topics.

I'm testing a consumer now. When the topic being read has the following lag.

consumer group partition: 0, offset: 254, lag: 12301
consumer group partition: 1, offset: 302, lag: 12216
consumer group partition: 2, offset: 300, lag: 12257
consumer group partition: 3, offset: 259, lag: 12108

My consumer is starting with partition 3 and catching all the way up, then
it starts reading the rest of the partitions evenly. I'm not sure why it is
happening that way.

Hope this helps.





On Sun, Jan 23, 2022 at 1:58 AM Mazen Ezzeddine <
mazen.ezzedd...@etu.univ-cotedazur.fr> wrote:

> Dear all,
>
> Consider a kafka topic deployment with 3 partitions P1, P2, P3 with
> events/records lagging in the partitions equal to 100, 50, 75 for P1, P2,
> P3 respectively. And let’s suppose that num.poll.records (the maximum
> number of records that can be fetched from the broker ) is equal to 100.
>
> If the consumer sends a request to fetch records from P1, P2, P3,  is
> there any guarantee that the returned records will be fairly/uniformly
> selected out of the available partitions e.g., say 34 records from P1, 33
> from P2 and 33 from P3.
>
> Otherwise, how the decision on the returned records is handled (e.g., is
> it based on the first partition  leader that replies to the fetch request
> e.g., say P1..). In such case how eventual fairness is guaranteed across
> different partitions,  in case for example when records happen to be
> fetched/read from a single partition.
>
> Thank you.
>
>

Reply via email to