Hello, I got this from the JavaDocs for KafkaConsumer.
* If a consumer is assigned multiple partitions to fetch data from, it will try to consume from all of them at the same time, * effectively giving these partitions the same priority for consumption. However in some cases consumers may want to * first focus on fetching from some subset of the assigned partitions at full speed, and only start fetching other partitions * when these partitions have few or no data to consume. * One of such cases is stream processing, where processor fetches from two topics and performs the join on these two streams. * When one of the topics is long lagging behind the other, the processor would like to pause fetching from the ahead topic * in order to get the lagging stream to catch up. Another example is bootstraping upon consumer starting up where there are * a lot of history data to catch up, the applications usually want to get the latest data on some of the topics before consider * fetching other topics. I'm testing a consumer now. When the topic being read has the following lag. consumer group partition: 0, offset: 254, lag: 12301 consumer group partition: 1, offset: 302, lag: 12216 consumer group partition: 2, offset: 300, lag: 12257 consumer group partition: 3, offset: 259, lag: 12108 My consumer is starting with partition 3 and catching all the way up, then it starts reading the rest of the partitions evenly. I'm not sure why it is happening that way. Hope this helps. On Sun, Jan 23, 2022 at 1:58 AM Mazen Ezzeddine < mazen.ezzedd...@etu.univ-cotedazur.fr> wrote: > Dear all, > > Consider a kafka topic deployment with 3 partitions P1, P2, P3 with > events/records lagging in the partitions equal to 100, 50, 75 for P1, P2, > P3 respectively. And let’s suppose that num.poll.records (the maximum > number of records that can be fetched from the broker ) is equal to 100. > > If the consumer sends a request to fetch records from P1, P2, P3, is > there any guarantee that the returned records will be fairly/uniformly > selected out of the available partitions e.g., say 34 records from P1, 33 > from P2 and 33 from P3. > > Otherwise, how the decision on the returned records is handled (e.g., is > it based on the first partition leader that replies to the fetch request > e.g., say P1..). In such case how eventual fairness is guaranteed across > different partitions, in case for example when records happen to be > fetched/read from a single partition. > > Thank you. > >