Hi, I do not have trace level logs as of now. I am doing very basic operation with messages. The only time consuming part is sending an e-mail. Our Email servers are very slow so sending one email is taking upto 20 seconds. That's why I turned max.poll.records to just 2, keppt session time out at 10 minutes. Still rebalances would happen.
However, there's an update. When I was trying potential config tuning I set max.poll.interval.ms to 3 minutes. Later on I found that this setting is not meant for Kafka 0.10.0.1 which we are using. So I removed that setting. Now after more than a week since that was done, I haven't seen any rebalance issue. But, still slow consumer startup issue persists. Whenever I restart my consumer process for almost 5 minutes there is no activity. I checked in broker logs at that time I saw message "preparing to stabilise consumer group", then there is a gap of 5 minutes and message "stabilized group". What could be happening here? On Fri, Jun 1, 2018 at 10:40 PM Ken Chen <[email protected]> wrote: > 1. Any detail logs ? > 2. How do you process the records after you polled the records? > 3. How much time does it take for every round of poll ? > > Thanks ! > > -- > Sent from my iPhone > > On May 28, 2018, at 10:44 PM, Shantanu Deshmukh <[email protected]> > wrote: > > Can anyone here help me please? I am at my wit's end. I now have > max.poll.records set to just 2. Still I am getting Auto offset commit > failed warning. Log file is getting full because of this warning. Session > timeout is 5 minutes, max.poll.interval.ms is 10 minutes. > > On Wed, May 23, 2018 at 12:42 PM Shantanu Deshmukh <[email protected]> > wrote: > > > > > Hello, > > > > We have a 3 broker Kafka 0.10.0.1 cluster. There we have 3 topics with 10 > > partitions each. We have an application which spawns threads as > consumers. > > We spawn 5 consumers for each topic. I am observing that consider group > > randomly keeps rebalancing. Then many times we see logs saying "Revoking > > partitions for". This happens almost every 10 minutes. Consumption during > > this time completely stops. > > > > I have applied this configuration > > max.poll.records 20 > > heartbeat.interval.ms 10000 > > Session.timeout.ms 6000 > > > > Still this did not help. Strange thing is I observed consumer writing > logs > > saying "auto commit failed because poll() loop spent too much time > > processing records" even when there was no data in partition to process. > We > > have polling interval of 500 ms, specified as argument in poll(). > Initially > > I had set same consumer group for all three topics' consumers. Then I > > specified different CGs for different topics' consumers. Even this is not > > helping. > > > > I am trying to search over the web, checked my code, tried many > > combinations of configuration but still no luck. Please help me. > > > > Thanks & Regards, > > > > Shantanu Deshmukh > > >
