Hello everyone, We are running a 3 broker Kafka 0.10.0.1 cluster. We have a java app which spawns many consumer threads consuming from different topics. For every topic we have specified different consumer-group. A lot of times I see that whenever this application is restarted a CG on one or two topics takes more than 5 minutes to receive partition assignment. Till that time consumers for that topic don't consumer anything. If I go to Kafka broker and run consumer-groups.sh and describe that particular CG I see that it is rebalancing. There is time critical data stored in that topic and we cannot tolerate such long delays. What can be the reason for such long rebalances.
Here's our consumer config auto.commit.interval.ms = 3000 auto.offset.reset = latest bootstrap.servers = [x.x.x.x:9092, x.x.x.x:9092, x.x.x.x:9092] check.crcs = true client.id = connections.max.idle.ms = 540000 enable.auto.commit = true exclude.internal.topics = true fetch.max.bytes = 52428800 fetch.max.wait.ms = 500 fetch.min.bytes = 1 group.id = otp-notifications-consumer heartbeat.interval.ms = 3000 interceptor.classes = null key.deserializer = class org.apache.kafka.common.serialization.StringDeserializer max.partition.fetch.bytes = 1048576 max.poll.interval.ms = 300000 max.poll.records = 50 metadata.max.age.ms = 300000 metric.reporters = [] metrics.num.samples = 2 metrics.sample.window.ms = 30000 partition.assignment.strategy = [class org.apache.kafka.clients.consumer.RangeAssignor] receive.buffer.bytes = 65536 reconnect.backoff.ms = 50 request.timeout.ms = 305000 retry.backoff.ms = 100 sasl.kerberos.kinit.cmd = /usr/bin/kinit sasl.kerberos.min.time.before.relogin = 60000 sasl.kerberos.service.name = null sasl.kerberos.ticket.renew.jitter = 0.05 sasl.kerberos.ticket.renew.window.factor = 0.8 sasl.mechanism = GSSAPI security.protocol = SSL send.buffer.bytes = 131072 session.timeout.ms = 300000 ssl.cipher.suites = null ssl.enabled.protocols = [TLSv1.2, TLSv1.1, TLSv1] ssl.endpoint.identification.algorithm = null ssl.key.password = null ssl.keymanager.algorithm = SunX509 ssl.keystore.location = null ssl.keystore.password = null ssl.keystore.type = JKS ssl.protocol = TLS ssl.provider = null ssl.secure.random.implementation = null ssl.trustmanager.algorithm = PKIX ssl.truststore.location = /x/x/client.truststore.jks ssl.truststore.password = [hidden] ssl.truststore.type = JKS value.deserializer = class org.apache.kafka.common.serialization.StringDeserializer Please help. *Thanks & Regards,* *Shantanu Deshmukh*