Evelyn Bayes created KAFKA-9468:
-----------------------------------
Summary: config.storage.topic partition count issue is hard to
debug
Key: KAFKA-9468
URL: https://issues.apache.org/jira/browse/KAFKA-9468
Project: Kafka
Issue Type: Improvement
Components: KafkaConnect
Affects Versions: 2.3.1, 2.4.0, 2.2.2, 2.1.1, 2.0.1, 1.1.1, 1.0.2
Reporter: Evelyn Bayes
When you run connect distributed with 2 or more workers and
config.storage.topic has more then 1 partition, you can end up with one of the
workers rebalancing endlessly:
[2020-01-13 12:53:23,535] INFO [Worker clientId=connect-1,
groupId=connect-cluster] Current config state offset 37 is behind group
assignment 63, reading to end of config log
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1,
groupId=connect-cluster] Finished reading to end of log and updated config
snapshot, new config log offset: 37
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
[2020-01-13 12:53:23,584] INFO [Worker clientId=connect-1,
groupId=connect-cluster] Current config state offset 37 does not match group
assignment 63. Forcing rebalance.
(org.apache.kafka.connect.runtime.distributed.DistributedHerder)
*Suggested Solution*
Make the connect worker check the partition count when it starts and if
partition count is > 1 Kafka Connect stops and logs the reason why.
I think this is reasonable as it would stop users just starting out from
building it incorrectly and would be easy to fix early. For those upgrading
this would easily be caught in a PRE-PROD environment. And even if they
upgraded directly in PROD you would only be impacted if upgraded all connect
workers at the same time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)