We are currently using the Kafka S3 connector to ship Avro data to S3. We made 
a change to one of our Avro schemas and have noticed consumer throughput on the 
Kafka connector drop considerably. I am wondering if there is anything we can 
do to avoid such issues when we update schemas in the future?

This is what I believe is happening:


·         The avro producer application is running on 12 instances. They are 
restarted in a rolling fashion, switching from producing schema version 1 
before the restart to schema version 2 afterward.

·         While the rolling restart is occurring, data on schema version 1 and 
schema version 2 is simultaneously being written to the topic.

·         The Kafka connector has to close the current avro file for a 
partition and ship it whenever it detects a schema change, which is happening 
several times due to the rolling nature of the schema update deployment and the 
mixture of message versions being written during this time. This process causes 
the overall consumer throughput to plummet.

Am I reasoning correctly about what we’re observing here? Is there any way to 
avoid this when we change schemas (short of stopping all instances of the 
service and bringing them up together on the new schema version)?

Thanks,
Dave

Reply via email to