Hi, does anyone have advice on how to deal with this issue? Is it possible that 
changing a schema compatibility setting could correct it?

Thanks,
Dave


On 5/26/17, 1:44 PM, "Dave Hamilton" <dhamil...@nanigans.com> wrote:

    We are currently using the Kafka S3 connector to ship Avro data to S3. We 
made a change to one of our Avro schemas and have noticed consumer throughput 
on the Kafka connector drop considerably. I am wondering if there is anything 
we can do to avoid such issues when we update schemas in the future?
    
    This is what I believe is happening:
    
    
    ·         The avro producer application is running on 12 instances. They 
are restarted in a rolling fashion, switching from producing schema version 1 
before the restart to schema version 2 afterward.
    
    ·         While the rolling restart is occurring, data on schema version 1 
and schema version 2 is simultaneously being written to the topic.
    
    ·         The Kafka connector has to close the current avro file for a 
partition and ship it whenever it detects a schema change, which is happening 
several times due to the rolling nature of the schema update deployment and the 
mixture of message versions being written during this time. This process causes 
the overall consumer throughput to plummet.
    
    Am I reasoning correctly about what we’re observing here? Is there any way 
to avoid this when we change schemas (short of stopping all instances of the 
service and bringing them up together on the new schema version)?
    
    Thanks,
    Dave
    
    

Reply via email to