Hi Ryan,

Would you please share your POC?

Thank you.

On 2020/07/22 22:54:04, Ryan Schachte <[email protected]> wrote: 
> Hello everyone,
> We are facing an interesting use-case with respect to Avro and
> deserialization.
> 
> As part of one of our systems, we get triggers to pull raw avro bytes out
> of our data layer and deserialize them. For many months, we have never had
> an issue with this. The deserialization was performed with the latest
> reader schema alongside the specific datum reader.
> 
> Recently, a schema change within one of the relevant objects was updated
> and deemed backward-transitive from a registry perspective, however
> deserialization began to fail. Diving deeper into this issue, it was
> because the deserialization was explicitly casting fields based on the
> field-level ordering of the object at the root level. To further clarify,
> once we had compiled an adjacent object matching the avro schema, you can
> notice that the fields in some of the case statements rely on this
> ordering, which breaks our deserialization flow.
> 
> To mitigate this issue, we have some hacks involving both the reader and
> writer schema in tandem to perform deserialization, but doing this
> operation on billions of records has destroyed a lot of our performance.
> 
> My question is, how should we handle this situation on our end? I'm happy
> to further elaborate on the problem and provide examples as well.
> 
> Thanks so much,
> Ryan Schachte
> 

Reply via email to