I'm debugging a nasty problem the occurs down in the Avro 1.4.1 code. Sometimes when I read my serialized data into a generic datum object I crash deep inside the Avro code. The call stack shows that the parser has been walking down my data structure until it gets to a string node that it tries to read using BinaryDecoder.readString. This function retrieves an invalid string length (e.g. a negative number) and the process subsequently crashes with an Array Index Out Of Bounds exception.
The exact origin of this bug is mysterious to me, but at a high level it appears the problem is that I wrote the data with one schema and mistakenly read it back in using a different schema. How exactly this happened is also mysterious, but appears that my mechanism for supporting projection schemas didn't behave as it should have. The two schemas in question are mostly the sameāin fact, one is a subset of the other. 1. In general is it possible for a schema-to-data mismatch to cause a crash down in the Avro code of the sort that I described? 2. If the answer to question (1) is "yes", is the only way you'd expect the crash to happen is if writing was done with the superset schema and reading done with the subset schema? 3. Writing with a superset schema and reading with a subset schema will always work because this is just projection, correct? Thanks.
