Hello,
I need to validate that a GenericRecord (read from a Kafka Topic) is valid
regarding an Avro Schema. This reference schema
is not necessarily the one used for Kafka message deserialization as this
one was acquired through a Schema Registry.
I had a look at GenericData.get().validate(schema, datum) but it does not
behave as expected because it does not seem
to validate record field names but only positions.
Here's below a test case that represents the weird behaviour I am
observing. I have used Avro 1.10.0 and 1.10.1 and both
versions behave the same:
@Test
public void testGenericDataValidate() {
Schema v1Schema = SchemaBuilder.record("User").fields()
.requiredString("name")
.requiredInt("age")
.endRecord();
Schema v2Schema = SchemaBuilder.record("User").fields()
.requiredString("fullName")
.requiredInt("age")
.endRecord();
GenericRecord userv1 = new GenericData.Record(v1Schema);
userv1.put("name", "Laurent");
userv1.put("age", 42);
// The validate method succeeds because it does not validate the field
name just the position... So the test fails.
assertFalse(GenericData.get().validate(v2Schema, userv1));
}
This test corresponds to a real life scenario I want to detect : Kafka
producer is still sending messages using the v1 schema but
we expect records following v2 schema that introduced breaking change
(field rename).
Is it a known / desired limitation of the validate() method of GenericData
? Is there another way of achieving what I want to check ?
Thanks!