Hello,

I need to validate that a GenericRecord (read from a Kafka Topic) is valid
regarding an Avro Schema. This reference schema
is not necessarily the one used for Kafka message deserialization as this
one was acquired through a Schema Registry.

I had a look at GenericData.get().validate(schema, datum) but it does not
behave as expected because it does not seem
to validate record field names but only positions.

Here's below a test case that represents the weird behaviour I am
observing. I have used Avro 1.10.0 and 1.10.1 and both
versions behave the same:

@Test
public void testGenericDataValidate() {
   Schema v1Schema = SchemaBuilder.record("User").fields()
         .requiredString("name")
         .requiredInt("age")
         .endRecord();
   Schema v2Schema = SchemaBuilder.record("User").fields()
         .requiredString("fullName")
         .requiredInt("age")
         .endRecord();

   GenericRecord userv1 = new GenericData.Record(v1Schema);
   userv1.put("name", "Laurent");
   userv1.put("age", 42);

   // The validate method succeeds because it does not validate the field
name just the position... So the test fails.
   assertFalse(GenericData.get().validate(v2Schema, userv1));
}

This test corresponds to a real life scenario I want to detect : Kafka
producer is still sending messages using the v1 schema but
we expect records following v2 schema that introduced breaking change
(field rename).

Is it a known / desired limitation of the validate() method of GenericData
? Is there another way of achieving what I want to check ?

Thanks!

Reply via email to