Hello! As you noticed, the validate method deliberately ignores the actual schema of a record datum, and validates the field values by position. It's answering a slightly different question --> whether the datum (and it's contents) could fit in the given schema.
For your use case, you might want to use the rules for schema compatibility: SchemaCompatibility.SchemaPairCompatibility compatibility = SchemaCompatibility.checkReaderWriterCompatibility(userv1.getSchema(), v2Schema); assertThat(compatibility.getType(), is(SchemaCompatibility.SchemaCompatibilityType.INCOMPATIBLE)); In your test, the built-in Avro schema resolution can't be used to convert the userv1 datum to the v2Schema, so it reports INCOMPATIBLE. If the V2 change were non-breaking (like adding a field with a default), then the schemas would still be reported COMPATIBLE with that method. Of course, if you just want to enforce that incoming records are strictly and only the reference schema, you could simply check the two for equality: user.getSchema().equals(v2Schema) Is this what you're looking for? I'm not familiar enough with records produced using the Confluent Schema Registry! I'm surprised this isn't available in Kafka message metadata, you might want to check into their implementation. All my best, Ryan On Tue, Jan 5, 2021 at 2:44 PM laurent broudoux <[email protected]> wrote: > Hello, > > I need to validate that a GenericRecord (read from a Kafka Topic) is valid > regarding an Avro Schema. This reference schema > is not necessarily the one used for Kafka message deserialization as this > one was acquired through a Schema Registry. > > I had a look at GenericData.get().validate(schema, datum) but it does not > behave as expected because it does not seem > to validate record field names but only positions. > > Here's below a test case that represents the weird behaviour I am > observing. I have used Avro 1.10.0 and 1.10.1 and both > versions behave the same: > > @Test > public void testGenericDataValidate() { > Schema v1Schema = SchemaBuilder.record("User").fields() > .requiredString("name") > .requiredInt("age") > .endRecord(); > Schema v2Schema = SchemaBuilder.record("User").fields() > .requiredString("fullName") > .requiredInt("age") > .endRecord(); > > GenericRecord userv1 = new GenericData.Record(v1Schema); > userv1.put("name", "Laurent"); > userv1.put("age", 42); > > // The validate method succeeds because it does not validate the field > name just the position... So the test fails. > assertFalse(GenericData.get().validate(v2Schema, userv1)); > } > > This test corresponds to a real life scenario I want to detect : Kafka > producer is still sending messages using the v1 schema but > we expect records following v2 schema that introduced breaking change > (field rename). > > Is it a known / desired limitation of the validate() method of GenericData > ? Is there another way of achieving what I want to check ? > > Thanks! > > > >
