Thanks a lot Ryan ! That's exactly that I was looking for.
I have not yet gone through the SchemaCompatibility APIs and now discover that they're very powerful and able to bring back a list of incompatibilities as well. Great ! Kind regards, Laurent Le mar. 5 janv. 2021 à 18:17, Ryan Skraba <[email protected]> a écrit : > Hello! > > As you noticed, the validate method deliberately ignores the actual schema > of a record datum, and validates the field values by position. It's > answering a slightly different question --> whether the datum (and it's > contents) could fit in the given schema. > > For your use case, you might want to use the rules for schema > compatibility: > > SchemaCompatibility.SchemaPairCompatibility compatibility = > SchemaCompatibility.checkReaderWriterCompatibility(userv1.getSchema(), > v2Schema); > assertThat(compatibility.getType(), > is(SchemaCompatibility.SchemaCompatibilityType.INCOMPATIBLE)); > > In your test, the built-in Avro schema resolution can't be used to convert > the userv1 datum to the v2Schema, so it reports INCOMPATIBLE. > > If the V2 change were non-breaking (like adding a field with a default), > then the schemas would still be reported COMPATIBLE with that method. > > Of course, if you just want to enforce that incoming records are strictly > and only the reference schema, you could simply check the two for equality: > > user.getSchema().equals(v2Schema) > > Is this what you're looking for? I'm not familiar enough with records > produced using the Confluent Schema Registry! I'm surprised this isn't > available in Kafka message metadata, you might want to check into their > implementation. > > All my best, Ryan > > > > On Tue, Jan 5, 2021 at 2:44 PM laurent broudoux < > [email protected]> wrote: > >> Hello, >> >> I need to validate that a GenericRecord (read from a Kafka Topic) is >> valid regarding an Avro Schema. This reference schema >> is not necessarily the one used for Kafka message deserialization as this >> one was acquired through a Schema Registry. >> >> I had a look at GenericData.get().validate(schema, datum) but it does not >> behave as expected because it does not seem >> to validate record field names but only positions. >> >> Here's below a test case that represents the weird behaviour I am >> observing. I have used Avro 1.10.0 and 1.10.1 and both >> versions behave the same: >> >> @Test >> public void testGenericDataValidate() { >> Schema v1Schema = SchemaBuilder.record("User").fields() >> .requiredString("name") >> .requiredInt("age") >> .endRecord(); >> Schema v2Schema = SchemaBuilder.record("User").fields() >> .requiredString("fullName") >> .requiredInt("age") >> .endRecord(); >> >> GenericRecord userv1 = new GenericData.Record(v1Schema); >> userv1.put("name", "Laurent"); >> userv1.put("age", 42); >> >> // The validate method succeeds because it does not validate the field >> name just the position... So the test fails. >> assertFalse(GenericData.get().validate(v2Schema, userv1)); >> } >> >> This test corresponds to a real life scenario I want to detect : Kafka >> producer is still sending messages using the v1 schema but >> we expect records following v2 schema that introduced breaking change >> (field rename). >> >> Is it a known / desired limitation of the validate() method of >> GenericData ? Is there another way of achieving what I want to check ? >> >> Thanks! >> >> >> >>
