[
https://issues.apache.org/jira/browse/AVRO-2002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15870374#comment-15870374
]
Doug Cutting commented on AVRO-2002:
------------------------------------
I believe you have misunderstood the semantics of fingerprints. Identical
fingerprints mean that one schema can read output of the other without schema
resolution, not that both can read a third using schema resolution.
Schema resolution permits interoperability (in some cases) between a pair of
schemas whose fingerprints do not match. The SchemaCompatiblity class can
determine whether a pair of schemas can, through resolution, interoperate.
> Canonical form strip the default value : Schema resolution may provide 2
> different answers with same schema's fingerprint
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: AVRO-2002
> URL: https://issues.apache.org/jira/browse/AVRO-2002
> Project: Avro
> Issue Type: Bug
> Components: java
> Affects Versions: 1.8.1
> Reporter: Deslandes Hugues
>
> I understand that the schema‘s fingerprint describes uniquely the Avro
> Schema. The following example shows 2 different schemas, with the same
> fingerprint but different behaviours: one can read the writer, the other one
> can’t. I guess it is a bug but maybe it's only a misinterpretation…
> Here are the details :
> First, the Canonical form of an Avro Schema is derived using this rule: (see
> http://avro.apache.org/docs/1.8.1/spec.html#Transforming+into+Parsing+Canonical+Form
> )
> {quote}
> [STRIP] Keep only attributes that are relevant to parsing data, which are:
> type, name, fields, symbols, items, values, size. Strip all others (e.g., doc
> and aliases). {quote}
> So any default attribute is removed.
> On the other hand, Schema Resolution is done using this particular rule:
> (http://avro.apache.org/docs/1.8.1/spec.html#Schema+Resolution )
> {quote}if the reader's record schema has a field with no default value, and
> writer's schema does not have a field with the same name, an error is
> signalled.{quote}
> To illustrate the situation on a simple schema (writer), I have created a new
> version by adding a new field to the schema with 2 options: one has a default
> attribute and value, the other one hasn’t. The first one can read old
> version of writer, the second one can’t.
> In other words, the canonical form does not take into account any default
> attribute for the record fields but the resolution algorithm uses the default
> attribute to evaluate the compatibility. The conclusion is that 2 schemas
> that differ only with a default attribute have the same finger print: one is
> compatible with the writer schema, the other one is not.
> I understand the different behaviors but not with the same fingerprint.
> I would suggest that the canonical form would not strip the default attribute
> (but strip the default value which should not interfere with the
> compatibility).
> The immediate workaround I will use is to systematically use a default value
> for any additional field.
> {code:linenumbers=true|language=java}
> package Main;
> import java.util.Collections;
> import org.apache.avro.Schema;
> import org.apache.avro.SchemaCompatibility;
> import org.apache.avro.SchemaNormalization;
> import org.apache.avro.SchemaValidationException;
> import org.apache.avro.SchemaValidator;
> import org.apache.avro.SchemaValidatorBuilder;
> public class Main {
> public static void main(String[] args) {
> Schema schemaWriter = new org.apache.avro.Schema.Parser().parse(
>
> "{\"type\":\"record\",\"name\":\"ExampleAvro\",\"fields\":[{\"name\":\"field\",\"type\":\"long\"}]}");
> Schema schemaReader = new org.apache.avro.Schema.Parser().parse(
>
> "{\"type\":\"record\",\"name\":\"ExampleAvro\",\"fields\":[{\"name\":\"field\",\"type\":\"long\"},{\"name\":\"newField\",\"type\":\"int\",\"default\":0}]}");
> Schema schemaReaderNoDefault = new
> org.apache.avro.Schema.Parser().parse(
>
> "{\"type\":\"record\",\"name\":\"ExampleAvro\",\"fields\":[{\"name\":\"field\",\"type\":\"long\"},{\"name\":\"newField\",\"type\":\"int\"}]}");
> long fpWriter =
> SchemaNormalization.parsingFingerprint64(schemaWriter);
> long fpReader =
> SchemaNormalization.parsingFingerprint64(schemaReader);
> long fpReaderNoDefault =
> SchemaNormalization.parsingFingerprint64(schemaReaderNoDefault);
>
> System.out.println("Schema writer " + fpWriter + " "+
> schemaWriter);
> System.out.println("Schema reader " + fpReader + " "+
> schemaReader);
> System.out.println("Schema readerNoDefault " +
> fpReaderNoDefault + " "+ schemaReaderNoDefault);
> // check compatibility : method 1
> String res =
> SchemaCompatibility.checkReaderWriterCompatibility(schemaReader,
> schemaWriter).getType().toString() ;
> String resNoDefault =
> SchemaCompatibility.checkReaderWriterCompatibility(schemaReaderNoDefault,
> schemaWriter).getType().toString() ;
>
> System.out.println(fpReader + " is " + res + " with "
> +fpWriter);
> System.out.println(fpReaderNoDefault + " is " + resNoDefault +
> " with " +fpWriter);
> // check compatibility : method 2
> SchemaValidator validator = new
> SchemaValidatorBuilder().canReadStrategy().validateAll();
> String isCompatible="";
> try {
> validator.validate(schemaReaderNoDefault,
> Collections.singletonList(schemaWriter));
> } catch (SchemaValidationException e) {
> isCompatible="not ";
> }
> System.out.println(fpReaderNoDefault + " is "+ isCompatible
> +"compatible with " +fpWriter);
> isCompatible="";
> try {
> validator.validate(schemaReader,
> Collections.singletonList(schemaWriter));
> } catch (SchemaValidationException e) {
> isCompatible="not ";
> }
> System.out.println(fpReader + " is "+ isCompatible +"compatible
> with " +fpWriter);
> System.out.println("------------");
> }
> //The output is :
> //Schema writer 8957007963871099370
> {"type":"record","name":"ExampleAvro","fields":[{"name":"field","type":"long"}]}
> //Schema reader 489516346825099350
> {"type":"record","name":"ExampleAvro","fields":[{"name":"field","type":"long"},{"name":"newField","type":"int","default":0}]}
> //Schema readerNoDefault 489516346825099350
> {"type":"record","name":"ExampleAvro","fields":[{"name":"field","type":"long"},{"name":"newField","type":"int"}]}
> //489516346825099350 is COMPATIBLE with 8957007963871099370
> //489516346825099350 is INCOMPATIBLE with 8957007963871099370
> //489516346825099350 is not compatible with 8957007963871099370
> //489516346825099350 is compatible with 8957007963871099370
>
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)