Ambiguous reference

Chad Parry Tue, 03 Jun 2025 15:49:51 -0700

It is possible to construct an ambiguous schema using the latest Avrospecification. Before I file a JIRA issue, I want to check whether thisis a known deficiency. I believe this is a bug in the specification, notany particular implementation.

Types can be defined in the null namespace, and then those types can bereferenced later. Such a reference would not contain any dots. Forexample, if we define the type "Target" in the null namespace, we canrefer to it with the fullname "Target". However, the specification saysthat when a reference has no dot, "the namespace is the namespace of theenclosing definition." That means we could define a different type"Target" in the namespace "org.apache.avro". It could be referenced withthe fullname "org.apache.avro.Target". If the enclosing namespace isalready "org.apache.avro", then it could also be referenced with thesimple name "Target". The problem arises when a single schema includesboth those types, and "Target" is a valid reference to either one.

In short, it is impossible to distinguish a qualified name that happensto be in the null namespace from a simple name. The specificationcreates this problem by neglecting the null namespace when it defines afullname as "composed of two parts: a name and a namespace, separated bya dot."

This could be solved by simply resolving all ambiguities in favor of thenull namespace reference. For example, the reference "Target" should beinterpreted as a fullname if such a type exists and as a simple nameotherwise. If the author didn't intend to reference into the nullnamespace, then they can unambiguously use a fullname reference instead.Any solution will create compatibility concerns, so first I just want todiscuss whether this is believed to be a problem.

The following complete test case illustrates how this issue leads todata corruption with the Java API. Note that the Java implementationneither detects the ambiguity nor resolves it the way I am recommending.


    @Test
    void testAmbiguousReference() {
        final Schema target = SchemaBuilder.builder()
                .record("Target")
                .doc("right")
                .fields()
                .endRecord();
        final Schema decoy = SchemaBuilder.builder()
                .record(target.getName())
                .namespace("org.apache.avro")
                .doc("wrong")
                .fields()
                .endRecord();
        final Schema ambiguous = SchemaBuilder.builder()
                .record("Ambiguous")
                .fields()
                    .name("definition")
                        .type(target)
                        .noDefault()
                    .name("working")
                        .type(target)
                        .noDefault()
                    .name("enclosing")
                        .type(SchemaBuilder.builder()
                                .record("Enclosing")
                                .namespace("org.apache.avro")
                                .fields()
                                    .name("decoy")
                                        .type(decoy)
                                        .noDefault()
                                    .name("working")
                                        .type(decoy)
                                        .noDefault()
                                    .name("broken")
                                        .type(target)
                                        .noDefault()
                                .endRecord())
                        .noDefault()
                .endRecord();
        final Schema parsed = new Schema.Parser().parse(
                ambiguous.toString());
        // This assertion succeeds.
        Assertions.assertEquals(
                ambiguous.getField("working").schema(),
                parsed.getField("working").schema());
        // This assertion succeeds but the specification is unclear.
        Assertions.assertEquals(
                ambiguous.getField("enclosing").schema()
                        .getField("working").schema(),
                parsed.getField("enclosing").schema()
                        .getField("working").schema());
        // This assertion FAILS.
        Assertions.assertEquals(
                ambiguous.getField("enclosing").schema()
                        .getField("broken").schema(),
                parsed.getField("enclosing").schema()
                        .getField("broken").schema());
    }

The assertion failure message complains:

expected: <{"type":"record","name":"Target","doc":"right","fields":[]}>but was:<{"type":"record","name":"Target","namespace":"org.apache.avro","doc":"wrong","fields":[]}>

Ambiguous reference

Reply via email to