Hi Chad, This ambiguity has indeed been brought up before, and the spec and the Java implementation are not in sync on this.
Originally, the spec followed Java packages on this: from inside a package, it's not possible to reference a class in the default/null package. This also means that in Avro, a simple name is first referenced against the current namespace. Referencing a simple name in the null namespace was implemented as a courtesy, but names in the current namespace take precedence. Otherwise, using namespaces would be cumbersome and namespaces would become an unused feature. What we should do IMHO, is fix the spec to explicitly mention the null namespace in a full name (and make sure the rest of the text is consistent). Then, the Java SDK should be fixed to write a full name in the null namespace like ".Target" instead of "Target". Kind regards, Oscar On Wed, 4 Jun 2025 at 18:15, Chad Parry <c...@parry.org> wrote: > On 6/4/25 04:09, Martin Grigorov wrote: > > HI Chad, > > > > https://avro.apache.org/docs/++version++/specification/#names says: "The > > empty string may also be used as a namespace to indicate the null > > namespace." > > I interpreted that sentence to be describing the standalone "namespace" > attribute used in type definitions, not the type name used in references. > > > That is, by using an empty string for the namespace you can help the > > resolver to find the schema definition. I.e. you can use `name: > ".Target"` > > to say "I want to use the top-level Target schema" > > The very next paragraph directly contradicts that: "The null namespace > may not be used in a dot-separated sequence of names." If your > interpretation is correct, then the word "not" needs to be removed. Or > if that sentence is only intended to restrict the standalone "namespace" > attribute, then it should explicitly allow the use of a null namespace > in a dot-separated fullname. An example type name with a leading dot > would be welcome. > > I tested the Java API, and I'm surprised to see that it does partially > conform to your interpretation. The Schema implementation serializes > references into the null namespace incorrectly, causing the data > corruption I illustrated. A bug would have to be filed for that. But the > Parser implementation does accept ".Target" as a type name in references. > > > > > > > > > On Wed, Jun 4, 2025 at 1:49 AM Chad Parry <c...@parry.org> wrote: > > > >> It is possible to construct an ambiguous schema using the latest Avro > >> specification. Before I file a JIRA issue, I want to check whether this > >> is a known deficiency. I believe this is a bug in the specification, not > >> any particular implementation. > >> > >> Types can be defined in the null namespace, and then those types can be > >> referenced later. Such a reference would not contain any dots. For > >> example, if we define the type "Target" in the null namespace, we can > >> refer to it with the fullname "Target". However, the specification says > >> that when a reference has no dot, "the namespace is the namespace of the > >> enclosing definition." That means we could define a different type > >> "Target" in the namespace "org.apache.avro". It could be referenced with > >> the fullname "org.apache.avro.Target". If the enclosing namespace is > >> already "org.apache.avro", then it could also be referenced with the > >> simple name "Target". The problem arises when a single schema includes > >> both those types, and "Target" is a valid reference to either one. > >> > >> In short, it is impossible to distinguish a qualified name that happens > >> to be in the null namespace from a simple name. The specification > >> creates this problem by neglecting the null namespace when it defines a > >> fullname as "composed of two parts: a name and a namespace, separated by > >> a dot." > >> > >> This could be solved by simply resolving all ambiguities in favor of the > >> null namespace reference. For example, the reference "Target" should be > >> interpreted as a fullname if such a type exists and as a simple name > >> otherwise. If the author didn't intend to reference into the null > >> namespace, then they can unambiguously use a fullname reference instead. > >> Any solution will create compatibility concerns, so first I just want to > >> discuss whether this is believed to be a problem. > >> > >> The following complete test case illustrates how this issue leads to > >> data corruption with the Java API. Note that the Java implementation > >> neither detects the ambiguity nor resolves it the way I am recommending. > >> > >> @Test > >> void testAmbiguousReference() { > >> final Schema target = SchemaBuilder.builder() > >> .record("Target") > >> .doc("right") > >> .fields() > >> .endRecord(); > >> final Schema decoy = SchemaBuilder.builder() > >> .record(target.getName()) > >> .namespace("org.apache.avro") > >> .doc("wrong") > >> .fields() > >> .endRecord(); > >> final Schema ambiguous = SchemaBuilder.builder() > >> .record("Ambiguous") > >> .fields() > >> .name("definition") > >> .type(target) > >> .noDefault() > >> .name("working") > >> .type(target) > >> .noDefault() > >> .name("enclosing") > >> .type(SchemaBuilder.builder() > >> .record("Enclosing") > >> .namespace("org.apache.avro") > >> .fields() > >> .name("decoy") > >> .type(decoy) > >> .noDefault() > >> .name("working") > >> .type(decoy) > >> .noDefault() > >> .name("broken") > >> .type(target) > >> .noDefault() > >> .endRecord()) > >> .noDefault() > >> .endRecord(); > >> final Schema parsed = new Schema.Parser().parse( > >> ambiguous.toString()); > >> // This assertion succeeds. > >> Assertions.assertEquals( > >> ambiguous.getField("working").schema(), > >> parsed.getField("working").schema()); > >> // This assertion succeeds but the specification is unclear. > >> Assertions.assertEquals( > >> ambiguous.getField("enclosing").schema() > >> .getField("working").schema(), > >> parsed.getField("enclosing").schema() > >> .getField("working").schema()); > >> // This assertion FAILS. > >> Assertions.assertEquals( > >> ambiguous.getField("enclosing").schema() > >> .getField("broken").schema(), > >> parsed.getField("enclosing").schema() > >> .getField("broken").schema()); > >> } > >> > >> The assertion failure message complains: > >> expected: <{"type":"record","name":"Target","doc":"right","fields":[]}> > >> but was: > >> > >> > <{"type":"record","name":"Target","namespace":"org.apache.avro","doc":"wrong","fields":[]}> > >> > >> > > > -- ✉️ Oscar Westra van Holthe - Kind <os...@westravanholthe.nl>