Ryan Skraba created AVRO-3370:
---------------------------------
Summary: [Spec] Inconsistent behaviour on types as invalid names.
Key: AVRO-3370
URL: https://issues.apache.org/jira/browse/AVRO-3370
Project: Apache Avro
Issue Type: Bug
Reporter: Ryan Skraba
We've run across this in some code that interoperates between Java and Python.
The spec [currently
forbids|https://avro.apache.org/docs/current/spec.html#names] using a primitive
type name as a keyword: _*Primitive type names have no namespace and their
names may not be defined in any namespace.*_
{code:java}
{"type":"record","name":"long","fields":[{"name":"a1","type":"long"}]} {code}
That fails in Java with {{"org.apache.avro.AvroTypeException: Schemas may not
be named after primitives: long"}}
What do we expect to happen when a named schema uses a complex type?
{code:java}
{"type":"record","name":"record","fields":[{"name":"a1","type":"long"}]} {code}
This currently *succeeds* in Java and the schema can be used to serialize and
deserialize data.
This currently *fails* in Python with: {{avro.schema.SchemaParseException:
record is a reserved type name}}
Which one is the correct behaviour?
This gets a bit more complicated when we consider using the name as a reference.
The following two schemas both work in Java:
{code:java}
{"type":"record","name":"LinkedList",
"fields":[
{"name":"value","type":"int},
{"name":"next","type":["null","LinkedList"]}]}" {code}
{code:java}
{"type":"record","name":"LinkedList",
"fields":[
{"name":"value","type":"int},
{"name":"next","type":["null",{"type":"LinkedList"}]}]}"
{code}
If we rename {{LinkedList}} to {{record}} the former succeeds in Java and the
latter fails with {{org.apache.avro.SchemaParseException: No name in schema:
\{"type":"record"}}}
{{}}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)