Augusto Hack created AVRO-3229:
----------------------------------
Summary: Python Avro doesn't validate the default value of an enum
field
Key: AVRO-3229
URL: https://issues.apache.org/jira/browse/AVRO-3229
Project: Apache Avro
Issue Type: Bug
Components: python
Affects Versions: 1.10.2
Environment: python --version
Python 3.9.5
pip freeze | grep avro
avro==1.10.2
Reporter: Augusto Hack
The following schema is invalid for Java (it fails to compile), because the
default value is not a valid symbol:
{code:java}
{
"type": "record",
"name": "test_schema",
"fields": [
{
"name": "test_enum",
"type": {
"name": "test_enum_type",
"type": "enum",
"symbols": [
"NONE"
],
"default": "UNKNOWN"
}
}
]
}
{code}
This matches the behavior documented in the spec:
{quote}default: A default value for this enumeration, used during resolution
when the reader encounters a symbol from the writer that isn't defined in the
reader's schema (optional). The value provided here must be a JSON *string
that's a member of the symbols array*. See documentation on schema resolution
for how this gets used.
{quote}
But the same schema is silently accepted by the python library (although the
writer doesn't allow the invalid value to be produced):
{code:java}
import avro.schema
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter
with open("test.avsc", "rb") as handler:
schema = avro.schema.parse(handler.read())
DATA_FILE = "test.avro"
with open(DATA_FILE, "wb") as handler:
writer = DataFileWriter(handler, DatumWriter(), schema)
writer.append({"test_enum": "NONE"})
# writer.append({"test_enum": "UNKNOWN"})
# writer.append({})
writer.close()
with open(DATA_FILE, "rb") as handler:
for user in DataFileReader(handler, DatumReader()):
print(user)
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)