[
https://issues.apache.org/jira/browse/AVRO-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13697081#comment-13697081
]
Doug Cutting commented on AVRO-1343:
------------------------------------
Thanks for elaborating.
The intent is that the record name should be used when resolving unions of
records. The specification states:
bq. Unions may not contain more than one schema with the same type, except for
the named types record, fixed and enum. For example, unions containing two
array types or two map types are not permitted, but two types with different
names are permitted. (Names permit efficient resolution when reading and
writing unions.)
This is a longstanding bug in Avro Python, discussed in AVRO-973 and elsewhere.
It's also a performance killer. I'd rather fix this underlying bug. AVRO-283
is the closest thing we have to a fix.
> Python: validate too permissive on records with extra fields
> ------------------------------------------------------------
>
> Key: AVRO-1343
> URL: https://issues.apache.org/jira/browse/AVRO-1343
> Project: Avro
> Issue Type: Bug
> Components: python
> Reporter: Jeremy Kahn
> Assignee: Jeremy Kahn
> Fix For: 1.7.5
>
> Attachments: AVRO-1343-tests.patch, AVRO-1343-validate.patch
>
>
> Python's validator silently accepts (generic) records with extra fields and
> considers them valid.
> For example, {{io.validate}} silently considers that the schema:
> {noformat}{"type": "record",
> "name": "Test",
> "fields": [{"name": "f", "type": "long"}]}
> {noformat}
> should accept records like:
> {noformat}{'f': 5, 'extra_field': "abc"}{noformat}
> but this is problematic.
> This is *especially* problematic for encoding unions, because internally the
> Python serializer uses {{validate}} to find the appropriate schema with which
> to encode a given object.
> In the current implementation, union schema selection is the *last* schema
> that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't
> picky, this encoding will frequently guess wrong.
> I will attach two patches: one to the tests and one to the {{validate}}
> function.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira