[
https://issues.apache.org/jira/browse/AVRO-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145098#comment-14145098
]
Doug Cutting commented on AVRO-1343:
------------------------------------
Please contribute a patch. I will commit patches that look reasonable, folks
say they work, include tests, etc.
If someone contributes more than a few patches that are committed then they'll
be invited to become a committer and they can commit Python patches more
rapidly.
Please also comment on any existing patches that you feel ought to be committed.
> Python: validate too permissive on records with extra fields
> ------------------------------------------------------------
>
> Key: AVRO-1343
> URL: https://issues.apache.org/jira/browse/AVRO-1343
> Project: Avro
> Issue Type: Bug
> Components: python
> Reporter: Jeremy Kahn
> Assignee: Jeremy Kahn
> Fix For: 1.8.0
>
> Attachments: AVRO-1343-tests.patch, AVRO-1343-validate.patch
>
>
> Python's validator silently accepts (generic) records with extra fields and
> considers them valid.
> For example, {{io.validate}} silently considers that the schema:
> {noformat}{"type": "record",
> "name": "Test",
> "fields": [{"name": "f", "type": "long"}]}
> {noformat}
> should accept records like:
> {noformat}{'f': 5, 'extra_field': "abc"}{noformat}
> but this is problematic.
> This is *especially* problematic for encoding unions, because internally the
> Python serializer uses {{validate}} to find the appropriate schema with which
> to encode a given object.
> In the current implementation, union schema selection is the *last* schema
> that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't
> picky, this encoding will frequently guess wrong.
> I will attach two patches: one to the tests and one to the {{validate}}
> function.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)