[ https://issues.apache.org/jira/browse/AVRO-1343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14145309#comment-14145309 ]
Dustin Spicuzza commented on AVRO-1343: --------------------------------------- Thanks Doug. AVRO-1545 is one that has been sitting in the queue for awhile. I'll post the validation patch also. > Python: validate too permissive on records with extra fields > ------------------------------------------------------------ > > Key: AVRO-1343 > URL: https://issues.apache.org/jira/browse/AVRO-1343 > Project: Avro > Issue Type: Bug > Components: python > Reporter: Jeremy Kahn > Assignee: Jeremy Kahn > Fix For: 1.8.0 > > Attachments: AVRO-1343-tests.patch, AVRO-1343-validate.patch > > > Python's validator silently accepts (generic) records with extra fields and > considers them valid. > For example, {{io.validate}} silently considers that the schema: > {noformat}{"type": "record", > "name": "Test", > "fields": [{"name": "f", "type": "long"}]} > {noformat} > should accept records like: > {noformat}{'f': 5, 'extra_field': "abc"}{noformat} > but this is problematic. > This is *especially* problematic for encoding unions, because internally the > Python serializer uses {{validate}} to find the appropriate schema with which > to encode a given object. > In the current implementation, union schema selection is the *last* schema > that {{validate(schema, obj)}} returns {{True}} for. If {{validate}} isn't > picky, this encoding will frequently guess wrong. > I will attach two patches: one to the tests and one to the {{validate}} > function. -- This message was sent by Atlassian JIRA (v6.3.4#6332)