[ 
https://issues.apache.org/jira/browse/AVRO-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17868810#comment-17868810
 ] 

Joey Pereira commented on AVRO-3272:
------------------------------------

I recently ran into this issue with production systems because there are some 
manually managed Avro files, and I made some changes that resulted in 
re-ordering the types. This was troublesome because while schemas passed Python 
validation, they caused errors when used in other Avro implementations that 
were more strict.

Not only is there no validation of types, but according to the official Avro 
spec the type of the default must be the first value in a union type. 
https://avro.apache.org/docs/1.11.1/specification/#unions

{quote}(Note that when a default value is specified for a record field whose 
type is a union, the type of the default value must match the first element of 
the union. Thus, for unions containing “null”, the “null” is usually listed 
first, since the default value of such unions is typically null.){quote}

Neither {{default}} types or the order of a union type are validated, and it's 
left as a TODO here - 
https://github.com/apache/avro/blob/61f2ecdc2faf7a5df759ace1acd24d5a5bac3739/lang/py/avro/schema.py#L446

(I also found the other popular Python library, fastavro, also has the same 
issue - https://github.com/fastavro/fastavro/issues/785)

> Validate defaults for Record Fields
> -----------------------------------
>
>                 Key: AVRO-3272
>                 URL: https://issues.apache.org/jira/browse/AVRO-3272
>             Project: Apache Avro
>          Issue Type: Improvement
>          Components: python
>    Affects Versions: 1.11.0
>            Reporter: Michael A. Smith
>            Priority: Major
>
> AVRO-3229 points out that the Python implementation does not validate 
> defaults for EnumSchema. The fix for EnumSchema is relatively 
> straightforward, but that issue reminds us that Python also doesn't validate 
> RecordSchema Fields. That fix is complicated by the fact that the Avro 
> validator is type-strict about the difference between bytes and strings, but 
> the default bytes field is a string.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to