[
https://issues.apache.org/jira/browse/AVRO-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Neil Ferguson updated AVRO-1968:
--------------------------------
Description:
The Python DatumWriter seems to evaluate types in a union in reverse order. For
example, with the following schema:
{{{
{
"type": "record",
"name": "MyRecord",
"fields": [
{"name": "my_field", "type": ["boolean", "double"]}
]
}
}}}
If I set my_field to a boolean in my data, it seems to be encoded as a double.
However, if I reverse the order of the types in my union ({{["double",
"boolean"]}}) it seems to be encoded as a boolean.
This seems unintuitive for a couple of reasons:
* I'd expect the types in the union to be evaluated in the order they are
specified
* Encoding a boolean as a double is a bit weird
I'm not sure if this is a bug or expected behaviour though. If this is the
expected behaviour (or it can't be changed without breaking things) then it
would be nice if this was documented somewhere (I searched by couldn't find
anything), as it's pretty unintuitive.
I've attached a full test case. The test case encodes and then decodes the data
with both the original schema and the reversed version. For me it prints:
{{
Type: <type 'float'>
Type from reversed schema: <type 'bool'>
}}
Ideally I'd expect the type to be 'bool' both times, but failing that I'd
expect the type to be 'bool' the first time, and 'float' the second time.
was:
The Python DatumWriter seems to evaluate types in a union in reverse order. For
example, with the following schema:
{{
{
"type": "record",
"name": "MyRecord",
"fields": [
{"name": "my_field", "type": ["boolean", "double"]}
]
}
}}
If I set my_field to a boolean in my data, it seems to be encoded as a double.
However, if I reverse the order of the types in my union ({{["double",
"boolean"]}}) it seems to be encoded as a boolean.
This seems unintuitive for a couple of reasons:
* I'd expect the types in the union to be evaluated in the order they are
specified
* Encoding a boolean as a double is a bit weird
I'm not sure if this is a bug or expected behaviour though. If this is the
expected behaviour (or it can't be changed without breaking things) then it
would be nice if this was documented somewhere (I searched by couldn't find
anything), as it's pretty unintuitive.
I've attached a full test case. The test case encodes and then decodes the data
with both the original schema and the reversed version. For me it prints:
{{
Type: <type 'float'>
Type from reversed schema: <type 'bool'>
}}
Ideally I'd expect the type to be 'bool' both times, but failing that I'd
expect the type to be 'bool' the first time, and 'float' the second time.
> Python DatumWriter seems to evaluate union types in reverse order
> ------------------------------------------------------------------
>
> Key: AVRO-1968
> URL: https://issues.apache.org/jira/browse/AVRO-1968
> Project: Avro
> Issue Type: Bug
> Components: python
> Affects Versions: 1.8.1
> Reporter: Neil Ferguson
> Attachments: avro_test.py
>
>
> The Python DatumWriter seems to evaluate types in a union in reverse order.
> For example, with the following schema:
> {{{
> {
> "type": "record",
> "name": "MyRecord",
> "fields": [
> {"name": "my_field", "type": ["boolean", "double"]}
> ]
> }
> }}}
> If I set my_field to a boolean in my data, it seems to be encoded as a
> double. However, if I reverse the order of the types in my union
> ({{["double", "boolean"]}}) it seems to be encoded as a boolean.
> This seems unintuitive for a couple of reasons:
> * I'd expect the types in the union to be evaluated in the order they are
> specified
> * Encoding a boolean as a double is a bit weird
> I'm not sure if this is a bug or expected behaviour though. If this is the
> expected behaviour (or it can't be changed without breaking things) then it
> would be nice if this was documented somewhere (I searched by couldn't find
> anything), as it's pretty unintuitive.
> I've attached a full test case. The test case encodes and then decodes the
> data with both the original schema and the reversed version. For me it prints:
> {{
> Type: <type 'float'>
> Type from reversed schema: <type 'bool'>
> }}
> Ideally I'd expect the type to be 'bool' both times, but failing that I'd
> expect the type to be 'bool' the first time, and 'float' the second time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)