[
https://issues.apache.org/jira/browse/AVRO-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manvendra Singh updated AVRO-2046:
----------------------------------
Description:
Hey, I come from [CWL
project](https://github.com/common-workflow-language/cwltool) and as a part of
my GSoC project, I'm working on adding Python 3 compatibility to ``cwltool``
codebase. We've been using avro-python2 for a long time now and it has worked
great for us in our projects: schema_salad and cwltool.
In the process of porting cwltool, I'm facing issues with avro-python3 library.
This is one of the bug I've found in the process.
Minimal reproducable example:
{code:none}
from collections import OrderedDict
import avro.schema
AvroSchemaFromJSONData = avro.schema.SchemaFromJSONData
a = {
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "favorite_number",
"type": [
"int",
"null"
]
},
{
"name": "favorite_color",
"type": [
"string",
"null"
]
}
],
"name": "User",
"namespace": "example.avro",
"type": "record"
}
b = OrderedDict(a)
AvroSchemaFromJSONData(a)
AvroSchemaFromJSONData(b)
{code}
Ouput:
{code}
~/Desktop/test/venv3/lib/python3.5/site-packages/avro/schema.py in
SchemaFromJSONData(json_data, names)
1252 if parser is None:
1253 raise SchemaParseException(
-> 1254 'Invalid JSON descriptor for an Avro schema: %r.' % json_data)
1255 return parser(json_data, names=names)
1256
SchemaParseException: Invalid JSON descriptor for an Avro schema:
OrderedDict([('namespace', 'example.avro'), ('type', 'record'), ('name',
'User'), ('fields', [{'type': 'string', 'name': 'name'}, {'type': ['int',
'null'], 'name': 'favorite_number'}, {'type': ['string', 'null'], 'name':
'favorite_color'}])]).
{code}
h5. Current implementation of this function does not allow for *any dict like
data type*. It however works in avro-python2.
Relevant line of code:
https://github.com/apache/avro/blob/master/lang/py3/avro/schema.py#L1250
was:
Minimal reproducable example:
{code:none}
from collections import OrderedDict
import avro.schema
AvroSchemaFromJSONData = avro.schema.SchemaFromJSONData
a = {
"fields": [
{
"name": "name",
"type": "string"
},
{
"name": "favorite_number",
"type": [
"int",
"null"
]
},
{
"name": "favorite_color",
"type": [
"string",
"null"
]
}
],
"name": "User",
"namespace": "example.avro",
"type": "record"
}
b = OrderedDict(a)
AvroSchemaFromJSONData(a)
AvroSchemaFromJSONData(b)
{code}
Ouput:
{code}
~/Desktop/test/venv3/lib/python3.5/site-packages/avro/schema.py in
SchemaFromJSONData(json_data, names)
1252 if parser is None:
1253 raise SchemaParseException(
-> 1254 'Invalid JSON descriptor for an Avro schema: %r.' % json_data)
1255 return parser(json_data, names=names)
1256
SchemaParseException: Invalid JSON descriptor for an Avro schema:
OrderedDict([('namespace', 'example.avro'), ('type', 'record'), ('name',
'User'), ('fields', [{'type': 'string', 'name': 'name'}, {'type': ['int',
'null'], 'name': 'favorite_number'}, {'type': ['string', 'null'], 'name':
'favorite_color'}])]).
{code}
h5. Current implementation of this function does not allow for *any dict like
data type*. It however works in avro-python2.
Relevant line of code:
https://github.com/apache/avro/blob/master/lang/py3/avro/schema.py#L1250
> avro-python3: Very restricted set of data types which are allowed in
> AvroSchemaFromJSONData
> -------------------------------------------------------------------------------------------
>
> Key: AVRO-2046
> URL: https://issues.apache.org/jira/browse/AVRO-2046
> Project: Avro
> Issue Type: Bug
> Components: python
> Affects Versions: 1.8.2
> Environment: avro-python3 (1.8.2)
> Reporter: Manvendra Singh
>
> Hey, I come from [CWL
> project](https://github.com/common-workflow-language/cwltool) and as a part
> of my GSoC project, I'm working on adding Python 3 compatibility to
> ``cwltool`` codebase. We've been using avro-python2 for a long time now and
> it has worked great for us in our projects: schema_salad and cwltool.
> In the process of porting cwltool, I'm facing issues with avro-python3
> library. This is one of the bug I've found in the process.
> Minimal reproducable example:
> {code:none}
> from collections import OrderedDict
> import avro.schema
> AvroSchemaFromJSONData = avro.schema.SchemaFromJSONData
> a = {
> "fields": [
> {
> "name": "name",
> "type": "string"
> },
> {
> "name": "favorite_number",
> "type": [
> "int",
> "null"
> ]
> },
> {
> "name": "favorite_color",
> "type": [
> "string",
> "null"
> ]
> }
> ],
> "name": "User",
> "namespace": "example.avro",
> "type": "record"
> }
> b = OrderedDict(a)
> AvroSchemaFromJSONData(a)
> AvroSchemaFromJSONData(b)
> {code}
> Ouput:
> {code}
> ~/Desktop/test/venv3/lib/python3.5/site-packages/avro/schema.py in
> SchemaFromJSONData(json_data, names)
> 1252 if parser is None:
> 1253 raise SchemaParseException(
> -> 1254 'Invalid JSON descriptor for an Avro schema: %r.' % json_data)
> 1255 return parser(json_data, names=names)
> 1256
> SchemaParseException: Invalid JSON descriptor for an Avro schema:
> OrderedDict([('namespace', 'example.avro'), ('type', 'record'), ('name',
> 'User'), ('fields', [{'type': 'string', 'name': 'name'}, {'type': ['int',
> 'null'], 'name': 'favorite_number'}, {'type': ['string', 'null'], 'name':
> 'favorite_color'}])]).
> {code}
>
> h5. Current implementation of this function does not allow for *any dict like
> data type*. It however works in avro-python2.
> Relevant line of code:
> https://github.com/apache/avro/blob/master/lang/py3/avro/schema.py#L1250
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)