[ https://issues.apache.org/jira/browse/AVRO-1795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16671833#comment-16671833 ]
Chong Wang commented on AVRO-1795: ---------------------------------- Found an example in http://gisgeek.blogspot.com/2012/12/using-apache-avro-with-python.html seems did what is required. > Python2: Cannot parse nested schemas > ------------------------------------ > > Key: AVRO-1795 > URL: https://issues.apache.org/jira/browse/AVRO-1795 > Project: Avro > Issue Type: Bug > Components: python > Affects Versions: 1.8.0 > Reporter: Jakob Homan > Assignee: Jakob Homan > Priority: Major > > In the Java client, one can parse nested schemas by loading the nested schema > before the nesting schema. > For example, a header can be defined in one file: > {code:javascript}{ "namespace": "python.avro", > "type": "record", > "name": "header", > "fields": [ > { "name": "header_field", "type": "string" } > ] > }{code} > and then included in another schema: > {code:javascript}{ "namespace": "python.avro", > "type": "record", > "name": "event", > "fields": [ > { "name": "header", "type": "python.avro.header" }, > { "name": "event_field", "type": "string" } > ] > }{code} > As long as one instantiates the Parser and loads the header first, the > schemas will be reconciled and merged correctly. > However, the Python client does not support this. The {{parse}} method of > the {{schema.py}} file always instantiates a new Names object to hold the > schemas: > {code}def parse(json_string): > """Constructs the Schema from the JSON text.""" > # TODO(hammer): preserve stack trace from JSON parse > # parse the JSON > try: > json_data = json.loads(json_string) > except: > raise SchemaParseException('Error parsing JSON: %s' % json_string) > # Initialize the names object > names = Names() > # construct the Avro Schema object > return make_avsc_object(json_data, names){code} > Some possible fixes for this are: > 1) Create a separate Parser class to mimic the Schema.Parser Java approach, > while deprecating the current parse method. > 2) Include Names as a global variable to the parse method, allowing multiple > parse calls to populate the same namespace. This breaks current behavior > (and at least one unit test depends on it), so would be backwards compatible. > 3) Create a new parse method that returns not only the schema, but also the > Names instance and accepts that instance. This keeps the code nice and > functional while exposing the Names class, which previously had been not > particularly public. > I like the first approach. -- This message was sent by Atlassian JIRA (v7.6.3#76005)