Windows 8.1, Python 2.7, Avro 1.7.7

Using this avro schema <http://codeshare.io/5cage> and data in this format
<http://codeshare.io/tO3nR> I am able to validate the data against the
schema prior to attempting to write the data to a .avro file using the
python DataFileWriter. The data writes successfully to a .avro file. When I
attempt to read the data I either receive a List out of Index error or a
SchemaResolutionException: Can't access branch index XX for union with 12
branches.

Code:

#imports
import avro.schema
from avro.datafile import DataFileReader, DataFileWriter
from avro.io import DatumReader, DatumWriter
from pretty import pprint

# get data
data = [list of dictionaries in 2nd link]
#get schema and writer
schema = avro.schema.parse(open("mygramschema.avsc").read())
writer = DataFileWriter(open("mygram.avro", "w"), DatumWriter(), schema)
#write data
for vals in data:
     writer.append(vals)
writer.close()

#get reader
reader = DataFileReader(open("mygram.avro", "r"), DatumReader())
for data in reader:
     pprint (data)

When I receive the list of index error nothing prints and when I receive
the SchemaResolutionException error some of the data prints but not all. I
am generating this data on the fly so I've checked a number of different
unicode encoding issues and had no luck so I don't think that's the issue.
I'm at a loss for how to go about troubleshooting this since the avro
schema checks out when I use avro.io.validate; in addtion the jsontofrag
jar utility tool has provided no additional information for debugging.

Thanks,
Balaji

Reply via email to