Hi,

I'm doing the following:

from avro.datafile import DataFileReader
from avro.datafile import DataFileWriter
from avro.io import DatumReader
from avro.io import DatumWriter

def OpenAvroFileToRead(avro_filename):
   DataFileReader(open(avro_filename, 'r'), DatumReader())


with OpenAvroFileToRead(avro_filename) as reader:
   for r in reader:
       ....

I have an avro file which is only 500 bytes. I think there is a data
structure in there which is null or empty.

I put in print statements before and after "for r in reader". On the
instruction, for r in reader it consumes about 400Gigs of memory before I
have to kill the process.

That is 400Gigs! Ihave 1TB on my server. I have tried this with 1.6.1 and
1.7.1 and 1.7.7 and get the same behavior on all three versions.

Any ideas on what is causing this?

Regards,

WU

Reply via email to