Dear Avro Community,

I'm having problems writing large Datums to an Avro file.  Can someone
please advise?

Normally what is done..
- Create DatumWriter
- Create DataFileWriter(DatumWriter)
- Open file with DataFileWriter.create(Schema, File)
- When the file was open, it wrote the Schema to the file.
- Then you can DataFileWriter.append(Datum) many times.

Problem is DataFileWriter.append() doesn't handle very large Datum.

And apparently the solution is to use a BlockingBinaryEncoder, which
does solve the OutOfMemoryError.
- Create DatumWriter
- Create OutputStream -> File
- Create EncoderFactory.blockingBinaryEncoder(OutputStream)
- DatumWriter.write(Datum, BlockingBinaryEncoder)

But that BlockingBinaryEncoder solution doesn't write the Schema to
the beginning of the file.
- Making it not work with DataFileReader.
- Plus these Schemas are different, so needs to be there

I tried a combination of the two technics above which wrote the Schema
with DataFileWriter and then used BlockingBinaryEncoder to write the
Datums.  But upon reading the file I get "Invalid Sync!"

Seems like what I need is a way to pass DataFileWriter a
BlockingBinaryEncoder for it to use.  Because it is automatically
using a BinaryEncoder.  And the API has no way to pass it a different
one.


Thank you,

Terry

Reply via email to