Dear Avro Community, I'm having problems writing large Datums to an Avro file. Can someone please advise?
Normally what is done.. - Create DatumWriter - Create DataFileWriter(DatumWriter) - Open file with DataFileWriter.create(Schema, File) - When the file was open, it wrote the Schema to the file. - Then you can DataFileWriter.append(Datum) many times. Problem is DataFileWriter.append() doesn't handle very large Datum. And apparently the solution is to use a BlockingBinaryEncoder, which does solve the OutOfMemoryError. - Create DatumWriter - Create OutputStream -> File - Create EncoderFactory.blockingBinaryEncoder(OutputStream) - DatumWriter.write(Datum, BlockingBinaryEncoder) But that BlockingBinaryEncoder solution doesn't write the Schema to the beginning of the file. - Making it not work with DataFileReader. - Plus these Schemas are different, so needs to be there I tried a combination of the two technics above which wrote the Schema with DataFileWriter and then used BlockingBinaryEncoder to write the Datums. But upon reading the file I get "Invalid Sync!" Seems like what I need is a way to pass DataFileWriter a BlockingBinaryEncoder for it to use. Because it is automatically using a BinaryEncoder. And the API has no way to pass it a different one. Thank you, Terry
