Here are some utility functions we've used for serialization to and from JSON.
Something similar should work for binary.
public <T> String avroEncodeAsJson(Class<T> clazz, Object object) {
String avroEncodedJson = null;
try {
if (object == null || !(object instanceof SpecificRecord)) {
return null;
}
T record = (T) object;
Schema schema = ((SpecificRecord) record).getSchema();
ByteArrayOutputStream out = new ByteArrayOutputStream();
Encoder e = EncoderFactory.get().jsonEncoder(schema, out);
SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
w.write(record, e);
e.flush();
avroEncodedJson = new String(out.toByteArray());
} catch (IOException e) {
e.printStackTrace();
}
return avroEncodedJson;
}
public <T> T jsonDecodeToAvro(String inputString, Class<T> className, Schema
schema) {
T returnObject = null;
try {
JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(schema,
inputString);
SpecificDatumReader<T> reader = new SpecificDatumReader<T>(className);
returnObject = reader.read(null, jsonDecoder);
} catch (IOException e) {
e.printStackTrace();
}
return returnObject;
}
Dave
From: [email protected] [mailto:[email protected]] On Behalf Of
Gary Steelman
Sent: Tuesday, February 18, 2014 4:21 PM
To: [email protected]
Subject: General-Purpose Serialization and Deserialization for Avro-Generated
SpecificRecords
Hi all,
Here's my use case: I've got a bunch of different Java objects generated from
Avro schema files. So the class definition headers look something like this:
public class MyObject extends org.apache.avro.specific.SpecificRecordBase
implements org.apache.avro.specific.SpecificRecord. I've got many other types
than MyObject too. I need to write a method which can serialize (from MyObject
or another class to byte[]) and deserialize (from byte[] to MyObject or another
class) in memory (not writing to disk).
I couldn't figure out how to write one method to handle it for SpecificRecord,
so I tired serializing/deserializing these things as GenericRecord instead:
public static byte[] serializeFromAvro(GenericRecord gr) {
try {
DatumWriter<GenericRecord> writer2 = new
GenericDatumWriter<GenericRecord>(gr.getSchema());
ByteArrayOutputStream bao2 = new ByteArrayOutputStream();
BinaryEncoder encoder2 = EncoderFactory.get().directBinaryEncoder(bao2,
null);
writer2.write(gr, encoder2);
byte[] avroBytes2 = bao2.toByteArray();
return avroBytes2;
} catch (IOException e) {
LOG.debug(e);
return null;
}
}
// Here I use a DataType enum and the AvroSchemaFactory to quickly retrieve a
Schema object for a supported DataType.
public static GenericRecord deserializeFromAvro(byte[] avroBytes, DataType
dataType) {
try {
Schema schema = AvroSchemaFactory.getInstance().getSchema(dataType);
DatumReader<GenericRecord> reader2 = new
GenericDatumReader<GenericRecord>(schema);
ByteArrayInputStream bai2 = new ByteArrayInputStream(avroBytes);
BinaryDecoder decoder2 = DecoderFactory.get().directBinaryDecoder(bai2,
null);
GenericRecord gr2 = reader2.read(null, decoder2);
return gr2;
} catch (Exception e) {
LOG.debug(e);
return null;
}
}
And use them like such:
// Remember MyObject is the SpecificRecord implementing class.
MyObject x = new MyObject();
byte[] avroBytes = serializeFromAvro(x);
MyObject x2 = (MyObject) deserializeFromAvro(avroBytes, DataType.MyObject);
Which results in this:
java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot
be cast to datatypes.generated.avro.MyObject
Is there an easier way to achieve my use case, or some way I can fix my methods
to allow the sort of behavior I want?
Thanks,
Gary