Hey all, I've adapted Dave's solution to serialize to/from byte[] rather
than JSON. Thanks a lot! The two methods are below:
@SuppressWarnings("unchecked")
public static <T> byte[] avroSerialize(Class<T> clazz, Object object) {
byte[] ret = null;
try {
if (object == null || !(object instanceof SpecificRecord)) {
return null;
}
T record = (T) object;
ByteArrayOutputStream out = new ByteArrayOutputStream();
Encoder e = EncoderFactory.get().directBinaryEncoder(out, null);
SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
w.write(record, e);
e.flush();
ret = out.toByteArray();
} catch (IOException e) {
LOG.debug(e);
}
return ret;
}
public static <T> T avroDeserialize(byte[] avroBytes, Class<T> clazz,
Schema schema) {
T ret = null;
try {
ByteArrayInputStream in = new ByteArrayInputStream(avroBytes);
Decoder d = DecoderFactory.get().directBinaryDecoder(in, null);
SpecificDatumReader<T> reader = new SpecificDatumReader<T>(clazz);
ret = reader.read(null, d);
} catch (IOException e) {
LOG.debug(e);
}
return ret;
}
And they're called like so:
MyObject x = new MyObject();
byte[] avroBytes = avroSerialize(x.getClass(), x);
MyObject y = avroDeserialize(avroBytes, MyObject.class, MyObject.SCHEMA$);
Thanks,
Gary
On Tue, Feb 18, 2014 at 6:49 PM, Gary Steelman <[email protected]>wrote:
> Thank you Dave, I appreciate it. I'll give those a shot and let you know
> how it goes.
>
> -Gary
> On Feb 18, 2014 6:45 PM, "Dave McAlpin" <[email protected]> wrote:
>
>> Here are some utility functions we've used for serialization to and
>> from JSON. Something similar should work for binary.
>>
>>
>>
>> public <T> String avroEncodeAsJson(Class<T> clazz, Object object) {
>>
>> String avroEncodedJson = null;
>>
>> try {
>>
>> if (object == null || !(object instanceof SpecificRecord)) {
>>
>> return null;
>>
>> }
>>
>> T record = (T) object;
>>
>> Schema schema = ((SpecificRecord) record).getSchema();
>>
>> ByteArrayOutputStream out = new ByteArrayOutputStream();
>>
>> Encoder e = EncoderFactory.get().jsonEncoder(schema, out);
>>
>> SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
>>
>> w.write(record, e);
>>
>> e.flush();
>>
>> avroEncodedJson = new String(out.toByteArray());
>>
>> } catch (IOException e) {
>>
>> e.printStackTrace();
>>
>> }
>>
>>
>>
>> return avroEncodedJson;
>>
>> }
>>
>>
>>
>> public <T> T jsonDecodeToAvro(String inputString, Class<T> className,
>> Schema schema) {
>>
>> T returnObject = null;
>>
>> try {
>>
>> JsonDecoder jsonDecoder =
>> DecoderFactory.get().jsonDecoder(schema, inputString);
>>
>> SpecificDatumReader<T> reader = new
>> SpecificDatumReader<T>(className);
>>
>> returnObject = reader.read(null, jsonDecoder);
>>
>> } catch (IOException e) {
>>
>> e.printStackTrace();
>>
>> }
>>
>>
>>
>> return returnObject;
>>
>> }
>>
>>
>>
>> Dave
>>
>>
>>
>> *From:* [email protected] [mailto:[email protected]] *On
>> Behalf Of *Gary Steelman
>> *Sent:* Tuesday, February 18, 2014 4:21 PM
>> *To:* [email protected]
>> *Subject:* General-Purpose Serialization and Deserialization for
>> Avro-Generated SpecificRecords
>>
>>
>>
>> Hi all,
>>
>> Here's my use case: I've got a bunch of different Java objects generated
>> from Avro schema files. So the class definition headers look something like
>> this: public class MyObject extends
>> org.apache.avro.specific.SpecificRecordBase implements
>> org.apache.avro.specific.SpecificRecord. I've got many other types than
>> MyObject too. I need to write a method which can serialize (from MyObject
>> or another class to byte[]) and deserialize (from byte[] to MyObject or
>> another class) in memory (not writing to disk).
>>
>> I couldn't figure out how to write one method to handle it for
>> SpecificRecord, so I tired serializing/deserializing these things as
>> GenericRecord instead:
>>
>> public static byte[] serializeFromAvro(GenericRecord gr) {
>> try {
>> DatumWriter<GenericRecord> writer2 = new
>> GenericDatumWriter<GenericRecord>(gr.getSchema());
>> ByteArrayOutputStream bao2 = new ByteArrayOutputStream();
>> BinaryEncoder encoder2 =
>> EncoderFactory.get().directBinaryEncoder(bao2, null);
>> writer2.write(gr, encoder2);
>> byte[] avroBytes2 = bao2.toByteArray();
>> return avroBytes2;
>> } catch (IOException e) {
>> LOG.debug(e);
>> return null;
>> }
>> }
>>
>> // Here I use a DataType enum and the AvroSchemaFactory to quickly
>> retrieve a Schema object for a supported DataType.
>>
>> public static GenericRecord deserializeFromAvro(byte[] avroBytes,
>> DataType dataType) {
>> try {
>> Schema schema = AvroSchemaFactory.getInstance().getSchema(dataType);
>> DatumReader<GenericRecord> reader2 = new
>> GenericDatumReader<GenericRecord>(schema);
>> ByteArrayInputStream bai2 = new ByteArrayInputStream(avroBytes);
>> BinaryDecoder decoder2 =
>> DecoderFactory.get().directBinaryDecoder(bai2, null);
>> GenericRecord gr2 = reader2.read(null, decoder2);
>> return gr2;
>> } catch (Exception e) {
>> LOG.debug(e);
>> return null;
>> }
>> }
>>
>> And use them like such:
>>
>> // Remember MyObject is the SpecificRecord implementing class.
>>
>> MyObject x = new MyObject();
>>
>> byte[] avroBytes = serializeFromAvro(x);
>>
>> MyObject x2 = (MyObject) deserializeFromAvro(avroBytes,
>> DataType.MyObject);
>>
>> Which results in this:
>> java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record
>> cannot be cast to datatypes.generated.avro.MyObject
>>
>> Is there an easier way to achieve my use case, or some way I can fix my
>> methods to allow the sort of behavior I want?
>>
>> Thanks,
>>
>> Gary
>>
>