That's great Gary. Thanks for the follow up.

Dave

From: [email protected] [mailto:[email protected]] On Behalf Of 
Gary Steelman
Sent: Tuesday, February 18, 2014 5:15 PM
To: Gary Steelman
Cc: [email protected]
Subject: Re: General-Purpose Serialization and Deserialization for 
Avro-Generated SpecificRecords

Hey all, I've adapted Dave's solution to serialize to/from byte[] rather than 
JSON. Thanks a lot! The two methods are below:

  @SuppressWarnings("unchecked")
  public static <T> byte[] avroSerialize(Class<T> clazz, Object object) {
    byte[] ret = null;
    try {
      if (object == null || !(object instanceof SpecificRecord)) {
        return null;
      }

      T record = (T) object;
      ByteArrayOutputStream out = new ByteArrayOutputStream();
      Encoder e = EncoderFactory.get().directBinaryEncoder(out, null);
      SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
      w.write(record, e);
      e.flush();
      ret = out.toByteArray();
    } catch (IOException e) {
      LOG.debug(e);
    }

    return ret;
  }

  public static <T> T avroDeserialize(byte[] avroBytes, Class<T> clazz, Schema 
schema) {
    T ret = null;
    try {
      ByteArrayInputStream in = new ByteArrayInputStream(avroBytes);
      Decoder d = DecoderFactory.get().directBinaryDecoder(in, null);
      SpecificDatumReader<T> reader = new SpecificDatumReader<T>(clazz);
      ret = reader.read(null, d);
    } catch (IOException e) {
      LOG.debug(e);
    }

    return ret;
  }
And they're called like so:
MyObject x = new MyObject();
byte[] avroBytes = avroSerialize(x.getClass(), x);
MyObject y = avroDeserialize(avroBytes, MyObject.class, MyObject.SCHEMA$);
Thanks,
Gary

On Tue, Feb 18, 2014 at 6:49 PM, Gary Steelman 
<[email protected]<mailto:[email protected]>> wrote:

Thank you Dave, I appreciate it. I'll give those a shot and let you know how it 
goes.

-Gary
On Feb 18, 2014 6:45 PM, "Dave McAlpin" 
<[email protected]<mailto:[email protected]>> wrote:
Here are some utility functions we've used for serialization to and from JSON. 
Something similar should work for binary.

public <T> String avroEncodeAsJson(Class<T> clazz, Object object) {
    String avroEncodedJson = null;
    try {
        if (object == null || !(object instanceof SpecificRecord)) {
            return null;
        }
        T record = (T) object;
        Schema schema = ((SpecificRecord) record).getSchema();
        ByteArrayOutputStream out = new ByteArrayOutputStream();
        Encoder e = EncoderFactory.get().jsonEncoder(schema, out);
        SpecificDatumWriter<T> w = new SpecificDatumWriter<T>(clazz);
        w.write(record, e);
        e.flush();
        avroEncodedJson = new String(out.toByteArray());
    } catch (IOException e) {
        e.printStackTrace();
    }

    return avroEncodedJson;
}

public <T> T jsonDecodeToAvro(String inputString, Class<T> className, Schema 
schema) {
    T returnObject = null;
    try {
        JsonDecoder jsonDecoder = DecoderFactory.get().jsonDecoder(schema, 
inputString);
        SpecificDatumReader<T> reader = new SpecificDatumReader<T>(className);
        returnObject = reader.read(null, jsonDecoder);
    } catch (IOException e) {
        e.printStackTrace();
    }

    return returnObject;
}

Dave

From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]<mailto:[email protected]>] On Behalf Of 
Gary Steelman
Sent: Tuesday, February 18, 2014 4:21 PM
To: [email protected]<mailto:[email protected]>
Subject: General-Purpose Serialization and Deserialization for Avro-Generated 
SpecificRecords

Hi all,
Here's my use case: I've got a bunch of different Java objects generated from 
Avro schema files. So the class definition headers look something like this: 
public class MyObject extends org.apache.avro.specific.SpecificRecordBase 
implements org.apache.avro.specific.SpecificRecord. I've got many other types 
than MyObject too. I need to write a method which can serialize (from MyObject 
or another class to byte[]) and deserialize (from byte[] to MyObject or another 
class) in memory (not writing to disk).
I couldn't figure out how to write one method to handle it for SpecificRecord, 
so I tired serializing/deserializing these things as GenericRecord instead:

  public static byte[] serializeFromAvro(GenericRecord gr) {
    try {
      DatumWriter<GenericRecord> writer2 = new 
GenericDatumWriter<GenericRecord>(gr.getSchema());
      ByteArrayOutputStream bao2 = new ByteArrayOutputStream();
      BinaryEncoder encoder2 = EncoderFactory.get().directBinaryEncoder(bao2, 
null);
      writer2.write(gr, encoder2);
      byte[] avroBytes2 = bao2.toByteArray();
      return avroBytes2;
    } catch (IOException e) {
      LOG.debug(e);
      return null;
    }
  }
  // Here I use a DataType enum and the AvroSchemaFactory to quickly retrieve a 
Schema object for a supported DataType.
  public static GenericRecord deserializeFromAvro(byte[] avroBytes, DataType 
dataType) {
    try {
      Schema schema = AvroSchemaFactory.getInstance().getSchema(dataType);
      DatumReader<GenericRecord> reader2 = new 
GenericDatumReader<GenericRecord>(schema);
      ByteArrayInputStream bai2 = new ByteArrayInputStream(avroBytes);
      BinaryDecoder decoder2 = DecoderFactory.get().directBinaryDecoder(bai2, 
null);
      GenericRecord gr2 = reader2.read(null, decoder2);
      return gr2;
    } catch (Exception e) {
      LOG.debug(e);
      return null;
    }
  }
And use them like such:
// Remember MyObject is the SpecificRecord implementing class.
MyObject x = new MyObject();
byte[] avroBytes = serializeFromAvro(x);
MyObject x2 = (MyObject) deserializeFromAvro(avroBytes, DataType.MyObject);
Which results in this:
java.lang.ClassCastException: org.apache.avro.generic.GenericData$Record cannot 
be cast to datatypes.generated.avro.MyObject
Is there an easier way to achieve my use case, or some way I can fix my methods 
to allow the sort of behavior I want?
Thanks,
Gary



Reply via email to