Hi Doug, On 2012-08-15, at 3:31 PM, Doug Cutting wrote:
> On Wed, Aug 15, 2012 at 11:00 AM, Ryan Slobojan <[email protected]> wrote: >> I'm trying to figure out how to integrate some existing Thrift objects into >> an Avro-generated object, and haven't been able to find any pointers in the >> docs - as background, the project in question has recently adopted Avro as a >> new serialization standard, however there is still quite a bit of legacy >> code (and data) which uses Thrift, so the hope is that we can package a >> fairly small Thrift object in as a field on an AVDL-generated Avro object. I >> came across the org.apache.avro.thrift package >> (http://avro.apache.org/docs/1.7.1/api/java/org/apache/avro/thrift/package-summary.html) >> and see that reading and writing of Thrift objects is supported (as well as >> the test which shows it in action at >> http://svn.apache.org/viewvc/avro/tags/release-1.7.1/lang/java/thrift/src/test/java/org/apache/avro/thrift/TestThrift.java?view=markup), >> however it's unclear to me if (or how) I can point at a Thrift object within >> an AVDL. > > That's not currently possible. K, good to know - glad that I didn't just miss docs somewhere or something like that :) >> Assuming that this AVDL embedding isn't possible (something tells me that >> ThriftDatum[Reader|Writer] being based off of GenericDatumReader means it's >> only meant for runtime use, and not as part of AVDL compilation), what would >> you recommend as the best approach towards achieving this? It would be >> possible to duplicate the Thrift object's schema in AVDL and create a second >> AVDL-based version of that object, but would there be a clean way to convert >> back and forth between the two representations without needing to add a >> bunch of extra code? The existing legacy code *really* wants a Thrift >> object, so I need to somehow get from the Avro object to that, preferably in >> the cleanest way possible - any pointers would be greatly appreciated. > > You might alias the AVDL-based record names to the Thrift-based record > names, serialize the Thrift object to a buffer using ThriftDatumWriter > then deserialize it using SpecificDatumReader. Would that work for > you? So assuming I understand correctly, my approach would be: * Create an AVDL with the same fields as the Thrift object, but in a different package (since both compiled classes need to exist within the same JVM) * Write my Thrift object into an intermediate Encoder (say DirectBinaryEncoder, since it's in-memory) which writes to an OutputStream * Read in from that OutputStream (wrapped in an InputStream) using my AVDL-generated class's readObject * Store that Avro object as a field on the parent Avro object (like any other Avro subobject) And the reverse would be similar, in order to get the Thrift object I need for the legacy code from the stored Avro object. Sound right? Another question which comes to mind - ThriftData has the ability to get an Avro Schema for a Thrift object (http://avro.apache.org/docs/1.7.1/api/java/org/apache/avro/thrift/ThriftData.html#getSchema(java.lang.Class)), can I use that in conjunction with http://avro.apache.org/docs/1.7.1/api/java/org/apache/avro/SchemaNormalization.html#toParsingForm(org.apache.avro.Schema) to ensure that the schema of the source and destination objects are in fact the same? Or do you think there would be issues with that approach? Thanks! Ryan Slobojan
