Hi Doug,

On 2012-08-15, at 3:31 PM, Doug Cutting wrote:

> On Wed, Aug 15, 2012 at 11:00 AM, Ryan Slobojan <[email protected]> wrote:
>> I'm trying to figure out how to integrate some existing Thrift objects into
>> an Avro-generated object, and haven't been able to find any pointers in the
>> docs - as background, the project in question has recently adopted Avro as a
>> new serialization standard, however there is still quite a bit of legacy
>> code (and data) which uses Thrift, so the hope is that we can package a
>> fairly small Thrift object in as a field on an AVDL-generated Avro object. I
>> came across the org.apache.avro.thrift package
>> (http://avro.apache.org/docs/1.7.1/api/java/org/apache/avro/thrift/package-summary.html)
>> and see that reading and writing of Thrift objects is supported (as well as
>> the test which shows it in action at
>> http://svn.apache.org/viewvc/avro/tags/release-1.7.1/lang/java/thrift/src/test/java/org/apache/avro/thrift/TestThrift.java?view=markup),
>> however it's unclear to me if (or how) I can point at a Thrift object within
>> an AVDL.
> 
> That's not currently possible.

K, good to know - glad that I didn't just miss docs somewhere or something like 
that :)

>> Assuming that this AVDL embedding isn't possible (something tells me that
>> ThriftDatum[Reader|Writer] being based off of GenericDatumReader means it's
>> only meant for runtime use, and not as part of AVDL compilation), what would
>> you recommend as the best approach towards achieving this? It would be
>> possible to duplicate the Thrift object's schema in AVDL and create a second
>> AVDL-based version of that object, but would there be a clean way to convert
>> back and forth between the two representations without needing to add a
>> bunch of extra code? The existing legacy code *really* wants a Thrift
>> object, so I need to somehow get from the Avro object to that, preferably in
>> the cleanest way possible - any pointers would be greatly appreciated.
> 
> You might alias the AVDL-based record names to the Thrift-based record
> names, serialize the Thrift object to a buffer using ThriftDatumWriter
> then deserialize it using SpecificDatumReader.  Would that work for
> you?

So assuming I understand correctly, my approach would be:

* Create an AVDL with the same fields as the Thrift object, but in a different 
package (since both compiled classes need to exist within the same JVM)
* Write my Thrift object into an intermediate Encoder (say DirectBinaryEncoder, 
since it's in-memory) which writes to an OutputStream
* Read in from that OutputStream (wrapped in an InputStream) using my 
AVDL-generated class's readObject
* Store that Avro object as a field on the parent Avro object (like any other 
Avro subobject)

And the reverse would be similar, in order to get the Thrift object I need for 
the legacy code from the stored Avro object.

Sound right?

Another question which comes to mind - ThriftData has the ability to get an 
Avro Schema for a Thrift object 
(http://avro.apache.org/docs/1.7.1/api/java/org/apache/avro/thrift/ThriftData.html#getSchema(java.lang.Class)),
 can I use that in conjunction with 
http://avro.apache.org/docs/1.7.1/api/java/org/apache/avro/SchemaNormalization.html#toParsingForm(org.apache.avro.Schema)
 to ensure that the schema of the source and destination objects are in fact 
the same? Or do you think there would be issues with that approach?

Thanks!

Ryan Slobojan

Reply via email to