Re: schema by reference

Doug Cutting Tue, 06 Dec 2011 11:48:14 -0800

On 12/06/2011 11:14 AM, Neil Davudo wrote:
> Yes, by a URL. Messages would be smaller in size when we store large numbers 
> of them, and we can always get the schema using the reference if necessary. 
> Similar to what we can do with WSDL having a reference to the XSD.


This is a reasonable thing to do.

A schema can easily be constructed from a URL with:

Schema.parse(url.openStream())

although one would probably want a cache in front of this.

Note that in Avro one one must ensure that the version of the schema at
the reference does not change, that it is identical to the version used
to write the datum.  So one should not probably not use a logical URL
for a datatype like http://me.com/schemas/FooRecord but rather a unique
ID like http://me.com/schemas/9fd73.

If you're using a database (e.g., HBase) then you can have a table that
of schemas, then, in other tables, store values annotated with the key
of the entry in the schema table.  https://github.com/spullara/havrobase
is one example of such an approach.

Or one might use a URL shortener for this, e.g.:

http://tinyurl.com/8a4rppd

redirects to

avro:///?{"type":"record","name":"foo","fields":[]}

One could then install a URL handler for "avro" URLs that resolves them
to their query string.

Doug

Re: schema by reference

Reply via email to