On 12/06/2011 11:14 AM, Neil Davudo wrote: > Yes, by a URL. Messages would be smaller in size when we store large numbers > of them, and we can always get the schema using the reference if necessary. > Similar to what we can do with WSDL having a reference to the XSD.
This is a reasonable thing to do. A schema can easily be constructed from a URL with: Schema.parse(url.openStream()) although one would probably want a cache in front of this. Note that in Avro one one must ensure that the version of the schema at the reference does not change, that it is identical to the version used to write the datum. So one should not probably not use a logical URL for a datatype like http://me.com/schemas/FooRecord but rather a unique ID like http://me.com/schemas/9fd73. If you're using a database (e.g., HBase) then you can have a table that of schemas, then, in other tables, store values annotated with the key of the entry in the schema table. https://github.com/spullara/havrobase is one example of such an approach. Or one might use a URL shortener for this, e.g.: http://tinyurl.com/8a4rppd redirects to avro:///?{"type":"record","name":"foo","fields":[]} One could then install a URL handler for "avro" URLs that resolves them to their query string. Doug
