Re: Schema registry

Doug Cutting Thu, 24 Mar 2011 09:30:23 -0700

Avro in several places requires that schemas are self-contained.  For
example, when reading a data file the schema that was used to write it
must be available and should not be dynamically re-constructed from
references to schemas in the reader's environment.  So, if such a
registry were implemented, it should perhaps only be used when parsing
schemas, not when printing them, and, even then, only in some contexts.

It's thus perhaps safer and simpler to handle this as a pre-processing
step.  One can, e.g., use a preprocessor like cpp or m4 to generate
schemas from input files.  For example, one could have a file named
md5.avph that contains:

#define MD5 {"type": "record", ... }

And another file named Foo.avpp that contains:

#include "md5.avps"

{"type": "record", "name":"Foo", "fields": [
  {"name":"checksum", "type": MD5 }
 ]
}

Then your build process can run cpp over .avps files to generate the
.avsc files that can be used by Avro.

Also note that Avro IDL supports imports:

http://avro.apache.org/docs/current/idl.html#imports

One can use an IDL file with no messages to define a set of types.

Doug

On 03/23/2011 11:56 AM, [email protected] wrote:
> Hi,
> 
> Is there some java implementation of Avro schema registry? The use case is
> to have separate schema data files for a bunch of types and be able to
> resolve nested types.
> 
> I tried avro for the first time and could not have schema parsed from one
> file have a nested record from a schema described in a second file.
> 
> I am using a modified version of the AvroUtil class from
> http://www.infoq.com/articles/ApacheAvro . The modified file is attached.
> I uses the SchemaParse exception and loads schema files from classpath.
> 
> Is there a better alternative. If this is a strong use case I could work
> on creating such a schema registry with plugable resolvers and loaders.
> 
> Thanks and regards,
>  - Ashish

Re: Schema registry

Reply via email to