> This full schema should either go with the data (data files) or in a registry (e.g. HAvroBase).
Isn't the latter what they want? A registry? Presumably the RPC framework implements such a registry, since it can look schemas up by their hashcode. On Thu, Mar 24, 2011 at 2:04 PM, Scott Carey <[email protected]>wrote: > There is danger in this. > > What is the schema used for in this case? There are three common reasons > for assembling a schema: > 1. Assembling the schema that represents the format of the data to be > written. > 2. Assembling the schema that represents the way a reader wishes to view > the data. (a.k.a. 'reader' or 'expected' schema). > 3. Assembling the schema that represents the way that some data was > persisted. > > If you are persisting data, you should persist the _entire_ schema used to > write that data as well. This full schema should either go with the data > (data files) or in a registry (e.g. HAvroBase). A schema name reference > is not sufficient -- you lose the ability to evolve the referenced schema. > > What if the version of the nested schema has changed? Now you have a data > file that refers to a nested schema by name "com.navteq.avro.FacebookUser" > and finds a schema with that name through some resolution mechanism. If > that resolution mechanism is not version-aware, you're in trouble. > > So for #3, assembling schema fragments by reference is dangerous and > complicated. > Making the resolution mechanism version aware is problematic but doable. > You can manually version every schema with a number, and use that, but > then you are manually versioning schemas and storing the version meta-data > in the schemas. > > Avro by nature versions schemas by equivalence. The natural way to encode > a schema version is to write the schema itself. > > In short: Any such registry would have to be version-aware if it is used > to assemble schemas for use case #3 above, and the schemas that refer to > these versions would also have to be version-aware. It is much simpler to > just embed the schemas. > > Use cases #1 and #2 above are essentially the assembly of the 'current' > schema version, and a registry could work. Avro does not have many > built-in tools for this. Generally, avsc, avpr, or avdl files are used as > schema source for 'schema first' design, and 'code first' design persists > the current schema in the code. > avdl files support includes, avsc and avpr are more primitive. > > > On 3/23/11 10:21 PM, "Ashish Shinde" <[email protected]> wrote: > > >Hi, > > > >My use case is very similar to the nested schema in > >the test case AvroUtilsTest on http://www.infoq.com/articles/ApacheAvro > > > >The only difference is I would like to automatically load schema's from > >resources in classpath and also automatically load schema's > >for nested types. > > > >If you look at the test example mentioned above if I ask the > >"AvroSchemaRegistry" for a schema named > >com.navteq.avro.FacebookSpecialUser it should also load the nested > >com.navteq.avro.FacebookUser schema using some resolving and loading > >mechanism. > > > >Thanks and regards, > >- Ashish > > > > > > > >On Thu, 24 Mar 2011 10:38:20 +0800 > >Felix Xu <[email protected]> wrote: > > > >> Hi,I'm not quite understand the question.. > >> Can you give an example of your schema? > >> > >> 2011/3/24 <[email protected]> > >> > >> > Hi, > >> > > >> > Is there some java implementation of Avro schema registry? The use > >> > case is to have separate schema data files for a bunch of types and > >> > be able to resolve nested types. > >> > > >> > I tried avro for the first time and could not have schema parsed > >> > from one file have a nested record from a schema described in a > >> > second file. > >> > > >> > I am using a modified version of the AvroUtil class from > >> > http://www.infoq.com/articles/ApacheAvro . The modified file is > >> > attached. I uses the SchemaParse exception and loads schema files > >> > from classpath. > >> > > >> > Is there a better alternative. If this is a strong use case I could > >> > work on creating such a schema registry with plugable resolvers and > >> > loaders. > >> > > >> > Thanks and regards, > >> > - Ashish > >> > > > > >
