Thanks Scott and Doug, see follow up below. On Tue, Aug 9, 2011 at 11:42 AM, Scott Carey <[email protected]> wrote:
> On 8/9/11 11:15 AM, "Bill Graham" <[email protected]> wrote: > > Hi, > > I'm trying to create a schema that references a type defined in another > schema and I'm having some troubles. Is there an easy way to do this? > > My test schemas look like this: > > $ cat position.avsc > {"type":"enum", "name": "Position", "namespace": "avro.examples.baseball", > "symbols": ["P", "C", "B1", "B2", "B3", "SS", "LF", "CF", "RF", "DH"] > } > > $ cat player.avsc > {"type":"record", "name":"Player", "namespace": "avro.examples.baseball", > "fields": [ > {"name": "number", "type": "int"}, > {"name": "first_name", "type": "string"}, > {"name": "last_name", "type": "string"}, > {"name": "position", "type": {"type": "array", "items": > "avro.examples.baseball.Position"} } > ] > } > > I've read this thread ( > http://apache-avro.679487.n3.nabble.com/How-to-reference-previously-defined-enum-in-avsc-file-td2663512.html) > and tried using IDL like so with no luck: > > $ cat baseball.avdl > @namespace("avro.examples.baseball") > protocol Baseball { > import schema "position.avsc"; > import schema "player.avsc"; > } > > $ java -jar avro-tools-1.5.1.jar idl baseball.avdl baseball.avpr > Exception in thread "main" org.apache.avro.SchemaParseException: Undefined > name: "avro.examples.baseball.Position" > at org.apache.avro.Schema.parse(Schema.java:979) > at org.apache.avro.Schema.parse(Schema.java:1052) > at org.apache.avro.Schema.parse(Schema.java:1021) > at org.apache.avro.Schema.parse(Schema.java:884) > at org.apache.avro.compiler.idl.Idl.ImportSchema(Idl.java:388) > at org.apache.avro.compiler.idl.Idl.ProtocolBody(Idl.java:320) > at > org.apache.avro.compiler.idl.Idl.ProtocolDeclaration(Idl.java:206) > at org.apache.avro.compiler.idl.Idl.CompilationUnit(Idl.java:84) > ... > > > I agree that the documentation indicates that this should work. I suspect > that it may not be able to resolve dependencies among imports. That is if > Baseball depends on position, and on player, it works. But since player > depends on position, it does not. The import statement pulls in each item > individually for use in composite things in the AvroIDL, but does not allow > for interdependencies in the imports. > This seems worthy of a JIRA enhancement request. I'm sure the project will > accept a patch that adds this. > > Done: https://issues.apache.org/jira/browse/AVRO-872 > > I also saw this blog post ( > http://www.infoq.com/articles/ApacheAvro#_ftnref6_7758) where the author > had to write some nasty String.replace(..) code to combine schemas, but > there's got to be a better way that this. > > > We need to improve the ability to import multiple files when parsing. > Using the lower level Avro API you can parse the files yourself in an order > that will work. > I have simply put all my types in one file. If you made one avsc file with > both Position and Player in a JSON array it will complie. It would look > like: > [ > < position schema here>, > < player schema here> > ] > Yes, I've used this approach in the past. Initially I was thinking that I could write something to combine multiple files into a single InputStream facade that generates a union like you describe, which could then be parsed. I could then hold a handle to the union schema and provide a method to get a given scheme type (i.e. the Player) by name. This is better than the String replace(..) approach, but still a bit hacky. Using the lower level Avro API you can parse the files yourself in an order > that will work. How exactly would the approach work where you parse files in reverse-dependency order work? This is something I'd like to explore and maybe contribute a helper for. I've tried a few combinations of this approach to no avail: Schema schema1 = Schema.parse(new File("examples/java/avro/position.avsc")); Schema schema2 = schema1.parse(new File("examples/java/avro/player.avsc")); > > > Also FYI, it seems enum values can't start with numbers (i.e. '1B'). Is > this a know issue or a feature? I haven't seen it documented anywhere. You > get an error like this if the value starts with a number: > > org.apache.avro.SchemaParseException: Illegal initial character > > > > Enums are a named type. The enum names must start with [A-Za-z_] and > subsequently contain only [A-Za-z0-9_]. > http://avro.apache.org/docs/1.5.1/spec.html#Names > I hadn't noticed that before, thanks. > > However, the spec does not say that the values must have such restrictions. > This may be a bug, can you file a JIRA ticket? > Done: https://issues.apache.org/jira/browse/AVRO-871 > > Thanks! > > -Scott > > > thanks, > Bill > >
