[ 
https://issues.apache.org/jira/browse/AVRO-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13651288#comment-13651288
 ] 

Scott Carey commented on AVRO-1325:
-----------------------------------

Below are the limitations that concern me from AVRO-1274, in approximate 
priority of my concern.

# Arbitrary properties are not supported, for example {"type":"string", 
"avro.java.string":"String"} can not be built.
# SchemaBuilder.INT and other constants are public.  Unfortunately, these are 
mutable, and anyone could call addProp() on these, affecting others.
# Scopes are confusing, it is not always obvious when a 
# Does not chain to nested types.  Although there is limited chaining for 
record fields, nested calls to the builder are required which prevents 
supporting namespace nesting or other passing of context from outer to inner 
scopes.


I have a prototype patch that builds on the work in AVRO-1274.  The major 
changes are to how scopes are handled for fields and unions, since adding 
property support is not trivial on top of AVRO-1274 because there is much 
ambiguity in what a call to add a property would apply to (the field, or the 
type of the field?)

The following schema:
{code:json}
  
{"type":"record","name":"HandshakeRequest","namespace":"org.apache.avro.ipc","fields":[
    {"name":"clientHash","type":{"type":"fixed","name":"MD5","size":16}},
    {"name":"clientProtocol","type":[
      "null",
      {"type":"string","avro.java.string":"String"}]},
    {"name":"serverHash","type":"MD5"},
    {"name":"meta","type":[
      "null",
      {"type":"map","values":"bytes","avro.java.string":"String"}]}
  ]}
{code}
looks like this in the builder:
{code}
  Schema result = SchemaBuilder
    .recordType("HandshakeRequest").namespace("org.apache.avro.ipc").fields()
      .name("clientHash").type().fixed("MD5").size(16).noDefault()
      .name("clientProtocol").type().unionOf()
        .nullType().and()
        .stringWith().prop("avro.java.string", 
"String").endString().endUnion().noDefault()
      .name("serverHash").type("MD5")
      .name("meta").type().unionOf()
        .nullType().and()
        .map().prop("avro.java.string", 
"String").values().bytesType().endUnion().withDefault(null)
      .record();
{code}

It supports the same feature set that JSON schemas do:
  * nesting of namespaces ("MD5" above automatically picks up the 
"org.apache.avro.ipc" namespace)
  * reference of named types by name .type("MD5") above for serverHash
And enforces other rules:
  * union defaults are required to be the same as the first type in the union
  * properties, doc(), namespace, and aliases work only in the contexts that 
they are supported. 

Supported features are scoped with many internal nested types, for example, the 
field assembler returned by the record builder's fields() method has only two 
methods -- name(String) and record(), and the type builder that name(String) 
returns type builder for a field, which has prop(String, String) for the field 
and the available types, such as map().  A call to map() returns a map builder, 
which has prop(String, String) again but for the map, and values() ends the use 
of the map builder, changing scope to the nested type and returning down to the 
fields assembler when that is complete. 


h4. Remaining Work
* All primitive types are not supported yet (trivial)
* Shortcut methods need to be added for common use cases such as an optional 
field.
* Naming of some things needs review -- it would be easier if enum, int, long, 
default, etc were not reserved java key words :)
* Javadoc is nearly absent.
* There is some room for pushing more common work into parent types.
* Tests
* Attempt to replace the Schema.Parser logic with it, at minimum to test for 
areas of improvement or missing features.
* No protocol support yet (e.g. error, protocol, request, response).  It 
probably makes sense to extend this to cover all Avro things, including fields 
and protocols.

I want to checkpoint the work so far and gather feedback.
                
> Enhanced Schema Builder API
> ---------------------------
>
>                 Key: AVRO-1325
>                 URL: https://issues.apache.org/jira/browse/AVRO-1325
>             Project: Avro
>          Issue Type: Bug
>            Reporter: Scott Carey
>            Assignee: Scott Carey
>             Fix For: 1.7.5
>
>
> The schema builder from AVRO-1274 has a few key limitations.  I have proposed 
> changes to make before it is released and the public API is locked in.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to