[
https://issues.apache.org/jira/browse/AVRO-1325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13654852#comment-13654852
]
Scott Carey commented on AVRO-1325:
-----------------------------------
I'll need to clean up the doc a little w.r.t. complexity in a couple places.
BaseTypeBuilder has three overloads for type(), FieldBuilder has three more due
to defaults.
What do you think would be confusing to the causal user? What could not be
satisfied with javadoc? I expect users to have IDEs set up with Maven so that
javadoc is available at autocomplete/suggestion. Are you concerned that will
not be the case for most users?
We could rename the string name reference variants for FieldBuilder's to
typeReference(). I named all of them type() because when writing the doc, it
was easy to say:
{code}
/**
* Builds a Field in the context of a {@link FieldAssembler}.
*
* Usage is to first configure any of the optional parameters and then to
call one
* of the type methods to complete the field. For example
* <pre>
* .namespace("org.apache.example").orderDecending().type()
* </pre>
* Optional parameters for a field are namespace, doc, order, and aliases.
*/
{code}
We could change the name of the ones that select a name by reference to
"typeRef", or remove the ones that select default values and instead force the
user to call an additional noDefaults() or withDefault() method afterwards to
reduce the number of methods named 'type'.
For a generic type builder, used by map and array, values() and items() returns
a builder that has three variants of type() on it, these could be rolled in to
the map and array instead:
{code}
map().values().intType()
map().values().type("MD5")
map().values().type(someSchema)
{code}
we could have:
{code}
map().values().intType()
map().values("MD5")
map().values(someSchema)
{code}
I did not do this because it shared more code and consistency -- SchemaBuilder
itself has the same API.
{quote}
What is the difference between intWith() and intType()
{quote}
I need feedback/suggestions on naming and API here.
The javadoc for intType would say "select an int type without custom
properties. A shortcut for intWith().endInt()". The javadoc for intWith would
say "return an int type builder for creating an int with properties. If
properties are not required use the #intType() shortcut".
intWith() exists only for the case where you need to add a property to the int,
which is uncommon, so I wanted a shortcut for the common case. I did not want
to have the context for optional properties to bleed into the following context
after type selection, since that adds extra methods to the later context and
applies to doc() and namespace() as well. After selecting a type, the context
either returns to an earlier scope or to a field default selection. In the
former case, properties are ambiguous with the outer scope (a field, array,
map, or record context). In the latter case, having a prop() method in scope
is not applicable to default value selection.
I decided to have the methods available in any scope be unambiguous and
correspond with the JSON declaration and the spec. This reduces how many
methods are available in all contexts significantly from the current version in
trunk and is intended to prevent user error.
Alternatively we could name "intWith() to "intBuilder()", or "intWithProps()",
or change intType() to be the builder, and have intSimple() for the shortcut,
but I wanted the common case to be the most obvious. I think the naming and
documentation here could certainly be improved. Making it clear to a user
which "intXYZ" (and the similar methods for all other primitive types) is
critical. The vast majority of the time users will not need to set properties
on primitive types.
{quote}
I think this is easy to add to the existing SchemaBuilder by adding an
addProp(String key, String value) method to the builders (RecordBuilder,
FieldBuilder, ArrayBuilder). For FieldBuilder we could also have a
addPropToType(String key, String value) method to distinguish between
properties that are added to the underlying type, not the field itself.
{quote}
It is complicated and confusing to scope properties correctly. The context for
setting a property is easily ambiguous in all cases where it is not contained
in its own scope. For example:
{code}
{"type":"record", "name":"Rec", "recProp":"r", "fields": [
{"name":"locations", "fieldProp":"f", "type": {"type":"map", "mapProp":"m",
"values":{"type":"string", "valProp":"v"}}}
]}
{code}
What would this look like with the API in trunk now if extended to support
properties? I had trouble reasoning about making it obvious to the user when
the field's type contexts bled into the record context. If you add the ability
to chain the builder to build a nested type in the map, it gets even more
messy. The scoping and nesting allows for chaining, and thus propagation of
default namespace and references by name to schemas already defined by the
builder. I plan to try this out in Schema.Parser() to simplify that code and
share logic related to Schemas and the spec.
In the proposal here (slightly tweaked locally), the schema above is:
{code}
SchemaBuilder.type().record("Rec").prop("recProp","r").fields()
.name("locations").prop("fieldProp","f").type().map().prop("mapProp",
"m").values().stringWith().prop("valProp","v").endString().noDefault()
.endRecord();
{code}
This is very similar to the JSON. I am hoping that it is very easy for
developers familiar with the JSON spec to use.
It is not possible to call prop(String, String) in a context that is ambiguous,
as it is only available immediately after declaring a type, before completing
the required information for the type.
* For records, after record(String name) but before fields()
* For fields, after name(String name) but before one of the type() variants
* For map, after map() but before values()
* For types in general, after the creation of the type builder (stringWith(),
map(), array(), record(), etc) but before its completion or delegation to
another type builder (endString(), values(), items(), fields()).
The same principle is applied to namespace() doc(), order(), etc to prevent
ambiguity and limit the number of methods available in any context.
> Enhanced Schema Builder API
> ---------------------------
>
> Key: AVRO-1325
> URL: https://issues.apache.org/jira/browse/AVRO-1325
> Project: Avro
> Issue Type: Bug
> Reporter: Scott Carey
> Assignee: Scott Carey
> Fix For: 1.7.5
>
> Attachments: AVRO-1325-preliminary.patch
>
>
> The schema builder from AVRO-1274 has a few key limitations. I have proposed
> changes to make before it is released and the public API is locked in.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira