[
https://issues.apache.org/jira/browse/AVRO-839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089946#comment-13089946
]
James Baldassari commented on AVRO-839:
---------------------------------------
Makes sense, Doug. I've made the changes you suggested. I also had to make
the generic/specific builders similar to the
GenericDatumReader/SpecificDatumReader in that the base class has a GenericData
and the specific subclass passes a SpecificData to the super constructor.
I created some new unit tests for RecordBuilderBase#isValidValue(Field,
Object). These can be found in org.apache.avro.data.RecordBuilderBaseTest in
lang/java/avro. The one that addresses your specific concern about unions is
RecordBuilderBaseTest#testIsValidValueWithUnion(). In order to make it easier
to unit test the isValidValue() method I left it static, but if you'd prefer to
make that method non-static I'll find another way to test it.
I'll have a new patch ready shortly. Oh, and I just noticed AVRO-784 today. I
haven't had a chance to look at the patch yet, but is there any overlap in
functionality between 784 and 839 with the bean-style accessors/mutators? It
sounded very similar from the description. If it turns out to be basically the
same, then I can just wait for that to be committed and then merge in the
changes.
> Implement builder pattern in generated record classes that sets default
> values when omitted
> -------------------------------------------------------------------------------------------
>
> Key: AVRO-839
> URL: https://issues.apache.org/jira/browse/AVRO-839
> Project: Avro
> Issue Type: Improvement
> Components: java
> Reporter: James Baldassari
> Assignee: James Baldassari
> Fix For: 1.6.0
>
> Attachments: AVRO-839-v2.patch, AVRO-839-v3.patch, AVRO-839.patch
>
>
> This is an idea for an improvement to the SpecificCompiler-generated record
> classes. There are two main issues to address:
> # Default values specified in schemas are only used at read time, not when
> writing/serializing records. For example, a NullPointerException is thrown
> when attempting to write a record that has an uninitialized array or string
> type. I'm sure this was done for good reasons, like giving users maximum
> control and preventing unnecessary garbage collection, but I think it's also
> somewhat confusing and unintuitive for new users (myself included).
> # Users have to create their own factory classes/methods for every record
> type, both to ensure that all non-primitive members are initialized and to
> facilitate the construction and initialization of record instances (i.e.
> constructing and setting values in a single statement).
> These issues have been discussed previously here:
> * [http://search-hadoop.com/m/iDVTn1JVeSR1]
> * AVRO-726
> * AVRO-770
> * [http://search-hadoop.com/m/JuY1V16pwxh1]
> I'd like to propose a solution that is used by at least one other messaging
> framework. For each generated record class there will be a public static
> inner class called Builder. The Builder inner class has the same fields as
> the record class, as well as accessors and mutators for each of these fields.
> Whenever a mutator method is called, the Builder sets a boolean flag
> indicating that the field has been set. All mutators return a reference to
> 'this', so it's possible to chain a series of setter invocations, which makes
> it really easy to construct records in a single statement. The Builder also
> has a build() method which constructs a record instance using the values that
> were set in the Builder. When the build() method is invoked, if there are
> any fields that have not been set but have default values as defined in the
> schema, the Builder will set the values of these fields using their defaults.
> One nice thing about implementing the builder pattern in a static inner
> Builder class rather than in the record itself is that this enhancement will
> be completely backwards-compatible with existing code. The record class
> itself would not change, and the public fields would still be there, so
> existing code would still work. Users would have the option to use the
> Builder or continue constructing records manually. Eventually the public
> fields could be phased out, and the record would be made immutable. All
> changes would have to be done through the Builder.
> Here is an example of what this might look like:
> {code}
> // Person.newBuilder() returns a new Person.Builder instance
> // All Person.Builder setters return 'this' allowing us to chain set calls
> together for convenience
> // Person.Builder.build() returns a Person instance after setting any
> uninitialized values that have defaults
> Person me =
> Person.newBuilder().setName("James").setCountry("US").setState("MA").build();
> // We still have direct access to Person's members, so the records are
> backwards-compatible
> me.state = "CA";
> // Person has accessor methods now so that the public fields can be phased
> out later
> System.out.println(me.getState());
> // No NPE here because the array<Person> field that stores this person's
> friends has been automatically
> // initialized by the Builder to a new java.util.ArrayList<Person> due to a
> @java_class annotation in the IDL
> System.out.println(me.getFriends().size());
> {code}
> What do people think about this approach? Any other ideas?
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira