> Creating default fields for objects would have performance
> issues if they are new instances -- new Utf8() and
> new YourClassHere() are not free.
Indeed for the read scenario (although I guess clients will need to
check all marshaled fields for null, just in case). Alternative for
writes, if you're concerned about performance issues, would be to not
require the writer to send them, or have the DatumWriters populate the
data lazily. I gather the former would be a subtle design change, and
the latter would be a patch on the writers to leverage the schema on the
write* methods.
Bill
Scott Carey wrote:
Creating default fields for objects would have performance issues if they are new
instances -- new Utf8() and new YourClassHere() are not free. So 'foo = new
Utf8("foo"); is not right, but assignment from a static default would be fine.
But unless these objects are immutable, a client could change the default.
Ideally, the specific and generic APIs can handle this better. A getter can
return a default value if its field is null, or generated classes can be more
sophisticated, removing default constructors and providing constructors or
static factory methods that require a user to provide all 'default-less' fields
up front. In the long run I'd like to have these sort of more powerful and
flexible generated objects with various user-configured options, but at this
time it is not there.
Also, consider how Unions complicate things here. Right now they are not so
fun to deal with unless it is only a union of NULL and one other type. Client
code has to know the exact classes/types to inspect to resolve the union.
Until there are enhancements to the API, if you are concerned with users
putting garbage in, I suggest you write a wrapper class that handles this.
Have users use that class rather than the classes generated by the specific
compiler.
On Jun 7, 2010, at 3:11 PM, Bill de hOra wrote:
Scott Carey wrote:
No, it should not initialize the field to the default.
Default values are for readers, not writers. The intended use case is schema evolution.
This means writers can't leverage schema defaults, so writers should do
something like this?
Message message = new Message();
// no defaults set
String quux = message
.getSchema()
.getField("foo")
.defaultValue()
.getTextValue();
message.foo=new Utf8(quux);
[ignoring that the writer needs to know the schema type]. I suspect
people will just write in garbage (like empty strings).
A writer must always correctly provide
data for all of the fields in the schema it declared
it is writing.
Why is it incorrect to not provide defaults when defaults are part of
the schema author's intention? Or put another way, why is reader/writer
asymmetry a goal under a given schema?
I see in the code that SpecificDatumWriter/GenericDatumWriter is passed
the Schema - By all means crash on fields with no defaults, but I'm not
clear on what harm is done by using default field data. The current code
seems fragile in comparison.
Bill