> Creating default fields for objects would have performance
> issues if they are new instances -- new Utf8() and
> new YourClassHere() are not free.

Indeed for the read scenario (although I guess clients will need to check all marshaled fields for null, just in case). Alternative for writes, if you're concerned about performance issues, would be to not require the writer to send them, or have the DatumWriters populate the data lazily. I gather the former would be a subtle design change, and the latter would be a patch on the writers to leverage the schema on the write* methods.

Bill

Scott Carey wrote:
Creating default fields for objects would have performance issues if they are new 
instances -- new Utf8() and new YourClassHere() are not free.  So 'foo = new 
Utf8("foo"); is not right, but assignment from a static default would be fine.  
But unless these objects are immutable, a client could change the default.

Ideally, the specific and generic APIs can handle this better.  A getter can 
return a default value if its field is null, or generated classes can be more 
sophisticated, removing default constructors and providing constructors or 
static factory methods that require a user to provide all 'default-less' fields 
up front.  In the long run I'd like to have these sort of more powerful and 
flexible generated objects with various user-configured options, but at this 
time it is not there.
Also, consider how Unions complicate things here.  Right now they are not so 
fun to deal with unless it is only a union of NULL and one other type.  Client 
code has to know the exact classes/types to inspect to resolve the union.

Until there are enhancements to the API, if you are concerned with users 
putting garbage in, I suggest you write a wrapper class that handles this.  
Have users use that class rather than the classes generated by the specific 
compiler.


On Jun 7, 2010, at 3:11 PM, Bill de hOra wrote:

Scott Carey wrote:
No, it should not initialize the field to the default.

Default values are for readers, not writers. The intended use case is schema evolution.
This means writers can't leverage schema defaults, so writers should do something like this?

 Message message = new Message();
 // no defaults set
 String quux = message
     .getSchema()
     .getField("foo")
     .defaultValue()
     .getTextValue();
 message.foo=new Utf8(quux);

[ignoring that the writer needs to know the schema type]. I suspect people will just write in garbage (like empty strings).


A writer must always correctly provide
data for all of the fields in the schema it declared
it is writing.
Why is it incorrect to not provide defaults when defaults are part of the schema author's intention? Or put another way, why is reader/writer asymmetry a goal under a given schema?

I see in the code that SpecificDatumWriter/GenericDatumWriter is passed the Schema - By all means crash on fields with no defaults, but I'm not clear on what harm is done by using default field data. The current code seems fragile in comparison.

Bill



Reply via email to