On Feb 12, 2014, at 4:04 PM, Sid Shetye 
<[email protected]<mailto:[email protected]>> wrote:

2. unsigned 32/64bit values have been extensively used as primitive types for 
over 3 decades (i.e. it's held it's ground. Heck, even core Java devs hate that 
unsigned doesn't exist. eg 
http://stackoverflow.com/questions/430346/why-doesnt-java-support-unsigned-ints)
3. All other workarounds simply add more friction to development when in 
reality, working with a primitive data type that's been around "forever" should 
be very transparent and very fluid.

It does add some friction — but — aren’t we in a space where the lowest common 
denominator has to be supported?
As you’re pointing out, #2 and #3 are about Java.

We’re not hitting an issue of serialization, afaik; if you’re looking for 
signed 32 bit, we’ve got that.
If you want unsigned, it seems to me that fixed is just fine for storage.

Do we agree that serialization is not the issue?
Issue I’m seeing here is with actual schema expressivity as well as APIs in 
other languages.

That’s an area where I’m simply taking this section to heart:
"Attributes not defined in this document are permitted as metadata, but must 
not affect the format of serialized data.”

And that section to me screams out for a registry.
With a registry of attributes we could work around issues like this and still 
keep in sync with each other.

Heck, that “MD5” example in the manual is a great one:

{"type": "fixed", "size": 16, "name": "md5"}

We all know that means md5 — but it’s just untyped in Avro. A registry for 
things like “md5”, “uint32”, etc, would be a nice to have.
Then our silly selves can go ahead and implement more complex API/deserializers.


-Charles

Reply via email to