Went through that thread. None are convincing from a design standpoint because:
1. Avro is used in non-Java environments. The Avro IDL is language agnostic while the code-gen is language-specific. So the C# code-gen could spit out unsigned. Every language has limitations but not sure why Java's limitations should drive Avro's designs, despite the heritage. (it's going to grow into other languages, right?) 2. unsigned 32/64bit values have been extensively used as primitive types for over 3 decades (i.e. it's held it's ground. Heck, even core Java devs hate that unsigned doesn't exist. eg http://stackoverflow.com/questions/430346/why-doesnt-java-support-unsigned-ints) 3. All other workarounds simply add more friction to development when in reality, working with a primitive data type that's been around "forever" should be very transparent and very fluid. Stepping off the soapbox, I also have a workaround for future readers. We cast uint<-> int after temporarily disabling arithmetic overflows, and then let Avro handle then as signed varints (aka zipzag varints). As example code: int avroInt32; // this is code-gen'd off the IDL uint csharpUint32; // this is an app domain var // to avro DTO avroInt32 = unchecked((int) csharpUint32); // from Avro DTO csharpUint32 = unchecked((uint)avroInt32 ); Pros: a) Use the encoding compression inherent in varints (eg: stay under 4 bytes till 134,217,727) b) Keep the application domain logic as unsigned (as it needs to be) c) Minimize the glue logic / impedance when converting from app domain => DTO domain Cons: 1) Specific glue code needed because Avro inherits Java's limitations 2) We're still wasting half of the addressable range since we're skipping every other possible varint encoding (reserved for -ve numbers) since we only see +ve numbers. Which means instead of hitting my 5th varint byte after 268,435,455, I now need that 5th byte at half that - 134,217,727. It's not *too* bad but seems wasteful to always transport a bit that's never used (bit 0, a zigzag varint's 'sign bit' will always be 0, carrying no informational content). Cheers Sid > From: [email protected] > Date: Wed, 12 Feb 2014 17:50:02 +0530 > Subject: Re: unsigned 32bit (uint) in Avro - C# ? > To: [email protected] > > See also this past thread on the topic perhaps: > http://mail-archives.apache.org/mod_mbox/avro-user/201212.mbox/%[email protected]%3e > > On Mon, Feb 10, 2014 at 3:46 PM, Mika Ristimaki > <[email protected]> wrote: > > Hi, > > > > Java doesn't have unsigned primitives, so most likely Avro doesn't support > > them directly either. > > > > -Mika > > > > On Feb 10, 2014, at 3:34 AM, Sid Shetye <[email protected]> wrote: > > > > How do I serialize an unsigned integer (uint or UInt32 in C#) in Avro? > > > > It's very bizarre that unsigned aren't discussed at > > http://avro.apache.org/docs/1.7.6/spec.html#schema_primitive > > > > > > > > > > > > -- > Harsh J
