Why does Avro have separate 'int' and 'long' types, and what does it mean for them to represent 32- and 64-bit integers?
Since their binary encoding is variable-length, it appears to me that they are unbounded in practice. Encoder and decoder functions for int and long are completely identical, other than checking that a value doesn't exceed the 32- and 64-bit maximum values. In fact, the Python implementation's read_int is just an alias for read_long [1], and the same goes for write_int / write_long. Just as a guess, I suppose that int and long exist to make life a little easier when you're deserializing. They act as promises that, despite being variable-length, the integer you read out will fit in a certain-sized chunk of memory. But if this is the goal, it seems like fixed-size integers would do a better job, and would let deserializers be *much* more efficient since they don't need to do multiple instructions (including a conditional!) on every single byte of the input. I don't know how open folks are to dramatic changes (is Avro 2 a concept that anyone is talking about?) but I think that Avro would be significantly improved with a rethink of its integer types. Some sort of fixed-length integer type[s], and just one variable-length integer type, would be clarifying, more expressive, and more efficient, I think. [1] https://github.com/apache/avro/blob/5bd7cfe0bf742d0482bf6f54b4541b4d22cc87d9/lang/py/avro/io.py#L251-L255
