I've just submitted a patch to build C documentation using doxygen
https://issues.apache.org/jira/browse/AVRO-37
I'm starting work on building file/socket handles and processing the
Avro compound data types (expect a patch with proper code/unit tests
soon) and I have a few (hopefully quick) questions:
(1) Spec related: Should we have a max-length attribute for the
variable length objects?
Since we're going to be using Avro for RPC, do we need to consider the
possibility of malicious data in the Avro streams? Malicious clients
could swamp out the memory on the server by sending intentionally long
values (e.g. a string that is 2GB in length). Instead of having a
server-wide max, it might make sense to allow a maximum length to be
specified per object.
(2) Maps: Do we need to maintain the key/value pair order for maps?
I will be converting Avro maps to apr_table_t structures in order to
make key search constant time. Do I need to guarantee that the order
I decode the map is the order I encode it later? Just want to know if
I need to store a private key array to maintain the order. It's ok if
I do, but I'd like to use less memory if I can avoid it.
(3) Blocks: Sanity check
If the elements of an array are fixed length (e.g. 8 bytes), then the
block of 100 of them would look like...
[ long = 100 ][ 100 * 8 = 800 bytes of data in the block][ long = 0 ]
... terminated with a zero.. or
[ long = 90 ][ 90 * 8 = 720 bytes of data in the block ][ long = 10 ]
[ 10 * 8 = 80 bytes in the block ][ long = 0]
.. correct?
However, if the objects are variable length, there is no way to
calculate the size of the block based on the element sizes so we use
the negative "count" value. For example...
[ long = -1 ][ long = 23948 ][ 23948 bytes of data in the block ]
[ long = 0 ]
.. which is terminated with a zero.
(4) RPC related: Should we explicitly specify the entire RPC
communication as an Avro schema?
For examples, the entire RPC communication schema can be expressed in
a single XDR .x file. The zeroc guys who wrote ICE express the RPC of
all their components using their IDL.
Having the Avro RPC shema would make implementing RPC automatic and
flexible.
Hope you all have a great Memorial Day weekend!
-Matt