On May 22, 2009, at 4:32 PM, Doug Cutting wrote:

Matt Massie wrote:
(1) Spec related: Should we have a max-length attribute for the variable length objects?

My instinct would be to put this in the server implementation--it should guard against requests that use too much memory.

That will work. If this approach doesn't work as well as we'd like, we can update the spec at a later time.


(2) Maps: Do we need to maintain the key/value pair order for maps?

I think re-ordering is fine.  Does anyone disagree?

My vote would be to not impose any ordering on maps.


(3) Blocks: Sanity check
If the elements of an array are fixed length (e.g. 8 bytes), then the block of 100 of them would look like...
[ long = 100 ][ 100 * 8 = 800 bytes of data in the block][ long = 0 ]
... terminated with a zero.. or
[ long = 90 ][ 90 * 8 = 720 bytes of data in the block ][ long = 10 ][ 10 * 8 = 80 bytes in the block ][ long = 0]
.. correct?

Yes, that's the idea.

However, if the objects are variable length, there is no way to calculate the size of the block based on the element sizes so we use the negative "count" value. For example... [ long = -1 ][ long = 23948 ][ 23948 bytes of data in the block ] [ long = 0 ]
.. which is terminated with a zero.

Almost.  AVRO-25 proposes to permit, for your first example:

[long = -100] [long = 800] [100 * 8 = 800 bytes of data] [long = 0]

The item count is always required, but when its negative, its followed by the byte count. This is the same for variable and fixed- sized data.

I guess one advantage of requiring the "count" is to allow an extra check once the block is processed.


(4) RPC related: Should we explicitly specify the entire RPC communication as an Avro schema?

You mean the handshake stuff? I've thought about that, but felt that the bootstrapping got complicated.

Yes. Actually, I believe that we should express all messages exchanged between Avro components in Avro schema so that we don't need to hand-craft the RPC layer. Having the schema will also make the protocols more transparent and Avro more adaptive to changes in RPC.

Are there any reasons you can see for hand-coding the RPC layer?

-Matt

Reply via email to