> - inefficient because you'll end up serializing your data twice, once > from the actual type into the bytes field, then a second type as a > bytes field;
I don't think it's as inefficient as you might think — the second serialization just blits the raw bytes content into some destination buffer/pipe/socket/etc. The C binding already does this under the covers to handle blocks when writing into a data file. And it hasn't been a performance bottleneck. > - unwieldy because as a user, I'll have to encode and decode the bytes > field manually everytime I want to access this field from the original > record, unless I keep track of the decoded extension externally to the > Avro record. Can you handle this in the middleware? I.e., have the middleware decode the bytes field before passing control to the user code. That's better from a decoupling standpoint anyway, since the user code shouldn't care what middleware is wrapping it. > When you write a middleware that lets users define custom types, > extensions are pretty much required. I guess my main point is that we already have two mechanisms for dealing with user extensions (schema resolution and Doug's bytes field proposal), both of which work just fine at runtime without rebuilding or restarting your code. In general, I think it's better if we can solve a problem at the library or application level, without having to update the spec. –doug
