Hey, I've enjoyed the brief time I've been able to spend on Avro so far; thanks for the great work on this project. A few thoughts:
* Would it be useful to express the metadata block format in the specification as an Avro schema? The handshake is specified with a schema, and it might help folks grasp what's going on a bit sooner. * There could be much better in-code documentation for language implementations--e.g. we could put parts of the specification into the docstrings for Python. Are folks trying to keep the code parsimonious by putting the documentation on the wiki or site? I'd rather see this documentation in the code as well, but I'd be happy to respect the desires of the community. * The header format for file object containers is specified as "Four bytes, ASCII 'O', 'b', 'j', followed by zero.", but both Python and Java implementations use a "VERSION" constant for the last byte of the magic constant. Should we make this explicit in the specification? * It would be nice to have some guidance about how you expect the specification to be implemented: e.g. how are you using the packages "specific", "generic", and "reflect" in the Java implementation? * The Python implementation uses, e.g., "getmeta" rather than "get_meta" or "getMeta". Following PEP-8 or Google's Python style guidelines ( http://code.google.com/p/soc/wiki/PythonStyleGuide) or "Code like a Pythonista" ( http://python.net/~goodger/projects/pycon/2007/idiomatic/handout.html<http://python.net/%7Egoodger/projects/pycon/2007/idiomatic/handout.html>) would be a good idea, but the patch to refactor the code would be fairly large. Is it worth taking on this code hygiene issue sooner rather than later? Thanks, Jeff
