On Apr 3, 2009, at 11:37 AM, Doug Cutting wrote:


Field ids are not present in Avro data except in the schema. A record's fields are serialized in the order that the fields occur in the records schema, with no per-field annotations whatsoever. For example, a record that contains a string and an int is serialized simply as a string followed by an int, nothing before, nothing between and nothing after. So, yes, it is a different data format.

While this representation would certainly be as compact as possible, wouldn't it prevent evolving the data structure over time? One of the nice features of Google Protocol Buffers and Thrift is that you can evolve the set of fields over time, and older/newer clients can talk to older/newer services. If the proposed Avro is evolvable, then perhaps I'm misunderstanding your statement about the lack of IDs in the serialized data.

I also agree with Bryan, in that it would be unfortunate to have two different Apache projects with overlapping goals. Regardless of features, both protocol buffers and thrift have the advantage of being debugged in mission-critical production environments.

-George

Reply via email to