> -----Original Message----- > On Apr 2, 2009, at 3:05 PM, Doug Cutting wrote: > > I propose we add a new Hadoop subproject for Avro, a serialization > system. My ambition is for Avro to replace both Hadoop's RPC and to > be used for most Hadoop data files, e.g., by Pig, Hive, etc. > > Initial committers would be Sharad Agarwal and me, both existing > Hadoop committers. We are the sole authors of this software to date. > > The code is currently at: > > http://people.apache.org/~cutting/avro.git/ > > To learn more: > > git clone http://people.apache.org/~cutting/avro.git/ avro > cat avro/README.txt > > Comments? Questions? > > Doug
After reading all the messages about Avro, I'm still not sure I understand why we should invent "yet another wheel". There are a number of people in the community who have significant investments in Thrift, and I have yet to see a compelling argument for Avro over Thrift. My understanding is that Thrift already supports multi-language bindings, something the HBase community has been asking for, for some time. It is also my understanding (based on the email thread) that Avro only supports Java and python. That is a step backwards from Thrift. It appears that Avro uses introspection heavily, which is expensive in applications that require a high message rate. So I guess my question is why Avro? I may be thick, but it seems to me as if it is just another wheel of a different color. If I could see a point by point comparison between Avro and Thrift I could be convinced that Avro is the way to go. So far, I have not seen any compelling reason to re-invent the wheel. +-0 --- Jim Kellerman, Powerset (Live Search, Microsoft Corporation)
