Hi, Jeff, We did some comparison of avro vs binary json (linkedin's serialization system, it uses a JSON data model but a more compact byte format; details in https://github.com/voldemort/voldemort/wiki/Binary-JSON-Serialization) before:
1. avro's in-memory serialization perf is 71% of binary json's; 2. avro's in-memory deserialization perf is 76% of binary json's; 3. on-disk serialization performance highly depends on compression algorithms; 4. when uncompressed, avro is more space efficient than binary json (I didn't do many experiments in this case and got ratio 62.5% using a couple sets of data). Best, Lin On Tue, Nov 30, 2010 at 9:42 PM, Jeff Zhang <[email protected]> wrote: > Lin, > > Great work. So you've already use it in Linkedin ? And how about the > performance of AvroStorage compared to other Storage implementation ? > > On Wed, Dec 1, 2010 at 1:05 PM, Lin Guo <[email protected]> wrote: >> Hi, >> >> We'd like to patch our pig AvroStorage function and >> would highly appreciate any kinds of comments. >> >> doc: >> http://snaprojects.jira.com/wiki/display/HTOOLS/AvroStorage+-+Pig+support+for+Avro+data >> >> jira: >> https://issues.apache.org/jira/browse/PIG-1748 >> >> Many thanks, >> Lin >> > > > > -- > Best Regards > > Jeff Zhang >
