> Lastly, it says "Note also that Avro binary-encoded data can be efficiently > ordered without deserializing it to objects." What does this mean exactly?
This is hinting at an implementation detail, though one of historical interest. There exists code, in Java, to compare two Avro objects based on their byte[] representations. This code happens to not create any objects; rather, it deals with bytes directly, and thus it's "efficient". This is of historical interest because of Avro's intended use in Hadoop MapReduce. Hadoop's sorting instantiates the key objects to do the sorting, but there's a way to specify a "binary comparator" which tells Hadoop to instantiate a class that just has a compare(byte[], byte[]) method instead of a compare(Object, Object) method. So, the spec is suggesting that this is possible and that there's an implementation of it. You are right that plain ol' byte comparison does not sort Avro objects correctly. (This is kind of a bummer, in my opinion. It makes Avro objects not something that's useful for HBase keys.) -- Philip
