Hey folks, Stack published a writeup I did on the HBase blog on the effects of rowkey size, column-name size, CF compression, data block encoding and KV storage approach on HFile size. For example, had large row keys vs. small row keys, used Snappy vs. LZO vs. etc., used prefix vs. fast-diff, used a KV per column vs. a single KV per row. We tried 'em all... and wrote it up.
http://blogs.apache.org/hbase/ Doug Meil Chief Software Architect, Explorys [email protected]
