Rong-en Fan wrote:
I'm reading
http://jimbojw.com/wiki/index.php?title=Understanding_HBase_column-family_performance_options
but get confused about BLOCK and RECORD compression. In my
understanding, the these two options govern the underlying MapFile's
data file, which is a SequenceFile. In HBase, each key in the SequenceFile
is actually row/column/ts. So, specifying RECORD means each
value in *one* row/column/ts is compressed. With BLOCK, it
may cover the same row (since one row may have more than one
row/column/ts keys in the underlying MapFile). If this is correct,
then I don't get the point mentioned in the wiki above.
Any ideas?
Your exposition above is correct Rong-en. The article is a little
inexact about what actually is being compressed. Would suggest you add
the above as a comment to Jimbo's article.
St.Ack