Record compression means that exactly one row/family:member/ts is compressed.
Block compression means that blocks in HDFS are compressed. A block may contain multiple records if they are shorter than one HDFS block or may only contain part of a record if the record is longer than a HDFS block. --- Jim Kellerman, Senior Engineer; Powerset > -----Original Message----- > From: Rong-en Fan [mailto:[EMAIL PROTECTED] > Sent: Thursday, July 10, 2008 7:52 AM > To: [email protected] > Subject: compression in HBase > > I'm reading > > http://jimbojw.com/wiki/index.php?title=Understanding_HBase_co > lumn-family_performance_options > > but get confused about BLOCK and RECORD compression. In my > understanding, the these two options govern the underlying > MapFile's data file, which is a SequenceFile. In HBase, each > key in the SequenceFile is actually row/column/ts. So, > specifying RECORD means each value in *one* row/column/ts is > compressed. With BLOCK, it may cover the same row (since one > row may have more than one row/column/ts keys in the > underlying MapFile). If this is correct, then I don't get the > point mentioned in the wiki above. > > Any ideas? > > Thanks, > Rong-En Fan > > No virus found in this incoming message. > Checked by AVG - http://www.avg.com > Version: 8.0.138 / Virus Database: 270.4.7/1542 - Release > Date: 7/9/2008 6:50 AM > No virus found in this outgoing message. Checked by AVG - http://www.avg.com Version: 8.0.138 / Virus Database: 270.4.7/1542 - Release Date: 7/9/2008 6:50 AM
