Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/NewFileFormat

------------------------------------------------------------------------------
- This page is for discussion related to 
[https://issues.apache.org/jira/browse/HBASE-61 HBASE-61, Create an 
HBase-specific MapFile implementation].  That issue, and its linked issues, has 
a bunch of suggestions for how we might do a better persistence.  Most have 
been replicated in the ''New Format'' section below.  Other related issues 
include, [https://issues.apache.org/jira/browse/HADOOP-3315 TFile], and 
[https://issues.apache.org/jira/browse/HBASE-647 HBASE-647, Remove the 
HStoreFile 'info' file (and index and bloomfilter if possible)].
+ This page is for discussion related to 
[https://issues.apache.org/jira/browse/HBASE-61 HBASE-61, Create an 
HBase-specific MapFile implementation].  That issue, and its linked issues, has 
a bunch of suggestions for how we might do a better persistence.  Most have 
been replicated in the ''New Format'' section below.  Other related issues 
include, [https://issues.apache.org/jira/browse/HADOOP-3315 TFile], and 
[https://issues.apache.org/jira/browse/HBASE-647 HBASE-647, Remove the 
HStoreFile 'info' file (and index and bloomfilter if possible)] as well as 
''SSTable'' from the bigtable paper.
  
  == Current Implementation ==
  
@@ -41, +41 @@

   * Always-on General bloomfilter. We know how many entries a file will have 
when we go to flush it so we can optimally size a bloomfilter.  The small 
amount of memory a bloomfilter occupies will pay for itself many-fold in the 
seeks saved trying to figure is a file contains an asked for key.
   * Optimal random-access
   * Iterate over keys only, rather than mapfiles currenty key+values always.  
This'd be useful when trying to find closest. TFile and SequenceFile can do 
this (Its not exposed in MapFile).
-  
+ 
+ === Index ===
+ TODO, but the TFile block-based rather than MapFile interval-based would seem 
better for us; indices then are of predicatable size; a seek to the index 
position will load at an amenable spot when blocks are compressed. 
  
  === Nice-to-haves ===
   * Don't write out the family portion of column when writing keys.
  
  == Other File Formats ==
+ 
  Cassandra uses a Sequence File.  It adds key/values in blocks of 128 by 
default.  On the 128th entry, an index for the block keys is inlined and then a 
new block begins.  Block offsets are kept out in an index file as in MapFile.  
Bloomfilters are on by default.
  
+ From the bigtable paper, an SSTable "... contains a sequence of blocks 
(typically each block is 64KB in size, but this is configurable).  A block 
index (stored at the end of the SSTable) is used to locate blocks; the index is 
loaded into memory when the SSTable is opened.  A lookup can be performed with 
a single disk seek: we first find the appropriate block by performing a binary 
search in the in-memory index, and then reading the appropriate block from 
disk.  Optionally, an SSTable can be completely mapped into memory, which 
allows us to perform lookups and scans without touching the disk."
+ 

Reply via email to