I found this useful article that explains the internal storage of HFile http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html <http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html>
On Tue, Mar 22, 2011 at 11:31 AM, Weishung Chung <[email protected]> wrote: > I also found this informative article > > http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html > > > > <http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html>is > the key value pair be > eg column family1 with one qualifier 1 with 2 versions > > key1 : rowkey1+column family1:qualifier1+timestamp1 > value1: corresponding cell value1 > key2 : rowkey1+column family1:qualifier1+timestamp2 > value2: corresponding cell value 2 > key3: rowkey2+column family1:qualifier1+timestamp1 > value3: corresponding cell value 3 > <http://cloudepr.blogspot.com/2009/09/hfile-block-indexed-file-format-to.html> > On Tue, Mar 22, 2011 at 10:58 AM, Vivek Krishna <[email protected]>wrote: > >> http://nosql.mypopescu.com/post/3220921756/hbase-internals-hfile-explained >> might help. >> >> Viv >> >> >> >> >> On Tue, Mar 22, 2011 at 11:43 AM, Weishung Chung <[email protected]>wrote: >> >>> My fellow superb hbase experts, >>> >>> Looking at the HFile specs and have some questions. >>> How is a particular table cell in a HBase table being represented in the >>> HFile? Does the key of the key value pair represent the rowkey+column >>> family:qualifier+timestamp and the value represent the corresponding cell >>> value? If so, to read a row, multiple key/value pair reads have to be >>> done? >>> >>> Thank you :) >>> >>> >>> On Tue, Mar 22, 2011 at 9:09 AM, Weishung Chung <[email protected]> >>> wrote: >>> >>> > Thank you, I will definitely take a look. Also, the TFile spec below >>> helps >>> > me to understand more, >>> > what an exciting work ! >>> > >>> > >>> > >>> https://issues.apache.org/jira/secure/attachment/12396286/TFile+Specification+20081217.pdf >>> > >>> > < >>> https://issues.apache.org/jira/secure/attachment/12396286/TFile+Specification+20081217.pdf >>> > >>> > On Mon, Mar 21, 2011 at 11:41 AM, Doug Cutting <[email protected]> >>> wrote: >>> > >>> >> On 03/19/2011 09:01 AM, Weishung Chung wrote: >>> >> > I am browsing through the hadoop.io package and was wondering what >>> >> other >>> >> > file formats are available in hadoop other than SequenceFile and >>> TFile? >>> >> > Is all data written through hadoop including those from hbase saved >>> in >>> >> the >>> >> > above formats? It seems like SequenceFile is in key value pair >>> format. >>> >> >>> >> Avro includes a file format that works with Hadoop. >>> >> >>> >> >>> >> >>> http://avro.apache.org/docs/current/api/java/org/apache/avro/mapred/package-summary.html >>> >> >>> >> Doug >>> >> >>> > >>> > >>> >> >> >
