Hi there- There is a FAQ entry in the Hbase book on this exact question.
http://hbase.apache.org/book.html#faq.hdfs.hbase On 7/7/11 2:53 PM, "Mohit Anchlia" <[email protected]> wrote: >I have looked at bigtable and it's ssTables etc. But my question is >directly related to how it's used with HDFS. HDFS recommends large >files, bigger blocks, write once and read many sequential reads. But >accessing small rows and writing small rows is more random and >different than inherent design of HDFS. How do these 2 go together and >is able to provide performance. > >On Thu, Jul 7, 2011 at 11:22 AM, Andrew Purtell <[email protected]> >wrote: >> Hi Mohit, >> >> Start here: http://labs.google.com/papers/bigtable.html >> >> Best regards, >> >> >> - Andy >> >> Problems worthy of attack prove their worth by hitting back. - Piet >>Hein (via Tom White) >> >> >>>________________________________ >>>From: Mohit Anchlia <[email protected]> >>>To: [email protected] >>>Sent: Thursday, July 7, 2011 11:12 AM >>>Subject: Hbase performance with HDFS >>> >>>I've been trying to understand how Hbase can provide good performance >>>using HDFS when purpose of HDFS is sequential large block sizes which >>>is inherently different than of Hbase where it's more random and row >>>sizes might be very small. >>> >>>I am reading this but doesn't answer my question. It does say that >>>HFile block size is different but how it really works with HDFS is >>>what I am trying to understand. >>> >>>http://www.larsgeorge.com/2009/10/hbase-architecture-101-storage.html >>> >>> >>>
