Re: How and where exactly LSM trees are used in HBase?

Mikael Sitruk Mon, 09 Dec 2013 05:09:02 -0800

LSM tree are the basis for reducing random I/O which is a huge performance
factor with big data system. A good overview can be found in HBase in
action book, from Lars George.
The basic idea is that you have an in memory structure for the latest
changes and a structure stored on files, The files content is always
ordered by key, and each row the file is jus the row_key, Column family
identifier, column name, timestamp and the value (+ a marker).
When the memory is full, the memory structure is flushed to disk, when
there are a certain amount of files on filesystem the files are merged to
bigger ones, since the files are ordered the merge is very fast, (like
merge in mergesort algo)



On Sun, Dec 8, 2013 at 8:42 AM, Ted Yu <[email protected]> wrote:

> Searching for 'lsm tree hbase' would give you several articles.
>
> I am in China - the search results are mostly in Chinese.
>
> You should be able to read this:
>
> http://stackoverflow.com/questions/13762992/log-structured-merge-tree-in-hbase
>
> Cheers
>
>
> On Wed, Dec 4, 2013 at 6:49 PM, AnilKumar B <[email protected]> wrote:
>
> > Hi,
> >
> > We are trying to understand how and where exactly LSM tress are used in
> > HBase. Currently as per our understanding, while flushing memstore to
> Store
> > files and while HFile compaction it is used. And sits on top of HFiles at
> > memstore level.
> >
> > Is this understanding correct. Can you please give more insight on this?
> > How exactly is the merging done?
> >
> > Thanks & Regards,
> > B Anil Kumar.
> >
>

Re: How and where exactly LSM trees are used in HBase?

Reply via email to