Re: [ANN] HBaseHUT - HBase High Update Throughput

Alex Baranau Tue, 30 Nov 2010 22:22:18 -0800

Hi Alvin,

There's no limitations, but in particular cases you won't want to use
HBaseHUT.

0) In general, if Get & Put operation for each write of new data is OK for
you (e.g. low write rate) then it's easier to go that way I believe.

1) In certain cases read performance can degrade noticeably when using
HBaseHUT (in the benefit of much higher write speed of course). By "certain
cases" I primarily mean the situation when both true of:
* updates affect just small amount of records (i.e. not far from
"well-spread") *and*
* updates (not "new inserts") rate is very heavy

Thus, if you have 100K of update operations (non-processed, i.e. updates
processing haven't fired for them yet) for some data item, then reading this
data item is likely to be slow (depending on your update logic, close to the
HBase scan speed on your cluster).
Note, that with "storing processed updates on client reads" feature next
time user requests for the same data item it will be returned very fast (as
single Get operation).

Thanks for the question, I'll gather them for the FAQ ;).

Alex Baranau
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop - HBase

On Wed, Dec 1, 2010 at 4:54 AM, Alvin C.L Huang <[email protected]>wrote:

> Is there any limitation?
>
> Many thanks
>
> Alvin
>
>
> On 1 December 2010 01:39, Stack <[email protected]> wrote:
>
> > Thanks for sending announcement to the list Alex.  Congrats on new
> > project.  You might consider adding your project to
> > http://wiki.apache.org/hadoop/SupportingProjects?
> >
> > St.Ack
> >
> > On Tue, Nov 30, 2010 at 5:40 AM, Alex Baranau <[email protected]>
> > wrote:
> > > Hello,
> > >
> > > Let me introduce new effort around HBase: HBaseHUT.
> > > It suggests solution to mentioned many times on this mailing list
> problem
> > > "do Get on every Put operation to update record" (which causes bad
> write
> > > performance) and suitable for many use-cases.
> > >
> > > Sources available here: http://github.com/sematext/HBaseHUT
> > > Wiki with some idea/usage details:
> > https://github.com/sematext/HBaseHUT/wiki
> > >
> > > It would be great to receive the feedback on the overall idea.
> > >
> > > Thank you!
> > >
> > > Alex Baranau
> > > ----
> > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - Hadoop -
> > HBase
> > >
> >
>

Re: [ANN] HBaseHUT - HBase High Update Throughput

Reply via email to