Hi,

I think that (idle) background threads would not make much of a difference
to the raw speed of iterating over cells of a single region served from the
block cache. I started testing this way after noticing slow down on a real
installation. I can imagine that there have been various improvements in
hbase 2 in other areas which will compensate partly the impact of what I
notice in this narrow test, but still I found these results remarkable
enough.

On Wed, May 20, 2020 at 4:33 PM 张铎(Duo Zhang) <[email protected]> wrote:

> Just saw that your tests were on local mode...
>
> Local mode is not for production so I do not see any related issues for
> improving the performance for hbase in local mode. Maybe we just have more
> threads in HBase 2 by default which makes it slow on a single machine, not
> sure...
>
> Could you please test it on a distributed cluster? If it is still a
> problem, you can open an issue and I believe there will be committers offer
> to help verifying the problem.
>
> Thanks.
>
> Bruno Dumon <[email protected]> 于2020年5月20日周三 下午4:45写道:
>
> > For the scan test, there is only minimal rpc involved, I verified through
> > ScanMetrics that there are only 2 rpc calls for the scan. It is
> essentially
> > testing how fast the region server is able to iterate over the cells.
> There
> > are no delete cells, and the table is fully compacted (1 storage file),
> and
> > all data fits into the block cache.
> >
> > For the sequential gets (i.e. one get after the other,
> non-multi-threaded),
> > I tried the BlockingRpcClient. It is about 13% faster than the netty rpc
> > client. But the same code on 1.6 is still 90% faster. Concretely, my test
> > code does 100K gets of the same row in a loop. On HBase 2.2.4 with the
> > BlockingRpcClient this takes on average 9 seconds, with HBase 1.6 it
> takes
> > 4.75 seconds.
> >
> > On Wed, May 20, 2020 at 9:27 AM Debraj Manna <[email protected]>
> > wrote:
> >
> > > I cross-posted this in slack channel as I was also observing something
> > > quite similar. This is the suggestion I received. Reposting here for
> > > the completion.
> > >
> > > zhangduo  12:15 PM
> > > Does get also have the same performance drop, or only scan?
> > > zhangduo  12:18 PM
> > > For the rpc layer, hbase2 defaults to netty while hbase1 is pure java
> > > socket. You can set the rpc client to BlockingRpcClient to see if the
> > > performance is back.
> > >
> > > On Mon, May 18, 2020 at 7:58 PM Bruno Dumon <[email protected]> wrote:
> > > >
> > > > Hi,
> > > >
> > > > We are looking into migrating from HBase 1.2.x to HBase 2.1.x (on
> > > Cloudera
> > > > CDH).
> > > >
> > > > It seems like HBase 2 is slower than HBase 1 for both reading and
> > > writing.
> > > >
> > > > I did a simple test, using HBase 1.6.0 and HBase 2.2.4 (the standard
> > OSS
> > > > versions), running in local mode (no HDFS) on my computer:
> > > >
> > > >  * ingested 15M single-KV rows
> > > >  * full table scan over them
> > > >  * to remove rpc latency as much as possible, the scan had a filter
> > 'new
> > > > RandomRowFilter(0.0001f)', caching set to 10K (more than the number
> of
> > > rows
> > > > returned) and hbase.cells.scanned.per.heartbeat.check set to 100M.
> This
> > > > scan returns about 1500 rows/KVs.
> > > >  * HBase configured with hbase.regionserver.regionSplitLimit=1 to
> > remove
> > > > influence from region splitting
> > > >
> > > > In this test, scanning seems over 50% slower on HBase 2 compared to
> > > HBase 1.
> > > >
> > > > I tried flushing & major-compacting before doing the scan, in which
> > case
> > > > the scan finishes faster, but the difference between the two HBase
> > > versions
> > > > stays about the same.
> > > >
> > > > The test code is written in Java, using the client libraries from the
> > > > corresponding HBase versions.
> > > >
> > > > Besides the above scan test, I also tested write performance through
> > > > BufferedMutator, scans without the filter (thus passing much more
> data
> > > over
> > > > the rpc), and sequential random Get requests. They all seem quite a
> bit
> > > > slower on HBase 2. Interestingly, using the HBase 1.6 client to talk
> to
> > > the
> > > > HBase 2.2.4 server is faster than using the HBase 2.2.4 client. So it
> > > seems
> > > > the rpc latency of the new client is worse.
> > > >
> > > > So my question is, is such a large performance drop to be expected
> when
> > > > migrating to HBase 2? Are there any special settings we need to be
> > aware
> > > of?
> > > >
> > > > Thanks!
> > >
> >
> >
> > --
> > Bruno Dumon
> > NGDATA
> > http://www.ngdata.com/
> >
>


-- 
Bruno Dumon
NGDATA
http://www.ngdata.com/

Reply via email to