Re: ycsb test on hbase

Jinsong Hu Fri, 10 Sep 2010 10:50:13 -0700

When I did the test, I found the CPU went to 100% for the compaction thread.while other cores in theOS is not fully utilized. I checked the hbase code and found there is onlyone single thread that doescompaction. That needs to be changed. I found that 0.21 version of hbasedoes have this plan touse multiple thread to do so, but it appears it will take a long time for usto get there.

Another issue that I found with hbase is that when the number of regionreaches around 1000 for eachregionserver, the regionserver begin to shutdown by itself for variousreasons. most of the timeIO issue with data node. occasionally session expiration issue withzookeeper. This is true regardless

of what key I use for the table.


Jimmy.

--------------------------------------------------
From: "Jeff Whiting" <[email protected]>
Sent: Friday, September 10, 2010 9:44 AM
To: <[email protected]>
Subject: Re: ycsb test on hbase

We were having the exact same problem when we were doing our own loadtesting with hbase. We found that a memstore would reach itshbase.hstore.blockingStoreFiles limit or itshbase.hregion.memstore.block.multiplier. Hitting either of those limitsprevents writes to the specific region and the client would have to pauseuntil a compaction could come through and clean stuff up.
However the biggest problem is that there would be a descent sizecompaction queue, we'd hit one of those limits, and then get put on the*back* of the queue and would have to wait *minutes* before it finally gotto do the compaction we needed to stop the blocking. I created a jira toaddress the issue HBASE-2646. There is a patch in the jira for 0.20.4that creates a priority compaction queue that greatly helped our problem.In fact we saw little to no pausing after applying the patch. In thecomments of the jira you can see some of the settings we used to preventthe problem without the patch.
Apparently there is some work going on to do concurrent prioritycompaction (Jonathan Gray has been working on it) but I haven't seenanything yet in hbase and don't know the time line. My personal opinionis that we should integrate the patch into trunk and use it until the moreadvanced compactions are implemented.
~Jeff

On 9/10/2010 2:27 AM, Jeff Hammerbacher wrote:
We've been brainstorming some ideas to "smooth out" these performance
lapses, so instead of getting a 10 second period of unavailability, youget
a 30 second period of slower performance, which is usually preferable.
Where is this brainstorming taking place? Could we open a JIRA issue to
capture the brainstorming in public and searchable fashion?
--
Jeff Whiting
Qualtrics Senior Software Engineer
[email protected]

Re: ycsb test on hbase

Reply via email to