Much thanks to Duo for his review, this will be merged shortly to active
branches.

In the PR we brainstormed ways to also optimize *off-heap* ByteBuffs. That
discussion is continuing in
https://issues.apache.org/jira/browse/HBASE-27730 if interested.

On Tue, Mar 14, 2023 at 8:45 PM Bryan Beaudreault <bbeaudrea...@apache.org>
wrote:

> For those following along, I’ve submitted an initial PR for this issue.
> https://github.com/apache/hbase/pull/5104
>
> On Mon, Mar 13, 2023 at 11:46 AM Bryan Beaudreault <bbeaudrea...@gmail.com>
> wrote:
>
>> Thanks everyone for chiming in. I submitted
>> https://issues.apache.org/jira/browse/HBASE-27710 and will work on it in
>> the near future.
>>
>> This does make me wonder what overhead checkRefCount causes in off-heap
>> cases too, but we just dont see it in profiling because the server does so
>> much other work. It has me thinking if there are other optimizations we
>> could do there. For example, currently we do checkRefCount for every single
>> ByteBuff operation (slice/duplicate/get/etc/etc). A lot of the buffer
>> operations happen before creating a Cell for return to user in
>> HFileBlock/DBE. I wonder if we could do all the parsing we need in
>> HFileBlock/DBE without checkRefCount, do a single checkRefCount call when
>> we create the Cell, then enable checkRefCount for all future ByteBuff calls
>> before returning a cell to upper levels. This would probably be a larger
>> project, so not investigating now. Just an idea i had.
>>
>> On Mon, Mar 13, 2023 at 2:06 AM Viraj Jasani <vjas...@apache.org> wrote:
>>
>>> +1, as the numbers suggest significant perf improvement.
>>>
>>>
>>> On Thu, Mar 9, 2023 at 9:36 AM Bryan Beaudreault <
>>> bbeaudrea...@apache.org>
>>> wrote:
>>>
>>> > Hey all,
>>> >
>>> > We have a use-case for HFile.Reader at my company. The use-case in
>>> question
>>> > is scanning many hfiles for a small subset of cells, so it mostly is
>>> > just iterating a large number of HFiles and discarding most of the
>>> cells.
>>> > We recently upgraded that deployable from super old cdh 5.16 (hbase
>>> > 1.2-ish) to hbase 2.5.3. In doing so, we noticed a performance
>>> regression
>>> > of around 4x. I imagine this regression would also affect
>>> > ClientSideRegionScanner.
>>> >
>>> > We did some profiling and noticed that a large amount of time is spent
>>> in
>>> > SingleByteBuff.checkRefCnt. It seems like every SingleByteBuff method
>>> calls
>>> > checkRefCnt and this checks compares a volatile
>>> > in AtomicIntegerFieldUpdater in the netty code.
>>> >
>>> > I believe ref counting is mostly necessary for off-heap buffers, but
>>> > on-heap buffers are also wrapped in SingleByteBuff and so also go
>>> through
>>> > checkRefCnt. We removed the checkRefCnt call, and the regression
>>> > disappeared.
>>> >
>>> > We created a simple test method which just does HFile.createReader,
>>> > reader.getScanner(), and then iterates the scanner counting the total
>>> > cells. The test reads an hfile with 100M cells and takes  over 1 minute
>>> > with checkRefCnt. Removing checkRefCnt brings the runtime down to 20s.
>>> >
>>> > It's worth noting that the regression is most prominent on java17. It's
>>> > slightly less obvious on java11, with runtime being 40s vs 28s.
>>> >
>>> > Thoughts on updating SingleByteBuff to skip the checkRefCnt call for
>>> > on-heap buffers? We can handle this in the constructor, when wrapping
>>> an
>>> > on-heap buffer here [1].
>>> >
>>> > [1]
>>> >
>>> >
>>> https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBuffAllocator.java#L300
>>> >
>>>
>>

Reply via email to