Thanks everyone for chiming in. I submitted
https://issues.apache.org/jira/browse/HBASE-27710 and will work on it in
the near future.

This does make me wonder what overhead checkRefCount causes in off-heap
cases too, but we just dont see it in profiling because the server does so
much other work. It has me thinking if there are other optimizations we
could do there. For example, currently we do checkRefCount for every single
ByteBuff operation (slice/duplicate/get/etc/etc). A lot of the buffer
operations happen before creating a Cell for return to user in
HFileBlock/DBE. I wonder if we could do all the parsing we need in
HFileBlock/DBE without checkRefCount, do a single checkRefCount call when
we create the Cell, then enable checkRefCount for all future ByteBuff calls
before returning a cell to upper levels. This would probably be a larger
project, so not investigating now. Just an idea i had.

On Mon, Mar 13, 2023 at 2:06 AM Viraj Jasani <vjas...@apache.org> wrote:

> +1, as the numbers suggest significant perf improvement.
>
>
> On Thu, Mar 9, 2023 at 9:36 AM Bryan Beaudreault <bbeaudrea...@apache.org>
> wrote:
>
> > Hey all,
> >
> > We have a use-case for HFile.Reader at my company. The use-case in
> question
> > is scanning many hfiles for a small subset of cells, so it mostly is
> > just iterating a large number of HFiles and discarding most of the cells.
> > We recently upgraded that deployable from super old cdh 5.16 (hbase
> > 1.2-ish) to hbase 2.5.3. In doing so, we noticed a performance regression
> > of around 4x. I imagine this regression would also affect
> > ClientSideRegionScanner.
> >
> > We did some profiling and noticed that a large amount of time is spent in
> > SingleByteBuff.checkRefCnt. It seems like every SingleByteBuff method
> calls
> > checkRefCnt and this checks compares a volatile
> > in AtomicIntegerFieldUpdater in the netty code.
> >
> > I believe ref counting is mostly necessary for off-heap buffers, but
> > on-heap buffers are also wrapped in SingleByteBuff and so also go through
> > checkRefCnt. We removed the checkRefCnt call, and the regression
> > disappeared.
> >
> > We created a simple test method which just does HFile.createReader,
> > reader.getScanner(), and then iterates the scanner counting the total
> > cells. The test reads an hfile with 100M cells and takes  over 1 minute
> > with checkRefCnt. Removing checkRefCnt brings the runtime down to 20s.
> >
> > It's worth noting that the regression is most prominent on java17. It's
> > slightly less obvious on java11, with runtime being 40s vs 28s.
> >
> > Thoughts on updating SingleByteBuff to skip the checkRefCnt call for
> > on-heap buffers? We can handle this in the constructor, when wrapping an
> > on-heap buffer here [1].
> >
> > [1]
> >
> >
> https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBuffAllocator.java#L300
> >
>

Reply via email to