Bryan Beaudreault created HBASE-27730:
-----------------------------------------
Summary: Optimize reference counting in off-heap ByteBuff
Key: HBASE-27730
URL: https://issues.apache.org/jira/browse/HBASE-27730
Project: HBase
Issue Type: Improvement
Reporter: Bryan Beaudreault
In HBASE-27710 we uncovered a performance regression in reference counting of
ByteBuff. This was especially prominent in on-heap buffers when doing a simple
HFile.Reader iteration of a file. For that case, we saw a 4x regression when
reference counting was in play.
It stands to reason that this same regression exists in off-heap buffers, and
I've run a microbenchmark which indeed shows the same issue. With existing
reference counting, scanning a 20gb hfile takes 40s. With an optimized version,
scanning the same file takes 20s. We don't typically see this in profiling live
regionservers where so much else goes on, but optimizing this would eliminate
some cpu cycles.
It's worth noting that netty saw this same regression a few years ago:
[https://github.com/netty/netty/pull/8895]. Hat tip to [~zhangduo] for pointing
this out.
One of the fixes there was to copy some internal code from deeper in the ref
counting, so that the call stack was smaller and inlining was possible. We
can't really do that.
Another thing they did was add a boolean field in their CompositeByteBuffer,
which gets set to true when the buffer is recycled. So they don't need to do
reference counting on every operation, instead they can just check a boolean.
I tried adding a boolean to our RefCnt.java, and it indeed fixes the
regression. The problem is, due to class alignment issues in java, adding this
boolean field increases the heap size of RefCnt from 24 to 32 bytes. This seems
non-trivial given it's used in bucket cache where there could be many millions
of them.
I think we can get around this by simply nulling out the recycler in RefCnt
after it has been called. Then, instead of doing a boolean check we can do a
null check. This performs similarly to the boolean, but without any extra
memory.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)