Bryan Beaudreault created HBASE-27730:
-----------------------------------------

             Summary: Optimize reference counting in off-heap ByteBuff
                 Key: HBASE-27730
                 URL: https://issues.apache.org/jira/browse/HBASE-27730
             Project: HBase
          Issue Type: Improvement
            Reporter: Bryan Beaudreault


In HBASE-27710 we uncovered a performance regression in reference counting of 
ByteBuff. This was especially prominent in on-heap buffers when doing a simple 
HFile.Reader iteration of a file. For that case, we saw a 4x regression when 
reference counting was in play.

It stands to reason that this same regression exists in off-heap buffers, and 
I've run a microbenchmark which indeed shows the same issue. With existing 
reference counting, scanning a 20gb hfile takes 40s. With an optimized version, 
scanning the same file takes 20s. We don't typically see this in profiling live 
regionservers where so much else goes on, but optimizing this would eliminate 
some cpu cycles.

It's worth noting that netty saw this same regression a few years ago: 
[https://github.com/netty/netty/pull/8895]. Hat tip to [~zhangduo] for pointing 
this out.

One of the fixes there was to copy some internal code from deeper in the ref 
counting, so that the call stack was smaller and inlining was possible. We 
can't really do that.

Another thing they did was add a boolean field in their CompositeByteBuffer, 
which gets set to true when the buffer is recycled. So they don't need to do 
reference counting on every operation, instead they can just check a boolean.

I tried adding a boolean to our RefCnt.java, and it indeed fixes the 
regression. The problem is, due to class alignment issues in java, adding this 
boolean field increases the heap size of RefCnt from 24 to 32 bytes. This seems 
non-trivial given it's used in bucket cache where there could be many millions 
of them.

I think we can get around this by simply nulling out the recycler in RefCnt 
after it has been called. Then, instead of doing a boolean check we can do a 
null check. This performs similarly to the boolean, but without any extra 
memory.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to