[
https://issues.apache.org/jira/browse/HBASE-27730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17801953#comment-17801953
]
Becker Ewing commented on HBASE-27730:
--------------------------------------
{noformat}
Benchmark (shouldUseNoneRecycler) (useOffHeapBuffer) Mode Cnt
Score Error Units
ByteBuffBenchmark.get true true avgt 5
8715.026 ± 196.680 ns/op
ByteBuffBenchmark.get true false avgt 5
9156.038 ± 879.701 ns/op
ByteBuffBenchmark.get false true avgt 5
8746.112 ± 91.118 ns/op
ByteBuffBenchmark.get false false avgt 5
9067.323 ± 190.359 ns/op
Benchmark (shouldUseNoneRecycler)
(vlong) Mode Cnt Score Error Units
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB true
9 avgt 5 2.254 ± 0.077 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB true
512 avgt 5 5.305 ± 0.169 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB true
80000 avgt 5 7.014 ± 0.929 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB true
548755813887 avgt 5 6.730 ± 0.037 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB true
1700104028981 avgt 5 7.497 ± 0.133 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB true
9123372036854775807 avgt 5 10.984 ± 0.445 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB false
9 avgt 5 2.242 ± 0.031 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB false
512 avgt 5 5.349 ± 0.075 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB false
80000 avgt 5 6.677 ± 0.041 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB false
548755813887 avgt 5 6.772 ± 0.528 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB false
1700104028981 avgt 5 7.436 ± 0.128 ns/op
ReadVLongBenchmark.readVLongHBase14186_OffHeapBB false
9123372036854775807 avgt 5 11.173 ± 0.217 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB true
9 avgt 5 2.495 ± 0.268 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB true
512 avgt 5 5.335 ± 0.166 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB true
80000 avgt 5 7.014 ± 0.177 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB true
548755813887 avgt 5 6.990 ± 0.083 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB true
1700104028981 avgt 5 7.085 ± 0.093 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB true
9123372036854775807 avgt 5 10.319 ± 0.229 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB false
9 avgt 5 2.420 ± 0.126 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB false
512 avgt 5 5.508 ± 0.500 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB false
80000 avgt 5 7.137 ± 0.108 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB false
548755813887 avgt 5 6.943 ± 0.040 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB false
1700104028981 avgt 5 6.921 ± 0.734 ns/op
ReadVLongBenchmark.readVLongHBase14186_OnHeapBB false
9123372036854775807 avgt 5 10.250 ± 0.364 ns/op{noformat}
> Optimize reference counting in off-heap ByteBuff
> ------------------------------------------------
>
> Key: HBASE-27730
> URL: https://issues.apache.org/jira/browse/HBASE-27730
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Assignee: Becker Ewing
> Priority: Major
> Attachments: HBASE-27730-prelim.patch
>
>
> In HBASE-27710 we uncovered a performance regression in reference counting of
> ByteBuff. This was especially prominent in on-heap buffers when doing a
> simple HFile.Reader iteration of a file. For that case, we saw a 4x
> regression when reference counting was in play.
> It stands to reason that this same regression exists in off-heap buffers, and
> I've run a microbenchmark which indeed shows the same issue. With existing
> reference counting, scanning a 20gb hfile takes 40s. With an optimized
> version, scanning the same file takes 20s. We don't typically see this in
> profiling live regionservers where so much else goes on, but optimizing this
> would eliminate some cpu cycles.
> It's worth noting that netty saw this same regression a few years ago:
> [https://github.com/netty/netty/pull/8895]. Hat tip to [~zhangduo] for
> pointing this out.
> One of the fixes there was to copy some internal code from deeper in the ref
> counting, so that the call stack was smaller and inlining was possible. We
> can't really do that.
> Another thing they did was add a boolean field in their CompositeByteBuffer,
> which gets set to true when the buffer is recycled. So they don't need to do
> reference counting on every operation, instead they can just check a boolean.
> I tried adding a boolean to our RefCnt.java, and it indeed fixes the
> regression. The problem is, due to class alignment issues in java, adding
> this boolean field increases the heap size of RefCnt from 24 to 32 bytes.
> This seems non-trivial given it's used in bucket cache where there could be
> many millions of them.
> I think we can get around this by simply nulling out the recycler in RefCnt
> after it has been called. Then, instead of doing a boolean check we can do a
> null check. This performs similarly to the boolean, but without any extra
> memory.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)