[
https://issues.apache.org/jira/browse/OAK-12094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18058411#comment-18058411
]
Julian Sedding edited comment on OAK-12094 at 2/13/26 12:02 PM:
----------------------------------------------------------------
h2. Analysis
In order to get comparable heap dumps, I ran the tests with the commit _before_
OAK-12040 with {{-Xmx432m}} only. That caused the tests to fail with an
{{OutOfMemoryError}} roughly at the same point during test execution, as the
failure _after_ OAK-12040 did with {{-Xmx496m}}.
Comparing the "retained objects" of both heap dumps (left=before OAK-12040,
right=after OAK-12040), it is evident that most of the objects look very
similar, but after the change the heap occupied by {{byte[]}} objects is ~66MB
larger.
!retained-objects-before-OAK-12040.png|width=560!
!retained-objects-after-OAK-12040.png|width=560!
Looking into the byte arrays reveals the presence of 16 {{byte[]s}} of 4.19MB
each (total=67.04MB). They are all referenced by a
{{io.netty.buffer.PoolChunk}}.
!largest-byte-array-incoming-reference-after-OAK-12040.png|width=1120!
It appears that the difference in how the Azure SDK is invoked to upload the
segment data makes a difference in how netty is called and how it handles the
data.
It is possible to instruct netty to use a different allocator by setting
{{-Dio.netty.allocator.type=unpooled}}. And indeed, the tests run to completion
with {{-Xmx464m}} (just like before OAK-12040) when invoked with this option:
{{mvn clean test -Dtest.opts.memory="-Xmx464m
-Dio.netty.allocator.type=unpooled" -pl oak-segment-azure}}.
This confirms that the change for OAK-12040 is indeed the culprit for increased
heap usage, and that it comes from how the netty library is used (via Azure
SDK).
was (Author: jsedding):
h2. Analysis
In order to get comparable heap dumps, I ran the tests with the commit _before_
OAK-12040 with {{-Xmx432m}} only. That caused the tests to fail with an
{{OutOfMemoryError}} roughly at the same point during test execution, as the
failure _after_ OAK-12040 did with {{-Xmx496m}}.
Comparing the "retained objects" of both heap dumps (left=before OAK-12040,
right=after OAK-12040), it is evident that most of the objects look very
similar, but after the change the heap occupied by {{byte[]}} objects is ~66MB
larger.
!retained-objects-before-OAK-12040.png|width=560!
!retained-objects-after-OAK-12040.png|width=560!
Looking into the byte arrays reveals the presence of 16 {{byte[]s}} of 4.19MB
each (total=67.04MB). They are all referenced by a
{{io.netty.buffer.PoolChunk}}.
!largest-byte-array-incoming-reference-after-OAK-12040.png|width=1120!
It appears that the difference in how the Azure SDK is invoked to upload the
segment data makes a difference in how netty is called and how it handles the
data.
> segment-azure: increased heap usage due to OAK-12040
> ----------------------------------------------------
>
> Key: OAK-12094
> URL: https://issues.apache.org/jira/browse/OAK-12094
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: segment-azure
> Reporter: Julian Sedding
> Assignee: Julian Sedding
> Priority: Minor
> Attachments:
> largest-byte-array-incoming-reference-after-OAK-12040.png,
> retained-objects-after-OAK-12040.png, retained-objects-before-OAK-12040.png
>
>
> The change for OAK-12040 accidentally introduced increased heap usage,
> presumably due to the way the Azure SDK is invoked.
> Running all unit tests in {{oak-segment-azure}} via {{mvn clean test
> -Dtest.opts.memory=-Xmx496m -pl oak-segment-azure}} reliably causes an
> {{OutOfMemoryError}} (on my machine using java 21), which wasn't the case
> before this change.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)