littlexyw opened a new pull request, #3249:
URL: https://github.com/apache/celeborn/pull/3249
…lushBuffer
<!--
Thanks for sending a pull request! Here are some tips for you:
- Make sure the PR title start w/ a JIRA ticket, e.g. '[CELEBORN-XXXX]
Your PR title ...'.
- Be sure to keep the PR description updated to reflect all changes.
- Please write your PR title to summarize what this PR proposes.
- If possible, provide a concise example to reproduce the issue for a
faster review.
-->
### What changes were proposed in this pull request?
backport #3224 to branch-0.5
Remove the redundant release of data after OutOfDirectMemoryError appears in
flushBuffer.addComponent
### Why are the changes needed?
he reason why OutOfDirectMemoryError will appear in flushBuffer.addComponent
is that after adding a new component, CompositeByteBuf will determine whether
the number of components exceeds the maximum limit. If it exceeds, the existing
components will be merged into a large component. At this time, new off-heap
memory will be requested. If there is insufficient memory at this time,
OutOfDirectMemoryError will be reported, but the new component has been added
to flushBuffer at this time. Releasing the new component at this time will
cause refcnt error.
Don't worry about the component here not being released causing memory
leaks, because it will be released normally in returnBuffer (flush or file
destroy or file close).
If writeLocalData does not catch OutOfDirectMemoryError, the impact is as
follows:
In the case of a single copy, if
https://github.com/apache/celeborn/pull/3049 pr is not merged, commitfile will
be blocked in waitPendingWrites and fail, because writeLocalData does not
correctly decrementPendingWrites. However, this will not cause flushBuffer to
exist in memory for a long time, because when shuffle expires, the file will be
destroyed, flushBuffer will be returned, and this part of memory will be
released.
In the case of dual replicas, in addition to the above problems, the thread
of the Eventloop to which replicate-client belongs will be blocked at
Await.result(writePromise.future, Duration.Inf) because writePromise is not
closed correctly. As a result, this thread will not process other PushData data
written by worker-data-replicator to the channels of the Eventloop to which
replicate-client belongs. This part of data accumulates in the taskQueue of
EventLoop and cannot be canceled, which is the cause of memory leak.
image
Therefore, if the memory leak occurs after OutOfDirectMemoryError occurs in
flushBuffer.addComponent, you only need to catch OutOfDirectMemoryError in
writeLocalData, and there is no need to release data after addComponent.
I simulated the scenario where addCompoent had an OutOfDirectMemoryError,
and released data after the OutOfDirectMemoryError occurred, and a refcnt error
occurred.
[oom_fix_error_release.log](https://github.com/user-attachments/files/19863484/oom_fix_error_release.log)
At the same time, I simulated the scenario where addCompoent had an
OutOfDirectMemoryError and did not release data after the
OutOfDirectMemoryError occurred. No refcnt error occurred, commitfiles
succeeded, the spark task succeeded, and after commitfiles, the worker
diskbuffercount became 0.
[celeborn_1760_followup_worker.log](https://github.com/user-attachments/files/19864486/celeborn_1760_followup_worker.log)
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
manual test.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]