[
https://issues.apache.org/jira/browse/HUDI-9055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17930745#comment-17930745
]
Geser Dugarov edited comment on HUDI-9055 at 2/26/25 4:26 PM:
--------------------------------------------------------------
I suppose, when we postpone creation of Java String objects in writers, it
leads to creation of huge number of objects when our buffers are full, and at
this point we also need to create a lot of Strings. It leads to high momentum
consumption of memory, and causes more full GC run, which decreases total
performance, see "Heap analysis - comment.png" attachment.
was (Author: JIRAUSER301110):
I suppose, when we postpone creation of Java String objects in writers, it
leads to creation of huge number of objects when our buffers are full, and at
this point we also need to create a lot of Strings. It leads to high momentum
consumption of memory, and causes more full GC run, which decreases total
performance.
!Heap analysis - comment.png!
> Analyze performance decrease with `StringData` constructor in
> `HoodieFlinkInternalRow`
> --------------------------------------------------------------------------------------
>
> Key: HUDI-9055
> URL: https://issues.apache.org/jira/browse/HUDI-9055
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Geser Dugarov
> Assignee: Geser Dugarov
> Priority: Major
> Attachments: Heap analysis - comment.png
>
>
> https://github.com/apache/hudi/pull/12796#discussion_r1961685697
> Add constructor with `StringData`.
> Research why write time for non bucket case increased.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)