-1 on this.

we found the new fg reader just caches the Spark GenericInternalRow in
records cache, which is 5x larger than the original avro bytes based
payload records, thus, the records is more prone to spill, the spill
is kind of a bottleneck of the compaction/regular reader read path,
the spill causes performance regression actually. We should mark this
as block of 1.0.2 I think. Also the cache takes a map metadata for
each record which also takes a lot of memory(the map obj takes a lot
of memory itself). To address this issue, I have fired a JIRA task:
https://issues.apache.org/jira/browse/HUDI-9318

Best,
Danny

Voon <v...@apache.org> 于2025年4月22日周二 09:59写道:
>
> Hi everyone,
>
> Please review and vote on the release candidate #2 for the version 1.0.2,
> as follows:
>
> [ ] +1, Approve the release
>
> [ ] -1, Do not approve the release (please provide specific comments)
>
>
>
> The complete staging area is available for your review, which includes:
>
> * JIRA release notes [1],
>
> * the official Apache source release and binary convenience releases to be
> deployed to dist.apache.org [2], which are signed with the key with
> fingerprint B8DC892C439CCB5C0CCA3BEA68050B561D9AFB32 [3],
>
> * all artifacts to be deployed to the Maven Central Repository [4],
>
> * source code tag "1.0.2-rc2" [5],
>
>
>
> The vote will be open for at least 72 hours. It is adopted by majority
> approval, with at least 3 PMC affirmative votes.
>
>
>
> Thanks,
>
> Release Manager
>
>
>
> [1]
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12322822&version=12355558
>
> [2] https://dist.apache.org/repos/dist/dev/hudi/hudi-1.0.2-rc2/
>
> [3] https://dist.apache.org/repos/dist/release/hudi/KEYS
>
> [4] https://repository.apache.org/content/repositories/orgapachehudi-1149/
>
> [5] https://github.com/apache/hudi/releases/tag/release-1.0.2-rc2

Reply via email to