[
https://issues.apache.org/jira/browse/FLINK-32410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17810418#comment-17810418
]
Matthias Pohl commented on FLINK-32410:
---------------------------------------
[~srichter] is this an ongoing effort? Looks like the PR made it into master
and release-1.18 with
[ab9445ac|https://github.com/apache/flink/commit/ab9445ac] and we could set the
fixVersion to 1.18.0 and close this issue?
> Allocate hash-based collections with sufficient capacity for expected size
> --------------------------------------------------------------------------
>
> Key: FLINK-32410
> URL: https://issues.apache.org/jira/browse/FLINK-32410
> Project: Flink
> Issue Type: Improvement
> Reporter: Stefan Richter
> Assignee: Stefan Richter
> Priority: Major
> Labels: pull-request-available, stale-assigned
> Fix For: 1.19.0
>
>
> The JDK API to create hash-based collections for a certain capacity is
> arguable misleading because it doesn't size the collections to "hold a
> specific number of items" like you'd expect it would. Instead it sizes it to
> hold load-factor% of the specified number.
> For the common pattern to allocate a hash-based collection with the size of
> expected elements to avoid rehashes, this means that a rehash is essentially
> guaranteed.
> We should introduce helper methods (similar to Guava's
> `Maps.newHashMapWithExpectedSize(int)`) for allocations for expected size and
> replace the direct constructor calls with those.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)