[
https://issues.apache.org/jira/browse/FLINK-32896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias Pohl resolved FLINK-32896.
-----------------------------------
Fix Version/s: 1.19.0
Resolution: Fixed
I didn't push for backports because it doesn't seem to cause issues. Therefore,
it's rather a code cleanup than a bug.
master (1.19): 30e8b3de05c1d6b75d8f27b9188a1d34f1589ac5
> Incorrect `Map.computeIfAbsent(..., ...::new)` usage which misinterprets key
> as initial capacity
> ------------------------------------------------------------------------------------------------
>
> Key: FLINK-32896
> URL: https://issues.apache.org/jira/browse/FLINK-32896
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.18.0, 1.17.1
> Reporter: Marcono1234
> Assignee: ZY tang
> Priority: Minor
> Labels: pull-request-available, starter
> Fix For: 1.19.0
>
>
> The are multiple cases in the code which look like this:
> {code}
> map.computeIfAbsent(..., ArrayList::new)
> {code}
> Not only does this create a new collection (here an {{ArrayList}}), but
> {{computeIfAbsent}} also passes the map key as argument to the mapping
> function, so instead of calling the no-args constructor such as {{new
> ArrayList<>()}}, this actually calls the constructor with {{int}} initial
> capacity parameter, such as {{new ArrayList<>(initialCapacity)}}.
> For some cases where this occurs in the Flink code I am not sure if it is
> intended, but in same cases this does not seem to be intended. Here are the
> cases I found:
> -
> [{{HsFileDataIndexImpl:163}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsFileDataIndexImpl.java#L163C70-L163C84]
> -
> [{{HsSpillingStrategy:128}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L128C68-L128C82]
> -
> [{{HsSpillingStrategy:134}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L134C63-L134C77]
> -
> [{{HsSpillingStrategy:140}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L140C63-L140C77]
> -
> [{{HsSpillingStrategy:145}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L145C70-L145C84]
> -
> [{{HsSpillingStrategy:151}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L151C65-L151C79]
> -
> [{{HsSpillingStrategy:157}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L157C65-L157C79]
> -
> [{{HsSpillingStrategyUtils:76}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategyUtils.java#L76C74-L76C88]
> -
> [{{ProducerMergedPartitionFileIndex:171}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/file/ProducerMergedPartitionFileIndex.java#L171C75-L171C89]
> -
> [{{TestingSpillingInfoProvider:201}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/hybrid/TestingSpillingInfoProvider.java#L201C56-L201C70]
> -
> [{{TestingSpillingInfoProvider:208}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/hybrid/TestingSpillingInfoProvider.java#L208C54-L208C66]
> -
> [{{TestingSpillingInfoProvider:216}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/hybrid/TestingSpillingInfoProvider.java#L216C54-L216C66]
> -
> [{{SourceCoordinatorConcurrentAttemptsTest:269}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/source/coordinator/SourceCoordinatorConcurrentAttemptsTest.java#L269C53-L269C65]
> This can lead either to runtime exceptions in case the map key is negative,
> since the collection constructors reject negative initial capacity values, or
> it can lead to bad performance if the key (which is misinterpreted as initial
> capacity) is pretty low, such as 0, or is pretty large and therefore
> pre-allocates a lot of memory for the collection.
> Regardless of whether or not these cases are intended it might be good to
> replace them with lambda expressions to make this more explicit:
> {code}
> map.computeIfAbsent(..., capacity -> new ArrayList<>(capacity))
> {code}
> or
> {code}
> map.computeIfAbsent(..., k -> new ArrayList<>())
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)