[ 
https://issues.apache.org/jira/browse/FLINK-32896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated FLINK-32896:
-----------------------------------
    Labels: pull-request-available starter  (was: starter)

> Incorrect `Map.computeIfAbsent(..., ...::new)` usage which misinterprets key 
> as initial capacity
> ------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-32896
>                 URL: https://issues.apache.org/jira/browse/FLINK-32896
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Coordination
>    Affects Versions: 1.18.0, 1.17.1
>            Reporter: Marcono1234
>            Assignee: ZY tang
>            Priority: Minor
>              Labels: pull-request-available, starter
>
> The are multiple cases in the code which look like this:
> {code}
> map.computeIfAbsent(..., ArrayList::new)
> {code}
> Not only does this create a new collection (here an {{ArrayList}}), but 
> {{computeIfAbsent}} also passes the map key as argument to the mapping 
> function, so instead of calling the no-args constructor such as {{new 
> ArrayList<>()}}, this actually calls the constructor with {{int}} initial 
> capacity parameter, such as {{new ArrayList<>(initialCapacity)}}.
> For some cases where this occurs in the Flink code I am not sure if it is 
> intended, but in same cases this does not seem to be intended. Here are the 
> cases I found:
> - 
> [{{HsFileDataIndexImpl:163}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsFileDataIndexImpl.java#L163C70-L163C84]
> - 
> [{{HsSpillingStrategy:128}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L128C68-L128C82]
> - 
> [{{HsSpillingStrategy:134}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L134C63-L134C77]
> - 
> [{{HsSpillingStrategy:140}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L140C63-L140C77]
> - 
> [{{HsSpillingStrategy:145}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L145C70-L145C84]
> - 
> [{{HsSpillingStrategy:151}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L151C65-L151C79]
> - 
> [{{HsSpillingStrategy:157}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategy.java#L157C65-L157C79]
> - 
> [{{HsSpillingStrategyUtils:76}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/HsSpillingStrategyUtils.java#L76C74-L76C88]
> - 
> [{{ProducerMergedPartitionFileIndex:171}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/main/java/org/apache/flink/runtime/io/network/partition/hybrid/tiered/file/ProducerMergedPartitionFileIndex.java#L171C75-L171C89]
> - 
> [{{TestingSpillingInfoProvider:201}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/hybrid/TestingSpillingInfoProvider.java#L201C56-L201C70]
> - 
> [{{TestingSpillingInfoProvider:208}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/hybrid/TestingSpillingInfoProvider.java#L208C54-L208C66]
> - 
> [{{TestingSpillingInfoProvider:216}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/io/network/partition/hybrid/TestingSpillingInfoProvider.java#L216C54-L216C66]
> - 
> [{{SourceCoordinatorConcurrentAttemptsTest:269}}|https://github.com/apache/flink/blob/9546f8243a24e7b45582b6de6702f819f1d73f97/flink-runtime/src/test/java/org/apache/flink/runtime/source/coordinator/SourceCoordinatorConcurrentAttemptsTest.java#L269C53-L269C65]
> This can lead either to runtime exceptions in case the map key is negative, 
> since the collection constructors reject negative initial capacity values, or 
> it can lead to bad performance if the key (which is misinterpreted as initial 
> capacity) is pretty low, such as 0, or is pretty large and therefore 
> pre-allocates a lot of memory for the collection.
> Regardless of whether or not these cases are intended it might be good to 
> replace them with lambda expressions to make this more explicit:
> {code}
> map.computeIfAbsent(..., capacity -> new ArrayList<>(capacity))
> {code}
> or
> {code}
> map.computeIfAbsent(..., k -> new ArrayList<>())
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to