[
https://issues.apache.org/jira/browse/FLINK-37949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferenc Csaky closed FLINK-37949.
--------------------------------
Resolution: Duplicate
> FlinkKinesisConsumer on EFO mode runs into Netty deadlock on client close
> during stop-with-savepoint
> -----------------------------------------------------------------------------------------------------
>
> Key: FLINK-37949
> URL: https://issues.apache.org/jira/browse/FLINK-37949
> Project: Flink
> Issue Type: Bug
> Components: Connectors / Kinesis
> Affects Versions: 1.20.0
> Reporter: Keith Lee
> Priority: Major
>
> We ran into the following issue
> * Flink job was continuously failing over and stop-with-savepoint savepoint
> was failing.
> * Savepoint was failing because task failed
> * Task failed because of TimeoutException thrown from netty client when
> Kinesis connector attempts to close netty client during Stop-with-savepoint
> operation
> {quote} java.lang.RuntimeException: java.util.concurrent.TimeoutException
> at
> org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.AwaitCloseChannelPoolMap.close(AwaitCloseChannelPoolMap.java:177)
> at
> org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.utils.NettyUtils.runAndLogError(NettyUtils.java:386)
> at
> org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.NettyNioAsyncHttpClient.close(NettyNioAsyncHttpClient.java:198)
> at
> org.apache.flink.streaming.connectors.kinesis.proxy.KinesisProxyAsyncV2.close(KinesisProxyAsyncV2.java:72)
> at
> org.apache.flink.streaming.connectors.kinesis.internals.publisher.fanout.FanOutRecordPublisherFactory.close(FanOutRecordPublisherFactory.java:101)
> at
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher.closeRecordPublisherFactory(KinesisDataFetcher.java:839)
> at
> org.apache.flink.streaming.connectors.kinesis.internals.KinesisDataFetcher.shutdownFetcher(KinesisDataFetcher.java:813)
> at
> org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.cancel(FlinkKinesisConsumer.java:422)
> at
> org.apache.flink.streaming.api.operators.StreamSource.stop(StreamSource.java:131)
> at
> org.apache.flink.streaming.runtime.tasks.SourceStreamTask.stopOperatorForStopWithSavepoint(SourceStreamTask.java:311)
> ...
> Caused by: java.util.concurrent.TimeoutException
> at
> java.base/java.util.concurrent.CompletableFuture.timedGet(CompletableFuture.java:1886)
> at
> java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2021)
> at
> org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.AwaitCloseChannelPoolMap.close(AwaitCloseChannelPoolMap.java:172)
> ...{quote}
> * Netty client failed to close probably due to deadlock. See exception
> thrown by checkDeadLock() within netty
> {quote}
> org.apache.flink.kinesis.shaded.io.netty.util.concurrent.BlockingOperationException:
> DefaultChannelPromise@731d6e17(incomplete)
> at
> org.apache.flink.kinesis.shaded.io.netty.util.concurrent.DefaultPromise.checkDeadLock(DefaultPromise.java:463)
> at
> org.apache.flink.kinesis.shaded.io.netty.channel.DefaultChannelPromise.checkDeadLock(DefaultChannelPromise.java:159)
> at
> org.apache.flink.kinesis.shaded.io.netty.util.concurrent.DefaultPromise.awaitUninterruptibly(DefaultPromise.java:269)
> at
> org.apache.flink.kinesis.shaded.io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:137)
> at
> org.apache.flink.kinesis.shaded.io.netty.channel.DefaultChannelPromise.awaitUninterruptibly(DefaultChannelPromise.java:30)
> at
> org.apache.flink.kinesis.shaded.io.netty.channel.pool.SimpleChannelPool.close(SimpleChannelPool.java:408)
> at
> org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.BetterSimpleChannelPool.close(BetterSimpleChannelPool.java:38)
> at
> org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.HonorCloseOnReleaseChannelPool.close(HonorCloseOnReleaseChannelPool.java:80)
> at
> org.apache.flink.kinesis.shaded.software.amazon.awssdk.http.nio.netty.internal.http2.Http2MultiplexedChannelPool.lambda$doClose$11(Http2MultiplexedChannelPool.java:419)
> ...{quote}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)