[
https://issues.apache.org/jira/browse/FLINK-39002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18055395#comment-18055395
]
Sergey Nuyanzin commented on FLINK-39002:
-----------------------------------------
[~jark], [~xtsong] it seems somehow related to Alibaba env, on which Azure
tests are running.
Since only Alibaba people have access there could you please assist on that?
> Could not resolve ResourceManager because of NPE
> ------------------------------------------------
>
> Key: FLINK-39002
> URL: https://issues.apache.org/jira/browse/FLINK-39002
> Project: Flink
> Issue Type: Bug
> Components: Deployment / Kubernetes, Test Infrastructure, Tests
> Reporter: Sergey Nuyanzin
> Priority: Critical
> Labels: test-stability
> Attachments: logs-ci-e2e_2_ci-1769733446.zip
>
>
> e2e tests 2 are failing with
> {noformat}
> [pekko.ssl.tcp://flink@localhost:6123]] Caused by:
> [javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is
> disabled or cipher suites are inappropriate)]
> 2026-01-30 00:57:01,235 INFO
> org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
> resolve ResourceManager address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*, retrying in
> 10000 ms: Could not connect to rpc endpoint under address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*.
> 2026-01-30 00:57:11,256 WARN
> org.apache.pekko.remote.ReliableDeliverySupervisor [] - Association
> with remote system [pekko.ssl.tcp://flink@localhost:6123] has failed, address
> is now gated for [50] ms. Reason: [Association failed with
> [pekko.ssl.tcp://flink@localhost:6123]] Caused by:
> [javax.net.ssl.SSLHandshakeException: No appropriate protocol (protocol is
> disabled or cipher suites are inappropriate)]
> 2026-01-30 00:57:11,259 INFO
> org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
> resolve ResourceManager address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*, retrying in
> 10000 ms: Could not connect to rpc endpoint under address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*.
> 2026-01-30 00:57:21,286 WARN
> org.apache.pekko.remote.ReliableDeliverySupervisor [] - Association
> with remote system [pekko.ssl.tcp://flink@localhost:6123] has failed, address
> is now gated for [50] ms. Reason: [Association failed with
> [pekko.ssl.tcp://flink@localhost:6123]] Caused by:
> [java.lang.NullPointerException: Cannot invoke
> "org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.handshakeFuture()"
> because the return value of
> "org.apache.flink.shaded.netty4.io.netty.channel.ChannelPipeline.get(java.lang.Class)"
> is null]
> 2026-01-30 00:57:21,286 INFO
> org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
> resolve ResourceManager address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*, retrying in
> 10000 ms: Could not connect to rpc endpoint under address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*.
> 2026-01-30 00:57:31,312 WARN
> org.apache.pekko.remote.ReliableDeliverySupervisor [] - Association
> with remote system [pekko.ssl.tcp://flink@localhost:6123] has failed, address
> is now gated for [50] ms. Reason: [Association failed with
> [pekko.ssl.tcp://flink@localhost:6123]] Caused by:
> [java.lang.NullPointerException: Cannot invoke
> "org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.handshakeFuture()"
> because the return value of
> "org.apache.flink.shaded.netty4.io.netty.channel.ChannelPipeline.get(java.lang.Class)"
> is null]
> 2026-01-30 00:57:31,316 INFO
> org.apache.flink.runtime.taskexecutor.TaskExecutor [] - Could not
> resolve ResourceManager address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*, retrying in
> 10000 ms: Could not connect to rpc endpoint under address
> pekko.ssl.tcp://flink@localhost:6123/user/rpc/resourcemanager_*.
> 2026-01-30 00:57:32,277 WARN
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner [] - RECEIVED
> SIGNAL 15: SIGTERM. Shutting down as requested.
> 2026-01-30 00:57:32,280 INFO
> org.apache.flink.runtime.blob.TransientBlobCache [] - Shutting
> down BLOB cache
> 2026-01-30 00:57:32,280 INFO
> org.apache.flink.runtime.state.TaskExecutorStateChangelogStoragesManager [] -
> Shutting down TaskExecutorStateChangelogStoragesManager.
> {noformat}
> including master
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=71966&view=logs&j=ef799394-2d67-5ff4-b2e5-410b80c9c0af&t=9e5768bc-daae-5f5f-1861-e58617922c7a&s=ab6e269b-88b2-5ded-2544-4aa5b1124530
> I noticed there were similar issues in past, however none of them failed with
> NPE so far.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)