[
https://issues.apache.org/jira/browse/FLINK-30908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17684615#comment-17684615
]
Xintong Song commented on FLINK-30908:
--------------------------------------
This is indeed related to FLINK_20988. The {{InterruptedIOException}} is
expected and is included in the whitelist for prohibited string checking.
However, this should not be handled as a fatal error.
I think this is a blocker. Depending on which process finishes first (the
gracefully shutdown and the fatal error handling), it may cause the process to
terminate with an error exit code, which lead to restarting of the application
being shutdown.
I'll provide a fix for it asap.
> Fatal error in ResourceManager caused
> YARNSessionFIFOSecuredITCase.testDetachedMode to fail
> -------------------------------------------------------------------------------------------
>
> Key: FLINK-30908
> URL: https://issues.apache.org/jira/browse/FLINK-30908
> Project: Flink
> Issue Type: Bug
> Components: Deployment / YARN, Runtime / Coordination
> Affects Versions: 1.17.0
> Reporter: Matthias Pohl
> Priority: Critical
> Labels: test-stability
> Attachments: mvn-1.FLINK-30908.log
>
>
> There's a build failure in {{YARNSessionFIFOSecuredITCase.testDetachedMode}}
> which is caused by a fatal error in the ResourceManager:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45720&view=logs&j=245e1f2e-ba5b-5570-d689-25ae21e5302f&t=d04c9862-880c-52f5-574b-a7a79fef8e0f&l=29869
> {code}
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send
> RPC request to server
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send
> RPC request to server
> Feb 05 02:41:58 at org.apache.hadoop.ipc.Client.call(Client.java:1480)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at org.apache.hadoop.ipc.Client.call(Client.java:1422)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at com.sun.proxy.$Proxy31.allocate(Unknown Source)
> ~[?:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
> ~[hadoop-yarn-common-3.2.3.jar:?]
> Feb 05 02:41:58 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method) ~[?:1.8.0_292]
> Feb 05 02:41:58 at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> ~[?:1.8.0_292]
> Feb 05 02:41:58 at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> ~[?:1.8.0_292]
> Feb 05 02:41:58 at java.lang.reflect.Method.invoke(Method.java:498)
> ~[?:1.8.0_292]
> Feb 05 02:41:58 at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at com.sun.proxy.$Proxy32.allocate(Unknown Source)
> ~[?:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:325)
> ~[hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58 at
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:311)
> [hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58 Caused by: java.lang.InterruptedException
> Feb 05 02:41:58 at
> java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_292]
> Feb 05 02:41:58 at
> java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_292]
> Feb 05 02:41:58 at
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1180)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 at org.apache.hadoop.ipc.Client.call(Client.java:1475)
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58 ... 17 more
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)