[ 
https://issues.apache.org/jira/browse/FLINK-30908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17684615#comment-17684615
 ] 

Xintong Song commented on FLINK-30908:
--------------------------------------

This is indeed related to FLINK_20988. The {{InterruptedIOException}} is 
expected and is included in the whitelist for prohibited string checking. 
However, this should not be handled as a fatal error.

I think this is a blocker. Depending on which process finishes first (the 
gracefully shutdown and the fatal error handling), it may cause the process to 
terminate with an error exit code, which lead to restarting of the application 
being shutdown.

I'll provide a fix for it asap.

> Fatal error in ResourceManager caused 
> YARNSessionFIFOSecuredITCase.testDetachedMode to fail
> -------------------------------------------------------------------------------------------
>
>                 Key: FLINK-30908
>                 URL: https://issues.apache.org/jira/browse/FLINK-30908
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN, Runtime / Coordination
>    Affects Versions: 1.17.0
>            Reporter: Matthias Pohl
>            Priority: Critical
>              Labels: test-stability
>         Attachments: mvn-1.FLINK-30908.log
>
>
> There's a build failure in {{YARNSessionFIFOSecuredITCase.testDetachedMode}} 
> which is caused by a fatal error in the ResourceManager:
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=45720&view=logs&j=245e1f2e-ba5b-5570-d689-25ae21e5302f&t=d04c9862-880c-52f5-574b-a7a79fef8e0f&l=29869
> {code}
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58 java.io.InterruptedIOException: Interrupted waiting to send 
> RPC request to server
> Feb 05 02:41:58       at org.apache.hadoop.ipc.Client.call(Client.java:1480) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at org.apache.hadoop.ipc.Client.call(Client.java:1422) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at com.sun.proxy.$Proxy31.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.allocate(ApplicationMasterProtocolPBClientImpl.java:77)
>  ~[hadoop-yarn-common-3.2.3.jar:?]
> Feb 05 02:41:58       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native 
> Method) ~[?:1.8.0_292]
> Feb 05 02:41:58       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_292]
> Feb 05 02:41:58       at java.lang.reflect.Method.invoke(Method.java:498) 
> ~[?:1.8.0_292]
> Feb 05 02:41:58       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at com.sun.proxy.$Proxy32.allocate(Unknown Source) 
> ~[?:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.allocate(AMRMClientImpl.java:325)
>  ~[hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58       at 
> org.apache.hadoop.yarn.client.api.async.impl.AMRMClientAsyncImpl$HeartbeatThread.run(AMRMClientAsyncImpl.java:311)
>  [hadoop-yarn-client-3.2.3.jar:?]
> Feb 05 02:41:58 Caused by: java.lang.InterruptedException
> Feb 05 02:41:58       at 
> java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) ~[?:1.8.0_292]
> Feb 05 02:41:58       at 
> java.util.concurrent.FutureTask.get(FutureTask.java:191) ~[?:1.8.0_292]
> Feb 05 02:41:58       at 
> org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1180) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       at org.apache.hadoop.ipc.Client.call(Client.java:1475) 
> ~[hadoop-common-3.2.3.jar:?]
> Feb 05 02:41:58       ... 17 more
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to