[ 
https://issues.apache.org/jira/browse/FLINK-28199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17750260#comment-17750260
 ] 

Matthias Pohl commented on FLINK-28199:
---------------------------------------

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=51893&view=logs&j=298e20ef-7951-5965-0e79-ea664ddc435e&t=d4c90338-c843-57b0-3232-10ae74f00347&l=27086

This time, only {{testClusterClientRetrieval}} failed. But the JM process 
finished without any issues at {{2023-08-02 03:21:25,901}}. The cleanup is 
triggered in the test. But the application wasn't cleared:
{code}
Aug 02 03:22:02 [ERROR] 
org.apache.flink.yarn.YARNHighAvailabilityITCase.testClusterClientRetrieval  
Time elapsed: 29.494 s  <<< FAILURE!
Aug 02 03:22:02 java.lang.AssertionError: There is at least one application on 
the cluster that is not finished.[App application_1690946369165_0003 is in 
state RUNNING.]
Aug 02 03:22:02         at 
org.apache.flink.yarn.YarnTestBase$CleanupYarnApplication.close(YarnTestBase.java:336)
Aug 02 03:22:02         at 
org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:300)
Aug 02 03:22:02         at 
org.apache.flink.yarn.YARNHighAvailabilityITCase.testClusterClientRetrieval(YARNHighAvailabilityITCase.java:221)
Aug 02 03:22:02         at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[...]
{code}

What about increasing the deadline for shutting down the YARN applications? 
It's currently set to 10s (see 
[apache/flink:org.apache.flink.yarn.YarnTestBase:310|https://github.com/apache/flink/blob/c8ae39d4ac73f81873e1d8ac37e17c29ae330b23/flink-yarn-tests/src/test/java/org/apache/flink/yarn/YarnTestBase.java#L310]

> Failures on YARNHighAvailabilityITCase.testClusterClientRetrieval and 
> YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FLINK-28199
>                 URL: https://issues.apache.org/jira/browse/FLINK-28199
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN
>    Affects Versions: 1.16.0
>            Reporter: Martijn Visser
>            Priority: Major
>              Labels: test-stability
>
> {code:java}
> Jun 22 08:57:50 [ERROR] Errors: 
> Jun 22 08:57:50 [ERROR]   
> YARNHighAvailabilityITCase.testClusterClientRetrieval » Timeout 
> testClusterCli...
> Jun 22 08:57:50 [ERROR]   
> YARNHighAvailabilityITCase.testKillYarnSessionClusterEntrypoint:156->YarnTestBase.runTest:288->lambda$testKillYarnSessionClusterEntrypoint$0:182->waitForJobTermination:325
>  » Execution
> Jun 22 08:57:50 [INFO] 
> Jun 22 08:57:50 [ERROR] Tests run: 27, Failures: 0, Errors: 2, Skipped: 0
> {code}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=37037&view=logs&j=fc5181b0-e452-5c8f-68de-1097947f6483&t=995c650b-6573-581c-9ce6-7ad4cc038461&l=29523



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to