[jira] [Updated] (FLINK-29618) YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI
[ https://issues.apache.org/jira/browse/FLINK-29618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated FLINK-29618: --- Labels: pull-request-available starter test-stability (was: starter test-stability) > YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI > --- > > Key: FLINK-29618 > URL: https://issues.apache.org/jira/browse/FLINK-29618 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN, Tests >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Assignee: Wencong Liu >Priority: Major > Labels: pull-request-available, starter, test-stability > Attachments: > build-20221012.7.YARNSessionFIFOSecuredITCase.testDetachedMode.log > > > We experienced a [build > failure|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41931=logs=fc5181b0-e452-5c8f-68de-1097947f6483=995c650b-6573-581c-9ce6-7ad4cc038461=30284] > that was caused (exclusively) by > {{YARNSessionFIFOSecuredITCase.testDetachedMode}} running into a timeout. > The test specific logs which were extracted from the build's are attached to > this Jira issue. > JUnit tries to stop the thread running the test but fails to due so because > it's interrupting a sleep. The {{InterruptedException}} is not properly > handled in > [YarnTestBase:744|https://github.com/apache/flink/blob/573ed922346c791760d27653543c2b8df56f51f7/flink-yarn-tests/src/test/java/org/apache/flink/yarn/YarnTestBase.java#L744] > (it doesn't forward the exception). Therefore, we only see the warning being > logged after 60s: > {code} > 11:33:51,124 [ForkJoinPool-1-worker-25] WARN > org.apache.flink.yarn.YarnTestBase [] - Interruped > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) ~[?:1.8.0_292] > at org.apache.flink.yarn.YarnTestBase.sleep(YarnTestBase.java:716) > ~[test-classes/:?] > at > org.apache.flink.yarn.YarnTestBase.startWithArgs(YarnTestBase.java:906) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOITCase.runDetachedModeTest(YARNSessionFIFOITCase.java:141) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedMode$2(YARNSessionFIFOSecuredITCase.java:173) > ~[test-classes/:?] > at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:288) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedMode(YARNSessionFIFOSecuredITCase.java:160) > ~[test-classes/:?] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_292] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_292] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_292] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_292] > [...] > {code} > The test code itself eventually continues and succeeds (despite the > interruption). The job submission takes suspiciously long, though. > Removing the timeout from the test (as this is the desired approach for tests > in general now) should solve this test instability. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29618) YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI
[ https://issues.apache.org/jira/browse/FLINK-29618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-29618: -- Description: We experienced a [build failure|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41931=logs=fc5181b0-e452-5c8f-68de-1097947f6483=995c650b-6573-581c-9ce6-7ad4cc038461=30284] that was caused (exclusively) by {{YARNSessionFIFOSecuredITCase.testDetachedMode}} running into a timeout. The test specific logs which were extracted from the build's are attached to this Jira issue. JUnit tries to stop the thread running the test but fails to due so because it's interrupting a sleep. The {{InterruptedException}} is not properly handled in [YarnTestBase:744|https://github.com/apache/flink/blob/573ed922346c791760d27653543c2b8df56f51f7/flink-yarn-tests/src/test/java/org/apache/flink/yarn/YarnTestBase.java#L744] (it doesn't forward the exception). Therefore, we only see the warning being logged after 60s: {code} 11:33:51,124 [ForkJoinPool-1-worker-25] WARN org.apache.flink.yarn.YarnTestBase [] - Interruped java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) ~[?:1.8.0_292] at org.apache.flink.yarn.YarnTestBase.sleep(YarnTestBase.java:716) ~[test-classes/:?] at org.apache.flink.yarn.YarnTestBase.startWithArgs(YarnTestBase.java:906) ~[test-classes/:?] at org.apache.flink.yarn.YARNSessionFIFOITCase.runDetachedModeTest(YARNSessionFIFOITCase.java:141) ~[test-classes/:?] at org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedMode$2(YARNSessionFIFOSecuredITCase.java:173) ~[test-classes/:?] at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:288) ~[test-classes/:?] at org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedMode(YARNSessionFIFOSecuredITCase.java:160) ~[test-classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_292] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_292] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_292] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_292] [...] {code} The test code itself eventually continues and succeeds (despite the interruption). The job submission takes suspiciously long, though. Removing the timeout from the test (as this is the desired approach for tests in general now) should solve this test instability. was: We experienced a [build failure|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41931=logs=fc5181b0-e452-5c8f-68de-1097947f6483=995c650b-6573-581c-9ce6-7ad4cc038461=30284] that was caused (exclusively) by {{YARNSessionFIFOSecuredITCase.testDetachedMode}} running into a timeout. The actual issue might be that the test thread failed due to an {{InterruptedException}} while waiting for the job to be submitted: {code} 11:33:51,124 [ForkJoinPool-1-worker-25] WARN org.apache.flink.yarn.YarnTestBase [] - Interruped java.lang.InterruptedException: sleep interrupted at java.lang.Thread.sleep(Native Method) ~[?:1.8.0_292] at org.apache.flink.yarn.YarnTestBase.sleep(YarnTestBase.java:716) ~[test-classes/:?] at org.apache.flink.yarn.YarnTestBase.startWithArgs(YarnTestBase.java:906) ~[test-classes/:?] at org.apache.flink.yarn.YARNSessionFIFOITCase.runDetachedModeTest(YARNSessionFIFOITCase.java:141) ~[test-classes/:?] at org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedMode$2(YARNSessionFIFOSecuredITCase.java:173) ~[test-classes/:?] at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:288) ~[test-classes/:?] at org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedMode(YARNSessionFIFOSecuredITCase.java:160) ~[test-classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_292] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_292] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_292] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_292] [...] {code} The test specific logs which were extracted from the build's are attached to this Jira issue. > YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI > --- > > Key: FLINK-29618 > URL: https://issues.apache.org/jira/browse/FLINK-29618 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN, Tests >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >
[jira] [Updated] (FLINK-29618) YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI
[ https://issues.apache.org/jira/browse/FLINK-29618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-29618: -- Labels: starter test-stability (was: test-stability) > YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI > --- > > Key: FLINK-29618 > URL: https://issues.apache.org/jira/browse/FLINK-29618 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN, Tests >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Priority: Major > Labels: starter, test-stability > Attachments: > build-20221012.7.YARNSessionFIFOSecuredITCase.testDetachedMode.log > > > We experienced a [build > failure|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41931=logs=fc5181b0-e452-5c8f-68de-1097947f6483=995c650b-6573-581c-9ce6-7ad4cc038461=30284] > that was caused (exclusively) by > {{YARNSessionFIFOSecuredITCase.testDetachedMode}} running into a timeout. > The actual issue might be that the test thread failed due to an > {{InterruptedException}} while waiting for the job to be submitted: > {code} > 11:33:51,124 [ForkJoinPool-1-worker-25] WARN > org.apache.flink.yarn.YarnTestBase [] - Interruped > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) ~[?:1.8.0_292] > at org.apache.flink.yarn.YarnTestBase.sleep(YarnTestBase.java:716) > ~[test-classes/:?] > at > org.apache.flink.yarn.YarnTestBase.startWithArgs(YarnTestBase.java:906) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOITCase.runDetachedModeTest(YARNSessionFIFOITCase.java:141) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedMode$2(YARNSessionFIFOSecuredITCase.java:173) > ~[test-classes/:?] > at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:288) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedMode(YARNSessionFIFOSecuredITCase.java:160) > ~[test-classes/:?] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_292] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_292] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_292] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_292] > [...] > {code} > The test specific logs which were extracted from the build's are attached to > this Jira issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-29618) YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI
[ https://issues.apache.org/jira/browse/FLINK-29618?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matthias Pohl updated FLINK-29618: -- Attachment: build-20221012.7.YARNSessionFIFOSecuredITCase.testDetachedMode.log > YARNSessionFIFOSecuredITCase.testDetachedMode timed out in Azure CI > --- > > Key: FLINK-29618 > URL: https://issues.apache.org/jira/browse/FLINK-29618 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN, Tests >Affects Versions: 1.17.0 >Reporter: Matthias Pohl >Priority: Major > Labels: test-stability > Attachments: > build-20221012.7.YARNSessionFIFOSecuredITCase.testDetachedMode.log > > > We experienced a [build > failure|https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=41931=logs=fc5181b0-e452-5c8f-68de-1097947f6483=995c650b-6573-581c-9ce6-7ad4cc038461=30284] > that was caused (exclusively) by > {{YARNSessionFIFOSecuredITCase.testDetachedMode}} running into a timeout. > The actual issue might be that the test thread failed due to an > {{InterruptedException}} while waiting for the job to be submitted: > {code} > 11:33:51,124 [ForkJoinPool-1-worker-25] WARN > org.apache.flink.yarn.YarnTestBase [] - Interruped > java.lang.InterruptedException: sleep interrupted > at java.lang.Thread.sleep(Native Method) ~[?:1.8.0_292] > at org.apache.flink.yarn.YarnTestBase.sleep(YarnTestBase.java:716) > ~[test-classes/:?] > at > org.apache.flink.yarn.YarnTestBase.startWithArgs(YarnTestBase.java:906) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOITCase.runDetachedModeTest(YARNSessionFIFOITCase.java:141) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.lambda$testDetachedMode$2(YARNSessionFIFOSecuredITCase.java:173) > ~[test-classes/:?] > at org.apache.flink.yarn.YarnTestBase.runTest(YarnTestBase.java:288) > ~[test-classes/:?] > at > org.apache.flink.yarn.YARNSessionFIFOSecuredITCase.testDetachedMode(YARNSessionFIFOSecuredITCase.java:160) > ~[test-classes/:?] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[?:1.8.0_292] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[?:1.8.0_292] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[?:1.8.0_292] > at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_292] > [...] > {code} > The test specific logs which were extracted from the build's are attached to > this Jira issue. -- This message was sent by Atlassian Jira (v8.20.10#820010)