[
https://issues.apache.org/jira/browse/FLINK-30108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17652779#comment-17652779
]
Zhu Zhu commented on FLINK-30108:
---------------------------------
The stuck case is ZooKeeperLeaderElectionConnectionHandlingTest.
Seems it got stuck in waiting for leadership granting. I will take another look.
{code:java}
Nov 18 01:18:09 "main" #1 prio=5 os_prio=0 tid=0x00007fcf7400b800 nid=0x5b90 in
Object.wait() [0x00007fcf7d29b000]
Nov 18 01:18:09 java.lang.Thread.State: WAITING (on object monitor)
Nov 18 01:18:09 at java.lang.Object.wait(Native Method)
Nov 18 01:18:09 at java.lang.Object.wait(Object.java:502)
Nov 18 01:18:09 at
org.apache.flink.core.testutils.OneShotLatch.await(OneShotLatch.java:61)
Nov 18 01:18:09 - locked <0x00000000e01a7510> (a java.lang.Object)
Nov 18 01:18:09 at
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionConnectionHandlingTest$TestingContender.awaitGrantLeadership(ZooKeeperLeaderElectionConnectionHandlingTest.java:199)
Nov 18 01:18:09 at
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionConnectionHandlingTest.runTestWithZooKeeperConnectionProblem(ZooKeeperLeaderElectionConnectionHandlingTest.java:147)
Nov 18 01:18:09 at
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionConnectionHandlingTest.runTestWithLostZooKeeperConnection(ZooKeeperLeaderElectionConnectionHandlingTest.java:106)
Nov 18 01:18:09 at
org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionConnectionHandlingTest.testLoseLeadershipOnLostConnectionIfTolerateSuspendedConnectionsIsEnabled(ZooKeeperLeaderElectionConnectionHandlingTest.java:93)
Nov 18 01:18:09 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
...
{code}
> flink-core module tests exited with code 143
> --------------------------------------------
>
> Key: FLINK-30108
> URL: https://issues.apache.org/jira/browse/FLINK-30108
> Project: Flink
> Issue Type: Bug
> Components: API / Core, Tests
> Affects Versions: 1.17.0
> Reporter: Leonard Xu
> Priority: Major
>
> {noformat}
> Nov 18 01:02:58 [INFO] Tests run: 16, Failures: 0, Errors: 0, Skipped: 0,
> Time elapsed: 109.22 s - in
> org.apache.flink.runtime.operators.hash.InPlaceMutableHashTableTest
> Nov 18 01:18:09
> ==============================================================================
> Nov 18 01:18:09 Process produced no output for 900 seconds.
> Nov 18 01:18:09
> ==============================================================================
> Nov 18 01:18:09
> ==============================================================================
> Nov 18 01:18:09 The following Java processes are running (JPS)
> Nov 18 01:18:09
> ==============================================================================
> Picked up JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError
> Nov 18 01:18:09 924 Launcher
> Nov 18 01:18:09 23421 surefirebooter1178962604207099497.jar
> Nov 18 01:18:09 11885 Jps
> Nov 18 01:18:09
> ==============================================================================
> Nov 18 01:18:09 Printing stack trace of Java process 924
> Nov 18 01:18:09
> ==============================================================================
> Picked up JAVA_TOOL_OPTIONS: -XX:+HeapDumpOnOutOfMemoryError
> Nov 18 01:18:09 2022-11-18 01:18:09
> Nov 18 01:18:09 Full thread dump OpenJDK 64-Bit Server VM (25.292-b10 mixed
> mode):
> ...
> ...
> ...
> Nov 18 01:18:09
> ==============================================================================
> Nov 18 01:18:09 Printing stack trace of Java process 11885
> Nov 18 01:18:09
> ==============================================================================
> 11885: No such process
> Nov 18 01:18:09 Killing process with pid=923 and all descendants
> /__w/2/s/tools/ci/watchdog.sh: line 113: 923 Terminated $cmd
> Nov 18 01:18:10 Process exited with EXIT CODE: 143.
> Nov 18 01:18:10 Trying to KILL watchdog (919).
> Nov 18 01:18:10 Searching for .dump, .dumpstream and related files in
> '/__w/2/s'
> Nov 18 01:18:16 Moving
> '/__w/2/s/flink-runtime/target/surefire-reports/2022-11-18T00-55-55_041-jvmRun3.dumpstream'
> to target directory ('/__w/_temp/debug_files')
> Nov 18 01:18:16 Moving
> '/__w/2/s/flink-runtime/target/surefire-reports/2022-11-18T00-55-55_041-jvmRun3.dump'
> to target directory ('/__w/_temp/debug_files')
> The STDIO streams did not close within 10 seconds of the exit event from
> process '/bin/bash'. This may indicate a child process inherited the STDIO
> streams and has not yet exited.
> ##[error]Bash exited with code '143'.
> Finishing: Test - core
> {noformat}
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=43277&view=logs&j=0e7be18f-84f2-53f0-a32d-4a5e4a174679&t=7c1d86e3-35bd-5fd5-3b7c-30c126a78702
--
This message was sent by Atlassian Jira
(v8.20.10#820010)