[
https://issues.apache.org/jira/browse/FLINK-23794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17403156#comment-17403156
]
Roman Khachatryan commented on FLINK-23794:
-------------------------------------------
The lambda that references the task (TaskExecutor$$Lambda$1325#1) is likely
[task::isBackPressured|https://github.com/apache/flink/blob/48f531d290dae7783f44f29f3a7e7eec07a12313/flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskExecutor.java#L739].
To sum up:
* TaskExecutor creates some metrics for each task that reference this task
* Those metrics are added to InMemoryReporter
* When the task is removed, the corresponding metrics are only scheduled for
removal from InMemoryReporter
* With many restarts during the test, InMemoryReporter references all previous
attempts and prevents GC
I see the following solutions:
1. Disable InMemoryReporter by default
2. Disable InMemoryReporter for this and similar tests
3. Try to cleanup references (i.e. null out task reference in metric when the
task is removed)
The latter seems very fragile and modifying production code only for tests.
[~arvid] WDYT?
> JdbcExactlyOnceSinkE2eTest JVM crash on Azure
> ---------------------------------------------
>
> Key: FLINK-23794
> URL: https://issues.apache.org/jira/browse/FLINK-23794
> Project: Flink
> Issue Type: Bug
> Components: Connectors / JDBC
> Affects Versions: 1.14.0
> Reporter: Xintong Song
> Assignee: Roman Khachatryan
> Priority: Major
> Labels: test-stability
> Fix For: 1.14.0
>
> Attachments: Screenshot_2021-08-23_13-34-31.png
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22196&view=logs&j=e9af9cde-9a65-5281-a58e-2c8511d36983&t=c520d2c3-4d17-51f1-813b-4b0b74a0c307&l=13960
> {code}
> Aug 14 22:56:30 [ERROR] Failed to execute goal
> org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on
> project flink-connector-jdbc_2.11: There are test failures.
> Aug 14 22:56:30 [ERROR]
> Aug 14 22:56:30 [ERROR] Please refer to
> /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire-reports for
> the individual test results.
> Aug 14 22:56:30 [ERROR] Please refer to dump files (if any exist)
> [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
> Aug 14 22:56:30 [ERROR] ExecutionException The forked VM terminated without
> properly saying goodbye. VM crash or System.exit called?
> Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd
> /__w/1/s/flink-connectors/flink-connector-jdbc &&
> /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m
> -Dmvn.forkNumber=2 -XX:+UseG1GC -jar
> /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar
> /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire
> 2021-08-14T22-14-27_386-jvmRun2 surefire3999990822284944903tmp
> surefire_7612891660133211258241tmp
> Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log
> Aug 14 22:56:30 [ERROR] Process Exit Code: 239
> Aug 14 22:56:30 [ERROR] Crashed tests:
> Aug 14 22:56:30 [ERROR]
> org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest
> Aug 14 22:56:30 [ERROR]
> org.apache.maven.surefire.booter.SurefireBooterForkException:
> ExecutionException The forked VM terminated without properly saying goodbye.
> VM crash or System.exit called?
> Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd
> /__w/1/s/flink-connectors/flink-connector-jdbc &&
> /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m
> -Dmvn.forkNumber=2 -XX:+UseG1GC -jar
> /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar
> /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire
> 2021-08-14T22-14-27_386-jvmRun2 surefire3999990822284944903tmp
> surefire_7612891660133211258241tmp
> Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log
> Aug 14 22:56:30 [ERROR] Process Exit Code: 239
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)