[jira] [Commented] (FLINK-15451) TaskManagerProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure failed on azure
[ https://issues.apache.org/jira/browse/FLINK-15451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407229#comment-17407229 ] Arvid Heise commented on FLINK-15451: - I got a case now https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23179=logs=a57e0635-3fad-5b08-57c7-a4142d7d6fa9=2ef0effc-1da1-50e5-c2bd-aab434b1c5b7 . > TaskManagerProcessFailureBatchRecoveryITCase.testTaskManagerProcessFailure > failed on azure > -- > > Key: FLINK-15451 > URL: https://issues.apache.org/jira/browse/FLINK-15451 > Project: Flink > Issue Type: Bug > Components: Runtime / Coordination, Tests >Affects Versions: 1.9.1 >Reporter: Congxian Qiu >Priority: Major > Labels: test-stability > > 2019-12-31T02:43:39.4766254Z [ERROR] Tests run: 2, Failures: 0, Errors: 1, > Skipped: 0, Time elapsed: 42.801 s <<< FAILURE! - in > org.apache.flink.test.recovery.TaskManagerProcessFailureBatchRecoveryITCase > 2019-12-31T02:43:39.4768373Z [ERROR] > testTaskManagerProcessFailure[0](org.apache.flink.test.recovery.TaskManagerProcessFailureBatchRecoveryITCase) > Time elapsed: 2.699 s <<< ERROR! 2019-12-31T02:43:39.4768834Z > java.net.BindException: Address already in use 2019-12-31T02:43:39.4769096Z > > > [https://dev.azure.com/rmetzger/5bd3ef0a-4359-41af-abca-811b04098d2e/_apis/build/builds/3995/logs/15] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-24051) Make consumer.group-id optional for KafkaSource
[ https://issues.apache.org/jira/browse/FLINK-24051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17407202#comment-17407202 ] Arvid Heise commented on FLINK-24051: - Merged into master as c10bd30e42d677447eaf39be1a1b48d0dcf8061d. > Make consumer.group-id optional for KafkaSource > --- > > Key: FLINK-24051 > URL: https://issues.apache.org/jira/browse/FLINK-24051 > Project: Flink > Issue Type: Improvement > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.12.5, 1.13.2, 1.15.0 >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > > For most of the users it is not necessary to generate a group-id and the > source itself can provide a meaningful group-id during startup. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23817) Write documentation for standardized operator metrics
[ https://issues.apache.org/jira/browse/FLINK-23817?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23817: --- Assignee: Arvid Heise > Write documentation for standardized operator metrics > - > > Key: FLINK-23817 > URL: https://issues.apache.org/jira/browse/FLINK-23817 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common, Documentation >Reporter: Arvid Heise >Assignee: Arvid Heise >Priority: Blocker > Fix For: 1.14.0 > > > Incorporate metrics in connector page. Use > [data-templates|https://gohugo.io/templates/data-templates/] for common > metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23796) UnalignedCheckpointRescaleITCase JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23796. - Resolution: Fixed > UnalignedCheckpointRescaleITCase JVM crash on Azure > --- > > Key: FLINK-23796 > URL: https://issues.apache.org/jira/browse/FLINK-23796 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=5182 > {code} > Aug 16 01:03:17 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test > (integration-tests) on project flink-tests: There are test failures. > Aug 16 01:03:17 [ERROR] > Aug 16 01:03:17 [ERROR] Please refer to > /__w/1/s/flink-tests/target/surefire-reports for the individual test results. > Aug 16 01:03:17 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 16 01:03:17 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > Aug 16 01:03:17 [ERROR] Crashed tests: > Aug 16 01:03:17 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase > Aug 16 01:03:17 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23913) UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs
[ https://issues.apache.org/jira/browse/FLINK-23913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17406575#comment-17406575 ] Arvid Heise commented on FLINK-23913: - Note that this failure predates the fix in FLINK-23794. If any new case happens now, it is caused for a different reason as we completely removed InMemoryReporter for this test case. > UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs > > > Key: FLINK-23913 > URL: https://issues.apache.org/jira/browse/FLINK-23913 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.14.0 > Environment: UnalignedCheckpointITCase >Reporter: Robert Metzger >Assignee: Arvid Heise >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > > Cases reported in FLINK-23525: > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=7b25afdf-cc6c-566f-5459-359dc2585798=10338 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22605=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - ... there are a lot more cases. > The problem seems to have started occurring around August 20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23913) UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs
[ https://issues.apache.org/jira/browse/FLINK-23913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23913. - Resolution: Fixed > UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs > > > Key: FLINK-23913 > URL: https://issues.apache.org/jira/browse/FLINK-23913 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.14.0 > Environment: UnalignedCheckpointITCase >Reporter: Robert Metzger >Assignee: Arvid Heise >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > > Cases reported in FLINK-23525: > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=7b25afdf-cc6c-566f-5459-359dc2585798=10338 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22605=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - ... there are a lot more cases. > The problem seems to have started occurring around August 20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23796) UnalignedCheckpointRescaleITCase JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17406573#comment-17406573 ] Arvid Heise commented on FLINK-23796: - Note that this failure predates the fix in FLINK-23794. If any new case happens now, it is caused for a different reason as we completely removed InMemoryReporter for this test case. > UnalignedCheckpointRescaleITCase JVM crash on Azure > --- > > Key: FLINK-23796 > URL: https://issues.apache.org/jira/browse/FLINK-23796 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=5182 > {code} > Aug 16 01:03:17 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test > (integration-tests) on project flink-tests: There are test failures. > Aug 16 01:03:17 [ERROR] > Aug 16 01:03:17 [ERROR] Please refer to > /__w/1/s/flink-tests/target/surefire-reports for the individual test results. > Aug 16 01:03:17 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 16 01:03:17 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > Aug 16 01:03:17 [ERROR] Crashed tests: > Aug 16 01:03:17 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase > Aug 16 01:03:17 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23776) Performance regression on 14.08.2021 in FLIP-27
[ https://issues.apache.org/jira/browse/FLINK-23776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17405965#comment-17405965 ] Arvid Heise commented on FLINK-23776: - Merged an improvement as fd429d084264357d3cfbfd8b2b62cf8327a8fd79. > Performance regression on 14.08.2021 in FLIP-27 > --- > > Key: FLINK-23776 > URL: https://issues.apache.org/jira/browse/FLINK-23776 > Project: Flink > Issue Type: Bug > Components: API / DataStream, Benchmarks >Affects Versions: 1.14.0 >Reporter: Piotr Nowojski >Assignee: Arvid Heise >Priority: Blocker > Labels: pull-request-available > Fix For: 1.14.0 > > > http://codespeed.dak8s.net:8000/timeline/?ben=mapSink.F27_UNBOUNDED=2 > http://codespeed.dak8s.net:8000/timeline/?ben=mapRebalanceMapSink.F27_UNBOUNDED=2 > {noformat} > git ls 7b60a964b1..7f3636f6b4 > 7f3636f6b4f [2 days ago] [FLINK-23652][connectors] Adding common source > metrics. [Arvid Heise] > 97c8f72b813 [3 months ago] [FLINK-23652][connectors] Adding common sink > metrics. [Arvid Heise] > 48da20e8f88 [3 months ago] [FLINK-23652][test] Adding InMemoryMetricReporter > and using it by default in MiniClusterResource. [Arvid Heise] > 63ee60859ca [3 months ago] [FLINK-23652][core/metrics] Extract > Operator(IO)MetricGroup interfaces and expose them in RuntimeContext [Arvid > Heise] > 5d5e39b614b [2 days ago] [refactor][connectors] Only use > MockSplitReader.Builder for instantiation. [Arvid Heise] > b927035610c [3 months ago] [refactor][core] Extract common context creation > in CollectionExecutor [Arvid Heise] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23794) JdbcExactlyOnceSinkE2eTest JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23794. - Resolution: Fixed Merged into master as 3dbd9e785394590bf52fd7dc41c00c5254a4de24..126fafc25d8769e1cdcd8d88e153bc33dbc95c61. > JdbcExactlyOnceSinkE2eTest JVM crash on Azure > - > > Key: FLINK-23794 > URL: https://issues.apache.org/jira/browse/FLINK-23794 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Arvid Heise >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > Attachments: Screenshot_2021-08-23_13-34-31.png > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22196=logs=e9af9cde-9a65-5281-a58e-2c8511d36983=c520d2c3-4d17-51f1-813b-4b0b74a0c307=13960 > {code} > Aug 14 22:56:30 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on > project flink-connector-jdbc_2.11: There are test failures. > Aug 14 22:56:30 [ERROR] > Aug 14 22:56:30 [ERROR] Please refer to > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire-reports for > the individual test results. > Aug 14 22:56:30 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 14 22:56:30 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd > /__w/1/s/flink-connectors/flink-connector-jdbc && > /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=2 -XX:+UseG1GC -jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire > 2021-08-14T22-14-27_386-jvmRun2 surefire390822284944903tmp > surefire_7612891660133211258241tmp > Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log > Aug 14 22:56:30 [ERROR] Process Exit Code: 239 > Aug 14 22:56:30 [ERROR] Crashed tests: > Aug 14 22:56:30 [ERROR] > org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest > Aug 14 22:56:30 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd > /__w/1/s/flink-connectors/flink-connector-jdbc && > /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=2 -XX:+UseG1GC -jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire > 2021-08-14T22-14-27_386-jvmRun2 surefire390822284944903tmp > surefire_7612891660133211258241tmp > Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log > Aug 14 22:56:30 [ERROR] Process Exit Code: 239 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23945) Cannot start the cluster using S3 as the file system
[ https://issues.apache.org/jira/browse/FLINK-23945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17405625#comment-17405625 ] Arvid Heise commented on FLINK-23945: - {quote} I tried to debug using the source code, and then found that he new a org.apache.flink.core.fs.Path by String "s3:///flink/recovery", and then threw a null uri host exception when calling the getFileSystem method {quote} This is expected: you leave out the bucket name. It should be {{s3:///}}. Do we have "s3:///" somewhere on our documentation? > Cannot start the cluster using S3 as the file system > > > Key: FLINK-23945 > URL: https://issues.apache.org/jira/browse/FLINK-23945 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem >Affects Versions: 1.12.3 >Reporter: 钟洋洋 >Priority: Minor > Attachments: 截屏2021-08-27 上午12.24.30.png, 截屏2021-08-27 上午12.31.10.png > > > {{high-availability.storageDir: s3:///flink/recovery}} > *When I performed the above configuration, the following error was reported* > Could not start cluster entrypoint KubernetesSessionClusterEntrypoint. > org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to > initialize the cluster entrypoint KubernetesSessionClusterEntrypoint. > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:201) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:585) > [flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.kubernetes.entrypoint.KubernetesSessionClusterEntrypoint.main(KubernetesSessionClusterEntrypoint.java:61) > [flink-dist_2.12-1.12.3.jar:1.12.3] > Caused by: java.io.IOException: Could not create FileSystem for highly > available storage path (s3:/flink/recovery/flink-native-k8s-session-1) > at > org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:92) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:115) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:338) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.initializeServices(ClusterEntrypoint.java:296) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:224) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$1(ClusterEntrypoint.java:178) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_292] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_292] > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) > ~[flink-shaded-hadoop-2-uber-2.8.3-10.0.jar:2.8.3-10.0] > at > org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > ... 2 more > Caused by: java.io.IOException: null uri host. > at > org.apache.flink.fs.s3.common.AbstractS3FileSystemFactory.create(AbstractS3FileSystemFactory.java:162) > ~[?:?] > at > org.apache.flink.core.fs.PluginFileSystemFactory.create(PluginFileSystemFactory.java:62) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:507) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:408) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at org.apache.flink.core.fs.Path.getFileSystem(Path.java:274) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.blob.BlobUtils.createFileSystemBlobStore(BlobUtils.java:89) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.blob.BlobUtils.createBlobStoreFromConfig(BlobUtils.java:76) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.highavailability.HighAvailabilityServicesUtils.createHighAvailabilityServices(HighAvailabilityServicesUtils.java:115) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at > org.apache.flink.runtime.entrypoint.ClusterEntrypoint.createHaServices(ClusterEntrypoint.java:338) > ~[flink-dist_2.12-1.12.3.jar:1.12.3] > at >
[jira] [Commented] (FLINK-23877) Embedded Pulsar broker backend for testing
[ https://issues.apache.org/jira/browse/FLINK-23877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17405157#comment-17405157 ] Arvid Heise commented on FLINK-23877: - Merged into master as d476c5facf7baac5b3c70295599d4fc41c52eedd..3ef9323436fbd7a20742b52120b34a53a82a9675. > Embedded Pulsar broker backend for testing > -- > > Key: FLINK-23877 > URL: https://issues.apache.org/jira/browse/FLINK-23877 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Pulsar >Reporter: Yufan Sheng >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > The pulsar connector use TestContainers for all the pulsar related tests. We > should have a lightweight embedded pulsar backend for performing all the Unit > tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23877) Embedded Pulsar broker backend for testing
[ https://issues.apache.org/jira/browse/FLINK-23877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23877. - Resolution: Fixed > Embedded Pulsar broker backend for testing > -- > > Key: FLINK-23877 > URL: https://issues.apache.org/jira/browse/FLINK-23877 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Pulsar >Reporter: Yufan Sheng >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > The pulsar connector use TestContainers for all the pulsar related tests. We > should have a lightweight embedded pulsar backend for performing all the Unit > tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23877) Embedded Pulsar broker backend for testing
[ https://issues.apache.org/jira/browse/FLINK-23877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23877: --- Assignee: Yufan Sheng > Embedded Pulsar broker backend for testing > -- > > Key: FLINK-23877 > URL: https://issues.apache.org/jira/browse/FLINK-23877 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Pulsar >Reporter: Yufan Sheng >Assignee: Yufan Sheng >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > The pulsar connector use TestContainers for all the pulsar related tests. We > should have a lightweight embedded pulsar backend for performing all the Unit > tests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23969) Test Pulsar source end 2 end
[ https://issues.apache.org/jira/browse/FLINK-23969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17405107#comment-17405107 ] Arvid Heise commented on FLINK-23969: - I assigned you, please wait for the documentation (just be done this week) and check if it can be improved. > Test Pulsar source end 2 end > > > Key: FLINK-23969 > URL: https://issues.apache.org/jira/browse/FLINK-23969 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Pulsar >Reporter: Arvid Heise >Assignee: Liu >Priority: Blocker > Labels: release-testing > Fix For: 1.14.0 > > > Write a test application using Pulsar Source and execute it in distributed > fashion. Check fault-tolerance by crashing and restarting a TM. > Ideally, we test different subscription modes and sticky keys in particular. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23969) Test Pulsar source end 2 end
[ https://issues.apache.org/jira/browse/FLINK-23969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23969: --- Assignee: Liu > Test Pulsar source end 2 end > > > Key: FLINK-23969 > URL: https://issues.apache.org/jira/browse/FLINK-23969 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Pulsar >Reporter: Arvid Heise >Assignee: Liu >Priority: Blocker > Labels: release-testing > Fix For: 1.14.0 > > > Write a test application using Pulsar Source and execute it in distributed > fashion. Check fault-tolerance by crashing and restarting a TM. > Ideally, we test different subscription modes and sticky keys in particular. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23969) Test Pulsar source end 2 end
Arvid Heise created FLINK-23969: --- Summary: Test Pulsar source end 2 end Key: FLINK-23969 URL: https://issues.apache.org/jira/browse/FLINK-23969 Project: Flink Issue Type: Sub-task Components: Connectors / Pulsar Reporter: Arvid Heise Fix For: 1.14.0 Write a test application using Pulsar Source and execute it in distributed fashion. Check fault-tolerance by crashing and restarting a TM. Ideally, we test different subscription modes and sticky keys in particular. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-23886) An exception is thrown out when recover job timers from checkpoint file
[ https://issues.apache.org/jira/browse/FLINK-23886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404373#comment-17404373 ] Arvid Heise edited comment on FLINK-23886 at 8/25/21, 11:49 AM: {quote} Flink actually synchronizes invocations of onTimer() and processElement() [see timers description|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/operators/process_function/#timers] via [mail box thread modle|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L1665-L1671] in StreamTask. As far as I can see, I cannot see concurrent problem here. Mabe [~arvid] could share more insights here. {quote} This is only true since Flink 1.11. Before that a few things were done concurrently, so I could imagine that there are bugs lingering. [~pnowojski] probably knows more. However, this probably also means that none of the currently supported version would exhibit the same issue. So even if we manage to find the bug and fix it, there won't be a bugfix release; you'd have to apply a patch on your own. was (Author: arvid): {quote} Flink actually synchronizes invocations of onTimer() and processElement() [see timers description|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/operators/process_function/#timers] via [mail box thread modle|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L1665-L1671] in StreamTask. As far as I can see, I cannot see concurrent problem here. Mabe [~arvid] could share more insights here. {quote} This is only true since Flink 1.11. Before that a few things were done concurrently, so I could imagine that there are bugs lingering. [~pnowojski] probably knows more. However, this probably also means that none of the currently supported version would exhibit the same issue. So even we manage to find the bug and fix it, there won't be a bugfix release; you'd have to apply a patch on your own. > An exception is thrown out when recover job timers from checkpoint file > --- > > Key: FLINK-23886 > URL: https://issues.apache.org/jira/browse/FLINK-23886 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: JING ZHANG >Priority: Major > Attachments: image-2021-08-25-16-38-04-023.png, > image-2021-08-25-16-38-12-308.png, image-2021-08-25-17-06-29-806.png, > image-2021-08-25-17-07-38-327.png > > > A user report the bug in the [mailist. > |http://mail-archives.apache.org/mod_mbox/flink-user/202108.mbox/%3ccakmsf43j14nkjmgjuy4dh5qn2vbjtw4tfh4pmmuyvcvfhgf...@mail.gmail.com%3E]I > paste the content here. > Setup Specifics: > Version: 1.6.2 > RocksDB Map State > Timers stored in rocksdb > > When we have this job running for long periods of time like > 30 days, if > for some reason the job restarts, we encounter "Error while deserializing the > element". Is this a known issue fixed in later versions? I see some changes > to code for FLINK-10175, but we don't use any queryable state > > Below is the stack trace > > org.apache.flink.util.FlinkRuntimeException: Error while deserializing the > element. > at > org.apache.flink.contrib.streaming.state.RocksDBCachingPriorityQueueSet.deserializeElement(RocksDBCachingPriorityQueueSet.java:389) > at > org.apache.flink.contrib.streaming.state.RocksDBCachingPriorityQueueSet.peek(RocksDBCachingPriorityQueueSet.java:146) > at > org.apache.flink.contrib.streaming.state.RocksDBCachingPriorityQueueSet.peek(RocksDBCachingPriorityQueueSet.java:56) > at > org.apache.flink.runtime.state.heap.KeyGroupPartitionedPriorityQueue$InternalPriorityQueueComparator.comparePriority(KeyGroupPartitionedPriorityQueue.java:274) > at > org.apache.flink.runtime.state.heap.KeyGroupPartitionedPriorityQueue$InternalPriorityQueueComparator.comparePriority(KeyGroupPartitionedPriorityQueue.java:261) > at > org.apache.flink.runtime.state.heap.HeapPriorityQueue.isElementPriorityLessThen(HeapPriorityQueue.java:164) > at > org.apache.flink.runtime.state.heap.HeapPriorityQueue.siftUp(HeapPriorityQueue.java:121) > at > org.apache.flink.runtime.state.heap.HeapPriorityQueue.addInternal(HeapPriorityQueue.java:85) > at > org.apache.flink.runtime.state.heap.AbstractHeapPriorityQueue.add(AbstractHeapPriorityQueue.java:73) > at > org.apache.flink.runtime.state.heap.KeyGroupPartitionedPriorityQueue.(KeyGroupPartitionedPriorityQueue.java:89) > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBPriorityQueueSetFactory.create(RocksDBKeyedStateBackend.java:2792) > at >
[jira] [Commented] (FLINK-23886) An exception is thrown out when recover job timers from checkpoint file
[ https://issues.apache.org/jira/browse/FLINK-23886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404373#comment-17404373 ] Arvid Heise commented on FLINK-23886: - {quote} Flink actually synchronizes invocations of onTimer() and processElement() [see timers description|https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/operators/process_function/#timers] via [mail box thread modle|https://github.com/apache/flink/blob/master/flink-streaming-java/src/main/java/org/apache/flink/streaming/runtime/tasks/StreamTask.java#L1665-L1671] in StreamTask. As far as I can see, I cannot see concurrent problem here. Mabe [~arvid] could share more insights here. {quote} This is only true since Flink 1.11. Before that a few things were done concurrently, so I could imagine that there are bugs lingering. [~pnowojski] probably knows more. However, this probably also means that none of the currently supported version would exhibit the same issue. So even we manage to find the bug and fix it, there won't be a bugfix release; you'd have to apply a patch on your own. > An exception is thrown out when recover job timers from checkpoint file > --- > > Key: FLINK-23886 > URL: https://issues.apache.org/jira/browse/FLINK-23886 > Project: Flink > Issue Type: Bug > Components: Runtime / State Backends >Reporter: JING ZHANG >Priority: Major > Attachments: image-2021-08-25-16-38-04-023.png, > image-2021-08-25-16-38-12-308.png, image-2021-08-25-17-06-29-806.png, > image-2021-08-25-17-07-38-327.png > > > A user report the bug in the [mailist. > |http://mail-archives.apache.org/mod_mbox/flink-user/202108.mbox/%3ccakmsf43j14nkjmgjuy4dh5qn2vbjtw4tfh4pmmuyvcvfhgf...@mail.gmail.com%3E]I > paste the content here. > Setup Specifics: > Version: 1.6.2 > RocksDB Map State > Timers stored in rocksdb > > When we have this job running for long periods of time like > 30 days, if > for some reason the job restarts, we encounter "Error while deserializing the > element". Is this a known issue fixed in later versions? I see some changes > to code for FLINK-10175, but we don't use any queryable state > > Below is the stack trace > > org.apache.flink.util.FlinkRuntimeException: Error while deserializing the > element. > at > org.apache.flink.contrib.streaming.state.RocksDBCachingPriorityQueueSet.deserializeElement(RocksDBCachingPriorityQueueSet.java:389) > at > org.apache.flink.contrib.streaming.state.RocksDBCachingPriorityQueueSet.peek(RocksDBCachingPriorityQueueSet.java:146) > at > org.apache.flink.contrib.streaming.state.RocksDBCachingPriorityQueueSet.peek(RocksDBCachingPriorityQueueSet.java:56) > at > org.apache.flink.runtime.state.heap.KeyGroupPartitionedPriorityQueue$InternalPriorityQueueComparator.comparePriority(KeyGroupPartitionedPriorityQueue.java:274) > at > org.apache.flink.runtime.state.heap.KeyGroupPartitionedPriorityQueue$InternalPriorityQueueComparator.comparePriority(KeyGroupPartitionedPriorityQueue.java:261) > at > org.apache.flink.runtime.state.heap.HeapPriorityQueue.isElementPriorityLessThen(HeapPriorityQueue.java:164) > at > org.apache.flink.runtime.state.heap.HeapPriorityQueue.siftUp(HeapPriorityQueue.java:121) > at > org.apache.flink.runtime.state.heap.HeapPriorityQueue.addInternal(HeapPriorityQueue.java:85) > at > org.apache.flink.runtime.state.heap.AbstractHeapPriorityQueue.add(AbstractHeapPriorityQueue.java:73) > at > org.apache.flink.runtime.state.heap.KeyGroupPartitionedPriorityQueue.(KeyGroupPartitionedPriorityQueue.java:89) > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend$RocksDBPriorityQueueSetFactory.create(RocksDBKeyedStateBackend.java:2792) > at > org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackend.create(RocksDBKeyedStateBackend.java:450) > at > org.apache.flink.streaming.api.operators.InternalTimeServiceManager.createTimerPriorityQueue(InternalTimeServiceManager.java:121) > at > org.apache.flink.streaming.api.operators.InternalTimeServiceManager.registerOrGetTimerService(InternalTimeServiceManager.java:106) > at > org.apache.flink.streaming.api.operators.InternalTimeServiceManager.getInternalTimerService(InternalTimeServiceManager.java:87) > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.getInternalTimerService(AbstractStreamOperator.java:764) > at > org.apache.flink.streaming.api.operators.KeyedProcessOperator.open(KeyedProcessOperator.java:61) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.openAllOperators(StreamTask.java:424) > at > org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:290) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711) > at java.lang.Thread.run(Thread.java:748) > Caused by:
[jira] [Comment Edited] (FLINK-23945) can not open dashboard
[ https://issues.apache.org/jira/browse/FLINK-23945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404366#comment-17404366 ] Arvid Heise edited comment on FLINK-23945 at 8/25/21, 11:41 AM: Your syntax is incorrect. s3:/// is always setting the bucket at a given endpoint. For minio, we explicitly set http://: as an {{s3.endpoint}} and then use s3 protocol just for actual bucket. -There is also a combined syntax (Virtual Hosted-Style Requests) where you set it to http(s)://.:/ [1]- -[1] [https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html|https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html-]- (cannot be used in Flink) was (Author: arvid): Your syntax is incorrect. s3:/// is always setting the bucket at a given endpoint. For minio, we explicitly set http://: as an {{s3.endpoint}} and then use s3 protocol just for actual bucket. - There is also a combined syntax (Virtual Hosted-Style Requests) where you set it to http(s)://.:/ [1] [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html- (cannot be used in Flink) > can not open dashboard > --- > > Key: FLINK-23945 > URL: https://issues.apache.org/jira/browse/FLINK-23945 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Deployment / Kubernetes >Affects Versions: 1.12.3 >Reporter: 钟洋洋 >Priority: Minor > > Cannot deploy normally using k8s native session mode, but no error log is > found > > Details are in the comments section > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-23945) can not open dashboard
[ https://issues.apache.org/jira/browse/FLINK-23945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404366#comment-17404366 ] Arvid Heise edited comment on FLINK-23945 at 8/25/21, 11:41 AM: Your syntax is incorrect. s3:/// is always setting the bucket at a given endpoint. For minio, we explicitly set http://: as an {{s3.endpoint}} and then use s3 protocol just for actual bucket. - There is also a combined syntax (Virtual Hosted-Style Requests) where you set it to http(s)://.:/ [1] [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html- (cannot be used in Flink) was (Author: arvid): Your syntax is incorrect. s3:/// is always setting the bucket at a given endpoint. For minio, we explicitly set http://: as an {{s3.endpoint}} and then use s3 protocol just for actual bucket. There is also a combined syntax (Virtual Hosted-Style Requests) where you set it to http(s)://.:/ [1] [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html > can not open dashboard > --- > > Key: FLINK-23945 > URL: https://issues.apache.org/jira/browse/FLINK-23945 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Deployment / Kubernetes >Affects Versions: 1.12.3 >Reporter: 钟洋洋 >Priority: Minor > > Cannot deploy normally using k8s native session mode, but no error log is > found > > Details are in the comments section > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23945) can not open dashboard
[ https://issues.apache.org/jira/browse/FLINK-23945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404366#comment-17404366 ] Arvid Heise commented on FLINK-23945: - Your syntax is incorrect. s3:/// is always setting the bucket at a given endpoint. For minio, we explicitly set http://: as an {{s3.endpoint}} and then use s3 protocol just for actual bucket. There is also a combined syntax (Virtual Hosted-Style Requests) where you set it to http(s)://.:/ [1] [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html > can not open dashboard > --- > > Key: FLINK-23945 > URL: https://issues.apache.org/jira/browse/FLINK-23945 > Project: Flink > Issue Type: Bug > Components: Connectors / FileSystem, Deployment / Kubernetes >Affects Versions: 1.12.3 >Reporter: 钟洋洋 >Priority: Minor > > Cannot deploy normally using k8s native session mode, but no error log is > found > > Details are in the comments section > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23647) UnalignedCheckpointStressITCase crashed on azure
[ https://issues.apache.org/jira/browse/FLINK-23647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23647: --- Assignee: Arvid Heise > UnalignedCheckpointStressITCase crashed on azure > > > Key: FLINK-23647 > URL: https://issues.apache.org/jira/browse/FLINK-23647 > Project: Flink > Issue Type: Bug > Components: Runtime / Network, Tests >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Assignee: Arvid Heise >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21539=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=4855 > When testing DFS changelog implementation in FLINK-23279 and enabling it for > all tests, > UnalignedCheckpointStressITCase crashed with the following exception > {code} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 18.433 s <<< FAILURE! - in > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase > [ERROR] > runStressTest(org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase) > Time elapsed: 17.663 s <<< ERROR! > java.io.UncheckedIOException: java.nio.file.NoSuchFileException: > /tmp/junit7860347244680665820/435237 d57439f2ceadfedba74dadd6fa/chk-16 >at > java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88) >at java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104) >at java.util.Iterator.forEachRemaining(Iterator.java:115) >at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) >at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) >at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) >at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) >at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) >at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:288) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:261) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:157) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:498) >at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) >at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) >at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) >at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) >at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) >at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) >at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) >at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) >at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) >at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) >at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) >at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) >at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) >at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) >at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) >at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at org.junit.rules.RunRules.evaluate(RunRules.java:20) >at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) >at org.junit.runners.ParentRunner.run(ParentRunner.java:413) >at >
[jira] [Comment Edited] (FLINK-23913) UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs
[ https://issues.apache.org/jira/browse/FLINK-23913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404035#comment-17404035 ] Arvid Heise edited comment on FLINK-23913 at 8/24/21, 8:16 PM: --- This is probably a duplicate of FLINK-23794. was (Author: arvid): This is probably caused by FLINK-23794. > UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs > > > Key: FLINK-23913 > URL: https://issues.apache.org/jira/browse/FLINK-23913 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.14.0 > Environment: UnalignedCheckpointITCase >Reporter: Robert Metzger >Assignee: Arvid Heise >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > > Cases reported in FLINK-23525: > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=7b25afdf-cc6c-566f-5459-359dc2585798=10338 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22605=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - ... there are a lot more cases. > The problem seems to have started occurring around August 20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23913) UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs
[ https://issues.apache.org/jira/browse/FLINK-23913?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404035#comment-17404035 ] Arvid Heise commented on FLINK-23913: - This is probably caused by FLINK-23794. > UnalignedCheckpointITCase fails with exit code 137 (kernel oom) on Azure VMs > > > Key: FLINK-23913 > URL: https://issues.apache.org/jira/browse/FLINK-23913 > Project: Flink > Issue Type: Bug > Components: Runtime / Network >Affects Versions: 1.14.0 > Environment: UnalignedCheckpointITCase >Reporter: Robert Metzger >Assignee: Arvid Heise >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > > Cases reported in FLINK-23525: > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=4d4a0d10-fca2-5507-8eed-c07f0bdf4887=7b25afdf-cc6c-566f-5459-359dc2585798=10338 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22618=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22605=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=4743 > - ... there are a lot more cases. > The problem seems to have started occurring around August 20. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23796) UnalignedCheckpointRescaleITCase JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23796. - Resolution: Fixed > UnalignedCheckpointRescaleITCase JVM crash on Azure > --- > > Key: FLINK-23796 > URL: https://issues.apache.org/jira/browse/FLINK-23796 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=5182 > {code} > Aug 16 01:03:17 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test > (integration-tests) on project flink-tests: There are test failures. > Aug 16 01:03:17 [ERROR] > Aug 16 01:03:17 [ERROR] Please refer to > /__w/1/s/flink-tests/target/surefire-reports for the individual test results. > Aug 16 01:03:17 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 16 01:03:17 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > Aug 16 01:03:17 [ERROR] Crashed tests: > Aug 16 01:03:17 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase > Aug 16 01:03:17 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23796) UnalignedCheckpointRescaleITCase JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17404034#comment-17404034 ] Arvid Heise commented on FLINK-23796: - I haven't seen any new reports. There is still the possibility that FLINK-23794 also causes this. But will close this ticket for the time being. > UnalignedCheckpointRescaleITCase JVM crash on Azure > --- > > Key: FLINK-23796 > URL: https://issues.apache.org/jira/browse/FLINK-23796 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=5182 > {code} > Aug 16 01:03:17 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test > (integration-tests) on project flink-tests: There are test failures. > Aug 16 01:03:17 [ERROR] > Aug 16 01:03:17 [ERROR] Please refer to > /__w/1/s/flink-tests/target/surefire-reports for the individual test results. > Aug 16 01:03:17 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 16 01:03:17 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > Aug 16 01:03:17 [ERROR] Crashed tests: > Aug 16 01:03:17 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase > Aug 16 01:03:17 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-22889) JdbcExactlyOnceSinkE2eTest.testInsert hangs on azure
[ https://issues.apache.org/jira/browse/FLINK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-22889: --- Assignee: Arvid Heise (was: Roman Khachatryan) > JdbcExactlyOnceSinkE2eTest.testInsert hangs on azure > > > Key: FLINK-22889 > URL: https://issues.apache.org/jira/browse/FLINK-22889 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.14.0, 1.13.1 >Reporter: Dawid Wysakowicz >Assignee: Arvid Heise >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=18690=logs=ba53eb01-1462-56a3-8e98-0dd97fbcaab5=bfbc6239-57a0-5db0-63f3-41551b4f7d51=16658 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23794) JdbcExactlyOnceSinkE2eTest JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23794: --- Assignee: Arvid Heise (was: Roman Khachatryan) > JdbcExactlyOnceSinkE2eTest JVM crash on Azure > - > > Key: FLINK-23794 > URL: https://issues.apache.org/jira/browse/FLINK-23794 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Arvid Heise >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > Attachments: Screenshot_2021-08-23_13-34-31.png > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22196=logs=e9af9cde-9a65-5281-a58e-2c8511d36983=c520d2c3-4d17-51f1-813b-4b0b74a0c307=13960 > {code} > Aug 14 22:56:30 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on > project flink-connector-jdbc_2.11: There are test failures. > Aug 14 22:56:30 [ERROR] > Aug 14 22:56:30 [ERROR] Please refer to > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire-reports for > the individual test results. > Aug 14 22:56:30 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 14 22:56:30 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd > /__w/1/s/flink-connectors/flink-connector-jdbc && > /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=2 -XX:+UseG1GC -jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire > 2021-08-14T22-14-27_386-jvmRun2 surefire390822284944903tmp > surefire_7612891660133211258241tmp > Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log > Aug 14 22:56:30 [ERROR] Process Exit Code: 239 > Aug 14 22:56:30 [ERROR] Crashed tests: > Aug 14 22:56:30 [ERROR] > org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest > Aug 14 22:56:30 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd > /__w/1/s/flink-connectors/flink-connector-jdbc && > /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=2 -XX:+UseG1GC -jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire > 2021-08-14T22-14-27_386-jvmRun2 surefire390822284944903tmp > surefire_7612891660133211258241tmp > Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log > Aug 14 22:56:30 [ERROR] Process Exit Code: 239 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23794) JdbcExactlyOnceSinkE2eTest JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17403204#comment-17403204 ] Arvid Heise commented on FLINK-23794: - Wow thanks for the in-depth analysis. I think I'll fix that in the InMemoryReporter (remove all metrics eagerly unless explicitly stated in `InMemoryReporterRule`) or overhaul the whole mechanic (only register when there is a `InMemoryReporterRule`). Can I take this ticket? > JdbcExactlyOnceSinkE2eTest JVM crash on Azure > - > > Key: FLINK-23794 > URL: https://issues.apache.org/jira/browse/FLINK-23794 > Project: Flink > Issue Type: Bug > Components: Connectors / JDBC >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Roman Khachatryan >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > Attachments: Screenshot_2021-08-23_13-34-31.png > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22196=logs=e9af9cde-9a65-5281-a58e-2c8511d36983=c520d2c3-4d17-51f1-813b-4b0b74a0c307=13960 > {code} > Aug 14 22:56:30 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test (default-test) on > project flink-connector-jdbc_2.11: There are test failures. > Aug 14 22:56:30 [ERROR] > Aug 14 22:56:30 [ERROR] Please refer to > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire-reports for > the individual test results. > Aug 14 22:56:30 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 14 22:56:30 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd > /__w/1/s/flink-connectors/flink-connector-jdbc && > /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=2 -XX:+UseG1GC -jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire > 2021-08-14T22-14-27_386-jvmRun2 surefire390822284944903tmp > surefire_7612891660133211258241tmp > Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log > Aug 14 22:56:30 [ERROR] Process Exit Code: 239 > Aug 14 22:56:30 [ERROR] Crashed tests: > Aug 14 22:56:30 [ERROR] > org.apache.flink.connector.jdbc.xa.JdbcExactlyOnceSinkE2eTest > Aug 14 22:56:30 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 14 22:56:30 [ERROR] Command was /bin/sh -c cd > /__w/1/s/flink-connectors/flink-connector-jdbc && > /usr/lib/jvm/adoptopenjdk-11-hotspot-amd64/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=2 -XX:+UseG1GC -jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire/surefirebooter3870491592340940577.jar > /__w/1/s/flink-connectors/flink-connector-jdbc/target/surefire > 2021-08-14T22-14-27_386-jvmRun2 surefire390822284944903tmp > surefire_7612891660133211258241tmp > Aug 14 22:56:30 [ERROR] Error occurred in starting fork, check output in log > Aug 14 22:56:30 [ERROR] Process Exit Code: 239 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23879) Benchmarks are not compiling
[ https://issues.apache.org/jira/browse/FLINK-23879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402542#comment-17402542 ] Arvid Heise commented on FLINK-23879: - Thanks for fixing [~chesnay]! Sorry for creating the mess - I keep forgetting about the benchmark repository. Do we actually still need the separation? I thought there was some clarification around JMH license that would make it possible to integrate into main repo? For example, apache-commons seems to use it [1]. [1] http://commons.apache.org/proper/commons-rng/commons-rng-examples/commons-rng-examples-jmh/dependencies.html > Benchmarks are not compiling > > > Key: FLINK-23879 > URL: https://issues.apache.org/jira/browse/FLINK-23879 > Project: Flink > Issue Type: Bug > Components: Benchmarks >Reporter: Zhilong Hong >Assignee: Chesnay Schepler >Priority: Blocker > Labels: pull-request-available > > The benchmarks are not compiling from Aug. 16th, 2021. The error is: > {noformat} > [2021-08-19T23:18:36.242Z] [ERROR] > /home/jenkins/workspace/flink-scheduler-benchmarks/flink-benchmarks/src/main/java/org/apache/flink/benchmark/SortingBoundedInputBenchmarks.java:47:54: > error: package org.apache.flink.streaming.runtime.streamstatus does not exist > [2021-08-19T23:18:36.242Z] [ERROR] > /home/jenkins/workspace/flink-scheduler-benchmarks/flink-benchmarks/src/main/java/org/apache/flink/benchmark/SortingBoundedInputBenchmarks.java:350:40: > error: cannot find symbol{noformat} > It seems to be introduced by FLINK-23767, in which {{StreamStatus}} is > replaced with {{WatermarkStatus}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23640) Create a KafkaRecordSerializationSchemas valueOnly helper
[ https://issues.apache.org/jira/browse/FLINK-23640?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402398#comment-17402398 ] Arvid Heise commented on FLINK-23640: - Merged into master as c6f943ec29608ef3ce66d9b41968fd1136915e13. > Create a KafkaRecordSerializationSchemas valueOnly helper > - > > Key: FLINK-23640 > URL: https://issues.apache.org/jira/browse/FLINK-23640 > Project: Flink > Issue Type: Sub-task >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > > Commonly users only want to serialize the value of a Kafka record if they are > not using a key schema. For these users, we can provide an easier entry point > comparable to \{{ KafkaValueOnlyDeserializerWrapper }}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23640) Create a KafkaRecordSerializationSchemas valueOnly helper
[ https://issues.apache.org/jira/browse/FLINK-23640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23640. - Fix Version/s: 1.14.0 Resolution: Fixed > Create a KafkaRecordSerializationSchemas valueOnly helper > - > > Key: FLINK-23640 > URL: https://issues.apache.org/jira/browse/FLINK-23640 > Project: Flink > Issue Type: Sub-task >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > Commonly users only want to serialize the value of a Kafka record if they are > not using a key schema. For these users, we can provide an easier entry point > comparable to \{{ KafkaValueOnlyDeserializerWrapper }}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23801) Add FLIP-33 metrics to KafkaSource
[ https://issues.apache.org/jira/browse/FLINK-23801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23801: --- Assignee: Qingsheng Ren > Add FLIP-33 metrics to KafkaSource > -- > > Key: FLINK-23801 > URL: https://issues.apache.org/jira/browse/FLINK-23801 > Project: Flink > Issue Type: New Feature > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > KafkaSource should report FLIP-33 standard metrics for monitoring -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23801) Add FLIP-33 metrics to KafkaSource
[ https://issues.apache.org/jira/browse/FLINK-23801?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23801. - Resolution: Fixed > Add FLIP-33 metrics to KafkaSource > -- > > Key: FLINK-23801 > URL: https://issues.apache.org/jira/browse/FLINK-23801 > Project: Flink > Issue Type: New Feature > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > KafkaSource should report FLIP-33 standard metrics for monitoring -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23801) Add FLIP-33 metrics to KafkaSource
[ https://issues.apache.org/jira/browse/FLINK-23801?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402397#comment-17402397 ] Arvid Heise commented on FLINK-23801: - Merged into master as 0136489c23308992654fc15290e3207c23c91369..a40594f2d441af426e174fbf60825e0d50f9b2a0. > Add FLIP-33 metrics to KafkaSource > -- > > Key: FLINK-23801 > URL: https://issues.apache.org/jira/browse/FLINK-23801 > Project: Flink > Issue Type: New Feature > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > KafkaSource should report FLIP-33 standard metrics for monitoring -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23773) KafkaPartitionSplitReader should remove empty splits from fetcher
[ https://issues.apache.org/jira/browse/FLINK-23773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23773: --- Assignee: Qingsheng Ren > KafkaPartitionSplitReader should remove empty splits from fetcher > - > > Key: FLINK-23773 > URL: https://issues.apache.org/jira/browse/FLINK-23773 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0, 1.13.2 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0, 1.13.3 > > > Currently if a {{KafkaPartitionSplit}} is empty (startingOffset >= > stoppingOffset), split reader only unsubscribes it from consumer, but doesn't > remove it from SplitFetcher. This will lead to consumer complaining some > partitions are not subscribed while fetching. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23838) Add FLIP-33 metrics to new KafkaSink
[ https://issues.apache.org/jira/browse/FLINK-23838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23838. - Resolution: Fixed Merged into master as 8e8a60a7afaba21272aaf2c2e12e1abfc2b12375..545fd7a9e306b2fb32532701bdc80314d315c7c6. > Add FLIP-33 metrics to new KafkaSink > > > Key: FLINK-23838 > URL: https://issues.apache.org/jira/browse/FLINK-23838 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23528) stop-with-savepoint can fail with FlinkKinesisConsumer
[ https://issues.apache.org/jira/browse/FLINK-23528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17402116#comment-17402116 ] Arvid Heise commented on FLINK-23528: - Merged into master as 273dce5b030e12dd3d7bebb2f51036a198d07112. > stop-with-savepoint can fail with FlinkKinesisConsumer > -- > > Key: FLINK-23528 > URL: https://issues.apache.org/jira/browse/FLINK-23528 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.11.3, 1.13.1, 1.12.4 >Reporter: Piotr Nowojski >Assignee: Arvid Heise >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > {{FlinkKinesisConsumer#cancel()}} (inside > {{KinesisDataFetcher#shutdownFetcher()}}) shouldn't be interrupting source > thread. Otherwise, as described in FLINK-23527, network stack can be left in > an invalid state and downstream tasks can encounter deserialisation errors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23854) KafkaSink error when restart from the checkpoint with a lower parallelism by exactly-once guarantee
[ https://issues.apache.org/jira/browse/FLINK-23854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23854: --- Assignee: Arvid Heise > KafkaSink error when restart from the checkpoint with a lower parallelism by > exactly-once guarantee > --- > > Key: FLINK-23854 > URL: https://issues.apache.org/jira/browse/FLINK-23854 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Hang Ruan >Assignee: Arvid Heise >Priority: Blocker > Labels: release-testing > > The KafkaSink throws the exception when restarted with a lower parallelism > and the exactly-once guarantee. The exception is like this. > {noformat} > java.lang.IllegalStateException: Internal error: It is expected that state > from previous executions is distributed to the same subtask id. > at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) > at > org.apache.flink.connector.kafka.sink.KafkaWriter.recoverAndInitializeState(KafkaWriter.java:178) > > at > org.apache.flink.connector.kafka.sink.KafkaWriter.(KafkaWriter.java:130) > > at > org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:99) > > at > org.apache.flink.streaming.runtime.operators.sink.SinkOperator.initializeState(SinkOperator.java:134) > > at > org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:118) > > at > org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286) > > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:109) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:690) > > at > org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:666) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:785) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:638) > > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:572) > at java.lang.Thread.run(Thread.java:748) > Suppressed: java.lang.NullPointerException > at > org.apache.flink.streaming.runtime.operators.sink.SinkOperator.close(SinkOperator.java:195) > > at > org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:141) > > at > org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:127) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:1028) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:1014) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:927) > > at > org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:797) > ... 4 more > {noformat} > I start the kafka cluster(kafka_2.13-2.8.0) and the flink cluster in my own > mac. I change the parallelism from 4 to 2 and restart the job from some > completed checkpoint. Then the error occurs. > And the cli command and the code are as follows. > {code:java} > // cli command > ./bin/flink run -d -c com.test.KafkaExactlyOnceScaleDownTest -s > /Users/test/checkpointDir/ExactlyOnceTest1/67105fcc1724e147fc6208af0dd90618/chk-1 > /Users/test/project/self/target/test.jar > {code} > {code:java} > public class KafkaExactlyOnceScaleDownTest { > public static void main(String[] args) throws Exception { > final String kafkaSourceTopic = "flinkSourceTest"; > final String kafkaSinkTopic = "flinkSinkExactlyTest1"; > final String groupId = "ExactlyOnceTest1"; > final String brokers = "localhost:9092"; > final String ckDir = "file:///Users/test/checkpointDir/" + groupId; > final StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.enableCheckpointing(6); > > env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); > > env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); > env.getCheckpointConfig().setCheckpointStorage(ckDir); > env.setParallelism(4); > KafkaSource source =
[jira] [Commented] (FLINK-23796) UnalignedCheckpointRescaleITCase JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401591#comment-17401591 ] Arvid Heise commented on FLINK-23796: - On a related note, shouldn't Flink more gracefully exit on OOM? I guess it's a matter of test setup in this case, right? 137 is usually the return code when the process is killed by another process (e.g. Yarn container). > UnalignedCheckpointRescaleITCase JVM crash on Azure > --- > > Key: FLINK-23796 > URL: https://issues.apache.org/jira/browse/FLINK-23796 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=5182 > {code} > Aug 16 01:03:17 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test > (integration-tests) on project flink-tests: There are test failures. > Aug 16 01:03:17 [ERROR] > Aug 16 01:03:17 [ERROR] Please refer to > /__w/1/s/flink-tests/target/surefire-reports for the individual test results. > Aug 16 01:03:17 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 16 01:03:17 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > Aug 16 01:03:17 [ERROR] Crashed tests: > Aug 16 01:03:17 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase > Aug 16 01:03:17 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23796) UnalignedCheckpointRescaleITCase JVM crash on Azure
[ https://issues.apache.org/jira/browse/FLINK-23796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401588#comment-17401588 ] Arvid Heise commented on FLINK-23796: - I have merged FLINK-23776, let's check back if this issue persists. > UnalignedCheckpointRescaleITCase JVM crash on Azure > --- > > Key: FLINK-23796 > URL: https://issues.apache.org/jira/browse/FLINK-23796 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Xintong Song >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=39d5b1d5-3b41-54dc-6458-1e2ddd1cdcf3=0c010d0c-3dec-5bf1-d408-7b18988b1b2b=5182 > {code} > Aug 16 01:03:17 [ERROR] Failed to execute goal > org.apache.maven.plugins:maven-surefire-plugin:2.22.2:test > (integration-tests) on project flink-tests: There are test failures. > Aug 16 01:03:17 [ERROR] > Aug 16 01:03:17 [ERROR] Please refer to > /__w/1/s/flink-tests/target/surefire-reports for the individual test results. > Aug 16 01:03:17 [ERROR] Please refer to dump files (if any exist) > [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. > Aug 16 01:03:17 [ERROR] ExecutionException The forked VM terminated without > properly saying goodbye. VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > Aug 16 01:03:17 [ERROR] Crashed tests: > Aug 16 01:03:17 [ERROR] > org.apache.flink.test.checkpointing.UnalignedCheckpointRescaleITCase > Aug 16 01:03:17 [ERROR] > org.apache.maven.surefire.booter.SurefireBooterForkException: > ExecutionException The forked VM terminated without properly saying goodbye. > VM crash or System.exit called? > Aug 16 01:03:17 [ERROR] Command was /bin/sh -c cd /__w/1/s/flink-tests/target > && /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java -Xms256m -Xmx2048m > -Dmvn.forkNumber=1 -XX:+UseG1GC -jar > /__w/1/s/flink-tests/target/surefire/surefirebooter4077406518503777021.jar > /__w/1/s/flink-tests/target/surefire 2021-08-15T23-59-56_973-jvmRun1 > surefire4438021626717472043tmp surefire_1445134621790231688950tmp > Aug 16 01:03:17 [ERROR] Error occurred in starting fork, check output in log > Aug 16 01:03:17 [ERROR] Process Exit Code: 137 > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23776) Performance regression on 14.08.2021 in FLIP-27
[ https://issues.apache.org/jira/browse/FLINK-23776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23776. - Resolution: Fixed Merged into master as 976026b63175e63009749936416fc9581a0bf56b. > Performance regression on 14.08.2021 in FLIP-27 > --- > > Key: FLINK-23776 > URL: https://issues.apache.org/jira/browse/FLINK-23776 > Project: Flink > Issue Type: Bug > Components: API / DataStream, Benchmarks >Affects Versions: 1.14.0 >Reporter: Piotr Nowojski >Assignee: Arvid Heise >Priority: Blocker > Labels: pull-request-available > Fix For: 1.14.0 > > > http://codespeed.dak8s.net:8000/timeline/?ben=mapSink.F27_UNBOUNDED=2 > http://codespeed.dak8s.net:8000/timeline/?ben=mapRebalanceMapSink.F27_UNBOUNDED=2 > {noformat} > git ls 7b60a964b1..7f3636f6b4 > 7f3636f6b4f [2 days ago] [FLINK-23652][connectors] Adding common source > metrics. [Arvid Heise] > 97c8f72b813 [3 months ago] [FLINK-23652][connectors] Adding common sink > metrics. [Arvid Heise] > 48da20e8f88 [3 months ago] [FLINK-23652][test] Adding InMemoryMetricReporter > and using it by default in MiniClusterResource. [Arvid Heise] > 63ee60859ca [3 months ago] [FLINK-23652][core/metrics] Extract > Operator(IO)MetricGroup interfaces and expose them in RuntimeContext [Arvid > Heise] > 5d5e39b614b [2 days ago] [refactor][connectors] Only use > MockSplitReader.Builder for instantiation. [Arvid Heise] > b927035610c [3 months ago] [refactor][core] Extract common context creation > in CollectionExecutor [Arvid Heise] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23854) KafkaSink error when restart from the checkpoint with a lower parallelism by exactly-once guarantee
[ https://issues.apache.org/jira/browse/FLINK-23854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23854: Description: The KafkaSink throws the exception when restarted with a lower parallelism and the exactly-once guarantee. The exception is like this. {noformat} java.lang.IllegalStateException: Internal error: It is expected that state from previous executions is distributed to the same subtask id. at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) at org.apache.flink.connector.kafka.sink.KafkaWriter.recoverAndInitializeState(KafkaWriter.java:178) at org.apache.flink.connector.kafka.sink.KafkaWriter.(KafkaWriter.java:130) at org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:99) at org.apache.flink.streaming.runtime.operators.sink.SinkOperator.initializeState(SinkOperator.java:134) at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:118) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:109) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:690) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:666) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:785) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:638) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:572) at java.lang.Thread.run(Thread.java:748) Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.SinkOperator.close(SinkOperator.java:195) at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:141) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:127) at org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:1028) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:1014) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:927) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:797) ... 4 more {noformat} I start the kafka cluster(kafka_2.13-2.8.0) and the flink cluster in my own mac. I change the parallelism from 4 to 2 and restart the job from some completed checkpoint. Then the error occurs. And the cli command and the code are as follows. {code:java} // cli command ./bin/flink run -d -c com.test.KafkaExactlyOnceScaleDownTest -s /Users/test/checkpointDir/ExactlyOnceTest1/67105fcc1724e147fc6208af0dd90618/chk-1 /Users/test/project/self/target/test.jar {code} {code:java} public class KafkaExactlyOnceScaleDownTest { public static void main(String[] args) throws Exception { final String kafkaSourceTopic = "flinkSourceTest"; final String kafkaSinkTopic = "flinkSinkExactlyTest1"; final String groupId = "ExactlyOnceTest1"; final String brokers = "localhost:9092"; final String ckDir = "file:///Users/test/checkpointDir/" + groupId; final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.enableCheckpointing(6); env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); env.getCheckpointConfig().setCheckpointStorage(ckDir); env.setParallelism(4); KafkaSource source = KafkaSource.builder() .setBootstrapServers(brokers) .setTopics(kafkaSourceTopic) .setGroupId(groupId) .setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest") .setValueOnlyDeserializer(new SimpleStringSchema()) .build(); DataStream flintstones = env.fromSource(source, WatermarkStrategy.noWatermarks(), "Kafka Source"); DataStream adults = flintstones.filter(s -> s != null && s.length() > 2); Properties props = new Properties(); props.setProperty("transaction.timeout.ms", "90"); adults.sinkTo(KafkaSink.builder() .setBootstrapServers(brokers) .setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE) .setTransactionalIdPrefix("tp-test-") .setKafkaProducerConfig(props)
[jira] [Updated] (FLINK-23854) KafkaSink error when restart from the checkpoint with a lower parallelism by exactly-once guarantee
[ https://issues.apache.org/jira/browse/FLINK-23854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23854: Description: The KafkaSink throws the exception when restarted with a lower parallelism and the exactly-once guarantee. The exception is like this. {noformat} java.lang.IllegalStateException: Internal error: It is expected that state from previous executions is distributed to the same subtask id. at org.apache.flink.util.Preconditions.checkState(Preconditions.java:193) at org.apache.flink.connector.kafka.sink.KafkaWriter.recoverAndInitializeState(KafkaWriter.java:178) at org.apache.flink.connector.kafka.sink.KafkaWriter.(KafkaWriter.java:130) at org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:99) at org.apache.flink.streaming.runtime.operators.sink.SinkOperator.initializeState(SinkOperator.java:134) at org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:118) at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:286) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:109) at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:690) at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55) at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:666) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:785) at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:638) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:572) at java.lang.Thread.run(Thread.java:748)Suppressed: java.lang.NullPointerException at org.apache.flink.streaming.runtime.operators.sink.SinkOperator.close(SinkOperator.java:195) at org.apache.flink.streaming.runtime.tasks.StreamOperatorWrapper.close(StreamOperatorWrapper.java:141) at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.closeAllOperators(RegularOperatorChain.java:127) at org.apache.flink.streaming.runtime.tasks.StreamTask.closeAllOperators(StreamTask.java:1028) at org.apache.flink.streaming.runtime.tasks.StreamTask.runAndSuppressThrowable(StreamTask.java:1014) at org.apache.flink.streaming.runtime.tasks.StreamTask.cleanUpInvoke(StreamTask.java:927) at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:797) ... 4 more {noformat} I start the kafka cluster(kafka_2.13-2.8.0) and the flink cluster in my own mac. I change the parallelism from 4 to 2 and restart the job from some completed checkpoint. Then the error occurs. And the cli command and the code are as follows. {code:java} // cli command ./bin/flink run -d -c com.test.KafkaExactlyOnceScaleDownTest -s /Users/test/checkpointDir/ExactlyOnceTest1/67105fcc1724e147fc6208af0dd90618/chk-1 /Users/test/project/self/target/test.jar {code} {code:java} public class KafkaExactlyOnceScaleDownTest { public static void main(String[] args) throws Exception { final String kafkaSourceTopic = "flinkSourceTest"; final String kafkaSinkTopic = "flinkSinkExactlyTest1"; final String groupId = "ExactlyOnceTest1"; final String brokers = "localhost:9092"; final String ckDir = "file:///Users/test/checkpointDir/" + groupId; final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.enableCheckpointing(6); env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE); env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); env.getCheckpointConfig().setCheckpointStorage(ckDir); env.setParallelism(4); KafkaSource source = KafkaSource.builder() .setBootstrapServers(brokers) .setTopics(kafkaSourceTopic) .setGroupId(groupId) .setProperty(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "earliest") .setValueOnlyDeserializer(new SimpleStringSchema()) .build(); DataStream flintstones = env.fromSource(source, WatermarkStrategy.noWatermarks(), "Kafka Source"); DataStream adults = flintstones.filter(s -> s != null && s.length() > 2); Properties props = new Properties(); props.setProperty("transaction.timeout.ms", "90"); adults.sinkTo(KafkaSink.builder() .setBootstrapServers(brokers) .setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE) .setTransactionalIdPrefix("tp-test-") .setKafkaProducerConfig(props)
[jira] [Assigned] (FLINK-23528) stop-with-savepoint can fail with FlinkKinesisConsumer
[ https://issues.apache.org/jira/browse/FLINK-23528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23528: --- Assignee: Arvid Heise > stop-with-savepoint can fail with FlinkKinesisConsumer > -- > > Key: FLINK-23528 > URL: https://issues.apache.org/jira/browse/FLINK-23528 > Project: Flink > Issue Type: Bug > Components: Connectors / Kinesis >Affects Versions: 1.11.3, 1.13.1, 1.12.4 >Reporter: Piotr Nowojski >Assignee: Arvid Heise >Priority: Critical > Labels: pull-request-available > Fix For: 1.14.0 > > > {{FlinkKinesisConsumer#cancel()}} (inside > {{KinesisDataFetcher#shutdownFetcher()}}) shouldn't be interrupting source > thread. Otherwise, as described in FLINK-23527, network stack can be left in > an invalid state and downstream tasks can encounter deserialisation errors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-20731) Pulsar Source
[ https://issues.apache.org/jira/browse/FLINK-20731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17401025#comment-17401025 ] Arvid Heise commented on FLINK-20731: - [~syhily] just to clarify: It would be awesome if you could add the documentation as a markdown in flink /docs folder similar to https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/datastream/kafka/#kafka-source. For the testing, it is enough to create a ticket in this JIRA; maybe as a subtask of FLINK-20726 and let someone else do the testing. Release testing is meant to check if docs and JavaDoc are sufficient for someone else to write a Pulsar-Flink application and it works as expected. Of course, that testing can only happen after you add the docs. > Pulsar Source > - > > Key: FLINK-20731 > URL: https://issues.apache.org/jira/browse/FLINK-20731 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common >Affects Versions: 1.13.0 >Reporter: Jianyun Zhao >Assignee: Yufan Sheng >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > This is our implementation based on FLIP-27. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23858) Clarify StreamRecord#timestamp.
[ https://issues.apache.org/jira/browse/FLINK-23858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23858: Description: The new Source apparently changed the way we specify records without timestamps. Previously, we used separate methods to create and maintain timestamp-less records. Now, we are shiftings towards using a special value (TimeStampAssigner#NO_TIMESTAMP). We first of all need to document that somewhere; at the very least in the JavaDoc of StreamRecord. We should also revise the consequences: - Do we want to encode it in the {{StreamElementSerializer}}? Currently, we use a flag to indicate no-timestamp on the old path but in the new path we now use 9 bytes to encode NO_TIMESTAMP. - We should check if all code-paths deal with `hasTimestamp() == true && getTimestamp() == TimeStampAssigner#NO_TIMESTAMP`, in particular with sinks. Do we want to deprecate `hasTimestamp` and related methods? was: The new Source apparently changed the way we specify records without timestamps. Previously, we used separate methods to create and maintain timestamp-less records. Now, we are shiftings towards using a special value (TimeStampAssigner#NO_TIMESTAMP). We first of all need to document that somewhere; at the very least in the JavaDoc of StreamRecord. We should also revise the consequences: - Do we want to encode it in the {{StreamElementSerializer}}? Currently, we use a flag to indicate no-timestamp on the old path but in the new path we now use 9 bytes to encode NO_TIMESTAMP. - We should check if all code-paths deal with `hasTimestamp() == true && getTimestamp() == TimeStampAssigner#NO_TIMESTAMP`, in particular with sinks. > Clarify StreamRecord#timestamp. > --- > > Key: FLINK-23858 > URL: https://issues.apache.org/jira/browse/FLINK-23858 > Project: Flink > Issue Type: Technical Debt > Components: Runtime / Network >Reporter: Arvid Heise >Priority: Major > > The new Source apparently changed the way we specify records without > timestamps. Previously, we used separate methods to create and maintain > timestamp-less records. > Now, we are shiftings towards using a special value > (TimeStampAssigner#NO_TIMESTAMP). > We first of all need to document that somewhere; at the very least in the > JavaDoc of StreamRecord. > We should also revise the consequences: > - Do we want to encode it in the {{StreamElementSerializer}}? Currently, we > use a flag to indicate no-timestamp on the old path but in the new path we > now use 9 bytes to encode NO_TIMESTAMP. > - We should check if all code-paths deal with `hasTimestamp() == true && > getTimestamp() == TimeStampAssigner#NO_TIMESTAMP`, in particular with sinks. > Do we want to deprecate `hasTimestamp` and related methods? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23858) Clarify StreamRecord#timestamp.
Arvid Heise created FLINK-23858: --- Summary: Clarify StreamRecord#timestamp. Key: FLINK-23858 URL: https://issues.apache.org/jira/browse/FLINK-23858 Project: Flink Issue Type: Technical Debt Components: Runtime / Network Reporter: Arvid Heise The new Source apparently changed the way we specify records without timestamps. Previously, we used separate methods to create and maintain timestamp-less records. Now, we are shiftings towards using a special value (TimeStampAssigner#NO_TIMESTAMP). We first of all need to document that somewhere; at the very least in the JavaDoc of StreamRecord. We should also revise the consequences: - Do we want to encode it in the {{StreamElementSerializer}}? Currently, we use a flag to indicate no-timestamp on the old path but in the new path we now use 9 bytes to encode NO_TIMESTAMP. - We should check if all code-paths deal with `hasTimestamp() == true && getTimestamp() == TimeStampAssigner#NO_TIMESTAMP`, in particular with sinks. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23674) flink restart with checkpoint ,kafka producer throw exception
[ https://issues.apache.org/jira/browse/FLINK-23674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23674. - Resolution: Duplicate > flink restart with checkpoint ,kafka producer throw exception > -- > > Key: FLINK-23674 > URL: https://issues.apache.org/jira/browse/FLINK-23674 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.13.1 > Environment: flink:flink-1.13.1 > kafka: _2.12-2.5.0 > java: 1.8.0_161 >Reporter: meetsong >Priority: Major > > > when I test flink eos, and sink is kafka. first I click the button of > cancel on flink web ui , then I input following code on console > {code:java} > bin/flink run -n -c com.shanjiancaofu.live.job.ChargeJob -s > file:/soft/opt/checkpoint/072c0a72343c6e1f06b9bd37c5147cc0/chk-1/_metadata > ./ad-live-process-0.11-jar-with-dependencies.jar > {code} > , after 10 second throw a exception > {code:java} > Caused by: org.apache.kafka.common.KafkaException: Unexpected error in > InitProducerIdResponse; Producer attempted an operation with an old epoch. > Either there is a newer producer with the same transactionalId, or the > producer's transaction has been expired by the broker. > {code} > and my code is : > {code:java} > package com.shanjiancaofu.live.job; > import com.alibaba.fastjson.JSON; > import lombok.AllArgsConstructor; > import lombok.Data; > import lombok.NoArgsConstructor; > import lombok.extern.slf4j.Slf4j; > import org.apache.commons.lang.SystemUtils; > import org.apache.flink.api.common.restartstrategy.RestartStrategies; > import org.apache.flink.api.common.serialization.SimpleStringSchema; > import org.apache.flink.api.common.state.ListState; > import org.apache.flink.api.common.state.ListStateDescriptor; > import org.apache.flink.api.common.time.Time; > import org.apache.flink.api.common.typeinfo.TypeHint; > import org.apache.flink.api.common.typeinfo.TypeInformation; > import org.apache.flink.api.java.functions.KeySelector; > import org.apache.flink.configuration.Configuration; > import org.apache.flink.runtime.state.filesystem.FsStateBackend; > import org.apache.flink.streaming.api.CheckpointingMode; > import org.apache.flink.streaming.api.TimeCharacteristic; > import org.apache.flink.streaming.api.environment.CheckpointConfig; > import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > import org.apache.flink.streaming.api.functions.KeyedProcessFunction; > import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer; > import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer; > import org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema; > import > org.apache.flink.streaming.connectors.kafka.internals.KeyedSerializationSchemaWrapper; > import org.apache.flink.util.Collector; > import org.apache.kafka.clients.consumer.ConsumerConfig; > import org.apache.kafka.clients.consumer.ConsumerRecord; > import org.apache.kafka.clients.producer.ProducerConfig; > import org.apache.kafka.common.IsolationLevel; > import java.util.*; > @Slf4j > public class ChargeJob1 { >static class RecordScheme implements > KafkaDeserializationSchema> { > @Override > public boolean isEndOfStream(ConsumerRecord > stringUserEventConsumerRecord) { > return false; > } > @Override > public ConsumerRecord > deserialize(ConsumerRecord consumerRecord) throws Exception { > String key = null; > UserEvent UserEvent = null; > if (consumerRecord.key() != null) { > key = new String(consumerRecord.key()); > } > if (consumerRecord.value() != null) { > UserEvent = JSON.parseObject(new String(consumerRecord.value()), > UserEvent.class); > } > return new ConsumerRecord<>( > consumerRecord.topic(), > consumerRecord.partition(), > consumerRecord.offset(), > consumerRecord.timestamp(), > consumerRecord.timestampType(), > consumerRecord.checksum(), > consumerRecord.serializedKeySize(), > consumerRecord.serializedValueSize(), > key, UserEvent); > } > @Override > public TypeInformation> > getProducedType() { > return TypeInformation.of(new TypeHint UserEvent>>() { > }); > } >} >public static void main(String[] args) throws Exception { > Configuration configuration = new Configuration(); > if (args != null) { > // 传递全局参数 > configuration.setString("args", String.join(" ", args)); > } > StreamExecutionEnvironment env = > StreamExecutionEnvironment.getExecutionEnvironment(); > env.setParallelism(3); >
[jira] [Commented] (FLINK-23674) flink restart with checkpoint ,kafka producer throw exception
[ https://issues.apache.org/jira/browse/FLINK-23674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400147#comment-17400147 ] Arvid Heise commented on FLINK-23674: - This seems to be a duplicate of FLINK-23509. The fix is unfortunately not released yet. You could try out flink-connector-kafka-1.13-SNAPSHOT.jar and see if this solves you issue. See also https://stackoverflow.com/questions/39702621/how-to-import-apache-flink-snapshot-artifacts . > flink restart with checkpoint ,kafka producer throw exception > -- > > Key: FLINK-23674 > URL: https://issues.apache.org/jira/browse/FLINK-23674 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.13.1 > Environment: flink:flink-1.13.1 > kafka: _2.12-2.5.0 > java: 1.8.0_161 >Reporter: meetsong >Priority: Major > > > when I test flink eos, and sink is kafka. first I click the button of > cancel on flink web ui , then I input following code on console > {code:java} > bin/flink run -n -c com.shanjiancaofu.live.job.ChargeJob -s > file:/soft/opt/checkpoint/072c0a72343c6e1f06b9bd37c5147cc0/chk-1/_metadata > ./ad-live-process-0.11-jar-with-dependencies.jar > {code} > , after 10 second throw a exception > {code:java} > Caused by: org.apache.kafka.common.KafkaException: Unexpected error in > InitProducerIdResponse; Producer attempted an operation with an old epoch. > Either there is a newer producer with the same transactionalId, or the > producer's transaction has been expired by the broker. > {code} > and my code is : > {code:java} > package com.shanjiancaofu.live.job; > import com.alibaba.fastjson.JSON; > import lombok.AllArgsConstructor; > import lombok.Data; > import lombok.NoArgsConstructor; > import lombok.extern.slf4j.Slf4j; > import org.apache.commons.lang.SystemUtils; > import org.apache.flink.api.common.restartstrategy.RestartStrategies; > import org.apache.flink.api.common.serialization.SimpleStringSchema; > import org.apache.flink.api.common.state.ListState; > import org.apache.flink.api.common.state.ListStateDescriptor; > import org.apache.flink.api.common.time.Time; > import org.apache.flink.api.common.typeinfo.TypeHint; > import org.apache.flink.api.common.typeinfo.TypeInformation; > import org.apache.flink.api.java.functions.KeySelector; > import org.apache.flink.configuration.Configuration; > import org.apache.flink.runtime.state.filesystem.FsStateBackend; > import org.apache.flink.streaming.api.CheckpointingMode; > import org.apache.flink.streaming.api.TimeCharacteristic; > import org.apache.flink.streaming.api.environment.CheckpointConfig; > import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > import org.apache.flink.streaming.api.functions.KeyedProcessFunction; > import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer; > import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer; > import org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema; > import > org.apache.flink.streaming.connectors.kafka.internals.KeyedSerializationSchemaWrapper; > import org.apache.flink.util.Collector; > import org.apache.kafka.clients.consumer.ConsumerConfig; > import org.apache.kafka.clients.consumer.ConsumerRecord; > import org.apache.kafka.clients.producer.ProducerConfig; > import org.apache.kafka.common.IsolationLevel; > import java.util.*; > @Slf4j > public class ChargeJob1 { >static class RecordScheme implements > KafkaDeserializationSchema> { > @Override > public boolean isEndOfStream(ConsumerRecord > stringUserEventConsumerRecord) { > return false; > } > @Override > public ConsumerRecord > deserialize(ConsumerRecord consumerRecord) throws Exception { > String key = null; > UserEvent UserEvent = null; > if (consumerRecord.key() != null) { > key = new String(consumerRecord.key()); > } > if (consumerRecord.value() != null) { > UserEvent = JSON.parseObject(new String(consumerRecord.value()), > UserEvent.class); > } > return new ConsumerRecord<>( > consumerRecord.topic(), > consumerRecord.partition(), > consumerRecord.offset(), > consumerRecord.timestamp(), > consumerRecord.timestampType(), > consumerRecord.checksum(), > consumerRecord.serializedKeySize(), > consumerRecord.serializedValueSize(), > key, UserEvent); > } > @Override > public TypeInformation> > getProducedType() { > return TypeInformation.of(new TypeHint UserEvent>>() { > }); > } >} >public static void main(String[] args) throws Exception { > Configuration
[jira] [Comment Edited] (FLINK-23674) flink restart with checkpoint ,kafka producer throw exception
[ https://issues.apache.org/jira/browse/FLINK-23674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400072#comment-17400072 ] Arvid Heise edited comment on FLINK-23674 at 8/17/21, 5:31 AM: --- Of course {noformat} [root@kafka3 flink-1.13.1]# bin/flink run -n -c com.shanjiancaofu.live.job.ChargeJob -s file:/soft/opt/checkpoint/0d294a9fbed5da53390d048db00d557d/chk-44/_metadata ./ad-live-process-0.11-jar-with-dependencies.jar[root@kafka3 flink-1.13.1]# bin/flink run -n -c com.shanjiancaofu.live.job.ChargeJob -s file:/soft/opt/checkpoint/0d294a9fbed5da53390d048db00d557d/chk-44/_metadata ./ad-live-process-0.11-jar-with-dependencies.jar . ___ _ _ /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \ \\/ ___)| |_)| | | | | || (_| | ) ) ) ) ' || .__|_| |_|_| |_\__, | / / / / =|_|==|___/=/_/_/_/ :: Spring Boot :: Job has been submitted with JobID a5859d58669b2ad5ec3798086526e64e The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: a5859d58669b2ad5ec3798086526e64e) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132) Caused by: java.util.concurrent.ExecutionException: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: a5859d58669b2ad5ec3798086526e64e) at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at org.apache.flink.client.program.StreamContextEnvironment.getJobExecutionResult(StreamContextEnvironment.java:123) at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:80) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1834) at com.shanjiancaofu.live.job.ChargeJob.main(ChargeJob.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) ... 8 more Caused by: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: a5859d58669b2ad5ec3798086526e64e) at org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:125) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:394) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.apache.flink.client.program.rest.RestClusterClient.lambda$pollResourceAsync$24(RestClusterClient.java:670) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:394) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at
[jira] [Comment Edited] (FLINK-23674) flink restart with checkpoint ,kafka producer throw exception
[ https://issues.apache.org/jira/browse/FLINK-23674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17400072#comment-17400072 ] Arvid Heise edited comment on FLINK-23674 at 8/17/21, 5:30 AM: --- Of course {noformat} [root@kafka3 flink-1.13.1]# bin/flink run -n -c com.shanjiancaofu.live.job.ChargeJob -s file:/soft/opt/checkpoint/0d294a9fbed5da53390d048db00d557d/chk-44/_metadata ./ad-live-process-0.11-jar-with-dependencies.jar[root@kafka3 flink-1.13.1]# bin/flink run -n -c com.shanjiancaofu.live.job.ChargeJob -s file:/soft/opt/checkpoint/0d294a9fbed5da53390d048db00d557d/chk-44/_metadata ./ad-live-process-0.11-jar-with-dependencies.jar . _ __ _ _ /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \ \\/ ___)| |_)| | | | | || (_| | ) ) ) ) ' || .__|_| |_|_| |_\__, | / / / / =|_|==|___/=/_/_/_/ :: Spring Boot :: Job has been submitted with JobID a5859d58669b2ad5ec3798086526e64e The program finished with the following exception: org.apache.flink.client.program.ProgramInvocationException: The main method caused an error: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: a5859d58669b2ad5ec3798086526e64e) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:372) at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:222) at org.apache.flink.client.ClientUtils.executeProgram(ClientUtils.java:114) at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:812) at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:246) at org.apache.flink.client.cli.CliFrontend.parseAndRun(CliFrontend.java:1054) at org.apache.flink.client.cli.CliFrontend.lambda$main$10(CliFrontend.java:1132) at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1132)Caused by: java.util.concurrent.ExecutionException: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: a5859d58669b2ad5ec3798086526e64e) at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) at org.apache.flink.client.program.StreamContextEnvironment.getJobExecutionResult(StreamContextEnvironment.java:123) at org.apache.flink.client.program.StreamContextEnvironment.execute(StreamContextEnvironment.java:80) at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.execute(StreamExecutionEnvironment.java:1834) at com.shanjiancaofu.live.job.ChargeJob.main(ChargeJob.java:155) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:355) ... 8 moreCaused by: org.apache.flink.client.program.ProgramInvocationException: Job failed (JobID: a5859d58669b2ad5ec3798086526e64e) at org.apache.flink.client.deployment.ClusterClientJobClientAdapter.lambda$null$6(ClusterClientJobClientAdapter.java:125) at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602) at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:394) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.apache.flink.client.program.rest.RestClusterClient.lambda$pollResourceAsync$24(RestClusterClient.java:670) at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:760) at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:736) at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474) at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1962) at org.apache.flink.runtime.concurrent.FutureUtils.lambda$retryOperationWithDelay$9(FutureUtils.java:394) at
[jira] [Resolved] (FLINK-23710) Move sink to org.apache.kafka.conntor.kafka.sink package
[ https://issues.apache.org/jira/browse/FLINK-23710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23710. - Fix Version/s: 1.14.0 Resolution: Fixed Merged into master as 94da2b587c133621bbdf39d36d55070a64605f56..6c9818323b41a84137c52822d2993df788dbc9bb. > Move sink to org.apache.kafka.conntor.kafka.sink package > > > Key: FLINK-23710 > URL: https://issues.apache.org/jira/browse/FLINK-23710 > Project: Flink > Issue Type: Sub-task >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > The FLIP-27 source for kafka is already placed under > org.apache.kafka.conntor.kafka.source. We should also relocate the FLIP-143 > to keep it consistent and ease the deprecation of the old connector -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-20731) Pulsar Source
[ https://issues.apache.org/jira/browse/FLINK-20731?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-20731. - Fix Version/s: 1.14.0 Release Note: Flink now directly provides a way to read data from Pulsar with DataStream API. Assignee: Yufan Sheng Resolution: Fixed Merged into master as c675f786c51038801161e861826d1c54654f0dde. > Pulsar Source > - > > Key: FLINK-20731 > URL: https://issues.apache.org/jira/browse/FLINK-20731 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common >Affects Versions: 1.13.0 >Reporter: Jianyun Zhao >Assignee: Yufan Sheng >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > This is our implementation based on FLIP-27. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23817) Write documentation for standardized operator metrics
Arvid Heise created FLINK-23817: --- Summary: Write documentation for standardized operator metrics Key: FLINK-23817 URL: https://issues.apache.org/jira/browse/FLINK-23817 Project: Flink Issue Type: Sub-task Components: Connectors / Common Reporter: Arvid Heise Fix For: 1.14.0 Incorporate metrics in connector page. Use [data-templates|https://gohugo.io/templates/data-templates/] for common metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23647) UnalignedCheckpointStressITCase crashed on azure
[ https://issues.apache.org/jira/browse/FLINK-23647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399936#comment-17399936 ] Arvid Heise commented on FLINK-23647: - > I think it's still a bit fragile: > 1. It assumes that the latest folder with _metadata file won't be removed; > but I think it can be removed if adding to checkpointStore fails I haven't thought about that one - or rather I didn't know it. When does it happen? > 2. The assumption is easy to violate - by reading attrs of some previous > checkpoint (which can be subsumed concurrently) There is no harm in checking a file on existance - even if subsumed. > 3. I couldn't find any docs about the atomicity of listFiles - could you list > any reference? The API is not allowing any exceptions at all. If you check the implementation, you can also see that no exceptions are thrown http://hg.openjdk.java.net/jdk/jdk11/file/1ddf9a99e4ad/src/java.base/unix/native/libjava/UnixFileSystem_md.c#l310 > 4. Reading the files while they are being deleted can prevent deletion Good thing that we are not reading them then unless we are sure that this is the correct checkpoint directory. > UnalignedCheckpointStressITCase crashed on azure > > > Key: FLINK-23647 > URL: https://issues.apache.org/jira/browse/FLINK-23647 > Project: Flink > Issue Type: Bug > Components: Runtime / Network, Tests >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21539=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=4855 > When testing DFS changelog implementation in FLINK-23279 and enabling it for > all tests, > UnalignedCheckpointStressITCase crashed with the following exception > {code} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 18.433 s <<< FAILURE! - in > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase > [ERROR] > runStressTest(org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase) > Time elapsed: 17.663 s <<< ERROR! > java.io.UncheckedIOException: java.nio.file.NoSuchFileException: > /tmp/junit7860347244680665820/435237 d57439f2ceadfedba74dadd6fa/chk-16 >at > java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88) >at java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104) >at java.util.Iterator.forEachRemaining(Iterator.java:115) >at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) >at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) >at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) >at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) >at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) >at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:288) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:261) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:157) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:498) >at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) >at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) >at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) >at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) >at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) >at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) >at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) >at >
[jira] [Created] (FLINK-23816) Test new (Kafka) metrics in cluster
Arvid Heise created FLINK-23816: --- Summary: Test new (Kafka) metrics in cluster Key: FLINK-23816 URL: https://issues.apache.org/jira/browse/FLINK-23816 Project: Flink Issue Type: Sub-task Components: Connectors / Common Reporter: Arvid Heise Fix For: 1.14.0 * Setup a Kafka cluster (dockerized local setup is fine) * Start simple Kafka consumer/producer job * Add a bunch of example data (a few minutes worth of processing) * Look at metrics, compare with [FLIP-33|FLIP-33%3A+Standardize+Connector+Metrics] definitions. * Pay special attention to lag metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22790) HybridSource end-to-end test
[ https://issues.apache.org/jira/browse/FLINK-22790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-22790: Fix Version/s: 1.14.0 > HybridSource end-to-end test > > > Key: FLINK-22790 > URL: https://issues.apache.org/jira/browse/FLINK-22790 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common >Reporter: Thomas Weise >Priority: Blocker > Labels: release-testing > Fix For: 1.14.0 > > > Add a new end-to-end test for HybridSource with files and Kafka. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22790) HybridSource end-to-end test
[ https://issues.apache.org/jira/browse/FLINK-22790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-22790: Labels: release-testing (was: ) > HybridSource end-to-end test > > > Key: FLINK-22790 > URL: https://issues.apache.org/jira/browse/FLINK-22790 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common >Reporter: Thomas Weise >Priority: Blocker > Labels: release-testing > > Add a new end-to-end test for HybridSource with files and Kafka. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-22790) HybridSource end-to-end test
[ https://issues.apache.org/jira/browse/FLINK-22790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-22790: Priority: Blocker (was: Major) > HybridSource end-to-end test > > > Key: FLINK-22790 > URL: https://issues.apache.org/jira/browse/FLINK-22790 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Common >Reporter: Thomas Weise >Priority: Blocker > > Add a new end-to-end test for HybridSource with files and Kafka. > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23814) Test FLIP-143 KafkaSink
[ https://issues.apache.org/jira/browse/FLINK-23814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23814: Labels: release-testing (was: ) > Test FLIP-143 KafkaSink > --- > > Key: FLINK-23814 > URL: https://issues.apache.org/jira/browse/FLINK-23814 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka >Reporter: Fabian Paul >Priority: Blocker > Labels: release-testing > Fix For: 1.14.0 > > > The following scenarios are worthwhile to test > * Start simple job with None/At-least once delivery guarantee and write > records to kafka topic > * Start simple job with exactly-once delivery guarantee and write records to > kafka topic. The records should only be visible with a `read-committed` > consumer > * Stop a job with exactly-once delivery guarantee and restart it with > different parallelism (scale-down, scale-up) > * Restart/kill a taskmanager while writing in exactly-once mode -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23767) FLIP-180: Adjust StreamStatus and Idleness definition
[ https://issues.apache.org/jira/browse/FLINK-23767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23767. - Fix Version/s: 1.14.0 Assignee: Arvid Heise Resolution: Fixed Merged into master as 379f28b60afb329c52ef0b0985825a0fd84882a5..d6a8afd98dab90572472dd89fd398db34822647d. > FLIP-180: Adjust StreamStatus and Idleness definition > - > > Key: FLINK-23767 > URL: https://issues.apache.org/jira/browse/FLINK-23767 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Arvid Heise >Assignee: Arvid Heise >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > The implementation ticket for > [FLIP-180|https://cwiki.apache.org/confluence/display/FLINK/FLIP-180%3A+Adjust+StreamStatus+and+Idleness+definition]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23797) SavepointITCase.testStopSavepointWithBoundedInput fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399754#comment-17399754 ] Arvid Heise commented on FLINK-23797: - [~trohrm...@apache.org] could that be connected to your recent changes to this test? > SavepointITCase.testStopSavepointWithBoundedInput fails on azure > > > Key: FLINK-23797 > URL: https://issues.apache.org/jira/browse/FLINK-23797 > Project: Flink > Issue Type: Bug > Components: Runtime / Checkpointing >Affects Versions: 1.14.0 >Reporter: Xintong Song >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=baf26b34-3c6a-54e8-f93f-cf269b32f802=8c9d126d-57d2-5a9e-a8c8-ff53f7b35cd9=4945 > {code} > Aug 15 22:57:53 [ERROR] Tests run: 15, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 58.799 s <<< FAILURE! - in > org.apache.flink.test.checkpointing.SavepointITCase > Aug 15 22:57:53 [ERROR] testStopSavepointWithBoundedInput Time elapsed: > 4.431 s <<< ERROR! > Aug 15 22:57:53 java.util.concurrent.ExecutionException: > org.apache.flink.runtime.checkpoint.CheckpointException: Checkpoint > triggering task Sink: Unnamed (1/1) of job a27d76d1e3a0145b179b30b3a4cd6564 > is not being executed at the moment. Aborting checkpoint. Failure reason: Not > all required tasks are currently running. > Aug 15 22:57:53 at > java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) > Aug 15 22:57:53 at > java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) > Aug 15 22:57:53 at > org.apache.flink.test.checkpointing.SavepointITCase.testStopSavepointWithBoundedInput(SavepointITCase.java:612) > Aug 15 22:57:53 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 15 22:57:53 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 15 22:57:53 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 15 22:57:53 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 15 22:57:53 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 15 22:57:53 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 15 22:57:53 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 15 22:57:53 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 15 22:57:53 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 15 22:57:53 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 15 22:57:53 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 15 22:57:53 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 15 22:57:53 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 15 22:57:53 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 15 22:57:53 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 15 22:57:53 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 15 22:57:53 at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > Aug 15 22:57:53 at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > Aug 15 22:57:53 at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > Aug 15 22:57:53 at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > Aug 15 22:57:53 at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > Aug 15 22:57:53 at > java.util.Iterator.forEachRemaining(Iterator.java:116) > Aug 15 22:57:53 at >
[jira] [Resolved] (FLINK-23785) SinkITCase.testMetrics fails with ConcurrentModification on Azure
[ https://issues.apache.org/jira/browse/FLINK-23785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23785. - Resolution: Fixed Merged into master as 236bd652c9f2080c490935e3e0f2381fb7850b4a..cab4a9875b9ab5cd4c7da3a7e83ec77cb2a4838f. > SinkITCase.testMetrics fails with ConcurrentModification on Azure > - > > Key: FLINK-23785 > URL: https://issues.apache.org/jira/browse/FLINK-23785 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.14.0 >Reporter: Till Rohrmann >Assignee: Arvid Heise >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > The {{SinkITCase.testMetrics}} fails with a ConcurrentModification > {code} > Aug 15 14:26:40 java.util.ConcurrentModificationException > Aug 15 14:26:40 at > java.util.HashMap$HashIterator.nextNode(HashMap.java:1445) > Aug 15 14:26:40 at java.util.HashMap$KeyIterator.next(HashMap.java:1469) > Aug 15 14:26:40 at java.util.AbstractSet.removeAll(AbstractSet.java:174) > Aug 15 14:26:40 at > org.apache.flink.runtime.testutils.InMemoryReporter.applyRemovals(InMemoryReporter.java:78) > Aug 15 14:26:40 at > org.apache.flink.runtime.testutils.InMemoryReporterRule.afterTestSuccess(InMemoryReporterRule.java:61) > Aug 15 14:26:40 at > org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:57) > Aug 15 14:26:40 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 15 14:26:40 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 15 14:26:40 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22215=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=5225 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23807) Use metrics to detect restarts in MiniClusterTestEnvironment#triggerTaskManagerFailover
[ https://issues.apache.org/jira/browse/FLINK-23807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23807: Description: {{MiniClusterTestEnvironment#triggerTaskManagerFailover}} checks the job status to detect a restart {noformat} terminateTaskManager(); CommonTestUtils.waitForJobStatus( jobClient, Arrays.asList(JobStatus.FAILING, JobStatus.FAILED, JobStatus.RESTARTING), Deadline.fromNow(Duration.ofMinutes(5))); afterFailAction.run(); startTaskManager(); {noformat} However, `waitForJobStatus` polls every 100ms while the restart can happen within 100ms and thus can easily miss the actual restart and wait forever (or when the next restart happens because slots are missing). We should rather use the metric `numRestarts`, check before the induced error, and wait until the counter increased. Here is an excerpt from a log where the restart was not detected in time. {noformat} 42769 [flink-akka.actor.default-dispatcher-26] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job TaskManager Failover Test (543035cf9e19317f92ee559b70ac70bd) switched from state RUNNING to RESTARTING. 42774 [flink-akka.actor.default-dispatcher-26] INFO org.apache.flink.runtime.jobmaster.slotpool.DefaultDeclarativeSlotPool [] - Releasing slot [ead7cad050ec7a264c0dba0b6e6a6ad9]. 42775 [flink-akka.actor.default-dispatcher-23] INFO org.apache.flink.runtime.resourcemanager.slotmanager.DeclarativeSlotManager [] - Clearing resource requirements of job 543035cf9e19317f92ee559b70ac70bd 42776 [flink-akka.actor.default-dispatcher-22] INFO org.apache.flink.runtime.taskexecutor.slot.TaskSlotTableImpl [] - Free slot TaskSlot(index:0, state:RELEASING, resource profile: ResourceProfile{taskHeapMemory=170.667gb (183251937962 bytes), taskOffHeapMemory=170.667gb (183251937962 bytes), managedMemory=13.333mb (13981013 bytes), networkMemory=10.667mb (11184810 bytes)}, allocationId: ead7cad050ec7a264c0dba0b6e6a6ad9, jobId: 543035cf9e19317f92ee559b70ac70bd). 43780 [flink-akka.actor.default-dispatcher-26] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Job TaskManager Failover Test (543035cf9e19317f92ee559b70ac70bd) switched from state RESTARTING to RUNNING. 43783 [flink-akka.actor.default-dispatcher-26] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - Restoring job 543035cf9e19317f92ee559b70ac70bd from Checkpoint 11 @ 1629093422900 for 543035cf9e19317f92ee559b70ac70bd located at . 43798 [flink-akka.actor.default-dispatcher-26] INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator [] - No master state to restore 43800 [SourceCoordinator-Source: Tested Source -> Sink: Data stream collect sink] INFO org.apache.flink.runtime.source.coordinator.SourceCoordinator [] - Recovering subtask 0 to checkpoint 11 for source Source: Tested Source -> Sink: Data stream collect sink to checkpoint. 43801 [flink-akka.actor.default-dispatcher-26] INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Tested Source -> Sink: Data stream collect sink (1/1) (35c0ee7183308af02db4b09152f1457e) switched from CREATED to SCHEDULED. {noformat} was: {{MiniClusterTestEnvironment#triggerTaskManagerFailover}} checks the job status to detect a restart {noformat} terminateTaskManager(); CommonTestUtils.waitForJobStatus( jobClient, Arrays.asList(JobStatus.FAILING, JobStatus.FAILED, JobStatus.RESTARTING), Deadline.fromNow(Duration.ofMinutes(5))); afterFailAction.run(); startTaskManager(); {noformat} However, `waitForJobStatus` polls every 100ms while the restart can happen within 10ms and thus can easily miss the actual restart and wait forever (or when the next restart happens because slots are missing). We should rather use the metric `numRestarts`, check before the induced error, and wait until the counter increased. > Use metrics to detect restarts in > MiniClusterTestEnvironment#triggerTaskManagerFailover > --- > > Key: FLINK-23807 > URL: https://issues.apache.org/jira/browse/FLINK-23807 > Project: Flink > Issue Type: Bug > Components: Connectors / Common >Reporter: Arvid Heise >Priority: Major > Fix For: 1.14.0 > > > {{MiniClusterTestEnvironment#triggerTaskManagerFailover}} checks the job > status to detect a restart > {noformat} > terminateTaskManager(); > CommonTestUtils.waitForJobStatus( > jobClient, > Arrays.asList(JobStatus.FAILING, JobStatus.FAILED, > JobStatus.RESTARTING), > Deadline.fromNow(Duration.ofMinutes(5))); > afterFailAction.run(); >
[jira] [Created] (FLINK-23807) Use metrics to detect restarts in MiniClusterTestEnvironment#triggerTaskManagerFailover
Arvid Heise created FLINK-23807: --- Summary: Use metrics to detect restarts in MiniClusterTestEnvironment#triggerTaskManagerFailover Key: FLINK-23807 URL: https://issues.apache.org/jira/browse/FLINK-23807 Project: Flink Issue Type: Bug Components: Connectors / Common Reporter: Arvid Heise Fix For: 1.14.0 {{MiniClusterTestEnvironment#triggerTaskManagerFailover}} checks the job status to detect a restart {noformat} terminateTaskManager(); CommonTestUtils.waitForJobStatus( jobClient, Arrays.asList(JobStatus.FAILING, JobStatus.FAILED, JobStatus.RESTARTING), Deadline.fromNow(Duration.ofMinutes(5))); afterFailAction.run(); startTaskManager(); {noformat} However, `waitForJobStatus` polls every 100ms while the restart can happen within 10ms and thus can easily miss the actual restart and wait forever (or when the next restart happens because slots are missing). We should rather use the metric `numRestarts`, check before the induced error, and wait until the counter increased. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23803) Improve instantiation of InMemoryReporter
Arvid Heise created FLINK-23803: --- Summary: Improve instantiation of InMemoryReporter Key: FLINK-23803 URL: https://issues.apache.org/jira/browse/FLINK-23803 Project: Flink Issue Type: Technical Debt Components: Test Infrastructure Affects Versions: 1.14.0 Reporter: Arvid Heise Currently, InMemoryReporter assumes it's used with the MiniCluster from the main thread. (It internally uses thread locals) A better approach would be to create a unique id in the MiniCluster and use the factory to fetch the appropriate instance from a global map. In this way, different threading model and even concurrent MiniCluster instances would be supported. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23795) JoinITCase.testFullJoinWithEqualPkNonEqui fails due to NPE
[ https://issues.apache.org/jira/browse/FLINK-23795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399526#comment-17399526 ] Arvid Heise commented on FLINK-23795: - Probably a duplicate of FLINK-23759. > JoinITCase.testFullJoinWithEqualPkNonEqui fails due to NPE > -- > > Key: FLINK-23795 > URL: https://issues.apache.org/jira/browse/FLINK-23795 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner >Affects Versions: 1.14.0 >Reporter: Xintong Song >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=0=logs=a9db68b9-a7e0-54b6-0f98-010e0aff39e2=cdd32e0b-6047-565b-c58f-14054472f1be=9477 > {code} > Aug 16 00:15:10 [ERROR] Tests run: 104, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 85.455 s <<< FAILURE! - in > org.apache.flink.table.planner.runtime.stream.sql.JoinITCase > Aug 16 00:15:10 [ERROR] testFullJoinWithEqualPkNonEqui[StateBackend=HEAP] > Time elapsed: 1.031 s <<< ERROR! > Aug 16 00:15:10 org.apache.flink.runtime.client.JobExecutionException: Job > execution failed. > Aug 16 00:15:10 at > org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) > Aug 16 00:15:10 at > org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > Aug 16 00:15:10 at > org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:250) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > Aug 16 00:15:10 at > org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1389) > Aug 16 00:15:10 at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93) > Aug 16 00:15:10 at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) > Aug 16 00:15:10 at > org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) > Aug 16 00:15:10 at > java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) > Aug 16 00:15:10 at > org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$1.onComplete(AkkaFutureUtils.java:47) > Aug 16 00:15:10 at akka.dispatch.OnComplete.internal(Future.scala:300) > Aug 16 00:15:10 at akka.dispatch.OnComplete.internal(Future.scala:297) > Aug 16 00:15:10 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:224) > Aug 16 00:15:10 at > akka.dispatch.japi$CallbackBridge.apply(Future.scala:221) > Aug 16 00:15:10 at > scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60) > Aug 16 00:15:10 at > org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$DirectExecutionContext.execute(AkkaFutureUtils.java:65) > Aug 16 00:15:10 at > scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68) > Aug 16 00:15:10 at > scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284) > Aug 16 00:15:10 at > scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284) > Aug 16 00:15:10 at > scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284) > Aug 16 00:15:10 at > akka.pattern.PromiseActorRef.$bang(AskSupport.scala:621) > Aug 16 00:15:10 at > akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:24) > Aug 16
[jira] [Closed] (FLINK-23793) SinkITCase.writerAndCommitterAndGlobalCommitterExecuteInStreamingMode fail due to ConcurrentModificationException
[ https://issues.apache.org/jira/browse/FLINK-23793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise closed FLINK-23793. --- Resolution: Duplicate This is a duplicate of FLINK-23785, fix available for review. > SinkITCase.writerAndCommitterAndGlobalCommitterExecuteInStreamingMode fail > due to ConcurrentModificationException > - > > Key: FLINK-23793 > URL: https://issues.apache.org/jira/browse/FLINK-23793 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.14.0 >Reporter: Xintong Song >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22196=logs=f2b08047-82c3-520f-51ee-a30fd6254285=3810d23d-4df2-586c-103c-ec14ede6af00=11756 > {code} > Aug 14 22:06:54 [ERROR] Tests run: 7, Failures: 0, Errors: 1, Skipped: 0, > Time elapsed: 7.458 s <<< FAILURE! - in > org.apache.flink.test.streaming.runtime.SinkITCase > Aug 14 22:06:54 [ERROR] > writerAndCommitterAndGlobalCommitterExecuteInStreamingMode Time elapsed: > 0.468 s <<< ERROR! > Aug 14 22:06:54 java.util.ConcurrentModificationException > Aug 14 22:06:54 at > java.util.HashMap$HashIterator.nextNode(HashMap.java:1445) > Aug 14 22:06:54 at java.util.HashMap$KeyIterator.next(HashMap.java:1469) > Aug 14 22:06:54 at java.util.AbstractSet.removeAll(AbstractSet.java:174) > Aug 14 22:06:54 at > org.apache.flink.runtime.testutils.InMemoryReporter.applyRemovals(InMemoryReporter.java:78) > Aug 14 22:06:54 at > org.apache.flink.runtime.testutils.InMemoryReporterRule.afterTestSuccess(InMemoryReporterRule.java:61) > Aug 14 22:06:54 at > org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:57) > Aug 14 22:06:54 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 14 22:06:54 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 14 22:06:54 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 14 22:06:54 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 14 22:06:54 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 14 22:06:54 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 14 22:06:54 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 14 22:06:54 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 14 22:06:54 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 14 22:06:54 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 14 22:06:54 at org.junit.runner.JUnitCore.run(JUnitCore.java:137) > Aug 14 22:06:54 at org.junit.runner.JUnitCore.run(JUnitCore.java:115) > Aug 14 22:06:54 at > org.junit.vintage.engine.execution.RunnerExecutor.execute(RunnerExecutor.java:43) > Aug 14 22:06:54 at > java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183) > Aug 14 22:06:54 at > java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) > Aug 14 22:06:54 at > java.util.Iterator.forEachRemaining(Iterator.java:116) > Aug 14 22:06:54 at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) > Aug 14 22:06:54 at > java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) > Aug 14 22:06:54 at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) > Aug 14 22:06:54 at > java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150) > Aug 14 22:06:54 at > java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173) > Aug 14 22:06:54 at > java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) >
[jira] [Commented] (FLINK-23776) Performance regression on 14.08.2021 in FLIP-27
[ https://issues.apache.org/jira/browse/FLINK-23776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399522#comment-17399522 ] Arvid Heise commented on FLINK-23776: - Thanks for the report. I will investigate. > Performance regression on 14.08.2021 in FLIP-27 > --- > > Key: FLINK-23776 > URL: https://issues.apache.org/jira/browse/FLINK-23776 > Project: Flink > Issue Type: Bug > Components: API / DataStream, Benchmarks >Affects Versions: 1.14.0 >Reporter: Piotr Nowojski >Assignee: Arvid Heise >Priority: Blocker > Fix For: 1.14.0 > > > http://codespeed.dak8s.net:8000/timeline/?ben=mapSink.F27_UNBOUNDED=2 > http://codespeed.dak8s.net:8000/timeline/?ben=mapRebalanceMapSink.F27_UNBOUNDED=2 > {noformat} > git ls 7b60a964b1..7f3636f6b4 > 7f3636f6b4f [2 days ago] [FLINK-23652][connectors] Adding common source > metrics. [Arvid Heise] > 97c8f72b813 [3 months ago] [FLINK-23652][connectors] Adding common sink > metrics. [Arvid Heise] > 48da20e8f88 [3 months ago] [FLINK-23652][test] Adding InMemoryMetricReporter > and using it by default in MiniClusterResource. [Arvid Heise] > 63ee60859ca [3 months ago] [FLINK-23652][core/metrics] Extract > Operator(IO)MetricGroup interfaces and expose them in RuntimeContext [Arvid > Heise] > 5d5e39b614b [2 days ago] [refactor][connectors] Only use > MockSplitReader.Builder for instantiation. [Arvid Heise] > b927035610c [3 months ago] [refactor][core] Extract common context creation > in CollectionExecutor [Arvid Heise] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23785) SinkITCase.testMetrics fails with ConcurrentModification on Azure
[ https://issues.apache.org/jira/browse/FLINK-23785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17399385#comment-17399385 ] Arvid Heise commented on FLINK-23785: - Synchronizing access to the reporter unfortunately only solves the symptom and not the root cause. The main issue is that apparently some metrics are only unregistered after the test case has already finished. That means that metrics from test case A can spill into test case B using the same mini cluster. That means that {{env.execute}} returned while metric system has not converged to a stable state. It's similar to {{FLINK-23647}}. So maybe [~roman_khachatryan] is right that we should find a general solution and not workaround it in each and every test case. I'll try to come up with a workaround for now. > SinkITCase.testMetrics fails with ConcurrentModification on Azure > - > > Key: FLINK-23785 > URL: https://issues.apache.org/jira/browse/FLINK-23785 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.14.0 >Reporter: Till Rohrmann >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > The {{SinkITCase.testMetrics}} fails with a ConcurrentModification > {code} > Aug 15 14:26:40 java.util.ConcurrentModificationException > Aug 15 14:26:40 at > java.util.HashMap$HashIterator.nextNode(HashMap.java:1445) > Aug 15 14:26:40 at java.util.HashMap$KeyIterator.next(HashMap.java:1469) > Aug 15 14:26:40 at java.util.AbstractSet.removeAll(AbstractSet.java:174) > Aug 15 14:26:40 at > org.apache.flink.runtime.testutils.InMemoryReporter.applyRemovals(InMemoryReporter.java:78) > Aug 15 14:26:40 at > org.apache.flink.runtime.testutils.InMemoryReporterRule.afterTestSuccess(InMemoryReporterRule.java:61) > Aug 15 14:26:40 at > org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:57) > Aug 15 14:26:40 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 15 14:26:40 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 15 14:26:40 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22215=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=5225 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23785) SinkITCase.testMetrics fails with ConcurrentModification on Azure
[ https://issues.apache.org/jira/browse/FLINK-23785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23785: --- Assignee: Arvid Heise > SinkITCase.testMetrics fails with ConcurrentModification on Azure > - > > Key: FLINK-23785 > URL: https://issues.apache.org/jira/browse/FLINK-23785 > Project: Flink > Issue Type: Bug > Components: API / DataStream >Affects Versions: 1.14.0 >Reporter: Till Rohrmann >Assignee: Arvid Heise >Priority: Critical > Labels: test-stability > Fix For: 1.14.0 > > > The {{SinkITCase.testMetrics}} fails with a ConcurrentModification > {code} > Aug 15 14:26:40 java.util.ConcurrentModificationException > Aug 15 14:26:40 at > java.util.HashMap$HashIterator.nextNode(HashMap.java:1445) > Aug 15 14:26:40 at java.util.HashMap$KeyIterator.next(HashMap.java:1469) > Aug 15 14:26:40 at java.util.AbstractSet.removeAll(AbstractSet.java:174) > Aug 15 14:26:40 at > org.apache.flink.runtime.testutils.InMemoryReporter.applyRemovals(InMemoryReporter.java:78) > Aug 15 14:26:40 at > org.apache.flink.runtime.testutils.InMemoryReporterRule.afterTestSuccess(InMemoryReporterRule.java:61) > Aug 15 14:26:40 at > org.apache.flink.util.ExternalResource$1.evaluate(ExternalResource.java:57) > Aug 15 14:26:40 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 15 14:26:40 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 15 14:26:40 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 15 14:26:40 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 15 14:26:40 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > {code} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=22215=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=5225 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23652) Implement FLIP-179: Expose Standardized Operator Metrics
[ https://issues.apache.org/jira/browse/FLINK-23652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23652. - Release Note: Connectors using unified Source and Sink interface expose certain standardized metrics automatically. Resolution: Fixed Merged into master as b927035610c6f55973471d29c19d8fe9405a51db..7f3636f6b4f8bac415a7db85917ad849636bd730. > Implement FLIP-179: Expose Standardized Operator Metrics > > > Key: FLINK-23652 > URL: https://issues.apache.org/jira/browse/FLINK-23652 > Project: Flink > Issue Type: Improvement > Components: API / DataStream >Affects Versions: 1.14.0 >Reporter: Arvid Heise >Assignee: Arvid Heise >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > This ticket is about implementing > [FLIP-179|https://cwiki.apache.org/confluence/display/FLINK/FLIP-179%3A+Expose+Standardized+Operator+Metrics]. > It will some metrics out-of-the-box for sources/sinks and supports connector > developers to easily implement some standardized metrics. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23735) Migrate BufferedUpsertSinkFunction to FLIP-143
[ https://issues.apache.org/jira/browse/FLINK-23735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23735. - Fix Version/s: 1.14.0 Resolution: Fixed Merged into master as 134f388323f0c2b625504fc58e874133f467f781..c082a10d5d3087961184e520476363849e8b3a8a. > Migrate BufferedUpsertSinkFunction to FLIP-143 > -- > > Key: FLINK-23735 > URL: https://issues.apache.org/jira/browse/FLINK-23735 > Project: Flink > Issue Type: Sub-task >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > The BufferedUpsertSinkFunction is still using the old sink interfaces and > relies on the old Kafka DataStream connector FlinkKafkaProducer. > We need to migrate it to the new Sink API to also leverage the new KafkaSink > connector and finally deprecate the FlinkKafkaProducer and all its belongings. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (FLINK-23639) Migrate Table API to new KafkaSink
[ https://issues.apache.org/jira/browse/FLINK-23639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17398757#comment-17398757 ] Arvid Heise edited comment on FLINK-23639 at 8/13/21, 6:19 PM: --- Merged into master as 5cffdbea87d636e2d3afb8a9f539f1dfbb778e7a..c605ce41a60b8d7c1d0148d25b62a99f92e295ff. was (Author: arvid): Merged into master as c605ce41a60b8d7c1d0148d25b62a99f92e295ff. > Migrate Table API to new KafkaSink > -- > > Key: FLINK-23639 > URL: https://issues.apache.org/jira/browse/FLINK-23639 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka, Table SQL / API >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > With the KafkaSink ported to FLIP-143 we should also adapt the Table API to > leverage the new KafkaSink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-19554) Unified testing framework for connectors
[ https://issues.apache.org/jira/browse/FLINK-19554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-19554. - Assignee: Qingsheng Ren Resolution: Fixed Merged into master as 4fd33234407e07d99647e495819e009467828161..adee56117778f6962d7892a26b2b507b67ec707a. > Unified testing framework for connectors > > > Key: FLINK-19554 > URL: https://issues.apache.org/jira/browse/FLINK-19554 > Project: Flink > Issue Type: New Feature > Components: Connectors / Common, Test Infrastructure >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, > pull-request-available > Fix For: 1.14.0 > > > As the community and eco-system of Flink growing up, more and more > Flink-owned and third party connectors are developed and added into Flink > community. In order to provide a standardized quality controlling for all > connectors, it's necessary to develop a unified connector testing framework > to simplify and standardize end-to-end test of connectors. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23639) Migrate Table API to new KafkaSink
[ https://issues.apache.org/jira/browse/FLINK-23639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23639. - Fix Version/s: 1.14.0 Release Note: Table API/SQL write to Kafka with the new KafkaSink. Resolution: Fixed Merged into master as c605ce41a60b8d7c1d0148d25b62a99f92e295ff. > Migrate Table API to new KafkaSink > -- > > Key: FLINK-23639 > URL: https://issues.apache.org/jira/browse/FLINK-23639 > Project: Flink > Issue Type: Sub-task > Components: Connectors / Kafka, Table SQL / API >Reporter: Fabian Paul >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > With the KafkaSink ported to FLIP-143 we should also adapt the Table API to > leverage the new KafkaSink -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23767) FLIP-180: Adjust StreamStatus and Idleness definition
[ https://issues.apache.org/jira/browse/FLINK-23767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23767: Issue Type: Improvement (was: Bug) > FLIP-180: Adjust StreamStatus and Idleness definition > - > > Key: FLINK-23767 > URL: https://issues.apache.org/jira/browse/FLINK-23767 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Arvid Heise >Priority: Major > > The implementation ticket for > [FLIP-180|https://cwiki.apache.org/confluence/display/FLINK/FLIP-180%3A+Adjust+StreamStatus+and+Idleness+definition]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23767) FLIP-180: Adjust StreamStatus and Idleness definition
Arvid Heise created FLINK-23767: --- Summary: FLIP-180: Adjust StreamStatus and Idleness definition Key: FLINK-23767 URL: https://issues.apache.org/jira/browse/FLINK-23767 Project: Flink Issue Type: Bug Components: Runtime / Network Reporter: Arvid Heise The implementation ticket for [FLIP-180|https://cwiki.apache.org/confluence/display/FLINK/FLIP-180%3A+Adjust+StreamStatus+and+Idleness+definition]. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23759) notifyCheckpointComplete without corresponding snapshotState
[ https://issues.apache.org/jira/browse/FLINK-23759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23759: Description: In a private run on AZP, I found a run where {{notifyCheckpointComplete}} was invoked without prior {{snapshotState}} after the default ENABLE_CHECKPOINTS_AFTER_TASKS_FINISH was changed to true. https://dev.azure.com/arvidheise0209/arvidheise/_build/results?buildId=1325=logs=43a593e7-535d-554b-08cc-244368da36b4=82d122c0-8bbf-56f3-4c0d-8e3d69630d0f This causes the following NPE because the implementation relies on {{notifyCheckpointComplete}} being called after a corresponding {{snapshotState}} (valid assumption). {noformat} 2021-08-12T19:25:20.6237724Z Aug 12 19:25:20 org.apache.flink.runtime.client.JobExecutionException: Job execution failed. 2021-08-12T19:25:20.6238810Z Aug 12 19:25:20at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) 2021-08-12T19:25:20.6239964Z Aug 12 19:25:20at org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137) 2021-08-12T19:25:20.6241075Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) 2021-08-12T19:25:20.6242062Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) 2021-08-12T19:25:20.6243440Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 2021-08-12T19:25:20.6298828Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 2021-08-12T19:25:20.6300041Z Aug 12 19:25:20at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:250) 2021-08-12T19:25:20.6301124Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) 2021-08-12T19:25:20.6302132Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) 2021-08-12T19:25:20.6303136Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 2021-08-12T19:25:20.6304395Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 2021-08-12T19:25:20.6305382Z Aug 12 19:25:20at org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1389) 2021-08-12T19:25:20.6306384Z Aug 12 19:25:20at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93) 2021-08-12T19:25:20.6307484Z Aug 12 19:25:20at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) 2021-08-12T19:25:20.6308655Z Aug 12 19:25:20at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92) 2021-08-12T19:25:20.6309720Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) 2021-08-12T19:25:20.6310726Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) 2021-08-12T19:25:20.6311740Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) 2021-08-12T19:25:20.6312700Z Aug 12 19:25:20at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) 2021-08-12T19:25:20.6313684Z Aug 12 19:25:20at org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$1.onComplete(AkkaFutureUtils.java:47) 2021-08-12T19:25:20.6314630Z Aug 12 19:25:20at akka.dispatch.OnComplete.internal(Future.scala:300) 2021-08-12T19:25:20.6315487Z Aug 12 19:25:20at akka.dispatch.OnComplete.internal(Future.scala:297) 2021-08-12T19:25:20.6316341Z Aug 12 19:25:20at akka.dispatch.japi$CallbackBridge.apply(Future.scala:224) 2021-08-12T19:25:20.6317222Z Aug 12 19:25:20at akka.dispatch.japi$CallbackBridge.apply(Future.scala:221) 2021-08-12T19:25:20.6318115Z Aug 12 19:25:20at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60) 2021-08-12T19:25:20.6319129Z Aug 12 19:25:20at org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$DirectExecutionContext.execute(AkkaFutureUtils.java:65) 2021-08-12T19:25:20.6320196Z Aug 12 19:25:20at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68) 2021-08-12T19:25:20.6321183Z Aug 12 19:25:20at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284) 2021-08-12T19:25:20.6322205Z Aug 12 19:25:20at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284) 2021-08-12T19:25:20.6323208Z Aug 12 19:25:20at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284) 2021-08-12T19:25:20.6324131Z
[jira] [Created] (FLINK-23759) notifyCheckpointComplete without corresponding snapshotState
Arvid Heise created FLINK-23759: --- Summary: notifyCheckpointComplete without corresponding snapshotState Key: FLINK-23759 URL: https://issues.apache.org/jira/browse/FLINK-23759 Project: Flink Issue Type: Bug Components: Runtime / Checkpointing Affects Versions: 1.14.0 Reporter: Arvid Heise In a private run on AZP, I found a run where {{notifyCheckpointComplete}} was invoked without prior {{snapshotState}} after the default ENABLE_CHECKPOINTS_AFTER_TASKS_FINISH was changed to true. https://dev.azure.com/arvidheise0209/arvidheise/_build/results?buildId=1325=logs=43a593e7-535d-554b-08cc-244368da36b4=82d122c0-8bbf-56f3-4c0d-8e3d69630d0f This causes the following NPE because the implementation relies on {{notifyCheckpointComplete}} being called after a corresponding {{snapshotState}} (valid assumption). {noformat} Aug 12 19:25:20 [ERROR] Tests run: 104, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 87.848 s <<< FAILURE! - in org.apache.flink.table.planner.runtime.stream.sql.JoinITCase Aug 12 19:25:20 [ERROR] testBigDataOfJoin[StateBackend=HEAP] Time elapsed: 0.792 s <<< ERROR! Aug 12 19:25:20 org.apache.flink.runtime.client.JobExecutionException: Job execution failed. Aug 12 19:25:20 at org.apache.flink.runtime.jobmaster.JobResult.toJobExecutionResult(JobResult.java:144) Aug 12 19:25:20 at org.apache.flink.runtime.minicluster.MiniClusterJobClient.lambda$getJobExecutionResult$3(MiniClusterJobClient.java:137) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:616) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) Aug 12 19:25:20 at org.apache.flink.runtime.rpc.akka.AkkaInvocationHandler.lambda$invokeRpc$0(AkkaInvocationHandler.java:250) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) Aug 12 19:25:20 at org.apache.flink.util.concurrent.FutureUtils.doForward(FutureUtils.java:1389) Aug 12 19:25:20 at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$null$1(ClassLoadingUtils.java:93) Aug 12 19:25:20 at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:68) Aug 12 19:25:20 at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.lambda$guardCompletionWithContextClassLoader$2(ClassLoadingUtils.java:92) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture$UniWhenComplete.tryFire(CompletableFuture.java:750) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) Aug 12 19:25:20 at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) Aug 12 19:25:20 at org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$1.onComplete(AkkaFutureUtils.java:47) Aug 12 19:25:20 at akka.dispatch.OnComplete.internal(Future.scala:300) Aug 12 19:25:20 at akka.dispatch.OnComplete.internal(Future.scala:297) Aug 12 19:25:20 at akka.dispatch.japi$CallbackBridge.apply(Future.scala:224) Aug 12 19:25:20 at akka.dispatch.japi$CallbackBridge.apply(Future.scala:221) Aug 12 19:25:20 at scala.concurrent.impl.CallbackRunnable.run(Promise.scala:60) Aug 12 19:25:20 at org.apache.flink.runtime.concurrent.akka.AkkaFutureUtils$DirectExecutionContext.execute(AkkaFutureUtils.java:65) Aug 12 19:25:20 at scala.concurrent.impl.CallbackRunnable.executeWithValue(Promise.scala:68) Aug 12 19:25:20 at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1(Promise.scala:284) Aug 12 19:25:20 at scala.concurrent.impl.Promise$DefaultPromise.$anonfun$tryComplete$1$adapted(Promise.scala:284) Aug 12 19:25:20 at scala.concurrent.impl.Promise$DefaultPromise.tryComplete(Promise.scala:284) Aug 12 19:25:20 at akka.pattern.PromiseActorRef.$bang(AskSupport.scala:621) Aug 12 19:25:20 at akka.pattern.PipeToSupport$PipeableFuture$$anonfun$pipeTo$1.applyOrElse(PipeToSupport.scala:24) Aug 12 19:25:20 at
[jira] [Commented] (FLINK-23678) KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397832#comment-17397832 ] Arvid Heise commented on FLINK-23678: - Ignoring test in ac42bb3d146 on master. > KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure > -- > > Key: FLINK-23678 > URL: https://issues.apache.org/jira/browse/FLINK-23678 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > Attachments: extracted.log > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21711=logs=b0097207-033c-5d9a-b48c-6d4796fbe60d=8338a7d2-16f7-52e5-f576-4b7b3071eb3d=6946 > {code} > Aug 07 00:12:18 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 67.431 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase > Aug 07 00:12:18 [ERROR] > testWriteRecordsToKafkaWithExactlyOnceGuarantee(org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase) > Time elapsed: 7.001 s <<< FAILURE! > Aug 07 00:12:18 java.lang.AssertionError: expected:<407799> but was:<407798> > Aug 07 00:12:18 at org.junit.Assert.fail(Assert.java:89) > Aug 07 00:12:18 at org.junit.Assert.failNotEquals(Assert.java:835) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:647) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:633) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.writeRecordsToKafka(KafkaSinkITCase.java:334) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee(KafkaSinkITCase.java:173) > Aug 07 00:12:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 07 00:12:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 07 00:12:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 07 00:12:18 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 07 00:12:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 07 00:12:18 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 07 00:12:18 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) > Aug 07 00:12:18 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) >
[jira] [Commented] (FLINK-23678) KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397830#comment-17397830 ] Arvid Heise commented on FLINK-23678: - The new errors are related to FLINK-23408. We do not see a notifyCheckpointCompleted for checkpoint 2 (grep for notifyCheckpoint) even though we see Completed checkpoint 2. [~pnowojski] PTAL https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21936=logs=c5f0071e-1851-543e-9a45-9ac140befc32=15a22db7-8faa-5b34-3920-d33c9f0ca23c [^extracted.log] > KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure > -- > > Key: FLINK-23678 > URL: https://issues.apache.org/jira/browse/FLINK-23678 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > Attachments: extracted.log > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21711=logs=b0097207-033c-5d9a-b48c-6d4796fbe60d=8338a7d2-16f7-52e5-f576-4b7b3071eb3d=6946 > {code} > Aug 07 00:12:18 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 67.431 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase > Aug 07 00:12:18 [ERROR] > testWriteRecordsToKafkaWithExactlyOnceGuarantee(org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase) > Time elapsed: 7.001 s <<< FAILURE! > Aug 07 00:12:18 java.lang.AssertionError: expected:<407799> but was:<407798> > Aug 07 00:12:18 at org.junit.Assert.fail(Assert.java:89) > Aug 07 00:12:18 at org.junit.Assert.failNotEquals(Assert.java:835) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:647) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:633) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.writeRecordsToKafka(KafkaSinkITCase.java:334) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee(KafkaSinkITCase.java:173) > Aug 07 00:12:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 07 00:12:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 07 00:12:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 07 00:12:18 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 07 00:12:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 07 00:12:18 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 07 00:12:18 at >
[jira] [Updated] (FLINK-23678) KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23678: Attachment: extracted.log > KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure > -- > > Key: FLINK-23678 > URL: https://issues.apache.org/jira/browse/FLINK-23678 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > Attachments: extracted.log > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21711=logs=b0097207-033c-5d9a-b48c-6d4796fbe60d=8338a7d2-16f7-52e5-f576-4b7b3071eb3d=6946 > {code} > Aug 07 00:12:18 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 67.431 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase > Aug 07 00:12:18 [ERROR] > testWriteRecordsToKafkaWithExactlyOnceGuarantee(org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase) > Time elapsed: 7.001 s <<< FAILURE! > Aug 07 00:12:18 java.lang.AssertionError: expected:<407799> but was:<407798> > Aug 07 00:12:18 at org.junit.Assert.fail(Assert.java:89) > Aug 07 00:12:18 at org.junit.Assert.failNotEquals(Assert.java:835) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:647) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:633) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.writeRecordsToKafka(KafkaSinkITCase.java:334) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee(KafkaSinkITCase.java:173) > Aug 07 00:12:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 07 00:12:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 07 00:12:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 07 00:12:18 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 07 00:12:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 07 00:12:18 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 07 00:12:18 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) > Aug 07 00:12:18 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 07 00:12:18 at >
[jira] [Comment Edited] (FLINK-23722) S3 Tests fail on AZP: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a regio
[ https://issues.apache.org/jira/browse/FLINK-23722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397606#comment-17397606 ] Arvid Heise edited comment on FLINK-23722 at 8/11/21, 8:13 PM: --- Merged into master as 5d568cb0022e69d66e1c4f92f31f77ebd64fe385..ff8e48d95f805acbc210e7091d5a75f6ba1d1dc9. was (Author: arvid): Merged into master as ff8e48d95f805acbc210e7091d5a75f6ba1d1dc9. > S3 Tests fail on AZP: Unable to find a region via the region provider chain. > Must provide an explicit region in the builder or setup environment to supply > a region. > > > Key: FLINK-23722 > URL: https://issues.apache.org/jira/browse/FLINK-23722 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.14.0 >Reporter: Arvid Heise >Assignee: Arvid Heise >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > E2E and integration tests fail with > {noformat} > Aug 11 09:11:32 Caused by: com.amazonaws.SdkClientException: Unable to find a > region via the region provider chain. Must provide an explicit region in the > builder or setup environment to supply a region. > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.buildAmazonS3Client(DefaultS3ClientFactory.java:144) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:96) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:753) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:446) > Aug 11 09:11:32 ... 44 more > {noformat} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21884=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (FLINK-23678) KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23678: Priority: Critical (was: Major) > KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure > -- > > Key: FLINK-23678 > URL: https://issues.apache.org/jira/browse/FLINK-23678 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Critical > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21711=logs=b0097207-033c-5d9a-b48c-6d4796fbe60d=8338a7d2-16f7-52e5-f576-4b7b3071eb3d=6946 > {code} > Aug 07 00:12:18 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 67.431 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase > Aug 07 00:12:18 [ERROR] > testWriteRecordsToKafkaWithExactlyOnceGuarantee(org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase) > Time elapsed: 7.001 s <<< FAILURE! > Aug 07 00:12:18 java.lang.AssertionError: expected:<407799> but was:<407798> > Aug 07 00:12:18 at org.junit.Assert.fail(Assert.java:89) > Aug 07 00:12:18 at org.junit.Assert.failNotEquals(Assert.java:835) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:647) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:633) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.writeRecordsToKafka(KafkaSinkITCase.java:334) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee(KafkaSinkITCase.java:173) > Aug 07 00:12:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 07 00:12:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 07 00:12:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 07 00:12:18 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 07 00:12:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 07 00:12:18 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 07 00:12:18 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) > Aug 07 00:12:18 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 07 00:12:18 at >
[jira] [Updated] (FLINK-23678) KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise updated FLINK-23678: Priority: Major (was: Blocker) > KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure > -- > > Key: FLINK-23678 > URL: https://issues.apache.org/jira/browse/FLINK-23678 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Major > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21711=logs=b0097207-033c-5d9a-b48c-6d4796fbe60d=8338a7d2-16f7-52e5-f576-4b7b3071eb3d=6946 > {code} > Aug 07 00:12:18 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 67.431 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase > Aug 07 00:12:18 [ERROR] > testWriteRecordsToKafkaWithExactlyOnceGuarantee(org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase) > Time elapsed: 7.001 s <<< FAILURE! > Aug 07 00:12:18 java.lang.AssertionError: expected:<407799> but was:<407798> > Aug 07 00:12:18 at org.junit.Assert.fail(Assert.java:89) > Aug 07 00:12:18 at org.junit.Assert.failNotEquals(Assert.java:835) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:647) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:633) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.writeRecordsToKafka(KafkaSinkITCase.java:334) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee(KafkaSinkITCase.java:173) > Aug 07 00:12:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 07 00:12:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 07 00:12:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 07 00:12:18 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 07 00:12:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 07 00:12:18 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 07 00:12:18 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) > Aug 07 00:12:18 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 07 00:12:18 at >
[jira] [Resolved] (FLINK-23678) KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23678. - Resolution: Fixed Merged into master as 58ff344e5e5fdd397e0f9276561b746ddabbadea. > KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure > -- > > Key: FLINK-23678 > URL: https://issues.apache.org/jira/browse/FLINK-23678 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Blocker > Labels: pull-request-available, test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21711=logs=b0097207-033c-5d9a-b48c-6d4796fbe60d=8338a7d2-16f7-52e5-f576-4b7b3071eb3d=6946 > {code} > Aug 07 00:12:18 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 67.431 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase > Aug 07 00:12:18 [ERROR] > testWriteRecordsToKafkaWithExactlyOnceGuarantee(org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase) > Time elapsed: 7.001 s <<< FAILURE! > Aug 07 00:12:18 java.lang.AssertionError: expected:<407799> but was:<407798> > Aug 07 00:12:18 at org.junit.Assert.fail(Assert.java:89) > Aug 07 00:12:18 at org.junit.Assert.failNotEquals(Assert.java:835) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:647) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:633) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.writeRecordsToKafka(KafkaSinkITCase.java:334) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee(KafkaSinkITCase.java:173) > Aug 07 00:12:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 07 00:12:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 07 00:12:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 07 00:12:18 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 07 00:12:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 07 00:12:18 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 07 00:12:18 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) > Aug 07 00:12:18 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 07 00:12:18 at >
[jira] [Resolved] (FLINK-23722) S3 Tests fail on AZP: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
[ https://issues.apache.org/jira/browse/FLINK-23722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23722. - Fix Version/s: 1.14.0 Resolution: Fixed Merged into master as ff8e48d95f805acbc210e7091d5a75f6ba1d1dc9. > S3 Tests fail on AZP: Unable to find a region via the region provider chain. > Must provide an explicit region in the builder or setup environment to supply > a region. > > > Key: FLINK-23722 > URL: https://issues.apache.org/jira/browse/FLINK-23722 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.14.0 >Reporter: Arvid Heise >Assignee: Arvid Heise >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > E2E and integration tests fail with > {noformat} > Aug 11 09:11:32 Caused by: com.amazonaws.SdkClientException: Unable to find a > region via the region provider chain. Must provide an explicit region in the > builder or setup environment to supply a region. > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.buildAmazonS3Client(DefaultS3ClientFactory.java:144) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:96) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:753) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:446) > Aug 11 09:11:32 ... 44 more > {noformat} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21884=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23722) S3 Tests fail on AZP: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
[ https://issues.apache.org/jira/browse/FLINK-23722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397370#comment-17397370 ] Arvid Heise commented on FLINK-23722: - Downgrading to 3.3.0 yields HADOOP-17337. 3.2.2 looks good though. > S3 Tests fail on AZP: Unable to find a region via the region provider chain. > Must provide an explicit region in the builder or setup environment to supply > a region. > > > Key: FLINK-23722 > URL: https://issues.apache.org/jira/browse/FLINK-23722 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.14.0 >Reporter: Arvid Heise >Assignee: Arvid Heise >Priority: Major > Labels: pull-request-available > > E2E and integration tests fail with > {noformat} > Aug 11 09:11:32 Caused by: com.amazonaws.SdkClientException: Unable to find a > region via the region provider chain. Must provide an explicit region in the > builder or setup environment to supply a region. > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) > Aug 11 09:11:32 at > com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.buildAmazonS3Client(DefaultS3ClientFactory.java:144) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:96) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:753) > Aug 11 09:11:32 at > org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:446) > Aug 11 09:11:32 ... 44 more > {noformat} > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21884=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-23722) S3 Tests fail on AZP: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region.
Arvid Heise created FLINK-23722: --- Summary: S3 Tests fail on AZP: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region. Key: FLINK-23722 URL: https://issues.apache.org/jira/browse/FLINK-23722 Project: Flink Issue Type: Improvement Components: Connectors / FileSystem Affects Versions: 1.14.0 Reporter: Arvid Heise Assignee: Arvid Heise E2E and integration tests fail with {noformat} Aug 11 09:11:32 Caused by: com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region. Aug 11 09:11:32 at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) Aug 11 09:11:32 at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) Aug 11 09:11:32 at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) Aug 11 09:11:32 at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.buildAmazonS3Client(DefaultS3ClientFactory.java:144) Aug 11 09:11:32 at org.apache.hadoop.fs.s3a.DefaultS3ClientFactory.createS3Client(DefaultS3ClientFactory.java:96) Aug 11 09:11:32 at org.apache.hadoop.fs.s3a.S3AFileSystem.bindAWSClient(S3AFileSystem.java:753) Aug 11 09:11:32 at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:446) Aug 11 09:11:32 ... 44 more {noformat} https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21884=logs=d44f43ce-542c-597d-bf94-b0718c71e5e8=ed165f3f-d0f6-524b-5279-86f8ee7d0e2d -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23462) Translate the abfs documentation to chinese
[ https://issues.apache.org/jira/browse/FLINK-23462?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397220#comment-17397220 ] Arvid Heise commented on FLINK-23462: - This is now good to go. Thanks [~Liebing]! > Translate the abfs documentation to chinese > --- > > Key: FLINK-23462 > URL: https://issues.apache.org/jira/browse/FLINK-23462 > Project: Flink > Issue Type: Bug > Components: chinese-translation, Documentation >Reporter: Srinivasulu Punuru >Assignee: Liebing Yu >Priority: Major > Labels: pull-request-available > > Translate the documentation changes that were made in this PR to chinese > https://github.com/apache/flink/pull/16559/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-18562) Add Support for Azure Data Lake Store Gen 2 in Flink File System
[ https://issues.apache.org/jira/browse/FLINK-18562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-18562. - Release Note: Flink now supports reading from Azure Data Lake Store Gen 2 with the flink-azure-fs-hadoop filesystem using the abfs(s) scheme. Resolution: Fixed Merged into master as 0acac5391bb9c8e9de88a3bfab24ba6c2e8d41be. Thanks again! > Add Support for Azure Data Lake Store Gen 2 in Flink File System > > > Key: FLINK-18562 > URL: https://issues.apache.org/jira/browse/FLINK-18562 > Project: Flink > Issue Type: Improvement > Components: Connectors / FileSystem >Affects Versions: 1.12.0 >Reporter: Israel Ekpo >Assignee: Srinivasulu Punuru >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, > pull-request-available > Fix For: 1.14.0 > > > The objective of this improvement is to add support for Azure Data Lake Store > Gen 2 (ADLS Gen2) [1] in the Flink File System [2] > This will allow include the abfs(s) scheme/protocol via ADLS Gen2 to be > available as one of the distributed filesystems that can be used for > savepointing, checkpointing, data sources and sinks in Flink jobs. > It will also simplify the implementation of managed identities [3] which is > an added security benefit in environments where system assigned and user > assigned managed identities can be leveraged. > [1] https://hadoop.apache.org/docs/current/hadoop-azure/abfs.html > [2] https://github.com/apache/flink/tree/master/flink-filesystems > [3] > https://hadoop.apache.org/docs/current/hadoop-azure-datalake/index.html#Using_MSI_.28Managed_Service_Identity.29 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23647) UnalignedCheckpointStressITCase crashed on azure
[ https://issues.apache.org/jira/browse/FLINK-23647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17396853#comment-17396853 ] Arvid Heise commented on FLINK-23647: - Here is my solution to solve the test instability in code https://github.com/apache/flink/commit/08b29798548ed6ee915113b2e000923d5d275259 with some explanations. > UnalignedCheckpointStressITCase crashed on azure > > > Key: FLINK-23647 > URL: https://issues.apache.org/jira/browse/FLINK-23647 > Project: Flink > Issue Type: Bug > Components: Runtime / Network, Tests >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21539=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=4855 > When testing DFS changelog implementation in FLINK-23279 and enabling it for > all tests, > UnalignedCheckpointStressITCase crashed with the following exception > {code} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 18.433 s <<< FAILURE! - in > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase > [ERROR] > runStressTest(org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase) > Time elapsed: 17.663 s <<< ERROR! > java.io.UncheckedIOException: java.nio.file.NoSuchFileException: > /tmp/junit7860347244680665820/435237 d57439f2ceadfedba74dadd6fa/chk-16 >at > java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88) >at java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104) >at java.util.Iterator.forEachRemaining(Iterator.java:115) >at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) >at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) >at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) >at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) >at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) >at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:288) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:261) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:157) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:498) >at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) >at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) >at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) >at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) >at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) >at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) >at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) >at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) >at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) >at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) >at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) >at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) >at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) >at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) >at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) >at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at org.junit.rules.RunRules.evaluate(RunRules.java:20) >at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) >at
[jira] [Commented] (FLINK-23647) UnalignedCheckpointStressITCase crashed on azure
[ https://issues.apache.org/jira/browse/FLINK-23647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17396845#comment-17396845 ] Arvid Heise commented on FLINK-23647: - Yes, it would also avoid ugly hacks like https://github.com/AHeise/flink/blob/master/flink-tests/src/test/java/org/apache/flink/test/checkpointing/UnalignedCheckpointTestBase.java#L212-L212 . However, when I just waited for a proper shutdown of the minicluster, it added 1-2s to each ITCase. So we need to be aware that some ITCases take 3x the time then. > UnalignedCheckpointStressITCase crashed on azure > > > Key: FLINK-23647 > URL: https://issues.apache.org/jira/browse/FLINK-23647 > Project: Flink > Issue Type: Bug > Components: Runtime / Network, Tests >Affects Versions: 1.14.0 >Reporter: Roman Khachatryan >Priority: Major > Labels: test-stability > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21539=logs=5c8e7682-d68f-54d1-16a2-a09310218a49=86f654fa-ab48-5c1a-25f4-7e7f6afb9bba=4855 > When testing DFS changelog implementation in FLINK-23279 and enabling it for > all tests, > UnalignedCheckpointStressITCase crashed with the following exception > {code} > [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: > 18.433 s <<< FAILURE! - in > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase > [ERROR] > runStressTest(org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase) > Time elapsed: 17.663 s <<< ERROR! > java.io.UncheckedIOException: java.nio.file.NoSuchFileException: > /tmp/junit7860347244680665820/435237 d57439f2ceadfedba74dadd6fa/chk-16 >at > java.nio.file.FileTreeIterator.fetchNextIfNeeded(FileTreeIterator.java:88) >at java.nio.file.FileTreeIterator.hasNext(FileTreeIterator.java:104) >at java.util.Iterator.forEachRemaining(Iterator.java:115) >at > java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801) >at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) >at > java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) >at > java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) >at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) >at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:546) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.discoverRetainedCheckpoint(UnalignedCheckpointStressITCase.java:288) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runAndTakeExternalCheckpoint(UnalignedCheckpointStressITCase.java:261) >at > org.apache.flink.test.checkpointing.UnalignedCheckpointStressITCase.runStressTest(UnalignedCheckpointStressITCase.java:157) >at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) >at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) >at java.lang.reflect.Method.invoke(Method.java:498) >at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) >at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) >at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) >at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) >at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) >at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) >at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) >at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) >at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) >at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) >at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) >at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) >at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) >at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) >at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) >at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) >at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) >at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) >at
[jira] [Assigned] (FLINK-23696) RMQSourceTest.testRedeliveredSessionIDsAck fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23696: --- Assignee: Fabian Paul > RMQSourceTest.testRedeliveredSessionIDsAck fails on azure > - > > Key: FLINK-23696 > URL: https://issues.apache.org/jira/browse/FLINK-23696 > Project: Flink > Issue Type: Bug > Components: Connectors/ RabbitMQ >Affects Versions: 1.13.2 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Major > Labels: test-stability > Fix For: 1.13.3 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21792=logs=e1276d0f-df12-55ec-86b5-c0ad597d83c9=906e9244-f3be-5604-1979-e767c8a6f6d9=13297 > {code} > Aug 10 01:15:35 [ERROR] Tests run: 16, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 3.181 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.rabbitmq.RMQSourceTest > Aug 10 01:15:35 [ERROR] > testRedeliveredSessionIDsAck(org.apache.flink.streaming.connectors.rabbitmq.RMQSourceTest) > Time elapsed: 0.269 s <<< FAILURE! > Aug 10 01:15:35 java.lang.AssertionError: expected:<25> but was:<27> > Aug 10 01:15:35 at org.junit.Assert.fail(Assert.java:88) > Aug 10 01:15:35 at org.junit.Assert.failNotEquals(Assert.java:834) > Aug 10 01:15:35 at org.junit.Assert.assertEquals(Assert.java:645) > Aug 10 01:15:35 at org.junit.Assert.assertEquals(Assert.java:631) > Aug 10 01:15:35 at > org.apache.flink.streaming.connectors.rabbitmq.RMQSourceTest.testRedeliveredSessionIDsAck(RMQSourceTest.java:407) > Aug 10 01:15:35 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 10 01:15:35 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 10 01:15:35 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 10 01:15:35 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 10 01:15:35 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > Aug 10 01:15:35 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 10 01:15:35 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > Aug 10 01:15:35 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 10 01:15:35 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 10 01:15:35 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 10 01:15:35 at > org.junit.rules.ExpectedException$ExpectedExceptionStatement.evaluate(ExpectedException.java:239) > Aug 10 01:15:35 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 10 01:15:35 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325) > Aug 10 01:15:35 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78) > Aug 10 01:15:35 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57) > Aug 10 01:15:35 at > org.junit.runners.ParentRunner$3.run(ParentRunner.java:290) > Aug 10 01:15:35 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71) > Aug 10 01:15:35 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288) > Aug 10 01:15:35 at > org.junit.runners.ParentRunner.access$000(ParentRunner.java:58) > Aug 10 01:15:35 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268) > Aug 10 01:15:35 at > org.junit.runners.ParentRunner.run(ParentRunner.java:363) > Aug 10 01:15:35 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Aug 10 01:15:35 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273) > Aug 10 01:15:35 at > org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238) > Aug 10 01:15:35 at > org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) > Aug 10 01:15:35 at > org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384) > Aug 10 01:15:35 at > org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345) > Aug 10 01:15:35 at > org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126) > Aug 10 01:15:35 at > org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418){code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (FLINK-23255) Add JUnit 5 jupiter and vintage engine
[ https://issues.apache.org/jira/browse/FLINK-23255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise resolved FLINK-23255. - Resolution: Fixed Merged into master as f8ce4d792a9725e550941fb9d5f4579055e80cf8. > Add JUnit 5 jupiter and vintage engine > -- > > Key: FLINK-23255 > URL: https://issues.apache.org/jira/browse/FLINK-23255 > Project: Flink > Issue Type: Sub-task > Components: Tests >Affects Versions: 1.14.0 >Reporter: Qingsheng Ren >Assignee: Qingsheng Ren >Priority: Major > Labels: pull-request-available > Fix For: 1.14.0 > > > Add dependencies for JUnit 5 jupiter for supporting JUnit 5 tests, and > vintage engine for supporting test cases in JUnit 4 style -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (FLINK-23674) flink restart with checkpoint ,kafka producer throw exception
[ https://issues.apache.org/jira/browse/FLINK-23674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17395843#comment-17395843 ] Arvid Heise commented on FLINK-23674: - Can you please provide the full log? > flink restart with checkpoint ,kafka producer throw exception > -- > > Key: FLINK-23674 > URL: https://issues.apache.org/jira/browse/FLINK-23674 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.13.1 > Environment: flink:flink-1.13.1 > kafka: _2.12-2.5.0 > java: 1.8.0_161 >Reporter: meetsong >Priority: Major > > > when I test flink eos, and sink is kafka. first I click the button of > cancel on flink web ui , then I input following code on console > {code:java} > bin/flink run -n -c com.shanjiancaofu.live.job.ChargeJob -s > file:/soft/opt/checkpoint/072c0a72343c6e1f06b9bd37c5147cc0/chk-1/_metadata > ./ad-live-process-0.11-jar-with-dependencies.jar > {code} > , after 10 second throw a exception > {code:java} > Caused by: org.apache.kafka.common.KafkaException: Unexpected error in > InitProducerIdResponse; Producer attempted an operation with an old epoch. > Either there is a newer producer with the same transactionalId, or the > producer's transaction has been expired by the broker. > {code} > and my code is : > {code:java} > package com.shanjiancaofu.live.job; > import com.alibaba.fastjson.JSON; > import lombok.AllArgsConstructor; > import lombok.Data; > import lombok.NoArgsConstructor; > import lombok.extern.slf4j.Slf4j; > import org.apache.commons.lang.SystemUtils; > import org.apache.flink.api.common.restartstrategy.RestartStrategies; > import org.apache.flink.api.common.serialization.SimpleStringSchema; > import org.apache.flink.api.common.state.ListState; > import org.apache.flink.api.common.state.ListStateDescriptor; > import org.apache.flink.api.common.time.Time; > import org.apache.flink.api.common.typeinfo.TypeHint; > import org.apache.flink.api.common.typeinfo.TypeInformation; > import org.apache.flink.api.java.functions.KeySelector; > import org.apache.flink.configuration.Configuration; > import org.apache.flink.runtime.state.filesystem.FsStateBackend; > import org.apache.flink.streaming.api.CheckpointingMode; > import org.apache.flink.streaming.api.TimeCharacteristic; > import org.apache.flink.streaming.api.environment.CheckpointConfig; > import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment; > import org.apache.flink.streaming.api.functions.KeyedProcessFunction; > import org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumer; > import org.apache.flink.streaming.connectors.kafka.FlinkKafkaProducer; > import org.apache.flink.streaming.connectors.kafka.KafkaDeserializationSchema; > import > org.apache.flink.streaming.connectors.kafka.internals.KeyedSerializationSchemaWrapper; > import org.apache.flink.util.Collector; > import org.apache.kafka.clients.consumer.ConsumerConfig; > import org.apache.kafka.clients.consumer.ConsumerRecord; > import org.apache.kafka.clients.producer.ProducerConfig; > import org.apache.kafka.common.IsolationLevel; > import java.util.*; > @Slf4j > public class ChargeJob1 { >static class RecordScheme implements > KafkaDeserializationSchema> { > @Override > public boolean isEndOfStream(ConsumerRecord > stringUserEventConsumerRecord) { > return false; > } > @Override > public ConsumerRecord > deserialize(ConsumerRecord consumerRecord) throws Exception { > String key = null; > UserEvent UserEvent = null; > if (consumerRecord.key() != null) { > key = new String(consumerRecord.key()); > } > if (consumerRecord.value() != null) { > UserEvent = JSON.parseObject(new String(consumerRecord.value()), > UserEvent.class); > } > return new ConsumerRecord<>( > consumerRecord.topic(), > consumerRecord.partition(), > consumerRecord.offset(), > consumerRecord.timestamp(), > consumerRecord.timestampType(), > consumerRecord.checksum(), > consumerRecord.serializedKeySize(), > consumerRecord.serializedValueSize(), > key, UserEvent); > } > @Override > public TypeInformation> > getProducedType() { > return TypeInformation.of(new TypeHint UserEvent>>() { > }); > } >} >public static void main(String[] args) throws Exception { > Configuration configuration = new Configuration(); > if (args != null) { > // 传递全局参数 > configuration.setString("args", String.join(" ", args)); > } > StreamExecutionEnvironment env = >
[jira] [Assigned] (FLINK-23683) KafkaSinkITCase hangs on azure
[ https://issues.apache.org/jira/browse/FLINK-23683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23683: --- Assignee: Fabian Paul > KafkaSinkITCase hangs on azure > -- > > Key: FLINK-23683 > URL: https://issues.apache.org/jira/browse/FLINK-23683 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Major > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21738=logs=4be4ed2b-549a-533d-aa33-09e28e360cc8=f7d83ad5-3324-5307-0eb3-819065cdcb38=7886 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (FLINK-23678) KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure
[ https://issues.apache.org/jira/browse/FLINK-23678?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arvid Heise reassigned FLINK-23678: --- Assignee: Fabian Paul > KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee fails on azure > -- > > Key: FLINK-23678 > URL: https://issues.apache.org/jira/browse/FLINK-23678 > Project: Flink > Issue Type: Bug > Components: Connectors / Kafka >Affects Versions: 1.14.0 >Reporter: Xintong Song >Assignee: Fabian Paul >Priority: Blocker > Labels: test-stability > Fix For: 1.14.0 > > > https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=21711=logs=b0097207-033c-5d9a-b48c-6d4796fbe60d=8338a7d2-16f7-52e5-f576-4b7b3071eb3d=6946 > {code} > Aug 07 00:12:18 [ERROR] Tests run: 7, Failures: 1, Errors: 0, Skipped: 0, > Time elapsed: 67.431 s <<< FAILURE! - in > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase > Aug 07 00:12:18 [ERROR] > testWriteRecordsToKafkaWithExactlyOnceGuarantee(org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase) > Time elapsed: 7.001 s <<< FAILURE! > Aug 07 00:12:18 java.lang.AssertionError: expected:<407799> but was:<407798> > Aug 07 00:12:18 at org.junit.Assert.fail(Assert.java:89) > Aug 07 00:12:18 at org.junit.Assert.failNotEquals(Assert.java:835) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:647) > Aug 07 00:12:18 at org.junit.Assert.assertEquals(Assert.java:633) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.writeRecordsToKafka(KafkaSinkITCase.java:334) > Aug 07 00:12:18 at > org.apache.flink.streaming.connectors.kafka.sink.KafkaSinkITCase.testWriteRecordsToKafkaWithExactlyOnceGuarantee(KafkaSinkITCase.java:173) > Aug 07 00:12:18 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native > Method) > Aug 07 00:12:18 at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > Aug 07 00:12:18 at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > Aug 07 00:12:18 at java.lang.reflect.Method.invoke(Method.java:498) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > Aug 07 00:12:18 at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > Aug 07 00:12:18 at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > Aug 07 00:12:18 at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54) > Aug 07 00:12:18 at > org.apache.flink.util.TestNameProvider$1.evaluate(TestNameProvider.java:45) > Aug 07 00:12:18 at > org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103) > Aug 07 00:12:18 at > org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$4.run(ParentRunner.java:331) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.access$100(ParentRunner.java:66) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293) > Aug 07 00:12:18 at > org.testcontainers.containers.FailureDetectingExternalResource$1.evaluate(FailureDetectingExternalResource.java:30) > Aug 07 00:12:18 at org.junit.rules.RunRules.evaluate(RunRules.java:20) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306) > Aug 07 00:12:18 at > org.junit.runners.ParentRunner.run(ParentRunner.java:413) > Aug 07 00:12:18 at > org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365) > Aug