Matthias Pohl created FLINK-33574:
-------------------------------------
Summary: testRecoverAfterMultiplePersistsStateWithMultiPart
andtestRecoverAfterMultiplePersistsStateWithMultiPart run into timeouts
Key: FLINK-33574
URL: https://issues.apache.org/jira/browse/FLINK-33574
Project: Flink
Issue Type: Bug
Components: Connectors / FileSystem
Affects Versions: 1.19.0
Reporter: Matthias Pohl
[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54446&view=logs&j=4eda0b4a-bd0d-521a-0916-8285b9be9bb5&t=2ff6d5fa-53a6-53ac-bff7-fa524ea361a9]
Multiple connect_1 stages fail due to a timeout:
{code:java}
Nov 09 02:09:33 "main" #1 prio=5 os_prio=0 tid=0x00007efd5400b800 nid=0x7c0e
waiting on condition [0x00007efd5ccd8000]
Nov 09 02:09:33 java.lang.Thread.State: WAITING (parking)
Nov 09 02:09:33 at sun.misc.Unsafe.park(Native Method)
Nov 09 02:09:33 - parking to wait for <0x00000000b762d130> (a
java.util.concurrent.CompletableFuture$Signaller)
Nov 09 02:09:33 at
java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
Nov 09 02:09:33 at
java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1707)
Nov 09 02:09:33 at
java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3323)
Nov 09 02:09:33 at
java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1742)
Nov 09 02:09:33 at
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908)
Nov 09 02:09:33 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartUploadToComplete(RecoverableMultiPartUploadImpl.java:233)
Nov 09 02:09:33 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartsUpload(RecoverableMultiPartUploadImpl.java:223)
Nov 09 02:09:33 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:152)
Nov 09 02:09:33 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:122)
Nov 09 02:09:33 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetCommitter(RecoverableMultiPartUploadImpl.java:56)
Nov 09 02:09:33 at
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.closeForCommit(S3RecoverableFsDataOutputStream.java:178)
Nov 09 02:09:33 at
org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersist(AbstractHadoopRecoverableWriterITCase.java:375)
Nov 09 02:09:33 at
org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersistWithMultiPartUploads(AbstractHadoopRecoverableWriterITCase.java:330)
Nov 09 02:09:33 at
org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testRecoverAfterMultiplePersistsStateWithMultiPart(AbstractHadoopRecoverableWriterITCase.java:318)
[...]{code}
And
{code:java}
Nov 09 01:53:59 "main" #1 prio=5 os_prio=0 cpu=3732.81ms elapsed=1707.61s
tid=0x00007f7bec028000 nid=0x3e5 waiting on condition [0x00007f7bf2c80000]
Nov 09 01:53:59 java.lang.Thread.State: WAITING (parking)
Nov 09 01:53:59 at
jdk.internal.misc.Unsafe.park([email protected]/Native Method)
Nov 09 01:53:59 - parking to wait for <0x00000000aff7e730> (a
java.util.concurrent.CompletableFuture$Signaller)
Nov 09 01:53:59 at
java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:194)
Nov 09 01:53:59 at
java.util.concurrent.CompletableFuture$Signaller.block([email protected]/CompletableFuture.java:1796)
Nov 09 01:53:59 at
java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3128)
Nov 09 01:53:59 at
java.util.concurrent.CompletableFuture.waitingGet([email protected]/CompletableFuture.java:1823)
Nov 09 01:53:59 at
java.util.concurrent.CompletableFuture.get([email protected]/CompletableFuture.java:1998)
Nov 09 01:53:59 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartUploadToComplete(RecoverableMultiPartUploadImpl.java:233)
Nov 09 01:53:59 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.awaitPendingPartsUpload(RecoverableMultiPartUploadImpl.java:223)
Nov 09 01:53:59 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:152)
Nov 09 01:53:59 at
org.apache.flink.fs.s3.common.writer.RecoverableMultiPartUploadImpl.snapshotAndGetRecoverable(RecoverableMultiPartUploadImpl.java:56)
Nov 09 01:53:59 at
org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.persist(S3RecoverableFsDataOutputStream.java:167)
Nov 09 01:53:59 at
org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersist(AbstractHadoopRecoverableWriterITCase.java:351)
Nov 09 01:53:59 at
org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testResumeAfterMultiplePersistWithMultiPartUploads(AbstractHadoopRecoverableWriterITCase.java:330)
Nov 09 01:53:59 at
org.apache.flink.runtime.fs.hdfs.AbstractHadoopRecoverableWriterITCase.testRecoverFromIntermWithoutAdditionalStateWithMultiPart(AbstractHadoopRecoverableWriterITCase.java:312)
[...]{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)