[ 
https://issues.apache.org/jira/browse/FLINK-24163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17478366#comment-17478366
 ] 

Yun Gao edited comment on FLINK-24163 at 1/19/22, 7:18 AM:
-----------------------------------------------------------

This seems to be due to different reason. 

Hi [~roman] , by binary search it seems with 
https://issues.apache.org/jira/browse/FLINK-25395 the running time of 
PartiallyFinishedSourcesITCase#test[complex graph SINGLE_SUBTASK, failover: 
true, strategy: region] has increased from 2s to about 1 minute, the case is 
blocked on restoring state after failover:
{code:java}
"transform-2-keyed (1/4)#1" #1517 prio=5 os_prio=31 tid=0x00007f862136a000 
nid=0x10423 runnable [0x0000700011fee000]
   java.lang.Thread.State: RUNNABLE
        at java.io.FileInputStream.readBytes(Native Method)
        at java.io.FileInputStream.read(FileInputStream.java:255)
        at 
org.apache.flink.core.fs.local.LocalDataInputStream.read(LocalDataInputStream.java:73)
        at 
org.apache.flink.core.fs.FSDataInputStreamWrapper.read(FSDataInputStreamWrapper.java:60)
        at 
org.apache.flink.runtime.util.ForwardingInputStream.read(ForwardingInputStream.java:52)
        at java.io.DataInputStream.read(DataInputStream.java:149)
        at 
org.apache.flink.api.java.typeutils.runtime.DataInputViewStream.read(DataInputViewStream.java:68)
        at com.esotericsoftware.kryo.io.Input.fill(Input.java:146)
        at 
org.apache.flink.api.java.typeutils.runtime.NoFetchingInput.require(NoFetchingInput.java:77)
        at com.esotericsoftware.kryo.io.Input.readAscii_slow(Input.java:598)
        at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:576)
        at com.esotericsoftware.kryo.io.Input.readString(Input.java:454)
        at 
com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:177)
        at 
com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:166)
        at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:730)
        at 
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:113)
        at 
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
        at 
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:402)
        at 
org.apache.flink.runtime.state.heap.StateTableByKeyGroupReaders.lambda$createV2PlusReader$0(StateTableByKeyGroupReaders.java:78)
        at 
org.apache.flink.runtime.state.heap.StateTableByKeyGroupReaders$$Lambda$2196/1169355256.readElement(Unknown
 Source)
        at 
org.apache.flink.runtime.state.KeyGroupPartitioner$PartitioningResultKeyGroupReader.readMappingsInKeyGroup(KeyGroupPartitioner.java:297)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.readKeyGroupStateData(HeapRestoreOperation.java:258)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.readStateHandleStateData(HeapRestoreOperation.java:220)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.restore(HeapRestoreOperation.java:166)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.restore(HeapRestoreOperation.java:62)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.restoreState(HeapKeyedStateBackendBuilder.java:169)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.build(HeapKeyedStateBackendBuilder.java:106)
        at 
org.apache.flink.runtime.state.hashmap.HashMapStateBackend.createKeyedStateBackend(HashMapStateBackend.java:137)
 {code}
Could you help to have a look~?

The commit before the PR on the master branch is 
265a0a0708ae743c63505bb02e0659984a565fbb and the commit right after the PR is 
4691b66545010ed812624a259869c7a522663720 . 


was (Author: gaoyunhaii):
This seems to be due to different reason. 

Hi [~roman] , by binary search it seems with 
https://issues.apache.org/jira/browse/FLINK-25395 the running time of 
PartiallyFinishedSourcesITCase#test[complex graph SINGLE_SUBTASK, failover: 
true, strategy: region] has increased from 2s to about 1 minute, the case is 
blocked on restoring state after failover:
{code:java}
"transform-2-keyed (1/4)#1" #1517 prio=5 os_prio=31 tid=0x00007f862136a000 
nid=0x10423 runnable [0x0000700011fee000]
   java.lang.Thread.State: RUNNABLE
        at java.io.FileInputStream.readBytes(Native Method)
        at java.io.FileInputStream.read(FileInputStream.java:255)
        at 
org.apache.flink.core.fs.local.LocalDataInputStream.read(LocalDataInputStream.java:73)
        at 
org.apache.flink.core.fs.FSDataInputStreamWrapper.read(FSDataInputStreamWrapper.java:60)
        at 
org.apache.flink.runtime.util.ForwardingInputStream.read(ForwardingInputStream.java:52)
        at java.io.DataInputStream.read(DataInputStream.java:149)
        at 
org.apache.flink.api.java.typeutils.runtime.DataInputViewStream.read(DataInputViewStream.java:68)
        at com.esotericsoftware.kryo.io.Input.fill(Input.java:146)
        at 
org.apache.flink.api.java.typeutils.runtime.NoFetchingInput.require(NoFetchingInput.java:77)
        at com.esotericsoftware.kryo.io.Input.readAscii_slow(Input.java:598)
        at com.esotericsoftware.kryo.io.Input.readAscii(Input.java:576)
        at com.esotericsoftware.kryo.io.Input.readString(Input.java:454)
        at 
com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:177)
        at 
com.esotericsoftware.kryo.serializers.DefaultSerializers$StringSerializer.read(DefaultSerializers.java:166)
        at com.esotericsoftware.kryo.Kryo.readObjectOrNull(Kryo.java:730)
        at 
com.esotericsoftware.kryo.serializers.ObjectField.read(ObjectField.java:113)
        at 
com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:528)
        at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
        at 
org.apache.flink.api.java.typeutils.runtime.kryo.KryoSerializer.deserialize(KryoSerializer.java:402)
        at 
org.apache.flink.runtime.state.heap.StateTableByKeyGroupReaders.lambda$createV2PlusReader$0(StateTableByKeyGroupReaders.java:78)
        at 
org.apache.flink.runtime.state.heap.StateTableByKeyGroupReaders$$Lambda$2196/1169355256.readElement(Unknown
 Source)
        at 
org.apache.flink.runtime.state.KeyGroupPartitioner$PartitioningResultKeyGroupReader.readMappingsInKeyGroup(KeyGroupPartitioner.java:297)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.readKeyGroupStateData(HeapRestoreOperation.java:258)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.readStateHandleStateData(HeapRestoreOperation.java:220)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.restore(HeapRestoreOperation.java:166)
        at 
org.apache.flink.runtime.state.heap.HeapRestoreOperation.restore(HeapRestoreOperation.java:62)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.restoreState(HeapKeyedStateBackendBuilder.java:169)
        at 
org.apache.flink.runtime.state.heap.HeapKeyedStateBackendBuilder.build(HeapKeyedStateBackendBuilder.java:106)
        at 
org.apache.flink.runtime.state.hashmap.HashMapStateBackend.createKeyedStateBackend(HashMapStateBackend.java:137)
 {code}
Could you help to have a look~?

> PartiallyFinishedSourcesITCase fails due to timeout
> ---------------------------------------------------
>
>                 Key: FLINK-24163
>                 URL: https://issues.apache.org/jira/browse/FLINK-24163
>             Project: Flink
>          Issue Type: Bug
>          Components: API / DataStream
>    Affects Versions: 1.14.0, 1.15.0
>            Reporter: Xintong Song
>            Assignee: Yun Gao
>            Priority: Blocker
>              Labels: pull-request-available, test-stability
>             Fix For: 1.14.0, 1.15.0
>
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=23529&view=logs&j=4d4a0d10-fca2-5507-8eed-c07f0bdf4887&t=7b25afdf-cc6c-566f-5459-359dc2585798&l=10996
> {code}
> Sep 04 04:35:28 [ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 0, 
> Time elapsed: 155.236 s <<< FAILURE! - in 
> org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase
> Sep 04 04:35:28 [ERROR] test[complex graph ALL_SUBTASKS, failover: false]  
> Time elapsed: 65.999 s  <<< ERROR!
> Sep 04 04:35:28 java.util.concurrent.TimeoutException: Condition was not met 
> in given timeout.
> Sep 04 04:35:28       at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:164)
> Sep 04 04:35:28       at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:142)
> Sep 04 04:35:28       at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitUntilCondition(CommonTestUtils.java:134)
> Sep 04 04:35:28       at 
> org.apache.flink.runtime.testutils.CommonTestUtils.waitForSubtasksToFinish(CommonTestUtils.java:297)
> Sep 04 04:35:28       at 
> org.apache.flink.runtime.operators.lifecycle.TestJobExecutor.waitForSubtasksToFinish(TestJobExecutor.java:219)
> Sep 04 04:35:28       at 
> org.apache.flink.runtime.operators.lifecycle.PartiallyFinishedSourcesITCase.test(PartiallyFinishedSourcesITCase.java:82)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to