[jira] [Commented] (FLINK-19791) PartitionRequestClientFactoryTest.testInterruptsNotCached fails with NullPointerException

2020-11-23 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17237359#comment-17237359
 ] 

Arvid Heise commented on FLINK-19791:
-

I think the second issue is a duplicate of 
https://issues.apache.org/jira/browse/FLINK-19925 . I'm closing this ticket as 
resolved.

> PartitionRequestClientFactoryTest.testInterruptsNotCached fails with 
> NullPointerException
> -
>
> Key: FLINK-19791
> URL: https://issues.apache.org/jira/browse/FLINK-19791
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8517=logs=6e58d712-c5cc-52fb-0895-6ff7bd56c46b=f30a8e80-b2cf-535c-9952-7f521a4ae374
> {code}
> 2020-10-23T13:25:12.0774554Z [ERROR] 
> testInterruptsNotCached(org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest)
>   Time elapsed: 0.762 s  <<< ERROR!
> 2020-10-23T13:25:12.0775695Z java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager '934dfa03c743/172.18.0.2:8080' has failed. 
> This might indicate that the remote task manager has been lost.
> 2020-10-23T13:25:12.0776455Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:95)
> 2020-10-23T13:25:12.0777038Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest.testInterruptsNotCached(PartitionRequestClientFactoryTest.java:72)
> 2020-10-23T13:25:12.0777465Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-10-23T13:25:12.0777815Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-10-23T13:25:12.0778221Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-10-23T13:25:12.0778581Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-10-23T13:25:12.0778921Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-10-23T13:25:12.0779331Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-10-23T13:25:12.0779733Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-10-23T13:25:12.0780117Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-10-23T13:25:12.0780484Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2020-10-23T13:25:12.0780851Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2020-10-23T13:25:12.0781236Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2020-10-23T13:25:12.0781600Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2020-10-23T13:25:12.0781937Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2020-10-23T13:25:12.0782431Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2020-10-23T13:25:12.0782877Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2020-10-23T13:25:12.0783223Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2020-10-23T13:25:12.0783541Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2020-10-23T13:25:12.0783905Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2020-10-23T13:25:12.0784315Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 2020-10-23T13:25:12.0784718Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 2020-10-23T13:25:12.0785125Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 2020-10-23T13:25:12.0785552Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 2020-10-23T13:25:12.0785980Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 2020-10-23T13:25:12.0786379Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 2020-10-23T13:25:12.0786763Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> 2020-10-23T13:25:12.0787922Z Caused by: 
> java.util.concurrent.ExecutionException: 
> 

[jira] [Commented] (FLINK-19791) PartitionRequestClientFactoryTest.testInterruptsNotCached fails with NullPointerException

2020-11-16 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17232974#comment-17232974
 ] 

Roman Khachatryan commented on FLINK-19791:
---

The original issue was caused by an incorrect test. Here, the test doesn't seem 
to be involved.

[~rmetzger], can you please open a separate ticket and close this one? (if you 
suspect this is not a network problem).

> PartitionRequestClientFactoryTest.testInterruptsNotCached fails with 
> NullPointerException
> -
>
> Key: FLINK-19791
> URL: https://issues.apache.org/jira/browse/FLINK-19791
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8517=logs=6e58d712-c5cc-52fb-0895-6ff7bd56c46b=f30a8e80-b2cf-535c-9952-7f521a4ae374
> {code}
> 2020-10-23T13:25:12.0774554Z [ERROR] 
> testInterruptsNotCached(org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest)
>   Time elapsed: 0.762 s  <<< ERROR!
> 2020-10-23T13:25:12.0775695Z java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager '934dfa03c743/172.18.0.2:8080' has failed. 
> This might indicate that the remote task manager has been lost.
> 2020-10-23T13:25:12.0776455Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:95)
> 2020-10-23T13:25:12.0777038Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest.testInterruptsNotCached(PartitionRequestClientFactoryTest.java:72)
> 2020-10-23T13:25:12.0777465Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-10-23T13:25:12.0777815Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-10-23T13:25:12.0778221Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-10-23T13:25:12.0778581Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-10-23T13:25:12.0778921Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-10-23T13:25:12.0779331Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-10-23T13:25:12.0779733Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-10-23T13:25:12.0780117Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-10-23T13:25:12.0780484Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2020-10-23T13:25:12.0780851Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2020-10-23T13:25:12.0781236Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2020-10-23T13:25:12.0781600Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2020-10-23T13:25:12.0781937Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2020-10-23T13:25:12.0782431Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2020-10-23T13:25:12.0782877Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2020-10-23T13:25:12.0783223Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2020-10-23T13:25:12.0783541Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2020-10-23T13:25:12.0783905Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2020-10-23T13:25:12.0784315Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 2020-10-23T13:25:12.0784718Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 2020-10-23T13:25:12.0785125Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 2020-10-23T13:25:12.0785552Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 2020-10-23T13:25:12.0785980Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 2020-10-23T13:25:12.0786379Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 2020-10-23T13:25:12.0786763Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> 2020-10-23T13:25:12.0787922Z Caused by: 
> 

[jira] [Commented] (FLINK-19791) PartitionRequestClientFactoryTest.testInterruptsNotCached fails with NullPointerException

2020-11-13 Thread Roman Khachatryan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231546#comment-17231546
 ] 

Roman Khachatryan commented on FLINK-19791:
---

I think it's indeed a network error as the message suggests.

The wrapped NullPointerException just means that the channel isn't connected 
(because of some previous error).

> PartitionRequestClientFactoryTest.testInterruptsNotCached fails with 
> NullPointerException
> -
>
> Key: FLINK-19791
> URL: https://issues.apache.org/jira/browse/FLINK-19791
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8517=logs=6e58d712-c5cc-52fb-0895-6ff7bd56c46b=f30a8e80-b2cf-535c-9952-7f521a4ae374
> {code}
> 2020-10-23T13:25:12.0774554Z [ERROR] 
> testInterruptsNotCached(org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest)
>   Time elapsed: 0.762 s  <<< ERROR!
> 2020-10-23T13:25:12.0775695Z java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager '934dfa03c743/172.18.0.2:8080' has failed. 
> This might indicate that the remote task manager has been lost.
> 2020-10-23T13:25:12.0776455Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:95)
> 2020-10-23T13:25:12.0777038Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest.testInterruptsNotCached(PartitionRequestClientFactoryTest.java:72)
> 2020-10-23T13:25:12.0777465Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-10-23T13:25:12.0777815Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-10-23T13:25:12.0778221Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-10-23T13:25:12.0778581Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-10-23T13:25:12.0778921Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-10-23T13:25:12.0779331Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-10-23T13:25:12.0779733Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-10-23T13:25:12.0780117Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-10-23T13:25:12.0780484Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2020-10-23T13:25:12.0780851Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2020-10-23T13:25:12.0781236Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2020-10-23T13:25:12.0781600Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2020-10-23T13:25:12.0781937Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2020-10-23T13:25:12.0782431Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2020-10-23T13:25:12.0782877Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2020-10-23T13:25:12.0783223Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2020-10-23T13:25:12.0783541Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2020-10-23T13:25:12.0783905Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2020-10-23T13:25:12.0784315Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 2020-10-23T13:25:12.0784718Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 2020-10-23T13:25:12.0785125Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 2020-10-23T13:25:12.0785552Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 2020-10-23T13:25:12.0785980Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 2020-10-23T13:25:12.0786379Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 2020-10-23T13:25:12.0786763Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> 2020-10-23T13:25:12.0787922Z Caused by: 
> java.util.concurrent.ExecutionException: 
> 

[jira] [Commented] (FLINK-19791) PartitionRequestClientFactoryTest.testInterruptsNotCached fails with NullPointerException

2020-11-13 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17231456#comment-17231456
 ] 

Robert Metzger commented on FLINK-19791:


I'm not sure if this problem has been really fixed. While testing the RC 1 of 
Flink 1.12.0, I saw the following exception:

{code}
2020-11-13 14:39:15,566 INFO  
org.apache.flink.runtime.executiongraph.ExecutionGraph   [] - Co-Flat Map 
(1/4) (0602ab4f0306596872a928c6375bd153) switched from RUNNING to FAILED on 
org.apache.flink.runtime.jobmaster.slotpool.SingleLogicalSlot@4102bd05.
org.apache.flink.runtime.io.network.partition.consumer.PartitionConnectionException:
 Connection for partition 
be51d31b9b1185e636f8b0e964615117#1@96cf744116e8d64d20ca53ccedac43c3 not 
reachable.
at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:163)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.internalRequestPartitions(SingleInputGate.java:314)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.requestPartitions(SingleInputGate.java:286)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.taskmanager.InputGateWithMetrics.requestPartitions(InputGateWithMetrics.java:94)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.runThrowing(StreamTaskActionExecutor.java:47)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.mailbox.Mail.run(Mail.java:78) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.processMail(MailboxProcessor.java:283)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:184)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:577)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:541) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:722) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:547) 
~[flink-dist_2.11-1.12.0.jar:1.12.0]
at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_222]
Caused by: java.io.IOException: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
Connecting to remote task manager '/192.168.1.25:57359' has failed. This might 
indicate that the remote task manager has been lost.
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:95)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:160)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
... 12 more
Caused by: java.util.concurrent.ExecutionException: 
org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
Connecting to remote task manager '/192.168.1.25:57359' has failed. This might 
indicate that the remote task manager has been lost.
at 
java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) 
~[?:1.8.0_222]
at 
java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1895) 
~[?:1.8.0_222]
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:88)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.io.network.netty.NettyConnectionManager.createPartitionRequestClient(NettyConnectionManager.java:67)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 
org.apache.flink.runtime.io.network.partition.consumer.RemoteInputChannel.requestSubpartition(RemoteInputChannel.java:160)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
... 12 more
Caused by: 
org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
Connecting to remote task manager '/192.168.1.25:57359' has failed. This might 
indicate that the remote task manager has been lost.
at 
org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.connect(PartitionRequestClientFactory.java:134)
 ~[flink-dist_2.11-1.12.0.jar:1.12.0]
at 

[jira] [Commented] (FLINK-19791) PartitionRequestClientFactoryTest.testInterruptsNotCached fails with NullPointerException

2020-10-26 Thread Arvid Heise (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220523#comment-17220523
 ] 

Arvid Heise commented on FLINK-19791:
-

Merged a fix into master as 0184672733fb2417ca9c23c30f5183bb3dff5dd0 .

> PartitionRequestClientFactoryTest.testInterruptsNotCached fails with 
> NullPointerException
> -
>
> Key: FLINK-19791
> URL: https://issues.apache.org/jira/browse/FLINK-19791
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8517=logs=6e58d712-c5cc-52fb-0895-6ff7bd56c46b=f30a8e80-b2cf-535c-9952-7f521a4ae374
> {code}
> 2020-10-23T13:25:12.0774554Z [ERROR] 
> testInterruptsNotCached(org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest)
>   Time elapsed: 0.762 s  <<< ERROR!
> 2020-10-23T13:25:12.0775695Z java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager '934dfa03c743/172.18.0.2:8080' has failed. 
> This might indicate that the remote task manager has been lost.
> 2020-10-23T13:25:12.0776455Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:95)
> 2020-10-23T13:25:12.0777038Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest.testInterruptsNotCached(PartitionRequestClientFactoryTest.java:72)
> 2020-10-23T13:25:12.0777465Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-10-23T13:25:12.0777815Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-10-23T13:25:12.0778221Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-10-23T13:25:12.0778581Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-10-23T13:25:12.0778921Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-10-23T13:25:12.0779331Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-10-23T13:25:12.0779733Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-10-23T13:25:12.0780117Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-10-23T13:25:12.0780484Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2020-10-23T13:25:12.0780851Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2020-10-23T13:25:12.0781236Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2020-10-23T13:25:12.0781600Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2020-10-23T13:25:12.0781937Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2020-10-23T13:25:12.0782431Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2020-10-23T13:25:12.0782877Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2020-10-23T13:25:12.0783223Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2020-10-23T13:25:12.0783541Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2020-10-23T13:25:12.0783905Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2020-10-23T13:25:12.0784315Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 2020-10-23T13:25:12.0784718Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 2020-10-23T13:25:12.0785125Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 2020-10-23T13:25:12.0785552Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 2020-10-23T13:25:12.0785980Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 2020-10-23T13:25:12.0786379Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 2020-10-23T13:25:12.0786763Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> 2020-10-23T13:25:12.0787922Z Caused by: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager 

[jira] [Commented] (FLINK-19791) PartitionRequestClientFactoryTest.testInterruptsNotCached fails with NullPointerException

2020-10-25 Thread Dian Fu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17220435#comment-17220435
 ] 

Dian Fu commented on FLINK-19791:
-

https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=8249=logs=f0ac5c25-1168-55a5-07ff-0e88223afed9=0dbaca5d-7c38-52e6-f4fe-2fb69ccb3ada

> PartitionRequestClientFactoryTest.testInterruptsNotCached fails with 
> NullPointerException
> -
>
> Key: FLINK-19791
> URL: https://issues.apache.org/jira/browse/FLINK-19791
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Assignee: Roman Khachatryan
>Priority: Major
>  Labels: pull-request-available, test-stability
> Fix For: 1.12.0
>
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8517=logs=6e58d712-c5cc-52fb-0895-6ff7bd56c46b=f30a8e80-b2cf-535c-9952-7f521a4ae374
> {code}
> 2020-10-23T13:25:12.0774554Z [ERROR] 
> testInterruptsNotCached(org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest)
>   Time elapsed: 0.762 s  <<< ERROR!
> 2020-10-23T13:25:12.0775695Z java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager '934dfa03c743/172.18.0.2:8080' has failed. 
> This might indicate that the remote task manager has been lost.
> 2020-10-23T13:25:12.0776455Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:95)
> 2020-10-23T13:25:12.0777038Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest.testInterruptsNotCached(PartitionRequestClientFactoryTest.java:72)
> 2020-10-23T13:25:12.0777465Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-10-23T13:25:12.0777815Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-10-23T13:25:12.0778221Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-10-23T13:25:12.0778581Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-10-23T13:25:12.0778921Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-10-23T13:25:12.0779331Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-10-23T13:25:12.0779733Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-10-23T13:25:12.0780117Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-10-23T13:25:12.0780484Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2020-10-23T13:25:12.0780851Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2020-10-23T13:25:12.0781236Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2020-10-23T13:25:12.0781600Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2020-10-23T13:25:12.0781937Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2020-10-23T13:25:12.0782431Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2020-10-23T13:25:12.0782877Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2020-10-23T13:25:12.0783223Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2020-10-23T13:25:12.0783541Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2020-10-23T13:25:12.0783905Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2020-10-23T13:25:12.0784315Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 2020-10-23T13:25:12.0784718Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 2020-10-23T13:25:12.0785125Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 2020-10-23T13:25:12.0785552Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 2020-10-23T13:25:12.0785980Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 2020-10-23T13:25:12.0786379Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 2020-10-23T13:25:12.0786763Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> 2020-10-23T13:25:12.0787922Z Caused by: 
> java.util.concurrent.ExecutionException: 
> 

[jira] [Commented] (FLINK-19791) PartitionRequestClientFactoryTest.testInterruptsNotCached fails with NullPointerException

2020-10-23 Thread Robert Metzger (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-19791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17219680#comment-17219680
 ] 

Robert Metzger commented on FLINK-19791:


CC [~roman_khachatryan]

> PartitionRequestClientFactoryTest.testInterruptsNotCached fails with 
> NullPointerException
> -
>
> Key: FLINK-19791
> URL: https://issues.apache.org/jira/browse/FLINK-19791
> Project: Flink
>  Issue Type: Bug
>  Components: Runtime / Network
>Affects Versions: 1.12.0
>Reporter: Robert Metzger
>Priority: Major
>  Labels: test-stability
>
> https://dev.azure.com/rmetzger/Flink/_build/results?buildId=8517=logs=6e58d712-c5cc-52fb-0895-6ff7bd56c46b=f30a8e80-b2cf-535c-9952-7f521a4ae374
> {code}
> 2020-10-23T13:25:12.0774554Z [ERROR] 
> testInterruptsNotCached(org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest)
>   Time elapsed: 0.762 s  <<< ERROR!
> 2020-10-23T13:25:12.0775695Z java.io.IOException: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager '934dfa03c743/172.18.0.2:8080' has failed. 
> This might indicate that the remote task manager has been lost.
> 2020-10-23T13:25:12.0776455Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactory.createPartitionRequestClient(PartitionRequestClientFactory.java:95)
> 2020-10-23T13:25:12.0777038Z  at 
> org.apache.flink.runtime.io.network.netty.PartitionRequestClientFactoryTest.testInterruptsNotCached(PartitionRequestClientFactoryTest.java:72)
> 2020-10-23T13:25:12.0777465Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 2020-10-23T13:25:12.0777815Z  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 2020-10-23T13:25:12.0778221Z  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 2020-10-23T13:25:12.0778581Z  at 
> java.lang.reflect.Method.invoke(Method.java:498)
> 2020-10-23T13:25:12.0778921Z  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> 2020-10-23T13:25:12.0779331Z  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
> 2020-10-23T13:25:12.0779733Z  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
> 2020-10-23T13:25:12.0780117Z  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
> 2020-10-23T13:25:12.0780484Z  at 
> org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
> 2020-10-23T13:25:12.0780851Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
> 2020-10-23T13:25:12.0781236Z  at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
> 2020-10-23T13:25:12.0781600Z  at 
> org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
> 2020-10-23T13:25:12.0781937Z  at 
> org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
> 2020-10-23T13:25:12.0782431Z  at 
> org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
> 2020-10-23T13:25:12.0782877Z  at 
> org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
> 2020-10-23T13:25:12.0783223Z  at 
> org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
> 2020-10-23T13:25:12.0783541Z  at 
> org.junit.runners.ParentRunner.run(ParentRunner.java:363)
> 2020-10-23T13:25:12.0783905Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
> 2020-10-23T13:25:12.0784315Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
> 2020-10-23T13:25:12.0784718Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
> 2020-10-23T13:25:12.0785125Z  at 
> org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159)
> 2020-10-23T13:25:12.0785552Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:384)
> 2020-10-23T13:25:12.0785980Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:345)
> 2020-10-23T13:25:12.0786379Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:126)
> 2020-10-23T13:25:12.0786763Z  at 
> org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:418)
> 2020-10-23T13:25:12.0787922Z Caused by: 
> java.util.concurrent.ExecutionException: 
> org.apache.flink.runtime.io.network.netty.exception.RemoteTransportException: 
> Connecting to remote task manager '934dfa03c743/172.18.0.2:8080' has failed. 
> This might indicate that the remote task manager has been lost.
>