[jira] [Commented] (FLINK-5999) MiniClusterITCase.runJobWithMultipleRpcServices fails

2017-03-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15933146#comment-15933146
 ] 

ASF GitHub Bot commented on FLINK-5999:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/3526


> MiniClusterITCase.runJobWithMultipleRpcServices fails
> -
>
> Key: FLINK-5999
> URL: https://issues.apache.org/jira/browse/FLINK-5999
> Project: Flink
>  Issue Type: Test
>  Components: Distributed Coordination, Tests
>Reporter: Ufuk Celebi
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> In a branch with unrelated changes to the web frontend I've seen the 
> following test fail:
> {code}
> runJobWithMultipleRpcServices(org.apache.flink.runtime.minicluster.MiniClusterITCase)
>   Time elapsed: 1.145 sec  <<< ERROR!
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>   at java.util.HashMap$ValueIterator.next(HashMap.java:1458)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.clear(JobLeaderIdService.java:114)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.stop(JobLeaderIdService.java:92)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManager.shutDown(ResourceManager.java:182)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDownInternally(ResourceManagerRunner.java:83)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDown(ResourceManagerRunner.java:78)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:313)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:281)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runJobWithMultipleRpcServices(MiniClusterITCase.java:72)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5999) MiniClusterITCase.runJobWithMultipleRpcServices fails

2017-03-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930363#comment-15930363
 ] 

ASF GitHub Bot commented on FLINK-5999:
---

Github user StephanEwen commented on the issue:

https://github.com/apache/flink/pull/3526
  
Looks good to me, +1 to merge


> MiniClusterITCase.runJobWithMultipleRpcServices fails
> -
>
> Key: FLINK-5999
> URL: https://issues.apache.org/jira/browse/FLINK-5999
> Project: Flink
>  Issue Type: Test
>  Components: Distributed Coordination, Tests
>Reporter: Ufuk Celebi
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> In a branch with unrelated changes to the web frontend I've seen the 
> following test fail:
> {code}
> runJobWithMultipleRpcServices(org.apache.flink.runtime.minicluster.MiniClusterITCase)
>   Time elapsed: 1.145 sec  <<< ERROR!
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>   at java.util.HashMap$ValueIterator.next(HashMap.java:1458)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.clear(JobLeaderIdService.java:114)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.stop(JobLeaderIdService.java:92)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManager.shutDown(ResourceManager.java:182)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDownInternally(ResourceManagerRunner.java:83)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDown(ResourceManagerRunner.java:78)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:313)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:281)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runJobWithMultipleRpcServices(MiniClusterITCase.java:72)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5999) MiniClusterITCase.runJobWithMultipleRpcServices fails

2017-03-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15930080#comment-15930080
 ] 

ASF GitHub Bot commented on FLINK-5999:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/3526
  
Travis passed. Rebasing the PR. If Travis passes on the latest master, I 
will merge the PR.


> MiniClusterITCase.runJobWithMultipleRpcServices fails
> -
>
> Key: FLINK-5999
> URL: https://issues.apache.org/jira/browse/FLINK-5999
> Project: Flink
>  Issue Type: Test
>  Components: Distributed Coordination, Tests
>Reporter: Ufuk Celebi
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> In a branch with unrelated changes to the web frontend I've seen the 
> following test fail:
> {code}
> runJobWithMultipleRpcServices(org.apache.flink.runtime.minicluster.MiniClusterITCase)
>   Time elapsed: 1.145 sec  <<< ERROR!
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>   at java.util.HashMap$ValueIterator.next(HashMap.java:1458)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.clear(JobLeaderIdService.java:114)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.stop(JobLeaderIdService.java:92)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManager.shutDown(ResourceManager.java:182)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDownInternally(ResourceManagerRunner.java:83)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDown(ResourceManagerRunner.java:78)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:313)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:281)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runJobWithMultipleRpcServices(MiniClusterITCase.java:72)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5999) MiniClusterITCase.runJobWithMultipleRpcServices fails

2017-03-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15907612#comment-15907612
 ] 

ASF GitHub Bot commented on FLINK-5999:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/3526

[FLINK-5999] [resMgnr] Move JobLeaderIdService shut down into 
ResourceManagerRunner

The JobLeaderIdService is being created by the ResourceManagerRunner and 
then given to a
ResourceManager. Before the ResourceManager stopped the service before 
being stopped
itself. This could lead to a concurrent modification exception by a state 
changing action
executed by the actor thread. In order to avoid this concurrent 
modification, the service's
shut down is now being executed after the ResourceManager has been shut 
down.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
resourceManagerServiceLifecycle

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/3526.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3526


commit 978ad4d55c0b52931c00d994c676dfd1d57b45b0
Author: Till Rohrmann 
Date:   2017-03-13T14:55:02Z

[FLINK-5999] [resMgnr] Move JobLeaderIdService shut down into 
ResourceManagerRunner

The JobLeaderIdService is being created by the ResourceManagerRunner and 
then given to a
ResourceManager. Before the ResourceManager stopped the service before 
being stopped
itself. This could lead to a concurrent modification exception by a state 
changing action
executed by the actor thread. In order to avoid this concurrent 
modification, the service's
shut down is now being executed after the ResourceManager has been shut 
down.




> MiniClusterITCase.runJobWithMultipleRpcServices fails
> -
>
> Key: FLINK-5999
> URL: https://issues.apache.org/jira/browse/FLINK-5999
> Project: Flink
>  Issue Type: Test
>  Components: Distributed Coordination, Tests
>Reporter: Ufuk Celebi
>Assignee: Till Rohrmann
>Priority: Critical
>  Labels: test-stability
>
> In a branch with unrelated changes to the web frontend I've seen the 
> following test fail:
> {code}
> runJobWithMultipleRpcServices(org.apache.flink.runtime.minicluster.MiniClusterITCase)
>   Time elapsed: 1.145 sec  <<< ERROR!
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>   at java.util.HashMap$ValueIterator.next(HashMap.java:1458)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.clear(JobLeaderIdService.java:114)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.stop(JobLeaderIdService.java:92)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManager.shutDown(ResourceManager.java:182)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDownInternally(ResourceManagerRunner.java:83)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDown(ResourceManagerRunner.java:78)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:313)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:281)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runJobWithMultipleRpcServices(MiniClusterITCase.java:72)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (FLINK-5999) MiniClusterITCase.runJobWithMultipleRpcServices fails

2017-03-08 Thread Ufuk Celebi (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15901330#comment-15901330
 ] 

Ufuk Celebi commented on FLINK-5999:


https://s3.amazonaws.com/archive.travis-ci.org/jobs/208981739/log.txt

> MiniClusterITCase.runJobWithMultipleRpcServices fails
> -
>
> Key: FLINK-5999
> URL: https://issues.apache.org/jira/browse/FLINK-5999
> Project: Flink
>  Issue Type: Test
>  Components: Distributed Coordination, Tests
>Reporter: Ufuk Celebi
>  Labels: test-stability
>
> In a branch with unrelated changes to the web frontend I've seen the 
> following test fail:
> {code}
> runJobWithMultipleRpcServices(org.apache.flink.runtime.minicluster.MiniClusterITCase)
>   Time elapsed: 1.145 sec  <<< ERROR!
> java.util.ConcurrentModificationException: null
>   at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
>   at java.util.HashMap$ValueIterator.next(HashMap.java:1458)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.clear(JobLeaderIdService.java:114)
>   at 
> org.apache.flink.runtime.resourcemanager.JobLeaderIdService.stop(JobLeaderIdService.java:92)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManager.shutDown(ResourceManager.java:182)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDownInternally(ResourceManagerRunner.java:83)
>   at 
> org.apache.flink.runtime.resourcemanager.ResourceManagerRunner.shutDown(ResourceManagerRunner.java:78)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdownInternally(MiniCluster.java:313)
>   at 
> org.apache.flink.runtime.minicluster.MiniCluster.shutdown(MiniCluster.java:281)
>   at 
> org.apache.flink.runtime.minicluster.MiniClusterITCase.runJobWithMultipleRpcServices(MiniClusterITCase.java:72)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)