[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2017-09-14 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16166301#comment-16166301
 ] 

Till Rohrmann commented on FLINK-2733:
--

I think [~tonycox], you actually did not use the fixed code when observing the 
test failure. Will close the issue therefore.

> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>  Components: Distributed Coordination
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335976#comment-15335976
 ] 

ASF GitHub Bot commented on FLINK-2733:
---

Github user asfgit closed the pull request at:

https://github.com/apache/flink/pull/2103


> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15335970#comment-15335970
 ] 

ASF GitHub Bot commented on FLINK-2733:
---

Github user tillrohrmann commented on the issue:

https://github.com/apache/flink/pull/2103
  
Ran 10 Travis builds and couldn't reproduce the ZooKeeperLeaderElectionTest 
failure. Thus, I assume that this PR fixes/hardens the test case. Will merge it 
now.


> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-15 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331478#comment-15331478
 ] 

Till Rohrmann commented on FLINK-2733:
--

I think my analysis was not correct. I checked and of course Curator will try 
to reconnect to ZooKeeper if the connection is lost. The problem was the 
following (hopefully I got it right this time ;-)

The test uses a leader retrieval service to find out which of the current 
leader contenders is the leader. Assume contender 0 is granted the leadership 
and the leader retrieval service is informed about it. Shortly afterwards, the 
contender 0 is revoked the leadership but the retrieval service is not yet 
informed about it. The test now compares the leader session id of the retrieval 
service with the contender whose address was returned. Since the leadership was 
revoked, the leader session id returned by the contender is null. That's the 
problem. I modified the test to allow false positive leaders being returned 
from the leader retrieval service.

> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15331453#comment-15331453
 ] 

ASF GitHub Bot commented on FLINK-2733:
---

GitHub user tillrohrmann opened a pull request:

https://github.com/apache/flink/pull/2103

[FLINK-2733] Harden ZooKeeperLeaderElectionTest

Hardens ZooKeeperElectionTest by allowing the testing listener to return
out-dated leader information. This can happen if the ZooKeeper connection
was suspended and the new leader information has not been sent to the
testing listener. In this case, the testing listener will be queried again
to return the actual leader information.

Add debug statements to ZooKeeperLeaderElectionTest.testZooKeeperReelection

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/tillrohrmann/flink 
fixZooKeeperLeaderElectionTest

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2103.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2103


commit 37127ed6ad198d010c81b4c725c2dd14a8b11872
Author: Till Rohrmann 
Date:   2016-06-06T15:18:59Z

[FLINK-2733] [tests] Harden ZooKeeperLeaderElectionTest

Hardens ZooKeeperElectionTest by allowing the testing listener to return
out-dated leader information. This can happen if the ZooKeeper connection
was suspended and the new leader information has not been sent to the
testing listener. In this case, the testing listener will be queried again
to return the actual leader information.

Add debug statements to ZooKeeperLeaderElectionTest.testZooKeeperReelection




> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-06 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316636#comment-15316636
 ] 

Till Rohrmann commented on FLINK-2733:
--

The problem seems to be a connection loss to the {{ZooKeeper}} testing server. 
Two curator clients lose their connection to the testing server. I suspect that 
the testing listener is one of the affected components. As a consequence the 
testing listener is no longer notified about the changing leader election and 
the test fails.

I assume that this has something to do with the Travis instances and the 
resouce consumption, since I couldn't reproduce the problem locally. I propose 
to decrease the number of concurrently connected instances and to increase the 
connection timeout in order to harden the test case.

> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-2733) ZooKeeperLeaderElectionTest.testZooKeeperReelection fails

2016-06-06 Thread Till Rohrmann (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316557#comment-15316557
 ] 

Till Rohrmann commented on FLINK-2733:
--

Another instance: 
https://s3.amazonaws.com/archive.travis-ci.org/jobs/134215896/log.txt

> ZooKeeperLeaderElectionTest.testZooKeeperReelection fails
> -
>
> Key: FLINK-2733
> URL: https://issues.apache.org/jira/browse/FLINK-2733
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 0.10.0
>Reporter: Robert Metzger
>Assignee: Till Rohrmann
>  Labels: test-stability
>
> I observed a test failure in this run: 
> https://travis-ci.org/rmetzger/flink/jobs/81571914
> {code}
> testZooKeeperReelection(org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest)
>   Time elapsed: 109.794 sec  <<< FAILURE!
> java.lang.AssertionError: expected: but 
> was:
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionTest.testZooKeeperReelection(ZooKeeperLeaderElectionTest.java:171)
> Results :
> Failed tests: 
>   ZooKeeperLeaderElectionTest.testZooKeeperReelection:171 
> expected: but was:
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)