[jira] [Updated] (ZOOKEEPER-3036) Unexpected exception in zookeeper

2018-07-29 Thread Michael Han (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-3036:
---
Component/s: (was: jmx)
 server
 quorum

> Unexpected exception in zookeeper
> -
>
> Key: ZOOKEEPER-3036
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.4.10
> Environment: 3 Zookeepers, 5 kafka servers
>Reporter: Oded
>Priority: Critical
>
> We got an issue with one of the zookeeprs (Leader), causing the entire kafka 
> cluster to fail:
> 2018-05-09 02:29:01,730 [myid:3] - ERROR 
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected 
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>     at java.net.SocketInputStream.read(SocketInputStream.java:171)
>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>     at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>     at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
> 2018-05-09 02:29:01,730 [myid:3] - WARN  
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - *** GOODBYE 
> /192.168.0.91:42490 
>  
> We would expect that zookeeper will choose another Leader and the Kafka 
> cluster will continue to work as expected, but that was not the case.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3036) Unexpected exception in zookeeper

2018-07-29 Thread Michael Han (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561417#comment-16561417
 ] 

Michael Han commented on ZOOKEEPER-3036:


What is the issue related to ZooKeeper in this case? When a learner thread dies 
the leader should be able to start another learner thread once the follower / 
observer corresponding to the died learner thread comes back. 

> Unexpected exception in zookeeper
> -
>
> Key: ZOOKEEPER-3036
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: quorum, server
>Affects Versions: 3.4.10
> Environment: 3 Zookeepers, 5 kafka servers
>Reporter: Oded
>Priority: Critical
>
> We got an issue with one of the zookeeprs (Leader), causing the entire kafka 
> cluster to fail:
> 2018-05-09 02:29:01,730 [myid:3] - ERROR 
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected 
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>     at java.net.SocketInputStream.read(SocketInputStream.java:171)
>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>     at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>     at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
> 2018-05-09 02:29:01,730 [myid:3] - WARN  
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - *** GOODBYE 
> /192.168.0.91:42490 
>  
> We would expect that zookeeper will choose another Leader and the Kafka 
> cluster will continue to work as expected, but that was not the case.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #545: ZOOKEEPER-2261 When only secureClientPort is co...

2018-07-29 Thread hanm
Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/545#discussion_r206012592
  
--- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java ---
@@ -866,6 +866,9 @@ public void setServerCnxnFactory(ServerCnxnFactory 
factory) {
 }
 
 public ServerCnxnFactory getServerCnxnFactory() {
+if (secureServerCnxnFactory != null) {
+return secureServerCnxnFactory;
+}
 return serverCnxnFactory;
 }
 
--- End diff --

I think alternatively we can kill the `setSecureServerCnxnFactory` and have 
something like `
 public void setServerCnxnFactory(ServerCnxnFactory factory) {
 if (secure) {
 secureServerCnxnFactory = factory;
 } else {
 serverCnxnFactory = factory;
 }
}`

The basic idea is to make code base consistent and easier to read. Having 
mixed methods just cost more brain power to reason about (albeit not a big one 
in this case).


---


[jira] [Commented] (ZOOKEEPER-3082) Fix server snapshot behavior when out of disk space

2018-07-29 Thread Michael Han (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561414#comment-16561414
 ] 

Michael Han commented on ZOOKEEPER-3082:


Committed to 3.6. Merge conflicts with branch-3.5, need a separate pull request 
to get this in 3.5.

> Fix server snapshot behavior when out of disk space
> ---
>
> Key: ZOOKEEPER-3082
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3082
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.6.0, 3.4.12, 3.5.5
>Reporter: Brian Nixon
>Assignee: Brian Nixon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When the ZK server tries to make a snapshot and the machine is out of disk 
> space, the snapshot creation fails and throws an IOException. An empty 
> snapshot file is created, (probably because the server is able to create an 
> entry in the dir) but is not able to write to the file.
>  
> If snapshot creation fails, the server commits suicide. When it restarts, it 
> will do so from the last known good snapshot. However, when it tries to make 
> a snapshot again, the same thing happens. This results in lots of empty 
> snapshot files being created. If eventually the DataDirCleanupManager garbage 
> collects the good snapshot files then only the empty files remain. At this 
> point, the server is well and truly screwed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #560: ZOOKEEPER-3082: Fix server snapshot behavior when out ...

2018-07-29 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/560
  
committed to master. thanks @enixon !


---


[jira] [Resolved] (ZOOKEEPER-3082) Fix server snapshot behavior when out of disk space

2018-07-29 Thread Michael Han (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han resolved ZOOKEEPER-3082.

   Resolution: Fixed
Fix Version/s: 3.6.0

> Fix server snapshot behavior when out of disk space
> ---
>
> Key: ZOOKEEPER-3082
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3082
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.6.0, 3.4.12, 3.5.5
>Reporter: Brian Nixon
>Assignee: Brian Nixon
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When the ZK server tries to make a snapshot and the machine is out of disk 
> space, the snapshot creation fails and throws an IOException. An empty 
> snapshot file is created, (probably because the server is able to create an 
> entry in the dir) but is not able to write to the file.
>  
> If snapshot creation fails, the server commits suicide. When it restarts, it 
> will do so from the last known good snapshot. However, when it tries to make 
> a snapshot again, the same thing happens. This results in lots of empty 
> snapshot files being created. If eventually the DataDirCleanupManager garbage 
> collects the good snapshot files then only the empty files remain. At this 
> point, the server is well and truly screwed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #560: ZOOKEEPER-3082: Fix server snapshot behavior wh...

2018-07-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/560


---


[GitHub] zookeeper issue #447: [ZOOKEEPER-2926] Fix potential data consistency issue ...

2018-07-29 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/447
  
Logic looks good to me. Summarize the change in one sentence: moving global 
session commit from pre processor to final processor so a global session will 
not be applied to zkDB until upgrade finished (global session creation 
committed to quorum). 


---


[GitHub] zookeeper issue #584: ZOOKEEPER-3102: Potential race condition when create e...

2018-07-29 Thread MichaelScofield
Github user MichaelScofield commented on the issue:

https://github.com/apache/zookeeper/pull/584
  
@anmolnar Thanks for the `computeIfAbsent` suggestion. Modified codes 
accordingly.
@breed @lvfangmin I'm aware that `createNode` is single-threaded, and the 
race condition I revealed is most likely not gonna happened in real world. But 
the original codes already assumed a multi-threaded used ephemerals map, so I 
think it's more reasonable to make it behaved more concurrency safe.


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-29 Thread hanm
Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r206010420
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/quorum/LeaderSessionTracker.java ---
@@ -85,31 +85,43 @@ public boolean isGlobalSession(long sessionId) {
 return globalSessionTracker.isTrackingSession(sessionId);
 }
 
-public boolean addGlobalSession(long sessionId, int sessionTimeout) {
-boolean added =
-globalSessionTracker.addSession(sessionId, sessionTimeout);
-if (localSessionsEnabled && added) {
+public boolean trackSession(long sessionId, int sessionTimeout) {
+boolean tracked =
+globalSessionTracker.trackSession(sessionId, sessionTimeout);
+if (localSessionsEnabled && tracked) {
 // Only do extra logging so we know what kind of session this 
is
 // if we're supporting both kinds of sessions
-LOG.info("Adding global session 0x" + 
Long.toHexString(sessionId));
+LOG.info("Tracking global session 0x" + 
Long.toHexString(sessionId));
 }
-return added;
+return tracked;
 }
 
-public boolean addSession(long sessionId, int sessionTimeout) {
-boolean added;
-if (localSessionsEnabled && !isGlobalSession(sessionId)) {
-added = localSessionTracker.addSession(sessionId, 
sessionTimeout);
--- End diff --

I see now. Basically each implementation of session tracker will maintain 
either a local session tracker and a global session tracker or both depend on 
the session tracker type and the tracking of different types of sessions are 
delegated to the local/global session trackers owned by the actual session 
tracker implementations. 


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-29 Thread hanm
Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r206009813
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTracker.java ---
@@ -47,21 +47,20 @@
 long createSession(int sessionTimeout);
 
 /**
- * Add a global session to those being tracked.
+ * Track the session expire, not add to ZkDb.
  * @param id sessionId
  * @param to sessionTimeout
  * @return whether the session was newly added (if false, already 
existed)
  */
-boolean addGlobalSession(long id, int to);
+boolean trackSession(long id, int to);
 
 /**
- * Add a session to those being tracked. The session is added as a 
local
- * session if they are enabled, otherwise as global.
+ * Add the session to the under layer storage.
--- End diff --

Sounds good to me with updated comment. 


---


[jira] [Created] (ZOOKEEPER-3106) Zookeeper client supports IPv6 address and document the "IPV6 feature"

2018-07-29 Thread maoling (JIRA)
maoling created ZOOKEEPER-3106:
--

 Summary: Zookeeper client supports IPv6 address and document the 
"IPV6 feature"
 Key: ZOOKEEPER-3106
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3106
 Project: ZooKeeper
  Issue Type: Test
  Components: documentation, java client
Reporter: maoling
Assignee: maoling






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-3106) Zookeeper client supports IPv6 address and document the "IPV6 feature"

2018-07-29 Thread maoling (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maoling updated ZOOKEEPER-3106:
---
Issue Type: Improvement  (was: Test)

> Zookeeper client supports IPv6 address and document the "IPV6 feature"
> --
>
> Key: ZOOKEEPER-3106
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3106
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: documentation, java client
>Reporter: maoling
>Assignee: maoling
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-1011) fix Java Barrier Documentation example's race condition issue and polish up the Barrier Documentation

2018-07-29 Thread maoling (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maoling updated ZOOKEEPER-1011:
---
Summary: fix Java Barrier Documentation example's race condition issue and 
polish up the Barrier Documentation  (was: Java Barrier Documentation example 
has a race condition issue)

> fix Java Barrier Documentation example's race condition issue and polish up 
> the Barrier Documentation
> -
>
> Key: ZOOKEEPER-1011
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1011
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Semih Salihoglu
>Assignee: maoling
>Priority: Trivial
>
> There is a race condition in the Barrier example of the java doc: 
> http://hadoop.apache.org/zookeeper/docs/current/zookeeperTutorial.html. It's 
> in the enter() method. Here's the original example:
> boolean enter() throws KeeperException, InterruptedException{
> zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
> CreateMode.EPHEMERAL_SEQUENTIAL);
> while (true) {
> synchronized (mutex) {
> List list = zk.getChildren(root, true);
> if (list.size() < size) {
> mutex.wait();
> } else {
> return true;
> }
> }
> }
> }
> Here's the race condition scenario:
> Let's say there are two machines/nodes: node1 and node2 that will use this 
> code to synchronize over ZK. Let's say the following steps take place:
> node1 calls the zk.create method and then reads the number of children, and 
> sees that it's 1 and starts waiting. 
> node2 calls the zk.create method (doesn't call the zk.getChildren method yet, 
> let's say it's very slow) 
> node1 is notified that the number of children on the znode changed, it checks 
> that the size is 2 so it leaves the barrier, it does its work and then leaves 
> the barrier, deleting its node.
> node2 calls zk.getChildren and because node1 has already left, it sees that 
> the number of children is equal to 1. Since node1 will never enter the 
> barrier again, it will keep waiting.
> --- End of scenario ---
> Here's Flavio's fix suggestions (copying from the email thread):
> ...
> I see two possible action points out of this discussion:
>   
> 1- State clearly in the beginning that the example discussed is not correct 
> under the assumption that a process may finish the computation before another 
> has started, and the example is there for illustration purposes;
> 2- Have another example following the current one that discusses the problem 
> and shows how to fix it. This is an interesting option that illustrates how 
> one could reason about a solution when developing with zookeeper.
> ...
> We'll go with the 2nd option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (ZOOKEEPER-1011) Java Barrier Documentation example has a race condition issue

2018-07-29 Thread maoling (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

maoling reassigned ZOOKEEPER-1011:
--

Assignee: maoling  (was: Semih Salihoglu)

> Java Barrier Documentation example has a race condition issue
> -
>
> Key: ZOOKEEPER-1011
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1011
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: documentation
>Reporter: Semih Salihoglu
>Assignee: maoling
>Priority: Trivial
>
> There is a race condition in the Barrier example of the java doc: 
> http://hadoop.apache.org/zookeeper/docs/current/zookeeperTutorial.html. It's 
> in the enter() method. Here's the original example:
> boolean enter() throws KeeperException, InterruptedException{
> zk.create(root + "/" + name, new byte[0], Ids.OPEN_ACL_UNSAFE,
> CreateMode.EPHEMERAL_SEQUENTIAL);
> while (true) {
> synchronized (mutex) {
> List list = zk.getChildren(root, true);
> if (list.size() < size) {
> mutex.wait();
> } else {
> return true;
> }
> }
> }
> }
> Here's the race condition scenario:
> Let's say there are two machines/nodes: node1 and node2 that will use this 
> code to synchronize over ZK. Let's say the following steps take place:
> node1 calls the zk.create method and then reads the number of children, and 
> sees that it's 1 and starts waiting. 
> node2 calls the zk.create method (doesn't call the zk.getChildren method yet, 
> let's say it's very slow) 
> node1 is notified that the number of children on the znode changed, it checks 
> that the size is 2 so it leaves the barrier, it does its work and then leaves 
> the barrier, deleting its node.
> node2 calls zk.getChildren and because node1 has already left, it sees that 
> the number of children is equal to 1. Since node1 will never enter the 
> barrier again, it will keep waiting.
> --- End of scenario ---
> Here's Flavio's fix suggestions (copying from the email thread):
> ...
> I see two possible action points out of this discussion:
>   
> 1- State clearly in the beginning that the example discussed is not correct 
> under the assumption that a process may finish the computation before another 
> has started, and the example is there for illustration purposes;
> 2- Have another example following the current one that discusses the problem 
> and shows how to fix it. This is an interesting option that illustrates how 
> one could reason about a solution when developing with zookeeper.
> ...
> We'll go with the 2nd option.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (ZOOKEEPER-3105) Character coding problem occur when create a node using python3

2018-07-29 Thread yang hao (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yang hao resolved ZOOKEEPER-3105.
-
Resolution: Fixed

> Character coding problem occur when create a node using python3
> ---
>
> Key: ZOOKEEPER-3105
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3105
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: contrib
>Affects Versions: 3.5.0
> Environment: linux
>Reporter: yang hao
>Priority: Major
> Fix For: 3.5.0
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> when creating a node using python3,  InvalidACLException occurs all the time. 
> it`s caused by imcompatible way of parsing acl passed through python3 api.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


ZooKeeper_branch34_jdk8 - Build # 1478 - Failure

2018-07-29 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_jdk8/1478/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 40.50 KB...]
[junit] Running org.apache.zookeeper.test.RepeatStartupTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
5.872 sec
[junit] Running org.apache.zookeeper.test.RestoreCommittedLogTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
18.752 sec
[junit] Running org.apache.zookeeper.test.SaslAuthDesignatedClientTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.896 sec
[junit] Running org.apache.zookeeper.test.SaslAuthDesignatedServerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.895 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailDesignatedClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.953 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailNotifyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.826 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.905 sec
[junit] Running org.apache.zookeeper.test.SaslAuthMissingClientConfigTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.803 sec
[junit] Running org.apache.zookeeper.test.SaslClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.119 sec
[junit] Running org.apache.zookeeper.test.SessionInvalidationTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.883 sec
[junit] Running org.apache.zookeeper.test.SessionTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
11.065 sec
[junit] Running org.apache.zookeeper.test.SessionTimeoutTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.113 sec
[junit] Running org.apache.zookeeper.test.StandaloneTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.031 sec
[junit] Running org.apache.zookeeper.test.StatTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.05 sec
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest
[junit] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.853 sec
[junit] Running org.apache.zookeeper.test.SyncCallTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.876 sec
[junit] Running org.apache.zookeeper.test.TruncateTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
9.876 sec
[junit] Running org.apache.zookeeper.test.UpgradeTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.215 sec
[junit] Running org.apache.zookeeper.test.WatchedEventTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.094 sec
[junit] Running org.apache.zookeeper.test.WatcherFuncTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.596 sec
[junit] Running org.apache.zookeeper.test.WatcherTest
[junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
30.65 sec
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
11.867 sec
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.071 sec

fail.build.on.test.failure:

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_jdk8/build.xml:1393: 
The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_jdk8/build.xml:1396: 
Tests failed!

Total time: 40 minutes 19 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
2 tests failed.
FAILED:  org.apache.zookeeper.server.ServerStatsTest.testLatencyMetrics

Error Message:
Min latency check
Expected: a value equal to or greater than <1001L>
 but: <1000L> was less than <1001L>

Stack Trace:
junit.framework.AssertionFailedError: Min latency check
Expected: a value equal to or greater than <1001L>
 but: <1000L> was less than <1001L>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at 
org.apache.zookeeper.server.ServerStatsTest.testLatencyMetrics(ServerStatsTest.java:77)
at 

too many flaky tests result from connection refused

2018-07-29 Thread 岭秀
I saw that too many flaky tests result from connection refused.What is the 
reason?Network conjestion?
maoling
Beijing,China



[GitHub] zookeeper pull request #:

2018-07-29 Thread maoling
Github user maoling commented on the pull request:


https://github.com/apache/zookeeper/commit/a2623a625a4778720f7d5482d0a66e9b37ae556f#commitcomment-29873077
  
@nkalmar  Could you plz create a JIRA to document this metric 
`zk_fsync_threshold_exceed_count`  in the part of `mntr` of 
`zookeeperAdmin.html`


---