[GitHub] zookeeper pull request #567: ZOOKEEPER-3071: Add a config parameter to contr...

2018-07-27 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/567#discussion_r205933566
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java ---
@@ -102,6 +102,21 @@
 /** Maximum time we allow for elapsed fsync before WARNing */
 private final static long fsyncWarningThresholdMS;
 
+/**
+ * This parameter limit the size of each txnlog to a given limit (KB).
+ * It does not affect how often the system will take a snapshot 
[zookeeper.snapCount]
+ * We roll the txnlog when either of the two limits are reached.
+ * Also since we only roll the logs at transaction boundaries, actual 
file size can exceed
+ * this limit by the maximum size of a serialized transaction.
+ * The feature is disabled by default (-1)
+ */
+public static final String LOG_SIZE_LIMIT = 
"zookeeper.txnlogSizeLimitInKb";
--- End diff --

I mean `zookeeperAdmin.xml`.
Html and pdf files are generated from xml.


---


[GitHub] zookeeper pull request #567: ZOOKEEPER-3071: Add a config parameter to contr...

2018-07-27 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/567#discussion_r205933558
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java ---
@@ -102,6 +102,21 @@
 /** Maximum time we allow for elapsed fsync before WARNing */
 private final static long fsyncWarningThresholdMS;
 
+/**
+ * This parameter limit the size of each txnlog to a given limit (KB).
+ * It does not affect how often the system will take a snapshot 
[zookeeper.snapCount]
+ * We roll the txnlog when either of the two limits are reached.
+ * Also since we only roll the logs at transaction boundaries, actual 
file size can exceed
+ * this limit by the maximum size of a serialized transaction.
+ * The feature is disabled by default (-1)
+ */
+public static final String LOG_SIZE_LIMIT = 
"zookeeper.txnlogSizeLimitInKb";
--- End diff --

Yes. This new feature should be documented in zookeeperAdmin. Are you 
planning to address docs here or in separate patch?


---


[jira] [Commented] (ZOOKEEPER-3061) add more details to 'Unhandled scenario for peer' log.warn message

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560592#comment-16560592
 ] 

Hudson commented on ZOOKEEPER-3061:
---

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #125 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/125/])
ZOOKEEPER-3061: add more details to 'Unhandled scenario for peer' (breed: rev 
726587ef50339f071960d153cc4599882aa71ac7)
* (edit) src/java/main/org/apache/zookeeper/server/quorum/LearnerHandler.java


> add more details to 'Unhandled scenario for peer' log.warn message
> --
>
> Key: ZOOKEEPER-3061
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3061
> Project: ZooKeeper
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-3061.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A few lines earlier the {{LOG.info("Synchronizing with Follower sid: ...}} 
> logging already contains most relevant details but it would be convenient to 
> more directly have full details in the {{LOG.warn("Unhandled scenario for 
> peer sid: ...}} itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3095) Connect string fix for non-existent hosts

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560594#comment-16560594
 ] 

Hudson commented on ZOOKEEPER-3095:
---

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #125 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/125/])
ZOOKEEPER-3095: Connect string fix for non-existent hosts (breed: rev 
932fee861001c158343c0bc64dbe80f253f0bb6d)
* (edit) src/c/tests/TestClient.cc
* (edit) src/c/src/zookeeper.c
* (edit) src/c/tests/zkServer.sh


> Connect string fix for non-existent hosts
> -
>
> Key: ZOOKEEPER-3095
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3095
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: other
>Affects Versions: 3.4.0
>Reporter: Mohamed Jeelani
>Assignee: Mohamed Jeelani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Connect string fix for non-existent hosts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3072) Race condition in throttling

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560593#comment-16560593
 ] 

Hudson commented on ZOOKEEPER-3072:
---

SUCCESS: Integrated in Jenkins build ZooKeeper-trunk #125 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/125/])
ZOOKEEPER-3072: Throttle race condition fix (breed: rev 
2a372fcdce3c0142c0bb23f06098a2c1a49f807e)
* (edit) src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java


> Race condition in throttling
> 
>
> Key: ZOOKEEPER-3072
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3072
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.2, 3.5.3, 3.5.4
>Reporter: Botond Hejj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.4, 3.6.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> There is a race condition in the server throttling code. It is possible that 
> the disableRecv is called after enableRecv.
> Basically, the I/O work thread does this in processPacket: 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java#L1102]
>  
>     submitRequest(si);
>     }
>     }
>     cnxn.incrOutstandingRequests(h);
>     }
>  
> incrOutstandingRequests() checks for limit breach, and potentially turns on 
> throttling, 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java#L384]
>  
> submitRequest() will create a logical request and en-queue it so that 
> Processor thread can pick it up. After being de-queued by Processor thread, 
> it does necessary handling, and then calls this 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java#L459]
>  :
>  
>     cnxn.sendResponse(hdr, rsp, "response");
>  
> and in sendResponse(), it first appends to outgoing buffer, and then checks 
> if un-throttle is needed:  
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java#L708]
>  
> However, if there is a context switch between submitRequest() and 
> cnxn.incrOutstandingRequests(), so that Processor thread completes 
> cnxn.sendResponse() call before I/O thread switches back, then enableRecv() 
> will happen before disableRecv(), and enableRecv() will fail the CAS ops, 
> while disableRecv() will succeed, resulting in a deadlock: un-throttle is 
> needed for letting in requests, and sendResponse is needed to trigger 
> un-throttle, but sendResponse() requires an incoming message. From that point 
> on, ZK server will no longer select the affected client socket for read, 
> leading to the observed client-side failure in the subject.
> If you would like to reproduce this than setting the globalOutstandingLimit 
> down to 1 makes this reproducible easier as throttling starts with less 
> requests. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #579: [ZOOKEEPER-3095] Connect string fix for non-existent h...

2018-07-27 Thread breed
Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/579
  
hey @mjeelanimsft when i did the commit, your email address came up wrong. 
i think i fixed it, but you might want to check your git config.

thanx for the submission!


---


[jira] [Resolved] (ZOOKEEPER-3095) Connect string fix for non-existent hosts

2018-07-27 Thread Benjamin Reed (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed resolved ZOOKEEPER-3095.
--
Resolution: Fixed

Issue resolved by pull request 579
[https://github.com/apache/zookeeper/pull/579]

> Connect string fix for non-existent hosts
> -
>
> Key: ZOOKEEPER-3095
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3095
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: other
>Affects Versions: 3.4.0
>Reporter: Mohamed Jeelani
>Assignee: Mohamed Jeelani
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Connect string fix for non-existent hosts



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #579: [ZOOKEEPER-3095] Connect string fix for non-exi...

2018-07-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/579


---


[GitHub] zookeeper issue #583: [ZOOKEEPER-3104] Fix data inconsistency due to NEWLEAD...

2018-07-27 Thread breed
Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/583
  
ok @lvfangmin i'll commit this once the conflict is resolved.


---


[jira] [Resolved] (ZOOKEEPER-3072) Race condition in throttling

2018-07-27 Thread Benjamin Reed (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed resolved ZOOKEEPER-3072.
--
Resolution: Fixed

> Race condition in throttling
> 
>
> Key: ZOOKEEPER-3072
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3072
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.2, 3.5.3, 3.5.4
>Reporter: Botond Hejj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.6.0, 3.5.4
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> There is a race condition in the server throttling code. It is possible that 
> the disableRecv is called after enableRecv.
> Basically, the I/O work thread does this in processPacket: 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java#L1102]
>  
>     submitRequest(si);
>     }
>     }
>     cnxn.incrOutstandingRequests(h);
>     }
>  
> incrOutstandingRequests() checks for limit breach, and potentially turns on 
> throttling, 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java#L384]
>  
> submitRequest() will create a logical request and en-queue it so that 
> Processor thread can pick it up. After being de-queued by Processor thread, 
> it does necessary handling, and then calls this 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java#L459]
>  :
>  
>     cnxn.sendResponse(hdr, rsp, "response");
>  
> and in sendResponse(), it first appends to outgoing buffer, and then checks 
> if un-throttle is needed:  
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java#L708]
>  
> However, if there is a context switch between submitRequest() and 
> cnxn.incrOutstandingRequests(), so that Processor thread completes 
> cnxn.sendResponse() call before I/O thread switches back, then enableRecv() 
> will happen before disableRecv(), and enableRecv() will fail the CAS ops, 
> while disableRecv() will succeed, resulting in a deadlock: un-throttle is 
> needed for letting in requests, and sendResponse is needed to trigger 
> un-throttle, but sendResponse() requires an incoming message. From that point 
> on, ZK server will no longer select the affected client socket for read, 
> leading to the observed client-side failure in the subject.
> If you would like to reproduce this than setting the globalOutstandingLimit 
> down to 1 makes this reproducible easier as throttling starts with less 
> requests. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (ZOOKEEPER-3072) Race condition in throttling

2018-07-27 Thread Benjamin Reed (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed updated ZOOKEEPER-3072:
-
Fix Version/s: 3.5.4
   3.6.0

> Race condition in throttling
> 
>
> Key: ZOOKEEPER-3072
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3072
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.0, 3.5.1, 3.5.2, 3.5.3, 3.5.4
>Reporter: Botond Hejj
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.5.4, 3.6.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> There is a race condition in the server throttling code. It is possible that 
> the disableRecv is called after enableRecv.
> Basically, the I/O work thread does this in processPacket: 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java#L1102]
>  
>     submitRequest(si);
>     }
>     }
>     cnxn.incrOutstandingRequests(h);
>     }
>  
> incrOutstandingRequests() checks for limit breach, and potentially turns on 
> throttling, 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java#L384]
>  
> submitRequest() will create a logical request and en-queue it so that 
> Processor thread can pick it up. After being de-queued by Processor thread, 
> it does necessary handling, and then calls this 
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/FinalRequestProcessor.java#L459]
>  :
>  
>     cnxn.sendResponse(hdr, rsp, "response");
>  
> and in sendResponse(), it first appends to outgoing buffer, and then checks 
> if un-throttle is needed:  
> [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java#L708]
>  
> However, if there is a context switch between submitRequest() and 
> cnxn.incrOutstandingRequests(), so that Processor thread completes 
> cnxn.sendResponse() call before I/O thread switches back, then enableRecv() 
> will happen before disableRecv(), and enableRecv() will fail the CAS ops, 
> while disableRecv() will succeed, resulting in a deadlock: un-throttle is 
> needed for letting in requests, and sendResponse is needed to trigger 
> un-throttle, but sendResponse() requires an incoming message. From that point 
> on, ZK server will no longer select the affected client socket for read, 
> leading to the observed client-side failure in the subject.
> If you would like to reproduce this than setting the globalOutstandingLimit 
> down to 1 makes this reproducible easier as throttling starts with less 
> requests. 
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #563: ZOOKEEPER-3072: Throttle race condition fix

2018-07-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/563


---


[jira] [Resolved] (ZOOKEEPER-3061) add more details to 'Unhandled scenario for peer' log.warn message

2018-07-27 Thread Benjamin Reed (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Reed resolved ZOOKEEPER-3061.
--
   Resolution: Fixed
Fix Version/s: 3.6.0

Issue resolved by pull request 555
[https://github.com/apache/zookeeper/pull/555]

> add more details to 'Unhandled scenario for peer' log.warn message
> --
>
> Key: ZOOKEEPER-3061
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3061
> Project: ZooKeeper
>  Issue Type: Task
>Reporter: Christine Poerschke
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
> Attachments: ZOOKEEPER-3061.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A few lines earlier the {{LOG.info("Synchronizing with Follower sid: ...}} 
> logging already contains most relevant details but it would be convenient to 
> more directly have full details in the {{LOG.warn("Unhandled scenario for 
> peer sid: ...}} itself.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #555: ZOOKEEPER-3061: add more details to 'Unhandled ...

2018-07-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/555


---


[GitHub] zookeeper pull request #579: [ZOOKEEPER-3095] Connect string fix for non-exi...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/579#discussion_r205924496
  
--- Diff: src/c/tests/zkServer.sh ---
@@ -77,7 +77,7 @@ fi
 
 if [ "x${base_dir}" == "x" ]
 then
-zk_base="../../"
+zk_base="../../../"
--- End diff --

Yes, seems the old code is not working if the base_dir is not specified. 

We were using this to test the change manually, so it's related from 
testing purpose. 


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205922937
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/quorum/LeaderSessionTracker.java ---
@@ -85,31 +85,43 @@ public boolean isGlobalSession(long sessionId) {
 return globalSessionTracker.isTrackingSession(sessionId);
 }
 
-public boolean addGlobalSession(long sessionId, int sessionTimeout) {
-boolean added =
-globalSessionTracker.addSession(sessionId, sessionTimeout);
-if (localSessionsEnabled && added) {
+public boolean trackSession(long sessionId, int sessionTimeout) {
+boolean tracked =
+globalSessionTracker.trackSession(sessionId, sessionTimeout);
+if (localSessionsEnabled && tracked) {
 // Only do extra logging so we know what kind of session this 
is
 // if we're supporting both kinds of sessions
-LOG.info("Adding global session 0x" + 
Long.toHexString(sessionId));
+LOG.info("Tracking global session 0x" + 
Long.toHexString(sessionId));
 }
-return added;
+return tracked;
 }
 
-public boolean addSession(long sessionId, int sessionTimeout) {
-boolean added;
-if (localSessionsEnabled && !isGlobalSession(sessionId)) {
-added = localSessionTracker.addSession(sessionId, 
sessionTimeout);
--- End diff --

When local session feature is enabled, the createSession method in 
ZooKeeperServer will create, track and 'commit' (update the local session in 
memory map) the local session immediately.


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205922808
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 
---
@@ -280,6 +275,11 @@ public synchronized boolean addSession(long id, int 
sessionTimeout) {
 return added;
 }
 
+public synchronized boolean commitSession(long id, int sessionTimeout) 
{
+sessionsWithTimeout.put(id, sessionTimeout);
+return true;
--- End diff --

The LeaderSessionTracker.commitSession will return whether it has 
successfully added the new session, but the return value is not being used 
anywhere in the code currently.

I'll update this to reflect that as well.


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205922968
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTracker.java ---
@@ -47,21 +47,20 @@
 long createSession(int sessionTimeout);
 
 /**
- * Add a global session to those being tracked.
+ * Track the session expire, not add to ZkDb.
  * @param id sessionId
  * @param to sessionTimeout
  * @return whether the session was newly added (if false, already 
existed)
  */
-boolean addGlobalSession(long id, int to);
+boolean trackSession(long id, int to);
 
 /**
- * Add a session to those being tracked. The session is added as a 
local
- * session if they are enabled, otherwise as global.
+ * Add the session to the under layer storage.
  * @param id sessionId
  * @param to sessionTimeout
  * @return whether the session was newly added (if false, already 
existed)
--- End diff --

This comment about the return value is still correct


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205923291
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/quorum/LearnerSessionTracker.java ---
@@ -101,33 +100,44 @@ public boolean isGlobalSession(long sessionId) {
 return globalSessionsWithTimeouts.containsKey(sessionId);
 }
 
-public boolean addGlobalSession(long sessionId, int sessionTimeout) {
+public boolean trackSession(long sessionId, int sessionTimeout) {
+// Learner doesn't track global session, do nothing here
+return false;
+}
+
+/**
+ * Synchronized on this to avoid race condition of adding a local 
session
+ * after committed global session, which may cause the same session 
being
+ * tracked on this server and leader.
+ */
+public synchronized boolean commitSession(
+long sessionId, int sessionTimeout) {
 boolean added =
 globalSessionsWithTimeouts.put(sessionId, sessionTimeout) == 
null;
-if (localSessionsEnabled && added) {
+
+if (added) {
 // Only do extra logging so we know what kind of session this 
is
 // if we're supporting both kinds of sessions
-LOG.info("Adding global session 0x" + 
Long.toHexString(sessionId));
+LOG.info("Committing global session 0x" + 
Long.toHexString(sessionId));
 }
-touchTable.get().put(sessionId, sessionTimeout);
-return added;
-}
 
-public boolean addSession(long sessionId, int sessionTimeout) {
--- End diff --

Explained in the previous comment, createSession will add and track it.


---


[GitHub] zookeeper pull request #567: ZOOKEEPER-3071: Add a config parameter to contr...

2018-07-27 Thread suyogmapara
Github user suyogmapara commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/567#discussion_r205923249
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java ---
@@ -102,6 +102,21 @@
 /** Maximum time we allow for elapsed fsync before WARNing */
 private final static long fsyncWarningThresholdMS;
 
+/**
+ * This parameter limit the size of each txnlog to a given limit (KB).
+ * It does not affect how often the system will take a snapshot 
[zookeeper.snapCount]
+ * We roll the txnlog when either of the two limits are reached.
+ * Also since we only roll the logs at transaction boundaries, actual 
file size can exceed
+ * this limit by the maximum size of a serialized transaction.
+ * The feature is disabled by default (-1)
+ */
+public static final String LOG_SIZE_LIMIT = 
"zookeeper.txnlogSizeLimitInKb";
--- End diff --

Thanks @breed, are you referring to 
https://github.com/apache/zookeeper/blob/master/docs/zookeeperAdmin.html?


---


[GitHub] zookeeper pull request #567: ZOOKEEPER-3071: Add a config parameter to contr...

2018-07-27 Thread suyogmapara
Github user suyogmapara commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/567#discussion_r205922930
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java ---
@@ -127,14 +127,11 @@
 
 Long logSize = Long.getLong(LOG_SIZE_LIMIT, -1);
 if (logSize > 0) {
+LOG.info("{} = {}", LOG_SIZE_LIMIT, logSize);
+
--- End diff --

I renamed the property as per the other comment, I think now the log is 
readable as is. Please let me know if you think otherwise.


---


[GitHub] zookeeper issue #567: ZOOKEEPER-3071: Add a config parameter to control tran...

2018-07-27 Thread suyogmapara
Github user suyogmapara commented on the issue:

https://github.com/apache/zookeeper/pull/567
  
@maoling  Thanks for the review, regarding concern around PreAllocSize, 
thanks to ZOOKEEPER-2249, setting txnLogSizeLimit less than preAllocSize should 
not cause any issue. I can add explicit test for it if you guys think it is 
valuable. 


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205922455
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTrackerImpl.java 
---
@@ -280,6 +275,11 @@ public synchronized boolean addSession(long id, int 
sessionTimeout) {
 return added;
 }
 
+public synchronized boolean commitSession(long id, int sessionTimeout) 
{
+sessionsWithTimeout.put(id, sessionTimeout);
+return true;
--- End diff --

The LeaderSessionTracker.commitSession will return whether it has 
successfully added the new session.


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205920893
  
--- Diff: src/java/test/org/apache/zookeeper/test/ClientBase.java ---
@@ -72,7 +72,7 @@
 static final File BASETEST =
 new File(System.getProperty("build.test.dir", "build"));
 
-protected String hostPort = "127.0.0.1:" + PortAssignment.unique();
+public String hostPort = "127.0.0.1:" + PortAssignment.unique();
--- End diff --

Was trying to change these settings to public so I can reference it in the 
SessionUpgradeQuorumTest.java, I guess I find a different way to to that and 
forgot to revert this change, will verify and remove this change if it's not 
necessary.


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205921489
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTracker.java ---
@@ -47,21 +47,20 @@
 long createSession(int sessionTimeout);
 
 /**
- * Add a global session to those being tracked.
+ * Track the session expire, not add to ZkDb.
  * @param id sessionId
  * @param to sessionTimeout
  * @return whether the session was newly added (if false, already 
existed)
  */
-boolean addGlobalSession(long id, int to);
+boolean trackSession(long id, int to);
 
 /**
- * Add a session to those being tracked. The session is added as a 
local
- * session if they are enabled, otherwise as global.
+ * Add the session to the under layer storage.
--- End diff --

In LocalSessionTracker, commitSession is used to update the in memory local 
session map, which is not in zkDB. How about change it to:

"Add the session to the local session map or global one in zkDB."


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205920948
  
--- Diff: src/java/test/org/apache/zookeeper/test/QuorumBase.java ---
@@ -53,32 +53,32 @@
 protected int port3;
 protected int port4;
 protected int port5;
-
+
 protected int portLE1;
 protected int portLE2;
 protected int portLE3;
 protected int portLE4;
 protected int portLE5;
-
+
 protected int portClient1;
 protected int portClient2;
 protected int portClient3;
 protected int portClient4;
 protected int portClient5;
 
-protected boolean localSessionsEnabled = false;
-protected boolean localSessionsUpgradingEnabled = false;
+public boolean localSessionsEnabled = false;
+public boolean localSessionsUpgradingEnabled = false;
--- End diff --

Same reason, will check and remove it if it's not required.


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205921085
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/quorum/UpgradeableSessionTracker.java 
---
@@ -19,6 +19,8 @@
 
 import java.util.concurrent.ConcurrentHashMap;
 import java.util.concurrent.ConcurrentMap;
+import java.util.Set;
+import java.util.HashSet;
--- End diff --

Yes, dangling import after implementation change, will remove. 


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205922234
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/quorum/LeaderSessionTracker.java ---
@@ -85,31 +85,43 @@ public boolean isGlobalSession(long sessionId) {
 return globalSessionTracker.isTrackingSession(sessionId);
 }
 
-public boolean addGlobalSession(long sessionId, int sessionTimeout) {
-boolean added =
-globalSessionTracker.addSession(sessionId, sessionTimeout);
-if (localSessionsEnabled && added) {
+public boolean trackSession(long sessionId, int sessionTimeout) {
+boolean tracked =
+globalSessionTracker.trackSession(sessionId, sessionTimeout);
+if (localSessionsEnabled && tracked) {
 // Only do extra logging so we know what kind of session this 
is
 // if we're supporting both kinds of sessions
-LOG.info("Adding global session 0x" + 
Long.toHexString(sessionId));
+LOG.info("Tracking global session 0x" + 
Long.toHexString(sessionId));
 }
-return added;
+return tracked;
 }
 
-public boolean addSession(long sessionId, int sessionTimeout) {
-boolean added;
-if (localSessionsEnabled && !isGlobalSession(sessionId)) {
-added = localSessionTracker.addSession(sessionId, 
sessionTimeout);
--- End diff --

When local session feature is enabled, the createSession method in 
ZooKeeperServer will create, track and 'commit' (update the local session in 
memory map) the local session immediately.


---


[GitHub] zookeeper pull request #447: [ZOOKEEPER-2926] Fix potential data consistency...

2018-07-27 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/447#discussion_r205921331
  
--- Diff: src/java/main/org/apache/zookeeper/server/SessionTracker.java ---
@@ -47,21 +47,20 @@
 long createSession(int sessionTimeout);
 
 /**
- * Add a global session to those being tracked.
+ * Track the session expire, not add to ZkDb.
  * @param id sessionId
  * @param to sessionTimeout
  * @return whether the session was newly added (if false, already 
existed)
  */
-boolean addGlobalSession(long id, int to);
+boolean trackSession(long id, int to);
--- End diff --

Local session tracker will start track session when create the session.


---


[GitHub] zookeeper pull request #496: ZOOKEEPER-3008: Potential NPE in SaslQuorumAuth...

2018-07-27 Thread pravsingh
Github user pravsingh commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/496#discussion_r205904126
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/quorum/auth/SaslQuorumAuthLearner.java
 ---
@@ -66,8 +67,8 @@ public SaslQuorumAuthLearner(boolean quorumRequireSasl,
  + "section '" + loginContext
  + "' could not be found.");
 }
-this.learnerLogin = new Login(loginContext,
-new SaslClientCallbackHandler(null, 
"QuorumLearner"), new ZKConfig());
+this.learnerLogin = loginFactory.createLogin(loginContext,
+new SaslClientCallbackHandler(null, "QuorumLearner"), 
new ZKConfig());
--- End diff --

this can be put on above line. makes it more readable.


---


[GitHub] zookeeper pull request #582: ZOOKEEPER-3103 Pluggable metrics system for Zoo...

2018-07-27 Thread pravsingh
Github user pravsingh commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/582#discussion_r205903481
  
--- Diff: src/java/main/org/apache/zookeeper/metrics/Summary.java ---
@@ -0,0 +1,34 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.metrics;
+
+/**
+ * Summaries track the size and number of events.
+ * They are able to publish minumum, maximum, average values, depending on 
the capabilities of the MetricsProvider.
+ */
+public interface Summary {
+
+ /**
+  * Register a value.
+  *
+  * @param value current value
+  */
+ public void registerValue(long value);
--- End diff --

no public.


---


[GitHub] zookeeper pull request #582: ZOOKEEPER-3103 Pluggable metrics system for Zoo...

2018-07-27 Thread pravsingh
Github user pravsingh commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/582#discussion_r205902926
  
--- Diff: src/java/main/org/apache/zookeeper/metrics/Gauge.java ---
@@ -0,0 +1,35 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.metrics;
+
+/**
+ * A Gauge is an application provided object which will be called by the 
framework in order to sample the value
+ * of an integer value.
+ */
+public interface Gauge {
+
+/**
+ * Returns the current value associated with this gauge.
+ * The MetricsProvider will call this callback without taking care of 
synchronization, it is up to the application
+ * to handle thread safety.
+ *
+ * @return the current value for the gauge
+ */
+public long getCurrentValue();
--- End diff --

no public.


---


[GitHub] zookeeper pull request #582: ZOOKEEPER-3103 Pluggable metrics system for Zoo...

2018-07-27 Thread pravsingh
Github user pravsingh commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/582#discussion_r205902898
  
--- Diff: src/java/main/org/apache/zookeeper/metrics/Counter.java ---
@@ -0,0 +1,47 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.metrics;
+
+/**
+ * A counter refers to a value which can only increase.
+ * Usually the value is reset when the process starts.
+ */
+public interface Counter {
+
+/**
+ * Increment the value by one.
+ */
+public default void inc() {
--- End diff --

public keyword is not needed in interfaces. All of them are public by 
default


---


[GitHub] zookeeper pull request #582: ZOOKEEPER-3103 Pluggable metrics system for Zoo...

2018-07-27 Thread pravsingh
Github user pravsingh commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/582#discussion_r205903511
  
--- Diff: src/java/main/org/apache/zookeeper/metrics/MetricsProvider.java 
---
@@ -0,0 +1,64 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.metrics;
+
+import java.util.Properties;
+
+/**
+ * A MetricsProvider is a system which collects Metrics and publishes 
current values to external facilities.
+ *
+ * The system will create an instance of the configured class using the 
default constructor, which must be public.
+ * After the instantiation of the provider, the system will call {@link 
#configure(java.util.Map) } in order to provide configuration,
+ * and then when the system is ready to work it will call {@link #start() 
}.
+ * 
+ * Providers can be used both on ZooKeeper servers and on ZooKeeper 
clients.
+ */
+public interface MetricsProvider {
+
+/**
+ * Configure the provider.
+ *
+ * @param configuration the configuration.
+ *
+ * @throws MetricsProviderLifeCycleException in case of invalid 
configuration.
+ */
+public void configure(Properties configuration) throws 
MetricsProviderLifeCycleException;
--- End diff --

no public


---


[GitHub] zookeeper pull request #572: ZOOKEEPER-3085 define exit codes in enum

2018-07-27 Thread pravsingh
Github user pravsingh commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/572#discussion_r205901936
  
--- Diff: src/java/main/org/apache/zookeeper/server/ExitCode.java ---
@@ -20,8 +20,35 @@
 /**
  * Exit code used to exit server
  */
-public class ExitCode {
+public enum ExitCode {
+
+/* Execution finished normally */
+EXECUTION_FINISHED(0),
+/* Unexpected errors like IO Exceptions */
+UNEXPECTED_ERROR(1),
+/* Invalid arguments during invocations */
+INVALID_INVOCATION(2),
+/* Cannot access datadir when trying to replicate server */
+UNABLE_TO_ACCESS_DATADIR(3),
+/* Unable to start admin server at ZooKeeper startup */
+ERROR_STARTING_ADMIN_SERVER(4),
+/* Severe error during snapshot IO */
+TXNLOG_ERROR_TAKING_SNAPSHOT(10),
+/* zxid from COMMIT does not match the one from pendingTxns queue */
+UNMATCHED_TXN_COMMIT(12),
+/* Unexpected packet from leader, or unable to truncate log on 
Leader.TRUNC */
+QUORUM_PACKET_ERROR(13),
+/* Unable to bind to the quorum (election) port after multiple retry */
+UNABLE_TO_BIND_QUORUM_PORT(14);
+
+private final int value;
+
+ExitCode(final int newValue) {
+value = newValue;
+}
+
+public int getValue() {
--- End diff --

do we have test cases asserting on the value fields for each of these enums?


---


[GitHub] zookeeper pull request #572: ZOOKEEPER-3085 define exit codes in enum

2018-07-27 Thread pravsingh
Github user pravsingh commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/572#discussion_r205901511
  
--- Diff: src/java/main/org/apache/zookeeper/server/ExitCode.java ---
@@ -20,8 +20,35 @@
 /**
  * Exit code used to exit server
  */
-public class ExitCode {
+public enum ExitCode {
+
+/* Execution finished normally */
--- End diff --

I would suggest using java doc comment. /***/.


---


[GitHub] zookeeper pull request #567: ZOOKEEPER-3071: Add a config parameter to contr...

2018-07-27 Thread suyogmapara
Github user suyogmapara commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/567#discussion_r205897728
  
--- Diff: src/java/test/org/apache/zookeeper/test/TxnLogSizeLimitTest.java 
---
@@ -0,0 +1,173 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.io.File;
+import java.util.HashSet;
+import java.util.Random;
+
+import org.apache.log4j.Logger;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZKTestCase;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.data.Stat;
+import org.apache.zookeeper.proto.CreateRequest;
+import org.apache.zookeeper.server.ServerCnxnFactory;
+import org.apache.zookeeper.server.ZKDatabase;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.server.persistence.FileTxnLog;
+import org.apache.zookeeper.server.persistence.FileTxnSnapLog;
+import org.apache.zookeeper.txn.TxnHeader;
+import org.junit.Assert;
+import org.junit.Test;
+
+/**
+ * Test loading committed proposal from txnlog. Learner uses these 
proposals to
+ * catch-up with leader
+ */
+public class TxnLogSizeLimitTest extends ZKTestCase implements Watcher {
+private static final Logger LOG = Logger
--- End diff --

Unit tests define some constants. Would you suggest I move it to per test? 
Also what is the general guideline for creating test classes?


---


[jira] [Commented] (ZOOKEEPER-3036) Unexpected exception in zookeeper

2018-07-27 Thread Oded (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560215#comment-16560215
 ] 

Oded commented on ZOOKEEPER-3036:
-

We are running with kafka 1.1.0




> Unexpected exception in zookeeper
> -
>
> Key: ZOOKEEPER-3036
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx
>Affects Versions: 3.4.10
> Environment: 3 Zookeepers, 5 kafka servers
>Reporter: Oded
>Priority: Critical
>
> We got an issue with one of the zookeeprs (Leader), causing the entire kafka 
> cluster to fail:
> 2018-05-09 02:29:01,730 [myid:3] - ERROR 
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected 
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>     at java.net.SocketInputStream.read(SocketInputStream.java:171)
>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>     at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>     at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
> 2018-05-09 02:29:01,730 [myid:3] - WARN  
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - *** GOODBYE 
> /192.168.0.91:42490 
>  
> We would expect that zookeeper will choose another Leader and the Kafka 
> cluster will continue to work as expected, but that was not the case.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3036) Unexpected exception in zookeeper

2018-07-27 Thread Kevin Lu (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560186#comment-16560186
 ] 

Kevin Lu commented on ZOOKEEPER-3036:
-

[~o...@coralogix.com] yes multiple brokers think they are the controller.

What version of Kafka are you using? We found this issue in 0.10.2.0, and 
upgrading to 1.1.1 seems to have fixed the problem. It is stable now, but not 
sure if it will happen again.

> Unexpected exception in zookeeper
> -
>
> Key: ZOOKEEPER-3036
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx
>Affects Versions: 3.4.10
> Environment: 3 Zookeepers, 5 kafka servers
>Reporter: Oded
>Priority: Critical
>
> We got an issue with one of the zookeeprs (Leader), causing the entire kafka 
> cluster to fail:
> 2018-05-09 02:29:01,730 [myid:3] - ERROR 
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected 
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>     at java.net.SocketInputStream.read(SocketInputStream.java:171)
>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>     at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>     at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
> 2018-05-09 02:29:01,730 [myid:3] - WARN  
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - *** GOODBYE 
> /192.168.0.91:42490 
>  
> We would expect that zookeeper will choose another Leader and the Kafka 
> cluster will continue to work as expected, but that was not the case.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3100) ZooKeeper client times out due to random choice of resolved addresses

2018-07-27 Thread Andor Molnar (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16560168#comment-16560168
 ] 

Andor Molnar commented on ZOOKEEPER-3100:
-

Got it. Sorry, I keep forgetting about this is with embedded ZooKeeper.

> ZooKeeper client times out due to random choice of resolved addresses
> -
>
> Key: ZOOKEEPER-3100
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3100
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.13
>Reporter: Rajini Sivaram
>Assignee: Andor Molnar
>Priority: Major
>
> The changes to ZooKeeper clients to re-resolve hosts made under 
> ZOOKEEPER-2184 results in delays when only a subset of the addresses that a 
> host resolves to are actually reachable. This can result in connection 
> timeouts on the client.
> For example, when running tests with a single ZooKeeper server accepting 
> connections on 127.0.0.1 on a host that has both IPv4 and IPv6, we have seen 
> connection timeouts in tests if client connects using `localhost` rather than 
> `127.0.0.1`. ZooKeeper client resolves `localhost` to both the IPv4 and IPv6 
> addresses and chooses a random one. If IPv6 was chosen, a fixed one second 
> backoff is applied before retry since there is only one hostname specified. 
> After backoff, 'localhost' is resolved again and a random address chosen, 
> which could also be the unconnectable IPv6 address.
> For the list of host names specified for connection, the clients do 
> round-robin without backoffs until connections to all hostnames are 
> attempted. Can we also do the same for addresses that each of the hosts 
> resolves to, so that backoffs are only applied after connection to each 
> address is attempted once and every address is connected to once using 
> round-robin rather than random selection? This will avoid delays in cases 
> where at least one address can be connected to.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #47: Update zookeeperOver.html

2018-07-27 Thread cpoerschke
Github user cpoerschke commented on the issue:

https://github.com/apache/zookeeper/pull/47
  
Ah, oops, @ghost is indeed a ghost!

Hmm, so is it unclear then how or who this pull request could be closed by 
then?

Except perhaps the customary "Closes #47" in a commit which would result in 
the ASF bot closing it?


---


[GitHub] zookeeper issue #47: Update zookeeperOver.html

2018-07-27 Thread cpoerschke
Github user cpoerschke commented on the issue:

https://github.com/apache/zookeeper/pull/47
  
@phunt looks like i don't have the necessary privileges to close this pull 
request.

@ghost as creator of the pull request, would you have a moment perhaps to 
close it? thank you.


---


[GitHub] zookeeper issue #566: ZOOKEEPER-3062: mention fsync.warningthresholdms in Fi...

2018-07-27 Thread cpoerschke
Github user cpoerschke commented on the issue:

https://github.com/apache/zookeeper/pull/566
  
Thanks everyone for your feedback!

> ... are you ok with removing the extra words in the log message?

Hmm, ok, done. Should I update the pull request and 
https://issues.apache.org/jira/browse/ZOOKEEPER-3062 title(s) to reflect the 
changed scope or would that unnecessarily confuse the ticket history and the 
continuous integration tools?


---


ZooKeeper_branch34_openjdk7 - Build # 2004 - Failure

2018-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/2004/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 41.20 KB...]
[junit] Running org.apache.zookeeper.test.SaslAuthDesignatedServerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.677 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailDesignatedClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.369 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailNotifyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.803 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.752 sec
[junit] Running org.apache.zookeeper.test.SaslAuthMissingClientConfigTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.813 sec
[junit] Running org.apache.zookeeper.test.SaslClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.075 sec
[junit] Running org.apache.zookeeper.test.SessionInvalidationTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.765 sec
[junit] Running org.apache.zookeeper.test.SessionTest
[junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
11.458 sec
[junit] Running org.apache.zookeeper.test.SessionTimeoutTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.936 sec
[junit] Running org.apache.zookeeper.test.StandaloneTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.886 sec
[junit] Running org.apache.zookeeper.test.StatTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.171 sec
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest
[junit] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.706 sec
[junit] Running org.apache.zookeeper.test.SyncCallTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.812 sec
[junit] Running org.apache.zookeeper.test.TruncateTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
9.502 sec
[junit] Running org.apache.zookeeper.test.UpgradeTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.456 sec
[junit] Running org.apache.zookeeper.test.WatchedEventTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.126 sec
[junit] Running org.apache.zookeeper.test.WatcherFuncTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.334 sec
[junit] Running org.apache.zookeeper.test.WatcherTest
[junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
28.959 sec
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
9.406 sec
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.905 sec

fail.build.on.test.failure:

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/build.xml:1393:
 The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/build.xml:1396:
 Tests failed!

Total time: 32 minutes 20 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Setting OPENJDK_7_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-7-openjdk-amd64/
Recording test results
Setting OPENJDK_7_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-7-openjdk-amd64/
Setting OPENJDK_7_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-7-openjdk-amd64/
Setting OPENJDK_7_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-7-openjdk-amd64/
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting OPENJDK_7_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-7-openjdk-amd64/
Setting OPENJDK_7_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-7-openjdk-amd64/



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff

Error Message:
expected:<4294967298> but was:<0>

Stack Trace:
junit.framework.AssertionFailedError: expected:<4294967298> but was:<0>
at 
org.apache.zookeeper.server.quorum.Zab1_0Test$5.converseWithFollower(Zab1_0Test.java:796)
at 
org.apache.zookeeper.server.quorum.Zab1_0Test.testFollowerConversation(Zab1_0Test.java:442)
at 
org.apache.zookeeper.server.quorum.Zab1_0Test.testNormalFollowerRunWithDiff(Zab1_0Test.java:712)

[GitHub] zookeeper issue #563: ZOOKEEPER-3072: Throttle race condition fix

2018-07-27 Thread breed
Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/563
  
thank you @bothejjms !


---


[jira] [Commented] (ZOOKEEPER-3100) ZooKeeper client times out due to random choice of resolved addresses

2018-07-27 Thread Rajini Sivaram (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559837#comment-16559837
 ] 

Rajini Sivaram commented on ZOOKEEPER-3100:
---

[~andorm] In the failing Kafka test, ZooKeeper was not listening on the wilcard 
address, it was listening specifically on 127.0.0.1 
(([https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/zk/EmbeddedZookeeper.scala)|https://github.com/apache/kafka/blob/trunk/core/src/test/scala/unit/kafka/zk/EmbeddedZookeeper.scala).].
 Hence the connection to the IPv6 address was failing. I think in the example 
above, you were running ZooKeeper on the wildcard address and hence it worked 
for both IPv4 and IPv6.

We have fixed the Kafka tests by changing clients to connect to `127.0.0.1` 
instead of `localhost` and that is a reasonable workaround since the server is 
bound explicitly to that address. But since this used to work before, perhaps 
it would be better to guarantee that all the possible addresses are attempted 
before applying backoff as large as a second?

> ZooKeeper client times out due to random choice of resolved addresses
> -
>
> Key: ZOOKEEPER-3100
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3100
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.13
>Reporter: Rajini Sivaram
>Assignee: Andor Molnar
>Priority: Major
>
> The changes to ZooKeeper clients to re-resolve hosts made under 
> ZOOKEEPER-2184 results in delays when only a subset of the addresses that a 
> host resolves to are actually reachable. This can result in connection 
> timeouts on the client.
> For example, when running tests with a single ZooKeeper server accepting 
> connections on 127.0.0.1 on a host that has both IPv4 and IPv6, we have seen 
> connection timeouts in tests if client connects using `localhost` rather than 
> `127.0.0.1`. ZooKeeper client resolves `localhost` to both the IPv4 and IPv6 
> addresses and chooses a random one. If IPv6 was chosen, a fixed one second 
> backoff is applied before retry since there is only one hostname specified. 
> After backoff, 'localhost' is resolved again and a random address chosen, 
> which could also be the unconnectable IPv6 address.
> For the list of host names specified for connection, the clients do 
> round-robin without backoffs until connections to all hostnames are 
> attempted. Can we also do the same for addresses that each of the hosts 
> resolves to, so that backoffs are only applied after connection to each 
> address is attempted once and every address is connected to once using 
> round-robin rather than random selection? This will avoid delays in cases 
> where at least one address can be connected to.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #563: ZOOKEEPER-3072: Throttle race condition fix

2018-07-27 Thread bothejjms
Github user bothejjms commented on the issue:

https://github.com/apache/zookeeper/pull/563
  
I have refactored the branches as suggested.


---


[GitHub] zookeeper pull request #563: ZOOKEEPER-3072: Throttle race condition fix

2018-07-27 Thread bothejjms
Github user bothejjms commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/563#discussion_r205778339
  
--- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java ---
@@ -1128,9 +1128,9 @@ public void processPacket(ServerCnxn cnxn, ByteBuffer 
incomingBuffer) throws IOE
 Record rsp = processSasl(incomingBuffer,cnxn);
 ReplyHeader rh = new ReplyHeader(h.getXid(), 0, 
KeeperException.Code.OK.intValue());
 cnxn.sendResponse(rh,rsp, "response"); // not sure about 
3rd arg..what is it?
-return;
--- End diff --

I have refactored like that.
Returns are actually unnecessary but I have consistently added them now.


---


[jira] [Commented] (ZOOKEEPER-3100) ZooKeeper client times out due to random choice of resolved addresses

2018-07-27 Thread Andor Molnar (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559776#comment-16559776
 ] 

Andor Molnar commented on ZOOKEEPER-3100:
-

[~rsivaram]

I've run a few tests with current 3.4 and 3.5 versions of ZooKeeper and I got 
the same results:

I spoke a little bit soon regarding the wildcard address, because ZooKeeper 
opens a unified socket this way. Although netstat shows that Zk is listening 
only on v6 socket, clients are able to connect with both protocols:
{noformat}
andor@andor-centos zkconf]$ sudo netstat -plnt | grep 2181
tcp6 0 0 :::2181 :::* LISTEN 9249/java 

[andor@andor-centos zkconf]$ echo "stat" | nc -4 -v localhost 2181
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to 127.0.0.1:2181.
stat is not executed because it is not in the whitelist.
Ncat: 5 bytes sent, 57 bytes received in 0.01 seconds.

[andor@andor-centos zkconf]$ echo "stat" | nc -6 -v localhost 2181
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connected to ::1:2181.
stat is not executed because it is not in the whitelist.
Ncat: 5 bytes sent, 57 bytes received in 0.01 seconds.{noformat}
So back to your original issue, I'm not able to repro it. CLI also works 
perfectly for me.

I need to look into the Kafka ticket, it must be something specific to that 
client.

 

 

 

> ZooKeeper client times out due to random choice of resolved addresses
> -
>
> Key: ZOOKEEPER-3100
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3100
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.13
>Reporter: Rajini Sivaram
>Assignee: Andor Molnar
>Priority: Major
>
> The changes to ZooKeeper clients to re-resolve hosts made under 
> ZOOKEEPER-2184 results in delays when only a subset of the addresses that a 
> host resolves to are actually reachable. This can result in connection 
> timeouts on the client.
> For example, when running tests with a single ZooKeeper server accepting 
> connections on 127.0.0.1 on a host that has both IPv4 and IPv6, we have seen 
> connection timeouts in tests if client connects using `localhost` rather than 
> `127.0.0.1`. ZooKeeper client resolves `localhost` to both the IPv4 and IPv6 
> addresses and chooses a random one. If IPv6 was chosen, a fixed one second 
> backoff is applied before retry since there is only one hostname specified. 
> After backoff, 'localhost' is resolved again and a random address chosen, 
> which could also be the unconnectable IPv6 address.
> For the list of host names specified for connection, the clients do 
> round-robin without backoffs until connections to all hostnames are 
> attempted. Can we also do the same for addresses that each of the hosts 
> resolves to, so that backoffs are only applied after connection to each 
> address is attempted once and every address is connected to once using 
> round-robin rather than random selection? This will avoid delays in cases 
> where at least one address can be connected to.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #584: ZOOKEEPER-3102: Potential race condition when c...

2018-07-27 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/584#discussion_r205749086
  
--- Diff: src/java/main/org/apache/zookeeper/server/DataTree.java ---
@@ -478,7 +478,10 @@ public void createNode(final String path, byte data[], 
List acl,
 HashSet list = ephemerals.get(ephemeralOwner);
 if (list == null) {
 list = new HashSet();
-ephemerals.put(ephemeralOwner, list);
+HashSet _list;
--- End diff --

You can use `computeIfAbsent()` like this:
```java
HashSet list = ephemerals.computeIfAbsent(ephemeralOwner, k -> new 
HashSet());
```


---


[jira] [Commented] (ZOOKEEPER-3057) Fix IPv6 literal usage

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559635#comment-16559635
 ] 

Hudson commented on ZOOKEEPER-3057:
---

FAILURE: Integrated in Jenkins build ZooKeeper-trunk #123 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/123/])
ZOOKEEPER-3057: Fix IPv6 literal usage (andor: rev 
ba8932dccb227b5b52de98e33c46054014f951b7)
* (edit) 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java
* (edit) src/java/main/org/apache/zookeeper/server/util/ConfigUtils.java
* (edit) src/java/main/org/apache/zookeeper/server/quorum/LocalPeerBean.java
* (add) src/java/test/org/apache/zookeeper/common/NetUtilsTest.java
* (edit) src/java/test/org/apache/zookeeper/server/quorum/LocalPeerBeanTest.java
* (add) src/java/test/org/apache/zookeeper/server/util/ConfigUtilsTest.java
* (edit) src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java
* (edit) src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java
* (edit) src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java
* (add) src/java/main/org/apache/zookeeper/common/NetUtils.java


> Fix IPv6 literal usage
> --
>
> Key: ZOOKEEPER-3057
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3057
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: other
>Affects Versions: 3.4.12
>Reporter: Mohamed Jeelani
>Assignee: Mohamed Jeelani
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 5h
>  Remaining Estimate: 0h
>
> IPv6 literals are not parsed correctly and can lead to potential errors if 
> not be an eye sore. Need to parse and display them correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-3067) Optionally suppress client environment logging.

2018-07-27 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559636#comment-16559636
 ] 

Hudson commented on ZOOKEEPER-3067:
---

FAILURE: Integrated in Jenkins build ZooKeeper-trunk #123 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/123/])
ZOOKEEPER-3067: Optionally disable client environment logging. (andor: rev 
01132edbd4df3df4472a2f8d3077fe585035cce1)
* (edit) src/c/CMakeLists.txt
* (edit) src/c/include/zookeeper.h
* (edit) src/c/Makefile.am
* (edit) src/c/src/zookeeper.c
* (add) src/c/tests/TestLogClientEnv.cc


> Optionally suppress client environment logging.
> ---
>
> Key: ZOOKEEPER-3067
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3067
> Project: ZooKeeper
>  Issue Type: Task
>  Components: c client
>Reporter: James Peach
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> It would be helpful to add a {{zookeeper_init}} flag to suppress the client 
> environment logging. In our deployment, this causes LDAP lookups for the 
> current user ID, which is otherwise an unnecessary service dependency for 
> ZooKeeper clients.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


ZooKeeper-trunk - Build # 123 - Failure

2018-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk/123/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 144.25 KB...]
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
3.427 sec, Thread: 8, Class: org.apache.zookeeper.test.ServerCnxnTest
[junit] Running org.apache.zookeeper.test.SessionTest in thread 5
[junit] Running org.apache.zookeeper.test.SessionTimeoutTest in thread 8
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.335 sec, Thread: 8, Class: org.apache.zookeeper.test.SessionTimeoutTest
[junit] Running org.apache.zookeeper.test.SessionTrackerCheckTest in thread 
8
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.102 sec, Thread: 8, Class: org.apache.zookeeper.test.SessionTrackerCheckTest
[junit] Running org.apache.zookeeper.test.SessionUpgradeTest in thread 8
[junit] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
14.109 sec, Thread: 5, Class: org.apache.zookeeper.test.SessionTest
[junit] Running org.apache.zookeeper.test.StandaloneTest in thread 5
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.777 sec, Thread: 5, Class: org.apache.zookeeper.test.StandaloneTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
129.838 sec, Thread: 6, Class: org.apache.zookeeper.test.RecoveryTest
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest in thread 5
[junit] Running org.apache.zookeeper.test.StatTest in thread 6
[junit] Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.102 sec, Thread: 5, Class: org.apache.zookeeper.test.StaticHostProviderTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.115 sec, Thread: 6, Class: org.apache.zookeeper.test.StatTest
[junit] Running org.apache.zookeeper.test.StringUtilTest in thread 5
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.106 sec, Thread: 5, Class: org.apache.zookeeper.test.StringUtilTest
[junit] Running org.apache.zookeeper.test.SyncCallTest in thread 6
[junit] Running org.apache.zookeeper.test.TruncateTest in thread 5
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.228 sec, Thread: 6, Class: org.apache.zookeeper.test.SyncCallTest
[junit] Running org.apache.zookeeper.test.WatchEventWhenAutoResetTest in 
thread 6
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
21.182 sec, Thread: 8, Class: org.apache.zookeeper.test.SessionUpgradeTest
[junit] Running org.apache.zookeeper.test.WatchedEventTest in thread 8
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.103 sec, Thread: 8, Class: org.apache.zookeeper.test.WatchedEventTest
[junit] Running org.apache.zookeeper.test.WatcherFuncTest in thread 8
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
5.073 sec, Thread: 8, Class: org.apache.zookeeper.test.WatcherFuncTest
[junit] Running org.apache.zookeeper.test.WatcherTest in thread 8
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
83.942 sec, Thread: 2, Class: org.apache.zookeeper.test.RestoreCommittedLogTest
[junit] Running org.apache.zookeeper.test.X509AuthTest in thread 2
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.094 sec, Thread: 2, Class: org.apache.zookeeper.test.X509AuthTest
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest in 
thread 2
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
18.002 sec, Thread: 5, Class: org.apache.zookeeper.test.TruncateTest
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest in thread 5
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.319 sec, Thread: 5, Class: org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
22.217 sec, Thread: 6, Class: 
org.apache.zookeeper.test.WatchEventWhenAutoResetTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
14.727 sec, Thread: 2, Class: org.apache.zookeeper.test.ZkDatabaseCorruptionTest
[junit] Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
34.413 sec, Thread: 8, Class: org.apache.zookeeper.test.WatcherTest
[junit] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
290.124 sec, Thread: 4, Class: org.apache.zookeeper.test.ReconfigTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
618.727 sec, Thread: 3, Class: org.apache.zookeeper.test.DisconnectedWatcherTest
[junit] Tests run: 105, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
561.808 sec, Thread: 7, Class: org.apache.zookeeper.test.NettyNettySuiteTest

ZooKeeper_branch34_openjdk8 - Build # 2 - Still Failing

2018-07-27 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk8/2/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 3.54 KB...]
at java.lang.Thread.run(Thread.java:748)
Suppressed: hudson.remoting.Channel$CallSiteStackTrace: Remote call to 
ubuntu-eu2
at 
hudson.remoting.Channel.attachCallSiteStackTrace(Channel.java:1741)
at 
hudson.remoting.UserRequest$ExceptionResponse.retrieve(UserRequest.java:357)
at hudson.remoting.Channel.call(Channel.java:955)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.execute(RemoteGitImpl.java:146)
at sun.reflect.GeneratedMethodAccessor1143.invoke(Unknown 
Source)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.jenkinsci.plugins.gitclient.RemoteGitImpl$CommandInvocationHandler.invoke(RemoteGitImpl.java:132)
at com.sun.proxy.$Proxy118.execute(Unknown Source)
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:886)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:1155)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1186)
at hudson.scm.SCM.checkout(SCM.java:504)
at 
hudson.model.AbstractProject.checkout(AbstractProject.java:1208)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:574)
at 
jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:499)
at hudson.model.Run.execute(Run.java:1794)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at 
hudson.model.ResourceController.execute(ResourceController.java:97)
at hudson.model.Executor.run(Executor.java:429)
ERROR: Error fetching remote repo 'origin'
Retrying after 10 seconds
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url git://git.apache.org/zookeeper.git # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
No valid HEAD. Skipping the resetting
 > git clean -fdx # timeout=10
Fetching upstream changes from git://git.apache.org/zookeeper.git
 > git --version # timeout=10
 > git fetch --tags --progress git://git.apache.org/zookeeper.git 
 > +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/branch-3.4^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/branch-3.4^{commit} # timeout=10
Checking out Revision fc346f2a0ce2df0a50eb601d7617d8dd6a7e6e69 
(refs/remotes/origin/branch-3.4)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f fc346f2a0ce2df0a50eb601d7617d8dd6a7e6e69
Commit message: "ZOOKEEPER-3009: fix the related bugs in branch-3.4"
 > git rev-list --no-walk fc346f2a0ce2df0a50eb601d7617d8dd6a7e6e69 # timeout=10
No emails were triggered.
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
[ZooKeeper_branch34_openjdk8] $ 
/home/jenkins/tools/ant/apache-ant-1.9.7/bin/ant -Dtest.junit.maxmem=2g 
-Dtest.output=no -Dtest.junit.threads=8 -Dtest.junit.output.format=xml 
-Djavac.target=1.8 clean test-core-java
Error: JAVA_HOME is not defined correctly.
  We cannot execute /usr/lib/jvm/java-8-openjdk-amd64//bin/java
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Recording test results
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
ERROR: Step ‘Publish JUnit test result report’ failed: No test report files 
were found. Configuration error?
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/
Setting OPENJDK_8_ON_UBUNTU_ONLY__HOME=/usr/lib/jvm/java-8-openjdk-amd64/



###
## FAILED TESTS (if any) 
##
No tests ran.

[GitHub] zookeeper issue #565: ZOOKEEPER-3067: Optionally disable client environment ...

2018-07-27 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/565
  
Committed to master branch.
Thanks @jpeach !


---


[GitHub] zookeeper pull request #565: ZOOKEEPER-3067: Optionally disable client envir...

2018-07-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/565


---


[jira] [Resolved] (ZOOKEEPER-3067) Optionally suppress client environment logging.

2018-07-27 Thread Andor Molnar (JIRA)


 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andor Molnar resolved ZOOKEEPER-3067.
-
   Resolution: Fixed
Fix Version/s: 3.6.0

Issue resolved by pull request 565
[https://github.com/apache/zookeeper/pull/565]

> Optionally suppress client environment logging.
> ---
>
> Key: ZOOKEEPER-3067
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3067
> Project: ZooKeeper
>  Issue Type: Task
>  Components: c client
>Reporter: James Peach
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 3.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> It would be helpful to add a {{zookeeper_init}} flag to suppress the client 
> environment logging. In our deployment, this causes LDAP lookups for the 
> current user ID, which is otherwise an unnecessary service dependency for 
> ZooKeeper clients.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #548: [ZOOKEEPER-3057] Fix IPv6 literal usage

2018-07-27 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/548
  
Committed to master branch only, because it conflicts with 3.5.
@mjeelanimsft Would you please create separate pull request for branch-3.5 
and branch-3.4?


---


[GitHub] zookeeper pull request #548: [ZOOKEEPER-3057] Fix IPv6 literal usage

2018-07-27 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/zookeeper/pull/548


---


[GitHub] zookeeper issue #548: [ZOOKEEPER-3057] Fix IPv6 literal usage

2018-07-27 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/548
  
Jenkins is green and we got 2 approvals. Committing.


---


[GitHub] zookeeper issue #583: [ZOOKEEPER-3104] Fix data inconsistency due to NEWLEAD...

2018-07-27 Thread nkalmar
Github user nkalmar commented on the issue:

https://github.com/apache/zookeeper/pull/583
  
It's not a blocker for me, I agree it would be nice to have a unified 
format, but that's pretty hard to achieve on an Apache I think :( 

Anyway, that's why I wrote a "comment" review, not a request for change.

Thanks for the fix @lvfangmin ! :)


---


[GitHub] zookeeper pull request #545: ZOOKEEPER-2261 When only secureClientPort is co...

2018-07-27 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/545#discussion_r205729889
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/persistence/UtilTest.java ---
@@ -0,0 +1,91 @@
+/**
--- End diff --

I had to make a bunch of refactorings to move all of these unit tests to 
the right place. Sorry for polluting the pull request, I can move it to a 
separate ticket if you want.


---


[GitHub] zookeeper pull request #545: ZOOKEEPER-2261 When only secureClientPort is co...

2018-07-27 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/545#discussion_r205729651
  
--- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java ---
@@ -866,6 +866,9 @@ public void setServerCnxnFactory(ServerCnxnFactory 
factory) {
 }
 
 public ServerCnxnFactory getServerCnxnFactory() {
+if (secureServerCnxnFactory != null) {
+return secureServerCnxnFactory;
+}
 return serverCnxnFactory;
 }
 
--- End diff --

I will look into it. Original issue was on the caller side which didn't 
have the logic to probe which ServerCnxnFactory is available, so I probably 
have to make it more clever.

Though I'm not entirely convinced that it should be the caller's 
responsibility to deal with the problem. He just want something that implements 
the interface. Why should make this their problem?


---


[GitHub] zookeeper pull request #586: Zookeeper 3105:Character coding problem occur w...

2018-07-27 Thread lordofkey
GitHub user lordofkey opened a pull request:

https://github.com/apache/zookeeper/pull/586

Zookeeper 3105:Character coding problem occur when create a node using 
python3

when creating a node using python3, InvalidACLException occurs all the 
time. it`s caused by imcompatible way of parsing acl passed through python3 api.
so

```
acls->data[i].id.id = strdup( PyUnicode_AsUnicode( PyDict_GetItemString( a, 
"id" ) ) );
acls->data[i].id.scheme = strdup( PyUnicode_AsUnicode( 
PyDict_GetItemString( a, "scheme" ) ) );
```
is changed to

```
acls->data[i].id.id = strdup( PyBytes_AS_STRING( PyUnicode_AsASCIIString( 
PyDict_GetItemString( a, "id" ) ) ) );
acls->data[i].id.scheme = strdup( PyBytes_AS_STRING( 
PyUnicode_AsASCIIString( PyDict_GetItemString( a, "scheme" ) ) ) );
```

because `acls->data[i].id.id` and `acls->data[i].id.scheme` must be an 
ASCII string.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/lordofkey/zookeeper ZOOKEEPER-3105

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/586.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #586


commit 5a441ed6058740458b8ec549fd32931757ce4e3a
Author: yanghao 
Date:   2018-07-27T07:35:41Z

ZOOKEEPER-3105:Character coding problem occur when create a node using 
python3




---


[GitHub] zookeeper pull request #585: Zookeeper 3105:Character coding problem occur w...

2018-07-27 Thread lordofkey
Github user lordofkey closed the pull request at:

https://github.com/apache/zookeeper/pull/585


---


[jira] [Commented] (ZOOKEEPER-3036) Unexpected exception in zookeeper

2018-07-27 Thread Oded (JIRA)


[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16559328#comment-16559328
 ] 

Oded commented on ZOOKEEPER-3036:
-

Hi,

We did the same but the issue returned again. It happens to us from time to
time, and we handle it manually.
The main problem is that it caused kafka for "split brain" where the
cluster believes it has more then one controller.





> Unexpected exception in zookeeper
> -
>
> Key: ZOOKEEPER-3036
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: jmx
>Affects Versions: 3.4.10
> Environment: 3 Zookeepers, 5 kafka servers
>Reporter: Oded
>Priority: Critical
>
> We got an issue with one of the zookeeprs (Leader), causing the entire kafka 
> cluster to fail:
> 2018-05-09 02:29:01,730 [myid:3] - ERROR 
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected 
> exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>     at java.net.SocketInputStream.socketRead0(Native Method)
>     at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>     at java.net.SocketInputStream.read(SocketInputStream.java:171)
>     at java.net.SocketInputStream.read(SocketInputStream.java:141)
>     at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>     at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>     at java.io.DataInputStream.readInt(DataInputStream.java:387)
>     at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>     at 
> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>     at 
> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>     at 
> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
> 2018-05-09 02:29:01,730 [myid:3] - WARN  
> [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - *** GOODBYE 
> /192.168.0.91:42490 
>  
> We would expect that zookeeper will choose another Leader and the Kafka 
> cluster will continue to work as expected, but that was not the case.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)