ZooKeeper-trunk-WinVS2008 - Build # 2345 - Still Failing

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008/2345/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 78 lines...]

ivy-retrieve:
[ivy:retrieve] :: Apache Ivy 2.4.0 - 20141213170938 :: 
http://ant.apache.org/ivy/ ::
[ivy:retrieve] :: loading settings :: file = 
f:\jenkins\jenkins-slave\workspace\ZooKeeper-trunk-WinVS2008\ivysettings.xml
[ivy:retrieve] :: resolving dependencies :: 
org.apache.zookeeper#zookeeper;3.6.0-SNAPSHOT
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  found jline#jline;2.11 in maven2
[ivy:retrieve]  found org.eclipse.jetty#jetty-server;9.2.18.v20160721 in maven2
[ivy:retrieve]  found javax.servlet#javax.servlet-api;3.1.0 in maven2
[ivy:retrieve]  found org.eclipse.jetty#jetty-http;9.2.18.v20160721 in maven2
[ivy:retrieve]  found org.eclipse.jetty#jetty-util;9.2.18.v20160721 in maven2
[ivy:retrieve]  found org.eclipse.jetty#jetty-io;9.2.18.v20160721 in maven2
[ivy:retrieve]  found org.eclipse.jetty#jetty-servlet;9.2.18.v20160721 in maven2
[ivy:retrieve]  found org.eclipse.jetty#jetty-security;9.2.18.v20160721 in 
maven2
[ivy:retrieve]  found org.codehaus.jackson#jackson-mapper-asl;1.9.11 in maven2
[ivy:retrieve]  found org.codehaus.jackson#jackson-core-asl;1.9.11 in maven2
[ivy:retrieve]  found org.slf4j#slf4j-api;1.7.5 in maven2
[ivy:retrieve]  found org.slf4j#slf4j-log4j12;1.7.5 in maven2
[ivy:retrieve]  found commons-cli#commons-cli;1.2 in maven2
[ivy:retrieve]  found log4j#log4j;1.2.17 in maven2
[ivy:retrieve]  found io.netty#netty;3.10.5.Final in maven2
[ivy:retrieve]  found net.java.dev.javacc#javacc;5.0 in maven2
[ivy:retrieve] :: resolution report :: resolve 456ms :: artifacts dl 22ms
-
|  |modules||   artifacts   |
|   conf   | number| search|dwnlded|evicted|| number|dwnlded|
-
|  default |   16  |   0   |   0   |   0   ||   16  |   0   |
-
[ivy:retrieve] :: retrieving :: org.apache.zookeeper#zookeeper
[ivy:retrieve]  confs: [default]
[ivy:retrieve]  16 artifacts copied, 0 already retrieved (4635kB/34ms)

generate_jute_parser:
[mkdir] Created dir: 
f:\jenkins\jenkins-slave\workspace\ZooKeeper-trunk-WinVS2008\build\jute_compiler\org\apache\jute\compiler\generated
[ivy:artifactproperty] DEPRECATED: 'ivy.conf.file' is deprecated, use 
'ivy.settings.file' instead
[ivy:artifactproperty] :: loading settings :: file = 
f:\jenkins\jenkins-slave\workspace\ZooKeeper-trunk-WinVS2008\ivysettings.xml
 [move] Moving 1 file to 
f:\jenkins\jenkins-slave\workspace\ZooKeeper-trunk-WinVS2008\build\lib
   [javacc] Java Compiler Compiler Version 5.0 (Parser Generator)
   [javacc] (type "javacc" with no arguments for help)
   [javacc] Reading from file 
f:\jenkins\jenkins-slave\workspace\ZooKeeper-trunk-WinVS2008\src\java\main\org\apache\jute\compiler\generated\rcc.jj
 . . .
   [javacc] File "TokenMgrError.java" does not exist.  Will create one.
   [javacc] File "ParseException.java" does not exist.  Will create one.
   [javacc] File "Token.java" does not exist.  Will create one.
   [javacc] File "SimpleCharStream.java" does not exist.  Will create one.
   [javacc] Parser generated successfully.

jute:

BUILD FAILED
f:\jenkins\jenkins-slave\workspace\ZooKeeper-trunk-WinVS2008\build.xml:273: 
Unable to find a javac compiler;
com.sun.tools.javac.Main is not on the classpath.
Perhaps JAVA_HOME does not point to the JDK.
It is currently set to "C:\Program Files\Java\jre1.8.0_92"

Total time: 3 seconds
Build step 'Invoke Ant' marked build as failure
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2016-12-01 Thread Mel Martinez (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713743#comment-15713743
 ] 

Mel Martinez commented on ZOOKEEPER-2251:
-

Michael Han - we are using version 3.4.5.  

We would almost certainly upgrade to a newer version if we knew this problem 
would be addressed. 



> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.9, 3.5.2
>Reporter: nijel
>Assignee: Arshad Mohammad
>Priority: Critical
>  Labels: fault
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2573) Modify Info.REVISION to adapt git repo

2016-12-01 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2573?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713597#comment-15713597
 ] 

Michael Han commented on ZOOKEEPER-2573:


bq. I think it is better to change org.apache.zookeeper.version.Info.REVISION 
from int to String and store the git revision.

+1 on the idea. [~arshad.mohammad], are you still interested working on a fix 
for this issue? If so, please go ahead and submit a pull request, so we can get 
this in for 3.5.3.


> Modify Info.REVISION to adapt git repo
> --
>
> Key: ZOOKEEPER-2573
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2573
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build, server
>Reporter: Arshad Mohammad
> Fix For: 3.5.3, 3.6.0
>
>
> Modify {{org.apache.zookeeper.version.Info.REVISION}} to store git repo 
> revision
> Currently {{org.apache.zookeeper.version.Info.REVISION}} stores the svn repo 
> revision which is of type int
> But after migrating to git repo the git repo's revision(commit 
> 63f5132716c08b3d8f18993bf98eb46eb42f80fb) can not be stored in this variable.
> So either we should modify this variable to string to introduce new variable 
> to store the git revision and leave the svn revision variable unchanged.
> build.xml, and org.apache.zookeeper.version.util.VerGen also need to be 
> modified. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2016-12-01 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713491#comment-15713491
 ] 

Michael Han commented on ZOOKEEPER-2251:


bq. Is the problem simply that the patch needs to be updated to match the 
latest code?

[~m.martinez] Thanks for bumping this up. Community has been working on a 
couple of high priority issues to prepare incoming 3.4.10 and 3.5.3 releases. 
This issue was not labelled with version info so it does not get much 
visibility in the queue. Just updated the JIRA and reviewing the patch.

bq. We are definitely running into what looks to be this problem
[~m.martinez] Which version of ZooKeeper you are running?

On a side note, the patch does look outdated (i.e. the doc changes refer to 
3.5.2 which is released already) and needs to be rebased. [~arshad.mohammad] Do 
you mind update the patch and send a pull request on git? We can start 
iterating from there.

> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.9, 3.5.2
>Reporter: nijel
>Assignee: Arshad Mohammad
>Priority: Critical
>  Labels: fault
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2016-12-01 Thread Michael Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-2251:
---
Labels: fault  (was: )

> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.9, 3.5.2
>Reporter: nijel
>Assignee: Arshad Mohammad
>Priority: Critical
>  Labels: fault
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2016-12-01 Thread Michael Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-2251:
---
Component/s: java client

> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.9, 3.5.2
>Reporter: nijel
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2016-12-01 Thread Michael Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-2251:
---
Fix Version/s: 3.6.0
   3.5.3

> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.9, 3.5.2
>Reporter: nijel
>Assignee: Arshad Mohammad
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2016-12-01 Thread Michael Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-2251:
---
Priority: Critical  (was: Major)

> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.9, 3.5.2
>Reporter: nijel
>Assignee: Arshad Mohammad
>Priority: Critical
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2016-12-01 Thread Michael Han (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Han updated ZOOKEEPER-2251:
---
Affects Version/s: 3.4.9
   3.5.2

> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.9, 3.5.2
>Reporter: nijel
>Assignee: Arshad Mohammad
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2637) Netty related test failures

2016-12-01 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713458#comment-15713458
 ] 

Michael Han commented on ZOOKEEPER-2637:


Netty tests are known flaky. Linking with ZOOKEEPER-2135 umbrella JIRA for 
tracking purposes.

> Netty related test failures
> ---
>
> Key: ZOOKEEPER-2637
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2637
> Project: ZooKeeper
>  Issue Type: Test
>  Components: tests
>Affects Versions: 3.6.0
> Environment: rhel ppc64le
>Reporter: Amita Chaudhary
>
> I am getting test failures related to Netty:
> [junit] Running org.apache.zookeeper.test.NettyNettySuiteHammerTest
> [junit] Running org.apache.zookeeper.test.NettyNettySuiteHammerTest
> [junit] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0 sec
> [junit] Test org.apache.zookeeper.test.NettyNettySuiteHammerTest FAILED 
> (crashed)
> [junit] Running org.apache.zookeeper.test.NettyNettySuiteTest
> [junit] Tests run: 101, Failures: 0, Errors: 26, Skipped: 0, Time elapsed: 
> 247.238 sec
> [junit] Test org.apache.zookeeper.test.NettyNettySuiteTest FAILED
> [junit] Running org.apache.zookeeper.test.NioNettySuiteHammerTest
> [junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
> 75.519 sec
> on machine rhel, ppc64le. for master branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: ZOOKEEPER- PreCommit Build #96

2016-12-01 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/96/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 486293 lines...]
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/96//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/96//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/96//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 383f39b450f1ff81c233aed33bb0ab07c723ac77 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1628:
 exec returned: 1

Total time: 15 minutes 49 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Compressed 573.88 KB of artifacts by 61.3% relative to #95
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2325
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.zookeeper.server.quorum.ReconfigDuringLeaderSyncTest.testDuringLeaderSync

Error Message:
zoo.cfg.dynamic.next is not deleted.

Stack Trace:
junit.framework.AssertionFailedError: zoo.cfg.dynamic.next is not deleted.
at 
org.apache.zookeeper.server.quorum.ReconfigDuringLeaderSyncTest.testDuringLeaderSync(ReconfigDuringLeaderSyncTest.java:165)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)




[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713392#comment-15713392
 ] 

Hadoop QA commented on ZOOKEEPER-2325:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 8 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/96//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/96//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/96//console

This message is automatically generated.

> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2549) As NettyServerCnxn.sendResponse() allows all the exception to bubble up it can stop main ZK requests processing thread

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713302#comment-15713302
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2549:
---

Github user yufeldman commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/99#discussion_r90553827
  
--- Diff: src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java ---
@@ -165,31 +163,35 @@ public void process(WatchedEvent event) {
 @Override
 public void sendResponse(ReplyHeader h, Record r, String tag)
 throws IOException {
-if (!channel.isOpen()) {
-return;
-}
-ByteArrayOutputStream baos = new ByteArrayOutputStream();
-// Make space for length
-BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
 try {
-baos.write(fourBytes);
-bos.writeRecord(h, "header");
-if (r != null) {
-bos.writeRecord(r, tag);
+if (!channel.isOpen()) {
+return;
 }
-baos.close();
-} catch (IOException e) {
-LOG.error("Error serializing response");
-}
-byte b[] = baos.toByteArray();
-ByteBuffer bb = ByteBuffer.wrap(b);
-bb.putInt(b.length - 4).rewind();
-sendBuffer(bb);
-if (h.getXid() > 0) {
-// zks cannot be null otherwise we would not have gotten here!
-if 
(!zkServer.shouldThrottle(outstandingCount.decrementAndGet())) {
-enableRecv();
+ByteArrayOutputStream baos = new ByteArrayOutputStream();
+// Make space for length
+BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
+try {
+baos.write(fourBytes);
+bos.writeRecord(h, "header");
+if (r != null) {
+bos.writeRecord(r, tag);
+}
+baos.close();
+} catch (IOException e) {
--- End diff --

I did not modify this code - it was like that before, but potentially - yes 
it makes sense to rethrow
I would say there are multiple places I came across where exceptions are 
swallowed 


> As NettyServerCnxn.sendResponse() allows all the exception to bubble up it 
> can stop main ZK requests processing thread
> --
>
> Key: ZOOKEEPER-2549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Yuliya Feldman
>Assignee: Yuliya Feldman
> Attachments: ZOOKEEPER-2549-2.patch, ZOOKEEPER-2549-3.patch, 
> ZOOKEEPER-2549-3.patch, ZOOKEEPER-2549-4.patch, ZOOKEEPER-2549.patch, 
> ZOOKEEPER-2549.patch, zookeeper-2549-1.patch
>
>
> As NettyServerCnxn.sendResponse() allows all the exception to bubble up it 
> can stop main ZK requests processing thread and make Zookeeper server look 
> like it is hanging, while it just can not process any request anymore.
> Idea is to catch all the exceptions in NettyServerCnxn.sendResponse() , 
> convert them to IOException and allow it propagating up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper pull request #99: ZOOKEEPER-2549 Add exception handling to sendRes...

2016-12-01 Thread yufeldman
Github user yufeldman commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/99#discussion_r90553827
  
--- Diff: src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java ---
@@ -165,31 +163,35 @@ public void process(WatchedEvent event) {
 @Override
 public void sendResponse(ReplyHeader h, Record r, String tag)
 throws IOException {
-if (!channel.isOpen()) {
-return;
-}
-ByteArrayOutputStream baos = new ByteArrayOutputStream();
-// Make space for length
-BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
 try {
-baos.write(fourBytes);
-bos.writeRecord(h, "header");
-if (r != null) {
-bos.writeRecord(r, tag);
+if (!channel.isOpen()) {
+return;
 }
-baos.close();
-} catch (IOException e) {
-LOG.error("Error serializing response");
-}
-byte b[] = baos.toByteArray();
-ByteBuffer bb = ByteBuffer.wrap(b);
-bb.putInt(b.length - 4).rewind();
-sendBuffer(bb);
-if (h.getXid() > 0) {
-// zks cannot be null otherwise we would not have gotten here!
-if 
(!zkServer.shouldThrottle(outstandingCount.decrementAndGet())) {
-enableRecv();
+ByteArrayOutputStream baos = new ByteArrayOutputStream();
+// Make space for length
+BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
+try {
+baos.write(fourBytes);
+bos.writeRecord(h, "header");
+if (r != null) {
+bos.writeRecord(r, tag);
+}
+baos.close();
+} catch (IOException e) {
--- End diff --

I did not modify this code - it was like that before, but potentially - yes 
it makes sense to rethrow
I would say there are multiple places I came across where exceptions are 
swallowed 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (ZOOKEEPER-2549) As NettyServerCnxn.sendResponse() allows all the exception to bubble up it can stop main ZK requests processing thread

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713266#comment-15713266
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2549:
---

Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/99#discussion_r90550930
  
--- Diff: src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java ---
@@ -165,31 +163,35 @@ public void process(WatchedEvent event) {
 @Override
 public void sendResponse(ReplyHeader h, Record r, String tag)
 throws IOException {
-if (!channel.isOpen()) {
-return;
-}
-ByteArrayOutputStream baos = new ByteArrayOutputStream();
-// Make space for length
-BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
 try {
-baos.write(fourBytes);
-bos.writeRecord(h, "header");
-if (r != null) {
-bos.writeRecord(r, tag);
+if (!channel.isOpen()) {
+return;
 }
-baos.close();
-} catch (IOException e) {
-LOG.error("Error serializing response");
-}
-byte b[] = baos.toByteArray();
-ByteBuffer bb = ByteBuffer.wrap(b);
-bb.putInt(b.length - 4).rewind();
-sendBuffer(bb);
-if (h.getXid() > 0) {
-// zks cannot be null otherwise we would not have gotten here!
-if 
(!zkServer.shouldThrottle(outstandingCount.decrementAndGet())) {
-enableRecv();
+ByteArrayOutputStream baos = new ByteArrayOutputStream();
+// Make space for length
+BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
+try {
+baos.write(fourBytes);
+bos.writeRecord(h, "header");
+if (r != null) {
+bos.writeRecord(r, tag);
+}
+baos.close();
+} catch (IOException e) {
--- End diff --

This IOException is swallowed either, should we re-throw it?


> As NettyServerCnxn.sendResponse() allows all the exception to bubble up it 
> can stop main ZK requests processing thread
> --
>
> Key: ZOOKEEPER-2549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Yuliya Feldman
>Assignee: Yuliya Feldman
> Attachments: ZOOKEEPER-2549-2.patch, ZOOKEEPER-2549-3.patch, 
> ZOOKEEPER-2549-3.patch, ZOOKEEPER-2549-4.patch, ZOOKEEPER-2549.patch, 
> ZOOKEEPER-2549.patch, zookeeper-2549-1.patch
>
>
> As NettyServerCnxn.sendResponse() allows all the exception to bubble up it 
> can stop main ZK requests processing thread and make Zookeeper server look 
> like it is hanging, while it just can not process any request anymore.
> Idea is to catch all the exceptions in NettyServerCnxn.sendResponse() , 
> convert them to IOException and allow it propagating up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2549) As NettyServerCnxn.sendResponse() allows all the exception to bubble up it can stop main ZK requests processing thread

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713267#comment-15713267
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2549:
---

Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/99#discussion_r90549964
  
--- Diff: src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java ---
@@ -716,7 +716,12 @@ public void process(WatchedEvent event) {
 // Convert WatchedEvent to a type that can be sent over the wire
 WatcherEvent e = event.getWrapper();
 
-sendResponse(h, e, "notification");
+try {
+sendResponse(h, e, "notification");
+} catch (IOException ex) {
+LOG.debug("Problem sending to " + getRemoteSocketAddress(), 
ex);
--- End diff --

We're using LOG.debug, so it shouldn't be an issue on prod.


> As NettyServerCnxn.sendResponse() allows all the exception to bubble up it 
> can stop main ZK requests processing thread
> --
>
> Key: ZOOKEEPER-2549
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2549
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.1
>Reporter: Yuliya Feldman
>Assignee: Yuliya Feldman
> Attachments: ZOOKEEPER-2549-2.patch, ZOOKEEPER-2549-3.patch, 
> ZOOKEEPER-2549-3.patch, ZOOKEEPER-2549-4.patch, ZOOKEEPER-2549.patch, 
> ZOOKEEPER-2549.patch, zookeeper-2549-1.patch
>
>
> As NettyServerCnxn.sendResponse() allows all the exception to bubble up it 
> can stop main ZK requests processing thread and make Zookeeper server look 
> like it is hanging, while it just can not process any request anymore.
> Idea is to catch all the exceptions in NettyServerCnxn.sendResponse() , 
> convert them to IOException and allow it propagating up



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper pull request #99: ZOOKEEPER-2549 Add exception handling to sendRes...

2016-12-01 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/99#discussion_r90549964
  
--- Diff: src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java ---
@@ -716,7 +716,12 @@ public void process(WatchedEvent event) {
 // Convert WatchedEvent to a type that can be sent over the wire
 WatcherEvent e = event.getWrapper();
 
-sendResponse(h, e, "notification");
+try {
+sendResponse(h, e, "notification");
+} catch (IOException ex) {
+LOG.debug("Problem sending to " + getRemoteSocketAddress(), 
ex);
--- End diff --

We're using LOG.debug, so it shouldn't be an issue on prod.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zookeeper pull request #99: ZOOKEEPER-2549 Add exception handling to sendRes...

2016-12-01 Thread lvfangmin
Github user lvfangmin commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/99#discussion_r90550930
  
--- Diff: src/java/main/org/apache/zookeeper/server/NettyServerCnxn.java ---
@@ -165,31 +163,35 @@ public void process(WatchedEvent event) {
 @Override
 public void sendResponse(ReplyHeader h, Record r, String tag)
 throws IOException {
-if (!channel.isOpen()) {
-return;
-}
-ByteArrayOutputStream baos = new ByteArrayOutputStream();
-// Make space for length
-BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
 try {
-baos.write(fourBytes);
-bos.writeRecord(h, "header");
-if (r != null) {
-bos.writeRecord(r, tag);
+if (!channel.isOpen()) {
+return;
 }
-baos.close();
-} catch (IOException e) {
-LOG.error("Error serializing response");
-}
-byte b[] = baos.toByteArray();
-ByteBuffer bb = ByteBuffer.wrap(b);
-bb.putInt(b.length - 4).rewind();
-sendBuffer(bb);
-if (h.getXid() > 0) {
-// zks cannot be null otherwise we would not have gotten here!
-if 
(!zkServer.shouldThrottle(outstandingCount.decrementAndGet())) {
-enableRecv();
+ByteArrayOutputStream baos = new ByteArrayOutputStream();
+// Make space for length
+BinaryOutputArchive bos = BinaryOutputArchive.getArchive(baos);
+try {
+baos.write(fourBytes);
+bos.writeRecord(h, "header");
+if (r != null) {
+bos.writeRecord(r, tag);
+}
+baos.close();
+} catch (IOException e) {
--- End diff --

This IOException is swallowed either, should we re-throw it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713018#comment-15713018
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/117
  
@breed I think I commented by started a review but not submitted - haven't 
used this github feature until now. Just submitted my comments, they should 
appear now.


> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper issue #117: ZOOKEEPER-2325: Data inconsistency if all snapshots em...

2016-12-01 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/117
  
@breed I think I commented by started a review but not submitted - haven't 
used this github feature until now. Just submitted my comments, they should 
appear now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713016#comment-15713016
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90516950
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -165,8 +165,22 @@ public File getSnapDir() {
  */
 public long restore(DataTree dt, Map sessions,
 PlayBackListener listener) throws IOException {
-snapLog.deserialize(dt, sessions);
+long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+if (-1L == deserializeResult) {
+/* this means that we couldn't find any snapshot, so we need to
+ * initialize an empty database */
+if (txnLog.getLastLoggedZxid() != -1) {
+throw new IOException(
+"No snapshot found, but there are log entries. " +
+"Something is broken!");
+}
+/* TODO: (br33d) we should either put a ConcurrentHashMap on 
restore()
+ *   or use Map on save() */
+save(dt, (ConcurrentHashMap)sessions);
--- End diff --

I think we need it here because if we are getting here then the zxid of 
this server must be -1, so it would not win leader election if at least one 
other server is sane (with valid snapshot/txn log to recover.), so this server 
will become a follow and sync the (none empty) snapshot from the leader. If all 
servers have empty snapshots then this save is also required to bootstrap the 
recover process.


> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper pull request #117: ZOOKEEPER-2325: Data inconsistency if all snaps...

2016-12-01 Thread hanm
Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90516233
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/EmptiedSnapshotRecoveryTest.java ---
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.io.IOException;
+import java.io.File;
+import java.io.PrintWriter;
+import java.util.List;
+import java.util.LinkedList;
+
+import org.apache.log4j.Logger;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZKTestCase;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.server.quorum.Leader.Proposal;
+import org.apache.zookeeper.server.ServerCnxnFactory;
+import org.apache.zookeeper.server.SyncRequestProcessor;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.server.persistence.FileTxnSnapLog;
+import org.junit.Assert;
+import org.junit.Test;
+
+/** If snapshots are corrupted to the empty file or deleted, Zookeeper 
should 
+ *  not proceed to read its transactiong log files
+ *  Test that zxid == -1 in the presence of emptied/deleted snapshots
+ */
+public class EmptiedSnapshotRecoveryTest extends ZKTestCase implements  
Watcher {
+private static final Logger LOG = 
Logger.getLogger(RestoreCommittedLogTest.class);
+private static String HOSTPORT = "127.0.0.1:" + 
PortAssignment.unique();
+private static final int CONNECTION_TIMEOUT = 3000;
+private static final int N_TRANSACTIONS = 150;
+private static final int SNAP_COUNT = 100;
+
+public void runTest(boolean leaveEmptyFile) throws Exception {
--- End diff --

Test coverage improvement suggestion: here we don't cover the case where 
both transaction log files and snap shot files are missing (either deleted, or 
empty) - in which case the ZK server should happily recover w/o problem. 
Something like this should work: `runTest(boolean leaveEmptySnapshotFile, 
boolean leaveEmptyTxnLogFile)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] zookeeper pull request #117: ZOOKEEPER-2325: Data inconsistency if all snaps...

2016-12-01 Thread hanm
Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90516950
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -165,8 +165,22 @@ public File getSnapDir() {
  */
 public long restore(DataTree dt, Map sessions,
 PlayBackListener listener) throws IOException {
-snapLog.deserialize(dt, sessions);
+long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+if (-1L == deserializeResult) {
+/* this means that we couldn't find any snapshot, so we need to
+ * initialize an empty database */
+if (txnLog.getLastLoggedZxid() != -1) {
+throw new IOException(
+"No snapshot found, but there are log entries. " +
+"Something is broken!");
+}
+/* TODO: (br33d) we should either put a ConcurrentHashMap on 
restore()
+ *   or use Map on save() */
+save(dt, (ConcurrentHashMap)sessions);
--- End diff --

I think we need it here because if we are getting here then the zxid of 
this server must be -1, so it would not win leader election if at least one 
other server is sane (with valid snapshot/txn log to recover.), so this server 
will become a follow and sync the (none empty) snapshot from the leader. If all 
servers have empty snapshots then this save is also required to bootstrap the 
recover process.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15713015#comment-15713015
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user hanm commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90516233
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/EmptiedSnapshotRecoveryTest.java ---
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.io.IOException;
+import java.io.File;
+import java.io.PrintWriter;
+import java.util.List;
+import java.util.LinkedList;
+
+import org.apache.log4j.Logger;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZKTestCase;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.server.quorum.Leader.Proposal;
+import org.apache.zookeeper.server.ServerCnxnFactory;
+import org.apache.zookeeper.server.SyncRequestProcessor;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.server.persistence.FileTxnSnapLog;
+import org.junit.Assert;
+import org.junit.Test;
+
+/** If snapshots are corrupted to the empty file or deleted, Zookeeper 
should 
+ *  not proceed to read its transactiong log files
+ *  Test that zxid == -1 in the presence of emptied/deleted snapshots
+ */
+public class EmptiedSnapshotRecoveryTest extends ZKTestCase implements  
Watcher {
+private static final Logger LOG = 
Logger.getLogger(RestoreCommittedLogTest.class);
+private static String HOSTPORT = "127.0.0.1:" + 
PortAssignment.unique();
+private static final int CONNECTION_TIMEOUT = 3000;
+private static final int N_TRANSACTIONS = 150;
+private static final int SNAP_COUNT = 100;
+
+public void runTest(boolean leaveEmptyFile) throws Exception {
--- End diff --

Test coverage improvement suggestion: here we don't cover the case where 
both transaction log files and snap shot files are missing (either deleted, or 
empty) - in which case the ZK server should happily recover w/o problem. 
Something like this should work: `runTest(boolean leaveEmptySnapshotFile, 
boolean leaveEmptyTxnLogFile)`.


> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712961#comment-15712961
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/117
  
@hanm i can't seem to find your comment :)


> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper issue #117: ZOOKEEPER-2325: Data inconsistency if all snapshots em...

2016-12-01 Thread breed
Github user breed commented on the issue:

https://github.com/apache/zookeeper/pull/117
  
@hanm i can't seem to find your comment :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: committing doc changes

2016-12-01 Thread Michael Han
Run forrest check only take a few seconds, so it seems worthwhile to add it
to QA target to have some sanity checks on the doc change.

On Thu, Dec 1, 2016 at 3:20 AM, Flavio Junqueira  wrote:

> We currently do it for the trunk build:
>
> 
>
> but not for pull request or patch QA:
>
> 
>
> "forrest.check" only checks if the forrest.home variable is defined.
>
> Is that enough that we run it as part of the trunk build?
>
> -Flavio
>
> > On 01 Dec 2016, at 01:04, Benjamin Reed  wrote:
> >
> > we could also build the doc as part of the tests.
> >
> > On Wed, Nov 30, 2016 at 3:26 PM, Flavio Junqueira 
> wrote:
> >> As part of the release process, we only copy the documentation, see it
> here:
> >>
> >> https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToRelease <
> https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToRelease>
> >>
> >> I think the reason we have gone this way is to avoid issues compiling
> the documentation at the time that we are preparing a release candidate or
> after voting on a release candidate. We could for sure build the
> documentation right before generating the first rc for a release and create
> blocker jiras in the case there is any issue.
> >>
> >> -Flavio
> >>
> >>> On 30 Nov 2016, at 23:12, Benjamin Reed  wrote:
> >>>
> >>> yeah, that's a deeper question. pat or flavio can correct me on this,
> >>> but i think the reason we check it in is so that the website's "trunk"
> >>> documentation will work. now that we moved to git, i don't thing it
> >>> works though... i also would just like to only build it when we do
> >>> releases.
> >>>
> >>> On Wed, Nov 30, 2016 at 2:24 PM, Jordan Zimmerman
> >>>  wrote:
>  I wondered about that myself. Why bother building the docs? Isn’t
> that only needed for packaging/deployment? It ends up making PRs ugly
> because you have all the unnecessary docs in the diff.
> 
>  -Jordan
> 
> > On Nov 30, 2016, at 11:23 PM, Benjamin Reed 
> wrote:
> >
> > when we commit pull requests with doc changes, i think we should
> > commit the generated doc as a separate commit. what do you all think?
> > i would like to do that to keep the change from the contributors
> > pristine :) and i think it simplifies things a bit.
> >
> > ben
> 
> >>
>
>


-- 
Cheers
Michael.


[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712905#comment-15712905
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user breed commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90523812
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/EmptiedSnapshotRecoveryTest.java ---
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.io.IOException;
+import java.io.File;
+import java.io.PrintWriter;
+import java.util.List;
+import java.util.LinkedList;
+
+import org.apache.log4j.Logger;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZKTestCase;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.server.quorum.Leader.Proposal;
+import org.apache.zookeeper.server.ServerCnxnFactory;
+import org.apache.zookeeper.server.SyncRequestProcessor;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.server.persistence.FileTxnSnapLog;
+import org.junit.Assert;
+import org.junit.Test;
+
+/** If snapshots are corrupted to the empty file or deleted, Zookeeper 
should 
+ *  not proceed to read its transactiong log files
+ *  Test that zxid == -1 in the presence of emptied/deleted snapshots
+ */
+public class EmptiedSnapshotRecoveryTest extends ZKTestCase implements  
Watcher {
+private static final Logger LOG = 
Logger.getLogger(RestoreCommittedLogTest.class);
+private static String HOSTPORT = "127.0.0.1:" + 
PortAssignment.unique();
+private static final int CONNECTION_TIMEOUT = 3000;
+private static final int N_TRANSACTIONS = 150;
+private static final int SNAP_COUNT = 100;
+
+public void runTest(boolean leaveEmptyFile) throws Exception {
+File tmpSnapDir = ClientBase.createTmpDir();
+File tmpLogDir  = ClientBase.createTmpDir();
+ClientBase.setupTestEnv();
+ZooKeeperServer zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 
3000);
+SyncRequestProcessor.setSnapCount(SNAP_COUNT);
+final int PORT = Integer.parseInt(HOSTPORT.split(":")[1]);
+ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1);
+f.startup(zks);
+Assert.assertTrue("waiting for server being up ",
+ClientBase.waitForServerUp(HOSTPORT,CONNECTION_TIMEOUT));
+ZooKeeper zk = new ZooKeeper(HOSTPORT, CONNECTION_TIMEOUT, this);
+try {
+for (int i = 0; i< N_TRANSACTIONS; i++) {
+zk.create("/node-" + i, new byte[0], Ids.OPEN_ACL_UNSAFE,
+CreateMode.PERSISTENT);
+}
+} finally {
+zk.close();
+}
+f.shutdown();
+zks.shutdown();
+Assert.assertTrue("waiting for server to shutdown",
+ClientBase.waitForServerDown(HOSTPORT, 
CONNECTION_TIMEOUT));
+
+// start server again with intact database
+zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 3000);
+zks.startdata();
+long zxid = zks.getZKDatabase().getDataTreeLastProcessedZxid();
+LOG.info("After clean restart, zxid = " + zxid);
+Assert.assertTrue("zxid > 0", zxid > 0);
+zks.shutdown();
+
+// Make all snapshots empty
+FileTxnSnapLog txnLogFactory = zks.getTxnLogFactory();
+List snapshots = txnLogFactory.findNRecentSnapshots(10);
+Assert.assertTrue("We have a snapshot to corrupt", 
snapshots.size() > 0);
+for (File file: snapshots) {
+if 

[GitHub] zookeeper pull request #117: ZOOKEEPER-2325: Data inconsistency if all snaps...

2016-12-01 Thread breed
Github user breed commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90523812
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/EmptiedSnapshotRecoveryTest.java ---
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.io.IOException;
+import java.io.File;
+import java.io.PrintWriter;
+import java.util.List;
+import java.util.LinkedList;
+
+import org.apache.log4j.Logger;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZKTestCase;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.server.quorum.Leader.Proposal;
+import org.apache.zookeeper.server.ServerCnxnFactory;
+import org.apache.zookeeper.server.SyncRequestProcessor;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.server.persistence.FileTxnSnapLog;
+import org.junit.Assert;
+import org.junit.Test;
+
+/** If snapshots are corrupted to the empty file or deleted, Zookeeper 
should 
+ *  not proceed to read its transactiong log files
+ *  Test that zxid == -1 in the presence of emptied/deleted snapshots
+ */
+public class EmptiedSnapshotRecoveryTest extends ZKTestCase implements  
Watcher {
+private static final Logger LOG = 
Logger.getLogger(RestoreCommittedLogTest.class);
+private static String HOSTPORT = "127.0.0.1:" + 
PortAssignment.unique();
+private static final int CONNECTION_TIMEOUT = 3000;
+private static final int N_TRANSACTIONS = 150;
+private static final int SNAP_COUNT = 100;
+
+public void runTest(boolean leaveEmptyFile) throws Exception {
+File tmpSnapDir = ClientBase.createTmpDir();
+File tmpLogDir  = ClientBase.createTmpDir();
+ClientBase.setupTestEnv();
+ZooKeeperServer zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 
3000);
+SyncRequestProcessor.setSnapCount(SNAP_COUNT);
+final int PORT = Integer.parseInt(HOSTPORT.split(":")[1]);
+ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1);
+f.startup(zks);
+Assert.assertTrue("waiting for server being up ",
+ClientBase.waitForServerUp(HOSTPORT,CONNECTION_TIMEOUT));
+ZooKeeper zk = new ZooKeeper(HOSTPORT, CONNECTION_TIMEOUT, this);
+try {
+for (int i = 0; i< N_TRANSACTIONS; i++) {
+zk.create("/node-" + i, new byte[0], Ids.OPEN_ACL_UNSAFE,
+CreateMode.PERSISTENT);
+}
+} finally {
+zk.close();
+}
+f.shutdown();
+zks.shutdown();
+Assert.assertTrue("waiting for server to shutdown",
+ClientBase.waitForServerDown(HOSTPORT, 
CONNECTION_TIMEOUT));
+
+// start server again with intact database
+zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 3000);
+zks.startdata();
+long zxid = zks.getZKDatabase().getDataTreeLastProcessedZxid();
+LOG.info("After clean restart, zxid = " + zxid);
+Assert.assertTrue("zxid > 0", zxid > 0);
+zks.shutdown();
+
+// Make all snapshots empty
+FileTxnSnapLog txnLogFactory = zks.getTxnLogFactory();
+List snapshots = txnLogFactory.findNRecentSnapshots(10);
+Assert.assertTrue("We have a snapshot to corrupt", 
snapshots.size() > 0);
+for (File file: snapshots) {
+if (leaveEmptyFile) {
+new PrintWriter(file).close ();
+}
+else {
--- End diff --

yes that is ugly!


---
If your project is set up for it, you can reply to this email and have your
reply appear on 

[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712900#comment-15712900
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user breed commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90523518
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -165,8 +165,22 @@ public File getSnapDir() {
  */
 public long restore(DataTree dt, Map sessions,
 PlayBackListener listener) throws IOException {
-snapLog.deserialize(dt, sessions);
+long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+if (-1L == deserializeResult) {
+/* this means that we couldn't find any snapshot, so we need to
+ * initialize an empty database */
+if (txnLog.getLastLoggedZxid() != -1) {
+throw new IOException(
+"No snapshot found, but there are log entries. " +
+"Something is broken!");
+}
+/* TODO: (br33d) we should either put a ConcurrentHashMap on 
restore()
+ *   or use Map on save() */
+save(dt, (ConcurrentHashMap)sessions);
--- End diff --

yes, ZOOKEEPER-261 is still a problem. is that what you are referring to? 
brian has a patch coming that builds on this one to fix that problem.


> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper pull request #117: ZOOKEEPER-2325: Data inconsistency if all snaps...

2016-12-01 Thread breed
Github user breed commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90523518
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -165,8 +165,22 @@ public File getSnapDir() {
  */
 public long restore(DataTree dt, Map sessions,
 PlayBackListener listener) throws IOException {
-snapLog.deserialize(dt, sessions);
+long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+if (-1L == deserializeResult) {
+/* this means that we couldn't find any snapshot, so we need to
+ * initialize an empty database */
+if (txnLog.getLastLoggedZxid() != -1) {
+throw new IOException(
+"No snapshot found, but there are log entries. " +
+"Something is broken!");
+}
+/* TODO: (br33d) we should either put a ConcurrentHashMap on 
restore()
+ *   or use Map on save() */
+save(dt, (ConcurrentHashMap)sessions);
--- End diff --

yes, ZOOKEEPER-261 is still a problem. is that what you are referring to? 
brian has a patch coming that builds on this one to fix that problem.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


ZooKeeper-trunk-openjdk7 - Build # 1261 - Still Failing

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/1261/

###
## LAST 60 LINES OF THE CONSOLE 
###
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on H17 (ubuntu) in workspace 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url git://git.apache.org/zookeeper.git # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
Resetting working tree
 > git reset --hard # timeout=10
 > git clean -fdx # timeout=10
Fetching upstream changes from git://git.apache.org/zookeeper.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > git://git.apache.org/zookeeper.git +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/master^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10
Checking out Revision d72f27279a13986ee0c011e1e5b34edf3a310da9 
(refs/remotes/origin/master)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f d72f27279a13986ee0c011e1e5b34edf3a310da9
 > git rev-list d72f27279a13986ee0c011e1e5b34edf3a310da9 # timeout=10
No emails were triggered.
[ZooKeeper-trunk-openjdk7] $ /home/jenkins/tools/ant/latest/bin/ant 
-Dtest.output=yes -Dtest.junit.threads=8 -Dtest.junit.output.format=xml 
-Djavac.target=1.7 clean test-core-java
Error: JAVA_HOME is not defined correctly.
  We cannot execute /usr/lib/jvm/java-7-openjdk-amd64//bin/java
Build step 'Invoke Ant' marked build as failure
Recording test results
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712832#comment-15712832
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/117
  
+1, just one comment on test coverage.


> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper issue #117: ZOOKEEPER-2325: Data inconsistency if all snapshots em...

2016-12-01 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/117
  
+1, just one comment on test coverage.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (ZOOKEEPER-2517) jute.maxbuffer is ignored

2016-12-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712611#comment-15712611
 ] 

Hadoop QA commented on ZOOKEEPER-2517:
--

+1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12829061/ZOOKEEPER-2517-04.patch
  against trunk revision d72f27279a13986ee0c011e1e5b34edf3a310da9.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 5 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

+1 core tests.  The patch passed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3537//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3537//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3537//console

This message is automatically generated.

> jute.maxbuffer is ignored
> -
>
> Key: ZOOKEEPER-2517
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2517
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.2
>Reporter: Benjamin Jaton
>Assignee: Arshad Mohammad
>Priority: Blocker
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2517-01.patch, ZOOKEEPER-2517-02.patch, 
> ZOOKEEPER-2517-03.patch, ZOOKEEPER-2517-04.patch, ZOOKEEPER-2517.patch
>
>
> In ClientCnxnSocket.java the parsing of the system property is erroneous:
> {code}packetLen = Integer.getInteger(
>   clientConfig.getProperty(ZKConfig.JUTE_MAXBUFFER),
>   ZKClientConfig.CLIENT_MAX_PACKET_LENGTH_DEFAULT
> );{code}
> Javadoc of Integer.getInteger states "The first argument is treated as the 
> name of a system property", whereas here the value of the property is passed.
> Instead I believe the author meant to write something like:
> {code}packetLen = Integer.parseInt(
>   clientConfig.getProperty(
> ZKConfig.JUTE_MAXBUFFER,
> String.valueOf(ZKClientConfig.CLIENT_MAX_PACKET_LENGTH_DEFAULT)
>   )
> );{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Success: ZOOKEEPER-2517 PreCommit Build #3537

2016-12-01 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2517
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3537/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 485598 lines...]
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 5 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3537//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3537//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/3537//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 47709ba83111de3f9570a3347b1341f0ccb47fbc logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/patchprocess' 
and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/patchprocess' 
are the same file

BUILD SUCCESSFUL
Total time: 22 minutes 10 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2517
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-2517) jute.maxbuffer is ignored

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712555#comment-15712555
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2517:
---

Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/113
  
+1, looks good. We should definitely get this in before cutting RC build 
for 3.5.3.


> jute.maxbuffer is ignored
> -
>
> Key: ZOOKEEPER-2517
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2517
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.2
>Reporter: Benjamin Jaton
>Assignee: Arshad Mohammad
>Priority: Blocker
> Fix For: 3.5.3, 3.6.0
>
> Attachments: ZOOKEEPER-2517-01.patch, ZOOKEEPER-2517-02.patch, 
> ZOOKEEPER-2517-03.patch, ZOOKEEPER-2517-04.patch, ZOOKEEPER-2517.patch
>
>
> In ClientCnxnSocket.java the parsing of the system property is erroneous:
> {code}packetLen = Integer.getInteger(
>   clientConfig.getProperty(ZKConfig.JUTE_MAXBUFFER),
>   ZKClientConfig.CLIENT_MAX_PACKET_LENGTH_DEFAULT
> );{code}
> Javadoc of Integer.getInteger states "The first argument is treated as the 
> name of a system property", whereas here the value of the property is passed.
> Instead I believe the author meant to write something like:
> {code}packetLen = Integer.parseInt(
>   clientConfig.getProperty(
> ZKConfig.JUTE_MAXBUFFER,
> String.valueOf(ZKClientConfig.CLIENT_MAX_PACKET_LENGTH_DEFAULT)
>   )
> );{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper issue #113: ZOOKEEPER-2517:jute.maxbuffer is ignored

2016-12-01 Thread hanm
Github user hanm commented on the issue:

https://github.com/apache/zookeeper/pull/113
  
+1, looks good. We should definitely get this in before cutting RC build 
for 3.5.3.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


ZooKeeper_branch35_solaris - Build # 336 - Still Failing

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_solaris/336/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 460700 lines...]
[junit] 2016-12-01 17:17:12,266 [myid:] - INFO  [main:ClientBase@386] - 
CREATING server instance 127.0.0.1:11222
[junit] 2016-12-01 17:17:12,266 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2016-12-01 17:17:12,266 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222
[junit] 2016-12-01 17:17:12,267 [myid:] - INFO  [main:ClientBase@361] - 
STARTING server instance 127.0.0.1:11222
[junit] 2016-12-01 17:17:12,267 [myid:] - INFO  [main:ZooKeeperServer@893] 
- minSessionTimeout set to 6000
[junit] 2016-12-01 17:17:12,267 [myid:] - INFO  [main:ZooKeeperServer@902] 
- maxSessionTimeout set to 6
[junit] 2016-12-01 17:17:12,268 [myid:] - INFO  [main:ZooKeeperServer@159] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test559103919706770117.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test559103919706770117.junit.dir/version-2
[junit] 2016-12-01 17:17:12,268 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test559103919706770117.junit.dir/version-2/snapshot.b
[junit] 2016-12-01 17:17:12,270 [myid:] - INFO  [main:FileTxnSnapLog@306] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch35_solaris/build/test/tmp/test559103919706770117.junit.dir/version-2/snapshot.b
[junit] 2016-12-01 17:17:12,271 [myid:] - ERROR [main:ZooKeeperServer@505] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-12-01 17:17:12,272 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222
[junit] 2016-12-01 17:17:12,272 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:42199
[junit] 2016-12-01 17:17:12,273 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from 
/127.0.0.1:42199
[junit] 2016-12-01 17:17:12,273 [myid:] - INFO  
[NIOWorkerThread-1:StatCommand@49] - Stat command output
[junit] 2016-12-01 17:17:12,273 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:42199 (no session established for client)
[junit] 2016-12-01 17:17:12,273 [myid:] - INFO  [main:JMXEnv@228] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-12-01 17:17:12,275 [myid:] - INFO  [main:JMXEnv@245] - 
expect:InMemoryDataTree
[junit] 2016-12-01 17:17:12,275 [myid:] - INFO  [main:JMXEnv@249] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2016-12-01 17:17:12,275 [myid:] - INFO  [main:JMXEnv@245] - 
expect:StandaloneServer_port
[junit] 2016-12-01 17:17:12,275 [myid:] - INFO  [main:JMXEnv@249] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2016-12-01 17:17:12,275 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17745
[junit] 2016-12-01 17:17:12,276 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24
[junit] 2016-12-01 17:17:12,276 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testQuota
[junit] 2016-12-01 17:17:12,276 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-12-01 17:17:12,352 [myid:] - INFO  [main:ZooKeeper@1311] - 
Session: 0x125385df588 closed
[junit] 2016-12-01 17:17:12,352 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-12-01 17:17:12,352 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x125385df588
[junit] 2016-12-01 17:17:12,352 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-12-01 17:17:12,352 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-12-01 17:17:12,352 [myid:] - INFO  
[ConnnectionExpirer:NIOServerCnxnFactory$ConnectionExpirerThread@583] - 

[GitHub] zookeeper pull request #117: ZOOKEEPER-2325: Data inconsistency if all snaps...

2016-12-01 Thread rgs1
Github user rgs1 commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90491384
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/EmptiedSnapshotRecoveryTest.java ---
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.io.IOException;
+import java.io.File;
+import java.io.PrintWriter;
+import java.util.List;
+import java.util.LinkedList;
+
+import org.apache.log4j.Logger;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZKTestCase;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.server.quorum.Leader.Proposal;
+import org.apache.zookeeper.server.ServerCnxnFactory;
+import org.apache.zookeeper.server.SyncRequestProcessor;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.server.persistence.FileTxnSnapLog;
+import org.junit.Assert;
+import org.junit.Test;
+
+/** If snapshots are corrupted to the empty file or deleted, Zookeeper 
should 
+ *  not proceed to read its transactiong log files
+ *  Test that zxid == -1 in the presence of emptied/deleted snapshots
+ */
+public class EmptiedSnapshotRecoveryTest extends ZKTestCase implements  
Watcher {
+private static final Logger LOG = 
Logger.getLogger(RestoreCommittedLogTest.class);
+private static String HOSTPORT = "127.0.0.1:" + 
PortAssignment.unique();
+private static final int CONNECTION_TIMEOUT = 3000;
+private static final int N_TRANSACTIONS = 150;
+private static final int SNAP_COUNT = 100;
+
+public void runTest(boolean leaveEmptyFile) throws Exception {
+File tmpSnapDir = ClientBase.createTmpDir();
+File tmpLogDir  = ClientBase.createTmpDir();
+ClientBase.setupTestEnv();
+ZooKeeperServer zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 
3000);
+SyncRequestProcessor.setSnapCount(SNAP_COUNT);
+final int PORT = Integer.parseInt(HOSTPORT.split(":")[1]);
+ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1);
+f.startup(zks);
+Assert.assertTrue("waiting for server being up ",
+ClientBase.waitForServerUp(HOSTPORT,CONNECTION_TIMEOUT));
+ZooKeeper zk = new ZooKeeper(HOSTPORT, CONNECTION_TIMEOUT, this);
+try {
+for (int i = 0; i< N_TRANSACTIONS; i++) {
+zk.create("/node-" + i, new byte[0], Ids.OPEN_ACL_UNSAFE,
+CreateMode.PERSISTENT);
+}
+} finally {
+zk.close();
+}
+f.shutdown();
+zks.shutdown();
+Assert.assertTrue("waiting for server to shutdown",
+ClientBase.waitForServerDown(HOSTPORT, 
CONNECTION_TIMEOUT));
+
+// start server again with intact database
+zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 3000);
+zks.startdata();
+long zxid = zks.getZKDatabase().getDataTreeLastProcessedZxid();
+LOG.info("After clean restart, zxid = " + zxid);
+Assert.assertTrue("zxid > 0", zxid > 0);
+zks.shutdown();
+
+// Make all snapshots empty
+FileTxnSnapLog txnLogFactory = zks.getTxnLogFactory();
+List snapshots = txnLogFactory.findNRecentSnapshots(10);
+Assert.assertTrue("We have a snapshot to corrupt", 
snapshots.size() > 0);
+for (File file: snapshots) {
+if (leaveEmptyFile) {
+new PrintWriter(file).close ();
+}
+else {
--- End diff --

nit: coding style `} else {`


---
If your project is set up for it, you can reply to this email and have your
reply 

[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712490#comment-15712490
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user rgs1 commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90491384
  
--- Diff: 
src/java/test/org/apache/zookeeper/test/EmptiedSnapshotRecoveryTest.java ---
@@ -0,0 +1,134 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.test;
+
+import java.io.IOException;
+import java.io.File;
+import java.io.PrintWriter;
+import java.util.List;
+import java.util.LinkedList;
+
+import org.apache.log4j.Logger;
+import org.apache.zookeeper.CreateMode;
+import org.apache.zookeeper.PortAssignment;
+import org.apache.zookeeper.WatchedEvent;
+import org.apache.zookeeper.Watcher;
+import org.apache.zookeeper.ZKTestCase;
+import org.apache.zookeeper.ZooKeeper;
+import org.apache.zookeeper.ZooDefs.Ids;
+import org.apache.zookeeper.server.quorum.Leader.Proposal;
+import org.apache.zookeeper.server.ServerCnxnFactory;
+import org.apache.zookeeper.server.SyncRequestProcessor;
+import org.apache.zookeeper.server.ZooKeeperServer;
+import org.apache.zookeeper.server.persistence.FileTxnSnapLog;
+import org.junit.Assert;
+import org.junit.Test;
+
+/** If snapshots are corrupted to the empty file or deleted, Zookeeper 
should 
+ *  not proceed to read its transactiong log files
+ *  Test that zxid == -1 in the presence of emptied/deleted snapshots
+ */
+public class EmptiedSnapshotRecoveryTest extends ZKTestCase implements  
Watcher {
+private static final Logger LOG = 
Logger.getLogger(RestoreCommittedLogTest.class);
+private static String HOSTPORT = "127.0.0.1:" + 
PortAssignment.unique();
+private static final int CONNECTION_TIMEOUT = 3000;
+private static final int N_TRANSACTIONS = 150;
+private static final int SNAP_COUNT = 100;
+
+public void runTest(boolean leaveEmptyFile) throws Exception {
+File tmpSnapDir = ClientBase.createTmpDir();
+File tmpLogDir  = ClientBase.createTmpDir();
+ClientBase.setupTestEnv();
+ZooKeeperServer zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 
3000);
+SyncRequestProcessor.setSnapCount(SNAP_COUNT);
+final int PORT = Integer.parseInt(HOSTPORT.split(":")[1]);
+ServerCnxnFactory f = ServerCnxnFactory.createFactory(PORT, -1);
+f.startup(zks);
+Assert.assertTrue("waiting for server being up ",
+ClientBase.waitForServerUp(HOSTPORT,CONNECTION_TIMEOUT));
+ZooKeeper zk = new ZooKeeper(HOSTPORT, CONNECTION_TIMEOUT, this);
+try {
+for (int i = 0; i< N_TRANSACTIONS; i++) {
+zk.create("/node-" + i, new byte[0], Ids.OPEN_ACL_UNSAFE,
+CreateMode.PERSISTENT);
+}
+} finally {
+zk.close();
+}
+f.shutdown();
+zks.shutdown();
+Assert.assertTrue("waiting for server to shutdown",
+ClientBase.waitForServerDown(HOSTPORT, 
CONNECTION_TIMEOUT));
+
+// start server again with intact database
+zks = new ZooKeeperServer(tmpSnapDir, tmpLogDir, 3000);
+zks.startdata();
+long zxid = zks.getZKDatabase().getDataTreeLastProcessedZxid();
+LOG.info("After clean restart, zxid = " + zxid);
+Assert.assertTrue("zxid > 0", zxid > 0);
+zks.shutdown();
+
+// Make all snapshots empty
+FileTxnSnapLog txnLogFactory = zks.getTxnLogFactory();
+List snapshots = txnLogFactory.findNRecentSnapshots(10);
+Assert.assertTrue("We have a snapshot to corrupt", 
snapshots.size() > 0);
+for (File file: snapshots) {
+if 

[jira] [Commented] (ZOOKEEPER-2325) Data inconsistency if all snapshots empty or missing

2016-12-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15712489#comment-15712489
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2325:
---

Github user rgs1 commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90490925
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -165,8 +165,22 @@ public File getSnapDir() {
  */
 public long restore(DataTree dt, Map sessions,
 PlayBackListener listener) throws IOException {
-snapLog.deserialize(dt, sessions);
+long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+if (-1L == deserializeResult) {
+/* this means that we couldn't find any snapshot, so we need to
+ * initialize an empty database */
--- End diff --

nit: can you add a reference to ZOOKEEPER-2325 here?


> Data inconsistency if all snapshots empty or missing
> 
>
> Key: ZOOKEEPER-2325
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2325
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.6
>Reporter: Andrew Grasso
>Assignee: Andrew Grasso
>Priority: Critical
> Attachments: ZOOKEEPER-2325-test.patch, ZOOKEEPER-2325.001.patch, 
> zk.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When loading state from snapshots on startup, FileTxnSnapLog.java ignores the 
> result of FileSnap.deserialize, which is -1L if no valid snapshots are found. 
> Recovery proceeds with dt.lastProcessed == 0, its initial value.
> The result is that Zookeeper will process the transaction logs and then begin 
> serving requests with a different state than the rest of the ensemble.
> To reproduce:
> In a healthy zookeeper cluster of size >= 3, shut down one node.
> Either delete all snapshots for this node or change all to be empty files.
> Restart the node.
> We believe this can happen organically if a node runs out of disk space.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] zookeeper pull request #117: ZOOKEEPER-2325: Data inconsistency if all snaps...

2016-12-01 Thread rgs1
Github user rgs1 commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/117#discussion_r90490925
  
--- Diff: 
src/java/main/org/apache/zookeeper/server/persistence/FileTxnSnapLog.java ---
@@ -165,8 +165,22 @@ public File getSnapDir() {
  */
 public long restore(DataTree dt, Map sessions,
 PlayBackListener listener) throws IOException {
-snapLog.deserialize(dt, sessions);
+long deserializeResult = snapLog.deserialize(dt, sessions);
 FileTxnLog txnLog = new FileTxnLog(dataDir);
+if (-1L == deserializeResult) {
+/* this means that we couldn't find any snapshot, so we need to
+ * initialize an empty database */
--- End diff --

nit: can you add a reference to ZOOKEEPER-2325 here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


ZooKeeper_branch34_openjdk7 - Build # 1292 - Failure

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/1292/

###
## LAST 60 LINES OF THE CONSOLE 
###
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on H15 (ubuntu) in workspace 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7
Cloning the remote Git repository
Cloning repository git://git.apache.org/zookeeper.git
 > git init /home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7 # 
 > timeout=10
Fetching upstream changes from git://git.apache.org/zookeeper.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > git://git.apache.org/zookeeper.git +refs/heads/*:refs/remotes/origin/*
 > git config remote.origin.url git://git.apache.org/zookeeper.git # timeout=10
 > git config --add remote.origin.fetch +refs/heads/*:refs/remotes/origin/* # 
 > timeout=10
 > git config remote.origin.url git://git.apache.org/zookeeper.git # timeout=10
Cleaning workspace
 > git rev-parse --verify HEAD # timeout=10
No valid HEAD. Skipping the resetting
 > git clean -fdx # timeout=10
Fetching upstream changes from git://git.apache.org/zookeeper.git
 > git -c core.askpass=true fetch --tags --progress 
 > git://git.apache.org/zookeeper.git +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/branch-3.4^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/branch-3.4^{commit} # timeout=10
Checking out Revision 967c3a71bd8eaf1ac29b2702173115976874bd8e 
(refs/remotes/origin/branch-3.4)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 967c3a71bd8eaf1ac29b2702173115976874bd8e
 > git rev-list 967c3a71bd8eaf1ac29b2702173115976874bd8e # timeout=10
No emails were triggered.
[ZooKeeper_branch34_openjdk7] $ /home/jenkins/tools/ant/latest/bin/ant 
-Dtest.output=yes -Dtest.junit.threads=8 -Dtest.junit.output.format=xml 
-Djavac.target=1.7 clean test-core-java
Error: JAVA_HOME is not defined correctly.
  We cannot execute /usr/lib/jvm/java-7-openjdk-amd64//bin/java
Build step 'Invoke Ant' marked build as failure
Recording test results
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
No tests ran.

ZooKeeper-trunk-jdk8 - Build # 841 - Failure

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-jdk8/841/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 487660 lines...]
[junit] 2016-12-01 12:07:42,229 [myid:127.0.0.1:19376] - INFO  
[main-SendThread(127.0.0.1:19376):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:19376. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-12-01 12:07:42,229 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:/127.0.0.1:19376:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:41820
[junit] 2016-12-01 12:07:42,229 [myid:127.0.0.1:19376] - INFO  
[main-SendThread(127.0.0.1:19376):ClientCnxn$SendThread@948] - Socket 
connection established, initiating session, client: /127.0.0.1:41820, server: 
127.0.0.1/127.0.0.1:19376
[junit] 2016-12-01 12:07:42,230 [myid:] - WARN  
[NIOWorkerThread-3:NIOServerCnxn@369] - Exception causing close of session 0x0: 
ZooKeeperServer not running
[junit] 2016-12-01 12:07:42,230 [myid:] - INFO  
[NIOWorkerThread-3:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:41820 (no session established for client)
[junit] 2016-12-01 12:07:42,230 [myid:127.0.0.1:19376] - INFO  
[main-SendThread(127.0.0.1:19376):ClientCnxn$SendThread@1231] - Unable to read 
additional data from server sessionid 0x0, likely server has closed socket, 
closing socket connection and attempting reconnect
[junit] 2016-12-01 12:07:42,733 [myid:127.0.0.1:19349] - INFO  
[main-SendThread(127.0.0.1:19349):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:19349. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-12-01 12:07:42,733 [myid:127.0.0.1:19349] - WARN  
[main-SendThread(127.0.0.1:19349):ClientCnxn$SendThread@1235] - Session 
0x1014df33121 for server 127.0.0.1/127.0.0.1:19349, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] 2016-12-01 12:07:43,073 [myid:127.0.0.1:19355] - INFO  
[main-SendThread(127.0.0.1:19355):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:19355. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-12-01 12:07:43,074 [myid:127.0.0.1:19355] - WARN  
[main-SendThread(127.0.0.1:19355):ClientCnxn$SendThread@1235] - Session 
0x3014df32ff9 for server 127.0.0.1/127.0.0.1:19355, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] 2016-12-01 12:07:43,326 [myid:127.0.0.1:19352] - INFO  
[main-SendThread(127.0.0.1:19352):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:19352. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-12-01 12:07:43,326 [myid:127.0.0.1:19352] - WARN  
[main-SendThread(127.0.0.1:19352):ClientCnxn$SendThread@1235] - Session 
0x2014df3322f for server 127.0.0.1/127.0.0.1:19352, unexpected error, 
closing socket connection and attempting reconnect
[junit] java.net.ConnectException: Connection refused
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
[junit] at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:357)
[junit] at 
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1214)
[junit] 2016-12-01 12:07:43,744 [myid:127.0.0.1:19376] - INFO  
[main-SendThread(127.0.0.1:19376):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:19376. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-12-01 12:07:43,744 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:/127.0.0.1:19376:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:41880
[junit] 2016-12-01 12:07:43,744 [myid:127.0.0.1:19376] - INFO  
[main-SendThread(127.0.0.1:19376):ClientCnxn$SendThread@948] - Socket 

Re: committing doc changes

2016-12-01 Thread Flavio Junqueira
We currently do it for the trunk build:



but not for pull request or patch QA:



"forrest.check" only checks if the forrest.home variable is defined.

Is that enough that we run it as part of the trunk build?

-Flavio 

> On 01 Dec 2016, at 01:04, Benjamin Reed  wrote:
> 
> we could also build the doc as part of the tests.
> 
> On Wed, Nov 30, 2016 at 3:26 PM, Flavio Junqueira  wrote:
>> As part of the release process, we only copy the documentation, see it here:
>> 
>> https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToRelease 
>> 
>> 
>> I think the reason we have gone this way is to avoid issues compiling the 
>> documentation at the time that we are preparing a release candidate or after 
>> voting on a release candidate. We could for sure build the documentation 
>> right before generating the first rc for a release and create blocker jiras 
>> in the case there is any issue.
>> 
>> -Flavio
>> 
>>> On 30 Nov 2016, at 23:12, Benjamin Reed  wrote:
>>> 
>>> yeah, that's a deeper question. pat or flavio can correct me on this,
>>> but i think the reason we check it in is so that the website's "trunk"
>>> documentation will work. now that we moved to git, i don't thing it
>>> works though... i also would just like to only build it when we do
>>> releases.
>>> 
>>> On Wed, Nov 30, 2016 at 2:24 PM, Jordan Zimmerman
>>>  wrote:
 I wondered about that myself. Why bother building the docs? Isn’t that 
 only needed for packaging/deployment? It ends up making PRs ugly because 
 you have all the unnecessary docs in the diff.
 
 -Jordan
 
> On Nov 30, 2016, at 11:23 PM, Benjamin Reed  wrote:
> 
> when we commit pull requests with doc changes, i think we should
> commit the generated doc as a separate commit. what do you all think?
> i would like to do that to keep the change from the contributors
> pristine :) and i think it simplifies things a bit.
> 
> ben
 
>> 



ZooKeeper_branch35_openjdk7 - Build # 316 - Still Failing

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_openjdk7/316/

###
## LAST 60 LINES OF THE CONSOLE 
###
Started by timer
[EnvInject] - Loading node environment variables.
Building remotely on H16 (ubuntu) in workspace 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch35_openjdk7
 > git rev-parse --is-inside-work-tree # timeout=10
Fetching changes from the remote Git repository
 > git config remote.origin.url git://git.apache.org/zookeeper.git # timeout=10
Fetching upstream changes from git://git.apache.org/zookeeper.git
 > git --version # timeout=10
 > git -c core.askpass=true fetch --tags --progress 
 > git://git.apache.org/zookeeper.git +refs/heads/*:refs/remotes/origin/*
 > git rev-parse refs/remotes/origin/branch-3.5^{commit} # timeout=10
 > git rev-parse refs/remotes/origin/origin/branch-3.5^{commit} # timeout=10
Checking out Revision 8f2a869c2efa91a9687c43360abd28da1ba1314e 
(refs/remotes/origin/branch-3.5)
 > git config core.sparsecheckout # timeout=10
 > git checkout -f 8f2a869c2efa91a9687c43360abd28da1ba1314e
 > git rev-list 8f2a869c2efa91a9687c43360abd28da1ba1314e # timeout=10
No emails were triggered.
[ZooKeeper_branch35_openjdk7] $ /home/jenkins/tools/ant/latest/bin/ant 
-Dtest.output=yes -Dtest.junit.threads=8 -Dtest.junit.output.format=xml 
-Djavac.target=1.7 clean test-core-java
Error: JAVA_HOME is not defined correctly.
  We cannot execute /usr/lib/jvm/java-7-openjdk-amd64//bin/java
Build step 'Invoke Ant' marked build as failure
Recording test results
ERROR: Step ?Publish JUnit test result report? failed: No test report files 
were found. Configuration error?
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
No tests ran.

ZooKeeper_branch35_jdk7 - Build # 751 - Failure

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_jdk7/751/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 482563 lines...]
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2016-12-01 08:51:34,563 [myid:] - WARN  [New I/O boss 
#9966:ClientCnxnSocketNetty$ZKClientHandler@439] - Exception caught: [id: 
0x1b81232c] EXCEPTION: java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14036
[junit] java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14036
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2016-12-01 08:51:34,564 [myid:] - INFO  [New I/O boss 
#9966:ClientCnxnSocketNetty@208] - channel is told closing
[junit] 2016-12-01 08:51:34,564 [myid:127.0.0.1:14036] - INFO  
[main-SendThread(127.0.0.1:14036):ClientCnxn$SendThread@1231] - channel for 
sessionid 0x100d11ba5f6 is lost, closing socket connection and attempting 
reconnect
[junit] 2016-12-01 08:51:34,840 [myid:127.0.0.1:14039] - INFO  
[main-SendThread(127.0.0.1:14039):ClientCnxn$SendThread@1113] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:14039. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2016-12-01 08:51:34,841 [myid:] - INFO  [New I/O boss 
#:ClientCnxnSocketNetty$1@127] - future isn't success, cause: {}
[junit] java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14039
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)
[junit] at 
org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[junit] at 
org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[junit] at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
[junit] at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
[junit] at java.lang.Thread.run(Thread.java:745)
[junit] 2016-12-01 08:51:34,841 [myid:] - WARN  [New I/O boss 
#:ClientCnxnSocketNetty$ZKClientHandler@439] - Exception caught: [id: 
0x489ef2c4] EXCEPTION: java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14039
[junit] java.net.ConnectException: Connection refused: 
127.0.0.1/127.0.0.1:14039
[junit] at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
[junit] at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.connect(NioClientBoss.java:152)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.processSelectedKeys(NioClientBoss.java:105)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.process(NioClientBoss.java:79)
[junit] at 
org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:337)
[junit] at 
org.jboss.netty.channel.socket.nio.NioClientBoss.run(NioClientBoss.java:42)

ZooKeeper-trunk-solaris - Build # 1406 - Still Failing

2016-12-01 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/1406/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 471625 lines...]
[junit] 2016-12-01 08:49:20,667 [myid:] - INFO  [main:ClientBase@386] - 
CREATING server instance 127.0.0.1:11222
[junit] 2016-12-01 08:49:20,667 [myid:] - INFO  
[main:NIOServerCnxnFactory@673] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2016-12-01 08:49:20,668 [myid:] - INFO  
[main:NIOServerCnxnFactory@686] - binding to port 0.0.0.0/0.0.0.0:11222
[junit] 2016-12-01 08:49:20,669 [myid:] - INFO  [main:ClientBase@361] - 
STARTING server instance 127.0.0.1:11222
[junit] 2016-12-01 08:49:20,669 [myid:] - INFO  [main:ZooKeeperServer@894] 
- minSessionTimeout set to 6000
[junit] 2016-12-01 08:49:20,669 [myid:] - INFO  [main:ZooKeeperServer@903] 
- maxSessionTimeout set to 6
[junit] 2016-12-01 08:49:20,669 [myid:] - INFO  [main:ZooKeeperServer@160] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test7292794833737315629.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test7292794833737315629.junit.dir/version-2
[junit] 2016-12-01 08:49:20,670 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test7292794833737315629.junit.dir/version-2/snapshot.b
[junit] 2016-12-01 08:49:20,672 [myid:] - INFO  [main:FileTxnSnapLog@306] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/build/test/tmp/test7292794833737315629.junit.dir/version-2/snapshot.b
[junit] 2016-12-01 08:49:20,674 [myid:] - ERROR [main:ZooKeeperServer@506] 
- ZKShutdownHandler is not registered, so ZooKeeper server won't take any 
action on ERROR or SHUTDOWN server state changes
[junit] 2016-12-01 08:49:20,674 [myid:] - INFO  
[main:FourLetterWordMain@85] - connecting to 127.0.0.1 11222
[junit] 2016-12-01 08:49:20,674 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:48505
[junit] 2016-12-01 08:49:20,675 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@485] - Processing stat command from 
/127.0.0.1:48505
[junit] 2016-12-01 08:49:20,675 [myid:] - INFO  
[NIOWorkerThread-1:StatCommand@49] - Stat command output
[junit] 2016-12-01 08:49:20,676 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@607] - Closed socket connection for client 
/127.0.0.1:48505 (no session established for client)
[junit] 2016-12-01 08:49:20,681 [myid:] - INFO  [main:JMXEnv@228] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2016-12-01 08:49:20,682 [myid:] - INFO  [main:JMXEnv@245] - 
expect:InMemoryDataTree
[junit] 2016-12-01 08:49:20,682 [myid:] - INFO  [main:JMXEnv@249] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2016-12-01 08:49:20,682 [myid:] - INFO  [main:JMXEnv@245] - 
expect:StandaloneServer_port
[junit] 2016-12-01 08:49:20,683 [myid:] - INFO  [main:JMXEnv@249] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2016-12-01 08:49:20,683 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@82] - Memory used 17820
[junit] 2016-12-01 08:49:20,683 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@87] - Number of threads 24
[junit] 2016-12-01 08:49:20,683 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@102] - FINISHED TEST METHOD 
testQuota
[junit] 2016-12-01 08:49:20,683 [myid:] - INFO  [main:ClientBase@543] - 
tearDown starting
[junit] 2016-12-01 08:49:20,742 [myid:] - INFO  [main:ZooKeeper@1313] - 
Session: 0x125368cff4c closed
[junit] 2016-12-01 08:49:20,742 [myid:] - INFO  [main:ClientBase@513] - 
STOPPING server
[junit] 2016-12-01 08:49:20,742 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down for 
session: 0x125368cff4c
[junit] 2016-12-01 08:49:20,743 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread exitted run method
[junit] 2016-12-01 08:49:20,743 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@420] 
- selector thread exitted run method
[junit] 2016-12-01 08:49:20,743 [myid:] - INFO