[jira] [Commented] (ZOOKEEPER-1621) ZooKeeper does not recover from crash when disk was full

2017-12-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306483#comment-16306483
 ] 

Hadoop QA commented on ZOOKEEPER-1621:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1392//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1392//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1392//console

This message is automatically generated.

> ZooKeeper does not recover from crash when disk was full
> 
>
> Key: ZOOKEEPER-1621
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.3
> Environment: Ubuntu 12.04, Amazon EC2 instance
>Reporter: David Arthur
>Assignee: Michi Mutsuzaki
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-1621.2.patch, ZOOKEEPER-1621.patch, 
> zookeeper.log.gz
>
>
> The disk that ZooKeeper was using filled up. During a snapshot write, I got 
> the following exception
> 2013-01-16 03:11:14,098 - ERROR [SyncThread:0:SyncRequestProcessor@151] - 
> Severe unrecoverable error, exiting
> java.io.IOException: No space left on device
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:282)
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:309)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:306)
> at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:484)
> at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:162)
> at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
> Then many subsequent exceptions like:
> 2013-01-16 15:02:23,984 - ERROR [main:Util@239] - Last transaction was 
> partial.
> 2013-01-16 15:02:23,985 - ERROR [main:ZooKeeperServerMain@63] - Unexpected 
> exception, exiting abnormally
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.(FileTxnLog.java:504)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130)
> at 
> org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
> at 
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
> at 
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
> at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
> at 
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
> at 
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
> at 
> 

Failed: ZOOKEEPER- PreCommit Build #1392

2017-12-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1392/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 78.91 MB...]
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1392//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1392//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1392//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 08fad879f3b64947d84a45928f5731150566b0a1 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1722:
 exec returned: 1

Total time: 19 minutes 0 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-1621
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentServersAreObserversInNextConfig

Error Message:
waiting for server 3 being up

Stack Trace:
junit.framework.AssertionFailedError: waiting for server 3 being up
at 
org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentServersAreObserversInNextConfig(ReconfigRecoveryTest.java:224)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)

[jira] [Commented] (ZOOKEEPER-1621) ZooKeeper does not recover from crash when disk was full

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306470#comment-16306470
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1621:
---

GitHub user abhishekrai opened a pull request:

https://github.com/apache/zookeeper/pull/439

ZOOKEEPER-1621: Delete and skip txn log with incomplete header

Based on the patch by Michi Mutsuzaki.

When Zookeeper server encounters a txn log with incomplete header,
the old behavior was to crash due to the resulting EOFException.
The new behavior is catch the exception and skip the txn log.

Additionally, the txn log is deleted to ensure that it does not
influence future loads/PurgeTxnLog in believing that this is
the only txn log before the following snapshot that they need to
load/retain.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/abhishekrai/zookeeper ZOOKEEPER-1621

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #439


commit 6b457a069ccdb01e1ee77537b02db80f3005f5b1
Author: Abhishek Rai 
Date:   2017-12-29T17:38:52Z

ZOOKEEPER-1621: Delete and skip txn log with incomplete header

Based on the patch by Michi Mutsuzaki.

When Zookeeper server encounters a txn log with incomplete header,
the old behavior was to crash due to the resulting EOFException.
The new behavior is catch the exception and skip the txn log.

Additionally, the txn log is deleted to ensure that it does not
influence future loads/PurgeTxnLog in believing that this is
the only txn log before the following snapshot that they need to
load/retain.




> ZooKeeper does not recover from crash when disk was full
> 
>
> Key: ZOOKEEPER-1621
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.4.3
> Environment: Ubuntu 12.04, Amazon EC2 instance
>Reporter: David Arthur
>Assignee: Michi Mutsuzaki
> Fix For: 3.5.4, 3.6.0
>
> Attachments: ZOOKEEPER-1621.2.patch, ZOOKEEPER-1621.patch, 
> zookeeper.log.gz
>
>
> The disk that ZooKeeper was using filled up. During a snapshot write, I got 
> the following exception
> 2013-01-16 03:11:14,098 - ERROR [SyncThread:0:SyncRequestProcessor@151] - 
> Severe unrecoverable error, exiting
> java.io.IOException: No space left on device
> at java.io.FileOutputStream.writeBytes(Native Method)
> at java.io.FileOutputStream.write(FileOutputStream.java:282)
> at 
> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
> at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:309)
> at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:306)
> at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:484)
> at 
> org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:162)
> at 
> org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
> Then many subsequent exceptions like:
> 2013-01-16 15:02:23,984 - ERROR [main:Util@239] - Last transaction was 
> partial.
> 2013-01-16 15:02:23,985 - ERROR [main:ZooKeeperServerMain@63] - Unexpected 
> exception, exiting abnormally
> java.io.EOFException
> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> at 
> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
> at 
> org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.(FileTxnLog.java:504)
> at 
> org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
> at 
> 

[GitHub] zookeeper pull request #439: ZOOKEEPER-1621: Delete and skip txn log with in...

2017-12-29 Thread abhishekrai
GitHub user abhishekrai opened a pull request:

https://github.com/apache/zookeeper/pull/439

ZOOKEEPER-1621: Delete and skip txn log with incomplete header

Based on the patch by Michi Mutsuzaki.

When Zookeeper server encounters a txn log with incomplete header,
the old behavior was to crash due to the resulting EOFException.
The new behavior is catch the exception and skip the txn log.

Additionally, the txn log is deleted to ensure that it does not
influence future loads/PurgeTxnLog in believing that this is
the only txn log before the following snapshot that they need to
load/retain.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/abhishekrai/zookeeper ZOOKEEPER-1621

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/439.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #439


commit 6b457a069ccdb01e1ee77537b02db80f3005f5b1
Author: Abhishek Rai 
Date:   2017-12-29T17:38:52Z

ZOOKEEPER-1621: Delete and skip txn log with incomplete header

Based on the patch by Michi Mutsuzaki.

When Zookeeper server encounters a txn log with incomplete header,
the old behavior was to crash due to the resulting EOFException.
The new behavior is catch the exception and skip the txn log.

Additionally, the txn log is deleted to ensure that it does not
influence future loads/PurgeTxnLog in believing that this is
the only txn log before the following snapshot that they need to
load/retain.




---


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306451#comment-16306451
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2901:
---

Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/377#discussion_r159091495
  
--- Diff: src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml ---
@@ -949,14 +949,15 @@ server.3=zoo3:2888:3888
   
 
   
-ttlNodesEnabled
+zookeeper.extendedTypesEnabled
--- End diff --

I only thought that we might be able to use the setting for other things in 
the future. But, I'm OK either way. Let me know.


> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] zookeeper pull request #377: [ZOOKEEPER-2901] TTL Nodes don't work with Serv...

2017-12-29 Thread Randgalt
Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/377#discussion_r159091495
  
--- Diff: src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml ---
@@ -949,14 +949,15 @@ server.3=zoo3:2888:3888
   
 
   
-ttlNodesEnabled
+zookeeper.extendedTypesEnabled
--- End diff --

I only thought that we might be able to use the setting for other things in 
the future. But, I'm OK either way. Let me know.


---


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306450#comment-16306450
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2901:
---

Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/377#discussion_r159091284
  
--- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java ---
@@ -476,9 +474,12 @@ public ZooKeeperServerListener 
getZooKeeperServerListener() {
 return listener;
 }
 
+// Visible for testing
+static volatile int serverId = 1;
--- End diff --

@phunt I'm not sure what you mean about quorum peer. I could find another 
way to set this value. Also, it's only for testing and clearly marked. What do 
you suggest?


> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] zookeeper pull request #377: [ZOOKEEPER-2901] TTL Nodes don't work with Serv...

2017-12-29 Thread Randgalt
Github user Randgalt commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/377#discussion_r159091284
  
--- Diff: src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java ---
@@ -476,9 +474,12 @@ public ZooKeeperServerListener 
getZooKeeperServerListener() {
 return listener;
 }
 
+// Visible for testing
+static volatile int serverId = 1;
--- End diff --

@phunt I'm not sure what you mean about quorum peer. I could find another 
way to set this value. Also, it's only for testing and clearly marked. What do 
you suggest?


---


Failed: ZOOKEEPER- PreCommit Build #1391

2017-12-29 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1391/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 78.61 MB...]
 [exec] +0 tests included.  The patch appears to be a documentation 
patch that doesn't require tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1391//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1391//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1391//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 97ab23202f713250cf36a5ddd3a7d6f87839b26b logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/build.xml:1722:
 exec returned: 1

Total time: 17 minutes 52 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2959
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  org.apache.zookeeper.test.AsyncHammerTest.testHammer

Error Message:
null

Stack Trace:
junit.framework.AssertionFailedError
at 
org.apache.zookeeper.test.AsyncHammerTest.testHammer(AsyncHammerTest.java:185)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79)

[jira] [Commented] (ZOOKEEPER-2959) ignore accepted epoch and LEADERINFO ack from observers when a newly elected leader computes new epoch

2017-12-29 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306154#comment-16306154
 ] 

Hadoop QA commented on ZOOKEEPER-2959:
--

-1 overall.  GitHub Pull Request  Build
  

+1 @author.  The patch does not contain any @author tags.

+0 tests included.  The patch appears to be a documentation patch that 
doesn't require tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 3.0.1) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1391//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1391//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1391//console

This message is automatically generated.

> ignore accepted epoch and LEADERINFO ack from observers when a newly elected 
> leader computes new epoch
> --
>
> Key: ZOOKEEPER-2959
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: xiangyq000
>
> Once the ZooKeeper cluster finishes the election for new leader, all learners 
> report their accepted epoch to the leader for the computation of new cluster 
> epoch.
> org.apache.zookeeper.server.quorum.Leader#getEpochToPropose
> {code:java}
> private final HashSet connectingFollowers = new HashSet();
> public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
> InterruptedException, IOException {
> synchronized(connectingFollowers) {
> if (!waitingForNewEpoch) {
> return epoch;
> }
> if (lastAcceptedEpoch >= epoch) {
> epoch = lastAcceptedEpoch+1;
> }
> connectingFollowers.add(sid);
> QuorumVerifier verifier = self.getQuorumVerifier();
> if (connectingFollowers.contains(self.getId()) &&
> 
> verifier.containsQuorum(connectingFollowers)) {
> waitingForNewEpoch = false;
> self.setAcceptedEpoch(epoch);
> connectingFollowers.notifyAll();
> } else {
> long start = Time.currentElapsedTime();
> long cur = start;
> long end = start + self.getInitLimit()*self.getTickTime();
> while(waitingForNewEpoch && cur < end) {
> connectingFollowers.wait(end - cur);
> cur = Time.currentElapsedTime();
> }
> if (waitingForNewEpoch) {
> throw new InterruptedException("Timeout while waiting for 
> epoch from quorum");
> }
> }
> return epoch;
> }
> }
> {code}
> The computation will get an outcome once :
> # The leader has call method "getEpochToPropose"
> # The number of all reporters is greater than half of participants.
> The problem is, an observer server will also send its accepted epoch to the 
> leader, while this procedure treat observers as participants.
> Supposed that the cluster consists of 1 leader, 2 followers and 1 observer, 
> and now the leader and the observer have reported their accepted epochs while 
> neither of the followers has. Thus, the connectingFollowers set consists of 
> two elements, resulting in a size of 2, which is greater than half quorum, 
> namely, 2. Then QuorumVerifier#containsQuorum will return true, because it 
> does not check whether the elements of the parameter are participants.
> The same flaw exists in 
> org.apache.zookeeper.server.quorum.Leader#waitForEpochAck



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (ZOOKEEPER-2959) ignore accepted epoch and LEADERINFO ack from observers when a newly elected leader computes new epoch

2017-12-29 Thread xiangyq000 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangyq000 updated ZOOKEEPER-2959:
--
Summary: ignore accepted epoch and LEADERINFO ack from observers when a 
newly elected leader computes new epoch  (was: ignore epoch proposal and ack 
from observers when a newly elected leader computes new epoch)

> ignore accepted epoch and LEADERINFO ack from observers when a newly elected 
> leader computes new epoch
> --
>
> Key: ZOOKEEPER-2959
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: xiangyq000
>
> Once the ZooKeeper cluster finishes the election for new leader, all learners 
> report their accepted epoch to the leader for the computation of new cluster 
> epoch.
> org.apache.zookeeper.server.quorum.Leader#getEpochToPropose
> {code:java}
> private final HashSet connectingFollowers = new HashSet();
> public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
> InterruptedException, IOException {
> synchronized(connectingFollowers) {
> if (!waitingForNewEpoch) {
> return epoch;
> }
> if (lastAcceptedEpoch >= epoch) {
> epoch = lastAcceptedEpoch+1;
> }
> connectingFollowers.add(sid);
> QuorumVerifier verifier = self.getQuorumVerifier();
> if (connectingFollowers.contains(self.getId()) &&
> 
> verifier.containsQuorum(connectingFollowers)) {
> waitingForNewEpoch = false;
> self.setAcceptedEpoch(epoch);
> connectingFollowers.notifyAll();
> } else {
> long start = Time.currentElapsedTime();
> long cur = start;
> long end = start + self.getInitLimit()*self.getTickTime();
> while(waitingForNewEpoch && cur < end) {
> connectingFollowers.wait(end - cur);
> cur = Time.currentElapsedTime();
> }
> if (waitingForNewEpoch) {
> throw new InterruptedException("Timeout while waiting for 
> epoch from quorum");
> }
> }
> return epoch;
> }
> }
> {code}
> The computation will get an outcome once :
> # The leader has call method "getEpochToPropose"
> # The number of all reporters is greater than half of participants.
> The problem is, an observer server will also send its accepted epoch to the 
> leader, while this procedure treat observers as participants.
> Supposed that the cluster consists of 1 leader, 2 followers and 1 observer, 
> and now the leader and the observer have reported their accepted epochs while 
> neither of the followers has. Thus, the connectingFollowers set consists of 
> two elements, resulting in a size of 2, which is greater than half quorum, 
> namely, 2. Then QuorumVerifier#containsQuorum will return true, because it 
> does not check whether the elements of the parameter are participants.
> The same flaw exists in 
> org.apache.zookeeper.server.quorum.Leader#waitForEpochAck



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ZOOKEEPER-2959) ignore epoch proposal and ack from observers when a newly elected leader computes new epoch

2017-12-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306144#comment-16306144
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2959:
---

GitHub user xyq000 opened a pull request:

https://github.com/apache/zookeeper/pull/438

ZOOKEEPER-2959: ignore accepted epoch and ack from observers

https://issues.apache.org/jira/browse/ZOOKEEPER-2959
After a round of elections completes, followers and observers send their 
accepted epochs to the leader to determine a final epoch.
Since `QuorumVerifier#containsQuorum(Set set)` does not check whether the 
elements of argument `set` exactly represent participants, this pull request is 
intended to ignore reported epochs and acks from observers for logical 
consistency.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xyq000/zookeeper ZOOKEEPER-2959

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #438


commit 647061aa7ba1182b83b44b7f2671508012a30b4c
Author: Yongqiang Xiang 
Date:   2017-12-29T08:20:06Z

ignore accepted epoch and ack from observers




> ignore epoch proposal and ack from observers when a newly elected leader 
> computes new epoch
> ---
>
> Key: ZOOKEEPER-2959
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2959
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.4.10, 3.5.3
>Reporter: xiangyq000
>
> Once the ZooKeeper cluster finishes the election for new leader, all learners 
> report their accepted epoch to the leader for the computation of new cluster 
> epoch.
> org.apache.zookeeper.server.quorum.Leader#getEpochToPropose
> {code:java}
> private final HashSet connectingFollowers = new HashSet();
> public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
> InterruptedException, IOException {
> synchronized(connectingFollowers) {
> if (!waitingForNewEpoch) {
> return epoch;
> }
> if (lastAcceptedEpoch >= epoch) {
> epoch = lastAcceptedEpoch+1;
> }
> connectingFollowers.add(sid);
> QuorumVerifier verifier = self.getQuorumVerifier();
> if (connectingFollowers.contains(self.getId()) &&
> 
> verifier.containsQuorum(connectingFollowers)) {
> waitingForNewEpoch = false;
> self.setAcceptedEpoch(epoch);
> connectingFollowers.notifyAll();
> } else {
> long start = Time.currentElapsedTime();
> long cur = start;
> long end = start + self.getInitLimit()*self.getTickTime();
> while(waitingForNewEpoch && cur < end) {
> connectingFollowers.wait(end - cur);
> cur = Time.currentElapsedTime();
> }
> if (waitingForNewEpoch) {
> throw new InterruptedException("Timeout while waiting for 
> epoch from quorum");
> }
> }
> return epoch;
> }
> }
> {code}
> The computation will get an outcome once :
> # The leader has call method "getEpochToPropose"
> # The number of all reporters is greater than half of participants.
> The problem is, an observer server will also send its accepted epoch to the 
> leader, while this procedure treat observers as participants.
> Supposed that the cluster consists of 1 leader, 2 followers and 1 observer, 
> and now the leader and the observer have reported their accepted epochs while 
> neither of the followers has. Thus, the connectingFollowers set consists of 
> two elements, resulting in a size of 2, which is greater than half quorum, 
> namely, 2. Then QuorumVerifier#containsQuorum will return true, because it 
> does not check whether the elements of the parameter are participants.
> The same flaw exists in 
> org.apache.zookeeper.server.quorum.Leader#waitForEpochAck



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] zookeeper pull request #438: ZOOKEEPER-2959: ignore accepted epoch and ack f...

2017-12-29 Thread xyq000
GitHub user xyq000 opened a pull request:

https://github.com/apache/zookeeper/pull/438

ZOOKEEPER-2959: ignore accepted epoch and ack from observers

https://issues.apache.org/jira/browse/ZOOKEEPER-2959
After a round of elections completes, followers and observers send their 
accepted epochs to the leader to determine a final epoch.
Since `QuorumVerifier#containsQuorum(Set set)` does not check whether the 
elements of argument `set` exactly represent participants, this pull request is 
intended to ignore reported epochs and acks from observers for logical 
consistency.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xyq000/zookeeper ZOOKEEPER-2959

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/438.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #438


commit 647061aa7ba1182b83b44b7f2671508012a30b4c
Author: Yongqiang Xiang 
Date:   2017-12-29T08:20:06Z

ignore accepted epoch and ack from observers




---


[jira] [Updated] (ZOOKEEPER-2959) ignore epoch proposal and ack from observers when a newly elected leader computes new epoch

2017-12-29 Thread xiangyq000 (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiangyq000 updated ZOOKEEPER-2959:
--
Description: 
Once the ZooKeeper cluster finishes the election for new leader, all learners 
report their accepted epoch to the leader for the computation of new cluster 
epoch.

org.apache.zookeeper.server.quorum.Leader#getEpochToPropose
{code:java}
private final HashSet connectingFollowers = new HashSet();
public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
InterruptedException, IOException {
synchronized(connectingFollowers) {
if (!waitingForNewEpoch) {
return epoch;
}
if (lastAcceptedEpoch >= epoch) {
epoch = lastAcceptedEpoch+1;
}
connectingFollowers.add(sid);
QuorumVerifier verifier = self.getQuorumVerifier();
if (connectingFollowers.contains(self.getId()) &&

verifier.containsQuorum(connectingFollowers)) {
waitingForNewEpoch = false;
self.setAcceptedEpoch(epoch);
connectingFollowers.notifyAll();
} else {
long start = Time.currentElapsedTime();
long cur = start;
long end = start + self.getInitLimit()*self.getTickTime();
while(waitingForNewEpoch && cur < end) {
connectingFollowers.wait(end - cur);
cur = Time.currentElapsedTime();
}
if (waitingForNewEpoch) {
throw new InterruptedException("Timeout while waiting for 
epoch from quorum");
}
}
return epoch;
}
}
{code}

The computation will get an outcome once :
# The leader has call method "getEpochToPropose"
# The number of all reporters is greater than half of participants.

The problem is, an observer server will also send its accepted epoch to the 
leader, while this procedure treat observers as participants.

Supposed that the cluster consists of 1 leader, 2 followers and 1 observer, and 
now the leader and the observer have reported their accepted epochs while 
neither of the followers has. Thus, the connectingFollowers set consists of two 
elements, resulting in a size of 2, which is greater than half quorum, namely, 
2. Then QuorumVerifier#containsQuorum will return true, because it does not 
check whether the elements of the parameter are participants.

The same flaw exists in 
org.apache.zookeeper.server.quorum.Leader#waitForEpochAck

  was:
Once the ZooKeeper cluster finishes the election for new leader, all learners 
report their accepted epoch to the leader for the computation of new cluster 
epoch.

org.apache.zookeeper.server.quorum.Leader#getEpochToPropose
{code:java}
private final HashSet connectingFollowers = new HashSet();
public long getEpochToPropose(long sid, long lastAcceptedEpoch) throws 
InterruptedException, IOException {
synchronized(connectingFollowers) {
if (!waitingForNewEpoch) {
return epoch;
}
if (lastAcceptedEpoch >= epoch) {
epoch = lastAcceptedEpoch+1;
}
connectingFollowers.add(sid);
QuorumVerifier verifier = self.getQuorumVerifier();
if (connectingFollowers.contains(self.getId()) &&

verifier.containsQuorum(connectingFollowers)) {
waitingForNewEpoch = false;
self.setAcceptedEpoch(epoch);
connectingFollowers.notifyAll();
} else {
long start = Time.currentElapsedTime();
long cur = start;
long end = start + self.getInitLimit()*self.getTickTime();
while(waitingForNewEpoch && cur < end) {
connectingFollowers.wait(end - cur);
cur = Time.currentElapsedTime();
}
if (waitingForNewEpoch) {
throw new InterruptedException("Timeout while waiting for 
epoch from quorum");
}
}
return epoch;
}
}
{code}

The computation will get an outcome once :
# The leader has call method "getEpochToPropose"
# The number of all reporters is greater than half of participants.

The problem is, an observer server will also send its accepted epoch to the 
leader, while this procedure treat observers as participants.

Supposed that the cluster consists of 1 leader, 2 followers and 1 observer, and 
now the leader and the observer have reported their accepted epochs while 
neither of the followers has. Thus, the connectingFollowers set consists of two 
elements, resulting in a size of 2, which is greater than half quorum, namely, 
2. Then