ZooKeeper_branch34_jdk7 - Build # 1843 - Failure

2018-03-06 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_jdk7/1843/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 38.48 KB...]
[junit] Running org.apache.zookeeper.test.RecoveryTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
24.366 sec
[junit] Running org.apache.zookeeper.test.RepeatStartupTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
6.236 sec
[junit] Running org.apache.zookeeper.test.RestoreCommittedLogTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
19.089 sec
[junit] Running org.apache.zookeeper.test.SaslAuthDesignatedClientTest
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.887 sec
[junit] Running org.apache.zookeeper.test.SaslAuthDesignatedServerTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.711 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailDesignatedClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.53 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailNotifyTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.608 sec
[junit] Running org.apache.zookeeper.test.SaslAuthFailTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.718 sec
[junit] Running org.apache.zookeeper.test.SaslAuthMissingClientConfigTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.631 sec
[junit] Running org.apache.zookeeper.test.SaslClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.078 sec
[junit] Running org.apache.zookeeper.test.SessionInvalidationTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.717 sec
[junit] Running org.apache.zookeeper.test.SessionTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
33.984 sec
[junit] Running org.apache.zookeeper.test.StandaloneTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.869 sec
[junit] Running org.apache.zookeeper.test.StatTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.971 sec
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest
[junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.327 sec
[junit] Running org.apache.zookeeper.test.SyncCallTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.662 sec
[junit] Running org.apache.zookeeper.test.TruncateTest
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
10.471 sec
[junit] Running org.apache.zookeeper.test.UpgradeTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.38 sec
[junit] Running org.apache.zookeeper.test.WatchedEventTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.118 sec
[junit] Running org.apache.zookeeper.test.WatcherFuncTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.579 sec
[junit] Running org.apache.zookeeper.test.WatcherTest
[junit] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
31.505 sec
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
12.369 sec
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.758 sec

fail.build.on.test.failure:

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_jdk7/build.xml:1382: 
The following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_jdk7/build.xml:1385: 
Tests failed!

Total time: 57 minutes 53 seconds
Build step 'Invoke Ant' marked build as failure
Archiving artifacts
Recording test results
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any



###
## FAILED TESTS (if any) 
##
3 tests failed.
FAILED:  
org.apache.zookeeper.server.PurgeTxnTest.testPurgeWhenLogRollingInProgress

Error Message:
ZkClient ops is not finished!

Stack Trace:
junit.framework.AssertionFailedError: ZkClient ops is not finished!
at 
org.apache.zookeeper.server.PurgeTxnTest.manyClientOps(PurgeTxnTest.java:591)
at 
org.apache.zookeeper.server.PurgeTxnTest.testPurgeWhenLogRollingInProgress(PurgeTxnTest.java:152)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:55)


FAILED:  

Success: ZOOKEEPER- PreCommit Build #1528

2018-03-06 Thread Apache Jenkins Server
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1528/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 39.82 MB...]
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 3.0.1) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] +1 core tests.  The patch passed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1528//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1528//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/1528//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Unable to log in to server: 
https://issues.apache.org/jira/rpc/soap/jirasoapservice-v2 with user: hadoopqa.
 [exec]  Cause: ; nested exception is: 
 [exec] javax.net.ssl.SSLException: Received fatal alert: 
protocol_version
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Unable to log in to server: 
https://issues.apache.org/jira/rpc/soap/jirasoapservice-v2 with user: hadoopqa.
 [exec]  Cause: ; nested exception is: 
 [exec] javax.net.ssl.SSLException: Received fatal alert: 
protocol_version
 [exec] mv: 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 and 
'/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-github-pr-build/patchprocess'
 are the same file

BUILD SUCCESSFUL
Total time: 36 minutes 32 seconds
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
[description-setter] Description set: ZOOKEEPER-2962
Putting comment on the pull request
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Success
Sending email for trigger: Success
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-2962) The function queueEmpty() in FastLeaderElection.Messenger is not used, should be removed.

2018-03-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388872#comment-16388872
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2962:
---

GitHub user asutosh936 opened a pull request:

https://github.com/apache/zookeeper/pull/482

ZOOKEEPER-2962 - Removed Unused method.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/asutosh936/zookeeper branch-3.4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/482.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #482


commit c33de596c0df5bb050663a449c00ef08c2621e2c
Author: asutosh936 
Date:   2018-03-07T01:49:29Z

ZOOKEEPER-2962 - Removed Unused method.




> The function queueEmpty() in FastLeaderElection.Messenger is not used, should 
> be removed.
> -
>
> Key: ZOOKEEPER-2962
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2962
> Project: ZooKeeper
>  Issue Type: Improvement
>  Components: leaderElection
>Affects Versions: 3.4.11
>Reporter: Jiafu Jiang
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #482: ZOOKEEPER-2962 - Removed Unused method.

2018-03-06 Thread asutosh936
GitHub user asutosh936 opened a pull request:

https://github.com/apache/zookeeper/pull/482

ZOOKEEPER-2962 - Removed Unused method.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/asutosh936/zookeeper branch-3.4

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/482.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #482


commit c33de596c0df5bb050663a449c00ef08c2621e2c
Author: asutosh936 
Date:   2018-03-07T01:49:29Z

ZOOKEEPER-2962 - Removed Unused method.




---


ZooKeeper-trunk - Build # 3758 - Still Failing

2018-03-06 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk/3758/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 256.84 KB...]
 [exec] Log Message Received: [2018-03-06 
23:29:21,120:29505(0x2b79e5988f40):ZOO_INFO@log_env@1072: Client 
environment:host.name=asf904.gq1.ygridcore.net]
 [exec] Log Message Received: [2018-03-06 
23:29:21,120:29505(0x2b79e5988f40):ZOO_INFO@log_env@1079: Client 
environment:os.name=Linux]
 [exec] Log Message Received: [2018-03-06 
23:29:21,120:29505(0x2b79e5988f40):ZOO_INFO@log_env@1080: Client 
environment:os.arch=3.13.0-135-generic]
 [exec] Log Message Received: [2018-03-06 
23:29:21,120:29505(0x2b79e5988f40):ZOO_INFO@log_env@1081: Client 
environment:os.version=#184-Ubuntu SMP Wed Oct 18 11:55:51 UTC 2017]
 [exec] Log Message Received: [2018-03-06 
23:29:21,120:29505(0x2b79e5988f40):ZOO_INFO@log_env@1089: Client 
environment:user.name=jenkins]
 [exec] Log Message Received: [2018-03-06 
23:29:21,122:29505(0x2b79e5988f40):ZOO_INFO@log_env@1097: Client 
environment:user.home=/home/jenkins]
 [exec] Log Message Received: [2018-03-06 
23:29:21,122:29505(0x2b79e5988f40):ZOO_INFO@log_env@1109: Client 
environment:user.dir=/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build/test/test-cppunit]
 [exec] Log Message Received: [2018-03-06 
23:29:21,122:29505(0x2b79e5988f40):ZOO_INFO@zookeeper_init_internal@1152: 
Initiating client connection, host=127.0.0.1:22181 sessionTimeout=1 
watcher=0x461d20 sessionId=0 sessionPasswd= context=0x7ffc39e16de0 
flags=0]
 [exec] Log Message Received: [2018-03-06 
23:29:21,123:29505(0x2b79e79eb700):ZOO_INFO@check_events@2439: initiated 
connection to server [127.0.0.1:22181]]
 [exec] Log Message Received: [2018-03-06 
23:29:21,142:29505(0x2b79e79eb700):ZOO_INFO@check_events@2491: session 
establishment complete on server [127.0.0.1:22181], 
sessionId=0x1025561ef93000f, negotiated timeout=1 ]
 [exec]  : elapsed 1004 : OK
 [exec] Zookeeper_simpleSystem::testAsyncWatcherAutoReset ZooKeeper server 
started : elapsed 10533 : OK
 [exec] Zookeeper_simpleSystem::testDeserializeString : elapsed 0 : OK
 [exec] Zookeeper_simpleSystem::testFirstServerDown : elapsed 1000 : OK
 [exec] Zookeeper_simpleSystem::testNullData : elapsed 1026 : OK
 [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1003 : OK
 [exec] Zookeeper_simpleSystem::testCreate : elapsed 1007 : OK
 [exec] Zookeeper_simpleSystem::testPath : elapsed 1012 : OK
 [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1036 : OK
 [exec] Zookeeper_simpleSystem::testPing : elapsed 17195 : OK
 [exec] Zookeeper_simpleSystem::testAcl : elapsed 1015 : OK
 [exec] Zookeeper_simpleSystem::testChroot : elapsed 3044 : OK
 [exec] Zookeeper_simpleSystem::testAuthFAIL: zktest-mt
 [exec] ==
 [exec] 1 of 2 tests failed
 [exec] Please report to u...@zookeeper.apache.org
 [exec] ==
 [exec] make[1]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build/test/test-cppunit'
 [exec] terminate called after throwing an instance of 'CppUnit::Exception'
 [exec]   what():  equality assertion failed
 [exec] - Expected: 0
 [exec] - Actual  : -116
 [exec] 
 [exec] /bin/bash: line 5: 29505 Aborted 
ZKROOT=/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/src/c/../.. 
CLASSPATH=$CLASSPATH:$CLOVER_HOME/lib/clover.jar ${dir}$tst
 [exec] make[1]: *** [check-TESTS] Error 1
 [exec] make: *** [check-am] Error 2

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1395: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1355: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/build.xml:1365: exec 
returned: 2

Total time: 13 minutes 30 seconds
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
[WARNINGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording fingerprints
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Recording test results
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Publishing Javadoc
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Email was triggered for: Failure - Any
Sending email for trigger: Failure - Any
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7
Setting JDK_1_7_LATEST__HOME=/home/jenkins/tools/java/latest1.7




[jira] [Commented] (ZOOKEEPER-2985) Expired session may unexpired after leader failover

2018-03-06 Thread Chris Thunes (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16388077#comment-16388077
 ] 

Chris Thunes commented on ZOOKEEPER-2985:
-

The ephemeral nodes do eventually get removed once the new ZK leader marks the 
session as expired and performs the associated session tear down.

One fix may be to have the server close the client connection, _without_ 
sending the Expired event, if it finds the session is in the closing state with 
an uncommitted closeSession entry. Alternatively, session re-validation could 
be blocked for "closing" sessions until their corresponding closeSession entry 
is committed.

> Expired session may unexpired after leader failover
> ---
>
> Key: ZOOKEEPER-2985
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2985
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.3, 3.4.11
>Reporter: Chris Thunes
>Priority: Major
>
> We recently observed an inconsistency in our Kafka cluster which we tracked 
> down to ZooKeeper sessions expiring and then re-appearing after a ZooKeeper 
> leadership failover. The Kafka nodes received session "Expired" events, 
> leading to them starting new sessions and attempting to re-create some 
> ephemeral nodes (broker ID nodes in kafka/brokers/ids specifically). However, 
> between receiving the session Expired event and establishing a new session a 
> leadership failover occurred within the ZooKeeper cluster which resulted in 
> the expired session re-appearing. When Kafka attempted to re-create the 
> ephemeral nodes mentioned above it (unexpectedly) received NODEEXISTS errors.
> This behavior is a result of how session expiration is handled by the leader. 
> Specifically, the expired session is marked as "closing" immediately upon 
> expiration (in SessionTrackerImpl) and _before_ the corresponding 
> "closeSession" entry is committed. A client can therefore receive a session 
> Expired event before its session is fully closed. A leadership failover which 
> results in the loss of the (uncommitted) closeSession entry thus leads to the 
> sessions' ephemeral nodes "re-appearing" until another expiration of the old 
> session on the new leader takes place.
> I'm not certain if this should be considered a bug or an edge case that 
> client are expected to handle. If it is the latter then I think it would be 
> good to include this in the Programmer's Guide in the documentation.
> If it's helpful I have code to reproduce this on an in-process cluster 
> running 3.4.11 or 3.5.3-beta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: Name resolution in StaticHostProvider

2018-03-06 Thread Andor Molnar
Hi Abe,

Unfortunately we haven't got any feedback yet. What do you think of
implementing Option #3?

Regards,
Andor


On Thu, Feb 22, 2018 at 6:06 PM, Andor Molnar  wrote:

> Did anybody happen to take a quick look by any chance?
>
> I don't want to push this too hard, because I know it's a time consuming
> topic to think about, but this is a blocker in 3.5 which has been hanging
> around for a while and any feedback would be extremely helpful to close it
> quickly.
>
> Thanks,
> Andor
>
>
>
> On Mon, Feb 19, 2018 at 12:18 PM, Andor Molnar  wrote:
>
>> Hi all,
>>
>> We need more eyes and brains on the following PR:
>>
>> https://github.com/apache/zookeeper/pull/451
>>
>> I added a comment few days ago about the way we currently do DNS name
>> resolution in this class and a suggestion on how we could simplify things a
>> little bit. We talked about it with Abe Fine, but we're a little bit unsure
>> and cannot get a conclusion. It would be extremely handy to get more
>> feedback from you.
>>
>> To add some colour to it, let me elaborate on the situation here:
>>
>> In general, the task that StaticHostProvider does is to get a list of
>> potentially unresolved InetSocketAddress objects, resolve them and iterate
>> over the resolved objects by calling next() method.
>>
>> *Option #1 (current logic)*
>> - Resolve addresses with getAllByName() which returns a list of IP
>> addresses associated with the address.
>> - Cache all these IP's, shuffle them and iterate over.
>> - If client is unable to connect to an IP, remove all IPs from the list
>> which the original servername was resolved to and re-resolve it.
>>
>> *Option #2 (getByName())*
>> - Resolve address with getByName() instead which returns only the first
>> IP address of the name,
>> - Do not cache IPs,
>> - Shuffle the *names* and resolve with getByName() *every time* when
>> next() is called,
>> - JDK's built-in caching will prevent name servers from being flooded and
>> will do the re-resolution automatically when cache expires,
>> - Names with multiple IPs will be handled by DNS servers which (if
>> configured properly) return IPs in different order - this is called DNS
>> Round Robin -, so getByName() will return different IP on each call.
>>
>> *Options #3*
>> - There's a small problem with option#2: if DNS server is not configured
>> properly and handles the round-robin case in a way that it always return
>> the IP list in the same order, getByName() will never return the next ip,
>> - In order to overcome that, use getAllByName() instead, shuffle the list
>> and return the first IP.
>>
>> All feedback if much appreciated.
>>
>> Thanks,
>> Andor
>>
>>
>>
>>
>


[jira] [Commented] (ZOOKEEPER-2985) Expired session may unexpired after leader failover

2018-03-06 Thread Andor Molnar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387978#comment-16387978
 ] 

Andor Molnar commented on ZOOKEEPER-2985:
-

[~cthunes]

Thanks for reporting this.

I think this is related to https://issues.apache.org/jira/browse/ZOOKEEPER-1208 
which has intentionally introduced the closing state for events which have been 
expired, but `closeSession` has not been acknowledged by the quorum.

Will the ephemerals be removed eventually once the quorum established or they 
survive forever because of the race condition?

> Expired session may unexpired after leader failover
> ---
>
> Key: ZOOKEEPER-2985
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2985
> Project: ZooKeeper
>  Issue Type: Bug
>Affects Versions: 3.5.3, 3.4.11
>Reporter: Chris Thunes
>Priority: Major
>
> We recently observed an inconsistency in our Kafka cluster which we tracked 
> down to ZooKeeper sessions expiring and then re-appearing after a ZooKeeper 
> leadership failover. The Kafka nodes received session "Expired" events, 
> leading to them starting new sessions and attempting to re-create some 
> ephemeral nodes (broker ID nodes in kafka/brokers/ids specifically). However, 
> between receiving the session Expired event and establishing a new session a 
> leadership failover occurred within the ZooKeeper cluster which resulted in 
> the expired session re-appearing. When Kafka attempted to re-create the 
> ephemeral nodes mentioned above it (unexpectedly) received NODEEXISTS errors.
> This behavior is a result of how session expiration is handled by the leader. 
> Specifically, the expired session is marked as "closing" immediately upon 
> expiration (in SessionTrackerImpl) and _before_ the corresponding 
> "closeSession" entry is committed. A client can therefore receive a session 
> Expired event before its session is fully closed. A leadership failover which 
> results in the loss of the (uncommitted) closeSession entry thus leads to the 
> sessions' ephemeral nodes "re-appearing" until another expiration of the old 
> session on the new leader takes place.
> I'm not certain if this should be considered a bug or an edge case that 
> client are expected to handle. If it is the latter then I think it would be 
> good to include this in the Programmer's Guide in the documentation.
> If it's helpful I have code to reproduce this on an in-process cluster 
> running 3.4.11 or 3.5.3-beta.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2993) .ignore file prevents adding src/java/main/org/apache/jute/compiler/generated dir to git repo

2018-03-06 Thread Andor Molnar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387863#comment-16387863
 ] 

Andor Molnar commented on ZOOKEEPER-2993:
-

[~taoist]

Nice catch, thanks for creating the Jira.

I think the name of the directory (generated) is a little bit misleading, 
because these files are part of the codebase since the very beginning and have 
mistakenly put on ignore by [~cnauroth] in 
[https://github.com/apache/zookeeper/commit/fa5955afa0962147268241163b7ca47dcdd074e0]

Are you happy to contribute and file a new PR on GitHub to address the issue as 
suggested?

> .ignore file prevents adding src/java/main/org/apache/jute/compiler/generated 
> dir to git repo
> -
>
> Key: ZOOKEEPER-2993
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2993
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: build
>Affects Versions: 3.4.10
>Reporter: jason wang
>Priority: Minor
>
> There are Rcc.java and other required files under the 
> src/java/main/org/apache/jute/compiler/generated directory.
> However, when I tried to add the source distribution to our own git repo, the 
> .gitignore file has "generated" as a key word in line 55 - which prevents the 
> dir and files under that dir to be added to the repo.  The compilation later 
> fails due to the missing dir and files.
> *compile_jute*
>  :*19:02:54* [mkdir] Created dir: 
> /home/jenkins/workspace/3PA/PMODS/zookeeper-pgdi-patch-in-maven-repo/src/java/generated*
> 19:02:54* [mkdir] Created dir: 
> /home/jenkins/workspace/3PA/PMODS/zookeeper-pgdi-patch-in-maven-repo/src/c/generated*
> 19:02:54* [java] Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
> *19:02:54* [java] Error: Could not find or load main class 
> org.apache.jute.compiler.generated.Rcc*
> 19:02:54* [java] Java Result: 1*19:02:54* [java] Picked up JAVA_TOOL_OPTIONS: 
> -Dfile.encoding=UTF8
> *19:02:54* [java] Error: Could not find or load main class 
> org.apache.jute.compiler.generated.Rcc*
> 19:02:54* [java] Java Result: 1*19:02:54* [touch] Creating 
> /home/jenkins/workspace/3PA/PMODS/zookeeper-pgdi-patch-in-maven-repo/src/java/generated/.generated*
>  
> Fix is to remove or comment out the generated key word in line 55.
> #
>  #generated
>  #
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2251) Add Client side packet response timeout to avoid infinite wait.

2018-03-06 Thread Andor Molnar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387776#comment-16387776
 ] 

Andor Molnar commented on ZOOKEEPER-2251:
-

[~awkejiang]

I believe that [~hanm]'s concerns have to be addressed first on GitHub.

> Add Client side packet response timeout to avoid infinite wait.
> ---
>
> Key: ZOOKEEPER-2251
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2251
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: java client
>Affects Versions: 3.4.9, 3.5.2, 3.4.11
>Reporter: nijel
>Assignee: Mohammad Arshad
>Priority: Critical
>  Labels: fault
> Fix For: 3.5.4, 3.6.0, 3.4.12
>
> Attachments: ZOOKEEPER-2251-01.patch, ZOOKEEPER-2251-02.patch, 
> ZOOKEEPER-2251-03.patch, ZOOKEEPER-2251-04.patch
>
>
> I came across one issue related to Client side packet response timeout In my 
> cluster many packet drops happened for some time.
> One observation is the zookeeper client got hanged. As per the thread dump it 
> is waiting for the response/ACK for the operation performed (synchronous API 
> used here).
> I am using 
> zookeeper.serverCnxnFactory=org.apache.zookeeper.server.NIOServerCnxnFactory
> Since only few packets missed there is no DISCONNECTED event occurred.
> Need add a "response time out" for the operations or packets.
> *Comments from [~rakeshr]*
> My observation about the problem:-
> * Can use tools like 'Wireshark' to simulate the artificial packet loss.
> * Assume there is only one packet in the 'outgoingQueue' and unfortunately 
> the server response packet lost. Now, client will enter into infinite 
> waiting. 
> https://github.com/apache/zookeeper/blob/trunk/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1515
> * Probably we can discuss more about this problem and possible solutions(add 
> packet ACK timeout or another better approach) in the jira.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2018-03-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387751#comment-16387751
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2901:
---

Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/377
  
@Randgalt @phunt I think this PR is in a pretty good shape, we should 
finalize it. Any thoughts or outstanding concerns?


> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper issue #377: [ZOOKEEPER-2901] TTL Nodes don't work with Server IDs ...

2018-03-06 Thread anmolnar
Github user anmolnar commented on the issue:

https://github.com/apache/zookeeper/pull/377
  
@Randgalt @phunt I think this PR is in a pretty good shape, we should 
finalize it. Any thoughts or outstanding concerns?


---


[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type

2018-03-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387728#comment-16387728
 ] 

ASF GitHub Bot commented on ZOOKEEPER-2901:
---

Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/377#discussion_r172509908
  
--- Diff: src/java/main/org/apache/zookeeper/server/OldEphemeralType.java 
---
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+/**
+ * See https://issues.apache.org/jira/browse/ZOOKEEPER-2901
+ *
+ * version 3.5.3 introduced bugs associated with how TTL nodes were 
implemented. version 3.5.4
+ * fixes the problems but makes TTL nodes created in 3.5.3 invalid. 
OldEphemeralType is a copy
+ * of the old - bad - implementation that is provided as a workaround. 
{@link EphemeralType#TTL_3_5_3_EMULATION_PROPERTY}
+ * can be used to emulate support of the badly specified TTL nodes.
+ */
+public enum OldEphemeralType {
+/**
+ * Not ephemeral
+ */
+VOID,
+/**
+ * Standard, pre-3.5.x EPHEMERAL
+ */
+NORMAL,
+/**
+ * Container node
+ */
+CONTAINER,
+/**
+ * TTL node
+ */
+TTL;
+
+public static final long CONTAINER_EPHEMERAL_OWNER = Long.MIN_VALUE;
+public static final long MAX_TTL = 0x0fffL;
+public static final long TTL_MASK = 0x8000L;
+
+public static OldEphemeralType get(long ephemeralOwner) {
--- End diff --

Makes sense.
I think it would be slightly more accurate to name the old enum to 
`EphemeralTypeEmu353`.
What do you think?


> Session ID that is negative causes mis-calculation of Ephemeral Type
> 
>
> Key: ZOOKEEPER-2901
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: server
>Affects Versions: 3.5.3
> Environment: Running 3.5.3-beta in Docker container
>Reporter: Mark Johnson
>Assignee: Jordan Zimmerman
>Priority: Blocker
>
> In the code that determines the EphemeralType it is looking at the owner 
> (which is the client ID or connection ID):
> EphemeralType.java:
>public static EphemeralType get(long ephemeralOwner) {
>if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) {
>return CONTAINER;
>}
>if (ephemeralOwner < 0) {
>return TTL;
>}
>return (ephemeralOwner == 0) ? VOID : NORMAL;
>}
> However my connection ID is:
> header.getClientId(): -720548323429908480
> This causes the code to think this is a TTL Ephemeral node instead of a
> NORMAL Ephemeral node.
> This also explains why this is random - if my client ID is non-negative
> then the node gets added correctly.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] zookeeper pull request #377: [ZOOKEEPER-2901] TTL Nodes don't work with Serv...

2018-03-06 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/377#discussion_r172509908
  
--- Diff: src/java/main/org/apache/zookeeper/server/OldEphemeralType.java 
---
@@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ * 
+ * http://www.apache.org/licenses/LICENSE-2.0
+ * 
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.zookeeper.server;
+
+/**
+ * See https://issues.apache.org/jira/browse/ZOOKEEPER-2901
+ *
+ * version 3.5.3 introduced bugs associated with how TTL nodes were 
implemented. version 3.5.4
+ * fixes the problems but makes TTL nodes created in 3.5.3 invalid. 
OldEphemeralType is a copy
+ * of the old - bad - implementation that is provided as a workaround. 
{@link EphemeralType#TTL_3_5_3_EMULATION_PROPERTY}
+ * can be used to emulate support of the badly specified TTL nodes.
+ */
+public enum OldEphemeralType {
+/**
+ * Not ephemeral
+ */
+VOID,
+/**
+ * Standard, pre-3.5.x EPHEMERAL
+ */
+NORMAL,
+/**
+ * Container node
+ */
+CONTAINER,
+/**
+ * TTL node
+ */
+TTL;
+
+public static final long CONTAINER_EPHEMERAL_OWNER = Long.MIN_VALUE;
+public static final long MAX_TTL = 0x0fffL;
+public static final long TTL_MASK = 0x8000L;
+
+public static OldEphemeralType get(long ephemeralOwner) {
--- End diff --

Makes sense.
I think it would be slightly more accurate to name the old enum to 
`EphemeralTypeEmu353`.
What do you think?


---


ZooKeeper_branch35_jdk8 - Build # 876 - Still Failing

2018-03-06 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch35_jdk8/876/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 60.59 KB...]
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.091 sec, Thread: 7, Class: org.apache.zookeeper.test.SaslClientTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.665 sec, Thread: 4, Class: 
org.apache.zookeeper.test.SaslAuthMissingClientConfigTest
[junit] Running org.apache.zookeeper.test.SaslSuperUserTest in thread 2
[junit] Running org.apache.zookeeper.test.ServerCnxnTest in thread 7
[junit] Running org.apache.zookeeper.test.SessionInvalidationTest in thread 
4
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.862 sec, Thread: 2, Class: org.apache.zookeeper.test.SaslSuperUserTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.194 sec, Thread: 4, Class: org.apache.zookeeper.test.SessionInvalidationTest
[junit] Running org.apache.zookeeper.test.SessionTest in thread 2
[junit] Running org.apache.zookeeper.test.SessionTrackerCheckTest in thread 
4
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.092 sec, Thread: 4, Class: org.apache.zookeeper.test.SessionTrackerCheckTest
[junit] Running org.apache.zookeeper.test.SessionUpgradeTest in thread 4
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
3.585 sec, Thread: 7, Class: org.apache.zookeeper.test.ServerCnxnTest
[junit] Running org.apache.zookeeper.test.StandaloneTest in thread 7
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.624 sec, Thread: 7, Class: org.apache.zookeeper.test.StandaloneTest
[junit] Running org.apache.zookeeper.test.StatTest in thread 7
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.203 sec, Thread: 7, Class: org.apache.zookeeper.test.StatTest
[junit] Running org.apache.zookeeper.test.StaticHostProviderTest in thread 7
[junit] Tests run: 14, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 
79.381 sec, Thread: 8, Class: org.apache.zookeeper.test.QuorumTest
[junit] Running org.apache.zookeeper.test.StringUtilTest in thread 8
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.066 sec, Thread: 8, Class: org.apache.zookeeper.test.StringUtilTest
[junit] Running org.apache.zookeeper.test.SyncCallTest in thread 8
[junit] Tests run: 13, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
2.914 sec, Thread: 7, Class: org.apache.zookeeper.test.StaticHostProviderTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.83 sec, Thread: 8, Class: org.apache.zookeeper.test.SyncCallTest
[junit] Running org.apache.zookeeper.test.TruncateTest in thread 7
[junit] Running org.apache.zookeeper.test.WatchEventWhenAutoResetTest in 
thread 8
[junit] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
69.22 sec, Thread: 3, Class: org.apache.zookeeper.test.QuorumZxidSyncTest
[junit] Running org.apache.zookeeper.test.WatchedEventTest in thread 3
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.074 sec, Thread: 3, Class: org.apache.zookeeper.test.WatchedEventTest
[junit] Running org.apache.zookeeper.test.WatcherFuncTest in thread 3
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
1.575 sec, Thread: 3, Class: org.apache.zookeeper.test.WatcherFuncTest
[junit] Running org.apache.zookeeper.test.WatcherTest in thread 3
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
6.537 sec, Thread: 7, Class: org.apache.zookeeper.test.TruncateTest
[junit] Running org.apache.zookeeper.test.X509AuthTest in thread 7
[junit] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.092 sec, Thread: 7, Class: org.apache.zookeeper.test.X509AuthTest
[junit] Running org.apache.zookeeper.test.ZkDatabaseCorruptionTest in 
thread 7
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
21.002 sec, Thread: 4, Class: org.apache.zookeeper.test.SessionUpgradeTest
[junit] Running org.apache.zookeeper.test.ZooKeeperQuotaTest in thread 4
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
0.895 sec, Thread: 4, Class: org.apache.zookeeper.test.ZooKeeperQuotaTest
[junit] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
6.569 sec, Thread: 7, Class: org.apache.zookeeper.test.ZkDatabaseCorruptionTest
[junit] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
19.931 sec, Thread: 8, Class: 
org.apache.zookeeper.test.WatchEventWhenAutoResetTest
[junit] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 
34.589 sec, Thread: 2, Class: