ZooKeeper-trunk-WinVS2008 - Build # 754 - Failure

2013-03-03 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008/754/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 168 lines...]
Build FAILED.

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\zookeeper.sln
 (default target) (1) -
(zookeeper target) - 
  cl : Command line warning D9035: option 'Wp64' has been deprecated and will 
be removed in a future release
  .\src\zookeeper.c(502): warning C4996: 'strdup': The POSIX name for this item 
is deprecated. Instead, use the ISO C++ conformant name: _strdup. See online 
help for details.
  .\src\zookeeper.c(750): warning C4996: 'strdup': The POSIX name for this item 
is deprecated. Instead, use the ISO C++ conformant name: _strdup. See online 
help for details.
  .\src\zookeeper.c(1005): warning C4996: 'strdup': The POSIX name for this 
item is deprecated. Instead, use the ISO C++ conformant name: _strdup. See 
online help for details.
  .\src\zookeeper.c(1016): warning C4996: 'strncpy': This function or variable 
may be unsafe. Consider using strncpy_s instead. To disable deprecation, use 
_CRT_SECURE_NO_WARNINGS. See online help for details.
  .\src\zookeeper.c(1022): warning C4996: 'strdup': The POSIX name for this 
item is deprecated. Instead, use the ISO C++ conformant name: _strdup. See 
online help for details.
  .\src\zookeeper.c(1083): warning C4996: 'strdup': The POSIX name for this 
item is deprecated. Instead, use the ISO C++ conformant name: _strdup. See 
online help for details.
  .\src\zk_log.c(120): warning C4996: '_snprintf': This function or variable 
may be unsafe. Consider using _snprintf_s instead. To disable deprecation, use 
_CRT_SECURE_NO_WARNINGS. See online help for details.
  .\src\zk_log.c(134): warning C4996: 'getpid': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _getpid. See online help 
for details.
  .\src\zk_log.c(160): warning C4996: 'vsnprintf': This function or variable 
may be unsafe. Consider using vsnprintf_s instead. To disable deprecation, use 
_CRT_SECURE_NO_WARNINGS. See online help for details.
  .\src\zk_hashtable.c(189): warning C4996: 'strdup': The POSIX name for this 
item is deprecated. Instead, use the ISO C++ conformant name: _strdup. See 
online help for details.
  .\src\zk_hashtable.c(213): warning C4996: 'strdup': The POSIX name for this 
item is deprecated. Instead, use the ISO C++ conformant name: _strdup. See 
online help for details.
  .\src\mt_adaptor.c(498): warning C4244: '=' : conversion from 'time_t' to 
'int32_t', possible loss of data


f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\zookeeper.sln
 (default target) (1) -
(Cli target) - 
  .\src\cli.c(124): warning C4996: 'fopen': This function or variable may be 
unsafe. Consider using fopen_s instead. To disable deprecation, use 
_CRT_SECURE_NO_WARNINGS. See online help for details.
  .\src\cli.c(281): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(282): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(351): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(371): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(386): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(397): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(430): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(433): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(445): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(458): warning C4996: 'strdup': The POSIX name for this item is 
deprecated. Instead, use the ISO C++ conformant name: _strdup. See online help 
for details.
  .\src\cli.c(551): warning C4996: 'strcpy': This function or variable may be 
unsafe. Consider using strcpy_s instead. To disable deprecation, use 
_CRT_SECURE_NO_WARNINGS. See online help for details.
  .\src\cli.c(556): warning C4996: 'fopen': This function or variable 

ZooKeeper-trunk-WinVS2008 - Build # 755 - Still Failing

2013-03-03 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008/755/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 112 lines...]
compile_jute_uptodate:

compile_jute:
[mkdir] Created dir: 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\java\generated
[mkdir] Created dir: 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\generated
 [java] ../../zookeeper.jute Parsed Successfully
 [java] ../../zookeeper.jute Parsed Successfully
[touch] Creating 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\java\generated\.generated

BUILD SUCCESSFUL
Total time: 8 seconds
[ZooKeeper-trunk-WinVS2008] $ cmd /c call 
C:\Users\hudson\AppData\Local\Temp\hudson6878150785352312376.bat

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008set 
ZOOKEEPER_HOME=f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk 

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008msbuild 
trunk/src/c/zookeeper.sln /p:Configuration=Release 
Microsoft (R) Build Engine Version 3.5.30729.1
[Microsoft .NET Framework, Version 2.0.50727.4223]
Copyright (C) Microsoft Corporation 2007. All rights reserved.

Build started 3/3/2013 11:02:35 AM.
Project 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\zookeeper.sln
 on node 0 (default targets).
  Building solution configuration Release|Win32.
cl : Command line warning D9035: option 'Wp64' has been deprecated and will be 
removed in a future release
C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(94): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_bbddcd05sy': No space left on device
C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(120): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_94c14deasy': No space left on device
f:\hudson\hudson-slave\workspace\zookeeper-trunk-winvs2008\trunk\src\c\src\hashtable/hashtable_itr.h(39):
 fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_082f04a3sy': No space left on device
C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(94): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_d298b055sy': No space left on device
C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(94): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_a7c8acacsy': No space left on device
Done Building Project 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\zookeeper.sln
 (default targets) -- FAILED.

Build FAILED.

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\zookeeper.sln
 (default target) (1) -
(zookeeper target) - 
  cl : Command line warning D9035: option 'Wp64' has been deprecated and will 
be removed in a future release


f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\zookeeper.sln
 (default target) (1) -
(zookeeper target) - 
  C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(94): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_bbddcd05sy': No space left on device
  C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(120): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_94c14deasy': No space left on device
  
f:\hudson\hudson-slave\workspace\zookeeper-trunk-winvs2008\trunk\src\c\src\hashtable/hashtable_itr.h(39):
 fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_082f04a3sy': No space left on device
  C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(94): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_d298b055sy': No space left on device


f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008\trunk\src\c\zookeeper.sln
 (default target) (1) -
(Cli target) - 
  C:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\include\time.inl(94): 
fatal error C1088: Cannot flush compiler intermediate file: 
'C:\Users\hudson\AppData\Local\Temp\_CL_a7c8acacsy': No space left on device

1 Warning(s)
5 Error(s)

Time Elapsed 00:00:07.68

f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008exit 1 
Build step 'Execute Windows batch command' marked build as failure
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (ZOOKEEPER-1658) Support SRV records

2013-03-03 Thread Devin Bayer (JIRA)
Devin Bayer created ZOOKEEPER-1658:
--

 Summary: Support SRV records
 Key: ZOOKEEPER-1658
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1658
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Devin Bayer
Priority: Minor


We want to make client configuration easy so the quorum is just a single A 
rrset with multiple IP addresses. This isn't ideal because we need to hard-code 
the IPs off our zookeeper servers and they already have domain names. If 
zookeeper supported SRV, we could just do:

  _zookeeper.example.com. 86400 IN SRV 10 60 2181 worker1
  _zookeeper.example.com. 86400 IN SRV 10 20 2181 worker2
  _zookeeper.example.com. 86400 IN SRV 10 10 2181 worker3

and -Dhbase.zookeeper.quorum=example.com

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


ZooKeeper-trunk-solaris - Build # 486 - Still Failing

2013-03-03 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/486/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 165013 lines...]
[junit] 2013-03-03 13:40:39,288 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-0:NIOServerCnxnFactory$SelectorThread@413] 
- selector thread exitted run method
[junit] 2013-03-03 13:40:39,289 [myid:] - INFO  [main:ZooKeeperServer@398] 
- shutting down
[junit] 2013-03-03 13:40:39,289 [myid:] - INFO  
[main:SessionTrackerImpl@180] - Shutting down
[junit] 2013-03-03 13:40:39,289 [myid:] - INFO  
[main:PrepRequestProcessor@804] - Shutting down
[junit] 2013-03-03 13:40:39,289 [myid:] - INFO  
[main:SyncRequestProcessor@175] - Shutting down
[junit] 2013-03-03 13:40:39,289 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@144] - PrepRequestProcessor exited loop!
[junit] 2013-03-03 13:40:39,290 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@155] - SyncRequestProcessor exited!
[junit] 2013-03-03 13:40:39,290 [myid:] - INFO  
[main:FinalRequestProcessor@421] - shutdown of request processor complete
[junit] 2013-03-03 13:40:39,291 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-03-03 13:40:39,291 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[]
[junit] 2013-03-03 13:40:39,292 [myid:] - INFO  [main:ClientBase@414] - 
STARTING server
[junit] 2013-03-03 13:40:39,292 [myid:] - INFO  [main:ZooKeeperServer@149] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test444373612090522741.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test444373612090522741.junit.dir/version-2
[junit] 2013-03-03 13:40:39,293 [myid:] - INFO  
[main:NIOServerCnxnFactory@663] - Configuring NIO connection handler with 10s 
sessionless connection timeout, 2 selector thread(s), 16 worker threads, and 64 
kB direct buffers.
[junit] 2013-03-03 13:40:39,293 [myid:] - INFO  
[main:NIOServerCnxnFactory@676] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-03-03 13:40:39,294 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test444373612090522741.junit.dir/version-2/snapshot.b
[junit] 2013-03-03 13:40:39,297 [myid:] - INFO  [main:FileTxnSnapLog@270] - 
Snapshotting: 0xb to 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris/trunk/build/test/tmp/test444373612090522741.junit.dir/version-2/snapshot.b
[junit] 2013-03-03 13:40:39,298 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2013-03-03 13:40:39,299 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@289]
 - Accepted socket connection from /127.0.0.1:34520
[junit] 2013-03-03 13:40:39,300 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@829] - Processing stat command from 
/127.0.0.1:34520
[junit] 2013-03-03 13:40:39,304 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@678] - Stat command output
[junit] 2013-03-03 13:40:39,304 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@1000] - Closed socket connection for client 
/127.0.0.1:34520 (no session established for client)
[junit] 2013-03-03 13:40:39,304 [myid:] - INFO  [main:JMXEnv@133] - 
ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] 2013-03-03 13:40:39,306 [myid:] - INFO  [main:JMXEnv@105] - 
expect:InMemoryDataTree
[junit] 2013-03-03 13:40:39,306 [myid:] - INFO  [main:JMXEnv@108] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2013-03-03 13:40:39,306 [myid:] - INFO  [main:JMXEnv@105] - 
expect:StandaloneServer_port
[junit] 2013-03-03 13:40:39,306 [myid:] - INFO  [main:JMXEnv@108] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-03-03 13:40:39,307 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@57] - FINISHED TEST METHOD testQuota
[junit] 2013-03-03 13:40:39,307 [myid:] - INFO  [main:ClientBase@451] - 
tearDown starting
[junit] 2013-03-03 13:40:39,369 [myid:] - INFO  [main:ZooKeeper@744] - 
Session: 0x13d307b978b closed
[junit] 2013-03-03 13:40:39,369 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@513] - EventThread shut down
[junit] 2013-03-03 13:40:39,369 [myid:] - INFO  [main:ClientBase@421] - 
STOPPING server
[junit] 2013-03-03 13:40:39,370 [myid:] - INFO  
[NIOServerCxnFactory.SelectorThread-1:NIOServerCnxnFactory$SelectorThread@413] 

[jira] [Commented] (ZOOKEEPER-1633) Introduce a protocol version to connection initiation message

2013-03-03 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591824#comment-13591824
 ] 

Flavio Junqueira commented on ZOOKEEPER-1633:
-

+1 for me too. does anyone else want to comment on this patch?

 Introduce a protocol version to connection initiation message
 -

 Key: ZOOKEEPER-1633
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1633
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Reporter: Alexander Shraer
Assignee: Alexander Shraer
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1633.patch, ZOOKEEPER-1633-v4.patch, 
 ZOOKEEPER-1633-v4.patch, ZOOKEEPER-1633-ver2.patch, ZOOKEEPER-1633-ver3.patch


 Currently the first message a server sends to another server includes just 
 one field - the server's id (long). This is in QuorumCnxManager.java. This 
 makes changes to the information passed during this initial connection very 
 difficult. This patch will change the first field of the message to be a 
 protocol version (a negative number that can't be a server id). The second 
 field will be the server id. The third field is number of bytes in the 
 remainder of the message. A 3.4 server will read the first field as before, 
 but if this is a negative number it will read the second field to find the 
 server id, and then remove the remainder of the message from the stream. This 
 will not affect 3.4 since 3.4 and earlier servers send just the server id (so 
 the code in the patch will not run unless there is a server  3.4 trying to 
 connect). This will, however, provide the necessary flexibility for future 
 releases as well as an upgrade path from 3.4

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures

2013-03-03 Thread Michael Morello (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Morello updated ZOOKEEPER-1382:
---

Attachment: ZOOKEEPER-1382.patch

I managed to create a JUnit test case that demonstrate the watches leak, 
unfortunately i had to promote a few methods to the public scope.

As Neha pointed out the unit test shows that if a client send a connection 
request with an expired session the NIOServerCnxn class still processes 
upcoming messages like OpCode.setWatches.

Actually the fix simply consists to call cnxn.enableRecv() only if the session 
has been validated by the leader.

 Zookeeper server holds onto dead/expired session ids in the watch data 
 structures
 -

 Key: ZOOKEEPER-1382
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.5
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1382_3.3.4.patch, ZOOKEEPER-1382.patch


 I've observed that zookeeper server holds onto expired session ids in the 
 watcher data structures. The result is the wchp command reports session ids 
 that cannot be found through cons/dump and those expired session ids sit 
 there maybe until the server is restarted. Here are snippets from the client 
 and the server logs that lead to this state, for one particular session id 
 0x134485fd7bcb26f -
 There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 
 and I'm using ZkClient to connect to the cluster
 From the application log -
 application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application Session establishment complete 
 on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, 
 negotiated timeout = 6000
 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application] Client session timed out, 
 have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing 
 socket connection and attempting reconnect
 application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] 
 [main-SendThread(226.prod:12913)] [application] Unable to reconnect to 
 ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket 
 connection
 On the leader zk, 225 -
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, 
 timeout of 6000ms exceeded
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination 
 for sessionid: 0x134485fd7bcb26f
 On the server, the client was initially connected to, 223 -
 zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020
 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client 
 /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f
 Here are the log snippets from 226, which is the server, the client 
 reconnected to, before getting session expired event -
 2012-01-27 09:52:38,190 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client 
 attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367
 2012-01-27 09:52:38,191 - INFO  
 [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 
 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired
 2012-01-27 09:52:38,191 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed 
 socket connection for client /172.17.136.82:49367 which had sessionid 
 0x134485fd7bcb26f
 wchp output from 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*wchp* | wc -l
 3
 wchp output from 223, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*wchp* | wc -l
 0
 cons output from 223 and 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*cons* | wc -l
 0
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*cons* | wc -l
 0
 So, what seems to have happened is that the client was able to re-register 
 the watches on the new server (226), after it got disconnected from 223, 
 inspite of having an expired session id. 
 In NIOServerCnxn, I saw that after suspecting that a session is expired, a 
 server removes the cnxn and its 

Failed: ZOOKEEPER-1382 PreCommit Build #1419

2013-03-03 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1419/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 71 lines...]
 [exec] Hunk #1 FAILED at 597.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java.rej
 [exec] patching file 
src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java
 [exec] Hunk #1 FAILED at 206.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java.rej
 [exec] patching file src/java/main/org/apache/zookeeper/ClientCnxn.java
 [exec] Hunk #1 succeeded at 231 (offset 4 lines).
 [exec] Hunk #2 succeeded at 258 (offset 4 lines).
 [exec] Hunk #3 succeeded at 277 (offset 4 lines).
 [exec] Hunk #4 succeeded at 304 (offset 4 lines).
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12571815/ZOOKEEPER-1382.patch
 [exec]   against trunk revision 1448007.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1419//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 99f7cbcf806d15cce5ce7d0bf5780c5bb01beb7b logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 43 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1382
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures

2013-03-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591848#comment-13591848
 ] 

Hadoop QA commented on ZOOKEEPER-1382:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12571815/ZOOKEEPER-1382.patch
  against trunk revision 1448007.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1419//console

This message is automatically generated.

 Zookeeper server holds onto dead/expired session ids in the watch data 
 structures
 -

 Key: ZOOKEEPER-1382
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.5
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1382_3.3.4.patch, ZOOKEEPER-1382.patch


 I've observed that zookeeper server holds onto expired session ids in the 
 watcher data structures. The result is the wchp command reports session ids 
 that cannot be found through cons/dump and those expired session ids sit 
 there maybe until the server is restarted. Here are snippets from the client 
 and the server logs that lead to this state, for one particular session id 
 0x134485fd7bcb26f -
 There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 
 and I'm using ZkClient to connect to the cluster
 From the application log -
 application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application Session establishment complete 
 on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, 
 negotiated timeout = 6000
 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application] Client session timed out, 
 have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing 
 socket connection and attempting reconnect
 application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] 
 [main-SendThread(226.prod:12913)] [application] Unable to reconnect to 
 ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket 
 connection
 On the leader zk, 225 -
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, 
 timeout of 6000ms exceeded
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination 
 for sessionid: 0x134485fd7bcb26f
 On the server, the client was initially connected to, 223 -
 zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020
 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client 
 /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f
 Here are the log snippets from 226, which is the server, the client 
 reconnected to, before getting session expired event -
 2012-01-27 09:52:38,190 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client 
 attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367
 2012-01-27 09:52:38,191 - INFO  
 [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 
 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired
 2012-01-27 09:52:38,191 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed 
 socket connection for client /172.17.136.82:49367 which had sessionid 
 0x134485fd7bcb26f
 wchp output from 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*wchp* | wc -l
 3
 wchp output from 223, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*wchp* | wc -l
 0
 cons output from 223 and 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*cons* | wc -l
 0
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*cons* | wc -l
 0
 So, what seems to have happened is that the client was able to re-register 
 the watches on the new server (226), after it got disconnected from 223, 
 inspite of having an expired session id. 
 In NIOServerCnxn, I saw that after 

[jira] [Updated] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures

2013-03-03 Thread Michael Morello (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Morello updated ZOOKEEPER-1382:
---

Attachment: (was: ZOOKEEPER-1382.patch)

 Zookeeper server holds onto dead/expired session ids in the watch data 
 structures
 -

 Key: ZOOKEEPER-1382
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.5
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1382_3.3.4.patch


 I've observed that zookeeper server holds onto expired session ids in the 
 watcher data structures. The result is the wchp command reports session ids 
 that cannot be found through cons/dump and those expired session ids sit 
 there maybe until the server is restarted. Here are snippets from the client 
 and the server logs that lead to this state, for one particular session id 
 0x134485fd7bcb26f -
 There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 
 and I'm using ZkClient to connect to the cluster
 From the application log -
 application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application Session establishment complete 
 on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, 
 negotiated timeout = 6000
 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application] Client session timed out, 
 have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing 
 socket connection and attempting reconnect
 application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] 
 [main-SendThread(226.prod:12913)] [application] Unable to reconnect to 
 ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket 
 connection
 On the leader zk, 225 -
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, 
 timeout of 6000ms exceeded
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination 
 for sessionid: 0x134485fd7bcb26f
 On the server, the client was initially connected to, 223 -
 zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020
 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client 
 /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f
 Here are the log snippets from 226, which is the server, the client 
 reconnected to, before getting session expired event -
 2012-01-27 09:52:38,190 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client 
 attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367
 2012-01-27 09:52:38,191 - INFO  
 [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 
 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired
 2012-01-27 09:52:38,191 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed 
 socket connection for client /172.17.136.82:49367 which had sessionid 
 0x134485fd7bcb26f
 wchp output from 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*wchp* | wc -l
 3
 wchp output from 223, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*wchp* | wc -l
 0
 cons output from 223 and 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*cons* | wc -l
 0
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*cons* | wc -l
 0
 So, what seems to have happened is that the client was able to re-register 
 the watches on the new server (226), after it got disconnected from 223, 
 inspite of having an expired session id. 
 In NIOServerCnxn, I saw that after suspecting that a session is expired, a 
 server removes the cnxn and its watches from its internal data structures. 
 But before that it allows more requests to be processed even if the session 
 is expired -
 // Now that the session is ready we can start receiving packets
 synchronized (this.factory) {
 sk.selector().wakeup();
 enableRecv();
 }
 } catch (Exception e) {
 LOG.warn(Exception while establishing session, closing, e);
 close();
 

[jira] [Updated] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures

2013-03-03 Thread Michael Morello (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Morello updated ZOOKEEPER-1382:
---

Attachment: ZOOKEEPER-1382-branch-3.4.patch

Current patch is for branch 3.4
Sorry

 Zookeeper server holds onto dead/expired session ids in the watch data 
 structures
 -

 Key: ZOOKEEPER-1382
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.5
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1382_3.3.4.patch, 
 ZOOKEEPER-1382-branch-3.4.patch


 I've observed that zookeeper server holds onto expired session ids in the 
 watcher data structures. The result is the wchp command reports session ids 
 that cannot be found through cons/dump and those expired session ids sit 
 there maybe until the server is restarted. Here are snippets from the client 
 and the server logs that lead to this state, for one particular session id 
 0x134485fd7bcb26f -
 There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 
 and I'm using ZkClient to connect to the cluster
 From the application log -
 application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application Session establishment complete 
 on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, 
 negotiated timeout = 6000
 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application] Client session timed out, 
 have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing 
 socket connection and attempting reconnect
 application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] 
 [main-SendThread(226.prod:12913)] [application] Unable to reconnect to 
 ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket 
 connection
 On the leader zk, 225 -
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, 
 timeout of 6000ms exceeded
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination 
 for sessionid: 0x134485fd7bcb26f
 On the server, the client was initially connected to, 223 -
 zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020
 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client 
 /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f
 Here are the log snippets from 226, which is the server, the client 
 reconnected to, before getting session expired event -
 2012-01-27 09:52:38,190 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client 
 attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367
 2012-01-27 09:52:38,191 - INFO  
 [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 
 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired
 2012-01-27 09:52:38,191 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed 
 socket connection for client /172.17.136.82:49367 which had sessionid 
 0x134485fd7bcb26f
 wchp output from 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*wchp* | wc -l
 3
 wchp output from 223, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*wchp* | wc -l
 0
 cons output from 223 and 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*cons* | wc -l
 0
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*cons* | wc -l
 0
 So, what seems to have happened is that the client was able to re-register 
 the watches on the new server (226), after it got disconnected from 223, 
 inspite of having an expired session id. 
 In NIOServerCnxn, I saw that after suspecting that a session is expired, a 
 server removes the cnxn and its watches from its internal data structures. 
 But before that it allows more requests to be processed even if the session 
 is expired -
 // Now that the session is ready we can start receiving packets
 synchronized (this.factory) {
 sk.selector().wakeup();
 enableRecv();
 }
 } catch (Exception e) {
 

Failed: ZOOKEEPER-1382 PreCommit Build #1420

2013-03-03 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1420/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 71 lines...]
 [exec] Hunk #1 FAILED at 597.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java.rej
 [exec] patching file 
src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java
 [exec] Hunk #1 FAILED at 206.
 [exec] 1 out of 1 hunk FAILED -- saving rejects to file 
src/java/main/org/apache/zookeeper/server/NIOServerCnxn.java.rej
 [exec] patching file src/java/main/org/apache/zookeeper/ClientCnxn.java
 [exec] Hunk #1 succeeded at 231 (offset 4 lines).
 [exec] Hunk #2 succeeded at 258 (offset 4 lines).
 [exec] Hunk #3 succeeded at 277 (offset 4 lines).
 [exec] Hunk #4 succeeded at 304 (offset 4 lines).
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12571817/ZOOKEEPER-1382-branch-3.4.patch
 [exec]   against trunk revision 1448007.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1420//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] e372528e767e2b6757ed4e2f6bc7f26b87e7bb55 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1623:
 exec returned: 1

Total time: 42 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1382
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures

2013-03-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13591854#comment-13591854
 ] 

Hadoop QA commented on ZOOKEEPER-1382:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12571817/ZOOKEEPER-1382-branch-3.4.patch
  against trunk revision 1448007.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/1420//console

This message is automatically generated.

 Zookeeper server holds onto dead/expired session ids in the watch data 
 structures
 -

 Key: ZOOKEEPER-1382
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.5
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1382_3.3.4.patch, 
 ZOOKEEPER-1382-branch-3.4.patch


 I've observed that zookeeper server holds onto expired session ids in the 
 watcher data structures. The result is the wchp command reports session ids 
 that cannot be found through cons/dump and those expired session ids sit 
 there maybe until the server is restarted. Here are snippets from the client 
 and the server logs that lead to this state, for one particular session id 
 0x134485fd7bcb26f -
 There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 
 and I'm using ZkClient to connect to the cluster
 From the application log -
 application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application Session establishment complete 
 on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, 
 negotiated timeout = 6000
 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application] Client session timed out, 
 have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing 
 socket connection and attempting reconnect
 application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] 
 [main-SendThread(226.prod:12913)] [application] Unable to reconnect to 
 ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket 
 connection
 On the leader zk, 225 -
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, 
 timeout of 6000ms exceeded
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination 
 for sessionid: 0x134485fd7bcb26f
 On the server, the client was initially connected to, 223 -
 zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020
 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client 
 /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f
 Here are the log snippets from 226, which is the server, the client 
 reconnected to, before getting session expired event -
 2012-01-27 09:52:38,190 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client 
 attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367
 2012-01-27 09:52:38,191 - INFO  
 [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 
 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired
 2012-01-27 09:52:38,191 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed 
 socket connection for client /172.17.136.82:49367 which had sessionid 
 0x134485fd7bcb26f
 wchp output from 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*wchp* | wc -l
 3
 wchp output from 223, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*wchp* | wc -l
 0
 cons output from 223 and 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*cons* | wc -l
 0
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*cons* | wc -l
 0
 So, what seems to have happened is that the client was able to re-register 
 the watches on the new server (226), after it got disconnected from 223, 
 inspite of having an expired session id. 
 In 

ZooKeeper_branch33_solaris - Build # 460 - Failure

2013-03-03 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch33_solaris/460/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 96942 lines...]
[junit] 2013-03-04 07:06:59,076 - INFO  [main:ZooKeeperServer@154] - 
Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test6220841508599798586.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test6220841508599798586.junit.dir/version-2
[junit] 2013-03-04 07:06:59,077 - INFO  [main:NIOServerCnxn$Factory@143] - 
binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-03-04 07:06:59,079 - INFO  [main:FileSnap@82] - Reading 
snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test6220841508599798586.junit.dir/version-2/snapshot.0
[junit] 2013-03-04 07:06:59,082 - INFO  [main:FileTxnSnapLog@256] - 
Snapshotting: b
[junit] 2013-03-04 07:06:59,084 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] 2013-03-04 07:06:59,085 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /127.0.0.1:45594
[junit] 2013-03-04 07:06:59,086 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@1237] - Processing 
stat command from /127.0.0.1:45594
[junit] 2013-03-04 07:06:59,087 - INFO  
[Thread-4:NIOServerCnxn$StatCommand@1153] - Stat command output
[junit] 2013-03-04 07:06:59,088 - INFO  [Thread-4:NIOServerCnxn@1435] - 
Closed socket connection for client /127.0.0.1:45594 (no session established 
for client)
[junit] ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] expect:InMemoryDataTree
[junit] found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-03-04 07:06:59,090 - INFO  [main:ClientBase@408] - STOPPING 
server
[junit] 2013-03-04 07:06:59,091 - INFO  
[ProcessThread:-1:PrepRequestProcessor@128] - PrepRequestProcessor exited loop!
[junit] 2013-03-04 07:06:59,091 - INFO  
[SyncThread:0:SyncRequestProcessor@151] - SyncRequestProcessor exited!
[junit] 2013-03-04 07:06:59,092 - INFO  [main:FinalRequestProcessor@370] - 
shutdown of request processor complete
[junit] 2013-03-04 07:06:59,093 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] ensureOnly:[]
[junit] 2013-03-04 07:06:59,095 - INFO  [main:ClientBase@401] - STARTING 
server
[junit] 2013-03-04 07:06:59,095 - INFO  [main:ZooKeeperServer@154] - 
Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test6220841508599798586.junit.dir/version-2
 snapdir 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test6220841508599798586.junit.dir/version-2
[junit] 2013-03-04 07:06:59,096 - INFO  [main:NIOServerCnxn$Factory@143] - 
binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2013-03-04 07:06:59,097 - INFO  [main:FileSnap@82] - Reading 
snapshot 
/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/ZooKeeper_branch33_solaris/trunk/build/test/tmp/test6220841508599798586.junit.dir/version-2/snapshot.b
[junit] 2013-03-04 07:06:59,100 - INFO  [main:FileTxnSnapLog@256] - 
Snapshotting: b
[junit] 2013-03-04 07:06:59,102 - INFO  [main:FourLetterWordMain@43] - 
connecting to 127.0.0.1 11221
[junit] 2013-03-04 07:06:59,103 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn$Factory@251] - 
Accepted socket connection from /127.0.0.1:45596
[junit] 2013-03-04 07:06:59,103 - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@1237] - Processing 
stat command from /127.0.0.1:45596
[junit] 2013-03-04 07:06:59,104 - INFO  
[Thread-5:NIOServerCnxn$StatCommand@1153] - Stat command output
[junit] 2013-03-04 07:06:59,105 - INFO  [Thread-5:NIOServerCnxn@1435] - 
Closed socket connection for client /127.0.0.1:45596 (no session established 
for client)
[junit] ensureOnly:[InMemoryDataTree, StandaloneServer_port]
[junit] expect:InMemoryDataTree
[junit] found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] expect:StandaloneServer_port
[junit] found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2013-03-04 07:06:59,107 - INFO  

[jira] [Commented] (ZOOKEEPER-1382) Zookeeper server holds onto dead/expired session ids in the watch data structures

2013-03-03 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592041#comment-13592041
 ] 

Mahadev konar commented on ZOOKEEPER-1382:
--

Michael,
 Would you be able to upload a patch for trunk as well?

 Zookeeper server holds onto dead/expired session ids in the watch data 
 structures
 -

 Key: ZOOKEEPER-1382
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1382
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.5
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Critical
 Fix For: 3.4.6

 Attachments: ZOOKEEPER-1382_3.3.4.patch, 
 ZOOKEEPER-1382-branch-3.4.patch


 I've observed that zookeeper server holds onto expired session ids in the 
 watcher data structures. The result is the wchp command reports session ids 
 that cannot be found through cons/dump and those expired session ids sit 
 there maybe until the server is restarted. Here are snippets from the client 
 and the server logs that lead to this state, for one particular session id 
 0x134485fd7bcb26f -
 There are 4 servers in the zookeeper cluster - 223, 224, 225 (leader), 226 
 and I'm using ZkClient to connect to the cluster
 From the application log -
 application.log.2012-01-26-325.gz:2012/01/26 04:56:36.177 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application Session establishment complete 
 on server 223.prod/172.17.135.38:12913, sessionid = 0x134485fd7bcb26f, 
 negotiated timeout = 6000
 application.log.2012-01-27.gz:2012/01/27 09:52:37.714 INFO [ClientCnxn] 
 [main-SendThread(223.prod:12913)] [application] Client session timed out, 
 have not heard from server in 9827ms for sessionid 0x134485fd7bcb26f, closing 
 socket connection and attempting reconnect
 application.log.2012-01-27.gz:2012/01/27 09:52:38.191 INFO [ClientCnxn] 
 [main-SendThread(226.prod:12913)] [application] Unable to reconnect to 
 ZooKeeper service, session 0x134485fd7bcb26f has expired, closing socket 
 connection
 On the leader zk, 225 -
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [SessionTracker:ZooKeeperServer@314] - Expiring session 0x134485fd7bcb26f, 
 timeout of 6000ms exceeded
 zookeeper.log.2012-01-27-leader-225.gz:2012-01-27 09:52:34,010 - INFO  
 [ProcessThread:-1:PrepRequestProcessor@391] - Processed session termination 
 for sessionid: 0x134485fd7bcb26f
 On the server, the client was initially connected to, 223 -
 zookeeper.log.2012-01-26-223.gz:2012-01-26 04:56:36,173 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1580] - Established session 
 0x134485fd7bcb26f with negotiated timeout 6000 for client /172.17.136.82:45020
 zookeeper.log.2012-01-27-223.gz:2012-01-27 09:52:34,018 - INFO  
 [CommitProcessor:1:NIOServerCnxn@1435] - Closed socket connection for client 
 /172.17.136.82:45020 which had sessionid 0x134485fd7bcb26f
 Here are the log snippets from 226, which is the server, the client 
 reconnected to, before getting session expired event -
 2012-01-27 09:52:38,190 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client 
 attempting to renew session 0x134485fd7bcb26f at /172.17.136.82:49367
 2012-01-27 09:52:38,191 - INFO  
 [QuorumPeer:/0.0.0.0:12913:NIOServerCnxn@1573] - Invalid session 
 0x134485fd7bcb26f for client /172.17.136.82:49367, probably expired
 2012-01-27 09:52:38,191 - INFO  
 [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed 
 socket connection for client /172.17.136.82:49367 which had sessionid 
 0x134485fd7bcb26f
 wchp output from 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*wchp* | wc -l
 3
 wchp output from 223, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*wchp* | wc -l
 0
 cons output from 223 and 226, taken on 01/30 -
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *226.*cons* | wc -l
 0
 nnarkhed-ld:zk-cons-wchp-2012013000 nnarkhed$ grep 0x134485fd7bcb26f 
 *223.*cons* | wc -l
 0
 So, what seems to have happened is that the client was able to re-register 
 the watches on the new server (226), after it got disconnected from 223, 
 inspite of having an expired session id. 
 In NIOServerCnxn, I saw that after suspecting that a session is expired, a 
 server removes the cnxn and its watches from its internal data structures. 
 But before that it allows more requests to be processed even if the session 
 is expired -
 // Now that the session is ready we can start receiving packets
 synchronized (this.factory) {
 sk.selector().wakeup();
 enableRecv();
 }
 } catch 

[jira] [Commented] (ZOOKEEPER-1551) Observer ignore txns that comes after snapshot and UPTODATE

2013-03-03 Thread Mahadev konar (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13592043#comment-13592043
 ] 

Mahadev konar commented on ZOOKEEPER-1551:
--

[~fpj] would you be able to review the latest patch?

 Observer ignore txns that comes after snapshot and UPTODATE 
 

 Key: ZOOKEEPER-1551
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1551
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum, server
Affects Versions: 3.4.3
Reporter: Thawan Kooburat
Assignee: Thawan Kooburat
Priority: Blocker
 Fix For: 3.5.0, 3.4.6

 Attachments: ZOOKEEPER-1551.patch, ZOOKEEPER-1551.patch


 In Learner.java, txns which comes after the learner has taken the snapshot 
 (after NEWLEADER packet) are stored in packetsNotCommitted. The follower has 
 special logic to apply these txns at the end of syncWithLeader() method. 
 However, the observer will ignore these txns completely, causing data 
 inconsistency. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (ZOOKEEPER-1657) Increased CPU usage by unnecessary SASL checks

2013-03-03 Thread Mahadev konar (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mahadev konar updated ZOOKEEPER-1657:
-

Fix Version/s: 3.4.6
   3.5.0

 Increased CPU usage by unnecessary SASL checks
 --

 Key: ZOOKEEPER-1657
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1657
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.4.5
Reporter: Gunnar Wagenknecht
  Labels: performance
 Fix For: 3.5.0, 3.4.6

 Attachments: ZOOKEEPER-1657.patch, ZOOKEEPER-1657.patch, 
 ZOOKEEPER-1657.patch, zookeeper-hotspot.png


 I did some profiling in one of our Java environments and found an interesting 
 footprint in ZooKeeper. The SASL support seems to trigger a lot times on the 
 client although it's not even in use.
 Is there a switch to disable SASL completely?
 The attached screenshot shows a 10-minute profiling session on one of our 
 production Jetty servers. The Jetty server handles ~1k web requests per 
 minute. The average response time per web request is a few milli seconds. The 
 profiling was performed on a machine running for 24h. 
 We noticed a significant CPU increase on our servers when deploying an update 
 from ZooKeeper 3.3.2 to ZooKeeper 3.4.5. Thus, we started investigating. The 
 screenshot shows that only 32% CPU time are spent in Jetty. In contrast, 65% 
 are spend in ZooKeeper. 
 A few notes/thoughts:
 * {{ClientCnxn$SendThread.clientTunneledAuthenticationInProgress}} seems to 
 be the culprit
 * {{javax.security.auth.login.Configuration.getConfiguration}} seems to be 
 called very often?
 * There is quite a bit reflection involved in 
 {{java.security.AccessController.doPrivileged}}
 * No security manager is active in the JVM: I tend to place an if-check in 
 the code before calling {{AccessController.doPrivileged}}. When no SM is 
 installed, the runnable can be called directly which safes cycles.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira