ZooKeeper-trunk-solaris - Build # 898 - Still Failing

2014-05-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-solaris/898/

###
## LAST 60 LINES OF THE CONSOLE 
###
Started by timer
Building remotely on solaris1 (Solaris) in workspace 
/export/home/hudson/hudson-slave/workspace/ZooKeeper-trunk-solaris
FATAL: hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
hudson.remoting.RequestAbortedException: 
hudson.remoting.RequestAbortedException: java.io.IOException: Unexpected 
termination of the channel
at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:41)
at 
hudson.remoting.RequestAbortedException.wrapForRethrow(RequestAbortedException.java:34)
at hudson.remoting.Request.call(Request.java:174)
at hudson.remoting.Channel.call(Channel.java:739)
at hudson.EnvVars.getRemote(EnvVars.java:404)
at hudson.model.Computer.getEnvironment(Computer.java:912)
at 
jenkins.model.CoreEnvironmentContributor.buildEnvironmentFor(CoreEnvironmentContributor.java:29)
at hudson.model.Run.getEnvironment(Run.java:2221)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:874)
at hudson.scm.SubversionSCM.checkout(SubversionSCM.java:866)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1251)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:604)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:513)
at hudson.model.Run.execute(Run.java:1706)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:88)
at hudson.model.Executor.run(Executor.java:231)
Caused by: hudson.remoting.RequestAbortedException: java.io.IOException: 
Unexpected termination of the channel
at hudson.remoting.Request.abort(Request.java:299)
at hudson.remoting.Channel.terminate(Channel.java:802)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:69)
Caused by: java.io.IOException: Unexpected termination of the channel
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:50)
Caused by: java.io.EOFException
at 
java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2328)
at 
java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2797)
at 
java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:802)
at java.io.ObjectInputStream.init(ObjectInputStream.java:299)
at 
hudson.remoting.ObjectInputStreamEx.init(ObjectInputStreamEx.java:40)
at 
hudson.remoting.AbstractSynchronousByteArrayCommandTransport.read(AbstractSynchronousByteArrayCommandTransport.java:34)
at 
hudson.remoting.SynchronousCommandTransport$ReaderThread.run(SynchronousCommandTransport.java:48)



###
## FAILED TESTS (if any) 
##
No tests ran.

ZooKeeper-3.4-WinVS2008_java - Build # 492 - Failure

2014-05-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-3.4-WinVS2008_java/492/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 188733 lines...]
[junit] 2014-05-20 10:30:26,834 [myid:] - INFO  [main:ClientBase@443] - 
STARTING server
[junit] 2014-05-20 10:30:26,834 [myid:] - INFO  [main:ClientBase@364] - 
CREATING server instance 127.0.0.1:11221
[junit] 2014-05-20 10:30:26,835 [myid:] - INFO  
[main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2014-05-20 10:30:26,836 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2014-05-20 10:30:26,836 [myid:] - INFO  [main:ZooKeeperServer@162] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
f:\hudson\hudson-slave\workspace\ZooKeeper-3.4-WinVS2008_java\branch-3.4\build\test\tmp\test1982126372498440347.junit.dir\version-2
 snapdir 
f:\hudson\hudson-slave\workspace\ZooKeeper-3.4-WinVS2008_java\branch-3.4\build\test\tmp\test1982126372498440347.junit.dir\version-2
[junit] 2014-05-20 10:30:26,839 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2014-05-20 10:30:26,839 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - 
Accepted socket connection from /127.0.0.1:51465
[junit] 2014-05-20 10:30:26,840 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing 
stat command from /127.0.0.1:51465
[junit] 2014-05-20 10:30:26,840 [myid:] - INFO  
[Thread-4:NIOServerCnxn$StatCommand@663] - Stat command output
[junit] 2014-05-20 10:30:26,840 [myid:] - INFO  
[Thread-4:NIOServerCnxn@1007] - Closed socket connection for client 
/127.0.0.1:51465 (no session established for client)
[junit] 2014-05-20 10:30:26,841 [myid:] - INFO  [main:JMXEnv@229] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2014-05-20 10:30:26,842 [myid:] - INFO  [main:JMXEnv@246] - 
expect:InMemoryDataTree
[junit] 2014-05-20 10:30:26,842 [myid:] - INFO  [main:JMXEnv@250] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2014-05-20 10:30:26,842 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2014-05-20 10:30:26,843 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2014-05-20 10:30:26,843 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@55] - Memory used 9305
[junit] 2014-05-20 10:30:26,843 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@60] - Number of threads 21
[junit] 2014-05-20 10:30:26,843 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@65] - FINISHED TEST METHOD testQuota
[junit] 2014-05-20 10:30:26,843 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2014-05-20 10:30:27,000 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop!
[junit] 2014-05-20 10:30:27,000 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited loop!
[junit] 2014-05-20 10:30:27,123 [myid:] - INFO  
[main-SendThread(127.0.0.1:11221):ClientCnxn$SendThread@852] - Socket 
connection established to 127.0.0.1/127.0.0.1:11221, initiating session
[junit] 2014-05-20 10:30:27,123 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@197] - 
Accepted socket connection from /127.0.0.1:51446
[junit] 2014-05-20 10:30:27,123 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:ZooKeeperServer@861] - Client 
attempting to renew session 0x14618f7b1ac at /127.0.0.1:51446
[junit] 2014-05-20 10:30:27,124 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:ZooKeeperServer@617] - Established 
session 0x14618f7b1ac with negotiated timeout 3 for client 
/127.0.0.1:51446
[junit] 2014-05-20 10:30:27,124 [myid:] - INFO  
[main-SendThread(127.0.0.1:11221):ClientCnxn$SendThread@1235] - Session 
establishment complete on server 127.0.0.1/127.0.0.1:11221, sessionid = 
0x14618f7b1ac, negotiated timeout = 3
[junit] 2014-05-20 10:30:27,125 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@494] - Processed session termination for 
sessionid: 0x14618f7b1ac
[junit] 2014-05-20 10:30:27,125 [myid:] - INFO  
[SyncThread:0:FileTxnLog@199] - Creating new log file: log.c
[junit] 2014-05-20 10:30:27,128 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x14618f7b1ac closed
[junit] 2014-05-20 10:30:27,128 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2014-05-20 10:30:27,128 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@512] - EventThread shut down
[junit] 2014-05-20 10:30:27,128 [myid:] - 

[jira] [Commented] (ZOOKEEPER-1576) Zookeeper cluster - failed to connect to cluster if one of the provided IPs causes java.net.UnknownHostException

2014-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003010#comment-14003010
 ] 

Hadoop QA commented on ZOOKEEPER-1576:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12645651/ZOOKEEPER-1576.patch
  against trunk revision 1595561.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2104//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2104//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2104//console

This message is automatically generated.

 Zookeeper cluster - failed to connect to cluster if one of the provided IPs 
 causes java.net.UnknownHostException
 

 Key: ZOOKEEPER-1576
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1576
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.5.0
 Environment: Three 3.4.3 zookeeper servers in cluster, linux.
Reporter: Tally Tsabary
Assignee: Edward Ribeiro
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1576-3.4.patch, ZOOKEEPER-1576.3.patch, 
 ZOOKEEPER-1576.4.patch, ZOOKEEPER-1576.5.patch, ZOOKEEPER-1576.patch


 Using a cluster of three 3.4.3 zookeeper servers.
 All the servers are up, but on the client machine, the firewall is blocking 
 one of the  servers.
 The following exception is happening, and the client is not connected to any 
 of the other cluster members.
 The exception:Nov 02, 2012 9:54:32 PM 
 com.netflix.curator.framework.imps.CuratorFrameworkImpl logError
 SEVERE: Background exception was not retry-able or retry gave up
 java.net.UnknownHostException: scnrmq003.myworkday.com
 at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method)
 at java.net.InetAddress$1.lookupAllHostAddr(Unknown Source)
 at java.net.InetAddress.getAddressesFromNameService(Unknown Source)
 at java.net.InetAddress.getAllByName0(Unknown Source)
 at java.net.InetAddress.getAllByName(Unknown Source)
 at java.net.InetAddress.getAllByName(Unknown Source)
 at 
 org.apache.zookeeper.client.StaticHostProvider.init(StaticHostProvider.java:60)
 at org.apache.zookeeper.ZooKeeper.init(ZooKeeper.java:440)
 at org.apache.zookeeper.ZooKeeper.init(ZooKeeper.java:375)
 The code at the 
 org.apache.zookeeper.client.StaticHostProvider.init(StaticHostProvider.java:60)
  is :
 public StaticHostProvider(CollectionInetSocketAddress serverAddresses) 
 throws UnknownHostException {
 for (InetSocketAddress address : serverAddresses) {
 InetAddress resolvedAddresses[] = InetAddress.getAllByName(address
 .getHostName());
 for (InetAddress resolvedAddress : resolvedAddresses) { 
 this.serverAddresses.add(new InetSocketAddress(resolvedAddress 
 .getHostAddress(), address.getPort())); }
 }
 ..
 The for-loop is not trying to resolve the rest of the servers on the list if 
 there is an UnknownHostException at the 
 InetAddress.getAllByName(address.getHostName()); 
 and it fails the client connection creation.
 I was expecting the connection will be created for the other members of the 
 cluster. 
 Also, InetAddress is a blocking command, and if it takes very long time,  
 (longer than the defined timeout) - that also should allow us to continue to 
 try and connect to the other servers on the list.
 Assuming this will be fixed, and we will get connection to the current 
 available servers, I think the zookeeper should continue to retry to connect 
 to the not-connected server of the cluster, so it will be able to use it 
 later when it is back.
 If one of the servers on the list is not available during the connection 
 creation, then it should be retried every x time despite the fact that we 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Failed: ZOOKEEPER-1576 PreCommit Build #2104

2014-05-20 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1576
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2104/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 238462 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12645651/ZOOKEEPER-1576.patch
 [exec]   against trunk revision 1595561.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2104//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2104//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2104//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 409b4acd7120f3f3994d6191119a983cc5acee7e logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1696:
 exec returned: 1

Total time: 39 minutes 12 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1576
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown

2014-05-20 Thread Grzegorz Grzybek (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003013#comment-14003013
 ] 

Grzegorz Grzybek commented on ZOOKEEPER-1459:
-

Is this really fixed? I don't see the change here: 
http://svn.apache.org/viewvc/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java?view=markup...

regards
Grzegorz Grzybek

 Standalone ZooKeeperServer is not closing the transaction log files on 
 shutdown
 ---

 Key: ZOOKEEPER-1459
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: server
Affects Versions: 3.4.0
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1459-branch-3_4.patch, 
 ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch


 When shutdown the standalone ZK server, its only clearing the zkdatabase and 
 not closing the transaction log streams. When tries to delete the temporary 
 files in unit tests on windows, its failing.
 ZooKeeperServer.java
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 }
 {noformat}
 Suggestion to close the zkDb as follows, this inturn will take care 
 transaction logs:
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 try {
 zkDb.close();
 } catch (IOException ie) {
 LOG.warn(Error closing logs , ie);
 }
 }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


ZooKeeper-trunk-WinVS2008_java - Build # 736 - Failure

2014-05-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-WinVS2008_java/736/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 267892 lines...]
[junit] 2014-05-20 10:52:01,237 [myid:] - INFO  
[main:NIOServerCnxnFactory@683] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2014-05-20 10:52:01,238 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2014-05-20 10:52:01,238 [myid:] - INFO  [main:ZooKeeperServer@766] 
- minSessionTimeout set to 6000
[junit] 2014-05-20 10:52:01,238 [myid:] - INFO  [main:ZooKeeperServer@775] 
- maxSessionTimeout set to 6
[junit] 2014-05-20 10:52:01,238 [myid:] - INFO  [main:ZooKeeperServer@149] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008_java\trunk\build\test\tmp\test8389610335436682810.junit.dir\version-2
 snapdir 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008_java\trunk\build\test\tmp\test8389610335436682810.junit.dir\version-2
[junit] 2014-05-20 10:52:01,240 [myid:] - INFO  [main:FileSnap@83] - 
Reading snapshot 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008_java\trunk\build\test\tmp\test8389610335436682810.junit.dir\version-2\snapshot.b
[junit] 2014-05-20 10:52:01,241 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
f:\hudson\hudson-slave\workspace\ZooKeeper-trunk-WinVS2008_java\trunk\build\test\tmp\test8389610335436682810.junit.dir\version-2\snapshot.b
[junit] 2014-05-20 10:52:01,243 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2014-05-20 10:52:01,244 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:57398
[junit] 2014-05-20 10:52:01,245 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@835] - Processing stat command from 
/127.0.0.1:57398
[junit] 2014-05-20 10:52:01,245 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@684] - Stat command output
[junit] 2014-05-20 10:52:01,246 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@1006] - Closed socket connection for client 
/127.0.0.1:57398 (no session established for client)
[junit] 2014-05-20 10:52:01,246 [myid:] - INFO  [main:JMXEnv@224] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2014-05-20 10:52:01,248 [myid:] - INFO  [main:JMXEnv@241] - 
expect:InMemoryDataTree
[junit] 2014-05-20 10:52:01,248 [myid:] - INFO  [main:JMXEnv@245] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1,name1=InMemoryDataTree
[junit] 2014-05-20 10:52:01,248 [myid:] - INFO  [main:JMXEnv@241] - 
expect:StandaloneServer_port
[junit] 2014-05-20 10:52:01,248 [myid:] - INFO  [main:JMXEnv@245] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port-1
[junit] 2014-05-20 10:52:01,249 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@55] - Memory used 13339
[junit] 2014-05-20 10:52:01,249 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@60] - Number of threads 23
[junit] 2014-05-20 10:52:01,249 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@65] - FINISHED TEST METHOD testQuota
[junit] 2014-05-20 10:52:01,249 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2014-05-20 10:52:01,476 [myid:] - INFO  
[main-SendThread(127.0.0.1:11221):ClientCnxn$SendThread@963] - Socket 
connection established to 127.0.0.1/127.0.0.1:11221, initiating session
[junit] 2014-05-20 10:52:01,478 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:57393
[junit] 2014-05-20 10:52:01,479 [myid:] - INFO  
[NIOWorkerThread-2:ZooKeeperServer@858] - Client attempting to renew session 
0x146190b66c9 at /127.0.0.1:57393
[junit] 2014-05-20 10:52:01,480 [myid:] - INFO  
[NIOWorkerThread-2:ZooKeeperServer@604] - Established session 0x146190b66c9 
with negotiated timeout 3 for client /127.0.0.1:57393
[junit] 2014-05-20 10:52:01,480 [myid:] - INFO  
[main-SendThread(127.0.0.1:11221):ClientCnxn$SendThread@1346] - Session 
establishment complete on server 127.0.0.1/127.0.0.1:11221, sessionid = 
0x146190b66c9, negotiated timeout = 3
[junit] 2014-05-20 10:52:01,481 [myid:] - INFO  [ProcessThread(sid:0 
cport:-1)::PrepRequestProcessor@685] - Processed session termination for 
sessionid: 0x146190b66c9
[junit] 2014-05-20 10:52:01,482 [myid:] - INFO  
[SyncThread:0:FileTxnLog@200] - Creating new log file: log.c
[junit] 2014-05-20 10:52:01,503 [myid:] - INFO  [main:ZooKeeper@968] - 
Session: 0x146190b66c9 closed
[junit] 2014-05-20 10:52:01,503 

ZooKeeper-trunk - Build # 2311 - Still Failing

2014-05-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk/2311/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 238605 lines...]
 [exec] Zookeeper_simpleSystem::testIPV6 : elapsed 1026 : OK
 [exec] Zookeeper_simpleSystem::testCreate : elapsed 1009 : OK
 [exec] Zookeeper_simpleSystem::testPath : elapsed 1018 : OK
 [exec] Zookeeper_simpleSystem::testPathValidation : elapsed 1036 : OK
 [exec] Zookeeper_simpleSystem::testPing : elapsed 17351 : OK
 [exec] Zookeeper_simpleSystem::testAcl : elapsed 1013 : OK
 [exec] Zookeeper_simpleSystem::testChroot : elapsed 3060 : OK
 [exec] Zookeeper_simpleSystem::testAuth ZooKeeper server started ZooKeeper 
server started : elapsed 30567 : OK
 [exec] Zookeeper_simpleSystem::testHangingClient : elapsed 1025 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithGlobal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
14870 : OK
 [exec] Zookeeper_simpleSystem::testWatcherAutoResetWithLocal ZooKeeper 
server started ZooKeeper server started ZooKeeper server started : elapsed 
15908 : OK
 [exec] Zookeeper_simpleSystem::testGetChildren2 : elapsed 1031 : OK
 [exec] Zookeeper_simpleSystem::testLastZxid : elapsed 4538 : OK
 [exec] Zookeeper_simpleSystem::testRemoveWatchers ZooKeeper server started 
: elapsed 4349 : OK
 [exec] Zookeeper_watchers::testDefaultSessionWatcher1 : elapsed 51 : OK
 [exec] Zookeeper_watchers::testDefaultSessionWatcher2 : elapsed 4 : OK
 [exec] Zookeeper_watchers::testObjectSessionWatcher1 : elapsed 53 : OK
 [exec] Zookeeper_watchers::testObjectSessionWatcher2 : elapsed 55 : OK
 [exec] Zookeeper_watchers::testNodeWatcher1 : assertion : elapsed 1033
 [exec] Zookeeper_watchers::testChildWatcher1 : elapsed 54 : OK
 [exec] Zookeeper_watchers::testChildWatcher2 : elapsed 54 : OK
 [exec] 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/src/c/tests/TestWatchers.cc:667:
 Assertion: assertion failed [Expression: ensureCondition( 
deliveryTracker.deliveryCounterEquals(2),1000)1000]
 [exec] Failures !!!
 [exec] Run: 71   Failure total: 1   Failures: 1   Errors: 0
 [exec] FAIL: zktest-mt
 [exec] ==
 [exec] 1 of 2 tests failed
 [exec] Please report to u...@zookeeper.apache.org
 [exec] ==
 [exec] make[1]: *** [check-TESTS] Error 1
 [exec] make[1]: Leaving directory 
`/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build/test/test-cppunit'
 [exec] make: *** [check-am] Error 2

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1426: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1386: The 
following error occurred while executing this line:
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk/trunk/build.xml:1396: 
exec returned: 2

Total time: 37 minutes 21 seconds
Build step 'Execute shell' marked build as failure
[FINDBUGS] Skipping publisher since build result is FAILURE
[WARNINGS] Skipping publisher since build result is FAILURE
Archiving artifacts
Recording fingerprints
Updating ZOOKEEPER-657
Updating ZOOKEEPER-1891
Updating ZOOKEEPER-1864
Updating ZOOKEEPER-1895
Updating ZOOKEEPER-1214
Updating ZOOKEEPER-1797
Updating ZOOKEEPER-1923
Updating ZOOKEEPER-1836
Updating ZOOKEEPER-1791
Updating ZOOKEEPER-1062
Updating ZOOKEEPER-1926
Recording test results
Publishing Javadoc
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
All tests passed

[jira] [Commented] (ZOOKEEPER-1891) StaticHostProviderTest.testUpdateLoadBalancing times out

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003083#comment-14003083
 ] 

Hudson commented on ZOOKEEPER-1891:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1891. StaticHostProviderTest.testUpdateLoadBalancing times out (Michi 
Mutsuzaki via rakeshr) (rakeshr: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593682)
* /zookeeper/trunk/CHANGES.txt
* 
/zookeeper/trunk/src/java/main/org/apache/zookeeper/client/StaticHostProvider.java


 StaticHostProviderTest.testUpdateLoadBalancing times out
 

 Key: ZOOKEEPER-1891
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1891
 Project: ZooKeeper
  Issue Type: Bug
  Components: java client
Affects Versions: 3.5.0
 Environment: ubuntu 13.10
 Server environment:java.version=1.7.0_51
 Server environment:java.vendor=Oracle Corporation
Reporter: Michi Mutsuzaki
Assignee: Michi Mutsuzaki
 Fix For: 3.5.0

 Attachments: StaticHostProviderTest.log, ZOOKEEPER-1891.patch, 
 ZOOKEEPER-1891.patch


 StaticHostProviderTest.testUpdateLoadBalancing is consistently timing out on 
 my box. I'll attach a log file.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1895) update all notice files, copyright, etc... with the new year - 2014

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003085#comment-14003085
 ] 

Hudson commented on ZOOKEEPER-1895:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1895. update all notice files, copyright, etc... with the new year - 
2014 (michim: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1595273)
* /zookeeper/trunk/NOTICE.txt


 update all notice files, copyright, etc... with the new year - 2014
 ---

 Key: ZOOKEEPER-1895
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1895
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.4.7, 3.5.0
Reporter: Patrick Hunt
Assignee: Michi Mutsuzaki
Priority: Blocker
 Fix For: 3.4.7, 3.5.0

 Attachments: ZOOKEEPER-1895.patch


 From a note on the list:
 Hi folks!
 This is a reminder to update the year in the NOTICE files from 2013 (or 
 older) to 2014.
 From a legal POV this is not that important as some say.
 But nonetheless it's good to update the year.
 LieGrue,
 strub



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-657) Cut down the running time of ZKDatabase corruption.

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003082#comment-14003082
 ] 

Hudson commented on ZOOKEEPER-657:
--

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-657. Cut down the running time of ZKDatabase corruption (Michi 
Mutsuzaki via rakeshr) (rakeshr: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1594755)
* /zookeeper/trunk/CHANGES.txt
* 
/zookeeper/trunk/src/java/test/org/apache/zookeeper/test/ZkDatabaseCorruptionTest.java


 Cut down the running time of ZKDatabase corruption.
 ---

 Key: ZOOKEEPER-657
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-657
 Project: ZooKeeper
  Issue Type: Improvement
  Components: tests
Reporter: Mahadev konar
Assignee: Michi Mutsuzaki
 Fix For: 3.4.7, 3.5.0

 Attachments: ZOOKEEPER-657.patch


 THe zkdatabasecorruption test takes around 180 seconds right now. It just 
 bring down a quorum cluster up and down and corrupts some snapshots. We need 
 to investigate why it takes that long and make it shorter so that our test 
 run times are smaller.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1864) quorumVerifier is null when creating a QuorumPeerConfig from parsing a Properties object

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003084#comment-14003084
 ] 

Hudson commented on ZOOKEEPER-1864:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1864. quorumVerifier is null when creating a QuorumPeerConfig from 
parsing a Properties object (Michi Mutsuzaki via rakeshr) (rakeshr: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1595443)
* /zookeeper/trunk/CHANGES.txt
* 
/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerConfig.java


 quorumVerifier is null when creating a QuorumPeerConfig from parsing a 
 Properties object
 

 Key: ZOOKEEPER-1864
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1864
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Reporter: some one
Assignee: Michi Mutsuzaki
 Fix For: 3.5.0

 Attachments: BackwardsCompatCheck.patch, ZOOKEEPER-1864.patch


 This bug was found when using ZK 3.5.0 with curator-test 2.3.0.
 curator-test is building a QuorumPeerConfig from a Properties object and then 
 when we try to run the quorum peer using that configuration, we get an NPE:
 {noformat}
 2014-01-19 21:58:39,768 [myid:] - ERROR 
 [Thread-3:TestingZooKeeperServer$1@138] - From testing server (random state: 
 false)
 java.lang.NullPointerException
   at 
 org.apache.zookeeper.server.quorum.QuorumPeer.setQuorumVerifier(QuorumPeer.java:1320)
   at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:156)
   at 
 org.apache.curator.test.TestingZooKeeperServer$1.run(TestingZooKeeperServer.java:134)
   at java.lang.Thread.run(Thread.java:722)
 {noformat}
 The reason that this happens is because QuorumPeerConfig:parseProperties only 
 peforms a subset of what 'QuorumPeerConfig:parse(String path)' does. The 
 exact additional task performed that we need in parseProperties is the 
 dynamic config backwards compatibility check:
 {noformat}
 // backward compatibility - dynamic configuration in the same 
 file as static configuration params
 // see writeDynamicConfig() - we change the config file to new 
 format if reconfig happens
 if (dynamicConfigFileStr == null) {
 configBackwardCompatibilityMode = true;
 configFileStr = path;
 parseDynamicConfig(cfg, electionAlg, true);
 checkValidity();
 }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1214) QuorumPeer should unregister only its previsously registered MBeans instead of use MBeanRegistry.unregisterAll() method.

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003086#comment-14003086
 ] 

Hudson commented on ZOOKEEPER-1214:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1214. QuorumPeer should unregister only its previsously registered 
MBeans instead of use MBeanRegistry.unregisterAll() method. (César Álvarez 
Núñez via michim) (michim: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1595561)
* /zookeeper/trunk/CHANGES.txt
* /zookeeper/trunk/src/java/main/org/apache/zookeeper/jmx/MBeanRegistry.java
* 
/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeer.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumUtil.java
* /zookeeper/trunk/src/java/test/org/apache/zookeeper/test/QuorumUtilTest.java


 QuorumPeer should unregister only its previsously registered MBeans instead 
 of use MBeanRegistry.unregisterAll() method.
 

 Key: ZOOKEEPER-1214
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1214
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Reporter: César Álvarez Núñez
Assignee: César Álvarez Núñez
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1214.2.patch, ZOOKEEPER-1214.3.patch, 
 ZOOKEEPER-1214.patch, ZOOKEEPER-1214.patch, ZOOKEEPER-1214.patch, 
 ZOOKEEPER-1214.patch


 When a QuorumPeer thread dies, it is unregistering *all* ZKMBeanInfo MBeans 
 previously registered on its java process; including those that has not been 
 registered by itself.
 It does not cause any side effect in production environment where each server 
 is running on a separate java process; but fails when using 
 org.apache.zookeeper.test.QuorumUtil to programmatically start up a 
 zookeeper server ensemble and use its provided methods to force Disconnected, 
 SyncConnected or SessionExpired events; in order to perform some 
 basic/functional testing.
 Scenario:
 * QuorumUtil qU = new QuorumUtil(1); // It creates a 3 servers ensemble.
 * qU.startAll(); // Startup all servers: 1 Leader + 2 Followers
 * qU.shutdown\(i\); // i is a number from 1 to 3. It shutdown one server.
 The last method causes that a QuorumPeer will die, invoking the 
 MBeanRegistry.unregisterAll() method.
 As a result, *all* ZKMBeanInfo MBeans are unregistered; including those 
 belonging to the other QuorumPeer instances.
 When trying to restart previous server (qU.restart\(i\)) an AssertionError is 
 thrown at MBeanRegistry.register(ZKMBeanInfo bean, ZKMBeanInfo parent) 
 method, causing the QuorumPeer thread dead.
 To solve it:
 * MBeanRegistry.unregisterAll() method has been removed.
 * QuorumPeer only unregister its ZKMBeanInfo MBeans.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1926) Unit tests should only use build/test/data for data

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003092#comment-14003092
 ] 

Hudson commented on ZOOKEEPER-1926:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1926. Unit tests should only use build/test/data for data (Enis 
Soztutar via michim) (michim: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593624)
* /zookeeper/trunk/CHANGES.txt
* 
/zookeeper/trunk/src/java/systest/org/apache/zookeeper/test/system/BaseSysTest.java
* 
/zookeeper/trunk/src/java/systest/org/apache/zookeeper/test/system/QuorumPeerInstance.java
* 
/zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/LearnerTest.java
* 
/zookeeper/trunk/src/java/test/org/apache/zookeeper/server/quorum/Zab1_0Test.java


 Unit tests should only use build/test/data for data
 ---

 Key: ZOOKEEPER-1926
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1926
 Project: ZooKeeper
  Issue Type: Bug
  Components: tests
Reporter: Enis Soztutar
Assignee: Enis Soztutar
 Fix For: 3.4.7, 3.5.0

 Attachments: zookeeper-1926_v1-branch-3.4.patch, 
 zookeeper-1926_v1.patch


 Some of the unit tests are creating temp files under system tmp dir (/tmp), 
 and put data there.
 We should encapsulate all temporary data from unit tests under 
 build/test/data. ant clean will clean all data from previous runs. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1836) addrvec_next() fails to set next parameter if addrvec_hasnext() returns false

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003089#comment-14003089
 ] 

Hudson commented on ZOOKEEPER-1836:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1836. addrvec_next() fails to set next parameter if addrvec_hasnext() 
returns false (Dutch T. Meyer via michim) (michim: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1595038)
* /zookeeper/trunk/CHANGES.txt
* /zookeeper/trunk/src/c/src/addrvec.c


 addrvec_next() fails to set next parameter if addrvec_hasnext() returns false
 -

 Key: ZOOKEEPER-1836
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1836
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Reporter: Dutch T. Meyer
Assignee: Dutch T. Meyer
Priority: Trivial
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1836.patch, ZOOKEEPER-1836.patch


 There is a relatively innocuous but useless pointer assignment in
 addrvec_next():
 195   void addrvec_next(addrvec_t *avec, struct sockaddr_storage *next)
 
 203   if (!addrvec_hasnext(avec))
 204   {
 205   next = NULL;
 206   return;
 That assignment on (205) has no point, as next is a local variable lost upon 
 function return.  Likely this should be a memset to zero out the actual 
 parameter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1923) A typo in zookeeperStarted document

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003088#comment-14003088
 ] 

Hudson commented on ZOOKEEPER-1923:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1923. A typo in zookeeperStarted document (Chengwei Yang via michim) 
(michim: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1593428)
* /zookeeper/trunk/CHANGES.txt
* /zookeeper/trunk/src/docs/src/documentation/content/xdocs/zookeeperStarted.xml


 A typo in zookeeperStarted document
 ---

 Key: ZOOKEEPER-1923
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1923
 Project: ZooKeeper
  Issue Type: Bug
  Components: documentation
Affects Versions: 3.4.6
 Environment: The trunk branch
Reporter: Chengwei Yang
Assignee: Chengwei Yang
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1923.patch


 There is a typo in the document zookeeperStarted.*, see 
 http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html, in the section 
 *Connecting to ZooKeeper*, where the *help* output *createpath* which should 
 be *create path*.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1062) Net-ZooKeeper: Net::ZooKeeper consumes 100% cpu on wait

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003091#comment-14003091
 ] 

Hudson commented on ZOOKEEPER-1062:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1062. Net-ZooKeeper: Net::ZooKeeper consumes 100% cpu on wait (Botond 
Hejj via michim) (michim: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1595374)
* /zookeeper/trunk/CHANGES.txt
* /zookeeper/trunk/src/contrib/zkperl/ZooKeeper.xs


 Net-ZooKeeper: Net::ZooKeeper consumes 100% cpu on wait
 ---

 Key: ZOOKEEPER-1062
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1062
 Project: ZooKeeper
  Issue Type: Bug
  Components: contrib-bindings
Affects Versions: 3.3.1, 3.4.5, 3.4.6
Reporter: Patrick Hunt
Assignee: Botond Hejj
  Labels: patch
 Fix For: 3.4.7, 3.5.0

 Attachments: ZOOKEEPER-1062.patch, ZOOKEEPER-1062.patch


 Reported by a user on the CDH user list (user reports that the listed fix 
 addressed this issue for him): 
 Net::ZooKeeper consumes 100% cpu when wait is used. At my initial 
 inspection, it seems to be related to implementation mistake in 
 pthread_cond_timedwait.
 https://rt.cpan.org/Public/Bug/Display.html?id=61290



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1791) ZooKeeper package includes unnecessary jars that are part of the package.

2014-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003090#comment-14003090
 ] 

Hudson commented on ZOOKEEPER-1791:
---

FAILURE: Integrated in ZooKeeper-trunk #2311 (See 
[https://builds.apache.org/job/ZooKeeper-trunk/2311/])
ZOOKEEPER-1791. ZooKeeper package includes unnecessary jars that are part of 
the package. (mahadev via michim) (michim: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1595559)
* /zookeeper/trunk/CHANGES.txt
* /zookeeper/trunk/ivy.xml
* /zookeeper/trunk/src/contrib/build.xml


 ZooKeeper package includes unnecessary jars that are part of the package.
 -

 Key: ZOOKEEPER-1791
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1791
 Project: ZooKeeper
  Issue Type: Bug
  Components: build
Affects Versions: 3.5.0
Reporter: Mahadev konar
Assignee: Mahadev konar
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1791.patch


 ZooKeeper package includes unnecessary jars that are part of the package.
 Packages like fatjar and 
 {code}
 maven-ant-tasks-2.1.3.jar
 maven-artifact-2.2.1.jar
 maven-artifact-manager-2.2.1.jar
 maven-error-diagnostics-2.2.1.jar
 maven-model-2.2.1.jar
 maven-plugin-registry-2.2.1.jar
 maven-profile-2.2.1.jar
 maven-project-2.2.1.jar
 maven-repository-metadata-2.2.1.jar
 {code}
 are part of the zookeeper package and rpm (via bigtop). 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown

2014-05-20 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003098#comment-14003098
 ] 

Rakesh R commented on ZOOKEEPER-1459:
-

Hi [~gzres], 
I think you are checking wrong file. Please see the below file to understand 
more,
https://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/ZooKeeperServerMain.java

 Standalone ZooKeeperServer is not closing the transaction log files on 
 shutdown
 ---

 Key: ZOOKEEPER-1459
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: server
Affects Versions: 3.4.0
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1459-branch-3_4.patch, 
 ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch


 When shutdown the standalone ZK server, its only clearing the zkdatabase and 
 not closing the transaction log streams. When tries to delete the temporary 
 files in unit tests on windows, its failing.
 ZooKeeperServer.java
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 }
 {noformat}
 Suggestion to close the zkDb as follows, this inturn will take care 
 transaction logs:
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 try {
 zkDb.close();
 } catch (IOException ie) {
 LOG.warn(Error closing logs , ie);
 }
 }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown

2014-05-20 Thread Grzegorz Grzybek (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003106#comment-14003106
 ] 

Grzegorz Grzybek commented on ZOOKEEPER-1459:
-

But shouldn't {{ZooKeeperServer}}'s {{shutdown}} do the same? We use 
{{ZooKeeperServer}} here: 
https://github.com/grgrzybek/fabric8/blob/master/fabric/fabric-zookeeper/src/main/java/io/fabric8/zookeeper/bootstrap/ZooKeeperServerFactory.java#L176

 Standalone ZooKeeperServer is not closing the transaction log files on 
 shutdown
 ---

 Key: ZOOKEEPER-1459
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: server
Affects Versions: 3.4.0
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1459-branch-3_4.patch, 
 ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch


 When shutdown the standalone ZK server, its only clearing the zkdatabase and 
 not closing the transaction log streams. When tries to delete the temporary 
 files in unit tests on windows, its failing.
 ZooKeeperServer.java
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 }
 {noformat}
 Suggestion to close the zkDb as follows, this inturn will take care 
 transaction logs:
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 try {
 zkDb.close();
 } catch (IOException ie) {
 LOG.warn(Error closing logs , ie);
 }
 }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1459) Standalone ZooKeeperServer is not closing the transaction log files on shutdown

2014-05-20 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003326#comment-14003326
 ] 

Rakesh R commented on ZOOKEEPER-1459:
-

This change would cause the following exception, I couldn't get the reason now. 
Please run the tests ReadOnlyModeTest#testReadOnlyClient and see it.

{code}
2014-05-20 18:02:39,766 [myid:] - ERROR 
[SyncThread:1:ZooKeeperCriticalThread@47] - Severe unrecoverable error, from 
thread : SyncThread:1
java.nio.channels.ClosedChannelException
at sun.nio.ch.FileChannelImpl.ensureOpen(FileChannelImpl.java:88)
at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:243)
at 
org.apache.zookeeper.server.persistence.Util.padLogFile(Util.java:215)
at 
org.apache.zookeeper.server.persistence.FileTxnLog.padFile(FileTxnLog.java:239)
at 
org.apache.zookeeper.server.persistence.FileTxnLog.append(FileTxnLog.java:217)
at 
org.apache.zookeeper.server.persistence.FileTxnSnapLog.append(FileTxnSnapLog.java:372)
at org.apache.zookeeper.server.ZKDatabase.append(ZKDatabase.java:542)
at 
org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:122)
{code}

I've gone through your code. Since you have handle to FileTxnSnapLog ftxn, 
after the server#shutdown() please close ftxn.close() in your code. Could you 
use ZooKeeperServerMain#initializeAndRun() and ZooKeeperServerMain#shutdown() 
for embedding the standalone server ?

I feel  ZOOKEEPER-1072 could be addressed to define the interfaces clearly to 
the users.



 Standalone ZooKeeperServer is not closing the transaction log files on 
 shutdown
 ---

 Key: ZOOKEEPER-1459
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1459
 Project: ZooKeeper
  Issue Type: Sub-task
  Components: server
Affects Versions: 3.4.0
Reporter: Rakesh R
Assignee: Rakesh R
 Fix For: 3.4.6, 3.5.0

 Attachments: ZOOKEEPER-1459-branch-3_4.patch, 
 ZOOKEEPER-1459-branch-3_4.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, 
 ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch, ZOOKEEPER-1459.patch


 When shutdown the standalone ZK server, its only clearing the zkdatabase and 
 not closing the transaction log streams. When tries to delete the temporary 
 files in unit tests on windows, its failing.
 ZooKeeperServer.java
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 }
 {noformat}
 Suggestion to close the zkDb as follows, this inturn will take care 
 transaction logs:
 {noformat}
 if (zkDb != null) {
 zkDb.clear();
 try {
 zkDb.close();
 } catch (IOException ie) {
 LOG.warn(Error closing logs , ie);
 }
 }
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1659) Add JMX support for dynamic reconfiguration

2014-05-20 Thread Rakesh R (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003426#comment-14003426
 ] 

Rakesh R commented on ZOOKEEPER-1659:
-

After unregister if anyone(for ex: monitoring tool) queried to get the 
attribute value, it would occur the following exception.
{code}
javax.management.InstanceNotFoundException: 
org.apache.ZooKeeperService:name0=ReplicatedServer_id3,name1=replica.3,name2=Leader,name3=InMemoryDataTree
at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
{code}

 Add JMX support for dynamic reconfiguration
 ---

 Key: ZOOKEEPER-1659
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1659
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.5.0
Reporter: Alexander Shraer
Assignee: Rakesh R
Priority: Blocker
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1659.patch


 We need to update JMX during reconfigurations. Currently, reconfiguration 
 changes are not reflected in JConsole.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1659) Add JMX support for dynamic reconfiguration

2014-05-20 Thread Otis Gospodnetic (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1659?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003644#comment-14003644
 ] 

Otis Gospodnetic commented on ZOOKEEPER-1659:
-

+1 for making sure changes are backwards-compatible.  We [monitor Zookeeper 
with SPM|http://sematext.com/spm/] and would love to be able to use the same 
agent for multiple/all ZK versions instead of having ZK version-specific agents.

 Add JMX support for dynamic reconfiguration
 ---

 Key: ZOOKEEPER-1659
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1659
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.5.0
Reporter: Alexander Shraer
Assignee: Rakesh R
Priority: Blocker
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1659.patch


 We need to update JMX during reconfigurations. Currently, reconfiguration 
 changes are not reflected in JConsole.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (ZOOKEEPER-1621) ZooKeeper does not recover from crash when disk was full

2014-05-20 Thread Michi Mutsuzaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michi Mutsuzaki updated ZOOKEEPER-1621:
---

Attachment: ZOOKEEPER-1621.patch

 ZooKeeper does not recover from crash when disk was full
 

 Key: ZOOKEEPER-1621
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3
 Environment: Ubuntu 12.04, Amazon EC2 instance
Reporter: David Arthur
Assignee: Michi Mutsuzaki
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1621.patch, zookeeper.log.gz


 The disk that ZooKeeper was using filled up. During a snapshot write, I got 
 the following exception
 2013-01-16 03:11:14,098 - ERROR [SyncThread:0:SyncRequestProcessor@151] - 
 Severe unrecoverable error, exiting
 java.io.IOException: No space left on device
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)
 at 
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:309)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:306)
 at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:484)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:162)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
 Then many subsequent exceptions like:
 2013-01-16 15:02:23,984 - ERROR [main:Util@239] - Last transaction was 
 partial.
 2013-01-16 15:02:23,985 - ERROR [main:ZooKeeperServerMain@63] - Unexpected 
 exception, exiting abnormally
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
 at 
 org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:504)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130)
 at 
 org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
 It seems to me that writing the transaction log should be fully atomic to 
 avoid such situations. Is this not the case?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Survey on Project Conventions

2014-05-20 Thread Martin Brandtner
Hello

My name is Martin Brandtner [1] and I’m a software engineering researcher
at the University of Zurich, Switzerland.
Together with Philipp Leitner [2], I currently work on an approach to
detect violations of project conventions based on data from the source code
repository, the issue tracker (e.g. Jira), and the build system (e.g.
Jenkins).

One example for such a project convention is: “You need to make sure that
the commit message contains at least the name of the contributor and
ideally a reference to the Bugzilla or JIRA issue where the patch was
submitted.” [3]

The idea is that our approach can detect violation of such a convention
automatically and therefore support the development process.

First of all we need conventions and that’s why we ask you to take part in
our survey. In the survey, we present five conventions and want you to rate
their relevance in your Apache project. Everybody contributing to your
Apache project can take part in this survey because we also want to see if
different roles may have different opinions about a convention.
The survey is totally anonymous and it will take about 15 minutes to answer
it.

We would be happy if you could fill out our survey under:
http://ww3.unipark.de/uc/SEAL_Research/1abe/ before May 30, 2014.

With the data collected in this survey we will implement a convention
violation detection in our tool called SQA-Timeline [4]. If your are
interested in our work, contact us via email or provide your email address
in the survey.

Best regards,
Martin and Philipp

[1] http://www.ifi.uzh.ch/seal/people/brandtner.html
[2] http://www.ifi.uzh.ch/seal/people/leitner.html
[3] http://www.apache.org/dev/committers.html#applying-patches
[4] https://www.youtube.com/watch?v=ZIsOODUapAE


Review Request 21732: ZOOKEEPER-1621

2014-05-20 Thread michi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/21732/
---

Review request for zookeeper.


Repository: zookeeper


Description
---

Modify FileTxnIterator to skip a transaction log file (instead of throwing an 
IOException) if the header is incomplete.


Diffs
-

  
http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/main/org/apache/zookeeper/server/persistence/FileTxnLog.java
 1596402 
  
http://svn.apache.org/repos/asf/zookeeper/trunk/src/java/test/org/apache/zookeeper/test/LoadFromLogTest.java
 1596402 

Diff: https://reviews.apache.org/r/21732/diff/


Testing
---

Added 2 testcases.


Thanks,

michim



[jira] [Commented] (ZOOKEEPER-1621) ZooKeeper does not recover from crash when disk was full

2014-05-20 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003956#comment-14003956
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1621:


https://reviews.apache.org/r/21732/

 ZooKeeper does not recover from crash when disk was full
 

 Key: ZOOKEEPER-1621
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3
 Environment: Ubuntu 12.04, Amazon EC2 instance
Reporter: David Arthur
Assignee: Michi Mutsuzaki
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1621.patch, zookeeper.log.gz


 The disk that ZooKeeper was using filled up. During a snapshot write, I got 
 the following exception
 2013-01-16 03:11:14,098 - ERROR [SyncThread:0:SyncRequestProcessor@151] - 
 Severe unrecoverable error, exiting
 java.io.IOException: No space left on device
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)
 at 
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:309)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:306)
 at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:484)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:162)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
 Then many subsequent exceptions like:
 2013-01-16 15:02:23,984 - ERROR [main:Util@239] - Last transaction was 
 partial.
 2013-01-16 15:02:23,985 - ERROR [main:ZooKeeperServerMain@63] - Unexpected 
 exception, exiting abnormally
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
 at 
 org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:504)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130)
 at 
 org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
 It seems to me that writing the transaction log should be fully atomic to 
 avoid such situations. Is this not the case?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Failed: ZOOKEEPER-1621 PreCommit Build #2105

2014-05-20 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2105/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 243609 lines...]
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12645856/ZOOKEEPER-1621.patch
 [exec]   against trunk revision 1596284.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 3 new or 
modified tests.
 [exec] 
 [exec] +1 javadoc.  The javadoc tool did not generate any warning 
messages.
 [exec] 
 [exec] +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
 [exec] 
 [exec] +1 findbugs.  The patch does not introduce any new Findbugs 
(version 1.3.9) warnings.
 [exec] 
 [exec] +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.
 [exec] 
 [exec] -1 core tests.  The patch failed core unit tests.
 [exec] 
 [exec] +1 contrib tests.  The patch passed contrib unit tests.
 [exec] 
 [exec] Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2105//testReport/
 [exec] Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2105//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2105//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 2a6536ea1944bd1c1c757f8d35b0e21fc3637390 logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1696:
 exec returned: 1

Total time: 36 minutes 35 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Description set: ZOOKEEPER-1621
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.zookeeper.server.quorum.StandaloneDisabledTest.startSingleServerTest

Error Message:
client could not connect to reestablished quorum: giving up after 30+ seconds.

Stack Trace:
junit.framework.AssertionFailedError: client could not connect to reestablished 
quorum: giving up after 30+ seconds.
at 
org.apache.zookeeper.test.ReconfigTest.testNormalOperation(ReconfigTest.java:153)
at 
org.apache.zookeeper.server.quorum.StandaloneDisabledTest.startSingleServerTest(StandaloneDisabledTest.java:75)
at 
org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)




[jira] [Commented] (ZOOKEEPER-1621) ZooKeeper does not recover from crash when disk was full

2014-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003996#comment-14003996
 ] 

Hadoop QA commented on ZOOKEEPER-1621:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12645856/ZOOKEEPER-1621.patch
  against trunk revision 1596284.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed core unit tests.

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2105//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2105//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2105//console

This message is automatically generated.

 ZooKeeper does not recover from crash when disk was full
 

 Key: ZOOKEEPER-1621
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3
 Environment: Ubuntu 12.04, Amazon EC2 instance
Reporter: David Arthur
Assignee: Michi Mutsuzaki
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1621.patch, zookeeper.log.gz


 The disk that ZooKeeper was using filled up. During a snapshot write, I got 
 the following exception
 2013-01-16 03:11:14,098 - ERROR [SyncThread:0:SyncRequestProcessor@151] - 
 Severe unrecoverable error, exiting
 java.io.IOException: No space left on device
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)
 at 
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:309)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:306)
 at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:484)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:162)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
 Then many subsequent exceptions like:
 2013-01-16 15:02:23,984 - ERROR [main:Util@239] - Last transaction was 
 partial.
 2013-01-16 15:02:23,985 - ERROR [main:ZooKeeperServerMain@63] - Unexpected 
 exception, exiting abnormally
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
 at 
 org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:504)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130)
 at 
 org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
 

[jira] [Commented] (ZOOKEEPER-1699) Leader should timeout and give up leadership when losing quorum of last proposed configuration

2014-05-20 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004011#comment-14004011
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1699:


Sounds good, I'm checking this in.

 Leader should timeout and give up leadership when losing quorum of last 
 proposed configuration
 --

 Key: ZOOKEEPER-1699
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1699
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.5.0
Reporter: Alexander Shraer
Assignee: Alexander Shraer
Priority: Blocker
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1699-draft.patch, ZOOKEEPER-1699-draft.patch, 
 ZOOKEEPER-1699-v1.patch, ZOOKEEPER-1699-v2.patch, ZOOKEEPER-1699-v3.patch, 
 ZOOKEEPER-1699-v4.patch, ZOOKEEPER-1699-v4.patch, ZOOKEEPER-1699-v5.patch, 
 ZOOKEEPER-1699.patch


 A leader gives up leadership when losing a quorum of the current 
 configuration.
 This doesn't take into account any proposed configuration. So, if
 a reconfig operation is in progress and a quorum of the new configuration is 
 not
 responsive, the leader will just get stuck waiting for it to ACK the reconfig 
 operation, and will never timeout. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (ZOOKEEPER-1699) Leader should timeout and give up leadership when losing quorum of last proposed configuration

2014-05-20 Thread Michi Mutsuzaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michi Mutsuzaki resolved ZOOKEEPER-1699.


Resolution: Fixed

trunk: http://svn.apache.org/viewvc?view=revisionrevision=1596422

 Leader should timeout and give up leadership when losing quorum of last 
 proposed configuration
 --

 Key: ZOOKEEPER-1699
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1699
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.5.0
Reporter: Alexander Shraer
Assignee: Alexander Shraer
Priority: Blocker
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1699-draft.patch, ZOOKEEPER-1699-draft.patch, 
 ZOOKEEPER-1699-v1.patch, ZOOKEEPER-1699-v2.patch, ZOOKEEPER-1699-v3.patch, 
 ZOOKEEPER-1699-v4.patch, ZOOKEEPER-1699-v4.patch, ZOOKEEPER-1699-v5.patch, 
 ZOOKEEPER-1699.patch


 A leader gives up leadership when losing a quorum of the current 
 configuration.
 This doesn't take into account any proposed configuration. So, if
 a reconfig operation is in progress and a quorum of the new configuration is 
 not
 responsive, the leader will just get stuck waiting for it to ACK the reconfig 
 operation, and will never timeout. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1621) ZooKeeper does not recover from crash when disk was full

2014-05-20 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004109#comment-14004109
 ] 

Alexander Shraer commented on ZOOKEEPER-1621:
-

Here's a different option - intuitively once zookeeper fails to write to disk, 
by continuing to operate normally it violates its promises to users (which is 
that if a majority acked the data is always there even if reboots happen). Once 
we realize the promise can't be kept it may be better to crash the server at 
that point and violate liveness (no availability) rather than to continue and 
risk coming up with a partial log at a later point violating safety 
(inconsistent state, lost transactions, etc).


 ZooKeeper does not recover from crash when disk was full
 

 Key: ZOOKEEPER-1621
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3
 Environment: Ubuntu 12.04, Amazon EC2 instance
Reporter: David Arthur
Assignee: Michi Mutsuzaki
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1621.patch, zookeeper.log.gz


 The disk that ZooKeeper was using filled up. During a snapshot write, I got 
 the following exception
 2013-01-16 03:11:14,098 - ERROR [SyncThread:0:SyncRequestProcessor@151] - 
 Severe unrecoverable error, exiting
 java.io.IOException: No space left on device
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)
 at 
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:309)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:306)
 at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:484)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:162)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
 Then many subsequent exceptions like:
 2013-01-16 15:02:23,984 - ERROR [main:Util@239] - Last transaction was 
 partial.
 2013-01-16 15:02:23,985 - ERROR [main:ZooKeeperServerMain@63] - Unexpected 
 exception, exiting abnormally
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
 at 
 org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:504)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130)
 at 
 org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
 It seems to me that writing the transaction log should be fully atomic to 
 avoid such situations. Is this not the case?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (ZOOKEEPER-1621) ZooKeeper does not recover from crash when disk was full

2014-05-20 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004117#comment-14004117
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1621:


I'm fine with Alex's suggestion. We should document how to manually recover 
when the server doesn't start because the log file doesn't contain the complete 
header.

 ZooKeeper does not recover from crash when disk was full
 

 Key: ZOOKEEPER-1621
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1621
 Project: ZooKeeper
  Issue Type: Bug
  Components: server
Affects Versions: 3.4.3
 Environment: Ubuntu 12.04, Amazon EC2 instance
Reporter: David Arthur
Assignee: Michi Mutsuzaki
 Fix For: 3.5.0

 Attachments: ZOOKEEPER-1621.patch, zookeeper.log.gz


 The disk that ZooKeeper was using filled up. During a snapshot write, I got 
 the following exception
 2013-01-16 03:11:14,098 - ERROR [SyncThread:0:SyncRequestProcessor@151] - 
 Severe unrecoverable error, exiting
 java.io.IOException: No space left on device
 at java.io.FileOutputStream.writeBytes(Native Method)
 at java.io.FileOutputStream.write(FileOutputStream.java:282)
 at 
 java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
 at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:123)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:309)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:306)
 at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:484)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:162)
 at 
 org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:101)
 Then many subsequent exceptions like:
 2013-01-16 15:02:23,984 - ERROR [main:Util@239] - Last transaction was 
 partial.
 2013-01-16 15:02:23,985 - ERROR [main:ZooKeeperServerMain@63] - Unexpected 
 exception, exiting abnormally
 java.io.EOFException
 at java.io.DataInputStream.readInt(DataInputStream.java:375)
 at 
 org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
 at 
 org.apache.zookeeper.server.persistence.FileHeader.deserialize(FileHeader.java:64)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.inStreamCreated(FileTxnLog.java:558)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.createInputArchive(FileTxnLog.java:577)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.goToNextLog(FileTxnLog.java:543)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.next(FileTxnLog.java:625)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:529)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog$FileTxnIterator.init(FileTxnLog.java:504)
 at 
 org.apache.zookeeper.server.persistence.FileTxnLog.read(FileTxnLog.java:341)
 at 
 org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:130)
 at 
 org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:259)
 at 
 org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:386)
 at 
 org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:138)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:112)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86)
 at 
 org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116)
 at 
 org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78)
 It seems to me that writing the transaction log should be fully atomic to 
 avoid such situations. Is this not the case?



--
This message was sent by Atlassian JIRA
(v6.2#6252)


ZooKeeper-trunk-jdk8 - Build # 24 - Still Failing

2014-05-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-jdk8/24/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 1441 lines...]
compile_jute:
[mkdir] Created dir: 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/src/java/generated
[mkdir] Created dir: 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/src/c/generated
 [java] ../../zookeeper.jute Parsed Successfully
 [java] ../../zookeeper.jute Parsed Successfully
[touch] Creating 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/src/java/generated/.generated

ver-gen:
[javac] Compiling 1 source file to 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/build/classes
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.5
[javac] warning: [options] source value 1.5 is obsolete and will be removed 
in a future release
[javac] warning: [options] To suppress warnings about obsolete options, use 
-Xlint:-options.
[javac] 3 warnings

svn-revision:
[mkdir] Created dir: 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/.revision

version-info:

process-template:

build-generated:
[javac] Compiling 60 source files to 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/build/classes
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.5
[javac] warning: [options] source value 1.5 is obsolete and will be removed 
in a future release
[javac] warning: [options] To suppress warnings about obsolete options, use 
-Xlint:-options.
[javac] 3 warnings

compile:
[javac] Compiling 185 source files to 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/build/classes
[javac] warning: [options] bootstrap class path not set in conjunction with 
-source 1.5
[javac] warning: [options] source value 1.5 is obsolete and will be removed 
in a future release
[javac] warning: [options] To suppress warnings about obsolete options, use 
-Xlint:-options.
[javac] 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java:65:
 error: cannot find symbol
[javac] static public class Proposal  extends SyncedLearnerTracker {
[javac]   ^
[javac]   symbol:   class SyncedLearnerTracker
[javac]   location: class Leader
[javac] 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/src/java/main/org/apache/zookeeper/jmx/ManagedUtil.java:62:
 warning: [rawtypes] found raw type: Enumeration
[javac] Enumeration enumer = r.getCurrentLoggers();
[javac] ^
[javac]   missing type arguments for generic class EnumerationE
[javac]   where E is a type-variable:
[javac] E extends Object declared in interface Enumeration
[javac] 
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/src/java/main/org/apache/zookeeper/server/quorum/Leader.java:69:
 error: method does not override or implement a method from a supertype
[javac] @Override
[javac] ^

BUILD FAILED
/home/hudson/jenkins-slave/workspace/ZooKeeper-trunk-jdk8/trunk/build.xml:436: 
Compile failed; see the compiler error output for details.

Total time: 10 seconds
Build step 'Execute shell' marked build as failure
[locks-and-latches] Releasing all the locks
[locks-and-latches] All the locks released
[WARNINGS] Skipping publisher since build result is FAILURE
Recording test results
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

Build failed in Jenkins: bookkeeper-trunk #643

2014-05-20 Thread Apache Jenkins Server
See https://builds.apache.org/job/bookkeeper-trunk/643/

--
[...truncated 529 lines...]
---
 T E S T S
---

---
 T E S T S
---

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0

[INFO] 
[INFO] --- maven-jar-plugin:2.3.1:jar (default-jar) @ bookkeeper-stats-api ---
[INFO] Building jar: 
https://builds.apache.org/job/bookkeeper-trunk/ws/bookkeeper-stats/target/bookkeeper-stats-api-4.3.0-SNAPSHOT.jar
[INFO] 
[INFO]  findbugs-maven-plugin:2.5.2:check (default-cli) @ 
bookkeeper-stats-api 
[INFO] 
[INFO] --- findbugs-maven-plugin:2.5.2:findbugs (findbugs) @ 
bookkeeper-stats-api ---
[INFO] Fork Value is true
[INFO] Done FindBugs Analysis
[INFO] 
[INFO]  findbugs-maven-plugin:2.5.2:check (default-cli) @ 
bookkeeper-stats-api 
[INFO] 
[INFO] --- findbugs-maven-plugin:2.5.2:check (default-cli) @ 
bookkeeper-stats-api ---
[INFO] BugInstance size is 0
[INFO] Error size is 0
[INFO] No errors/warnings found
[INFO] 
[INFO] 
[INFO] Building bookkeeper-server 4.3.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ bookkeeper-server ---
[INFO] Deleting 
https://builds.apache.org/job/bookkeeper-trunk/ws/bookkeeper-server (includes 
= [dependency-reduced-pom.xml], excludes = [])
[INFO] 
[INFO] --- apache-rat-plugin:0.7:check (default-cli) @ bookkeeper-server ---
[INFO] Exclude: **/DataFormats.java
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.1:process (default) @ 
bookkeeper-server ---
[INFO] 
[INFO] --- maven-resources-plugin:2.4.3:resources (default-resources) @ 
bookkeeper-server ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 3 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.0:compile (default-compile) @ 
bookkeeper-server ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 174 source files to 
https://builds.apache.org/job/bookkeeper-trunk/ws/bookkeeper-server/target/classes
[INFO] 
[INFO] --- maven-resources-plugin:2.4.3:testResources (default-testResources) @ 
bookkeeper-server ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.0:testCompile (default-testCompile) @ 
bookkeeper-server ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 84 source files to 
https://builds.apache.org/job/bookkeeper-trunk/ws/bookkeeper-server/target/test-classes
[INFO] 
[INFO] --- maven-surefire-plugin:2.9:test (default-test) @ bookkeeper-server ---
[INFO] Surefire report directory: 
https://builds.apache.org/job/bookkeeper-trunk/ws/bookkeeper-server/target/surefire-reports

---
 T E S T S
---

---
 T E S T S
---
Running org.apache.bookkeeper.client.SlowBookieTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 25.553 sec
Running org.apache.bookkeeper.client.ListLedgersTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.266 sec
Running org.apache.bookkeeper.client.BookieRecoveryTest
Tests run: 72, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 35.856 sec
Running org.apache.bookkeeper.client.TestReadTimeout
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 16.688 sec
Running org.apache.bookkeeper.client.LedgerRecoveryTest
Tests run: 18, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.037 sec
Running org.apache.bookkeeper.client.BookKeeperTest
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.38 sec
Running org.apache.bookkeeper.client.RoundRobinDistributionScheduleTest
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.118 sec
Running org.apache.bookkeeper.client.BookKeeperCloseTest
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.355 sec
Running org.apache.bookkeeper.client.TestFencing
Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.005 sec
Running org.apache.bookkeeper.client.TestLedgerChecker
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.378 sec
Running org.apache.bookkeeper.client.TestRackawareEnsemblePlacementPolicy
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.245 sec
Running org.apache.bookkeeper.client.LedgerCloseTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.381 sec
Running org.apache.bookkeeper.client.TestSpeculativeRead
Tests 

Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

2014-05-20 Thread Rakesh R


 On April 23, 2014, 10:22 a.m., Ivan Kelly wrote:
  bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 33
  https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line33
 
  This should certainly not be an enum. Otherwise we need to bump the 
  protocol version each time we add an error code.
  
  Imagine the scenario where both server and client are running 4.3.0. 
  Then the server is upgraded with 4.3.1 which a new error EPRINTERONFIRE. It 
  sends this to the client who throws a decode error.
 
 Sijie Guo wrote:
 how could being not enum help this? if it is integer, client still has no 
 idea how to interpret, so it is still invalid response from 4.3.0 client. I 
 thought we reached an agreement on enum on the ticket, no?
 
 Ivan Kelly wrote:
 So for version and operationtype, enum is ok. These originate at the 
 client, so if the servers are always upgraded at the client, there's no 
 interoperability issues. Status codes originate at the server though, so it 
 is possible for the server to send a statuscode that is unrecognised to a 
 client. The normal way to handle this would be a else or default: to pass 
 this up to the client as a BKException.UnexpectedConditionException. If it's 
 an enum, this will throw a decode exception in the netty decoder, which is 
 harder to handle.
 
 To resolve this on the server side, by checking the version and only 
 sending errors valid for that version, implies two things. Firstly, every 
 error code change will require the version to be bumped and secondly, that 
 there will need to be a list maintained for which errors are valid for each 
 version. This goes against the motivation for using protobuf in the first 
 place.
 
 Sijie Guo wrote:
 this is the application level agreement, no? it doesn't matter that u are 
 using a protobuf protocol or using current protocol, or it also doesn't 
 matter that u are using an integer or an enum. in any case, the best way is 
 as you described, you shouldn't send new status code back to an old client, 
 as the new status code is meaningless to the old client.
 
 Ivan Kelly wrote:
 but how do you know its an old client? Only by bumping the version number 
 each time you add an error code. In which case you end up with a whole lot of 
 junk like if (client.version == X) { send A } else if (client.version == Y) 
 { send B } else if (client.version ... which is exactly what protobuf was 
 designed to avoid (see A bit of history on 
 https://developers.google.com/protocol-buffers/docs/overview).
 
 Sijie Guo wrote:
 a else or default branch would make the behavior unpredictable as an old 
 client is treating a new status code as some kind of unknown. as you said, 
 you want to treat them as UnexpectedConditionException. But what does 
 UnexpectedConditionException means? doesn't it mean the sever already breaks 
 backward compatibility, since server couldn't satisfy the old client's 
 request.
 
 so still, if server wants to be backward compatibility to clients, in any 
 cases, it needs to know what version of protocol that the client is speaking 
 and handle them accordingly, not just let client to do their job in an 
 unexpected way.
 
 I don't see any elegant solutions without detecting protocol version. if 
 you have, please describe how not being enum would avoid this.
 
 Ivan Kelly wrote:
 the default behaviour for an unknown error code is something we already 
 use today.
 
 https://github.com/apache/bookkeeper/blob/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java#L714
 
 The client only needs to know that the request failed. the point of the 
 different error codes is so that the client could take specific recovery 
 steps. the default behaviour is just to pass the error up.
 
 Sijie Guo wrote:
 the default behavior was there just for all already known status codes. 
 but it doesn't mean it is correct for any unknown status codes. and when u 
 are saying 'the client only needs to know that the request failed', you are 
 making an assumption that there is only one status code indicating OK, other 
 status code should be taken as failed. but it isn't true. say in an old 
 protocol, we supported range reads, it responded with OK, list of entry 
 response (0 = data, 1 = missing, 2 = missing, 3 = data). if we are going 
 to improve our protocol to make communication more efficient, we are going to 
 change the protocol to get rid of transferring missing entries: responding 
 with PARTIAL_OK, list of existing entries (0 = data, 3 = data). 
 
 in this case, if server doesn't distinguish the client's protocol, just 
 respond every range reads with PARTIAL_OK, would did break the compatibility 
 with old protocol, as old protocol treats it as failure by default behavior. 
 in order to maintain backward compatibility, server needs to detect the 
 client's protocol and responds 

Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

2014-05-20 Thread Rakesh R


 On April 24, 2014, 12:19 p.m., Ivan Kelly wrote:
  bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 81
  https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line81
 
  ledgerId and entryId should be optional in all requests. It may be the 
  case, that how we specify them changes in the future (like when we flatten 
  the metadata), so it would be good to leave that possibility open.
 
 Sijie Guo wrote:
 for all existing reads/writes protocol, the ledgerId and entryId are 
 required. I am not sure how u will change the protocol by flatten the 
 metadata. but I guess that will be pretty new protocol. if so, it would be 
 better to add new request type, so we will not break any existing systems or 
 making it being complicated.
 
 Ivan Kelly wrote:
 Im not sure how it will change either. What I'm requesting is that the 
 protobuf protocol doesnt lock us in to how we are doing it now forever.
 
 Ivan Kelly wrote:
 actually for a concrete example, lets say we want to do a read request 
 for a whole ledger, from bookie A we request all entries, but it doesnt have 
 every 3rd entry due to striping. In the case we can request all entrys from 
 the ledger with entryid modulo 3 from bookie B. In this case, what would I 
 put in the _required_ entry id firld for the read to bookie B?
 
 Sijie Guo wrote:
 as I said, doesn't it sound like a new read protocol?
 
 Ivan Kelly wrote:
 it will likely go through the same codepaths though, so we'll end up with 
 a load of duplicate code. my concern with the requiredness of fields is that 
 it's so rigid that in future we will have to add new messages to make any 
 enhancements, causing the protocol to grow into something huge, with loads of 
 redundancy, and not any better than we have now with the manual encoding.
 
 Sijie Guo wrote:
 single entry read/write are primitives of bookie, ledger id and entry id 
 are required for them as they are the fundamental of bookkeeper. all other 
 improvements like streaming or range read could be built on these primitives. 
 then, if they are built on primitives, I don't see we will end up with a lot 
 of duplicated codes.
 
 Rakesh R wrote:
 As far as compatibility issue is concerned, making optional is defensive 
 approach and safe coding by avoiding parsing issues later. But it should be 
 done very carefully at the code level because the requiredness is now 
 handling at the code level. For example, at the server it should do 
 validation to see a request has both ledgerId/entryId and which is mandatory 
 or not etc. On the otherside for the required field, I could see the entity 
 is not open for expansion by removing that field. But we have again options 
 like like Sijie suggested, by defining new protocol and do the expansion.
 
 If we have a better way in hand to avoid code duplication, it is OK to go 
 ahead with required.
 
 Sijie Guo wrote:
 as I said, currently bookie storage is built on single read/add entry 
 primitives. there isn't any reason to make ledger id and entry id not to be 
 required. if you are going to change the protocol to get rid of ledger id and 
 entry id, you have to change the bookie storage. then I don't think there 
 will be any code duplication.

I agree this for add entry but while reading 'entryId' can be optional. There 
is no real functional issues, only the concern is this will force us to create 
new protocol if any such requirement comes in future.


- Rakesh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17895/#review41281
---


On April 24, 2014, 7:43 a.m., Sijie Guo wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17895/
 ---
 
 (Updated April 24, 2014, 7:43 a.m.)
 
 
 Review request for bookkeeper and Ivan Kelly.
 
 
 Bugs: BOOKKEEPER-582
 https://issues.apache.org/jira/browse/BOOKKEEPER-582
 
 
 Repository: bookkeeper-git
 
 
 Description
 ---
 
 - introducing protobuf support for bookkeeper
 - for server: introduce packet processor / EnDecoder for different protocol 
 supports
 - for client: change PCBC to use protobuf to send requests
 - misc changes for protobuf support
 
 (bookie server is able for backward compatibility) 
 
 
 Diffs
 -
 
   bookkeeper-server/pom.xml ebc1198 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/IndexInMemPageMgr.java
  56487aa 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/LedgerChecker.java
  28e23d6 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/client/PendingReadOp.java
  fb36b90 
   
 bookkeeper-server/src/main/java/org/apache/bookkeeper/processor/RequestProcessor.java
  241f369 
   
 

[jira] [Commented] (BOOKKEEPER-758) Add TryReadLastAddConfirmed API

2014-05-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14003963#comment-14003963
 ] 

Flavio Junqueira commented on BOOKKEEPER-758:
-

But this code does not benefit at all from the fact that the callback can be 
called multiple times, no? I don't think it is necessarily a big deal, but it 
isn't very clear the precise semantics and how applications can benefit from 
potentially multiple calls to the callback.

 Add TryReadLastAddConfirmed API
 ---

 Key: BOOKKEEPER-758
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-758
 Project: Bookkeeper
  Issue Type: Improvement
  Components: bookkeeper-client
Reporter: Sijie Guo
Assignee: Sijie Guo
 Fix For: 4.3.0

 Attachments: BOOKKEEPER-758.diff, BOOKKEEPER-758.v2.diff


 add TryReadLastConfirmed to read last confirmed without coverage checking, as 
 for readers which polls LAC, they just need LAC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (BOOKKEEPER-751) Ensure all the bookkeeper callbacks not run under ledger handle lock

2014-05-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004022#comment-14004022
 ] 

Flavio Junqueira commented on BOOKKEEPER-751:
-

This test case failed for me, could it be related:

Failed tests:   
test10Ledgers200ThreadsRead(org.apache.bookkeeper.test.MultipleThreadReadTest): 
Test failed because we couldn't read entries

 Ensure all the bookkeeper callbacks not run under ledger handle lock
 

 Key: BOOKKEEPER-751
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-751
 Project: Bookkeeper
  Issue Type: Bug
  Components: bookkeeper-client
Reporter: Sijie Guo
Assignee: Sijie Guo
 Fix For: 4.3.0, 4.2.3

 Attachments: BOOKKEEPER-751.diff


 we are running bookkeeper callbacks under ledger handle lock, which would 
 possibly introduce deadlock if application call bookkeeper functions in those 
 callbacks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (BOOKKEEPER-758) Add TryReadLastAddConfirmed API

2014-05-20 Thread Sijie Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004255#comment-14004255
 ] 

Sijie Guo commented on BOOKKEEPER-758:
--

to be clear, the multiple callbacks is internally in bk client, the user will 
only get one callback. the benefit to the code here is: when user received a 
LAC response from any bookie, it could move on to read the entries w/o waiting 
other bookies responses; so the reading entries could be paralleled with 
receiving LAC responses from other bookies. the benefit isn't the goal of this 
API, this API is to not block readLAC on waiting for LAC responses from 
multiple bookies.

 Add TryReadLastAddConfirmed API
 ---

 Key: BOOKKEEPER-758
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-758
 Project: Bookkeeper
  Issue Type: Improvement
  Components: bookkeeper-client
Reporter: Sijie Guo
Assignee: Sijie Guo
 Fix For: 4.3.0

 Attachments: BOOKKEEPER-758.diff, BOOKKEEPER-758.v2.diff


 add TryReadLastConfirmed to read last confirmed without coverage checking, as 
 for readers which polls LAC, they just need LAC.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (BOOKKEEPER-751) Ensure all the bookkeeper callbacks not run under ledger handle lock

2014-05-20 Thread Sijie Guo (JIRA)

[ 
https://issues.apache.org/jira/browse/BOOKKEEPER-751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14004259#comment-14004259
 ] 

Sijie Guo commented on BOOKKEEPER-751:
--

 test10Ledgers200ThreadsRead is kind of a resource-sensitive test. it is 
observed failing by me too. we could improve this test case, but it isn't 
related to this change.

 Ensure all the bookkeeper callbacks not run under ledger handle lock
 

 Key: BOOKKEEPER-751
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-751
 Project: Bookkeeper
  Issue Type: Bug
  Components: bookkeeper-client
Reporter: Sijie Guo
Assignee: Sijie Guo
 Fix For: 4.3.0, 4.2.3

 Attachments: BOOKKEEPER-751.diff


 we are running bookkeeper callbacks under ledger handle lock, which would 
 possibly introduce deadlock if application call bookkeeper functions in those 
 callbacks.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

2014-05-20 Thread Sijie Guo


 On April 24, 2014, 12:19 p.m., Ivan Kelly wrote:
  bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 81
  https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line81
 
  ledgerId and entryId should be optional in all requests. It may be the 
  case, that how we specify them changes in the future (like when we flatten 
  the metadata), so it would be good to leave that possibility open.
 
 Sijie Guo wrote:
 for all existing reads/writes protocol, the ledgerId and entryId are 
 required. I am not sure how u will change the protocol by flatten the 
 metadata. but I guess that will be pretty new protocol. if so, it would be 
 better to add new request type, so we will not break any existing systems or 
 making it being complicated.
 
 Ivan Kelly wrote:
 Im not sure how it will change either. What I'm requesting is that the 
 protobuf protocol doesnt lock us in to how we are doing it now forever.
 
 Ivan Kelly wrote:
 actually for a concrete example, lets say we want to do a read request 
 for a whole ledger, from bookie A we request all entries, but it doesnt have 
 every 3rd entry due to striping. In the case we can request all entrys from 
 the ledger with entryid modulo 3 from bookie B. In this case, what would I 
 put in the _required_ entry id firld for the read to bookie B?
 
 Sijie Guo wrote:
 as I said, doesn't it sound like a new read protocol?
 
 Ivan Kelly wrote:
 it will likely go through the same codepaths though, so we'll end up with 
 a load of duplicate code. my concern with the requiredness of fields is that 
 it's so rigid that in future we will have to add new messages to make any 
 enhancements, causing the protocol to grow into something huge, with loads of 
 redundancy, and not any better than we have now with the manual encoding.
 
 Sijie Guo wrote:
 single entry read/write are primitives of bookie, ledger id and entry id 
 are required for them as they are the fundamental of bookkeeper. all other 
 improvements like streaming or range read could be built on these primitives. 
 then, if they are built on primitives, I don't see we will end up with a lot 
 of duplicated codes.
 
 Rakesh R wrote:
 As far as compatibility issue is concerned, making optional is defensive 
 approach and safe coding by avoiding parsing issues later. But it should be 
 done very carefully at the code level because the requiredness is now 
 handling at the code level. For example, at the server it should do 
 validation to see a request has both ledgerId/entryId and which is mandatory 
 or not etc. On the otherside for the required field, I could see the entity 
 is not open for expansion by removing that field. But we have again options 
 like like Sijie suggested, by defining new protocol and do the expansion.
 
 If we have a better way in hand to avoid code duplication, it is OK to go 
 ahead with required.
 
 Sijie Guo wrote:
 as I said, currently bookie storage is built on single read/add entry 
 primitives. there isn't any reason to make ledger id and entry id not to be 
 required. if you are going to change the protocol to get rid of ledger id and 
 entry id, you have to change the bookie storage. then I don't think there 
 will be any code duplication.
 
 Rakesh R wrote:
 I agree this for add entry but while reading 'entryId' can be optional. 
 There is no real functional issues, only the concern is this will force us to 
 create new protocol if any such requirement comes in future.

again, as I said current bookie storage is per entry. if you want to support 
batch reads: 1) if you don't change bookie storage, you could build the batch 
read protocol on top of single read primitive. 2) if you changed bookie storage 
itself to support batch reads inside the storage, then it should be new request 
type to use new method in bookie storage, so the old read could still work with 
the storage that only support single read. this is for backward compatible. 


- Sijie


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17895/#review41281
---


On April 24, 2014, 7:43 a.m., Sijie Guo wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17895/
 ---
 
 (Updated April 24, 2014, 7:43 a.m.)
 
 
 Review request for bookkeeper and Ivan Kelly.
 
 
 Bugs: BOOKKEEPER-582
 https://issues.apache.org/jira/browse/BOOKKEEPER-582
 
 
 Repository: bookkeeper-git
 
 
 Description
 ---
 
 - introducing protobuf support for bookkeeper
 - for server: introduce packet processor / EnDecoder for different protocol 
 supports
 - for client: change PCBC to use protobuf to send requests
 - misc changes for protobuf support
 
 (bookie server 

Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

2014-05-20 Thread Sijie Guo


 On April 23, 2014, 10:22 a.m., Ivan Kelly wrote:
  bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 33
  https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line33
 
  This should certainly not be an enum. Otherwise we need to bump the 
  protocol version each time we add an error code.
  
  Imagine the scenario where both server and client are running 4.3.0. 
  Then the server is upgraded with 4.3.1 which a new error EPRINTERONFIRE. It 
  sends this to the client who throws a decode error.
 
 Sijie Guo wrote:
 how could being not enum help this? if it is integer, client still has no 
 idea how to interpret, so it is still invalid response from 4.3.0 client. I 
 thought we reached an agreement on enum on the ticket, no?
 
 Ivan Kelly wrote:
 So for version and operationtype, enum is ok. These originate at the 
 client, so if the servers are always upgraded at the client, there's no 
 interoperability issues. Status codes originate at the server though, so it 
 is possible for the server to send a statuscode that is unrecognised to a 
 client. The normal way to handle this would be a else or default: to pass 
 this up to the client as a BKException.UnexpectedConditionException. If it's 
 an enum, this will throw a decode exception in the netty decoder, which is 
 harder to handle.
 
 To resolve this on the server side, by checking the version and only 
 sending errors valid for that version, implies two things. Firstly, every 
 error code change will require the version to be bumped and secondly, that 
 there will need to be a list maintained for which errors are valid for each 
 version. This goes against the motivation for using protobuf in the first 
 place.
 
 Sijie Guo wrote:
 this is the application level agreement, no? it doesn't matter that u are 
 using a protobuf protocol or using current protocol, or it also doesn't 
 matter that u are using an integer or an enum. in any case, the best way is 
 as you described, you shouldn't send new status code back to an old client, 
 as the new status code is meaningless to the old client.
 
 Ivan Kelly wrote:
 but how do you know its an old client? Only by bumping the version number 
 each time you add an error code. In which case you end up with a whole lot of 
 junk like if (client.version == X) { send A } else if (client.version == Y) 
 { send B } else if (client.version ... which is exactly what protobuf was 
 designed to avoid (see A bit of history on 
 https://developers.google.com/protocol-buffers/docs/overview).
 
 Sijie Guo wrote:
 a else or default branch would make the behavior unpredictable as an old 
 client is treating a new status code as some kind of unknown. as you said, 
 you want to treat them as UnexpectedConditionException. But what does 
 UnexpectedConditionException means? doesn't it mean the sever already breaks 
 backward compatibility, since server couldn't satisfy the old client's 
 request.
 
 so still, if server wants to be backward compatibility to clients, in any 
 cases, it needs to know what version of protocol that the client is speaking 
 and handle them accordingly, not just let client to do their job in an 
 unexpected way.
 
 I don't see any elegant solutions without detecting protocol version. if 
 you have, please describe how not being enum would avoid this.
 
 Ivan Kelly wrote:
 the default behaviour for an unknown error code is something we already 
 use today.
 
 https://github.com/apache/bookkeeper/blob/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java#L714
 
 The client only needs to know that the request failed. the point of the 
 different error codes is so that the client could take specific recovery 
 steps. the default behaviour is just to pass the error up.
 
 Sijie Guo wrote:
 the default behavior was there just for all already known status codes. 
 but it doesn't mean it is correct for any unknown status codes. and when u 
 are saying 'the client only needs to know that the request failed', you are 
 making an assumption that there is only one status code indicating OK, other 
 status code should be taken as failed. but it isn't true. say in an old 
 protocol, we supported range reads, it responded with OK, list of entry 
 response (0 = data, 1 = missing, 2 = missing, 3 = data). if we are going 
 to improve our protocol to make communication more efficient, we are going to 
 change the protocol to get rid of transferring missing entries: responding 
 with PARTIAL_OK, list of existing entries (0 = data, 3 = data). 
 
 in this case, if server doesn't distinguish the client's protocol, just 
 respond every range reads with PARTIAL_OK, would did break the compatibility 
 with old protocol, as old protocol treats it as failure by default behavior. 
 in order to maintain backward compatibility, server needs to detect the 
 client's protocol and responds 

[jira] [Updated] (BOOKKEEPER-756) Use HashedwheelTimer for request timeouts for PCBC

2014-05-20 Thread Sijie Guo (JIRA)

 [ 
https://issues.apache.org/jira/browse/BOOKKEEPER-756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sijie Guo updated BOOKKEEPER-756:
-

Attachment: BOOKKEEPER-756.v2.diff

addressed the comments

 Use HashedwheelTimer for request timeouts for PCBC
 --

 Key: BOOKKEEPER-756
 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-756
 Project: Bookkeeper
  Issue Type: Improvement
  Components: bookkeeper-client
Reporter: Sijie Guo
Assignee: Sijie Guo
 Fix For: 4.3.0, 4.2.3

 Attachments: BOOKKEEPER-756.diff, BOOKKEEPER-756.v2.diff


 Current scheduler based timeout mechanism is per batch, which isn't 
 efficient. HashedWheelTimer is much better for timeouts. So change the PCBC 
 to use HashedWheelTimer for timeouts.
 Besides that HashedWheelTimer change, it also provides multiple channel per 
 bookie support for latency consideration.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

2014-05-20 Thread Rakesh R


 On April 24, 2014, 12:19 p.m., Ivan Kelly wrote:
  bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 81
  https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line81
 
  ledgerId and entryId should be optional in all requests. It may be the 
  case, that how we specify them changes in the future (like when we flatten 
  the metadata), so it would be good to leave that possibility open.
 
 Sijie Guo wrote:
 for all existing reads/writes protocol, the ledgerId and entryId are 
 required. I am not sure how u will change the protocol by flatten the 
 metadata. but I guess that will be pretty new protocol. if so, it would be 
 better to add new request type, so we will not break any existing systems or 
 making it being complicated.
 
 Ivan Kelly wrote:
 Im not sure how it will change either. What I'm requesting is that the 
 protobuf protocol doesnt lock us in to how we are doing it now forever.
 
 Ivan Kelly wrote:
 actually for a concrete example, lets say we want to do a read request 
 for a whole ledger, from bookie A we request all entries, but it doesnt have 
 every 3rd entry due to striping. In the case we can request all entrys from 
 the ledger with entryid modulo 3 from bookie B. In this case, what would I 
 put in the _required_ entry id firld for the read to bookie B?
 
 Sijie Guo wrote:
 as I said, doesn't it sound like a new read protocol?
 
 Ivan Kelly wrote:
 it will likely go through the same codepaths though, so we'll end up with 
 a load of duplicate code. my concern with the requiredness of fields is that 
 it's so rigid that in future we will have to add new messages to make any 
 enhancements, causing the protocol to grow into something huge, with loads of 
 redundancy, and not any better than we have now with the manual encoding.
 
 Sijie Guo wrote:
 single entry read/write are primitives of bookie, ledger id and entry id 
 are required for them as they are the fundamental of bookkeeper. all other 
 improvements like streaming or range read could be built on these primitives. 
 then, if they are built on primitives, I don't see we will end up with a lot 
 of duplicated codes.
 
 Rakesh R wrote:
 As far as compatibility issue is concerned, making optional is defensive 
 approach and safe coding by avoiding parsing issues later. But it should be 
 done very carefully at the code level because the requiredness is now 
 handling at the code level. For example, at the server it should do 
 validation to see a request has both ledgerId/entryId and which is mandatory 
 or not etc. On the otherside for the required field, I could see the entity 
 is not open for expansion by removing that field. But we have again options 
 like like Sijie suggested, by defining new protocol and do the expansion.
 
 If we have a better way in hand to avoid code duplication, it is OK to go 
 ahead with required.
 
 Sijie Guo wrote:
 as I said, currently bookie storage is built on single read/add entry 
 primitives. there isn't any reason to make ledger id and entry id not to be 
 required. if you are going to change the protocol to get rid of ledger id and 
 entry id, you have to change the bookie storage. then I don't think there 
 will be any code duplication.
 
 Rakesh R wrote:
 I agree this for add entry but while reading 'entryId' can be optional. 
 There is no real functional issues, only the concern is this will force us to 
 create new protocol if any such requirement comes in future.
 
 Sijie Guo wrote:
 again, as I said current bookie storage is per entry. if you want to 
 support batch reads: 1) if you don't change bookie storage, you could build 
 the batch read protocol on top of single read primitive. 2) if you changed 
 bookie storage itself to support batch reads inside the storage, then it 
 should be new request type to use new method in bookie storage, so the old 
 read could still work with the storage that only support single read. this is 
 for backward compatible.
 
 Sijie Guo wrote:
 one more comment, for any protocol requirements, please remember what 
 kind of operations supported now in bookie storage, and what's the backward 
 compatibility on bookie storage.

OK. makes sense to me.


- Rakesh


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17895/#review41281
---


On April 24, 2014, 7:43 a.m., Sijie Guo wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/17895/
 ---
 
 (Updated April 24, 2014, 7:43 a.m.)
 
 
 Review request for bookkeeper and Ivan Kelly.
 
 
 Bugs: BOOKKEEPER-582
 https://issues.apache.org/jira/browse/BOOKKEEPER-582
 
 
 Repository: bookkeeper-git
 
 
 

Re: Review Request 17895: BOOKKEEPER-582: protobuf support for bookkeeper

2014-05-20 Thread Rakesh R


 On April 23, 2014, 10:22 a.m., Ivan Kelly wrote:
  bookkeeper-server/src/main/proto/BookkeeperProtocol.proto, line 33
  https://reviews.apache.org/r/17895/diff/2/?file=563033#file563033line33
 
  This should certainly not be an enum. Otherwise we need to bump the 
  protocol version each time we add an error code.
  
  Imagine the scenario where both server and client are running 4.3.0. 
  Then the server is upgraded with 4.3.1 which a new error EPRINTERONFIRE. It 
  sends this to the client who throws a decode error.
 
 Sijie Guo wrote:
 how could being not enum help this? if it is integer, client still has no 
 idea how to interpret, so it is still invalid response from 4.3.0 client. I 
 thought we reached an agreement on enum on the ticket, no?
 
 Ivan Kelly wrote:
 So for version and operationtype, enum is ok. These originate at the 
 client, so if the servers are always upgraded at the client, there's no 
 interoperability issues. Status codes originate at the server though, so it 
 is possible for the server to send a statuscode that is unrecognised to a 
 client. The normal way to handle this would be a else or default: to pass 
 this up to the client as a BKException.UnexpectedConditionException. If it's 
 an enum, this will throw a decode exception in the netty decoder, which is 
 harder to handle.
 
 To resolve this on the server side, by checking the version and only 
 sending errors valid for that version, implies two things. Firstly, every 
 error code change will require the version to be bumped and secondly, that 
 there will need to be a list maintained for which errors are valid for each 
 version. This goes against the motivation for using protobuf in the first 
 place.
 
 Sijie Guo wrote:
 this is the application level agreement, no? it doesn't matter that u are 
 using a protobuf protocol or using current protocol, or it also doesn't 
 matter that u are using an integer or an enum. in any case, the best way is 
 as you described, you shouldn't send new status code back to an old client, 
 as the new status code is meaningless to the old client.
 
 Ivan Kelly wrote:
 but how do you know its an old client? Only by bumping the version number 
 each time you add an error code. In which case you end up with a whole lot of 
 junk like if (client.version == X) { send A } else if (client.version == Y) 
 { send B } else if (client.version ... which is exactly what protobuf was 
 designed to avoid (see A bit of history on 
 https://developers.google.com/protocol-buffers/docs/overview).
 
 Sijie Guo wrote:
 a else or default branch would make the behavior unpredictable as an old 
 client is treating a new status code as some kind of unknown. as you said, 
 you want to treat them as UnexpectedConditionException. But what does 
 UnexpectedConditionException means? doesn't it mean the sever already breaks 
 backward compatibility, since server couldn't satisfy the old client's 
 request.
 
 so still, if server wants to be backward compatibility to clients, in any 
 cases, it needs to know what version of protocol that the client is speaking 
 and handle them accordingly, not just let client to do their job in an 
 unexpected way.
 
 I don't see any elegant solutions without detecting protocol version. if 
 you have, please describe how not being enum would avoid this.
 
 Ivan Kelly wrote:
 the default behaviour for an unknown error code is something we already 
 use today.
 
 https://github.com/apache/bookkeeper/blob/trunk/bookkeeper-server/src/main/java/org/apache/bookkeeper/proto/PerChannelBookieClient.java#L714
 
 The client only needs to know that the request failed. the point of the 
 different error codes is so that the client could take specific recovery 
 steps. the default behaviour is just to pass the error up.
 
 Sijie Guo wrote:
 the default behavior was there just for all already known status codes. 
 but it doesn't mean it is correct for any unknown status codes. and when u 
 are saying 'the client only needs to know that the request failed', you are 
 making an assumption that there is only one status code indicating OK, other 
 status code should be taken as failed. but it isn't true. say in an old 
 protocol, we supported range reads, it responded with OK, list of entry 
 response (0 = data, 1 = missing, 2 = missing, 3 = data). if we are going 
 to improve our protocol to make communication more efficient, we are going to 
 change the protocol to get rid of transferring missing entries: responding 
 with PARTIAL_OK, list of existing entries (0 = data, 3 = data). 
 
 in this case, if server doesn't distinguish the client's protocol, just 
 respond every range reads with PARTIAL_OK, would did break the compatibility 
 with old protocol, as old protocol treats it as failure by default behavior. 
 in order to maintain backward compatibility, server needs to detect the 
 client's protocol and responds