[jira] [Commented] (ZOOKEEPER-2170) Zookeeper is not logging as per the configuraiton in log4j.properties

2015-06-26 Thread Arshad Mohammad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602484#comment-14602484
 ] 

Arshad Mohammad commented on ZOOKEEPER-2170:


Lets me further clarify what are the problems with the current default 
configuration as per my understanding

Suppose I downloaded latest zookeeper, installed and run it without any 
configuration change
Log will go to console logger and console is redirected to file 
zookeeper-root-server-host-name.out
{color:red}Problem 1:{color} This file keeps on growing, if zookeeper runs many 
number of days or logging frequency is high(in case errors), this file will 
grow in GBs, big enough that it can not be open
{color:red}Problem 2:{color} When I restart the zookeeper, it will overwrite 
the previous zookeeper-root-server-host-name.out file and create new file, all 
the log history is gone
{color:red}Problem 3:{color}  after observing Problem 1 and Problem 2 any user 
would go and modify the log4.properties but it would not do any effect, as I 
explained in my earlier comments

you are right [~rgs], [~cnauroth] 's patch in a way to align with hadoop's 
configuration.
But it is different form what is the expectation of this JIRA. May be 
[~cnauroth] 's patch can be committed as part of different JIRA

Expectation of this JIRA is: 
1) Default logging behaviour should come from log4.properties
2) It is good if we can make ROLLINGFILE as the default logger

 Zookeeper is not logging as per  the configuraiton in log4j.properties
 --

 Key: ZOOKEEPER-2170
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2170
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Arshad Mohammad
Assignee: Chris Nauroth
 Fix For: 3.6.0

 Attachments: ZOOKEEPER-2170.001.patch


 In conf/log4j.properties default root logger is 
 {code}
 zookeeper.root.logger=INFO, CONSOLE
 {code}
 Changing root logger to bellow value or any other value does not change 
 logging effect
 {code}
 zookeeper.root.logger=DEBUG, ROLLINGFILE
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (ZOOKEEPER-2222) Fail fast if `myid` does not exist but server.N properties are defined

2015-06-26 Thread Joe Halliwell (JIRA)
Joe Halliwell created ZOOKEEPER-:


 Summary: Fail fast if `myid` does not exist but server.N 
properties are defined
 Key: ZOOKEEPER-
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.4.6
Reporter: Joe Halliwell
Priority: Minor


Under these circumstances the server logs a warning, but starts in standalone 
mode. I think it should exit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2140) NettyServerCnxn and NIOServerCnxn code should be improved

2015-06-26 Thread Arshad Mohammad (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602851#comment-14602851
 ] 

Arshad Mohammad commented on ZOOKEEPER-2140:


Rebased the patch on top of trunk

 NettyServerCnxn and NIOServerCnxn code should be improved
 -

 Key: ZOOKEEPER-2140
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2140
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Arshad Mohammad
 Fix For: 3.6.0

 Attachments: ZOOKEEPER-2140-1.patch, ZOOKEEPER-2140-2.patch, 
 ZOOKEEPER-2140-3.patch


 Classes org.apache.zookeeper.server.NIOServerCnxn and 
 org.apache.zookeeper.server.NettyServerCnxn have following need and scope for 
 improvement
 1) Duplicate code.
   These two classes have around 250 line duplicate code. All the command 
 code is duplicated
 2) Many improvement/bugFix done in one class but not done in other class. 
 These changes should be synced
 For example
 In NettyServerCnxn
 {code}
// clone should be faster than iteration
 // ie give up the cnxns lock faster
 AbstractSetServerCnxn cnxns;
 synchronized (factory.cnxns) {
 cnxns = new HashSetServerCnxn(factory.cnxns);
 }
 for (ServerCnxn c : cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 In NIOServerCnxn
 {code}
for (ServerCnxn c : factory.cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 3) NettyServerCnxn and  NIOServerCnxn classes are bulky unnecessarily. 
 Command classes have altogether different functionality, the command classes 
 should go in different class files.
 If this done it will be easy to add new command with minimal change to 
 existing classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2222) Fail fast if `myid` does not exist but server.N properties are defined

2015-06-26 Thread Joe Halliwell (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603153#comment-14603153
 ] 

Joe Halliwell commented on ZOOKEEPER-:
--

Looking through the code, it's clearly supposed to exit under these 
circumstances. I'll see if I can provide more details.

 Fail fast if `myid` does not exist but server.N properties are defined
 --

 Key: ZOOKEEPER-
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.4.6
Reporter: Joe Halliwell
Priority: Minor

 Under these circumstances the server logs a warning, but starts in standalone 
 mode. I think it should exit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2140) NettyServerCnxn and NIOServerCnxn code should be improved

2015-06-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602967#comment-14602967
 ] 

Hadoop QA commented on ZOOKEEPER-2140:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12742114/ZOOKEEPER-2140-3.patch
  against trunk revision 1686767.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 4 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2780//console

This message is automatically generated.

 NettyServerCnxn and NIOServerCnxn code should be improved
 -

 Key: ZOOKEEPER-2140
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2140
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Arshad Mohammad
 Fix For: 3.6.0

 Attachments: ZOOKEEPER-2140-1.patch, ZOOKEEPER-2140-2.patch, 
 ZOOKEEPER-2140-3.patch


 Classes org.apache.zookeeper.server.NIOServerCnxn and 
 org.apache.zookeeper.server.NettyServerCnxn have following need and scope for 
 improvement
 1) Duplicate code.
   These two classes have around 250 line duplicate code. All the command 
 code is duplicated
 2) Many improvement/bugFix done in one class but not done in other class. 
 These changes should be synced
 For example
 In NettyServerCnxn
 {code}
// clone should be faster than iteration
 // ie give up the cnxns lock faster
 AbstractSetServerCnxn cnxns;
 synchronized (factory.cnxns) {
 cnxns = new HashSetServerCnxn(factory.cnxns);
 }
 for (ServerCnxn c : cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 In NIOServerCnxn
 {code}
for (ServerCnxn c : factory.cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 3) NettyServerCnxn and  NIOServerCnxn classes are bulky unnecessarily. 
 Command classes have altogether different functionality, the command classes 
 should go in different class files.
 If this done it will be easy to add new command with minimal change to 
 existing classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: ZOOKEEPER-2140 PreCommit Build #2780

2015-06-26 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-2140
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2780/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 185 lines...]
 [exec] File to patch: 
 [exec] Skip this patch? [y] 
 [exec] Skipping patch.
 [exec] 1 out of 1 hunk ignored
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12742114/ZOOKEEPER-2140-3.patch
 [exec]   against trunk revision 1686767.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 4 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2780//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] 91638864402fcf8600d1cddc0062ebbe3542b00a logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1782:
 exec returned: 1

Total time: 44 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-ZOOKEEPER-Build #2752
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 60820 bytes
Compression is 0.0%
Took 7.2 sec
Recording test results
Description set: ZOOKEEPER-2140
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-2193) reconfig command completes even if parameter is wrong obviously

2015-06-26 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602999#comment-14602999
 ] 

Flavio Junqueira commented on ZOOKEEPER-2193:
-

I remember that we wired QCM so that we could assign ids to observers 
automatically, but frankly I can't remember having finished the feature. I 
believe we still require observers to have unique ids in the config, but I can 
do some investigation to be sure.

 reconfig command completes even if parameter is wrong obviously
 ---

 Key: ZOOKEEPER-2193
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2193
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.5.0
 Environment: CentOS7 + Java7
Reporter: Yasuhito Fukuda
Assignee: Yasuhito Fukuda
 Attachments: ZOOKEEPER-2193-v2.patch, ZOOKEEPER-2193-v3.patch, 
 ZOOKEEPER-2193-v4.patch, ZOOKEEPER-2193-v5.patch, ZOOKEEPER-2193-v6.patch, 
 ZOOKEEPER-2193-v7.patch, ZOOKEEPER-2193-v8.patch, ZOOKEEPER-2193.patch


 Even if reconfig parameter is wrong, it was confirmed to complete.
 refer to the following.
 - Ensemble consists of four nodes
 {noformat}
 [zk: vm-101:2181(CONNECTED) 0] config
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 version=1
 {noformat}
 - add node by reconfig command
 {noformat}
 [zk: vm-101:2181(CONNECTED) 9] reconfig -add 
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 Committed new configuration:
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 version=30007
 {noformat}
 server.4 and server.5 of the IP address is a duplicate.
 In this state, reader election will not work properly.
 Besides, it is assumed an ensemble will be undesirable state.
 I think that need a parameter validation when reconfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ZOOKEEPER-2222) Fail fast if `myid` does not exist but server.N properties are defined

2015-06-26 Thread Joe Halliwell (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joe Halliwell resolved ZOOKEEPER-.
--
Resolution: Invalid

My mistake -- the config the server was using did not define any server.N 
entries.

 Fail fast if `myid` does not exist but server.N properties are defined
 --

 Key: ZOOKEEPER-
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-
 Project: ZooKeeper
  Issue Type: Improvement
  Components: server
Affects Versions: 3.4.6
Reporter: Joe Halliwell
Priority: Minor

 Under these circumstances the server logs a warning, but starts in standalone 
 mode. I think it should exit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2164) fast leader election keeps failing

2015-06-26 Thread Hongchao Deng (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603285#comment-14603285
 ] 

Hongchao Deng commented on ZOOKEEPER-2164:
--

It's on my plan to have a patch for this. I'm currently involved in internal 
stuff. I should be able to get onto this after that.

At the mean time, it sounds like you have a good testing plan. Would be nice if 
you can share it. :)

 fast leader election keeps failing
 --

 Key: ZOOKEEPER-2164
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2164
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.4.5
Reporter: Michi Mutsuzaki
Assignee: Hongchao Deng
 Fix For: 3.5.2, 3.6.0


 I have a 3-node cluster with sids 1, 2 and 3. Originally 2 is the leader. 
 When I shut down 2, 1 and 3 keep going back to leader election. Here is what 
 seems to be happening.
 - Both 1 and 3 elect 3 as the leader.
 - 1 receives votes from 3 and itself, and starts trying to connect to 3 as a 
 follower.
 - 3 doesn't receive votes for 5 seconds because connectOne() to 2 doesn't 
 timeout for 5 seconds: 
 https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java#L346
 - By the time 3 receives votes, 1 has given up trying to connect to 3: 
 https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L247
 I'm using 3.4.5, but it looks like this part of the code hasn't changed for a 
 while, so I'm guessing later versions have the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


ZooKeeper-trunk-openjdk7 - Build # 854 - Failure

2015-06-26 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper-trunk-openjdk7/854/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 364280 lines...]
[junit] 2015-06-26 20:42:35,376 [myid:] - INFO  [main:FileTxnSnapLog@298] - 
Snapshotting: 0xb to 
/home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/trunk/build/test/tmp/test1787711313594693480.junit.dir/version-2/snapshot.b
[junit] 2015-06-26 20:42:35,416 [myid:] - INFO  
[main:FourLetterWordMain@63] - connecting to 127.0.0.1 11222
[junit] 2015-06-26 20:42:35,417 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:53903
[junit] 2015-06-26 20:42:35,431 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@836] - Processing stat command from 
/127.0.0.1:53903
[junit] 2015-06-26 20:42:35,432 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn$StatCommand@685] - Stat command output
[junit] 2015-06-26 20:42:35,432 [myid:] - INFO  
[NIOWorkerThread-1:NIOServerCnxn@1007] - Closed socket connection for client 
/127.0.0.1:53903 (no session established for client)
[junit] 2015-06-26 20:42:35,433 [myid:] - INFO  [main:JMXEnv@224] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2015-06-26 20:42:35,434 [myid:] - INFO  [main:JMXEnv@241] - 
expect:InMemoryDataTree
[junit] 2015-06-26 20:42:35,435 [myid:] - INFO  [main:JMXEnv@245] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=InMemoryDataTree
[junit] 2015-06-26 20:42:35,435 [myid:] - INFO  [main:JMXEnv@241] - 
expect:StandaloneServer_port
[junit] 2015-06-26 20:42:35,435 [myid:] - INFO  [main:JMXEnv@245] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11222
[junit] 2015-06-26 20:42:35,435 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 85691
[junit] 2015-06-26 20:42:35,436 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 24
[junit] 2015-06-26 20:42:35,436 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota
[junit] 2015-06-26 20:42:35,436 [myid:] - INFO  [main:ClientBase@538] - 
tearDown starting
[junit] 2015-06-26 20:42:35,955 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@158] - SessionTrackerImpl exited loop!
[junit] 2015-06-26 20:42:35,955 [myid:] - INFO  
[SessionTracker:SessionTrackerImpl@158] - SessionTrackerImpl exited loop!
[junit] 2015-06-26 20:42:36,749 [myid:] - INFO  
[main-SendThread(127.0.0.1:11222):ClientCnxn$SendThread@1138] - Opening socket 
connection to server 127.0.0.1/127.0.0.1:11222. Will not attempt to 
authenticate using SASL (unknown error)
[junit] 2015-06-26 20:42:36,750 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@296]
 - Accepted socket connection from /127.0.0.1:53906
[junit] 2015-06-26 20:42:36,750 [myid:] - INFO  
[main-SendThread(127.0.0.1:11222):ClientCnxn$SendThread@980] - Socket 
connection established, initiating session, client: /127.0.0.1:53906, server: 
127.0.0.1/127.0.0.1:11222
[junit] 2015-06-26 20:42:36,764 [myid:] - INFO  
[NIOWorkerThread-2:ZooKeeperServer@936] - Client attempting to renew session 
0x101603fe65b at /127.0.0.1:53906
[junit] 2015-06-26 20:42:36,765 [myid:] - INFO  
[NIOWorkerThread-2:ZooKeeperServer@645] - Established session 0x101603fe65b 
with negotiated timeout 3 for client /127.0.0.1:53906
[junit] 2015-06-26 20:42:36,768 [myid:] - INFO  
[main-SendThread(127.0.0.1:11222):ClientCnxn$SendThread@1400] - Session 
establishment complete on server 127.0.0.1/127.0.0.1:11222, sessionid = 
0x101603fe65b, negotiated timeout = 3
[junit] 2015-06-26 20:42:36,774 [myid:] - INFO  [ProcessThread(sid:0 
cport:11222)::PrepRequestProcessor@640] - Processed session termination for 
sessionid: 0x101603fe65b
[junit] 2015-06-26 20:42:36,779 [myid:] - INFO  
[SyncThread:0:FileTxnLog@200] - Creating new log file: log.c
[junit] 2015-06-26 20:42:36,786 [myid:] - INFO  [main:ZooKeeper@1110] - 
Session: 0x101603fe65b closed
[junit] 2015-06-26 20:42:36,786 [myid:] - INFO  [main:ClientBase@508] - 
STOPPING server
[junit] 2015-06-26 20:42:36,786 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@542] - EventThread shut down for 
session: 0x101603fe65b
[junit] 2015-06-26 20:42:36,787 [myid:] - INFO  
[NIOWorkerThread-5:MBeanRegistry@119] - Unregister MBean 
[org.apache.ZooKeeperService:name0=StandaloneServer_port11222,name1=Connections,name2=127.0.0.1,name3=0x101603fe65b]
[junit] 2015-06-26 20:42:36,788 [myid:] - INFO  
[NIOServerCxnFactory.AcceptThread:0.0.0.0/0.0.0.0:11222:NIOServerCnxnFactory$AcceptThread@219]
 - accept thread 

ZooKeeper_branch34_openjdk7 - Build # 919 - Failure

2015-06-26 Thread Apache Jenkins Server
See https://builds.apache.org/job/ZooKeeper_branch34_openjdk7/919/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 215970 lines...]
[junit] 2015-06-26 10:07:29,082 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2015-06-26 10:07:29,082 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2015-06-26 10:07:29,083 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2015-06-26 10:07:29,083 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@219] - 
NIOServerCnxn factory exited run method
[junit] 2015-06-26 10:07:29,083 [myid:] - INFO  [main:ZooKeeperServer@441] 
- shutting down
[junit] 2015-06-26 10:07:29,084 [myid:] - INFO  
[main:SessionTrackerImpl@225] - Shutting down
[junit] 2015-06-26 10:07:29,084 [myid:] - INFO  
[main:PrepRequestProcessor@769] - Shutting down
[junit] 2015-06-26 10:07:29,084 [myid:] - INFO  
[main:SyncRequestProcessor@209] - Shutting down
[junit] 2015-06-26 10:07:29,084 [myid:] - INFO  [ProcessThread(sid:0 
cport:11221)::PrepRequestProcessor@145] - PrepRequestProcessor exited loop!
[junit] 2015-06-26 10:07:29,085 [myid:] - INFO  
[SyncThread:0:SyncRequestProcessor@187] - SyncRequestProcessor exited!
[junit] 2015-06-26 10:07:29,085 [myid:] - INFO  
[main:FinalRequestProcessor@415] - shutdown of request processor complete
[junit] 2015-06-26 10:07:29,086 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2015-06-26 10:07:29,086 [myid:] - INFO  [main:JMXEnv@146] - 
ensureOnly:[]
[junit] 2015-06-26 10:07:29,088 [myid:] - INFO  [main:ClientBase@443] - 
STARTING server
[junit] 2015-06-26 10:07:29,088 [myid:] - INFO  [main:ClientBase@364] - 
CREATING server instance 127.0.0.1:11221
[junit] 2015-06-26 10:07:29,089 [myid:] - INFO  
[main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:11221
[junit] 2015-06-26 10:07:29,089 [myid:] - INFO  [main:ClientBase@339] - 
STARTING server instance 127.0.0.1:11221
[junit] 2015-06-26 10:07:29,089 [myid:] - INFO  [main:ZooKeeperServer@162] 
- Created server with tickTime 3000 minSessionTimeout 6000 maxSessionTimeout 
6 datadir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/branch-3.4/build/test/tmp/test6680355425688285248.junit.dir/version-2
 snapdir 
/home/jenkins/jenkins-slave/workspace/ZooKeeper_branch34_openjdk7/branch-3.4/build/test/tmp/test6680355425688285248.junit.dir/version-2
[junit] 2015-06-26 10:07:29,094 [myid:] - INFO  
[main:FourLetterWordMain@43] - connecting to 127.0.0.1 11221
[junit] 2015-06-26 10:07:29,095 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxnFactory@192] - 
Accepted socket connection from /127.0.0.1:38287
[junit] 2015-06-26 10:07:29,095 [myid:] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:11221:NIOServerCnxn@827] - Processing 
stat command from /127.0.0.1:38287
[junit] 2015-06-26 10:07:29,095 [myid:] - INFO  
[Thread-4:NIOServerCnxn$StatCommand@663] - Stat command output
[junit] 2015-06-26 10:07:29,096 [myid:] - INFO  
[Thread-4:NIOServerCnxn@1007] - Closed socket connection for client 
/127.0.0.1:38287 (no session established for client)
[junit] 2015-06-26 10:07:29,096 [myid:] - INFO  [main:JMXEnv@229] - 
ensureParent:[InMemoryDataTree, StandaloneServer_port]
[junit] 2015-06-26 10:07:29,099 [myid:] - INFO  [main:JMXEnv@246] - 
expect:InMemoryDataTree
[junit] 2015-06-26 10:07:29,099 [myid:] - INFO  [main:JMXEnv@250] - 
found:InMemoryDataTree 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221,name1=InMemoryDataTree
[junit] 2015-06-26 10:07:29,099 [myid:] - INFO  [main:JMXEnv@246] - 
expect:StandaloneServer_port
[junit] 2015-06-26 10:07:29,099 [myid:] - INFO  [main:JMXEnv@250] - 
found:StandaloneServer_port 
org.apache.ZooKeeperService:name0=StandaloneServer_port11221
[junit] 2015-06-26 10:07:29,100 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@58] - Memory used 32639
[junit] 2015-06-26 10:07:29,100 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@63] - Number of threads 20
[junit] 2015-06-26 10:07:29,100 [myid:] - INFO  
[main:JUnit4ZKTestRunner$LoggedInvokeMethod@78] - FINISHED TEST METHOD testQuota
[junit] 2015-06-26 10:07:29,101 [myid:] - INFO  [main:ClientBase@520] - 
tearDown starting
[junit] 2015-06-26 10:07:29,167 [myid:] - INFO  [main:ZooKeeper@684] - 
Session: 0x14e2f561a9d closed
[junit] 2015-06-26 10:07:29,167 [myid:] - INFO  [main:ClientBase@490] - 
STOPPING server
[junit] 2015-06-26 10:07:29,167 [myid:] - INFO  
[main-EventThread:ClientCnxn$EventThread@517] - EventThread shut down for 
session: 0x14e2f561a9d
[junit] 2015-06-26 10:07:29,168 [myid:] - INFO  

[jira] [Commented] (ZOOKEEPER-2155) network is not good, the watcher in observer env will clear

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603871#comment-14603871
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2155:
---

Unless we can get a better description of what's going on ... 

 network is not good, the watcher in observer env will clear
 ---

 Key: ZOOKEEPER-2155
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2155
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.4.6
Reporter: linking12
Priority: Critical
  Labels: moreinfo
 Fix For: 3.5.0


 When I set up a ZooKeeper ensemble that uses Observers, The network is not 
 very good.
 I find all of the watcher disappear.
 I read the source code and find:
   When the observer connect to leader, will dump the DataTree from leader and 
 rebuild in observer.
 But the datawachers and childWatches is cleared for this operation.
 after i change code like:
 WatchManager dataWatchers = zk.getZKDatabase().getDataTree()
.getDataWatches();
 WatchManager childWatchers = zk.getZKDatabase().getDataTree()
.getChildWatches();
 zk.getZKDatabase().clear();
 zk.getZKDatabase().deserializeSnapshot(leaderIs);
 zk.getZKDatabase().getDataTree().setDataWatches(dataWatchers);
 zk.getZKDatabase().getDataTree().setChildWatches(childWatchers);
 The watcher do not disappear



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1525) Plumb ZooKeeperServer object into auth plugins

2015-06-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603875#comment-14603875
 ] 

Hadoop QA commented on ZOOKEEPER-1525:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12696698/ZOOKEEPER-1525.patch
  against trunk revision 1687876.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 2 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2781//console

This message is automatically generated.

 Plumb ZooKeeperServer object into auth plugins
 --

 Key: ZOOKEEPER-1525
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1525
 Project: ZooKeeper
  Issue Type: Improvement
Affects Versions: 3.5.0
Reporter: Warren Turkal
Assignee: Warren Turkal
 Fix For: 3.5.2, 3.6.0

 Attachments: ZOOKEEPER-1525.patch, ZOOKEEPER-1525.patch, 
 ZOOKEEPER-1525.patch


 I want to plumb the ZooKeeperServer object into the auth plugins so that I 
 can store authentication data in zookeeper itself. With access to the 
 ZooKeeperServer object, I also have access to the ZKDatabase and can look up 
 entries in the local copy of the zookeeper data.
 In order to implement this, I make sure that a ZooKeeperServer instance is 
 passed in to the ProviderRegistry.initialize() method. Then initialize() will 
 try to find a constructor for the AuthenticationProvider that takes a 
 ZooKeeperServer instance. If the constructor is found, it will be used. 
 Otherwise, initialize() will look for a constructor that takes no arguments 
 and use that instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Failed: ZOOKEEPER-1525 PreCommit Build #2781

2015-06-26 Thread Apache Jenkins Server
Jira: https://issues.apache.org/jira/browse/ZOOKEEPER-1525
Build: https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2781/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 98 lines...]
 [exec] Hunk #2 FAILED at 34.
 [exec] Hunk #3 succeeded at 64 (offset 2 lines).
 [exec] 1 out of 3 hunks FAILED -- saving rejects to file 
src/java/main/org/apache/zookeeper/server/auth/ProviderRegistry.java.rej
 [exec] patching file 
src/java/test/org/apache/zookeeper/test/KeyAuthClientTest.java
 [exec] PATCH APPLICATION FAILED
 [exec] 
 [exec] 
 [exec] 
 [exec] 
 [exec] -1 overall.  Here are the results of testing the latest attachment 
 [exec]   
http://issues.apache.org/jira/secure/attachment/12696698/ZOOKEEPER-1525.patch
 [exec]   against trunk revision 1687876.
 [exec] 
 [exec] +1 @author.  The patch does not contain any @author tags.
 [exec] 
 [exec] +1 tests included.  The patch appears to include 2 new or 
modified tests.
 [exec] 
 [exec] -1 patch.  The patch command could not apply the patch.
 [exec] 
 [exec] Console output: 
https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2781//console
 [exec] 
 [exec] This message is automatically generated.
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Adding comment to Jira.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 
 [exec] Comment added.
 [exec] dc734804594abcc5229fbfbcea3736c41a2b574a logged out
 [exec] 
 [exec] 
 [exec] 
==
 [exec] 
==
 [exec] Finished build.
 [exec] 
==
 [exec] 
==
 [exec] 
 [exec] 

BUILD FAILED
/home/jenkins/jenkins-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/build.xml:1782:
 exec returned: 1

Total time: 45 seconds
Build step 'Execute shell' marked build as failure
Archiving artifacts
Sending artifact delta relative to PreCommit-ZOOKEEPER-Build #2752
Archived 1 artifacts
Archive block size is 32768
Received 0 blocks and 60820 bytes
Compression is 0.0%
Took 8.3 sec
Recording test results
Description set: ZOOKEEPER-1525
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Commented] (ZOOKEEPER-2193) reconfig command completes even if parameter is wrong obviously

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603793#comment-14603793
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2193:
---

Ok - thanks for checking [~shralex] and [~fpj]. I'll go ahead and merge then. 

 reconfig command completes even if parameter is wrong obviously
 ---

 Key: ZOOKEEPER-2193
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2193
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.5.0
 Environment: CentOS7 + Java7
Reporter: Yasuhito Fukuda
Assignee: Yasuhito Fukuda
 Attachments: ZOOKEEPER-2193-v2.patch, ZOOKEEPER-2193-v3.patch, 
 ZOOKEEPER-2193-v4.patch, ZOOKEEPER-2193-v5.patch, ZOOKEEPER-2193-v6.patch, 
 ZOOKEEPER-2193-v7.patch, ZOOKEEPER-2193-v8.patch, ZOOKEEPER-2193.patch


 Even if reconfig parameter is wrong, it was confirmed to complete.
 refer to the following.
 - Ensemble consists of four nodes
 {noformat}
 [zk: vm-101:2181(CONNECTED) 0] config
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 version=1
 {noformat}
 - add node by reconfig command
 {noformat}
 [zk: vm-101:2181(CONNECTED) 9] reconfig -add 
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 Committed new configuration:
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 version=30007
 {noformat}
 server.4 and server.5 of the IP address is a duplicate.
 In this state, reader election will not work properly.
 Besides, it is assumed an ensemble will be undesirable state.
 I think that need a parameter validation when reconfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2193) reconfig command completes even if parameter is wrong obviously

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603795#comment-14603795
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2193:
---

I think it's been clarified below by [~fpj] and [~shralex] - we can ignore 
observers using non-unique (i.e.: -1) ids for now. I'll go ahead and merge. 
Thanks [~Yasuhito Fukuda]!

 reconfig command completes even if parameter is wrong obviously
 ---

 Key: ZOOKEEPER-2193
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2193
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.5.0
 Environment: CentOS7 + Java7
Reporter: Yasuhito Fukuda
Assignee: Yasuhito Fukuda
 Attachments: ZOOKEEPER-2193-v2.patch, ZOOKEEPER-2193-v3.patch, 
 ZOOKEEPER-2193-v4.patch, ZOOKEEPER-2193-v5.patch, ZOOKEEPER-2193-v6.patch, 
 ZOOKEEPER-2193-v7.patch, ZOOKEEPER-2193-v8.patch, ZOOKEEPER-2193.patch


 Even if reconfig parameter is wrong, it was confirmed to complete.
 refer to the following.
 - Ensemble consists of four nodes
 {noformat}
 [zk: vm-101:2181(CONNECTED) 0] config
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 version=1
 {noformat}
 - add node by reconfig command
 {noformat}
 [zk: vm-101:2181(CONNECTED) 9] reconfig -add 
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 Committed new configuration:
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 version=30007
 {noformat}
 server.4 and server.5 of the IP address is a duplicate.
 In this state, reader election will not work properly.
 Besides, it is assumed an ensemble will be undesirable state.
 I think that need a parameter validation when reconfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2140) NettyServerCnxn and NIOServerCnxn code should be improved

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603823#comment-14603823
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2140:
---

[~arshad.mohammad]: if you are using git, you need to create the patch with:

{code}
git diff --no-prefix HEAD~1..  ZOOKEEPER-2140.patch
{code}

Otherwise, jenkins won't be able to apply it and run the tests. Please recreate 
the patch and upload it again, thanks!

 NettyServerCnxn and NIOServerCnxn code should be improved
 -

 Key: ZOOKEEPER-2140
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2140
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Arshad Mohammad
 Fix For: 3.6.0

 Attachments: ZOOKEEPER-2140-1.patch, ZOOKEEPER-2140-2.patch, 
 ZOOKEEPER-2140-3.patch


 Classes org.apache.zookeeper.server.NIOServerCnxn and 
 org.apache.zookeeper.server.NettyServerCnxn have following need and scope for 
 improvement
 1) Duplicate code.
   These two classes have around 250 line duplicate code. All the command 
 code is duplicated
 2) Many improvement/bugFix done in one class but not done in other class. 
 These changes should be synced
 For example
 In NettyServerCnxn
 {code}
// clone should be faster than iteration
 // ie give up the cnxns lock faster
 AbstractSetServerCnxn cnxns;
 synchronized (factory.cnxns) {
 cnxns = new HashSetServerCnxn(factory.cnxns);
 }
 for (ServerCnxn c : cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 In NIOServerCnxn
 {code}
for (ServerCnxn c : factory.cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 3) NettyServerCnxn and  NIOServerCnxn classes are bulky unnecessarily. 
 Command classes have altogether different functionality, the command classes 
 should go in different class files.
 If this done it will be easy to add new command with minimal change to 
 existing classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1525) Plumb ZooKeeperServer object into auth plugins

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603822#comment-14603822
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-1525:
---

Thanks for the patch [~timrc]!  I added a few comments in the RB. After 
updating that, mind re-attaching the patch here as well so CI can run too. 
Thanks!

 Plumb ZooKeeperServer object into auth plugins
 --

 Key: ZOOKEEPER-1525
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1525
 Project: ZooKeeper
  Issue Type: Improvement
Affects Versions: 3.5.0
Reporter: Warren Turkal
Assignee: Warren Turkal
 Fix For: 3.5.2, 3.6.0

 Attachments: ZOOKEEPER-1525.patch, ZOOKEEPER-1525.patch, 
 ZOOKEEPER-1525.patch


 I want to plumb the ZooKeeperServer object into the auth plugins so that I 
 can store authentication data in zookeeper itself. With access to the 
 ZooKeeperServer object, I also have access to the ZKDatabase and can look up 
 entries in the local copy of the zookeeper data.
 In order to implement this, I make sure that a ZooKeeperServer instance is 
 passed in to the ProviderRegistry.initialize() method. Then initialize() will 
 try to find a constructor for the AuthenticationProvider that takes a 
 ZooKeeperServer instance. If the constructor is found, it will be used. 
 Otherwise, initialize() will look for a constructor that takes no arguments 
 and use that instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2170) Zookeeper is not logging as per the configuraiton in log4j.properties

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603828#comment-14603828
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2170:
---

Sounds sensible to me, could you provide a patch for that?

 Zookeeper is not logging as per  the configuraiton in log4j.properties
 --

 Key: ZOOKEEPER-2170
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2170
 Project: ZooKeeper
  Issue Type: Bug
Reporter: Arshad Mohammad
Assignee: Chris Nauroth
 Fix For: 3.6.0

 Attachments: ZOOKEEPER-2170.001.patch


 In conf/log4j.properties default root logger is 
 {code}
 zookeeper.root.logger=INFO, CONSOLE
 {code}
 Changing root logger to bellow value or any other value does not change 
 logging effect
 {code}
 zookeeper.root.logger=DEBUG, ROLLINGFILE
 {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2217) event might lost before re-watch

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603844#comment-14603844
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2217:
---

getChildren (and getData, etc) do get the data  set their watches atomically, 
so how would inverting the order change anything?

The only way of getting _every_ intermediate state would be by tailing the 
transaction logs, but at that point maybe ZooKeeper is not the right tool for 
the job. 

 event might lost before re-watch
 

 Key: ZOOKEEPER-2217
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2217
 Project: ZooKeeper
  Issue Type: Improvement
  Components: c client, java client
Affects Versions: 3.4.5, 3.4.6
 Environment: jdk1.7_45 on centos6.5 and ubuntu14.4 
Reporter: Caspian

 I use zk to  monitor the children nodes under a path, eg: /servers. 
 when the client is told that children changes,  I have to re-watch the path 
 again, during the peroid, it's possible that some children down, or some up. 
 And those events will be missed.
 For now, my temporary solution is not to use getChildren(path, true...) to 
 get children and re-watch this path, but re-watch this path first, then get 
 the children. Thus non events can be ignored, but I don't know what will the 
 zk server be like if there are too much clients that act like this.
 How do you think of this problem? Is there any other solutions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2193) reconfig command completes even if parameter is wrong obviously

2015-06-26 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603650#comment-14603650
 ] 

Flavio Junqueira commented on ZOOKEEPER-2193:
-

We have added code to QCM so that observers could connect without providing a 
unique id, but I don't think the way we process configuration supports this 
feature currently. For example, we check if a peer is observing by checking the 
server id. The intent was to support this feature, though, to be able connect 
observers without having to change configuration.

 reconfig command completes even if parameter is wrong obviously
 ---

 Key: ZOOKEEPER-2193
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2193
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.5.0
 Environment: CentOS7 + Java7
Reporter: Yasuhito Fukuda
Assignee: Yasuhito Fukuda
 Attachments: ZOOKEEPER-2193-v2.patch, ZOOKEEPER-2193-v3.patch, 
 ZOOKEEPER-2193-v4.patch, ZOOKEEPER-2193-v5.patch, ZOOKEEPER-2193-v6.patch, 
 ZOOKEEPER-2193-v7.patch, ZOOKEEPER-2193-v8.patch, ZOOKEEPER-2193.patch


 Even if reconfig parameter is wrong, it was confirmed to complete.
 refer to the following.
 - Ensemble consists of four nodes
 {noformat}
 [zk: vm-101:2181(CONNECTED) 0] config
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 version=1
 {noformat}
 - add node by reconfig command
 {noformat}
 [zk: vm-101:2181(CONNECTED) 9] reconfig -add 
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 Committed new configuration:
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 version=30007
 {noformat}
 server.4 and server.5 of the IP address is a duplicate.
 In this state, reader election will not work properly.
 Besides, it is assumed an ensemble will be undesirable state.
 I think that need a parameter validation when reconfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-175) needed: docs for ops - how to setup acls authentication in the server

2015-06-26 Thread Albert Taylor (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603675#comment-14603675
 ] 

Albert Taylor commented on ZOOKEEPER-175:
-

This looks like a pretty good start.
https://ihong5.wordpress.com/2014/07/24/apache-zookeeper-setting-acl-in-zookeeper-client/


 needed: docs for ops - how to setup acls  authentication in the server
 ---

 Key: ZOOKEEPER-175
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-175
 Project: ZooKeeper
  Issue Type: Improvement
  Components: documentation
Reporter: Robbie Scott

 Part of the interest in creating documentation related to security.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2221) Zookeeper JettyAdminServer server should start on configured IP.

2015-06-26 Thread Raul Gutierrez Segales (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603851#comment-14603851
 ] 

Raul Gutierrez Segales commented on ZOOKEEPER-2221:
---

Thanks for the patch [~surendrasingh]! A few comments:

* the indentation seems off because of tabs, could you please use spaces for 
indentation to make it consistent with the rest of the file?
* could you document the new property (zookeeper.admin.address) in 
zookeeperAdmin.html?

Thanks!

 Zookeeper JettyAdminServer server should start on configured IP.
 

 Key: ZOOKEEPER-2221
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2221
 Project: ZooKeeper
  Issue Type: Bug
  Components: quorum
Affects Versions: 3.5.0
Reporter: Surendra Singh Lilhore
Assignee: Surendra Singh Lilhore
 Attachments: ZOOKEEPER-2221.patch


 Currently JettyAdminServer starting on 0.0.0.0 IP. 0.0.0.0 means all IP 
 addresses on the local machine. So, if your webserver machine has two ip 
 addresses, 192.168.1.1(private) and 10.1.2.1(public), and you allow a 
 webserver daemon like apache to listen on 0.0.0.0, it will be reachable at 
 both of those IPs.
 This is security issue. webserver should be accessible from only configured IP



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2193) reconfig command completes even if parameter is wrong obviously

2015-06-26 Thread Alexander Shraer (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603689#comment-14603689
 ] 

Alexander Shraer commented on ZOOKEEPER-2193:
-

I'm not sure it would work with reconfig either. In any case, this should be 
discussed in a separate Jira (in case this is important to support)


 reconfig command completes even if parameter is wrong obviously
 ---

 Key: ZOOKEEPER-2193
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2193
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, server
Affects Versions: 3.5.0
 Environment: CentOS7 + Java7
Reporter: Yasuhito Fukuda
Assignee: Yasuhito Fukuda
 Attachments: ZOOKEEPER-2193-v2.patch, ZOOKEEPER-2193-v3.patch, 
 ZOOKEEPER-2193-v4.patch, ZOOKEEPER-2193-v5.patch, ZOOKEEPER-2193-v6.patch, 
 ZOOKEEPER-2193-v7.patch, ZOOKEEPER-2193-v8.patch, ZOOKEEPER-2193.patch


 Even if reconfig parameter is wrong, it was confirmed to complete.
 refer to the following.
 - Ensemble consists of four nodes
 {noformat}
 [zk: vm-101:2181(CONNECTED) 0] config
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 version=1
 {noformat}
 - add node by reconfig command
 {noformat}
 [zk: vm-101:2181(CONNECTED) 9] reconfig -add 
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 Committed new configuration:
 server.1=192.168.100.101:2888:3888:participant
 server.2=192.168.100.102:2888:3888:participant
 server.3=192.168.100.103:2888:3888:participant
 server.4=192.168.100.104:2888:3888:participant
 server.5=192.168.100.104:2888:3888:participant;0.0.0.0:2181
 version=30007
 {noformat}
 server.4 and server.5 of the IP address is a duplicate.
 In this state, reader election will not work properly.
 Besides, it is assumed an ensemble will be undesirable state.
 I think that need a parameter validation when reconfig.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2217) event might lost before re-watch

2015-06-26 Thread Camille Fournier (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14603922#comment-14603922
 ] 

Camille Fournier commented on ZOOKEEPER-2217:
-

[~caspian] I am closing this jira because this was a fundamental design 
decision of the system and there seems to be some confusion about the intended 
usage and behavior. We're happy to discuss this in more depth on the users or 
dev mailing lists if you are interested in feedback on what you are trying to 
do. Thanks!

 event might lost before re-watch
 

 Key: ZOOKEEPER-2217
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2217
 Project: ZooKeeper
  Issue Type: Improvement
  Components: c client, java client
Affects Versions: 3.4.5, 3.4.6
 Environment: jdk1.7_45 on centos6.5 and ubuntu14.4 
Reporter: Caspian

 I use zk to  monitor the children nodes under a path, eg: /servers. 
 when the client is told that children changes,  I have to re-watch the path 
 again, during the peroid, it's possible that some children down, or some up. 
 And those events will be missed.
 For now, my temporary solution is not to use getChildren(path, true...) to 
 get children and re-watch this path, but re-watch this path first, then get 
 the children. Thus non events can be ignored, but I don't know what will the 
 zk server be like if there are too much clients that act like this.
 How do you think of this problem? Is there any other solutions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ZOOKEEPER-2217) event might lost before re-watch

2015-06-26 Thread Camille Fournier (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2217?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Camille Fournier resolved ZOOKEEPER-2217.
-
Resolution: Not A Problem

 event might lost before re-watch
 

 Key: ZOOKEEPER-2217
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2217
 Project: ZooKeeper
  Issue Type: Improvement
  Components: c client, java client
Affects Versions: 3.4.5, 3.4.6
 Environment: jdk1.7_45 on centos6.5 and ubuntu14.4 
Reporter: Caspian

 I use zk to  monitor the children nodes under a path, eg: /servers. 
 when the client is told that children changes,  I have to re-watch the path 
 again, during the peroid, it's possible that some children down, or some up. 
 And those events will be missed.
 For now, my temporary solution is not to use getChildren(path, true...) to 
 get children and re-watch this path, but re-watch this path first, then get 
 the children. Thus non events can be ignored, but I don't know what will the 
 zk server be like if there are too much clients that act like this.
 How do you think of this problem? Is there any other solutions?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2172) Cluster crashes when reconfig a new node as a participant

2015-06-26 Thread Akihiro Suda (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602661#comment-14602661
 ] 

Akihiro Suda commented on ZOOKEEPER-2172:
-

Yes, as many logs as possible might be helpful.
Plus some additional information such as the accurate ZK version, workload 
scripts, or filesystem information might be also helpful.

I am trying to reproduce the bug by injecting some {{Thread.sleep()}}s into 
syncing-related functions using byteman.
But I could not reproduced the bug at this moment, as I am not sure which 
function should be injected.

 Cluster crashes when reconfig a new node as a participant
 -

 Key: ZOOKEEPER-2172
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2172
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection, quorum, server
Affects Versions: 3.5.0
 Environment: Ubuntu 12.04 + java 7
Reporter: Ziyou Wang
Priority: Critical
 Attachments: node-1.log, node-2.log, node-3.log, zoo-1.log, 
 zoo-2-1.log, zoo-2-2.log, zoo-2-3.log, zoo-2.log, zoo-2212-1.log, 
 zoo-2212-2.log, zoo-2212-3.log, zoo-3-1.log, zoo-3-2.log, zoo-3-3.log, 
 zoo-3.log, zoo-4-1.log, zoo-4-2.log, zoo-4-3.log, zoo.cfg.dynamic.1005d, 
 zoo.cfg.dynamic.next, zookeeper-1.log, zookeeper-2.log, zookeeper-3.log


 The operations are quite simple: start three zk servers one by one, then 
 reconfig the cluster to add the new one as a participant. When I add the  
 third one, the zk cluster may enter a weird state and cannot recover.
  
   I found “2015-04-20 12:53:48,236 [myid:1] - INFO  [ProcessThread(sid:1 
 cport:-1)::PrepRequestProcessor@547] - Incremental reconfig” in node-1 log. 
 So the first node received the reconfig cmd at 12:53:48. Latter, it logged 
 “2015-04-20  12:53:52,230 [myid:1] - ERROR 
 [LearnerHandler-/10.0.0.2:55890:LearnerHandler@580] - Unexpected exception 
 causing shutdown while sock still open” and “2015-04-20 12:53:52,231 [myid:1] 
 - WARN  [LearnerHandler-/10.0.0.2:55890:LearnerHandler@595] - *** GOODBYE 
  /10.0.0.2:55890 ”. From then on, the first node and second node 
 rejected all client connections and the third node didn’t join the cluster as 
 a participant. The whole cluster was done.
  
  When the problem happened, all three nodes just used the same dynamic 
 config file zoo.cfg.dynamic.1005d which only contained the first two 
 nodes. But there was another unused dynamic config file in node-1 directory 
 zoo.cfg.dynamic.next  which already contained three nodes.
  
  When I extended the waiting time between starting the third node and 
 reconfiguring the cluster, the problem didn’t show again. So it should be a 
 race condition problem.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-2140) NettyServerCnxn and NIOServerCnxn code should be improved

2015-06-26 Thread Arshad Mohammad (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arshad Mohammad updated ZOOKEEPER-2140:
---
Attachment: ZOOKEEPER-2140-3.patch

 NettyServerCnxn and NIOServerCnxn code should be improved
 -

 Key: ZOOKEEPER-2140
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2140
 Project: ZooKeeper
  Issue Type: Improvement
Reporter: Arshad Mohammad
 Fix For: 3.6.0

 Attachments: ZOOKEEPER-2140-1.patch, ZOOKEEPER-2140-2.patch, 
 ZOOKEEPER-2140-3.patch


 Classes org.apache.zookeeper.server.NIOServerCnxn and 
 org.apache.zookeeper.server.NettyServerCnxn have following need and scope for 
 improvement
 1) Duplicate code.
   These two classes have around 250 line duplicate code. All the command 
 code is duplicated
 2) Many improvement/bugFix done in one class but not done in other class. 
 These changes should be synced
 For example
 In NettyServerCnxn
 {code}
// clone should be faster than iteration
 // ie give up the cnxns lock faster
 AbstractSetServerCnxn cnxns;
 synchronized (factory.cnxns) {
 cnxns = new HashSetServerCnxn(factory.cnxns);
 }
 for (ServerCnxn c : cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 In NIOServerCnxn
 {code}
for (ServerCnxn c : factory.cnxns) {
 c.dumpConnectionInfo(pw, false);
 pw.println();
 }
 {code}
 3) NettyServerCnxn and  NIOServerCnxn classes are bulky unnecessarily. 
 Command classes have altogether different functionality, the command classes 
 should go in different class files.
 If this done it will be easy to add new command with minimal change to 
 existing classes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-2164) fast leader election keeps failing

2015-06-26 Thread Filip Deleersnijder (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14602876#comment-14602876
 ] 

Filip Deleersnijder commented on ZOOKEEPER-2164:


We experienced a related problem.  

In a test-setup with 6 servers (3.4.6) with 2 servers shut down, leader 
election could take a very long time ( 1 to 2 minutes ) to complete. Once we 
changed the cnxTO variable from 5000ms to 500ms in the QuorumCnxManager, it 
completed under 10 seconds again.

In a setup with 8 servers (3.4.6) with 2 servers shut down, leader election 
could take a very long time ( We have experienced more than 10 minutes ! ) to 
complete and frequently started again immediately after completing.
Monday we will test our cnxTO fix on this setup as well.


 fast leader election keeps failing
 --

 Key: ZOOKEEPER-2164
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2164
 Project: ZooKeeper
  Issue Type: Bug
  Components: leaderElection
Affects Versions: 3.4.5
Reporter: Michi Mutsuzaki
Assignee: Hongchao Deng
 Fix For: 3.5.2, 3.6.0


 I have a 3-node cluster with sids 1, 2 and 3. Originally 2 is the leader. 
 When I shut down 2, 1 and 3 keep going back to leader election. Here is what 
 seems to be happening.
 - Both 1 and 3 elect 3 as the leader.
 - 1 receives votes from 3 and itself, and starts trying to connect to 3 as a 
 follower.
 - 3 doesn't receive votes for 5 seconds because connectOne() to 2 doesn't 
 timeout for 5 seconds: 
 https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java#L346
 - By the time 3 receives votes, 1 has given up trying to connect to 3: 
 https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L247
 I'm using 3.4.5, but it looks like this part of the code hasn't changed for a 
 while, so I'm guessing later versions have the same issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)