[jira] [Created] (ZOOKEEPER-4516) checkstyle:check is failing
Mohammad Arshad created ZOOKEEPER-4516: -- Summary: checkstyle:check is failing Key: ZOOKEEPER-4516 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4516 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.1, 3.6.4 checkstyle:check is failing on branch-3.7 and branch-3.6 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4515) ZK Cli quit command always logs error
Mohammad Arshad created ZOOKEEPER-4515: -- Summary: ZK Cli quit command always logs error Key: ZOOKEEPER-4515 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4515 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Attachments: image-2022-04-08-15-47-04-325.png !image-2022-04-08-15-47-04-325.png! * When connection is in closing state, this log warning is entirely useless, change this log to debug. * When JVM exiting with 0 log info instead of error -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4514) ClientCnxnSocketNetty throwing NPE
Mohammad Arshad created ZOOKEEPER-4514: -- Summary: ClientCnxnSocketNetty throwing NPE Key: ZOOKEEPER-4514 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4514 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Attachments: image-2022-04-07-13-27-13-068.png ClientCnxnSocketNetty throwing NPE. This mainly happens when any of the server is in restarting state and client tries to connect. !image-2022-04-07-13-27-13-068.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
Mohammad Arshad created ZOOKEEPER-4510: -- Summary: dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 Key: ZOOKEEPER-4510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Fix For: 3.7.1, 3.6.4 On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is failing with following errors. {code:java} [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check (default-cli) on project zookeeper-assembly: [ERROR] [ERROR] One or more dependencies were identified with vulnerabilities that have a CVSS score greater than or equal to '0.0': [ERROR] [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4507) Create ZOO_DAEMON_OUT file backup when restarting the server
Mohammad Arshad created ZOOKEEPER-4507: -- Summary: Create ZOO_DAEMON_OUT file backup when restarting the server Key: ZOOKEEPER-4507 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4507 Project: ZooKeeper Issue Type: Improvement Components: scripts Reporter: Mohammad Arshad Assignee: Mohammad Arshad Attachments: image-2022-03-29-20-33-57-181.png The ZooKeeper server deamon out file zookeeper-$USER-server-$HOSTNAME.out is overwritten on every server restart. Like the other log file we should create backup of this file also. Many times information logged into these file are useful in issue analysis. For example this contains the information about which transaction and which snapshot files were deleted. These are useful info, we should retain for some time. May be by default we can backup 5 .out files like below !image-2022-03-29-20-33-57-181.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4506) Change Server default log4j appender from CONSOLE to ROLLINGFILE
Mohammad Arshad created ZOOKEEPER-4506: -- Summary: Change Server default log4j appender from CONSOLE to ROLLINGFILE Key: ZOOKEEPER-4506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4506 Project: ZooKeeper Issue Type: Improvement Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 Server by default logs to CONSOLE and then contents are redirected to zookeeper-$USER-server-$HOSTNAME.out" file. This file is overwritten on every server restart, the size of this file keeps growing, does not split when size is bigger. I think default logging appender should be ROLLINGFILE instead of CONSOLE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
Mohammad Arshad created ZOOKEEPER-4504: -- Summary: ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality Key: ZOOKEEPER-4504 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 *Problem and Analysis:* After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA functionality as shown in below thread dumps. {code:java} "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 nid=0x101b waiting for monitor entry [0x7f9bda8a6000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) - waiting to lock <0xc17986c0> (a org.apache.hadoop.ha.ActiveStandbyElector) at org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) {code} {code:java} "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on condition [0x7f9c06404000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xc1b383c8> (a java.util.concurrent.Semaphore$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) at org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) at org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) at org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) - locked <0xc17986c0> (a org.apache.hadoop.ha.ActiveStandbyElector) at org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) {code} org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) ZKUtil.deleteRecursive is async API call and in callback it is invoking ActiveStandbyElector#processWatchEvent which is synchronized on ActiveStandbyElector instance. So there is deadlock, clearParentZNode() is waiting processWatchEvent() to complete and processWatchEvent() is waiting clearParentZNode to complete *Why this problem was not happening with earlier versions (3.5.x)?* In earlier zk versions, ZKUtil.deleteRecursive was using sync zk API intnernally. So there was no callback (processWatchEvent) coming into the scenario. *Proposed Fix:* There are two approaches to fix this problem. 1. We can fix the problem in HDFS, modify the HDFS code to avoid the deadlock. But we may get similar bugs in other projects. 2. Fix the problem in ZK. Make the API behavior same as the old behavior(use sync API to delete the ZK node) and provide new overloaded API with new behavior(use async API to delete the ZK node) I propose to fix the
[jira] [Created] (ZOOKEEPER-4282) Redesign quota feature
Mohammad Arshad created ZOOKEEPER-4282: -- Summary: Redesign quota feature Key: ZOOKEEPER-4282 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4282 Project: ZooKeeper Issue Type: New Feature Components: quota Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.8.0 *Quota Use Case:* Generally in a big data solution deployment multiple services (hdfs, yarn, hbase etc.) use single Zookeeper cluster. So it is very important to ensure fare usage by all services. Sometime services unintentionally, mainly because of faulty behavior, create many znodes and impact the overall reliability of the ZooKeeper service. To ensure the faire usage quota feature is required. But this is the only use case there are many other use cases for quota feature. *Current Problems:* # Currently, user can set quota by updating znode “/zookeeper/quota/nodepath”, or using setquota/delquota in CLI command. This makes the quota setting infective Currently any user can set/delete quota, which is not proper, it should be admin operation # User is allowed to modify zookeeper system paths like /zookeeper/quota. These are internal to zookeeper should not be allowed to modify. # Generally services create single top level znode in Zookeeper like /hbase and create all required znode under it. It is better if it is configurable who can create top level znodes to controll ZooKeeper usage. # After ZOOKEEPER-231, there two kinds quota enforcement limits 1. Hard limit 2. Soft limit. I think there should be only limit. When enforce quota is enabled that limits becomes the hard limit otherwise it is soft limit same as old feature, just logs warnings. *Proposed Solution* # Add setQuota and deleteQuota admin APIs. Add listQuota normal user API Modify quota cli commands to use these APIs instead of directory modifying ZooKeeper system path /zookeeper/quota/ # Protect ZooKeeper system paths from outside modification. System should only be readable from outside # Expose configuration to set ACL for root system znode. After this, at the time of ZooKeeper service deployment administrator can create top level znode for a service and set quota. This way we can control overall ZooKeeper usage # Revert some of the changes in ZOOKEEPER-231 and move to single quota limit -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
Mohammad Arshad created ZOOKEEPER-4278: -- Summary: dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 Key: ZOOKEEPER-4278 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4273) Forward port ZOOKEEPER-3931: "zkServer.sh version" returns a trailing dash
Mohammad Arshad created ZOOKEEPER-4273: -- Summary: Forward port ZOOKEEPER-3931: "zkServer.sh version" returns a trailing dash Key: ZOOKEEPER-4273 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4273 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.7.1 Reporter: Mohammad Arshad When you run zkServer.sh version the result includes a few spam lines and the version reports a trailing dash {noformat} bin/zkServer.sh version ZooKeeper JMX enabled by default Using config: /xxx/bin/../conf/zoo.cfg Apache ZooKeeper, version 3.6.2- 09/04/2020 12:44 GMT {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4270) Flaky test: org.apache.zookeeper.server.quorum.QuorumPeerMainTest#testLeaderOutOfView
Mohammad Arshad created ZOOKEEPER-4270: -- Summary: Flaky test: org.apache.zookeeper.server.quorum.QuorumPeerMainTest#testLeaderOutOfView Key: ZOOKEEPER-4270 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4270 Project: ZooKeeper Issue Type: Sub-task Components: tests Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.6.3, 3.8.0, 3.7.1 org.apache.zookeeper.server.quorum.QuorumPeerMainTest#testLeaderOutOfView is flaky and often fails when I run it in local CI. Failure message: {noformat} java.lang.AssertionError at org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderOutOfView(QuorumPeerMainTest.java:937) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4269) acceptedEpoch.tmp rename failure will cause server startup error
Mohammad Arshad created ZOOKEEPER-4269: -- Summary: acceptedEpoch.tmp rename failure will cause server startup error Key: ZOOKEEPER-4269 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4269 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad accepted epoch is first written to temporary file acceptedEpoch.tmp then this file is renamed to acceptedEpoch. Failure, either because of exception or power-off, in renaming the acceptedEpoch.tmp file will cause server startup error with message "The current epoch, x, is older than the last zxid y" To handle this scenario we should read accepted epoch from this temp file as well. For more context, refer https://github.com/apache/zookeeper/pull/1109 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4267) Fix check-style issues
Mohammad Arshad created ZOOKEEPER-4267: -- Summary: Fix check-style issues Key: ZOOKEEPER-4267 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4267 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.6.3, 3.8.0, 3.7.1 Currently there check-style issues reported in following files which need to be fixed {noformat} org.apache.zookeeper.common.CertificatesToPlayWith org.apache.zookeeper.server.quorum.QuorumPeerMainTest {noformat} Because of these issues checkstyle:check check is failing in all the branches -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4266) Correct ZooKeeper version in documentation header
Mohammad Arshad created ZOOKEEPER-4266: -- Summary: Correct ZooKeeper version in documentation header Key: ZOOKEEPER-4266 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4266 Project: ZooKeeper Issue Type: Bug Components: documentation Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.8.0, 3.7.1 Attachments: image-2021-03-28-22-25-39-949.png Both Master and branch-3.7 documentation header have ZooKeeper version as 3.6. These should be changed to 3.8 and 3.7 for master and branch-3.7 respectively Master documentation currently: !image-2021-03-28-22-25-39-949.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4257) learner.asyncSending and learner.closeSocketAsync should be configurable in zoo.cfg
Mohammad Arshad created ZOOKEEPER-4257: -- Summary: learner.asyncSending and learner.closeSocketAsync should be configurable in zoo.cfg Key: ZOOKEEPER-4257 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4257 Project: ZooKeeper Issue Type: Sub-task Reporter: Mohammad Arshad Fix For: 3.7.0, 3.8.0 Configurations learner.asyncSending and learner.closeSocketAsync introduced in ZOOKEEPER-3575 and ZOOKEEPER-3574 are java system property only, which means can not be configured through ZooKeeper configuration file zoo.cfg As these JIRA changes are not released yet it is better to correct it and make it configurable through zoo.cfg. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4252) Flaky test: RequestPathMetricsCollectorTest#testMultiThreadPerf
Mohammad Arshad created ZOOKEEPER-4252: -- Summary: Flaky test: RequestPathMetricsCollectorTest#testMultiThreadPerf Key: ZOOKEEPER-4252 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4252 Project: ZooKeeper Issue Type: Sub-task Reporter: Mohammad Arshad Attachments: image-2021-03-16-12-31-46-432.png Flakyness=20.0% (3 / 15) !image-2021-03-16-12-31-46-432.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4251) Flaky test: org.apache.zookeeper.test.WatcherTest
Mohammad Arshad created ZOOKEEPER-4251: -- Summary: Flaky test: org.apache.zookeeper.test.WatcherTest Key: ZOOKEEPER-4251 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4251 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Attachments: image-2021-03-16-12-24-27-480.png !image-2021-03-16-12-24-27-480.png! -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4227) X509AuthFailureTest is failing consistently
Mohammad Arshad created ZOOKEEPER-4227: -- Summary: X509AuthFailureTest is failing consistently Key: ZOOKEEPER-4227 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4227 Project: ZooKeeper Issue Type: Bug Components: tests Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.6.3, 3.7.0, 3.8.0 X509AuthFailureTest is failing consistently. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3969) Add whoami API and Cli command
Mohammad Arshad created ZOOKEEPER-3969: -- Summary: Add whoami API and Cli command Key: ZOOKEEPER-3969 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3969 Project: ZooKeeper Issue Type: New Feature Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.0, 3.6.3 When connected to Zookeeper through CLI, many times there is need to know who is the current user. This is helpful when finding ACL related problems. Finding who is current user is not easy, have to check many configuration both at client and at server. Personally I run below three commands to know the current user {code:java} create /a setAcl /a auth::cdrwa getAcl /a {code} Given all the above facts, I think adding whoami command will be good add-on into the zookeeper command list. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3891) ZKCli commands give wrong error message "Authentication is not valid" for insufficient permissions
Mohammad Arshad created ZOOKEEPER-3891: -- Summary: ZKCli commands give wrong error message "Authentication is not valid" for insufficient permissions Key: ZOOKEEPER-3891 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3891 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad ZKCli commands give error message "Authentication is not valid" for insufficient permissions .(when KeeperException.NoAuthException is thrown). This is misleading message. Steps: to get the error {code:java} [zk: vm1:2181(CONNECTED) 0] create /b Created /b [zk: vm1:2181(CONNECTED) 1] getAcl /b 'world,'anyone : cdrwa [zk: vm1:2181(CONNECTED) 2] setAcl /b world:anyone:ra [zk: vm1:2181(CONNECTED) 3] getAcl /b 'world,'anyone : ra [zk: vm1:2181(CONNECTED) 4] create /b/b1 Authentication is not valid : /b/b1 [zk: vm1:2181(CONNECTED) 5] {code} I think we should change message "Authentication is not valid" to "Insufficient permission" -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3887) In SSL-only server zkServer.sh status command should use secureClientPortAddress instead of clientPortAddress
Mohammad Arshad created ZOOKEEPER-3887: -- Summary: In SSL-only server zkServer.sh status command should use secureClientPortAddress instead of clientPortAddress Key: ZOOKEEPER-3887 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3887 Project: ZooKeeper Issue Type: Bug Components: scripts Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.0, 3.6.2, 3.5.9 When only SSL client port is enabled, zkServer.sh status command should use secureClientPortAddress value instead of clientPortAddress. As clientPortAddress is not configured, zkServer.sh status command tries to connect to localhost and fails. ZOOKEEPER-3818 has addressed the port issue, same way we should address the host issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3886) Client connection string should support IPV6 with or without enclosed in square bracket.
Mohammad Arshad created ZOOKEEPER-3886: -- Summary: Client connection string should support IPV6 with or without enclosed in square bracket. Key: ZOOKEEPER-3886 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3886 Project: ZooKeeper Issue Type: Improvement Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.5.9 Clients should be able to connect to ZooKeeper with or without square bracket around IPV6 in connection string. 127:0:0:0:0:0:0:1:2181 and [127:0:0:0:0:0:0:1]:2181 both should work. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3878) Client connection fails if IPV6 is not enclosed in square brackets
Mohammad Arshad created ZOOKEEPER-3878: -- Summary: Client connection fails if IPV6 is not enclosed in square brackets Key: ZOOKEEPER-3878 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3878 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 4.0.0, 3.6.2 Clients should be able to connect to ZooKeeper with or without square bracket around IPV6 in connection string. 127:0:0:0:0:0:0:1:2181 and [127:0:0:0:0:0:0:1]:2181 both should work. After ZOOKEEPER-3106 fix connection with 127:0:0:0:0:0:0:1:2181 fails I think we should support both with or without square bracket around IPV6. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3877) JMX Bean RemotePeerBean should enclose IPV6 host in square bracket same as LocalPeerBean
Mohammad Arshad created ZOOKEEPER-3877: -- Summary: JMX Bean RemotePeerBean should enclose IPV6 host in square bracket same as LocalPeerBean Key: ZOOKEEPER-3877 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3877 Project: ZooKeeper Issue Type: Bug Components: jmx Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 4.0.0 JMX metrics Bean RemotePeerBean should enclose ipv6 host in square bracket same as LocalPeerBean Changes done in ZOOKEEPER-3057 for LocalPeerBean should also be done for RemotePeerBean -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3876) zkServer.sh status command fails when IPV6 is configured
Mohammad Arshad created ZOOKEEPER-3876: -- Summary: zkServer.sh status command fails when IPV6 is configured Key: ZOOKEEPER-3876 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3876 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.0 When server configuration has client IP and port in it as below {code:java} server.1=127:0:0:0:0:0:0:1:2890:3890:participant;127:0:0:0:0:0:0:1:2181 {code} Then zkServer.sh status command fails. It is not able to parse the host and ip. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3815) Support a new comprehensive parent znode watcher
Mohammad Arshad created ZOOKEEPER-3815: -- Summary: Support a new comprehensive parent znode watcher Key: ZOOKEEPER-3815 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3815 Project: ZooKeeper Issue Type: New Feature Reporter: Mohammad Arshad Assignee: Mohammad Arshad When a client registers this new watcher(for time being lets call it comprehensive parent znode watcher, we can give better name later.) on a parent znode then # Client should be notified on following events ## When a child is added ## When a child is deleted ## When a child is updated ## When parent is deleted # Client should be notified with znode data, This should be optional. There are many scenarios where znode data is always required. This can avoid unnecessary RPC calls. # If Client keeps all child znode data in memory, there should be way to check whether client data is consistent with Zookeeper server data. This is to ensure that no notification is lost. # This watcher should be persistent watcher, not one time watcher -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3558) Support authentication enforcement
Mohammad Arshad created ZOOKEEPER-3558: -- Summary: Support authentication enforcement Key: ZOOKEEPER-3558 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3558 Project: ZooKeeper Issue Type: New Feature Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.5.7 Provide authentication enforcement in ZooKeeper that is backward compatible and can work for any authentication scheme, can work even with custom authentication schemes. *Problems:* 1. Currently server is starting with default authentication providers(DigestAuthenticationProvider, IPAuthenticationProvider). These default authentication providers are not really secure. 2. ZooKeeper server is not checking whether authentication is done or not before performing any user operation. *Solutions:* 1. We should not start any authentication provider by default. But this would be backward incompatible change. So we can provide configuration whether to start default authentication provides are not. By default we can start these authentication providers. 2. Before any user operation server should check whether authentication happened or not. At least client must be authenticated with one authentication scheme. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3536) On Windows maven build generates corrupted tarball
Mohammad Arshad created ZOOKEEPER-3536: -- Summary: On Windows maven build generates corrupted tarball Key: ZOOKEEPER-3536 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3536 Project: ZooKeeper Issue Type: Bug Components: build Affects Versions: 3.5.5 Reporter: Mohammad Arshad On windows maven command {code}mvn clean install -DskipTests{code} creates corrupted tarballs. In zookeeper-assembly/pom.xml posix causing the problem. Many use Windows as development environment. it would be better if we can make tarLongFileMode property configurable or select based on OS. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (ZOOKEEPER-3498) In zookeeper-jute project generated source should not be in target\classes folder
Mohammad Arshad created ZOOKEEPER-3498: -- Summary: In zookeeper-jute project generated source should not be in target\classes folder Key: ZOOKEEPER-3498 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3498 Project: ZooKeeper Issue Type: Bug Components: build Reporter: Mohammad Arshad Currently in zookeeper-jute project jute generated source code are put in target\classes folder. In eclipse when project is refreshed/cleaned this folder content will get deleted which results in compilation error in other projects -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ZOOKEEPER-3496) Transaction larger than jute.maxbuffer makes ZooKeeper unavailable
Mohammad Arshad created ZOOKEEPER-3496: -- Summary: Transaction larger than jute.maxbuffer makes ZooKeeper unavailable Key: ZOOKEEPER-3496 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3496 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.4.14, 3.5.5 Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.6.0, 3.4.15, 3.5.6 *Problem:* ZooKeeper server fails to start, logs following error {code:java} Exception in thread "main" java.io.IOException: Unreasonable length = 1001025 at org.apache.jute.BinaryInputArchive.checkLength(BinaryInputArchive.java:127) at org.apache.jute.BinaryInputArchive.readBuffer(BinaryInputArchive.java:92) {code} This indicates that one of the transactions size is more than the configured jute.maxbuffer values. But how transaction more than jute.maxbuffer size is allowed to write? *Analysis:* At ZooKeeper server jute.maxbuffer specifies the maximum size of a transaction. By default it is 1 MB at the server jute.maxbuffer is used for following: # Size sanity check of incoming request. Incoming requests size must not be more than jute.maxbuffer # Size sanity check of the transaction while reading from transaction or snapshot file. Transaction size must not be more than jute.maxbuffer+1024 # Size sanity check of transaction while reading data from the leader. Transaction size must not be more than jute.maxbuffer+1024 Request size sanity check is done in the beginning of a request processing but later request processing adds additional information into request then writes to transaction file. This additional information size is not considered in sanity check. This is how transaction larger than jute.maxbuffer are accepted into ZooKeeper. If this additional information size is less than 1024 Bytes then it is OK as ZooKeeper already takes care of it. But if this additional information size is more than 1024 bytes it allows the request, But while reading from transaction/snapshot file and while reading from leader it fails and make the ZooKeeper service unavailable +Example:+ Suppose incoming request size is 100 Bytes Configured jute.maxbuffer is 100 After processing the request ZooKeeper server adds 1025 more bytes In this case, request will be processed successfully, and 100+1025 bytes will be written to transaction file But while reading from the transaction log 100+1025 bytes cannot be read as max allowed length is 100(effectively 100+1024). *Solutions:* If incoming request size sanity check is done after populating all additional information then this problem is solved. But doing sanity check in the later stage of request processing will defeat the purpose of sanity check itself. So this we can not do Currently additional information size is constant 1024 Bytes [Code Reference|https://github.com/apache/zookeeper/blob/branch-3.5/zookeeper-jute/src/main/java/org/apache/jute/BinaryInputArchive.java#L126]. We should increase this value and make it more reasonable. I propose to make this additional information size to same as the jute.maxbuffer. Also make additional information size configurable. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Commented] (ZOOKEEPER-2843) auth_to_local should support reading rules from a file
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16756891#comment-16756891 ] Mohammad Arshad commented on ZOOKEEPER-2843: security.auth_to_local can also be configured without having new line character in the Rules like security.auth_to_local=RULE:[1:$1]RULE:[2:$1]DEFAULT. In this case above discussed problems would not occur. So the changes proposed in this JIRA are not required. We can close this jira. > auth_to_local should support reading rules from a file > -- > > Key: ZOOKEEPER-2843 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2843 > Project: ZooKeeper > Issue Type: Improvement > Components: kerberos, server >Affects Versions: 3.4.10, 3.5.3 >Reporter: Lionel Cons >Priority: Major > Labels: pull-request-available > Attachments: ZOOKEEPER-2843.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The current handling of {{zookeeper.security.auth_to_local}} in > {{KerberosName.java}} only supports rules given directly as property value. > These rules must therefore be given on the command line and: > * must be escaped properly to avoid shell expansion > * are visible in the {{ps}} output > It would be much better to put these rules in a file and pass the file path > as the property value. We would then use something like > {{-Dzookeeper.security.auth_to_local=file:/etc/zookeeper/rules}}. > Note that using the {{file:}} prefix allows keeping backward compatibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2843) auth_to_local should support reading rules from a file
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753751#comment-16753751 ] Mohammad Arshad commented on ZOOKEEPER-2843: [~lionel.cons], Thanks for reporting and submitting the fix. Can you please re base your PR and address review comment. > auth_to_local should support reading rules from a file > -- > > Key: ZOOKEEPER-2843 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2843 > Project: ZooKeeper > Issue Type: Improvement > Components: kerberos, server >Affects Versions: 3.4.10, 3.5.3 >Reporter: Lionel Cons >Priority: Major > Labels: pull-request-available > Attachments: ZOOKEEPER-2843.patch > > Time Spent: 10m > Remaining Estimate: 0h > > The current handling of {{zookeeper.security.auth_to_local}} in > {{KerberosName.java}} only supports rules given directly as property value. > These rules must therefore be given on the command line and: > * must be escaped properly to avoid shell expansion > * are visible in the {{ps}} output > It would be much better to put these rules in a file and pass the file path > as the property value. We would then use something like > {{-Dzookeeper.security.auth_to_local=file:/etc/zookeeper/rules}}. > Note that using the {{file:}} prefix allows keeping backward compatibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2843) auth_to_local should support reading rules from a file
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16753740#comment-16753740 ] Mohammad Arshad commented on ZOOKEEPER-2843: security.auth_to_local configuration as string in zoo.cfg is causing problem in branch-3.5 and master *Reason:* Every time dynamic configuration file version is changed static configuration file zoo.cfg is rewritten. While rewriting security.auth_to_local new line characters are lost. This is breaking the functionality. *Scenarios:* * security.auth_to_local=RULE:[1:$1]\nRULE:[2:$1]\nDEFAULT When ZooKeeper server starts it changes this property to security.auth_to_local=RULE:[1:$1]nRULE:[2:$1]nDEFAULT. So these rules become invalid. * security.auth_to_local=RULE:[1:$1]\\nRULE:[2:$1] nDEFAULT When ZooKeeper server starts it rewrites as the property security.auth_to_local=RULE:[1:$1] RULE:[2:$1] DEFAULT This also does not work because in code security.auth_to_local gives value only RULE:[1:$1] not the complete value. > auth_to_local should support reading rules from a file > -- > > Key: ZOOKEEPER-2843 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2843 > Project: ZooKeeper > Issue Type: Improvement > Components: kerberos, server >Affects Versions: 3.4.10, 3.5.3 >Reporter: Lionel Cons >Priority: Major > Attachments: ZOOKEEPER-2843.patch > > > The current handling of {{zookeeper.security.auth_to_local}} in > {{KerberosName.java}} only supports rules given directly as property value. > These rules must therefore be given on the command line and: > * must be escaped properly to avoid shell expansion > * are visible in the {{ps}} output > It would be much better to put these rules in a file and pass the file path > as the property value. We would then use something like > {{-Dzookeeper.security.auth_to_local=file:/etc/zookeeper/rules}}. > Note that using the {{file:}} prefix allows keeping backward compatibility. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2307) ZooKeeper not starting because acceptedEpoch is less than the currentEpoch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16665018#comment-16665018 ] Mohammad Arshad commented on ZOOKEEPER-2307: Work for this issue was completed long back. Now the last given patch need to be re based. I will rebase the changes and raise a PR. > ZooKeeper not starting because acceptedEpoch is less than the currentEpoch > -- > > Key: ZOOKEEPER-2307 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2307 > Project: ZooKeeper > Issue Type: Bug > Components: server >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Fix For: 3.6.0, 3.5.5 > > Attachments: ZOOKEEPER-2307-01.patch, ZOOKEEPER-2307-02.patch, > ZOOKEEPER-2307-03.patch, ZOOKEEPER-2307-04.patch > > > This issue occurred in one of our test environment where disk was being > changed to read only very frequently. > The the scenario is as follows: > # Configure three node ZooKeeper cluster, lets say nodes are A, B and C > # Start A and B. Both A and B start successfully, quorum is running. > # Start C, because of IO error C fails to update acceptedEpoch file. But C > also starts successfully, joins the quorum as follower > # Stop C > # Start C, bellow exception with message "The accepted epoch, 0 is less than > the current epoch, 1" is thrown > {code} > 2015-10-29 16:52:32,942 [myid:3] - ERROR [main:QuorumPeer@784] - Unable to > load database on disk > java.io.IOException: The accepted epoch, 0 is less than the current epoch, 1 > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:781) > at > org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:720) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:202) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:139) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:88) > 2015-10-29 16:52:32,946 [myid:3] - ERROR [main:QuorumPeerMain@111] - > Unexpected exception, exiting abnormally > java.lang.RuntimeException: Unable to run quorum server > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:785) > at > org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:720) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:202) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:139) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:88) > Caused by: java.io.IOException: The accepted epoch, 0 is less than the > current epoch, 1 > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:781) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ZOOKEEPER-3181) ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664716#comment-16664716 ] Mohammad Arshad edited comment on ZOOKEEPER-3181 at 10/26/18 10:39 AM: --- Logically, for this issue change should be done only in curator, not in ZooKeeper. Problem is there because the way ZooKeeper is used by curator. I want to understand how to avoid this problem from zookeeper side. Shall I check before creating new method whether same method is available or not in the downstream projects? I don’t think this is logical thing. Any thoughts on how to avoid this kind of issues in future? was (Author: arshad.mohammad): Logically, for this issue change should be done only in curator, not in ZooKeeper. Problem is there because the way ZooKeeper is used by curator. I want to understand how to avoid this problem from zookeeper side. Shall I check before creating new method whether same method is available or not in the downstream projects? I don’t think this is logical thing. Any thoughts on how to avoid this issue in future?? > ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain > -- > > Key: ZOOKEEPER-3181 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3181 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.3, 3.4.11 >Reporter: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > ZOOKEEPER-2355 added a getQuorumPeer method to QuorumPeerMain > [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java#L194]. > TestingQuorumPeerMain has an identically named method, which is now > unintentionally overridding the one in the base class. > This is fixed by CURATOR-409, however, I'd like this to be fixed in ZooKeeper > as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3181) ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664716#comment-16664716 ] Mohammad Arshad commented on ZOOKEEPER-3181: Logically, for this issue change should be done only in curator, not in ZooKeeper. Problem is there because the way ZooKeeper is used by curator. I want to understand how to avoid this problem from zookeeper side. Shall I check before creating new method whether same method is available or not in the downstream projects? I don’t think this is logical thing. Any thoughts on how to avoid this issue in future?? > ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain > -- > > Key: ZOOKEEPER-3181 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3181 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.3, 3.4.11 >Reporter: Akira Ajisaka >Priority: Major > Labels: pull-request-available > Time Spent: 1.5h > Remaining Estimate: 0h > > ZOOKEEPER-2355 added a getQuorumPeer method to QuorumPeerMain > [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java#L194]. > TestingQuorumPeerMain has an identically named method, which is now > unintentionally overridding the one in the base class. > This is fixed by CURATOR-409, however, I'd like this to be fixed in ZooKeeper > as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3181) ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661792#comment-16661792 ] Mohammad Arshad commented on ZOOKEEPER-3181: Thanks [~ajisakaa] for bringing this issue to our notice getQuorumPeer method was added to inject custom QuorumPeer in test classes. I think adding this method does not violate any backward compatibility requirement. It is completely co-incident that curator had exactly same method name. I see you have changed method name in curator to getTestingQuorumPeer. What change you want to see in ZooKeeper. Do you want to rename getQuorumPeer method to something else. > ZOOKEEPER-2355 broke Curator TestingQuorumPeerMain > -- > > Key: ZOOKEEPER-3181 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3181 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.3, 3.4.11 >Reporter: Akira Ajisaka >Priority: Major > > ZOOKEEPER-2355 added a getQuorumPeer method to QuorumPeerMain > [https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/server/quorum/QuorumPeerMain.java#L194]. > TestingQuorumPeerMain has an identically named method, which is now > unintentionally overridding the one in the base class. > This is fixed by CURATOR-409, however, I'd like this to be fixed in ZooKeeper > as well. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2925) ZooKeeper server fails to start on first-startup due to race to create dataDir & snapDir
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652982#comment-16652982 ] Mohammad Arshad commented on ZOOKEEPER-2925: [~dineshappavoo], no objection at all, please go ahead and raise the pull request > ZooKeeper server fails to start on first-startup due to race to create > dataDir & snapDir > > > Key: ZOOKEEPER-2925 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2925 > Project: ZooKeeper > Issue Type: Bug > Components: other >Affects Versions: 3.4.6 >Reporter: Robert P. Thille >Priority: Major > Labels: easyfix, newbie, patch > Fix For: 3.4.10 > > Attachments: ZOOKEEPER-2925.patch > > > Due to two threads trying to create the dataDir and snapDir, and the > java.io.File.mkdirs() call returning false both for errors and for the > directory already existing, sometimes ZooKeeper will fail to start with the > following stack trace: > {noformat} > 2017-10-25 22:30:40,069 [myid:] - INFO [main:ZooKeeperServerMain@95] - > Starting server > 2017-10-25 22:30:40,075 [myid:] - INFO [main:Environment@100] - Server > environment:zookeeper.version=3.4.6-mdavis8efb625--1, built on 10/25/2017 > 01:12 GMT > [ More 'Server environment:blah blah blah' messages trimmed] > 2017-10-25 22:30:40,077 [myid:] - INFO [main:Environment@100] - Server > environment:user.dir=/ > 2017-10-25 22:30:40,081 [myid:] - ERROR [main:ZooKeeperServerMain@63] - > Unexpected exception, exiting abnormally > java.io.IOException: Unable to create data directory /bp2/data/version-2 > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85) > at > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:104) > at > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86) > at > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78) > 2017-10-25 22:30:40,085 [myid:] - INFO > [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. > {noformat} > this is caused by the QuorumPeerMain thread and the PurgeTask thread both > competing to create the directories. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-1260) Audit logging in ZooKeeper servers.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612318#comment-16612318 ] Mohammad Arshad commented on ZOOKEEPER-1260: To run the performance utility, go to zookeeper folder and run the following command: {code:java} java -cp zookeeper-3.6.0-SNAPSHOT.jar:lib/*:dist-maven/zookeeper-3.6.0-SNAPSHOT-tests.jar org.apache.zookeeper.audit.ZKAuditLoggerPerformance localhost:2181 / 5000{code} ZKAuditLoggerPerformance is added as part of audit log feature. To run this command on old zookeeper installation * Build ZooKeper with audit log feature patch * Copy zookeeper-3.6.0-SNAPSHOT-tests.jar to old zookeeper 's dist-maven folder and then run above command > Audit logging in ZooKeeper servers. > --- > > Key: ZOOKEEPER-1260 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1260 > Project: ZooKeeper > Issue Type: New Feature > Components: server >Reporter: Mahadev konar >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Attachments: ZOOKEEPER-1260-01.patch, zookeeperAuditLogs.pdf > > Time Spent: 10m > Remaining Estimate: 0h > > Lots of users have had questions on debugging which client changed what znode > and what updates went through a znode. We should add audit logging as in > Hadoop (look at Namenode Audit logging) to log which client changed what in > the zookeeper servers. This could just be a log4j audit logger. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-1260) Audit logging in ZooKeeper servers.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612315#comment-16612315 ] Mohammad Arshad commented on ZOOKEEPER-1260: >From the performance readings, seems performance impact very less when feature >is disabled. > Audit logging in ZooKeeper servers. > --- > > Key: ZOOKEEPER-1260 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1260 > Project: ZooKeeper > Issue Type: New Feature > Components: server >Reporter: Mahadev konar >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Attachments: ZOOKEEPER-1260-01.patch, zookeeperAuditLogs.pdf > > Time Spent: 10m > Remaining Estimate: 0h > > Lots of users have had questions on debugging which client changed what znode > and what updates went through a znode. We should add audit logging as in > Hadoop (look at Namenode Audit logging) to log which client changed what in > the zookeeper servers. This could just be a log4j audit logger. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-1260) Audit logging in ZooKeeper servers.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612302#comment-16612302 ] Mohammad Arshad commented on ZOOKEEPER-1260: To assess the performance impact of audit log feature I created, deleted and set data 5000 times on three types of clusters. Here is performance readings grouped by cluster type * ZooKeeper package does not have audit log feature. create=14776 ms setData=12223 ms delete=12599 ms * ZooKeeper package have audit log feature but it is disabled by configuration create=15161 ms setData=13328 ms delete=13046 ms * ZooKeeper package have audit log feature and it is enabled create=17364 ms setData=13612 ms delete=14174 ms > Audit logging in ZooKeeper servers. > --- > > Key: ZOOKEEPER-1260 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1260 > Project: ZooKeeper > Issue Type: New Feature > Components: server >Reporter: Mahadev konar >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Attachments: ZOOKEEPER-1260-01.patch, zookeeperAuditLogs.pdf > > Time Spent: 10m > Remaining Estimate: 0h > > Lots of users have had questions on debugging which client changed what znode > and what updates went through a znode. We should add audit logging as in > Hadoop (look at Namenode Audit logging) to log which client changed what in > the zookeeper servers. This could just be a log4j audit logger. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2284) LogFormatter and SnapshotFormatter does not handle FileNotFoundException gracefully
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16598684#comment-16598684 ] Mohammad Arshad commented on ZOOKEEPER-2284: Please take it. Assigned to you. > LogFormatter and SnapshotFormatter does not handle FileNotFoundException > gracefully > --- > > Key: ZOOKEEPER-2284 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2284 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Mohammad Arshad >Assignee: maoling >Priority: Minor > Fix For: 3.6.0, 3.5.5 > > Attachments: ZOOKEEPER-2284-01.patch, ZOOKEEPER-2284-02.patch, > ZOOKEEPER-2284-03.patch, ZOOKEEPER-2284-04.patch > > > {{LogFormatter}} and {{SnapshotFormatter}} does not handle > FileNotFoundException gracefully. If file no exist then these classes > propagate the exception to console. > {code} > Exception in thread "main" java.io.FileNotFoundException: log.1 (The system > cannot find the file specified) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:146) > at java.io.FileInputStream.(FileInputStream.java:101) > at org.apache.zookeeper.server.LogFormatter.main(LogFormatter.java:49) > {code} > File existence should be validated and appropriate message should be > displayed on console if file does not exist -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-2284) LogFormatter and SnapshotFormatter does not handle FileNotFoundException gracefully
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-2284: -- Assignee: maoling (was: Mohammad Arshad) > LogFormatter and SnapshotFormatter does not handle FileNotFoundException > gracefully > --- > > Key: ZOOKEEPER-2284 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2284 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Mohammad Arshad >Assignee: maoling >Priority: Minor > Fix For: 3.6.0, 3.5.5 > > Attachments: ZOOKEEPER-2284-01.patch, ZOOKEEPER-2284-02.patch, > ZOOKEEPER-2284-03.patch, ZOOKEEPER-2284-04.patch > > > {{LogFormatter}} and {{SnapshotFormatter}} does not handle > FileNotFoundException gracefully. If file no exist then these classes > propagate the exception to console. > {code} > Exception in thread "main" java.io.FileNotFoundException: log.1 (The system > cannot find the file specified) > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.(FileInputStream.java:146) > at java.io.FileInputStream.(FileInputStream.java:101) > at org.apache.zookeeper.server.LogFormatter.main(LogFormatter.java:49) > {code} > File existence should be validated and appropriate message should be > displayed on console if file does not exist -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3128) CLI Commands display Authentication error for Authorization error
Mohammad Arshad created ZOOKEEPER-3128: -- Summary: CLI Commands display Authentication error for Authorization error Key: ZOOKEEPER-3128 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3128 Project: ZooKeeper Issue Type: Bug Components: server Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.6.0, 3.5.5, 3.4.14 CLI Commands display "Authentication is not valid : /path123" when user does not have access on the znode /path123. For example command {code:java} get /path456 {code} will display error message {code:java} Authentication is not valid : /path456 {code} if user does not have read access on znode /path456. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-1260) Audit logging in ZooKeeper servers.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16591173#comment-16591173 ] Mohammad Arshad commented on ZOOKEEPER-1260: Thanks everyone for your interest in this feature. I will start working on this feature again. Hopefully this time we can conclude it. > Audit logging in ZooKeeper servers. > --- > > Key: ZOOKEEPER-1260 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1260 > Project: ZooKeeper > Issue Type: New Feature > Components: server >Reporter: Mahadev konar >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Attachments: ZOOKEEPER-1260-01.patch, zookeeperAuditLogs.pdf > > Time Spent: 10m > Remaining Estimate: 0h > > Lots of users have had questions on debugging which client changed what znode > and what updates went through a znode. We should add audit logging as in > Hadoop (look at Namenode Audit logging) to log which client changed what in > the zookeeper servers. This could just be a log4j audit logger. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ZOOKEEPER-2960) The dataDir and dataLogDir are used opposingly
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310965#comment-16310965 ] Mohammad Arshad edited comment on ZOOKEEPER-2960 at 1/4/18 8:22 AM: Thanks [~andorm] for submitting your analysis and patch for this issue. I also will have a look on this issue very soon. was (Author: arshad.mohammad): Thanks [~andorm] for reporting this issue, submitting your analysis and patch. I also will have a look on this issue very soon. > The dataDir and dataLogDir are used opposingly > -- > > Key: ZOOKEEPER-2960 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2960 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.11 > Environment: Not relevant. >Reporter: Dan Milon >Assignee: Andor Molnar >Priority: Critical > > After upgrading from zookeeper 3.4.5, to 3.4.11, without editing {{zoo.cfg}}, > the new version of the server tries to use the {{dataDir}} as the > {{dataLogDir}}, and the {{dataLogDir}} as the {{dataDir}}. Or at least some > parts of the server. > Configuration file has: > {noformat} > $ grep -i data /etc/zookeeper/zoo.cfg > dataLogDir=/var/lib/zookeeper/datalog > dataDir=/var/lib/zookeeper/data > {noformat} > But runtime configuration has: > {noformat} > $ echo conf | nc localhost 2181 | grep -i data > dataDir=/var/lib/zookeeper/datalog/version-2 > dataLogDir=/var/lib/zookeeper/data/version-2 > {noformat} > Also, I got this in the debug logs, so clearly some parts of the server > confuse things. > {noformat} > [PurgeTask:FileTxnSnapLog@79] - Opening datadir:/var/lib/zookeeper/datalog > snapDir:/var/lib/zookeeper/data > [main:FileTxnSnapLog@79] - Opening datadir:/var/lib/zookeeper/data > snapDir:/var/lib/zookeeper/datalog > {noformat} > I tried to look in the code for wrong uses of the directories. I only found > [ZookeeperServer.java|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java#L227] > is passing the arguments to {{FileTxnSnapLog}} in the wrong order, but the > code comment says that this is legacy only for tests, so I assume it isn't > the cause for my case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2960) The dataDir and dataLogDir are used opposingly
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310965#comment-16310965 ] Mohammad Arshad commented on ZOOKEEPER-2960: Thanks [~andorm] for reporting this issue, submitting your analysis and patch. I also will have a look on this issue very soon. > The dataDir and dataLogDir are used opposingly > -- > > Key: ZOOKEEPER-2960 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2960 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.11 > Environment: Not relevant. >Reporter: Dan Milon >Assignee: Andor Molnar >Priority: Critical > > After upgrading from zookeeper 3.4.5, to 3.4.11, without editing {{zoo.cfg}}, > the new version of the server tries to use the {{dataDir}} as the > {{dataLogDir}}, and the {{dataLogDir}} as the {{dataDir}}. Or at least some > parts of the server. > Configuration file has: > {noformat} > $ grep -i data /etc/zookeeper/zoo.cfg > dataLogDir=/var/lib/zookeeper/datalog > dataDir=/var/lib/zookeeper/data > {noformat} > But runtime configuration has: > {noformat} > $ echo conf | nc localhost 2181 | grep -i data > dataDir=/var/lib/zookeeper/datalog/version-2 > dataLogDir=/var/lib/zookeeper/data/version-2 > {noformat} > Also, I got this in the debug logs, so clearly some parts of the server > confuse things. > {noformat} > [PurgeTask:FileTxnSnapLog@79] - Opening datadir:/var/lib/zookeeper/datalog > snapDir:/var/lib/zookeeper/data > [main:FileTxnSnapLog@79] - Opening datadir:/var/lib/zookeeper/data > snapDir:/var/lib/zookeeper/datalog > {noformat} > I tried to look in the code for wrong uses of the directories. I only found > [ZookeeperServer.java|https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/ZooKeeperServer.java#L227] > is passing the arguments to {{FileTxnSnapLog}} in the wrong order, but the > code comment says that this is legacy only for tests, so I assume it isn't > the cause for my case. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2925) ZooKeeper server fails to start on first-startup due to race to create dataDir & snapDir
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224381#comment-16224381 ] Mohammad Arshad commented on ZOOKEEPER-2925: On quick comment is instead of depending on exception and message can we synchronize dataDir and snapDir creation. Example: {code} if (!this.dataDir.exists()) { synchronized (FileTxnSnapLog.class) { if (!this.dataDir.exists()) { //other existing code } if (!this.snapDir.exists()) { synchronized (FileTxnSnapLog.class) { if (!this.snapDir.exists()) { //other existing code {code} > ZooKeeper server fails to start on first-startup due to race to create > dataDir & snapDir > > > Key: ZOOKEEPER-2925 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2925 > Project: ZooKeeper > Issue Type: Bug > Components: other >Affects Versions: 3.4.6 >Reporter: Robert P. Thille > Labels: easyfix, newbie, patch > Fix For: 3.4.10 > > Attachments: ZOOKEEPER-2925.patch > > > Due to two threads trying to create the dataDir and snapDir, and the > java.io.File.mkdirs() call returning false both for errors and for the > directory already existing, sometimes ZooKeeper will fail to start with the > following stack trace: > {noformat} > 2017-10-25 22:30:40,069 [myid:] - INFO [main:ZooKeeperServerMain@95] - > Starting server > 2017-10-25 22:30:40,075 [myid:] - INFO [main:Environment@100] - Server > environment:zookeeper.version=3.4.6-mdavis8efb625--1, built on 10/25/2017 > 01:12 GMT > [ More 'Server environment:blah blah blah' messages trimmed] > 2017-10-25 22:30:40,077 [myid:] - INFO [main:Environment@100] - Server > environment:user.dir=/ > 2017-10-25 22:30:40,081 [myid:] - ERROR [main:ZooKeeperServerMain@63] - > Unexpected exception, exiting abnormally > java.io.IOException: Unable to create data directory /bp2/data/version-2 > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85) > at > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:104) > at > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86) > at > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78) > 2017-10-25 22:30:40,085 [myid:] - INFO > [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. > {noformat} > this is caused by the QuorumPeerMain thread and the PurgeTask thread both > competing to create the directories. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2925) ZooKeeper server fails to start on first-startup due to race to create dataDir & snapDir
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16224377#comment-16224377 ] Mohammad Arshad commented on ZOOKEEPER-2925: [~rthille] Thanks for reporting this issue. I also faced this issue recently. This issue is in all the branches. Can you raise pull request. I will be happy review it. > ZooKeeper server fails to start on first-startup due to race to create > dataDir & snapDir > > > Key: ZOOKEEPER-2925 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2925 > Project: ZooKeeper > Issue Type: Bug > Components: other >Affects Versions: 3.4.6 >Reporter: Robert P. Thille > Labels: easyfix, newbie, patch > Fix For: 3.4.10 > > Attachments: ZOOKEEPER-2925.patch > > > Due to two threads trying to create the dataDir and snapDir, and the > java.io.File.mkdirs() call returning false both for errors and for the > directory already existing, sometimes ZooKeeper will fail to start with the > following stack trace: > {noformat} > 2017-10-25 22:30:40,069 [myid:] - INFO [main:ZooKeeperServerMain@95] - > Starting server > 2017-10-25 22:30:40,075 [myid:] - INFO [main:Environment@100] - Server > environment:zookeeper.version=3.4.6-mdavis8efb625--1, built on 10/25/2017 > 01:12 GMT > [ More 'Server environment:blah blah blah' messages trimmed] > 2017-10-25 22:30:40,077 [myid:] - INFO [main:Environment@100] - Server > environment:user.dir=/ > 2017-10-25 22:30:40,081 [myid:] - ERROR [main:ZooKeeperServerMain@63] - > Unexpected exception, exiting abnormally > java.io.IOException: Unable to create data directory /bp2/data/version-2 > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.(FileTxnSnapLog.java:85) > at > org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:104) > at > org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:86) > at > org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:52) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:116) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78) > 2017-10-25 22:30:40,085 [myid:] - INFO > [PurgeTask:DatadirCleanupManager$PurgeTask@144] - Purge task completed. > {noformat} > this is caused by the QuorumPeerMain thread and the PurgeTask thread both > competing to create the directories. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-1260) Audit logging in ZooKeeper servers.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16080703#comment-16080703 ] Mohammad Arshad commented on ZOOKEEPER-1260: Thanks [~fpj] for showing interest in this feature. I will re-base it soon and raise a PR. > Audit logging in ZooKeeper servers. > --- > > Key: ZOOKEEPER-1260 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1260 > Project: ZooKeeper > Issue Type: New Feature >Reporter: Mahadev konar >Assignee: Mohammad Arshad > Fix For: 3.5.4, 3.6.0 > > Attachments: ZOOKEEPER-1260-01.patch, zookeeperAuditLogs.pdf > > > Lots of users have had questions on debugging which client changed what znode > and what updates went through a znode. We should add audit logging as in > Hadoop (look at Namenode Audit logging) to log which client changed what in > the zookeeper servers. This could just be a log4j audit logger. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2591) The deletion of Container znode doesn't check ACL delete permission
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16073119#comment-16073119 ] Mohammad Arshad commented on ZOOKEEPER-2591: Adding "node.stat.getCversion() > 0" check makes sense to me. anybody submitting patch? I will review it. > The deletion of Container znode doesn't check ACL delete permission > --- > > Key: ZOOKEEPER-2591 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2591 > Project: ZooKeeper > Issue Type: Bug > Components: security, server >Reporter: Edward Ribeiro >Assignee: Edward Ribeiro > > Container nodes check the ACL before creation, but the deletion doesn't check > the ACL rights. The code below succeeds even tough we removed ACL access > permissions for "/a". > {code} > zk.create("/a", null, Ids.OPEN_ACL_UNSAFE, CreateMode.CONTAINER); > ArrayList list = new ArrayList<>(); > list.add(new ACL(0, Ids.ANYONE_ID_UNSAFE)); > zk.setACL("/", list, -1); > zk.delete("/a", -1); > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009971#comment-16009971 ] Mohammad Arshad commented on ZOOKEEPER-2775: Submitting patch for quick review, later i will raise merge request. 1) Fix has just one line change 2) Moved SaslAuthTest to org.apache.zookeeper package and added test case for this defect scenario > ZK Client not able to connect with Xid out of order error > -- > > Key: ZOOKEEPER-2775 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.10, 3.5.3, 3.6.0 >Reporter: Bhupendra Kumar Jain >Assignee: Mohammad Arshad >Priority: Critical > Attachments: ZOOKEEPER-2775-01.patch > > > During Network unreachable scenario in one of the cluster, we observed Xid > out of order and Nothing in the queue error continously. And ZK client it > finally not able to connect successully to ZK server. > *Logs:* > unexpected error, closing socket connection and attempting reconnect | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) > java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 > for a packet with details: clientPath:null serverPath:null finished:false > header:: 53,101 replyHeader:: 0,0,-4 request:: > 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes} > response:: null > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > unexpected error, closing socket connection and attempting reconnect > java.io.IOException: Nothing in the queue, but got 1 > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > > *Analysis:* > 1) First time Client fails to do SASL login due to network unreachable > problem. > 2017-03-29 10:03:59,377 | WARN | [main-SendThread(192.168.130.8:24002)] | > SASL configuration failed: javax.security.auth.login.LoginException: Network > is unreachable (sendto failed) Will continue connection to Zookeeper server > without SASL authentication, if Zookeeper server allows it. | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) > Here the boolean saslLoginFailed becomes true. > 2) After some time network connection is recovered and client is successully > able to login but still the boolean saslLoginFailed is not reset to false. > 3) Now SASL negotiation between client and server start happening and during > this time no user request will be sent. ( As the socket channel will be > closed for write till sasl negotiation complets) > 4) Now response from server for SASL packet will be processed by the client > and client assumes that tunnelAuthInProgress() is finished ( method checks > for saslLoginFailed boolean Since the boolean is true it assumes its done.) > and tries to process the packet as a other packet and will result in above > errors. > *Solution:* Reset the saslLoginFailed boolean every time before client login -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2775: --- Attachment: ZOOKEEPER-2775-01.patch > ZK Client not able to connect with Xid out of order error > -- > > Key: ZOOKEEPER-2775 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Bhupendra Kumar Jain >Assignee: Mohammad Arshad >Priority: Critical > Attachments: ZOOKEEPER-2775-01.patch > > > During Network unreachable scenario in one of the cluster, we observed Xid > out of order and Nothing in the queue error continously. And ZK client it > finally not able to connect successully to ZK server. > *Logs:* > unexpected error, closing socket connection and attempting reconnect | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) > java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 > for a packet with details: clientPath:null serverPath:null finished:false > header:: 53,101 replyHeader:: 0,0,-4 request:: > 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes} > response:: null > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > unexpected error, closing socket connection and attempting reconnect > java.io.IOException: Nothing in the queue, but got 1 > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > > *Analysis:* > 1) First time Client fails to do SASL login due to network unreachable > problem. > 2017-03-29 10:03:59,377 | WARN | [main-SendThread(192.168.130.8:24002)] | > SASL configuration failed: javax.security.auth.login.LoginException: Network > is unreachable (sendto failed) Will continue connection to Zookeeper server > without SASL authentication, if Zookeeper server allows it. | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) > Here the boolean saslLoginFailed becomes true. > 2) After some time network connection is recovered and client is successully > able to login but still the boolean saslLoginFailed is not reset to false. > 3) Now SASL negotiation between client and server start happening and during > this time no user request will be sent. ( As the socket channel will be > closed for write till sasl negotiation complets) > 4) Now response from server for SASL packet will be processed by the client > and client assumes that tunnelAuthInProgress() is finished ( method checks > for saslLoginFailed boolean Since the boolean is true it assumes its done.) > and tries to process the packet as a other packet and will result in above > errors. > *Solution:* Reset the saslLoginFailed boolean every time before client login -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16009967#comment-16009967 ] Mohammad Arshad commented on ZOOKEEPER-2775: Discussed with [~Bhupendra] offline, I will submit the fix > ZK Client not able to connect with Xid out of order error > -- > > Key: ZOOKEEPER-2775 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Bhupendra Kumar Jain >Priority: Critical > > During Network unreachable scenario in one of the cluster, we observed Xid > out of order and Nothing in the queue error continously. And ZK client it > finally not able to connect successully to ZK server. > *Logs:* > unexpected error, closing socket connection and attempting reconnect | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) > java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 > for a packet with details: clientPath:null serverPath:null finished:false > header:: 53,101 replyHeader:: 0,0,-4 request:: > 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes} > response:: null > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > unexpected error, closing socket connection and attempting reconnect > java.io.IOException: Nothing in the queue, but got 1 > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > > *Analysis:* > 1) First time Client fails to do SASL login due to network unreachable > problem. > 2017-03-29 10:03:59,377 | WARN | [main-SendThread(192.168.130.8:24002)] | > SASL configuration failed: javax.security.auth.login.LoginException: Network > is unreachable (sendto failed) Will continue connection to Zookeeper server > without SASL authentication, if Zookeeper server allows it. | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) > Here the boolean saslLoginFailed becomes true. > 2) After some time network connection is recovered and client is successully > able to login but still the boolean saslLoginFailed is not reset to false. > 3) Now SASL negotiation between client and server start happening and during > this time no user request will be sent. ( As the socket channel will be > closed for write till sasl negotiation complets) > 4) Now response from server for SASL packet will be processed by the client > and client assumes that tunnelAuthInProgress() is finished ( method checks > for saslLoginFailed boolean Since the boolean is true it assumes its done.) > and tries to process the packet as a other packet and will result in above > errors. > *Solution:* Reset the saslLoginFailed boolean every time before client login -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-2775: -- Assignee: Mohammad Arshad > ZK Client not able to connect with Xid out of order error > -- > > Key: ZOOKEEPER-2775 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Bhupendra Kumar Jain >Assignee: Mohammad Arshad >Priority: Critical > > During Network unreachable scenario in one of the cluster, we observed Xid > out of order and Nothing in the queue error continously. And ZK client it > finally not able to connect successully to ZK server. > *Logs:* > unexpected error, closing socket connection and attempting reconnect | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) > java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 > for a packet with details: clientPath:null serverPath:null finished:false > header:: 53,101 replyHeader:: 0,0,-4 request:: > 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes} > response:: null > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > unexpected error, closing socket connection and attempting reconnect > java.io.IOException: Nothing in the queue, but got 1 > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > > *Analysis:* > 1) First time Client fails to do SASL login due to network unreachable > problem. > 2017-03-29 10:03:59,377 | WARN | [main-SendThread(192.168.130.8:24002)] | > SASL configuration failed: javax.security.auth.login.LoginException: Network > is unreachable (sendto failed) Will continue connection to Zookeeper server > without SASL authentication, if Zookeeper server allows it. | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) > Here the boolean saslLoginFailed becomes true. > 2) After some time network connection is recovered and client is successully > able to login but still the boolean saslLoginFailed is not reset to false. > 3) Now SASL negotiation between client and server start happening and during > this time no user request will be sent. ( As the socket channel will be > closed for write till sasl negotiation complets) > 4) Now response from server for SASL packet will be processed by the client > and client assumes that tunnelAuthInProgress() is finished ( method checks > for saslLoginFailed boolean Since the boolean is true it assumes its done.) > and tries to process the packet as a other packet and will result in above > errors. > *Solution:* Reset the saslLoginFailed boolean every time before client login -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-1932) Remove deprecated LeaderElection class
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16006607#comment-16006607 ] Mohammad Arshad commented on ZOOKEEPER-1932: Thanks [~hanm] for working on this issue. Merged to master: https://github.com/apache/zookeeper/commit/a680655a3569bfc546712cb85eeaea8c9b7de3ad > Remove deprecated LeaderElection class > -- > > Key: ZOOKEEPER-1932 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.5.0 >Reporter: Michi Mutsuzaki >Assignee: Michael Han > Fix For: 3.6.0 > > Attachments: TEST-org.apache.zookeeper.test.LETest.txt, > ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch > > > org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. > I'm not able to reproduce the failure on my box. I looked at the log, but I > couldn't quite figure out what's going on. > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/ > Update: > == > Because LE is deprecated there is not much points on spending effort fixing > it, as discussed in the JIRA. Updated JIRA title to reflect the state of the > issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-1932) Remove deprecated LeaderElection class
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-1932: --- Summary: Remove deprecated LeaderElection class (was: org.apache.zookeeper.test.LETest.testLE fails once in a while) > Remove deprecated LeaderElection class > -- > > Key: ZOOKEEPER-1932 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1932 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection >Affects Versions: 3.5.0 >Reporter: Michi Mutsuzaki >Assignee: Michael Han > Fix For: 3.6.0 > > Attachments: TEST-org.apache.zookeeper.test.LETest.txt, > ZOOKEEPER-1932.patch, ZOOKEEPER-1932.patch > > > org.apache.zookeeper.test.LETest.testLE is failing on trunk once in a while. > I'm not able to reproduce the failure on my box. I looked at the log, but I > couldn't quite figure out what's going on. > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2315/testReport/ > Update: > == > Because LE is deprecated there is not much points on spending effort fixing > it, as discussed in the JIRA. Updated JIRA title to reflect the state of the > issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2775) ZK Client not able to connect with Xid out of order error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15997846#comment-15997846 ] Mohammad Arshad commented on ZOOKEEPER-2775: This issue has been discussed many times, many places but it was never concluded. Thanks [~Bhupendra] for analyzing and reporting it. Are you willing to submit the fix? > ZK Client not able to connect with Xid out of order error > -- > > Key: ZOOKEEPER-2775 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2775 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Bhupendra Kumar Jain >Priority: Critical > > During Network unreachable scenario in one of the cluster, we observed Xid > out of order and Nothing in the queue error continously. And ZK client it > finally not able to connect successully to ZK server. > *Logs:* > unexpected error, closing socket connection and attempting reconnect | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1447) > java.io.IOException: Xid out of order. Got Xid 52 with err 0 expected Xid 53 > for a packet with details: clientPath:null serverPath:null finished:false > header:: 53,101 replyHeader:: 0,0,-4 request:: > 12885502275,v{'/app1/controller,'/app1/config/changes},v{},v{'/app1/config/changes} > response:: null > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:996) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > unexpected error, closing socket connection and attempting reconnect > java.io.IOException: Nothing in the queue, but got 1 > at > org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:983) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101) > at > org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:370) > at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1426) > > *Analysis:* > 1) First time Client fails to do SASL login due to network unreachable > problem. > 2017-03-29 10:03:59,377 | WARN | [main-SendThread(192.168.130.8:24002)] | > SASL configuration failed: javax.security.auth.login.LoginException: Network > is unreachable (sendto failed) Will continue connection to Zookeeper server > without SASL authentication, if Zookeeper server allows it. | > org.apache.zookeeper.ClientCnxn (ClientCnxn.java:1307) > Here the boolean saslLoginFailed becomes true. > 2) After some time network connection is recovered and client is successully > able to login but still the boolean saslLoginFailed is not reset to false. > 3) Now SASL negotiation between client and server start happening and during > this time no user request will be sent. ( As the socket channel will be > closed for write till sasl negotiation complets) > 4) Now response from server for SASL packet will be processed by the client > and client assumes that tunnelAuthInProgress() is finished ( method checks > for saslLoginFailed boolean Since the boolean is true it assumes its done.) > and tries to process the packet as a other packet and will result in above > errors. > *Solution:* Reset the saslLoginFailed boolean every time before client login -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2742) Few test cases of org.apache.zookeeper.ZooKeeperTest fails in Windows
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15990005#comment-15990005 ] Mohammad Arshad commented on ZOOKEEPER-2742: This issue is not applicable to branch-3.4. Merged to master: https://github.com/apache/zookeeper/commit/017ca1a24c1fa5988dd3c718f90b4beedb8a6a45 Merged to branch-3.5: https://github.com/apache/zookeeper/commit/acd116af648b962ff12c9289577fa8832ae91f5f Thanks [~a72877] for your contribution. [~rakeshr]/[~fpj]/[~phunt], Please add [~a72877] in contributors list. > Few test cases of org.apache.zookeeper.ZooKeeperTest fails in Windows > - > > Key: ZOOKEEPER-2742 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2742 > Project: ZooKeeper > Issue Type: Test > Components: tests > Environment: Windows >Reporter: Abhishek Kumar >Priority: Trivial > Fix For: 3.5.4, 3.6.0 > > Attachments: ZOOKEEPER-2742-01.patch > > > Following test cases fail in Windows environment: > 1. org.apache.zookeeper.ZooKeeperTest.testLsrRootCommand() > 2. org.apache.zookeeper.ZooKeeperTest.testLsrCommand() > It seems that failure is related to use of "\n" (System dependent new line > char)in org.apache.zookeeper.ZooKeeperTest.runCommandExpect(CliCommand, > List) > .. > .. > String result = byteStream.toString(); > assertTrue(result, result.contains( > StringUtils.joinStrings(expectedResults, "\n"))); > .. > .. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2742) Few test cases of org.apache.zookeeper.ZooKeeperTest fails in Windows.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15958288#comment-15958288 ] Mohammad Arshad commented on ZOOKEEPER-2742: Windows is not fully supported by zookeeper, but it makes sense to fix these very obvious test failures. [~a72877], are u willing to submit a fix. > Few test cases of org.apache.zookeeper.ZooKeeperTest fails in Windows. > -- > > Key: ZOOKEEPER-2742 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2742 > Project: ZooKeeper > Issue Type: Test > Components: tests > Environment: Windows >Reporter: Abhishek Kumar >Priority: Trivial > > Following test cases fail in Windows environment: > 1. org.apache.zookeeper.ZooKeeperTest.testLsrRootCommand() > 2. org.apache.zookeeper.ZooKeeperTest.testLsrCommand() > It seems that failure is related to use of "\n" (System dependent new line > char)in org.apache.zookeeper.ZooKeeperTest.runCommandExpect(CliCommand, > List) > .. > .. > String result = byteStream.toString(); > assertTrue(result, result.contains( > StringUtils.joinStrings(expectedResults, "\n"))); > .. > .. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2460) Remove javacc dependency from public Maven pom
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15928584#comment-15928584 ] Mohammad Arshad commented on ZOOKEEPER-2460: Merged to master: https://github.com/apache/zookeeper/commit/b4dded46f901fbc2128c7f752107a1391d676968 Merged to branch-3.5: https://github.com/apache/zookeeper/commit/f5aacfcdfd09be337f1e60d20fc86e95958d2b0d Thanks [~eolivelli] for your contribution. [~fpj], can you please add [~eolivelli] in contributors list and assign the issue to him. > Remove javacc dependency from public Maven pom > -- > > Key: ZOOKEEPER-2460 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2460 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.2 >Reporter: Enrico Olivelli >Priority: Critical > Fix For: 3.5.3, 3.6.0 > > > during the vote of 3.5.2-ALPHA RC 0 we found a Maven dependency to javacc in > published pom for zookeeper > {code} > > net.java.dev.javacc > javacc > 5.0compile > > {code} > this dependency appears not to be useful and should be removed > this was the tested pom: > https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.2-alpha/zookeeper-3.5.2-alpha.pom -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ZOOKEEPER-2460) Remove javacc dependency from public Maven pom
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-2460. Resolution: Fixed Fix Version/s: 3.5.3 3.6.0 Issue resolved by pull request 116 [https://github.com/apache/zookeeper/pull/116] > Remove javacc dependency from public Maven pom > -- > > Key: ZOOKEEPER-2460 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2460 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.2 >Reporter: Enrico Olivelli >Priority: Critical > Fix For: 3.6.0, 3.5.3 > > > during the vote of 3.5.2-ALPHA RC 0 we found a Maven dependency to javacc in > published pom for zookeeper > {code} > > net.java.dev.javacc > javacc > 5.0compile > > {code} > this dependency appears not to be useful and should be removed > this was the tested pom: > https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.2-alpha/zookeeper-3.5.2-alpha.pom -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2460) Remove javacc dependency from public Maven pom
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925777#comment-15925777 ] Mohammad Arshad commented on ZOOKEEPER-2460: I will have a look today and do the needful > Remove javacc dependency from public Maven pom > -- > > Key: ZOOKEEPER-2460 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2460 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.2 >Reporter: Enrico Olivelli >Priority: Critical > > during the vote of 3.5.2-ALPHA RC 0 we found a Maven dependency to javacc in > published pom for zookeeper > {code} > > net.java.dev.javacc > javacc > 5.0compile > > {code} > this dependency appears not to be useful and should be removed > this was the tested pom: > https://repository.apache.org/content/groups/staging/org/apache/zookeeper/zookeeper/3.5.2-alpha/zookeeper-3.5.2-alpha.pom -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2717) org.apache.zookeeper.server.quorum.RaceConditionTest fails intermittently
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905013#comment-15905013 ] Mohammad Arshad commented on ZOOKEEPER-2717: This test case was corrected recently in ZOOKEEPER-2683 and from then it has not failed in ZooKeeper CI. From the log message, i can see you are running the latest RaceConditionTest. Can you please share compete RaceConditionTest test log. > org.apache.zookeeper.server.quorum.RaceConditionTest fails intermittently > - > > Key: ZOOKEEPER-2717 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2717 > Project: ZooKeeper > Issue Type: Bug > Components: quorum, server >Affects Versions: 3.6.0 > Environment: Linux 6945c232192e 3.16.0-30-generic #40~14.04.1-Ubuntu > SMP Thu Jan 15 17:42:36 UTC 2015 ppc64le ppc64le ppc64le GNU/Linux >Reporter: Sneha Kanekar > Labels: ppc64le > > The test-suite org.apache.zookeeper.server.quorum.RaceConditionTest fails > intermittently on ppc64le with following error message: > {code:borderStyle=solid} > org.apache.zookeeper.server.quorum.RaceConditionTest.testRaceConditionBetweenLeaderAndAckRequestProcessor > Stacktrace: > Leader failed to transition to new state. Current state is leading > junit.framework.AssertionFailedError: Leader failed to transition to new > state. Current state is leading > at > org.apache.zookeeper.server.quorum.RaceConditionTest.testRaceConditionBetweenLeaderAndAckRequestProcessor(RaceConditionTest.java:82) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:745) > {code} > Also I have attached the standard output log file. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (ZOOKEEPER-2699) Restrict 4lw commands based on client IP
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-2699. Resolution: Won't Fix > Restrict 4lw commands based on client IP > > > Key: ZOOKEEPER-2699 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2699 > Project: ZooKeeper > Issue Type: Bug > Components: security, server >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > Currently 4lw commands are executed without authentication and can be > accessed from any IP which has access to ZooKeeper server. ZOOKEEPER-2693 > attempts to limit the 4lw commands which are enabled by default or enabled by > configuration. > In addition to ZOOKEEPER-2693 we should also restrict 4lw commands based on > client IP as well. It is required for following scenario > # User wants to enable all the 4lw commands > # User wants to limit the access of the commands which are considered to be > safe by default. > > *Implementation:* > we can introduce new property 4lw.commands.host.whitelist > # By default we allow all the hosts, but off course only on the 4lw exposed > commands as per the ZOOKEEPER-2693 > # It can be configured to allow individual IPs(192.168.1.2,192.168.1.3 etc.) > # It can also be configured to allow group of IPs like 192.168.1.* -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2699) Restrict 4lw commands based on client IP
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878428#comment-15878428 ] Mohammad Arshad commented on ZOOKEEPER-2699: Thanks [~revans2] for the information. yes, IP based restriction will not be effective. > Restrict 4lw commands based on client IP > > > Key: ZOOKEEPER-2699 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2699 > Project: ZooKeeper > Issue Type: Bug > Components: security, server >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > Currently 4lw commands are executed without authentication and can be > accessed from any IP which has access to ZooKeeper server. ZOOKEEPER-2693 > attempts to limit the 4lw commands which are enabled by default or enabled by > configuration. > In addition to ZOOKEEPER-2693 we should also restrict 4lw commands based on > client IP as well. It is required for following scenario > # User wants to enable all the 4lw commands > # User wants to limit the access of the commands which are considered to be > safe by default. > > *Implementation:* > we can introduce new property 4lw.commands.host.whitelist > # By default we allow all the hosts, but off course only on the 4lw exposed > commands as per the ZOOKEEPER-2693 > # It can be configured to allow individual IPs(192.168.1.2,192.168.1.3 etc.) > # It can also be configured to allow group of IPs like 192.168.1.* -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2695) Handle unknown error for rolling upgrade old client new server scenario
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2695: --- Attachment: ZOOKEEPER-2695-01.patch Submitting the fix, throwing {{SystemErrorException}} for all errors which are not known to client > Handle unknown error for rolling upgrade old client new server scenario > --- > > Key: ZOOKEEPER-2695 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2695 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Attachments: ZOOKEEPER-2695-01.patch > > > In Zookeeper rolling upgrade scenario where server is new but client is old, > when sever sends error code which is not understood by the client, client > throws NullPointerException. > KeeperException.SystemErrorException should be thrown for all unknown error > code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2695) Handle unknown error for rolling upgrade old client new server scenario
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878375#comment-15878375 ] Mohammad Arshad commented on ZOOKEEPER-2695: IllegalArgumentException is thrown only for known error code, as per the current source code, only 0 and -12 can throw IllegalArgumentException. So IllegalArgumentException is thrown only for know error codes. My earlier understanding on this was wrong. > Handle unknown error for rolling upgrade old client new server scenario > --- > > Key: ZOOKEEPER-2695 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2695 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > In Zookeeper rolling upgrade scenario where server is new but client is old, > when sever sends error code which is not understood by the client, client > throws NullPointerException. > KeeperException.SystemErrorException should be thrown for all unknown error > code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2695) Handle unknown error for rolling upgrade old client new server scenario
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878376#comment-15878376 ] Mohammad Arshad commented on ZOOKEEPER-2695: For unknown errors, clients always throw NullPointerException Suppose in new version of server for getChildren() API for new scenario new error code is added. This new error code will be unknown error code for the old clients. Old clients will throw NullPointerException while processing these unknown {code} java.lang.NullPointerException at org.apache.zookeeper.KeeperException.create(KeeperException.java:91) at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:2465) {code} > Handle unknown error for rolling upgrade old client new server scenario > --- > > Key: ZOOKEEPER-2695 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2695 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > In Zookeeper rolling upgrade scenario where server is new but client is old, > when sever sends error code which is not understood by the client, client > throws NullPointerException. > KeeperException.SystemErrorException should be thrown for all unknown error > code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2695) Handle unknown error for rolling upgrade old client new server scenario
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2695: --- Description: In Zookeeper rolling upgrade scenario where server is new but client is old, when sever sends error code which is not understood by the client, client throws NullPointerException. KeeperException.SystemErrorException should be thrown for all unknown error code. was: In Zookeeper rolling upgrade scenario where server is new but client is old, when sever sends error code which is not understood by the client, client throws IllegalArgumentException. Generally IllegalArgumentException is not handled by any of the ZK applications. KeeperException.SystemErrorException should be thrown instead of IllegalArgumentException > Handle unknown error for rolling upgrade old client new server scenario > --- > > Key: ZOOKEEPER-2695 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2695 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > In Zookeeper rolling upgrade scenario where server is new but client is old, > when sever sends error code which is not understood by the client, client > throws NullPointerException. > KeeperException.SystemErrorException should be thrown for all unknown error > code. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2694) sync CLI command does not wait for result from server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2694: --- Attachment: ZOOKEEPER-2694-01.patch > sync CLI command does not wait for result from server > - > > Key: ZOOKEEPER-2694 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2694 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Attachments: ZOOKEEPER-2694-01.patch > > > sync CLI command does not wait for result from server. It returns immediately > after invoking the sync's asynchronous API. > Executing bellow command does not give the expected result > {{/bin/zkCli.sh -server host:port sync /}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ZOOKEEPER-2694) sync CLI command does not wait for result from server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-2694: -- Assignee: Mohammad Arshad > sync CLI command does not wait for result from server > - > > Key: ZOOKEEPER-2694 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2694 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > sync CLI command does not wait for result from server. It returns immediately > after invoking the sync's asynchronous API. > Executing bellow command does not give the expected result > {{/bin/zkCli.sh -server host:port sync /}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (ZOOKEEPER-2695) Handle unknown error for rolling upgrade old client new server scenario
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-2695: -- Assignee: Mohammad Arshad > Handle unknown error for rolling upgrade old client new server scenario > --- > > Key: ZOOKEEPER-2695 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2695 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > In Zookeeper rolling upgrade scenario where server is new but client is old, > when sever sends error code which is not understood by the client, client > throws IllegalArgumentException. Generally IllegalArgumentException is not > handled by any of the ZK applications. > KeeperException.SystemErrorException should be thrown instead of > IllegalArgumentException -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2693) DOS attack on wchp/wchc four letter words (4lw)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2693: --- Attachment: ZOOKEEPER-2693-01.patch > DOS attack on wchp/wchc four letter words (4lw) > --- > > Key: ZOOKEEPER-2693 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2693 > Project: ZooKeeper > Issue Type: Bug > Components: security, server >Affects Versions: 3.4.0, 3.5.1, 3.5.2 >Reporter: Patrick Hunt >Assignee: Michael Han >Priority: Blocker > Fix For: 3.4.10, 3.5.3 > > Attachments: ZOOKEEPER-2693-01.patch > > > The wchp/wchc four letter words can be exploited in a DOS attack on the ZK > client port - typically 2181. The following POC attack was recently published > on the web: > https://webcache.googleusercontent.com/search?q=cache:_CNGIz10PRYJ:https://www.exploit-db.com/exploits/41277/+=14=en=clnk=us > The most straightforward way to block this attack is to not allow access to > the client port to non-trusted clients - i.e. firewall the ZooKeeper service > and only allow access to trusted applications using it for coordination. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2693) DOS attack on wchp/wchc four letter words (4lw)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15873191#comment-15873191 ] Mohammad Arshad commented on ZOOKEEPER-2693: bq. 3.4: ruok,srvr,crst,srst,isro,mntr, 3.5: There are some 4lw commands which ZooKeeper is using by itself For example # srvr is used in zookeeper/bin/zkServer.sh status # isro is used in org.apache.zookeeper.ClientCnxn.SendThread.pingRwServer() If we do not enable those commands by default, related funtionalities will not work, so we have to include in the default list But if we enable, I do not know if whole purpose of this fix is defeated because the attacker can call the these commands, even though we are not doing much work in these commands but still the connections will be created for every call. Any comments on which option to choose? > DOS attack on wchp/wchc four letter words (4lw) > --- > > Key: ZOOKEEPER-2693 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2693 > Project: ZooKeeper > Issue Type: Bug > Components: security, server >Affects Versions: 3.4.0, 3.5.1, 3.5.2 >Reporter: Patrick Hunt >Assignee: Michael Han >Priority: Blocker > Fix For: 3.4.10, 3.5.3 > > > The wchp/wchc four letter words can be exploited in a DOS attack on the ZK > client port - typically 2181. The following POC attack was recently published > on the web: > https://webcache.googleusercontent.com/search?q=cache:_CNGIz10PRYJ:https://www.exploit-db.com/exploits/41277/+=14=en=clnk=us > The most straightforward way to block this attack is to not allow access to > the client port to non-trusted clients - i.e. firewall the ZooKeeper service > and only allow access to trusted applications using it for coordination. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2693) DOS attack on wchp/wchc four letter words (4lw)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15871370#comment-15871370 ] Mohammad Arshad commented on ZOOKEEPER-2693: bq. I propose we get the command white list patch in, and then the release out, and then think about how to improve the overall access control of ZK in the wild, unless the current command white list does not address the security concern raised by this JIRA. [~hanm], This makes sense to me. I have create new jira ZOOKEEPER-2699 and have put some more detail there. Sure, we can handle after this JIRA is merged. I will review this jira today > DOS attack on wchp/wchc four letter words (4lw) > --- > > Key: ZOOKEEPER-2693 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2693 > Project: ZooKeeper > Issue Type: Bug > Components: security, server >Affects Versions: 3.4.0, 3.5.1, 3.5.2 >Reporter: Patrick Hunt >Assignee: Michael Han >Priority: Blocker > Fix For: 3.4.10, 3.5.3 > > > The wchp/wchc four letter words can be exploited in a DOS attack on the ZK > client port - typically 2181. The following POC attack was recently published > on the web: > https://webcache.googleusercontent.com/search?q=cache:_CNGIz10PRYJ:https://www.exploit-db.com/exploits/41277/+=14=en=clnk=us > The most straightforward way to block this attack is to not allow access to > the client port to non-trusted clients - i.e. firewall the ZooKeeper service > and only allow access to trusted applications using it for coordination. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ZOOKEEPER-2699) Restrict 4lw commands based on client IP
Mohammad Arshad created ZOOKEEPER-2699: -- Summary: Restrict 4lw commands based on client IP Key: ZOOKEEPER-2699 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2699 Project: ZooKeeper Issue Type: Bug Components: security, server Reporter: Mohammad Arshad Assignee: Mohammad Arshad Currently 4lw commands are executed without authentication and can be accessed from any IP which has access to ZooKeeper server. ZOOKEEPER-2693 attempts to limit the 4lw commands which are enabled by default or enabled by configuration. In addition to ZOOKEEPER-2693 we should also restrict 4lw commands based on client IP as well. It is required for following scenario # User wants to enable all the 4lw commands # User wants to limit the access of the commands which are considered to be safe by default. *Implementation:* we can introduce new property 4lw.commands.host.whitelist # By default we allow all the hosts, but off course only on the 4lw exposed commands as per the ZOOKEEPER-2693 # It can be configured to allow individual IPs(192.168.1.2,192.168.1.3 etc.) # It can also be configured to allow group of IPs like 192.168.1.* -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2693) DOS attack on wchp/wchc four letter words (4lw)
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15869352#comment-15869352 ] Mohammad Arshad commented on ZOOKEEPER-2693: Can we restrict 4lw commands based on IP By default we can allow access to the IP on which server is running. It can be configured to allow individual IPs(192.168.1.2,192.168.1.3 etc) It can also be configured to allow group of IPs like 192.168.1.* > DOS attack on wchp/wchc four letter words (4lw) > --- > > Key: ZOOKEEPER-2693 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2693 > Project: ZooKeeper > Issue Type: Bug > Components: security, server >Affects Versions: 3.4.0, 3.5.1, 3.5.2 >Reporter: Patrick Hunt >Assignee: Michael Han >Priority: Blocker > Fix For: 3.4.10, 3.5.3 > > > The wchp/wchc four letter words can be exploited in a DOS attack on the ZK > client port - typically 2181. The following POC attack was recently published > on the web: > https://webcache.googleusercontent.com/search?q=cache:_CNGIz10PRYJ:https://www.exploit-db.com/exploits/41277/+=14=en=clnk=us > The most straightforward way to block this attack is to not allow access to > the client port to non-trusted clients - i.e. firewall the ZooKeeper service > and only allow access to trusted applications using it for coordination. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ZOOKEEPER-2695) Handle unknown error for rolling upgrade old client new server scenario
Mohammad Arshad created ZOOKEEPER-2695: -- Summary: Handle unknown error for rolling upgrade old client new server scenario Key: ZOOKEEPER-2695 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2695 Project: ZooKeeper Issue Type: Bug Components: java client Reporter: Mohammad Arshad In Zookeeper rolling upgrade scenario where server is new but client is old, when sever sends error code which is not understood by the client, client throws IllegalArgumentException. Generally IllegalArgumentException is not handled by any of the ZK applications. KeeperException.SystemErrorException should be thrown instead of IllegalArgumentException -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ZOOKEEPER-2694) sync CLI command does not wait for result from server
Mohammad Arshad created ZOOKEEPER-2694: -- Summary: sync CLI command does not wait for result from server Key: ZOOKEEPER-2694 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2694 Project: ZooKeeper Issue Type: Bug Components: java client Reporter: Mohammad Arshad sync CLI command does not wait for result from server. It returns immediately after invoking the sync's asynchronous API. Executing bellow command does not give the expected result {{/bin/zkCli.sh -server host:port sync /}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2687) Deadlock while shutting down the Leader server.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15865369#comment-15865369 ] Mohammad Arshad commented on ZOOKEEPER-2687: [~rakeshr] can you please have a look on the proposed change as you are most familiar with the related defect ZOOKEEPER-2380. > Deadlock while shutting down the Leader server. > --- > > Key: ZOOKEEPER-2687 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2687 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.2, 3.6.0 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2687-01.patch > > > Leader server enters into deadlock while shutting down. This happens some > time only. > The reason and deadlock flow is same as ZOOKEEPER-2380. > shutdown was removed from synchronized block in ZOOKEEPER-2380 > Now shutdown is called from synchronized block from another place. > {code} > // check leader running status > if (!this.isRunning()) { > shutdown("Unexpected internal error"); > return; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2683) RaceConditionTest is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864234#comment-15864234 ] Mohammad Arshad commented on ZOOKEEPER-2683: Thanks [~hanm] for review. Old leader is already in QuorumStats.Provider.LEADING_STATE that is why I think it is not good to include in the assert list. > RaceConditionTest is flaky > -- > > Key: ZOOKEEPER-2683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2683 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2683-01.patch > > > *Error Message* > {noformat} > Leader failed to transition to LOOKING or FOLLOWING state > {noformat} > *Stacktrace* > {noformat} > junit.framework.AssertionFailedError: Leader failed to transition to LOOKING > or FOLLOWING state > at > org.apache.zookeeper.server.quorum.RaceConditionTest.testRaceConditionBetweenLeaderAndAckRequestProcessor(RaceConditionTest.java:74) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.lang.Thread.run(Thread.java:745) > {noformat} > [CI Failures > Reference|https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/279//testReport/org.apache.zookeeper.server.quorum/RaceConditionTest/testRaceConditionBetweenLeaderAndAckRequestProcessor/] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2689) Fix Kerberos Authentication related test cases
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15864160#comment-15864160 ] Mohammad Arshad commented on ZOOKEEPER-2689: Thanks [~rakeshr] for the fix. Thanks [~eribeiro] for reviewing the fix. Committed to branch-3.4: https://git-wip-us.apache.org/repos/asf?p=zookeeper.git;a=commit;h=e8247eec1103e387e02bbb1e8859b4d468688f48 > Fix Kerberos Authentication related test cases > -- > > Key: ZOOKEEPER-2689 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2689 > Project: ZooKeeper > Issue Type: Bug > Components: tests >Affects Versions: 3.4.9 >Reporter: Mohammad Arshad >Assignee: Rakesh R >Priority: Critical > Fix For: 3.4.10 > > > Following test classes failed when branch-3.4 is run on java 6. > {noformat} > org.apache.zookeeper.server.quorum.auth.MiniKdcTest > org.apache.zookeeper.server.quorum.auth.QuorumKerberosAuthTest > org.apache.zookeeper.server.quorum.auth.QuorumKerberosHostBasedAuthTest > {noformat} > Error message is {{org/apache/kerby/kerberos/kerb/KrbException : Unsupported > major.minor version 51.0}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2680) Correct DataNode.getChildren() inconsistent behaviour.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859692#comment-15859692 ] Mohammad Arshad commented on ZOOKEEPER-2680: I created ZOOKEEPER-2689 to track the branch-3.4 test failures. I will be happy to review if any patch given. > Correct DataNode.getChildren() inconsistent behaviour. > -- > > Key: ZOOKEEPER-2680 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.1 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2680-01.patch > > > DataNode.getChildren() API returns null and empty set if there are no > children in it depending on when the API is called. DataNode.getChildren() > API behavior should be changed and it should always return empty set if the > node does not have any child > *DataNode.getChildren() API Current Behavior:* > # returns null initially > When DataNode is created and no children are added yet, > DataNode.getChildren() returns null > # returns empty set after all the children are deleted: > created a Node > add a child > delete the child > DataNode.getChildren() returns empty set. > After fix DataNode.getChildren() should return empty set in all the above > cases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ZOOKEEPER-2689) Fix Kerberos Authentication related test cases
Mohammad Arshad created ZOOKEEPER-2689: -- Summary: Fix Kerberos Authentication related test cases Key: ZOOKEEPER-2689 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2689 Project: ZooKeeper Issue Type: Bug Components: tests Reporter: Mohammad Arshad Following test classes failed when branch-3.4 is run on java 6. {noformat} org.apache.zookeeper.server.quorum.auth.MiniKdcTest org.apache.zookeeper.server.quorum.auth.QuorumKerberosAuthTest org.apache.zookeeper.server.quorum.auth.QuorumKerberosHostBasedAuthTest {noformat} Error message is {{org/apache/kerby/kerberos/kerb/KrbException : Unsupported major.minor version 51.0}} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2687) Deadlock while shutting down the Leader server.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2687: --- Attachment: ZOOKEEPER-2687-01.patch > Deadlock while shutting down the Leader server. > --- > > Key: ZOOKEEPER-2687 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2687 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.2, 3.6.0 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Attachments: ZOOKEEPER-2687-01.patch > > > Leader server enters into deadlock while shutting down. This happens some > time only. > The reason and deadlock flow is same as ZOOKEEPER-2380. > shutdown was removed from synchronized block in ZOOKEEPER-2380 > Now shutdown is called from synchronized block from another place. > {code} > // check leader running status > if (!this.isRunning()) { > shutdown("Unexpected internal error"); > return; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2574) PurgeTxnLog can inadvertently delete required txn log files
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859266#comment-15859266 ] Mohammad Arshad commented on ZOOKEEPER-2574: Document changes are done in generated html, docs/zookeeperAdmin.html. When the document is generated again these changes will be overridden. Changes should have been done in src/docs/src/documentation/content/xdocs/zookeeperAdmin.xml. May be we can raise new jira an port the changes > PurgeTxnLog can inadvertently delete required txn log files > --- > > Key: ZOOKEEPER-2574 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2574 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.7, 3.4.8, 3.5.0, 3.5.1, 3.5.2 > Environment: Zookeeper 3.4.8, standalone, and 3-server quorum >Reporter: Abhishek Rai >Assignee: Abhishek Rai > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2574.2.patch, ZOOKEEPER-2574.3.patch, > ZOOKEEPER-2574.4.patch, ZOOKEEPER-2574.5.patch, ZOOKEEPER-2574.6.patch, > ZOOKEEPER-2574.patch > > > As part of the fix for ZOOKEEPER-1797, the call to > FileTxnSnapLog.getSnapshotLogs() was removed from PurgeTxnLog.java. As a > result, some old-looking but required txn log files can be deleted, resulting > in data corruption or loss. > For example, consider the following: > 1. Configuration: > autopurge.snapRetainCount=3 > 2. Following files exist: > log.100 spans transactions from zxid=100 till zxid=140 (inclusive) > snapshot.110 - snapshot as of zxid=110 > snapshot.120 - snapshot as of zxid=120 > snapshot.130 - snapshot as of zxid=130 > Above scenario is possible when snapshotting has happened multiple times but > without accompanying log rollover, which is possible if the server was > running as a learner. > 3. PurgeTxnLog retains all snapshots but deletes log.100 because its zxid is > older than the zxid of the oldest snapshot (110). This results in loss of > transactions in the range 131-140. > Before the fix for ZOOKEEPER-1797, this was avoided by the call to > FileTxnSnapLog.getSnapshotLogs() which finds and retains the newest txn log > file with starting zxid < oldest retained snapshot's highest zxid. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ZOOKEEPER-2687) Deadlock while shutting down the Leader server.
Mohammad Arshad created ZOOKEEPER-2687: -- Summary: Deadlock while shutting down the Leader server. Key: ZOOKEEPER-2687 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2687 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.5.2, 3.6.0 Reporter: Mohammad Arshad Assignee: Mohammad Arshad Leader server enters into deadlock while shutting down. This happens some time only. The reason and deadlock flow is same as ZOOKEEPER-2380. shutdown was removed from synchronized block in ZOOKEEPER-2380 Now shutdown is called from synchronized block from another place. {code} // check leader running status if (!this.isRunning()) { shutdown("Unexpected internal error"); return; } {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2680) Correct DataNode.getChildren() inconsistent behaviour.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856528#comment-15856528 ] Mohammad Arshad commented on ZOOKEEPER-2680: Thanks [~rakeshr]. branch-3.5 all tests are passing. branch-3.4 has following failures which are not related to this patch. {noformat} org.apache.zookeeper.server.quorum.auth.MiniKdcTest org.apache.zookeeper.server.quorum.auth.QuorumKerberosAuthTest org.apache.zookeeper.server.quorum.auth.QuorumKerberosHostBasedAuthTest {noformat} with error message {noformat} org/apache/kerby/kerberos/kerb/KrbException : Unsupported major.minor version 51.0 {noformat} I have run branch-3.5 on jdk1.7.0_80 branch-3.4 on jdk1.6.0_45 > Correct DataNode.getChildren() inconsistent behaviour. > -- > > Key: ZOOKEEPER-2680 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.1 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2680-01.patch > > > DataNode.getChildren() API returns null and empty set if there are no > children in it depending on when the API is called. DataNode.getChildren() > API behavior should be changed and it should always return empty set if the > node does not have any child > *DataNode.getChildren() API Current Behavior:* > # returns null initially > When DataNode is created and no children are added yet, > DataNode.getChildren() returns null > # returns empty set after all the children are deleted: > created a Node > add a child > delete the child > DataNode.getChildren() returns empty set. > After fix DataNode.getChildren() should return empty set in all the above > cases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2683) RaceConditionTest is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2683: --- Attachment: ZOOKEEPER-2683-01.patch QuorumPeerMain.getQuorumPeer visibility changed to protected to simply the test case. This is just improvement Fix can be without this change but it is good to have. > RaceConditionTest is flaky > -- > > Key: ZOOKEEPER-2683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2683 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Attachments: ZOOKEEPER-2683-01.patch > > > *Error Message* > {noformat} > Leader failed to transition to LOOKING or FOLLOWING state > {noformat} > *Stacktrace* > {noformat} > junit.framework.AssertionFailedError: Leader failed to transition to LOOKING > or FOLLOWING state > at > org.apache.zookeeper.server.quorum.RaceConditionTest.testRaceConditionBetweenLeaderAndAckRequestProcessor(RaceConditionTest.java:74) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.lang.Thread.run(Thread.java:745) > {noformat} > [CI Failures > Reference|https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/279//testReport/org.apache.zookeeper.server.quorum/RaceConditionTest/testRaceConditionBetweenLeaderAndAckRequestProcessor/] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2683) RaceConditionTest is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854462#comment-15854462 ] Mohammad Arshad commented on ZOOKEEPER-2683: Test case expectation is wrong. Test case is expecting the old leader to be the follower only, which is not correct. Old leader can become the leader again. > RaceConditionTest is flaky > -- > > Key: ZOOKEEPER-2683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2683 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > > *Error Message* > {noformat} > Leader failed to transition to LOOKING or FOLLOWING state > {noformat} > *Stacktrace* > {noformat} > junit.framework.AssertionFailedError: Leader failed to transition to LOOKING > or FOLLOWING state > at > org.apache.zookeeper.server.quorum.RaceConditionTest.testRaceConditionBetweenLeaderAndAckRequestProcessor(RaceConditionTest.java:74) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.lang.Thread.run(Thread.java:745) > {noformat} > [CI Failures > Reference|https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/279//testReport/org.apache.zookeeper.server.quorum/RaceConditionTest/testRaceConditionBetweenLeaderAndAckRequestProcessor/] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Issue Comment Deleted] (ZOOKEEPER-2577) Flaky Test: org.apache.zookeeper.server.quorum.ReconfigDuringLeaderSyncTest.testDuringLeaderSync
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2577: --- Comment: was deleted (was: Test case expectation is wrong. Test case is expecting the old leader to be the follower only, which is not correct. Old leader can become the leader again. I will soon raise a merge request.) > Flaky Test: > org.apache.zookeeper.server.quorum.ReconfigDuringLeaderSyncTest.testDuringLeaderSync > > > Key: ZOOKEEPER-2577 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2577 > Project: ZooKeeper > Issue Type: Test > Components: tests >Affects Versions: 3.5.2 >Reporter: Michael Han >Assignee: Mohammad Arshad > Labels: flaky, flaky-test > Fix For: 3.6.0 > > > This failure is new and consistent on jdk7/8 with trunk branch - happened > after build 3070 recently. Not sure if this is caused by svn - git migration > or not. > {noformat} > Error Message > zoo.cfg.dynamic.next is not deleted. > Stacktrace > junit.framework.AssertionFailedError: zoo.cfg.dynamic.next is not deleted. > at > org.apache.zookeeper.server.quorum.ReconfigDuringLeaderSyncTest.testDuringLeaderSync(ReconfigDuringLeaderSyncTest.java:155) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > Standard Output > 2016-09-13 05:09:25,247 [myid:] - INFO [main:JUnit4ZKTestRunner@47] - No > test.method specified. using default methods. > 2016-09-13 05:09:25,349 [myid:] - INFO [main:JUnit4ZKTestRunner@47] - No > test.method specified. using default methods. > 2016-09-13 05:09:25,370 [myid:] - INFO [main:ZKTestCase$1@55] - STARTING > testDuringLeaderSync > 2016-09-13 05:09:25,372 [myid:] - INFO > [main:JUnit4ZKTestRunner$LoggedInvokeMethod@77] - RUNNING TEST METHOD > testDuringLeaderSync > 2016-09-13 05:09:25,375 [myid:] - INFO [main:PortAssignment@151] - Test > process 2/8 using ports from 13914 - 16606. > 2016-09-13 05:09:25,380 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13915 from range 13914 - 16606. > 2016-09-13 05:09:25,380 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13916 from range 13914 - 16606. > 2016-09-13 05:09:25,381 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13917 from range 13914 - 16606. > 2016-09-13 05:09:25,381 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13918 from range 13914 - 16606. > 2016-09-13 05:09:25,381 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13919 from range 13914 - 16606. > 2016-09-13 05:09:25,382 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13920 from range 13914 - 16606. > 2016-09-13 05:09:25,382 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13921 from range 13914 - 16606. > 2016-09-13 05:09:25,382 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13922 from range 13914 - 16606. > 2016-09-13 05:09:25,383 [myid:] - INFO [main:PortAssignment@85] - Assigned > port 13923 from range 13914 - 16606. > 2016-09-13 05:09:25,406 [myid:] - INFO > [main:QuorumPeerTestBase$MainThread@131] - id = 0 tmpDir = > /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/build/test/tmp/test8397079557861207505.junit.dir > clientPort = 13915 adminServerPort = 8080 > 2016-09-13 05:09:25,416 [myid:] - INFO > [main:QuorumPeerTestBase$MainThread@131] - id = 1 tmpDir = > /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/build/test/tmp/test176919429940621.junit.dir > clientPort = 13918 adminServerPort = 8080 > 2016-09-13 05:09:25,420 [myid:] - INFO > [main:QuorumPeerTestBase$MainThread@131] - id = 2 tmpDir = > /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/build/test/tmp/test5455612786130415623.junit.dir > clientPort = 13921 adminServerPort = 8080 > 2016-09-13 05:09:25,422 [myid:] - INFO [Thread-0:QuorumPeerConfig@116] - > Reading configuration from: > /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/build/test/tmp/test8397079557861207505.junit.dir/zoo.cfg > 2016-09-13 05:09:25,422 [myid:] - INFO [Thread-2:QuorumPeerConfig@116] - > Reading configuration from: > /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/build/test/tmp/test5455612786130415623.junit.dir/zoo.cfg > 2016-09-13 05:09:25,422 [myid:] - INFO [Thread-1:QuorumPeerConfig@116] - > Reading configuration from: > /home/jenkins/jenkins-slave/workspace/ZooKeeper-trunk-openjdk7/build/test/tmp/test176919429940621.junit.dir/zoo.cfg > 2016-09-13 05:09:25,424 [myid:] - INFO [main:FourLetterWordMain@85] - > connecting to 127.0.0.1 13915 > 2016-09-13 05:09:25,425 [myid:] - INFO [Thread-0:QuorumPeerConfig@318] - > clientPortAddress is 0.0.0.0/0.0.0.0:13915 >
[jira] [Created] (ZOOKEEPER-2683) RaceConditionTest is flaky
Mohammad Arshad created ZOOKEEPER-2683: -- Summary: RaceConditionTest is flaky Key: ZOOKEEPER-2683 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2683 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad *Error Message* {noformat} Leader failed to transition to LOOKING or FOLLOWING state {noformat} *Stacktrace* {noformat} junit.framework.AssertionFailedError: Leader failed to transition to LOOKING or FOLLOWING state at org.apache.zookeeper.server.quorum.RaceConditionTest.testRaceConditionBetweenLeaderAndAckRequestProcessor(RaceConditionTest.java:74) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.lang.Thread.run(Thread.java:745) {noformat} [CI Failures Reference|https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/279//testReport/org.apache.zookeeper.server.quorum/RaceConditionTest/testRaceConditionBetweenLeaderAndAckRequestProcessor/] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2680) Correct DataNode.getChildren() inconsistent behaviour.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853574#comment-15853574 ] Mohammad Arshad commented on ZOOKEEPER-2680: I wanted to run test cases in my local CI. I need one patch to be committed to complete my CI setup. Can you please have a look on ZOOKEEPER-2682 and commit it. > Correct DataNode.getChildren() inconsistent behaviour. > -- > > Key: ZOOKEEPER-2680 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2680 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.1 >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2680-01.patch > > > DataNode.getChildren() API returns null and empty set if there are no > children in it depending on when the API is called. DataNode.getChildren() > API behavior should be changed and it should always return empty set if the > node does not have any child > *DataNode.getChildren() API Current Behavior:* > # returns null initially > When DataNode is created and no children are added yet, > DataNode.getChildren() returns null > # returns empty set after all the children are deleted: > created a Node > add a child > delete the child > DataNode.getChildren() returns empty set. > After fix DataNode.getChildren() should return empty set in all the above > cases. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2682) Make it optional to fail build on test failure.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853571#comment-15853571 ] Mohammad Arshad commented on ZOOKEEPER-2682: I verified the patch on my local CI. I run the test cases of master branch. It works fine. This patch applies on master and branch-3.5, I will submit separate patch for branch-3.4 > Make it optional to fail build on test failure. > --- > > Key: ZOOKEEPER-2682 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2682 > Project: ZooKeeper > Issue Type: Improvement > Components: build, tests >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Minor > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2682-01.patch, ZOOKEEPER-2682-02.patch > > > Currently if there is a test failure, build is marked as failed and exits. > I want to rerun the failed test cases instead of exiting. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (ZOOKEEPER-2682) Make it optional to fail build on test failure.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15853568#comment-15853568 ] Mohammad Arshad commented on ZOOKEEPER-2682: test failure are not related to this patch. > Make it optional to fail build on test failure. > --- > > Key: ZOOKEEPER-2682 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2682 > Project: ZooKeeper > Issue Type: Improvement > Components: build, tests >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Minor > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2682-01.patch, ZOOKEEPER-2682-02.patch > > > Currently if there is a test failure, build is marked as failed and exits. > I want to rerun the failed test cases instead of exiting. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2682) Make it optional to fail build on test failure.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2682: --- Attachment: ZOOKEEPER-2682-02.patch Generated patch with --no-prefix option. It was missed in earlier patch > Make it optional to fail build on test failure. > --- > > Key: ZOOKEEPER-2682 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2682 > Project: ZooKeeper > Issue Type: Improvement > Components: build, tests >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Minor > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2682-01.patch, ZOOKEEPER-2682-02.patch > > > Currently if there is a test failure, build is marked as failed and exits. > I want to rerun the failed test cases instead of exiting. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2682) Make it optional to fail build on test failure.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2682: --- Attachment: ZOOKEEPER-2682-01.patch added new paratermeter test.junit.failbuild.ontestfailure Give -Dtest.junit.failbuild.ontestfailure=false along with other junit test parameters to pass the build even if some tests are failed. By defautl test.junit.failbuild.ontestfailure is true so default is behavior is unchanged. > Make it optional to fail build on test failure. > --- > > Key: ZOOKEEPER-2682 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2682 > Project: ZooKeeper > Issue Type: Improvement > Components: build, tests >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Minor > Fix For: 3.4.10, 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2682-01.patch > > > Currently if there is a test failure, build is marked as failed and exits. > I want to rerun the failed test cases instead of exiting. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (ZOOKEEPER-2682) Make it optional to fail build on test failure.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-2682: --- Description: Currently if there is a test failure, build is marked as failed and exits. I want to rerun the failed test cases instead of exiting. was: Currently if there is a test failure, build is marked as failed and exits. I want to rerun the failed test cases instead of existing. > Make it optional to fail build on test failure. > --- > > Key: ZOOKEEPER-2682 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2682 > Project: ZooKeeper > Issue Type: Improvement > Components: build, tests >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Minor > > Currently if there is a test failure, build is marked as failed and exits. > I want to rerun the failed test cases instead of exiting. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (ZOOKEEPER-2682) Make it optional to fail build on test failure.
Mohammad Arshad created ZOOKEEPER-2682: -- Summary: Make it optional to fail build on test failure. Key: ZOOKEEPER-2682 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2682 Project: ZooKeeper Issue Type: Improvement Components: build, tests Reporter: Mohammad Arshad Assignee: Mohammad Arshad Priority: Minor Currently if there is a test failure, build is marked as failed and exits. I want to rerun the failed test cases instead of existing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)