[jira] [Resolved] (ZOOKEEPER-4531) Revert Netty TCNative change
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4531. Fix Version/s: 3.9.0 3.7.1 3.6.4 3.8.1 Resolution: Fixed Issue resolved by pull request 1873 [https://github.com/apache/zookeeper/pull/1873] > Revert Netty TCNative change > > > Key: ZOOKEEPER-4531 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4531 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Fix For: 3.9.0, 3.7.1, 3.6.4, 3.8.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > netty-tcnative is a dependency with netty 4.1.73. After upgrading netty to > 4.1.76 we can remove netty-tcnative as the netty-tcnative upgrade to 2.0.48 > did not resolve any CVEs. > > As netty 4.1.76 does not have netty-tcnative dependency CVEs will also be > resolved. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (ZOOKEEPER-4529) Upgrade netty to 4.1.76.Final
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4529. Fix Version/s: 3.7.1 3.6.4 3.9.0 3.8.1 Resolution: Fixed > Upgrade netty to 4.1.76.Final > - > > Key: ZOOKEEPER-4529 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4529 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > To resolve the CVEs generated due to netty-tcnative-classes:jar:2.0.46.Final > we should upgrade netty version. > the following CVEs are coming due to dependency of > io.netty:netty-codec:jar:4.1.73.Final on > io.netty:netty-tcnative-classes:jar:2.0.46.Final. > > CVE-2014-3488, CVE-2015-2156, CVE-2019-16869, CVE-2019-20444, CVE-2019-20445, > CVE-2021-21290, CVE-2021-21295, CVE-2021-21409, CVE-2021-37136, > CVE-2021-37137, CVE-2021-43797 -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17531892#comment-17531892 ] Mohammad Arshad commented on ZOOKEEPER-4510: dependency-check-maven upgrade to latest release 7.1.0 solves this false positive CVE issue. I will raise PR. > dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, > CVE-2022-23307 > --- > > Key: ZOOKEEPER-4510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Blocker > Labels: pull-request-available > Fix For: 3.6.4, 3.7. > > Time Spent: 0.5h > Remaining Estimate: 0h > > On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is > failing with following errors. > {code:java} > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check > (default-cli) on project zookeeper-assembly: > [ERROR] > [ERROR] One or more dependencies were identified with vulnerabilities that > have a CVSS score greater than or equal to '0.0': > [ERROR] > [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (ZOOKEEPER-4482) Fix LICENSE FILES for commons-io and commons-cli
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4482: --- Fix Version/s: 3.9.0 (was: 3.6.4) (was: 3.8.1) > Fix LICENSE FILES for commons-io and commons-cli > > > Key: ZOOKEEPER-4482 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4482 > Project: ZooKeeper > Issue Type: Task > Components: license >Reporter: Enrico Olivelli >Assignee: Enrico Olivelli >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.9.0 > > Time Spent: 1h > Remaining Estimate: 0h > > We should rename from commons-io-2.7 to 2.11.0 and we should also add the > LICENSE file for commons-cli -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Resolved] (ZOOKEEPER-4482) Fix LICENSE FILES for commons-io and commons-cli
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4482. Resolution: Fixed > Fix LICENSE FILES for commons-io and commons-cli > > > Key: ZOOKEEPER-4482 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4482 > Project: ZooKeeper > Issue Type: Task > Components: license >Reporter: Enrico Olivelli >Assignee: Enrico Olivelli >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.9.0 > > Time Spent: 1h > Remaining Estimate: 0h > > We should rename from commons-io-2.7 to 2.11.0 and we should also add the > LICENSE file for commons-cli -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (ZOOKEEPER-4287) Upgrade prometheus client library version to 0.10.0
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4287: --- Fix Version/s: (was: 3.7.1) > Upgrade prometheus client library version to 0.10.0 > --- > > Key: ZOOKEEPER-4287 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4287 > Project: ZooKeeper > Issue Type: Improvement > Components: build >Affects Versions: 3.7.0, 3.8.0 >Reporter: Li Wang >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Upgrade the client library version to the latest to help investigating the > Prometheus impact issue. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (ZOOKEEPER-4388) Recover from network partition, follower/observer ephemerals nodes is inconsistent with leader
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4388: --- Fix Version/s: (was: 3.7.1) > Recover from network partition, follower/observer ephemerals nodes is > inconsistent with leader > -- > > Key: ZOOKEEPER-4388 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4388 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.6.0, 3.6.3, 3.6.1, 3.6.2 > Environment: zk version 3.6.0 3.6.1 3.6.2 > the follower/observer network disconnection time exceeds session timeout >Reporter: shixiaoxiao >Priority: Major > Labels: inconsistency, partitoned, zookeeper > Attachments: dataInconsistent.png > > > The follower/observer enable read only. When the node returns to normal from > the partitioned, the ephemerals nodes will be inconsistent with the leader > node. The reason is that the request to close the timeout sessions is > processed by the ReadOnly follower or observer when they are partitioned and > the ephemerals nodes created by these sessions also are delete. When the > leader node uses diff to synchronize data with the follower/observer node, > the transaction that needs to be synchronized does not include the creation > of temporary nodes which created by sessions closed by followers.So the > follower/observer ephemerals nodes is inconsistent with leader. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Updated] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4510: --- Fix Version/s: 3.7. (was: 3.7.1) > dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, > CVE-2022-23307 > --- > > Key: ZOOKEEPER-4510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Blocker > Labels: pull-request-available > Fix For: 3.6.4, 3.7. > > Time Spent: 0.5h > Remaining Estimate: 0h > > On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is > failing with following errors. > {code:java} > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check > (default-cli) on project zookeeper-assembly: > [ERROR] > [ERROR] One or more dependencies were identified with vulnerabilities that > have a CVSS score greater than or equal to '0.0': > [ERROR] > [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 > {code} -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Commented] (ZOOKEEPER-1875) NullPointerException in ClientCnxn$EventThread.processEvent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523064#comment-17523064 ] Mohammad Arshad commented on ZOOKEEPER-1875: Thanks [~jerryhe] for raising and submitting the patches. Thanks [~symat], [~eolivelli] for the reviews. > NullPointerException in ClientCnxn$EventThread.processEvent > --- > > Key: ZOOKEEPER-1875 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1875 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.5, 3.4.10 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Attachments: ZOOKEEPER-1875-trunk.patch, ZOOKEEPER-1875.patch, > ZOOKEEPER-1875.patch > > Time Spent: 50m > Remaining Estimate: 0h > > We've been seeing NullPointerException while working on HBase: > {code} > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Client > environment:user.dir=/home/biadmin/hbase-trunk > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=hdtest009:2181 sessionTimeout=9 watcher=null > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Opening socket connection to > server hdtest009/9.30.194.18:2181. Will not attempt to authenticate using > SASL (Unable to locate a login configuration) > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Socket connection established to > hdtest009/9.30.194.18:2181, initiating session > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Session establishment complete > on server hdtest009/9.30.194.18:2181, sessionid = 0x143986213e67e48, > negotiated timeout = 6 > 14/01/30 22:15:25 ERROR zookeeper.ClientCnxn: Error while calling watcher > java.lang.NullPointerException > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > The reason is the watcher is null in this part of the code: > {code} >private void processEvent(Object event) { > try { > if (event instanceof WatcherSetEventPair) { > // each watcher will process the event > WatcherSetEventPair pair = (WatcherSetEventPair) event; > for (Watcher watcher : pair.watchers) { > try { > watcher.process(pair.event); > } catch (Throwable t) { > LOG.error("Error while calling watcher ", t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-1875) NullPointerException in ClientCnxn$EventThread.processEvent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523059#comment-17523059 ] Mohammad Arshad commented on ZOOKEEPER-1875: When watcher is null ZooKeeper client app is anyway getting null pointer exception. Now after this fix the apps will start getting IllegalArgumentException which will make it easier to figure out the wrong in the code and correct it. > NullPointerException in ClientCnxn$EventThread.processEvent > --- > > Key: ZOOKEEPER-1875 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1875 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.5, 3.4.10 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Labels: pull-request-available > Attachments: ZOOKEEPER-1875-trunk.patch, ZOOKEEPER-1875.patch, > ZOOKEEPER-1875.patch > > Time Spent: 40m > Remaining Estimate: 0h > > We've been seeing NullPointerException while working on HBase: > {code} > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Client > environment:user.dir=/home/biadmin/hbase-trunk > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=hdtest009:2181 sessionTimeout=9 watcher=null > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Opening socket connection to > server hdtest009/9.30.194.18:2181. Will not attempt to authenticate using > SASL (Unable to locate a login configuration) > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Socket connection established to > hdtest009/9.30.194.18:2181, initiating session > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Session establishment complete > on server hdtest009/9.30.194.18:2181, sessionid = 0x143986213e67e48, > negotiated timeout = 6 > 14/01/30 22:15:25 ERROR zookeeper.ClientCnxn: Error while calling watcher > java.lang.NullPointerException > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > The reason is the watcher is null in this part of the code: > {code} >private void processEvent(Object event) { > try { > if (event instanceof WatcherSetEventPair) { > // each watcher will process the event > WatcherSetEventPair pair = (WatcherSetEventPair) event; > for (Watcher watcher : pair.watchers) { > try { > watcher.process(pair.event); > } catch (Throwable t) { > LOG.error("Error while calling watcher ", t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-1875) NullPointerException in ClientCnxn$EventThread.processEvent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17523057#comment-17523057 ] Mohammad Arshad commented on ZOOKEEPER-1875: I have checked the zk client code carefully, NPE will occur only when watcher is set null either throw ZooKeeper constructor or through register method. Now I think we should do the exactly what had been submitted in latest ZOOKEEPER-1875.patch. > NullPointerException in ClientCnxn$EventThread.processEvent > --- > > Key: ZOOKEEPER-1875 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1875 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.5, 3.4.10 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Labels: pull-request-available > Attachments: ZOOKEEPER-1875-trunk.patch, ZOOKEEPER-1875.patch, > ZOOKEEPER-1875.patch > > Time Spent: 40m > Remaining Estimate: 0h > > We've been seeing NullPointerException while working on HBase: > {code} > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Client > environment:user.dir=/home/biadmin/hbase-trunk > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=hdtest009:2181 sessionTimeout=9 watcher=null > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Opening socket connection to > server hdtest009/9.30.194.18:2181. Will not attempt to authenticate using > SASL (Unable to locate a login configuration) > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Socket connection established to > hdtest009/9.30.194.18:2181, initiating session > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Session establishment complete > on server hdtest009/9.30.194.18:2181, sessionid = 0x143986213e67e48, > negotiated timeout = 6 > 14/01/30 22:15:25 ERROR zookeeper.ClientCnxn: Error while calling watcher > java.lang.NullPointerException > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > The reason is the watcher is null in this part of the code: > {code} >private void processEvent(Object event) { > try { > if (event instanceof WatcherSetEventPair) { > // each watcher will process the event > WatcherSetEventPair pair = (WatcherSetEventPair) event; > for (Watcher watcher : pair.watchers) { > try { > watcher.process(pair.event); > } catch (Throwable t) { > LOG.error("Error while calling watcher ", t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17520474#comment-17520474 ] Mohammad Arshad commented on ZOOKEEPER-4510: As CVE false positive issue resolution is taking time. Lets suppress those CVEs and move on. I raised PR. > dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, > CVE-2022-23307 > --- > > Key: ZOOKEEPER-4510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Blocker > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4 > > Time Spent: 10m > Remaining Estimate: 0h > > On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is > failing with following errors. > {code:java} > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check > (default-cli) on project zookeeper-assembly: > [ERROR] > [ERROR] One or more dependencies were identified with vulnerabilities that > have a CVSS score greater than or equal to '0.0': > [ERROR] > [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4515) ZK Cli quit command always logs error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17520008#comment-17520008 ] Mohammad Arshad commented on ZOOKEEPER-4515: Thanks [~Tison], [~eolivelli] for the review. > ZK Cli quit command always logs error > - > > Key: ZOOKEEPER-4515 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4515 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Attachments: image-2022-04-08-15-47-04-325.png, screenshot-1.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > !image-2022-04-08-15-47-04-325.png! > * When connection is in closing state, this log warning is entirely useless, > change this log to debug. > * When JVM exiting with 0 log info instead of error -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (ZOOKEEPER-4515) ZK Cli quit command always logs error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4515. Resolution: Fixed Issue resolved by pull request 1856 [https://github.com/apache/zookeeper/pull/1856] > ZK Cli quit command always logs error > - > > Key: ZOOKEEPER-4515 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4515 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.9.0, 3.7.1, 3.6.4, 3.8.1 > > Attachments: image-2022-04-08-15-47-04-325.png, screenshot-1.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > !image-2022-04-08-15-47-04-325.png! > * When connection is in closing state, this log warning is entirely useless, > change this log to debug. > * When JVM exiting with 0 log info instead of error -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4510: --- Priority: Blocker (was: Critical) > dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, > CVE-2022-23307 > --- > > Key: ZOOKEEPER-4510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Blocker > Fix For: 3.7.1, 3.6.4 > > > On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is > failing with following errors. > {code:java} > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check > (default-cli) on project zookeeper-assembly: > [ERROR] > [ERROR] One or more dependencies were identified with vulnerabilities that > have a CVSS score greater than or equal to '0.0': > [ERROR] > [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4516) checkstyle:check is failing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519711#comment-17519711 ] Mohammad Arshad commented on ZOOKEEPER-4516: Thanks [~symat] for the review. > checkstyle:check is failing > > > Key: ZOOKEEPER-4516 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4516 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4 > > Time Spent: 1h > Remaining Estimate: 0h > > checkstyle:check is failing on branch-3.7 and branch-3.6 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (ZOOKEEPER-4516) checkstyle:check is failing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4516. Resolution: Fixed Issue resolved by pull request 1858 [https://github.com/apache/zookeeper/pull/1858] > checkstyle:check is failing > > > Key: ZOOKEEPER-4516 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4516 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4 > > Time Spent: 20m > Remaining Estimate: 0h > > checkstyle:check is failing on branch-3.7 and branch-3.6 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4514) ClientCnxnSocketNetty throwing NPE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519699#comment-17519699 ] Mohammad Arshad commented on ZOOKEEPER-4514: Thanks [~symat] for the review. > ClientCnxnSocketNetty throwing NPE > -- > > Key: ZOOKEEPER-4514 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4514 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Attachments: image-2022-04-07-13-27-13-068.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > ClientCnxnSocketNetty throwing NPE. This mainly happens when any of the > server is in restarting state and client tries to connect. > !image-2022-04-07-13-27-13-068.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-4514) ClientCnxnSocketNetty throwing NPE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4514: --- Fix Version/s: 3.7.1 3.6.4 3.9.0 3.8.1 > ClientCnxnSocketNetty throwing NPE > -- > > Key: ZOOKEEPER-4514 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4514 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Attachments: image-2022-04-07-13-27-13-068.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > ClientCnxnSocketNetty throwing NPE. This mainly happens when any of the > server is in restarting state and client tries to connect. > !image-2022-04-07-13-27-13-068.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (ZOOKEEPER-4514) ClientCnxnSocketNetty throwing NPE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4514. Resolution: Fixed > ClientCnxnSocketNetty throwing NPE > -- > > Key: ZOOKEEPER-4514 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4514 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Attachments: image-2022-04-07-13-27-13-068.png > > Time Spent: 1h 10m > Remaining Estimate: 0h > > ClientCnxnSocketNetty throwing NPE. This mainly happens when any of the > server is in restarting state and client tries to connect. > !image-2022-04-07-13-27-13-068.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4516) checkstyle:check is failing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519532#comment-17519532 ] Mohammad Arshad commented on ZOOKEEPER-4516: branch-3.6 error {noformat} [ERROR] src\test\java\org\apache\zookeeper\KerberosTicketRenewalTest.java:[220,7] (whitespace) WhitespaceAround: 'if' is not followed by whitespace. [ERROR] src\test\java\org\apache\zookeeper\server\quorum\auth\MiniKdcTest.java:[26,8] (imports) UnusedImports: Unused import: java.util.HashMap ... [ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.1.0:check (default-cli) on project zookeeper: You have 8 Checkstyle violations. -> [Help 1] {noformat} > checkstyle:check is failing > > > Key: ZOOKEEPER-4516 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4516 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4 > > Time Spent: 10m > Remaining Estimate: 0h > > checkstyle:check is failing on branch-3.7 and branch-3.6 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4516) checkstyle:check is failing
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519513#comment-17519513 ] Mohammad Arshad commented on ZOOKEEPER-4516: mvn clean install checkstyle:check -DskipTests on branch-3.7 fails with following errors {noformat} [ERROR] src\test\java\org\apache\zookeeper\common\CertificatesToPlayWith.java:[564,90] (whitespace) OperatorWrap: '+' should be on a new line. [ERROR] src\test\java\org\apache\zookeeper\common\CertificatesToPlayWith.java:[565,62] (whitespace) OperatorWrap: '+' should be on a new line. [ERROR] src\test\java\org\apache\zookeeper\ZKUtilTest.java:[31,1] (imports) ImportOrder: Extra separation in import group before 'org.apache.zookeeper.data.Stat' [ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.1.1:check (default-cli) on project zookeeper: You have 438 Checkstyle violations. -> [Help 1] [ERROR] {noformat} > checkstyle:check is failing > > > Key: ZOOKEEPER-4516 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4516 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Fix For: 3.7.1, 3.6.4 > > > checkstyle:check is failing on branch-3.7 and branch-3.6 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-4515) ZK Cli quit command always logs error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4515: --- Fix Version/s: 3.7.1 3.6.4 3.9.0 3.8.1 > ZK Cli quit command always logs error > - > > Key: ZOOKEEPER-4515 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4515 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Attachments: image-2022-04-08-15-47-04-325.png, screenshot-1.png > > Time Spent: 10m > Remaining Estimate: 0h > > !image-2022-04-08-15-47-04-325.png! > * When connection is in closing state, this log warning is entirely useless, > change this log to debug. > * When JVM exiting with 0 log info instead of error -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4515) ZK Cli quit command always logs error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17519510#comment-17519510 ] Mohammad Arshad commented on ZOOKEEPER-4515: After fix: !screenshot-1.png! > ZK Cli quit command always logs error > - > > Key: ZOOKEEPER-4515 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4515 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Attachments: image-2022-04-08-15-47-04-325.png, screenshot-1.png > > > !image-2022-04-08-15-47-04-325.png! > * When connection is in closing state, this log warning is entirely useless, > change this log to debug. > * When JVM exiting with 0 log info instead of error -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-4515) ZK Cli quit command always logs error
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4515: --- Attachment: screenshot-1.png > ZK Cli quit command always logs error > - > > Key: ZOOKEEPER-4515 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4515 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Attachments: image-2022-04-08-15-47-04-325.png, screenshot-1.png > > > !image-2022-04-08-15-47-04-325.png! > * When connection is in closing state, this log warning is entirely useless, > change this log to debug. > * When JVM exiting with 0 log info instead of error -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4516) checkstyle:check is failing
Mohammad Arshad created ZOOKEEPER-4516: -- Summary: checkstyle:check is failing Key: ZOOKEEPER-4516 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4516 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.1, 3.6.4 checkstyle:check is failing on branch-3.7 and branch-3.6 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4515) ZK Cli quit command always logs error
Mohammad Arshad created ZOOKEEPER-4515: -- Summary: ZK Cli quit command always logs error Key: ZOOKEEPER-4515 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4515 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Attachments: image-2022-04-08-15-47-04-325.png !image-2022-04-08-15-47-04-325.png! * When connection is in closing state, this log warning is entirely useless, change this log to debug. * When JVM exiting with 0 log info instead of error -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Comment Edited] (ZOOKEEPER-1875) NullPointerException in ClientCnxn$EventThread.processEvent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17518799#comment-17518799 ] Mohammad Arshad edited comment on ZOOKEEPER-1875 at 4/7/22 11:15 AM: - I think as proposed in the first patch we should skip the watcher processing if either watcher or pair.event is null, but we should also add warning so in future we can get better understanding of the use case from where problem is coming. {code:java} if (watcher != null && pair.event != null) { watcher.process(pair.event); } else { LOG.warn( "Skipping watcher processing as watcher and pair.event cannot" + " be null. watcher={}, pair.event={}", watcher == null ? "null" : watcher.getClass().getName(), pair.event == null ? "null" : pair.event); } {code} was (Author: arshad.mohammad): I think as proposed in the first patch we should skip the watcher processing if either watcher or pair.event is null, but we should also add warning so in future we can get better understanding of the use case from where problem is coming. {code:java} if (watcher != null && pair.event != null) { watcher.process(pair.event); } else { LOG.warn( "Skipping watcher processing as watcher or pair.event cannot" + " be null. watcher={}, pair.event={}", watcher == null ? "null" : watcher.getClass().getName(), pair.event == null ? "null" : pair.event); } {code} > NullPointerException in ClientCnxn$EventThread.processEvent > --- > > Key: ZOOKEEPER-1875 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1875 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.5, 3.4.10 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Attachments: ZOOKEEPER-1875-trunk.patch, ZOOKEEPER-1875.patch, > ZOOKEEPER-1875.patch > > > We've been seeing NullPointerException while working on HBase: > {code} > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Client > environment:user.dir=/home/biadmin/hbase-trunk > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=hdtest009:2181 sessionTimeout=9 watcher=null > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Opening socket connection to > server hdtest009/9.30.194.18:2181. Will not attempt to authenticate using > SASL (Unable to locate a login configuration) > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Socket connection established to > hdtest009/9.30.194.18:2181, initiating session > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Session establishment complete > on server hdtest009/9.30.194.18:2181, sessionid = 0x143986213e67e48, > negotiated timeout = 6 > 14/01/30 22:15:25 ERROR zookeeper.ClientCnxn: Error while calling watcher > java.lang.NullPointerException > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > The reason is the watcher is null in this part of the code: > {code} >private void processEvent(Object event) { > try { > if (event instanceof WatcherSetEventPair) { > // each watcher will process the event > WatcherSetEventPair pair = (WatcherSetEventPair) event; > for (Watcher watcher : pair.watchers) { > try { > watcher.process(pair.event); > } catch (Throwable t) { > LOG.error("Error while calling watcher ", t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-1875) NullPointerException in ClientCnxn$EventThread.processEvent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17518799#comment-17518799 ] Mohammad Arshad commented on ZOOKEEPER-1875: I think as proposed in the first patch we should skip the watcher processing if either watcher or pair.event is null, but we should also add warning so in future we can get better understanding of the use case from where problem is coming. {code:java} if (watcher != null && pair.event != null) { watcher.process(pair.event); } else { LOG.warn( "Skipping watcher processing as watcher or pair.event cannot" + " be null. watcher={}, pair.event={}", watcher == null ? "null" : watcher.getClass().getName(), pair.event == null ? "null" : pair.event); } {code} > NullPointerException in ClientCnxn$EventThread.processEvent > --- > > Key: ZOOKEEPER-1875 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1875 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.5, 3.4.10 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Attachments: ZOOKEEPER-1875-trunk.patch, ZOOKEEPER-1875.patch, > ZOOKEEPER-1875.patch > > > We've been seeing NullPointerException while working on HBase: > {code} > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Client > environment:user.dir=/home/biadmin/hbase-trunk > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=hdtest009:2181 sessionTimeout=9 watcher=null > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Opening socket connection to > server hdtest009/9.30.194.18:2181. Will not attempt to authenticate using > SASL (Unable to locate a login configuration) > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Socket connection established to > hdtest009/9.30.194.18:2181, initiating session > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Session establishment complete > on server hdtest009/9.30.194.18:2181, sessionid = 0x143986213e67e48, > negotiated timeout = 6 > 14/01/30 22:15:25 ERROR zookeeper.ClientCnxn: Error while calling watcher > java.lang.NullPointerException > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > The reason is the watcher is null in this part of the code: > {code} >private void processEvent(Object event) { > try { > if (event instanceof WatcherSetEventPair) { > // each watcher will process the event > WatcherSetEventPair pair = (WatcherSetEventPair) event; > for (Watcher watcher : pair.watchers) { > try { > watcher.process(pair.event); > } catch (Throwable t) { > LOG.error("Error while calling watcher ", t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-1875) NullPointerException in ClientCnxn$EventThread.processEvent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17518795#comment-17518795 ] Mohammad Arshad commented on ZOOKEEPER-1875: I had observed this issue in the past, but right now don't have details of exact cause. > NullPointerException in ClientCnxn$EventThread.processEvent > --- > > Key: ZOOKEEPER-1875 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1875 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.5, 3.4.10 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Attachments: ZOOKEEPER-1875-trunk.patch, ZOOKEEPER-1875.patch, > ZOOKEEPER-1875.patch > > > We've been seeing NullPointerException while working on HBase: > {code} > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Client > environment:user.dir=/home/biadmin/hbase-trunk > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=hdtest009:2181 sessionTimeout=9 watcher=null > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Opening socket connection to > server hdtest009/9.30.194.18:2181. Will not attempt to authenticate using > SASL (Unable to locate a login configuration) > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Socket connection established to > hdtest009/9.30.194.18:2181, initiating session > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Session establishment complete > on server hdtest009/9.30.194.18:2181, sessionid = 0x143986213e67e48, > negotiated timeout = 6 > 14/01/30 22:15:25 ERROR zookeeper.ClientCnxn: Error while calling watcher > java.lang.NullPointerException > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > The reason is the watcher is null in this part of the code: > {code} >private void processEvent(Object event) { > try { > if (event instanceof WatcherSetEventPair) { > // each watcher will process the event > WatcherSetEventPair pair = (WatcherSetEventPair) event; > for (Watcher watcher : pair.watchers) { > try { > watcher.process(pair.event); > } catch (Throwable t) { > LOG.error("Error while calling watcher ", t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-4510: -- Assignee: Mohammad Arshad > dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, > CVE-2022-23307 > --- > > Key: ZOOKEEPER-4510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.7.1, 3.6.4 > > > On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is > failing with following errors. > {code:java} > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check > (default-cli) on project zookeeper-assembly: > [ERROR] > [ERROR] One or more dependencies were identified with vulnerabilities that > have a CVSS score greater than or equal to '0.0': > [ERROR] > [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4514) ClientCnxnSocketNetty throwing NPE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4514?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17518760#comment-17518760 ] Mohammad Arshad commented on ZOOKEEPER-4514: NPE is thrown because channel object is null in ClientCnxnSocketNetty#sendPkt(Packet p, boolean doFlush) method. We should do null check same like the null check done in sendPacket(ClientCnxn.Packet p) method. As sendPacket is delegating the call to sendPkt method, better we can move the null check to sendPkt method to handle all scenarios. > ClientCnxnSocketNetty throwing NPE > -- > > Key: ZOOKEEPER-4514 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4514 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Attachments: image-2022-04-07-13-27-13-068.png > > > ClientCnxnSocketNetty throwing NPE. This mainly happens when any of the > server is in restarting state and client tries to connect. > !image-2022-04-07-13-27-13-068.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4514) ClientCnxnSocketNetty throwing NPE
Mohammad Arshad created ZOOKEEPER-4514: -- Summary: ClientCnxnSocketNetty throwing NPE Key: ZOOKEEPER-4514 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4514 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Attachments: image-2022-04-07-13-27-13-068.png ClientCnxnSocketNetty throwing NPE. This mainly happens when any of the server is in restarting state and client tries to connect. !image-2022-04-07-13-27-13-068.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-1875) NullPointerException in ClientCnxn$EventThread.processEvent
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17518639#comment-17518639 ] Mohammad Arshad commented on ZOOKEEPER-1875: This issue is still applicable to all versions. is anybody interested in raising PR. > NullPointerException in ClientCnxn$EventThread.processEvent > --- > > Key: ZOOKEEPER-1875 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1875 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.4.5, 3.4.10 >Reporter: Jerry He >Assignee: Jerry He >Priority: Minor > Attachments: ZOOKEEPER-1875-trunk.patch, ZOOKEEPER-1875.patch, > ZOOKEEPER-1875.patch > > > We've been seeing NullPointerException while working on HBase: > {code} > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Client > environment:user.dir=/home/biadmin/hbase-trunk > 14/01/30 22:15:25 INFO zookeeper.ZooKeeper: Initiating client connection, > connectString=hdtest009:2181 sessionTimeout=9 watcher=null > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Opening socket connection to > server hdtest009/9.30.194.18:2181. Will not attempt to authenticate using > SASL (Unable to locate a login configuration) > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Socket connection established to > hdtest009/9.30.194.18:2181, initiating session > 14/01/30 22:15:25 INFO zookeeper.ClientCnxn: Session establishment complete > on server hdtest009/9.30.194.18:2181, sessionid = 0x143986213e67e48, > negotiated timeout = 6 > 14/01/30 22:15:25 ERROR zookeeper.ClientCnxn: Error while calling watcher > java.lang.NullPointerException > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:519) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:495) > {code} > The reason is the watcher is null in this part of the code: > {code} >private void processEvent(Object event) { > try { > if (event instanceof WatcherSetEventPair) { > // each watcher will process the event > WatcherSetEventPair pair = (WatcherSetEventPair) event; > for (Watcher watcher : pair.watchers) { > try { > watcher.process(pair.event); > } catch (Throwable t) { > LOG.error("Error while calling watcher ", t); > } > } > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517933#comment-17517933 ] Mohammad Arshad commented on ZOOKEEPER-4504: Thanks [~eolivelli] and [~symat] for the reviews. Merged to master, branch-3.8, branch-3.7 and branch-3.6 > ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality > > > Key: ZOOKEEPER-4504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Critical > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Time Spent: 50m > Remaining Estimate: 0h > > *Problem and Analysis:* > After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA > functionality as shown in below thread dumps. > {code:java} > "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 > nid=0x101b waiting for monitor entry [0x7f9bda8a6000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) > - waiting to lock <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) > {code} > {code:java} > "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on > condition [0x7f9c06404000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc1b383c8> (a > java.util.concurrent.Semaphore$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) > at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) > at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) > - locked <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) > at > org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) > {code} > org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance > synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) > ZKUtil.deleteRecursive is making async delete API call with MultiCallback as > it callback. > As processWatchEvent is being processed, pathRoot or one of the child paths > must have set watcher for delete notification. > When delete API is called, notification comes first to client then the actual > delete response. > In this case both notification and delete response are processed through > callback and
[jira] [Resolved] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4504. Resolution: Fixed Issue resolved by pull request 1843 [https://github.com/apache/zookeeper/pull/1843] > ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality > > > Key: ZOOKEEPER-4504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Critical > Labels: pull-request-available > Fix For: 3.9.0, 3.7.1, 3.6.4, 3.8.1 > > Time Spent: 40m > Remaining Estimate: 0h > > *Problem and Analysis:* > After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA > functionality as shown in below thread dumps. > {code:java} > "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 > nid=0x101b waiting for monitor entry [0x7f9bda8a6000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) > - waiting to lock <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) > {code} > {code:java} > "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on > condition [0x7f9c06404000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc1b383c8> (a > java.util.concurrent.Semaphore$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) > at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) > at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) > - locked <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) > at > org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) > {code} > org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance > synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) > ZKUtil.deleteRecursive is making async delete API call with MultiCallback as > it callback. > As processWatchEvent is being processed, pathRoot or one of the child paths > must have set watcher for delete notification. > When delete API is called, notification comes first to client then the actual > delete response. > In this case both notification and delete response are processed through > callback and through common waitingEvents queue one by
[jira] [Updated] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3652: --- Fix Version/s: 3.5.10 > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Fix For: 3.5.10, 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517849#comment-17517849 ] Mohammad Arshad commented on ZOOKEEPER-4510: Thanks [~c...@qos.ch] for the good suggestion. I reported false positive issue. https://github.com/jeremylong/DependencyCheck/issues/4316 > dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, > CVE-2022-23307 > --- > > Key: ZOOKEEPER-4510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Critical > Fix For: 3.7.1, 3.6.4 > > > On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is > failing with following errors. > {code:java} > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check > (default-cli) on project zookeeper-assembly: > [ERROR] > [ERROR] One or more dependencies were identified with vulnerabilities that > have a CVSS score greater than or equal to '0.0': > [ERROR] > [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517660#comment-17517660 ] Mohammad Arshad commented on ZOOKEEPER-4510: you are right, I can see both the CVEs are marked as fixed https://github.com/qos-ch/reload4j/issues/21 https://github.com/qos-ch/reload4j/commit/64902fe18ce5a5dd40487051a2f6231d9fbbe9b0 But don't know why these CVEs are reported in dependency check. I think we have to exclude these CVs to pass the dependency check. > dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, > CVE-2022-23307 > --- > > Key: ZOOKEEPER-4510 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Critical > Fix For: 3.7.1, 3.6.4 > > > On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is > failing with following errors. > {code:java} > [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check > (default-cli) on project zookeeper-assembly: > [ERROR] > [ERROR] One or more dependencies were identified with vulnerabilities that > have a CVSS score greater than or equal to '0.0': > [ERROR] > [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 > {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4510) dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307
Mohammad Arshad created ZOOKEEPER-4510: -- Summary: dependency-check:check failing - reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 Key: ZOOKEEPER-4510 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4510 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Fix For: 3.7.1, 3.6.4 On branch-3.7 "mvn clean package -DskipTests dependency-check:check" is failing with following errors. {code:java} [ERROR] Failed to execute goal org.owasp:dependency-check-maven:6.5.3:check (default-cli) on project zookeeper-assembly: [ERROR] [ERROR] One or more dependencies were identified with vulnerabilities that have a CVSS score greater than or equal to '0.0': [ERROR] [ERROR] reload4j-1.2.19.jar: CVE-2020-9493, CVE-2022-23307 {code} -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17517305#comment-17517305 ] Mohammad Arshad commented on ZOOKEEPER-4504: Hi [~ctubbsii] I have added few more details in description, Pls have a look. bq. It seems like this problem is either caused by a poorly written callback that synchronizes in a way it shouldn't. I observed this issue when I upgraded from zk 3.5.6 to zk 3.6.3 in my cluster, HDFS code remained same before and after upgrade. So at high level it is not an issue of poorly written callback. It is an issue of impact of changed behavior in ZKUtil.deleteRecursive API. Also please have a look on the proposed change in the PR. May be it is helpful in understanding the context > ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality > > > Key: ZOOKEEPER-4504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Critical > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Time Spent: 10m > Remaining Estimate: 0h > > *Problem and Analysis:* > After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA > functionality as shown in below thread dumps. > {code:java} > "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 > nid=0x101b waiting for monitor entry [0x7f9bda8a6000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) > - waiting to lock <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) > {code} > {code:java} > "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on > condition [0x7f9c06404000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc1b383c8> (a > java.util.concurrent.Semaphore$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) > at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) > at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) > - locked <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) > at > org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) > {code} > org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance >
[jira] [Updated] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4504: --- Description: *Problem and Analysis:* After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA functionality as shown in below thread dumps. {code:java} "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 nid=0x101b waiting for monitor entry [0x7f9bda8a6000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) - waiting to lock <0xc17986c0> (a org.apache.hadoop.ha.ActiveStandbyElector) at org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) {code} {code:java} "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on condition [0x7f9c06404000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xc1b383c8> (a java.util.concurrent.Semaphore$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) at org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) at org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) at org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) - locked <0xc17986c0> (a org.apache.hadoop.ha.ActiveStandbyElector) at org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) {code} org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) ZKUtil.deleteRecursive is making async delete API call with MultiCallback as it callback. As processWatchEvent is being processed, pathRoot or one of the child paths must have set watcher for delete notification. When delete API is called, notification comes first to client then the actual delete response. In this case both notification and delete response are processed through callback and through common waitingEvents queue one by one. First notification is processed, but it cannot complete as it cannot take lock on processWatchEvent() method as lock is already taken by another thread while calling clearParentZNode() As delete notification cannot be processed, MultiCallback is not taken from queue for processing. It stays there in the queue forever. *Why this problem was not happening with earlier versions (3.5.x)?* In earlier ZK versions, ZKUtil.deleteRecursive was using sync delete API. So delete response was processed directly not though the callback. Sot both clearParentZNode and processWatchEvent were completing independently. *Proposed Fix:* There are two approaches to fix this problem. 1. We can fix the problem in HDFS, modify the HDFS code to avoid
[jira] [Resolved] (ZOOKEEPER-4505) CVE-2020-36518 - Upgrade jackson databind to 2.13.2.1
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4505?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4505. Fix Version/s: 3.9.0 3.7.1 3.6.4 3.8.1 Resolution: Fixed Issue resolved by pull request 1846 [https://github.com/apache/zookeeper/pull/1846] > CVE-2020-36518 - Upgrade jackson databind to 2.13.2.1 > - > > Key: ZOOKEEPER-4505 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4505 > Project: ZooKeeper > Issue Type: Bug >Reporter: Edwin Hobor >Priority: Major > Labels: pull-request-available, security > Fix For: 3.9.0, 3.7.1, 3.6.4, 3.8.1 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > *CVE-2020-36518* vulnerability affects jackson-databind in Zookeeper (see > [https://github.com/advisories/GHSA-57j2-w4cx-62h2]). > Upgrading to jackson-databind version *2.13.2.1* should address this issue. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514878#comment-17514878 ] Mohammad Arshad commented on ZOOKEEPER-3652: Merged. Thanks [~sylvain] for the contribution. > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Time Spent: 1.5h > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-3652. Resolution: Fixed Issue resolved by pull request 1257 [https://github.com/apache/zookeeper/pull/1257] > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Fix For: 3.9.0, 3.7.1, 3.6.4, 3.8.1 > > Time Spent: 1.5h > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-4388) Recover from network partition, follower/observer ephemerals nodes is inconsistent with leader
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4388: --- Fix Version/s: 3.7.1 (was: 3.7) > Recover from network partition, follower/observer ephemerals nodes is > inconsistent with leader > -- > > Key: ZOOKEEPER-4388 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4388 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.6.0, 3.6.3, 3.6.1, 3.6.2 > Environment: zk version 3.6.0 3.6.1 3.6.2 > the follower/observer network disconnection time exceeds session timeout >Reporter: shixiaoxiao >Priority: Major > Labels: inconsistency, partitoned, zookeeper > Fix For: 3.7.1 > > Attachments: dataInconsistent.png > > > The follower/observer enable read only. When the node returns to normal from > the partitioned, the ephemerals nodes will be inconsistent with the leader > node. The reason is that the request to close the timeout sessions is > processed by the ReadOnly follower or observer when they are partitioned and > the ephemerals nodes created by these sessions also are delete. When the > leader node uses diff to synchronize data with the follower/observer node, > the transaction that needs to be synchronized does not include the creation > of temporary nodes which created by sessions closed by followers.So the > follower/observer ephemerals nodes is inconsistent with leader. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-4287) Upgrade prometheus client library version to 0.10.0
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4287: --- Fix Version/s: 3.7.1 (was: 3.7) > Upgrade prometheus client library version to 0.10.0 > --- > > Key: ZOOKEEPER-4287 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4287 > Project: ZooKeeper > Issue Type: Improvement > Components: build >Affects Versions: 3.7.0, 3.8.0 >Reporter: Li Wang >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Upgrade the client library version to the latest to help investigating the > Prometheus impact issue. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3652: --- Fix Version/s: 3.7.1 3.9.0 3.8.1 > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3652: --- Fix Version/s: 3.6.4 (was: 3.6.4,3.7.1,3.8.0,3.9.0) > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Fix For: 3.6.4 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514212#comment-17514212 ] Mohammad Arshad commented on ZOOKEEPER-3652: Some how wrong version number "3.6.4,3.7.1,3.8.0,3.9.0" got created into the jira system. I will delete this version number > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Fix For: 3.6.4,3.7.1,3.8.0,3.9.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514204#comment-17514204 ] Mohammad Arshad commented on ZOOKEEPER-4504: Analysis is put into the issue description, hdfs code org.apache.hadoop.ha.ActiveStandbyElector > ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality > > > Key: ZOOKEEPER-4504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Critical > Labels: pull-request-available > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > Time Spent: 10m > Remaining Estimate: 0h > > *Problem and Analysis:* > After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA > functionality as shown in below thread dumps. > {code:java} > "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 > nid=0x101b waiting for monitor entry [0x7f9bda8a6000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) > - waiting to lock <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) > {code} > {code:java} > "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on > condition [0x7f9c06404000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc1b383c8> (a > java.util.concurrent.Semaphore$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) > at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) > at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) > - locked <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) > at > org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) > {code} > org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance > synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) > ZKUtil.deleteRecursive is async API call and in callback it is invoking > ActiveStandbyElector#processWatchEvent which is synchronized on > ActiveStandbyElector instance. > So there is deadlock, clearParentZNode() is waiting processWatchEvent() to > complete and processWatchEvent() is waiting clearParentZNode to complete > > *Why this problem was not happening with earlier versions (3.5.x)?* > In earlier zk versions,
[jira] [Updated] (ZOOKEEPER-4289) Reduce the performance impact of Prometheus metrics
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4289: --- Fix Version/s: 3.8.1 (was: 3.8.0) > Reduce the performance impact of Prometheus metrics > --- > > Key: ZOOKEEPER-4289 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4289 > Project: ZooKeeper > Issue Type: Improvement > Components: metric system >Affects Versions: 3.6.3, 3.7.0, 3.6.2, 3.8.0, 3.7.1 >Reporter: Li Wang >Priority: Major > Labels: pull-request-available > Fix For: 3.8.1 > > Time Spent: 3h > Remaining Estimate: 0h > > I have done load comparison tests for Prometheus enabled vs disabled and > found the performance is reduced about 40%-60% for both read and write > operations (i.e. getData, getChildren and createNode). > Looked more into this and found that Prometheus Summary is very expensive and > there are more than 20 Summary metrics added to the new CommitProcessor. > Need a solution to reduce the impact of Prometheus metrics. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514194#comment-17514194 ] Mohammad Arshad commented on ZOOKEEPER-4504: There is no need to create new sync or asyc API. There was already one attempt in ZOOKEEPER-3763 to make deleteRecursive API compatible to older version. but that patch missed changing delete api invocation from async to sync. In this jira we can handle that part and make deleteRecursive API fully compatible with older versions bq. It's not clear from the discussion above what the actual cause was. Please refer the analysis along with the hdfs code, hope that will give better clarity on the root cause. > ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality > > > Key: ZOOKEEPER-4504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > > *Problem and Analysis:* > After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA > functionality as shown in below thread dumps. > {code:java} > "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 > nid=0x101b waiting for monitor entry [0x7f9bda8a6000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) > - waiting to lock <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) > {code} > {code:java} > "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on > condition [0x7f9c06404000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc1b383c8> (a > java.util.concurrent.Semaphore$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) > at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) > at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) > - locked <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) > at > org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) > {code} > org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance > synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) > ZKUtil.deleteRecursive is async API call and in callback it is invoking >
[jira] [Created] (ZOOKEEPER-4507) Create ZOO_DAEMON_OUT file backup when restarting the server
Mohammad Arshad created ZOOKEEPER-4507: -- Summary: Create ZOO_DAEMON_OUT file backup when restarting the server Key: ZOOKEEPER-4507 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4507 Project: ZooKeeper Issue Type: Improvement Components: scripts Reporter: Mohammad Arshad Assignee: Mohammad Arshad Attachments: image-2022-03-29-20-33-57-181.png The ZooKeeper server deamon out file zookeeper-$USER-server-$HOSTNAME.out is overwritten on every server restart. Like the other log file we should create backup of this file also. Many times information logged into these file are useful in issue analysis. For example this contains the information about which transaction and which snapshot files were deleted. These are useful info, we should retain for some time. May be by default we can backup 5 .out files like below !image-2022-03-29-20-33-57-181.png! -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (ZOOKEEPER-4506) Change Server default appender from CONSOLE to ROLLINGFILE
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4506: --- Summary: Change Server default appender from CONSOLE to ROLLINGFILE (was: Change Server default log4j appender from CONSOLE to ROLLINGFILE) > Change Server default appender from CONSOLE to ROLLINGFILE > -- > > Key: ZOOKEEPER-4506 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4506 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > > Server by default logs to CONSOLE and then contents are redirected to > zookeeper-$USER-server-$HOSTNAME.out" file. This file is overwritten on every > server restart, the size of this file keeps growing, does not split when size > is bigger. > I think default logging appender should be ROLLINGFILE instead of CONSOLE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4506) Change Server default log4j appender from CONSOLE to ROLLINGFILE
Mohammad Arshad created ZOOKEEPER-4506: -- Summary: Change Server default log4j appender from CONSOLE to ROLLINGFILE Key: ZOOKEEPER-4506 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4506 Project: ZooKeeper Issue Type: Improvement Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 Server by default logs to CONSOLE and then contents are redirected to zookeeper-$USER-server-$HOSTNAME.out" file. This file is overwritten on every server restart, the size of this file keeps growing, does not split when size is bigger. I think default logging appender should be ROLLINGFILE instead of CONSOLE. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17514033#comment-17514033 ] Mohammad Arshad commented on ZOOKEEPER-4504: yes, I am sure problem is not because of low value of rateLimit. If it was because of low concurrency, call would have taken bit longer but not stuck forever. Also the call was not deleting a lot of znode, so there was no need to submit multiple batches, one batch was sufficient. > ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality > > > Key: ZOOKEEPER-4504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Critical > Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 > > > *Problem and Analysis:* > After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA > functionality as shown in below thread dumps. > {code:java} > "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 > nid=0x101b waiting for monitor entry [0x7f9bda8a6000] >java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) > - waiting to lock <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) > {code} > {code:java} > "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on > condition [0x7f9c06404000] >java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0xc1b383c8> (a > java.util.concurrent.Semaphore$NonfairSync) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) > at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) > at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) > at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) > at > org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) > at > org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) > at > org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) > - locked <0xc17986c0> (a > org.apache.hadoop.ha.ActiveStandbyElector) > at > org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) > at > org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) > at > org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) > at > org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:360) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) > at > org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) > at > org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) > at > org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) > {code} > org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance > synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) > ZKUtil.deleteRecursive is async API call and in callback it is invoking > ActiveStandbyElector#processWatchEvent which is synchronized on > ActiveStandbyElector instance. > So there is deadlock, clearParentZNode() is waiting processWatchEvent() to > complete and processWatchEvent() is waiting clearParentZNode to complete > > *Why
[jira] [Updated] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3652: --- Fix Version/s: 3.6.4,3.7.1,3.8.0,3.9.0 > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Fix For: 3.6.4,3.7.1,3.8.0,3.9.0 > > Time Spent: 1h > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-3652) Improper synchronization in ClientCnxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17513978#comment-17513978 ] Mohammad Arshad commented on ZOOKEEPER-3652: Good finding [~sylvain]. I will review and merge your PR. > Improper synchronization in ClientCnxn > -- > > Key: ZOOKEEPER-3652 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3652 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.6 >Reporter: Sylvain Wallez >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > ZOOKEEPER-2111 introduced {{synchronized(state)}} statements in > {{ClientCnxn}} and {{ClientCnxn.SendThread}} to coordinate insertion in > {{outgoingQueue}} and draining it when the client connection isn't alive. > There are several issues with this approach: > - the value of the {{state}} field is not stable, meaning we don't always > synchronize on the same object. > - the {{state}} field is an enum value, which are global objects. So in an > application with several ZooKeeper clients connected to different servers, > this causes some contention between clients. > An easy fix is change those {{synchronized(state)}} statements to > {{synchronized(outgoingQueue)}} since it is local to each client and is what > we want to coordinate. > I'll be happy to prepare a PR with the above change if this is deemed to be > the correct way to fix it. > > Another issue that makes contention worse is > {{ClientCnxnSocketNIO.cleanup()}} that is called from within the above > synchronized block and contains {{Thread.sleep(100)}}. Why is this sleep > statement needed, and can we remove it? > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (ZOOKEEPER-4504) ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality
Mohammad Arshad created ZOOKEEPER-4504: -- Summary: ZKUtil#deleteRecursive causing deadlock in HDFS HA functionality Key: ZOOKEEPER-4504 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4504 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.7.1, 3.6.4, 3.9.0, 3.8.1 *Problem and Analysis:* After integrating ZooKeeper 3.6.3 we observed deadlock in HDFS HA functionality as shown in below thread dumps. {code:java} "main-EventThread" #33 daemon prio=5 os_prio=0 tid=0x7f9c017f1000 nid=0x101b waiting for monitor entry [0x7f9bda8a6000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.ha.ActiveStandbyElector.processWatchEvent(ActiveStandbyElector.java:603) - waiting to lock <0xc17986c0> (a org.apache.hadoop.ha.ActiveStandbyElector) at org.apache.hadoop.ha.ActiveStandbyElector$WatcherWithClientRef.process(ActiveStandbyElector.java:1193) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:626) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:582) {code} {code:java} "main" #1 prio=5 os_prio=0 tid=0x7f9c0006 nid=0xea3 waiting on condition [0x7f9c06404000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0xc1b383c8> (a java.util.concurrent.Semaphore$NonfairSync) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838) at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:999) at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1306) at java.util.concurrent.Semaphore.acquire(Semaphore.java:467) at org.apache.zookeeper.ZKUtil.deleteInBatch(ZKUtil.java:122) at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:64) at org.apache.zookeeper.ZKUtil.deleteRecursive(ZKUtil.java:76) at org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:386) at org.apache.hadoop.ha.ActiveStandbyElector$1.run(ActiveStandbyElector.java:383) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1103) at org.apache.hadoop.ha.ActiveStandbyElector.zkDoWithRetries(ActiveStandbyElector.java:1095) at org.apache.hadoop.ha.ActiveStandbyElector.clearParentZNode(ActiveStandbyElector.java:383) - locked <0xc17986c0> (a org.apache.hadoop.ha.ActiveStandbyElector) at org.apache.hadoop.ha.ZKFailoverController.formatZK(ZKFailoverController.java:290) at org.apache.hadoop.ha.ZKFailoverController.doRun(ZKFailoverController.java:227) at org.apache.hadoop.ha.ZKFailoverController.access$000(ZKFailoverController.java:66) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:186) at org.apache.hadoop.ha.ZKFailoverController$1.run(ZKFailoverController.java:182) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:360) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1741) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:498) at org.apache.hadoop.ha.ZKFailoverController.run(ZKFailoverController.java:182) at org.apache.hadoop.hdfs.tools.DFSZKFailoverController.main(DFSZKFailoverController.java:220) {code} org.apache.hadoop.ha.ActiveStandbyElector#clearParentZNode is instance synchronized and calls ZKUtil.deleteRecursive(zk, pathRoot) ZKUtil.deleteRecursive is async API call and in callback it is invoking ActiveStandbyElector#processWatchEvent which is synchronized on ActiveStandbyElector instance. So there is deadlock, clearParentZNode() is waiting processWatchEvent() to complete and processWatchEvent() is waiting clearParentZNode to complete *Why this problem was not happening with earlier versions (3.5.x)?* In earlier zk versions, ZKUtil.deleteRecursive was using sync zk API intnernally. So there was no callback (processWatchEvent) coming into the scenario. *Proposed Fix:* There are two approaches to fix this problem. 1. We can fix the problem in HDFS, modify the HDFS code to avoid the deadlock. But we may get similar bugs in other projects. 2. Fix the problem in ZK. Make the API behavior same as the old behavior(use sync API to delete the ZK node) and provide new overloaded API with new behavior(use async API to delete the ZK node) I propose to fix the
[jira] [Resolved] (ZOOKEEPER-4434) Backport ZOOKEEPER-3142 for branch-3.5
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4434. Fix Version/s: 3.5.10 Resolution: Fixed Issue resolved by pull request 1791 [https://github.com/apache/zookeeper/pull/1791] > Backport ZOOKEEPER-3142 for branch-3.5 > -- > > Key: ZOOKEEPER-4434 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4434 > Project: ZooKeeper > Issue Type: Improvement >Affects Versions: 3.5.9 >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Fix For: 3.5.10 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > h5. Extend SnapshotFormatter to dump data in json format. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (ZOOKEEPER-4433) Backport ZOOKEEPER-2872 for branch-3.5
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17488971#comment-17488971 ] Mohammad Arshad commented on ZOOKEEPER-4433: Thanks [~ananysin] for submitting the PR. Added you as ZK contributor > Backport ZOOKEEPER-2872 for branch-3.5 > -- > > Key: ZOOKEEPER-4433 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4433 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.9 >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Fix For: 3.5.10 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Assigned] (ZOOKEEPER-4433) Backport ZOOKEEPER-2872 for branch-3.5
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-4433: -- Assignee: Ananya Singh > Backport ZOOKEEPER-2872 for branch-3.5 > -- > > Key: ZOOKEEPER-4433 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4433 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.9 >Reporter: Ananya Singh >Assignee: Ananya Singh >Priority: Major > Labels: pull-request-available > Fix For: 3.5.10 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (ZOOKEEPER-4433) Backport ZOOKEEPER-2872 for branch-3.5
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4433. Fix Version/s: 3.5.10 Resolution: Fixed Issue resolved by pull request 1790 [https://github.com/apache/zookeeper/pull/1790] > Backport ZOOKEEPER-2872 for branch-3.5 > -- > > Key: ZOOKEEPER-4433 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4433 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.9 >Reporter: Ananya Singh >Priority: Major > Labels: pull-request-available > Fix For: 3.5.10 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (ZOOKEEPER-4385) Backport ZOOKEEPER-4278 to branch-3.5 to Address CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4385. Fix Version/s: 3.5.10 Resolution: Fixed Issue resolved by pull request 1762 [https://github.com/apache/zookeeper/pull/1762] > Backport ZOOKEEPER-4278 to branch-3.5 to Address CVE-2021-21409 > --- > > Key: ZOOKEEPER-4385 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4385 > Project: ZooKeeper > Issue Type: Bug >Reporter: Brahma Reddy Battula >Assignee: Brahma Reddy Battula >Priority: Major > Labels: pull-request-available > Fix For: 3.5.10 > > Time Spent: 20m > Remaining Estimate: 0h > > Backport ZOOKEEPER-4278 to branch-3.5 to address CVE-2021-21409 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17419023#comment-17419023 ] Mohammad Arshad commented on ZOOKEEPER-4278: Thanks [~brahmareddy] for creating new bug ZOOKEEPER-4385 to backport to branch-3.5. Pls raise PR as well. > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Ayush Mantri >Priority: Blocker > Labels: pull-request-available > Fix For: 3.6.3, 3.8.0, 3.7.1 > > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-4282) Redesign quota feature
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17397435#comment-17397435 ] Mohammad Arshad commented on ZOOKEEPER-4282: [~ztzg] Thanks for sharing the scary story. Its supporting the need to protect the internal ZooKeeper data structure from outside modification. bq. I would suggest opening another ticket, and creating PRs preventing the server crash for 3.5 and 3.6. WDYT? Should I take care of it? I think preventing server crash after allowing wrong data to be set in quota znode will work but it is better to prevent the cause itself. Should not allow to set wrong data in quota znodes. But the changes I am proposing will be big and may not go in branch-3.6 and branch-3.5. So I am OK with any work around solution on those branches. If you have interest please go ahead.(y) > Redesign quota feature > -- > > Key: ZOOKEEPER-4282 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4282 > Project: ZooKeeper > Issue Type: New Feature > Components: quota >Reporter: Mohammad Arshad >Assignee: Mohammad Arshad >Priority: Major > Fix For: 3.8.0 > > > *Quota Use Case:* > Generally in a big data solution deployment multiple services (hdfs, yarn, > hbase etc.) use single Zookeeper cluster. So it is very important to ensure > fare usage by all services. Sometime services unintentionally, mainly because > of faulty behavior, create many znodes and impact the overall reliability of > the ZooKeeper service. To ensure the faire usage quota feature is required. > But this is the only use case there are many other use cases for quota > feature. > *Current Problems:* > # Currently, user can set quota by updating znode > “/zookeeper/quota/nodepath”, or using setquota/delquota in CLI command. > This makes the quota setting infective > Currently any user can set/delete quota, which is not proper, it should be > admin operation > # User is allowed to modify zookeeper system paths like /zookeeper/quota. > These are internal to zookeeper should not be allowed to modify. > # Generally services create single top level znode in Zookeeper like /hbase > and create all required znode under it. > It is better if it is configurable who can create top level znodes to > controll ZooKeeper usage. > # After ZOOKEEPER-231, there two kinds quota enforcement limits 1. Hard limit > 2. Soft limit. > I think there should be only limit. When enforce quota is enabled that limits > becomes the hard limit otherwise it is soft limit same as old feature, just > logs warnings. > *Proposed Solution* > # Add setQuota and deleteQuota admin APIs. Add listQuota normal user API > Modify quota cli commands to use these APIs instead of directory modifying > ZooKeeper system path /zookeeper/quota/ > # Protect ZooKeeper system paths from outside modification. System should > only be readable from outside > # Expose configuration to set ACL for root system znode. > After this, at the time of ZooKeeper service deployment administrator can > create top level znode for a service and set quota. This way we can control > overall ZooKeeper usage > # Revert some of the changes in ZOOKEEPER-231 and move to single quota limit -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-4345) Avoid NoSunchMethodException caused by shaded zookeeper jar
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4345: --- Summary: Avoid NoSunchMethodException caused by shaded zookeeper jar (was: Avoid NoSunchMethodException caused by shaded) > Avoid NoSunchMethodException caused by shaded zookeeper jar > --- > > Key: ZOOKEEPER-4345 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4345 > Project: ZooKeeper > Issue Type: Bug >Reporter: Bo Cui >Priority: Major > Labels: pull-request-available > Fix For: 3.8.0, 3.7.1, 3.6.4 > > Attachments: image-2021-08-07-17-30-42-883.png, > image-2021-08-07-18-52-00-633.png > > Time Spent: 2h 50m > Remaining Estimate: 0h > > in OS Flink, flink relocate zk to > org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.* > [https://github.com/apache/flink-shaded/blob/82f8bb3324864491dc62c4d3e27f1c1ccc49ac84/flink-shaded-zookeeper-parent/pom.xml#L68] > the maven-shade-plugin changes all 'org.apache.zookeeper' to > 'org.apache.flink.shaded.zookeeper3.org.apache.zookeeper' > if JVM has -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.*, and in shaded > zk jar, will get NoSunchMethodException > !image-2021-08-07-18-52-00-633.png! > !image-2021-08-07-17-30-42-883.png! > code: > [https://github.com/apache/zookeeper/blob/9a5da5f9a023e53bf339748b5b7b17278ae36475/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L3029] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ZOOKEEPER-4345) Avoid NoSunchMethodException caused by shaded
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4345. Fix Version/s: 3.6.4 3.7.1 3.8.0 Resolution: Fixed Issue resolved by pull request 1736 [https://github.com/apache/zookeeper/pull/1736] > Avoid NoSunchMethodException caused by shaded > - > > Key: ZOOKEEPER-4345 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4345 > Project: ZooKeeper > Issue Type: Bug >Reporter: Bo Cui >Priority: Major > Labels: pull-request-available > Fix For: 3.8.0, 3.7.1, 3.6.4 > > Attachments: image-2021-08-07-17-30-42-883.png, > image-2021-08-07-18-52-00-633.png > > Time Spent: 2h 40m > Remaining Estimate: 0h > > in OS Flink, flink relocate zk to > org.apache.flink.shaded.zookeeper3.org.apache.zookeeper.* > [https://github.com/apache/flink-shaded/blob/82f8bb3324864491dc62c4d3e27f1c1ccc49ac84/flink-shaded-zookeeper-parent/pom.xml#L68] > the maven-shade-plugin changes all 'org.apache.zookeeper' to > 'org.apache.flink.shaded.zookeeper3.org.apache.zookeeper' > if JVM has -Dzookeeper.clientCnxnSocket=org.apache.zookeeper.*, and in shaded > zk jar, will get NoSunchMethodException > !image-2021-08-07-18-52-00-633.png! > !image-2021-08-07-17-30-42-883.png! > code: > [https://github.com/apache/zookeeper/blob/9a5da5f9a023e53bf339748b5b7b17278ae36475/zookeeper-server/src/main/java/org/apache/zookeeper/ZooKeeper.java#L3029] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ZOOKEEPER-4275) Slowness in sasl login or subject.doAs() causes zk client to falsely assume that the server did not respond, closes connection and goes to unnecessary retries
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-4275: -- Assignee: Ravi Kishore Valeti > Slowness in sasl login or subject.doAs() causes zk client to falsely assume > that the server did not respond, closes connection and goes to unnecessary > retries > -- > > Key: ZOOKEEPER-4275 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4275 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.9 >Reporter: Ravi Kishore Valeti >Assignee: Ravi Kishore Valeti >Priority: Minor > Labels: pull-request-available > Fix For: 3.5.10, 3.8.0, 3.7.1, 3.6.4 > > Time Spent: 2.5h > Remaining Estimate: 0h > > Zookeeper client does sasl auth (login and subject.doAs())as a preset before > attempting a connection to server. > If there is a delay in sasl auth (possibly due to slow Kerberos > communication), ZK client falsely assumes that the zk server did not respond > and runs in to unnecessary multiple retries. > Client configuration: > "zookeeper.session.timeout" = "3000" > "zookeeper.recovery.retry" = "1" > "zookeeper.recovery.retry.intervalmill" = "500" > This configuration translates to > connect timeout as 1000ms > Read Timeout as 2000ms > Example: There was a 3 second delay in logging in the user as seen from the > logs below. The connection attempt was made later. However, zk client did not > wait for server response but logged a timeout (3 seconds > 1 sec connect > timeout), closed the connection and went to retries. Since there was a > consistent delay at Kerberos master, we had seen this retries go as long as > 10 mins causing requests to timeout/fail. > Logs: > 3/23/21 4:15:*32.389* AM jute.maxbuffer value is x Bytes > 3/23/21 4:15:*35.395* AM Client successfully logged in. > 3/23/21 4:15:35.396 AM TGT refresh sleeping until: Wed Mar 24 00:34:31 GMT > 2021 > 3/23/21 4:15:35.396 AM TGT refresh thread started. > 3/23/21 4:15:35.396 AM Client will use GSSAPI as SASL mechanism. > 3/23/21 4:15:35.396 AM TGT expires: xxx Mar xx 04:15:35 GMT > 2021 > 3/23/21 4:15:35.396 AM TGT valid starting at: xxx Mar xx 04:15:35 GMT > 2021 > 3/23/21 4:15:*35.397* AM *Opening socket connection* to server x:2181. > Will attempt to SASL-authenticate using Login Context section 'Client' > 3/23/21 4:15:*35.397* AM *Client session timed out, have not heard from > server in* *3008ms* for sessionid 0x0 > 3/23/21 4:15:35.397 AM Client session timed out, have not heard from server > in 3008ms for sessionid 0x0, closing socket connection and attempting > reconnect > 3/23/21 4:15:35.498 AM TGT renewal thread has been interrupted and will exit. > 3/23/21 4:15:38.503 AM Client successfully logged in. > 3/23/21 4:15:38.503 AM TGT expires: xxx Mar xx 04:15:38 GMT > 2021 > 3/23/21 4:15:38.503 AM Client will use GSSAPI as SASL mechanism. > 3/23/21 4:15:38.503 AM TGT valid starting at: xxx Mar xx 04:15:38 GMT > 2021 > 3/23/21 4:15:38.503 AM TGT refresh thread started. > 3/23/21 4:15:38.503 AM TGT refresh sleeping until: Wed Mar 24 00:10:10 GMT > 2021 > 3/23/21 4:15:38.506 AM Opening socket connection to server x:2181. Will > attempt to SASL-authenticate using Login Context section 'Client' > 3/23/21 4:15:38.506 AM Client session timed out, have not heard from server > in 3009ms for sessionid 0x0, closing socket connection and attempting > reconnect > 3/23/21 4:15:38.506 AM Client session timed out, have not heard from server > in 3009ms for sessionid 0x0 > 3/23/21 4:15:38.606 AM TGT renewal thread has been interrupted and will exit. > 3/23/21 4:15:41.610 AM Client successfully logged in. > 3/23/21 4:15:41.611 AM TGT refresh sleeping until: xxx Mar xx 23:42:03 GMT > 2021 > 3/23/21 4:15:41.611 AM Client will use GSSAPI as SASL mechanism. > 3/23/21 4:15:41.611 AM TGT valid starting at: xxx Mar xx 04:15:41 GMT > 2021 > 3/23/21 4:15:41.611 AM TGT expires: xxx Mar xx 04:15:41 GMT > 2021 > 3/23/21 4:15:41.611 AM TGT refresh thread started. > 3/23/21 4:15:41.612 AM Opening socket connection to server x:2181. Will > attempt to SASL-authenticate using Login Context section 'Client' > 3/23/21 4:15:41.613 AM Client session timed out, have not heard from server > in 3006ms for sessionid 0x0 > 3/23/21 4:15:41.613 AM Client session timed out, have not heard from server > in 3006ms for sessionid 0x0, closing socket connection and attempting > reconnect -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ZOOKEEPER-4247) NPE while processing message from restarted quorum member
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4247. Fix Version/s: 3.6.4 3.7.1 3.8.0 Resolution: Fixed > NPE while processing message from restarted quorum member > - > > Key: ZOOKEEPER-4247 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4247 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.6.2 > Environment: K8S >Reporter: Devarshi Shah >Assignee: Mate Szalay-Beko >Priority: Major > Labels: pull-request-available > Fix For: 3.8.0, 3.7.1, 3.6.4 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > *Problem:* > While upgrading K8S cluster, container running Zookeeper (during serving it's > client) will rollover one by one. > During this rollover, +Null Pointer Exception+ was observed as below. > After updating to the latest Zookeeper 3.6.2 we still see the problem. > This is happening on a fresh install (and has all the time). > > *Stack-trace**:* > > {code:java} > 2021-02-08T12:42:08.229+ [myid:] - ERROR > [nioEventLoopGroup-4-1:NettyServerCnxnFactory$CnxnChannelHandler@329] - > Unexpected exception in receive > java.lang.NullPointerException: null > at > org.apache.zookeeper.server.NettyServerCnxn.receiveMessage(NettyServerCnxn.java:518) > ~[zookeeper-3.6.2.jar:3.6.2] > at > org.apache.zookeeper.server.NettyServerCnxn.processMessage(NettyServerCnxn.java:368) > ~[zookeeper-3.6.2.jar:3.6.2] > at > org.apache.zookeeper.server.NettyServerCnxnFactory$CnxnChannelHandler.channelRead(NettyServerCnxnFactory.java:326) > [zookeeper-3.6.2.jar:3.6.2] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:714) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:650) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:576) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493) > [netty-transport-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) > [netty-common-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) > [netty-common-4.1.50.Final.jar:4.1.50.Final] > at > io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) > [netty-common-4.1.50.Final.jar:4.1.50.Final] > at java.lang.Thread.run(Thread.java:834) [?:?] > {code} > > > *Expectation:* > This scenario should be handled and Zookeeper should not print Null Pointer > Exception in logs when peer member goes down as a part of the upgrade > procedure. > We are kindly requesting Apache Zookeeper team to fix this issue. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4282) Redesign quota feature
Mohammad Arshad created ZOOKEEPER-4282: -- Summary: Redesign quota feature Key: ZOOKEEPER-4282 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4282 Project: ZooKeeper Issue Type: New Feature Components: quota Reporter: Mohammad Arshad Assignee: Mohammad Arshad Fix For: 3.8.0 *Quota Use Case:* Generally in a big data solution deployment multiple services (hdfs, yarn, hbase etc.) use single Zookeeper cluster. So it is very important to ensure fare usage by all services. Sometime services unintentionally, mainly because of faulty behavior, create many znodes and impact the overall reliability of the ZooKeeper service. To ensure the faire usage quota feature is required. But this is the only use case there are many other use cases for quota feature. *Current Problems:* # Currently, user can set quota by updating znode “/zookeeper/quota/nodepath”, or using setquota/delquota in CLI command. This makes the quota setting infective Currently any user can set/delete quota, which is not proper, it should be admin operation # User is allowed to modify zookeeper system paths like /zookeeper/quota. These are internal to zookeeper should not be allowed to modify. # Generally services create single top level znode in Zookeeper like /hbase and create all required znode under it. It is better if it is configurable who can create top level znodes to controll ZooKeeper usage. # After ZOOKEEPER-231, there two kinds quota enforcement limits 1. Hard limit 2. Soft limit. I think there should be only limit. When enforce quota is enabled that limits becomes the hard limit otherwise it is soft limit same as old feature, just logs warnings. *Proposed Solution* # Add setQuota and deleteQuota admin APIs. Add listQuota normal user API Modify quota cli commands to use these APIs instead of directory modifying ZooKeeper system path /zookeeper/quota/ # Protect ZooKeeper system paths from outside modification. System should only be readable from outside # Expose configuration to set ACL for root system znode. After this, at the time of ZooKeeper service deployment administrator can create top level znode for a service and set quota. This way we can control overall ZooKeeper usage # Revert some of the changes in ZOOKEEPER-231 and move to single quota limit -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3841) remove useless codes in the Leader.java
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3841: --- Fix Version/s: (was: 3.7.1,3.8.0) 3.7.1 3.8.0 > remove useless codes in the Leader.java > --- > > Key: ZOOKEEPER-3841 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3841 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Ling Mao >Priority: Minor > Fix For: 3.8.0, 3.7.1 > > Time Spent: 20m > Remaining Estimate: 0h > > - There are some useless code in the Leader.java which were comment out. > - Pls recheck all the things in this class to clear up > e.g: > {code:java} > // Everything is a go, simply start counting the ticks > // WARNING: I couldn't find any wait statement on a synchronized > // block that would be notified by this notifyAll() call, so > // I commented it out > //synchronized (this) { > //notifyAll(); > //} > {code} > {code:java} > //turnOffFollowers(); > {code} > {code:java} > //LOG.warn("designated leader is: " + designatedLeader); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-4265) Download page broken links
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4265: --- Fix Version/s: (was: 3.6.3) > Download page broken links > -- > > Key: ZOOKEEPER-4265 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4265 > Project: ZooKeeper > Issue Type: Bug >Reporter: Sebb >Assignee: Damien Diederen >Priority: Major > Labels: pull-request-available > Fix For: 3.8.0, 3.7.1 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > The download page [1] has broken links for the following release versions: > 3.6.1 > 3.5.9 > Please remove them from the page. > If necessary, they can be linked from the archive server, in which case the > page should make it clear that they historic releases. > [1] https://zookeeper.apache.org/releases.html -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17317650#comment-17317650 ] Mohammad Arshad commented on ZOOKEEPER-4278: Thanks [~ayushmantri] for raising the PR. Please raise PR for branch-3.5 also > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Ayush Mantri >Priority: Blocker > Labels: pull-request-available > Fix For: 3.6.3, 3.8.0, 3.7.1 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-4278. Fix Version/s: 3.7.1 3.8.0 Resolution: Fixed > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Assignee: Ayush Mantri >Priority: Blocker > Labels: pull-request-available > Fix For: 3.6.3, 3.8.0, 3.7.1 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4278: --- Priority: Blocker (was: Major) > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Blocker > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-4278: --- Fix Version/s: 3.6.3 > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Blocker > Fix For: 3.6.3 > > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17317085#comment-17317085 ] Mohammad Arshad commented on ZOOKEEPER-4278: To fix the CVE anyway we have to upgrade to 4.1.61. 4.1.62 and 4.1.63 are Regression fix releases. As per the release notes there is not much change from 4.1.62 to 4.1.63. https://netty.io/news/2021/03/30/4-1-61-Final.html https://netty.io/news/2021/03/31/4-1-62-Final.html https://netty.io/news/2021/04/01/4-1-63-Final.html > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316947#comment-17316947 ] Mohammad Arshad edited comment on ZOOKEEPER-4278 at 4/8/21, 8:25 AM: - Though 4.1.61.Final has fixed the CVE-2021-21409, latest netty release is 4-1-63-Final. I think we should upgrade to the latest version . https://netty.io/news/2021/04/01/4-1-63-Final.html was (Author: arshad.mohammad): Though 4.1.61.Final has fixed the CVE-2021-21409, latest netty release is 4-1-63-Final. I think we should upgrade to this version. https://netty.io/news/2021/04/01/4-1-63-Final.html > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17316947#comment-17316947 ] Mohammad Arshad commented on ZOOKEEPER-4278: Though 4.1.61.Final has fixed the CVE-2021-21409, latest netty release is 4-1-63-Final. I think we should upgrade to this version. https://netty.io/news/2021/04/01/4-1-63-Final.html > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
[ https://issues.apache.org/jira/browse/ZOOKEEPER-4278?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-4278: -- Assignee: (was: Mohammad Arshad) > dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 > - > > Key: ZOOKEEPER-4278 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 > Project: ZooKeeper > Issue Type: Bug >Reporter: Mohammad Arshad >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4278) dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409
Mohammad Arshad created ZOOKEEPER-4278: -- Summary: dependency-check:check failing - netty-transport-4.1.60.Final CVE-2021-21409 Key: ZOOKEEPER-4278 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4278 Project: ZooKeeper Issue Type: Bug Reporter: Mohammad Arshad Assignee: Mohammad Arshad -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3992) addWatch api should check the null watch
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3992: --- Fix Version/s: 3.8.0 3.6.3 > addWatch api should check the null watch > > > Key: ZOOKEEPER-3992 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3992 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Reporter: Ling Mao >Assignee: Damien Diederen >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.3, 3.7.0, 3.8.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > {code:java} > public void addWatch(String basePath, Watcher watcher, AddWatchMode mode) > throws KeeperException, InterruptedException { > PathUtils.validatePath(basePath); > String serverPath = prependChroot(basePath); > RequestHeader h = new RequestHeader(); > h.setType(ZooDefs.OpCode.addWatch); > AddWatchRequest request = new AddWatchRequest(serverPath, mode.getMode()); > ReplyHeader r = cnxn.submitRequest(h, request, new ErrorResponse(), > > {code} > we need to _*validateWatcher(watcher)*_ to ** avoid the case: > {code:java} > zk.addWatch("/a/b", null, PERSISTENT_RECURSIVE); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3980) Fix Jenkinsfiles with new tool names
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3980: --- Fix Version/s: 3.8.0 3.6.3 > Fix Jenkinsfiles with new tool names > > > Key: ZOOKEEPER-3980 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3980 > Project: ZooKeeper > Issue Type: Task > Components: build-infrastructure >Affects Versions: 3.7.0 >Reporter: Enrico Olivelli >Assignee: Enrico Olivelli >Priority: Blocker > Labels: pull-request-available > Fix For: 3.6.3, 3.7.0, 3.8.0 > > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3957) Create Owasp check build on new Jenkins instance
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3957: --- Fix Version/s: 3.8.0 3.6.3 > Create Owasp check build on new Jenkins instance > > > Key: ZOOKEEPER-3957 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3957 > Project: ZooKeeper > Issue Type: Task > Components: build >Reporter: Andor Molnar >Assignee: Andor Molnar >Priority: Major > Fix For: 3.6.3, 3.7.0, 3.8.0 > > Time Spent: 50m > Remaining Estimate: 0h > > We haven't migrated the owasp build to the new instance yet. > Need to create a new multi-branch Pipeline job here: > https://ci-hadoop.apache.org/view/ZooKeeper/ -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-3931) "zkServer.sh version" returns a trailing dash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314411#comment-17314411 ] Mohammad Arshad commented on ZOOKEEPER-3931: Thanks [~Suraj Naik] for your contribution, Added you as a contributor > "zkServer.sh version" returns a trailing dash > - > > Key: ZOOKEEPER-3931 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3931 > Project: ZooKeeper > Issue Type: Bug >Reporter: Enrico Olivelli >Assignee: Suraj Naik >Priority: Major > Fix For: 3.6.3 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > When you run zkServer.sh version the result includes a few spam lines and the > version reports a trailing dash > {noformat} > bin/zkServer.sh version > ZooKeeper JMX enabled by default > Using config: /xxx/bin/../conf/zoo.cfg > Apache ZooKeeper, version 3.6.2- 09/04/2020 12:44 GMT > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ZOOKEEPER-3931) "zkServer.sh version" returns a trailing dash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-3931: -- Assignee: Suraj Naik > "zkServer.sh version" returns a trailing dash > - > > Key: ZOOKEEPER-3931 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3931 > Project: ZooKeeper > Issue Type: Bug >Reporter: Enrico Olivelli >Assignee: Suraj Naik >Priority: Major > Fix For: 3.6.3 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > When you run zkServer.sh version the result includes a few spam lines and the > version reports a trailing dash > {noformat} > bin/zkServer.sh version > ZooKeeper JMX enabled by default > Using config: /xxx/bin/../conf/zoo.cfg > Apache ZooKeeper, version 3.6.2- 09/04/2020 12:44 GMT > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (ZOOKEEPER-3931) "zkServer.sh version" returns a trailing dash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad resolved ZOOKEEPER-3931. Fix Version/s: 3.6.3 Resolution: Fixed > "zkServer.sh version" returns a trailing dash > - > > Key: ZOOKEEPER-3931 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3931 > Project: ZooKeeper > Issue Type: Bug >Reporter: Enrico Olivelli >Priority: Major > Fix For: 3.6.3 > > Time Spent: 2h 40m > Remaining Estimate: 0h > > When you run zkServer.sh version the result includes a few spam lines and the > version reports a trailing dash > {noformat} > bin/zkServer.sh version > ZooKeeper JMX enabled by default > Using config: /xxx/bin/../conf/zoo.cfg > Apache ZooKeeper, version 3.6.2- 09/04/2020 12:44 GMT > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-3931) "zkServer.sh version" returns a trailing dash
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314410#comment-17314410 ] Mohammad Arshad commented on ZOOKEEPER-3931: Created https://issues.apache.org/jira/browse/ZOOKEEPER-4273 to forward port in master branch-3.7 > "zkServer.sh version" returns a trailing dash > - > > Key: ZOOKEEPER-3931 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3931 > Project: ZooKeeper > Issue Type: Bug >Reporter: Enrico Olivelli >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > When you run zkServer.sh version the result includes a few spam lines and the > version reports a trailing dash > {noformat} > bin/zkServer.sh version > ZooKeeper JMX enabled by default > Using config: /xxx/bin/../conf/zoo.cfg > Apache ZooKeeper, version 3.6.2- 09/04/2020 12:44 GMT > {noformat} > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-4273) Forward port ZOOKEEPER-3931: "zkServer.sh version" returns a trailing dash
Mohammad Arshad created ZOOKEEPER-4273: -- Summary: Forward port ZOOKEEPER-3931: "zkServer.sh version" returns a trailing dash Key: ZOOKEEPER-4273 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4273 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.7.1 Reporter: Mohammad Arshad When you run zkServer.sh version the result includes a few spam lines and the version reports a trailing dash {noformat} bin/zkServer.sh version ZooKeeper JMX enabled by default Using config: /xxx/bin/../conf/zoo.cfg Apache ZooKeeper, version 3.6.2- 09/04/2020 12:44 GMT {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3934) upgrade dependency-check to version 6.0.0
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3934: --- Fix Version/s: (was: 3.6.3) (was: 3.7.0) > upgrade dependency-check to version 6.0.0 > - > > Key: ZOOKEEPER-3934 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3934 > Project: ZooKeeper > Issue Type: Improvement > Components: build, security >Affects Versions: 3.7.0, 3.5.8, 3.6.2 >Reporter: Patrick D. Hunt >Assignee: Patrick D. Hunt >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > 6.0.0 is now available. I verified it with 3.5, 3.6,3.7 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3933) owasp failing with json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3933: --- Fix Version/s: (was: 3.6.3) (was: 3.7.0) > owasp failing with json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712 > --- > > Key: ZOOKEEPER-3933 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3933 > Project: ZooKeeper > Issue Type: Bug > Components: security >Affects Versions: 3.7.0, 3.5.8, 3.6.2 >Reporter: Patrick D. Hunt >Priority: Blocker > > dependency-check is failing with: > json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ZOOKEEPER-3933) owasp failing with json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-3933: -- Assignee: (was: Mohammad Arshad) > owasp failing with json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712 > --- > > Key: ZOOKEEPER-3933 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3933 > Project: ZooKeeper > Issue Type: Bug > Components: security >Affects Versions: 3.7.0, 3.5.8, 3.6.2 >Reporter: Patrick D. Hunt >Priority: Blocker > Fix For: 3.6.3, 3.7.0 > > > dependency-check is failing with: > json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (ZOOKEEPER-3933) owasp failing with json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad reassigned ZOOKEEPER-3933: -- Assignee: Mohammad Arshad > owasp failing with json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712 > --- > > Key: ZOOKEEPER-3933 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3933 > Project: ZooKeeper > Issue Type: Bug > Components: security >Affects Versions: 3.7.0, 3.5.8, 3.6.2 >Reporter: Patrick D. Hunt >Assignee: Mohammad Arshad >Priority: Blocker > Fix For: 3.6.3, 3.7.0 > > > dependency-check is failing with: > json-simple-1.1.1.jar: CVE-2020-10663, CVE-2020-7712 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-3841) remove useless codes in the Leader.java
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314403#comment-17314403 ] Mohammad Arshad commented on ZOOKEEPER-3841: Update fix version from 3.6.3 to 3.8.0,3.7.1 as changes are not present in branch 3.6 > remove useless codes in the Leader.java > --- > > Key: ZOOKEEPER-3841 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3841 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Ling Mao >Priority: Minor > Fix For: 3.7.1,3.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > - There are some useless code in the Leader.java which were comment out. > - Pls recheck all the things in this class to clear up > e.g: > {code:java} > // Everything is a go, simply start counting the ticks > // WARNING: I couldn't find any wait statement on a synchronized > // block that would be notified by this notifyAll() call, so > // I commented it out > //synchronized (this) { > //notifyAll(); > //} > {code} > {code:java} > //turnOffFollowers(); > {code} > {code:java} > //LOG.warn("designated leader is: " + designatedLeader); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3841) remove useless codes in the Leader.java
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3841: --- Fix Version/s: (was: 3.6.3) 3.7.1,3.8.0 > remove useless codes in the Leader.java > --- > > Key: ZOOKEEPER-3841 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3841 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Ling Mao >Priority: Minor > Fix For: 3.7.1,3.8.0 > > Time Spent: 20m > Remaining Estimate: 0h > > - There are some useless code in the Leader.java which were comment out. > - Pls recheck all the things in this class to clear up > e.g: > {code:java} > // Everything is a go, simply start counting the ticks > // WARNING: I couldn't find any wait statement on a synchronized > // block that would be notified by this notifyAll() call, so > // I commented it out > //synchronized (this) { > //notifyAll(); > //} > {code} > {code:java} > //turnOffFollowers(); > {code} > {code:java} > //LOG.warn("designated leader is: " + designatedLeader); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (ZOOKEEPER-3798) remove the useless code in the ProposalRequestProcessor#processRequest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mohammad Arshad updated ZOOKEEPER-3798: --- Fix Version/s: (was: 3.6.3) 3.8.0,3.7.1 > remove the useless code in the ProposalRequestProcessor#processRequest > -- > > Key: ZOOKEEPER-3798 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3798 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Ling Mao >Priority: Minor > Labels: pull-request-available > Fix For: 3.8.0,3.7.1 > > Time Spent: 40m > Remaining Estimate: 0h > > remove the following useless codes in the > ProposalRequestProcessor#processRequest > {code:java} > public void processRequest(Request request) throws RequestProcessorException { > // LOG.warn("Ack>>> cxid = " + request.cxid + " type = " + > // request.type + " id = " + request.sessionId); > // request.addRQRec(">prop"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-3798) remove the useless code in the ProposalRequestProcessor#processRequest
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314402#comment-17314402 ] Mohammad Arshad commented on ZOOKEEPER-3798: Update fix version from 3.6.3 to 3.8.0,3.7.1 as changes are not present in branch 3.6 > remove the useless code in the ProposalRequestProcessor#processRequest > -- > > Key: ZOOKEEPER-3798 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3798 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Ling Mao >Priority: Minor > Labels: pull-request-available > Fix For: 3.8.0,3.7.1 > > Time Spent: 40m > Remaining Estimate: 0h > > remove the following useless codes in the > ProposalRequestProcessor#processRequest > {code:java} > public void processRequest(Request request) throws RequestProcessorException { > // LOG.warn("Ack>>> cxid = " + request.cxid + " type = " + > // request.type + " id = " + request.sessionId); > // request.addRQRec(">prop"); > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (ZOOKEEPER-3774) Close quorum socket asynchronously on the leader to avoid ping being blocked by long socket closing time
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17314401#comment-17314401 ] Mohammad Arshad commented on ZOOKEEPER-3774: Update fix version from 3.6.3 to 3.8.0,3.7.1 as changes are not present in branch 3.6 > Close quorum socket asynchronously on the leader to avoid ping being blocked > by long socket closing time > > > Key: ZOOKEEPER-3774 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3774 > Project: ZooKeeper > Issue Type: Sub-task > Components: server >Reporter: Jie Huang >Assignee: Jie Huang >Priority: Minor > Labels: pull-request-available > Fix For: 3.8.0, 3.7.1 > > Time Spent: 3h > Remaining Estimate: 0h > > In ZOOKEEPER-3574 we close the quorum sockets on followers asynchronously > when a leader is partitioned away so the shutdown process will not be stalled > by long socket closing time and the followers can quickly establish a new > quorum to serve client requests. > We've found that the long socket closing time can cause trouble on the leader > too when a follower is partitioned away if the partition is detected by > PingLaggingDetector. When the ping thread detects partition, it tries to > disconnect the follower. If the socket closing time is long, the ping thread > will be blocked and no ping is sent to any follower--even the ones still > connected to the leader--since the ping thread is responsible for sending > pings to all followers. When followers don't receive pings, they don't send > ping response. When the leader don't receive ping response, the sessions > expire. > To prevent good sessions from expiring, we need to close the socket > asynchronously on the leader too. > -- This message was sent by Atlassian Jira (v8.3.4#803005)