[jira] [Created] (ZOOKEEPER-3981) Flaky test MultipleAddressTest::testGetValidAddressWithNotValid
Michael Han created ZOOKEEPER-3981: -- Summary: Flaky test MultipleAddressTest::testGetValidAddressWithNotValid Key: ZOOKEEPER-3981 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3981 Project: ZooKeeper Issue Type: Task Components: tests Reporter: Michael Han Assignee: Michael Han Problem: Test MultipleAddressTest::testGetValidAddressWithNotValid might fail deterministically when the address it's using, 10.0.0.1, is reachable, as per https://tools.ietf.org/html/rfc5735 10.0.0.1 might be allocatable to private network usage. In fact, the router address of my ISP is assigned this IP, leading to this test always failing for me. Solution: Replace the address with 240.0.0.0, which is reserved for future use and less likely to be reachable. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3970) Enable ZooKeeperServerController to expire session
Michael Han created ZOOKEEPER-3970: -- Summary: Enable ZooKeeperServerController to expire session Key: ZOOKEEPER-3970 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3970 Project: ZooKeeper Issue Type: Task Components: server, tests Reporter: Michael Han Assignee: Michael Han This is a follow up of ZOOKEEPER-3948. Here we enable ZooKeeperServerController to be able to expire a global or local session. This is very useful in our experience in integration testing when we want a controlled session expiration mechanism. This is done by having session tracker exposing both global and local session stats, so a zookeeper server can expire the sessions in the controller. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3967) Jetty License Update
Michael Han created ZOOKEEPER-3967: -- Summary: Jetty License Update Key: ZOOKEEPER-3967 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3967 Project: ZooKeeper Issue Type: Task Components: license Reporter: Michael Han ZooKeeper server is using Jetty (apache license, v2) for admin server (and for more things in future), but we didn't include any of Jetty's copy right / notice / license file in ZooKeeper distribution. This ticket is to figure out if Jetty license is indeed missing and if so, fix it. Some previous discussions on Jetty license in ZOOKEEPER-2235 but Jetty somehow was not ended up in the patch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3966) Model ZooKeeper data tree using RocksDB primitives to enable on disk data tree storage
Michael Han created ZOOKEEPER-3966: -- Summary: Model ZooKeeper data tree using RocksDB primitives to enable on disk data tree storage Key: ZOOKEEPER-3966 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3966 Project: ZooKeeper Issue Type: Sub-task Components: server Reporter: Michael Han -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3965) Add documentation for RocksDB Snap feature
Michael Han created ZOOKEEPER-3965: -- Summary: Add documentation for RocksDB Snap feature Key: ZOOKEEPER-3965 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3965 Project: ZooKeeper Issue Type: Sub-task Components: documentation Reporter: Michael Han -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3964) Introduce RocksDB snap and implement change data capture to enable incremental snapshot
Michael Han created ZOOKEEPER-3964: -- Summary: Introduce RocksDB snap and implement change data capture to enable incremental snapshot Key: ZOOKEEPER-3964 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3964 Project: ZooKeeper Issue Type: Sub-task Components: rocksdb, server Reporter: Michael Han Assignee: Michael Han This is the first step of enabling on disk storage engine for ZooKeeper by extending the existing Snap interface and implement a RocksDB backed snapshot. Comparing to file based snapshot, RocksDB based snapshot is superior for big in memory data tree as it supports incremental snapshot by only serializing the changed data between snapshots. High level overview: * Extend Snap interface so every thing that's need serialize has a presence on the interface. * Implement RocksDB based snapshot, and bidirectional conversations between File based snapshot and RocksDB snapshot, for back / forward compatibility. * Change data capture is implemented by buffering transactions applied to data tree, and applied to RocksDB when processing each transaction. An incremental snapshot thus only requires RocksDB flush. ZK will always do a full snapshot when first loading the data tree during the start process. * By default, this feature is disabled. Users need opt in by explicitly specify a Java system property to instantiate RocksDBSnap at runtime. This work is based on top of the patch attached to ZOOKEEPER-3783 (kudos to Fangmin and co at FB), with some bug / test fixes and adjustment so it can cleanly apply to master branch. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3948) Introduce a deterministic runtime behavior injection framework for ZooKeeperServer testing
Michael Han created ZOOKEEPER-3948: -- Summary: Introduce a deterministic runtime behavior injection framework for ZooKeeperServer testing Key: ZOOKEEPER-3948 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3948 Project: ZooKeeper Issue Type: New Feature Components: server, tests Reporter: Michael Han Assignee: Michael Han We'd like to understand how applications built on top of ZooKeeper behave under various faulty conditions, which is important to build resilient end to end solutions and avoid ZooKeeper being single point of failure. We'd also like to achieve this in both unit tests (in process) and integration tests (in and out of process). Traditional methods of using external fault injection mechanisms are non deterministic and requires non trivial set up and hard to integrate with unit tests, so here we introduce the ZooKeeperController service which solves both. The basic idea here is to create a controllable ZooKeeperServer which accepts various control commands (such as - delay request, drop request, eat request, expire session, shutdown, trigger leader election, and so on), and reacting based on incoming commands. The controllable server and production server share the same underlying machineries (quorum peers, ZooKeeper server, etc) but code paths are separate, thus this feature has no production impact. This controller system is currently composed of following pieces: * CommandClient: a convenient HTTP client to send control commands to controller service. * CommandListener: an embed HTTP server listening incoming commands and dispatch to controller service. * Controller Service: the service that's responsible to create controllable ZK server and the controller. * ZooKeeperServerController: the controller that changes the behavior of ZK server runtime. * Controllable Cnx / Factory: controllable connection that accept behavior change requests. In future more control commands and controllable components can be added on top of this framework. This can be used in either unit tests / integration tests as an in process embedded controllable ZooKeeper server, or as an out of process stand alone controllable ZooKeeper process. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3793) Request throttling is broken when RequestThrottler is disabled or configured incorrectly.
Michael Han created ZOOKEEPER-3793: -- Summary: Request throttling is broken when RequestThrottler is disabled or configured incorrectly. Key: ZOOKEEPER-3793 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3793 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han When RequestThrottler is not enabled or is enabled but configured incorrectly, ZooKeeper server will stop throttling. This is a serious bug as without request throttling, it's fairly easy to overwhelm ZooKeeper which leads to all sorts of issues. This is a regression introduced in ZOOKEEPER-3243, where the total number of queued requests in request processing pipeline is not taking into consideration when deciding whether to throttle or not, or only taken into consideration conditionally based on RequestThrottler's configurations. We should make sure always taking into account the number of queued requests in request processing pipeline before making throttling decisions. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3561) Generalize target authentication scheme for ZooKeeper authentication enforcement.
Michael Han created ZOOKEEPER-3561: -- Summary: Generalize target authentication scheme for ZooKeeper authentication enforcement. Key: ZOOKEEPER-3561 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3561 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.6.0 Reporter: Michael Han ZOOKEEPER-1634 introduced an option to allow user enforce authentication for ZooKeeper clients, but the enforced authentication scheme in committed implementation was SASL only. This JIRA is to generalize the authentication scheme such that the authentication enforcement on ZooKeeper clients could work with any supported authentication scheme. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3560) Add response cache to serve get children (2) requests.
Michael Han created ZOOKEEPER-3560: -- Summary: Add response cache to serve get children (2) requests. Key: ZOOKEEPER-3560 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3560 Project: ZooKeeper Issue Type: Improvement Components: server Reporter: Michael Han Assignee: Michael Han ZOOKEEPER-3180 introduces response cache but it only covers getData requests. This JIRA is to extend the response cache based on the infrastructure set up by ZOOKEEPER-3180 to so the response of get children requests can also be served out of cache. Some design decisions: * Only OpCode.getChildren2 is supported, as OpCode.getChildren does not have associated stats and current cache infra relies on stats to invalidate cache. * The children list is stored in a separate response cache object so it does not pollute the existing data cache that's serving getData requests, and this separation also allows potential separate tuning of each cache based on workload characteristics. * As a result of cache object separation, new server metrics is added to measure cache hit / miss for get children requests, that's separated from get data requests. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (ZOOKEEPER-3548) Redundant zxid check in SnapStream.isValidSnapshot
Michael Han created ZOOKEEPER-3548: -- Summary: Redundant zxid check in SnapStream.isValidSnapshot Key: ZOOKEEPER-3548 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3548 Project: ZooKeeper Issue Type: Improvement Components: server Reporter: Michael Han Assignee: Michael Han getZxidFromName is called twice in isValidSnapshot, and the second call is redundant and should be removed. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (ZOOKEEPER-3483) Flaky test: org.apache.zookeeper.server.util.RequestPathMetricsCollectorTest.testCollectStats
Michael Han created ZOOKEEPER-3483: -- Summary: Flaky test: org.apache.zookeeper.server.util.RequestPathMetricsCollectorTest.testCollectStats Key: ZOOKEEPER-3483 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3483 Project: ZooKeeper Issue Type: Test Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han Test org.apache.zookeeper.server.util.RequestPathMetricsCollectorTest.testCollectStats consistently pass on local dev environment but frequently failing on Jenkins pre-commit build. For now disable the test to unblock a couple of pull request acquiring a green build, before it's completely addressed. Error for reference: {code:java} Error Message expected:<845466> but was:<111> Stacktrace java.lang.AssertionError: expected:<845466> but was:<111> at org.apache.zookeeper.server.util.RequestPathMetricsCollectorTest.testCollectStats(RequestPathMetricsCollectorTest.java:248) {code} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (ZOOKEEPER-3448) Introduce MessageTracker to assist debug leader and leaner connectivity issues
Michael Han created ZOOKEEPER-3448: -- Summary: Introduce MessageTracker to assist debug leader and leaner connectivity issues Key: ZOOKEEPER-3448 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3448 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han We want to have better insight on the state of the world when learners lost connection with leader, so we need capture more information when that happens. We capture more information through MessageTracker which will record the last few sent and received messages at various protocol stage, and these information will be dumped to log files for further analysis. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3439) Observability improvements on client / server connection close
Michael Han created ZOOKEEPER-3439: -- Summary: Observability improvements on client / server connection close Key: ZOOKEEPER-3439 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3439 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han Currently when server close a client connection there is not enough information recorded (except few exception logs) which makes it hard to do postmortems. On the other side, having a complete view of the aggregated connection closing reason will provide more signals based on which we can better operate the clusters (e.g. predicate an incident might happen based on the trending of the connection closing reasons). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3430) Observability improvement: provide top N read / write path queries
Michael Han created ZOOKEEPER-3430: -- Summary: Observability improvement: provide top N read / write path queries Key: ZOOKEEPER-3430 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3430 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han We would like to have a better understanding of the type of workloads hit ZK, and one aspect of such understanding is to be able to answer queries of top N read and top N write request path. Knowing the hot request paths will allow us better optimize for such workloads, for example, enabling path specific caching, or change the path structure (e.g. break a long path to hierarchical paths). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3427) Introduce SnapshotComparer that assists debugging with snapshots.
Michael Han created ZOOKEEPER-3427: -- Summary: Introduce SnapshotComparer that assists debugging with snapshots. Key: ZOOKEEPER-3427 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3427 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han SnapshotComparer is a tool that loads and compares two snapshots, with configurable threshold and various filters. It's useful in use cases that involve snapshot analysis, such as offline data consistency checking, and data trending analysis (e.g. what's growing under which zNode path during when). A sample output of the tool (actual numbers removed, due to sensitivity). {code:java} Successfully parsed options! Deserialized snapshot in snapshot.0 in seconds Processed data tree in seconds Deserialized snapshot in snapshot.1 in seconds Processed data tree in seconds Node count: Total size: Max depth: Count of nodes at depth 1: Count of nodes at depth 2: Count of nodes at depth 3: Count of nodes at depth 4: Count of nodes at depth 5: Count of nodes at depth 6: Count of nodes at depth 7: Count of nodes at depth 8: Count of nodes at depth 9: Count of nodes at depth 10: Count of nodes at depth 11: Node count: Total size: Max depth: Count of nodes at depth 1: Count of nodes at depth 2: Count of nodes at depth 3: Count of nodes at depth 4: Count of nodes at depth 5: Count of nodes at depth 6: Count of nodes at depth 7: Count of nodes at depth 8: Count of nodes at depth 9: Count of nodes at depth 10: Count of nodes at depth 11: Analysis for depth 0 Analysis for depth 1 Analysis for depth 2 Analysis for depth 3 Analysis for depth 4 Analysis for depth 5 Analysis for depth 6 Analysis for depth 7 Analysis for depth 8 Analysis for depth 9 Analysis for depth 10 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3419) Backup and recovery support
Michael Han created ZOOKEEPER-3419: -- Summary: Backup and recovery support Key: ZOOKEEPER-3419 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3419 Project: ZooKeeper Issue Type: New Feature Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han Historically ZooKeeper has no intrinsic support for backup and restore. The usual approach of doing backup and restore is through customized scripts to copy data around, or through some 3rd party tools (exhibitor, etc), which introduces operation burden. This Jira will introduce another option: a direct support of backup and restore from ZooKeeper itself. It's completely built into ZooKeeper, support point in time recovery of an entire tree rooted after an oops event, support recovery partial tree for test/dev purpose, and can help replay history for bug investigation. It will try to provide a generic interface so the backups can be directed to different data storage systems (S3, Kafka, HDFS, etc). This same system has been in production at Twitter for X years and proved to be quite helpful for various use cases mentioned earlier. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3418) Improve quorum throughput through eager ACL checks of requests on local servers
Michael Han created ZOOKEEPER-3418: -- Summary: Improve quorum throughput through eager ACL checks of requests on local servers Key: ZOOKEEPER-3418 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3418 Project: ZooKeeper Issue Type: Improvement Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han Serving write requests that change the state of the system requires quorum operations, and in some cases, the quorum operations can be avoided if the requests are doomed to fail. ACL check failure is such a case. To optimize for this case, we elevate the ACL check logic and perform eager ACL check on local server (where the requests are received), and fail fast, before sending the requests to leader. As with any features, there is a feature flag that can control this feature on, or off (default). This feature is also forward compatible in that for new any new Op code (and some existing Op code we did not explicit check against), they will pass the check and (potentially) fail on leader side, instead of being prematurely filtered out on local server. The end result is better throughput and stability of the quorum for certain workloads. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3416) Remove redundant ServerCnxnFactoryAccessor
Michael Han created ZOOKEEPER-3416: -- Summary: Remove redundant ServerCnxnFactoryAccessor Key: ZOOKEEPER-3416 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3416 Project: ZooKeeper Issue Type: Improvement Components: tests Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han We have two ways to access the private zkServer inside ServerCnxnFactory, and there is really no need to keep maintaining both. We could remove ServerCnxnFactoryAccessor when we added the public accessor for ServerCnxnFactory in ZOOKEEPER-1346, but we did not. The solution is to consolidate all access of the zkServer through the public accessor of ServerCnxnFactory. The end result is cleaner code base and less confusion. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-1000) Provide SSL in zookeeper to be able to run cross colos.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-1000. Resolution: Duplicate > Provide SSL in zookeeper to be able to run cross colos. > --- > > Key: ZOOKEEPER-1000 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1000 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Mahadev konar >Assignee: Mahadev konar >Priority: Major > Fix For: 3.6.0, 3.5.6 > > > This jira is to track SSL for zookeeper. The inter zookeeper server > communication and the client to server communication should be over ssl so > that zookeeper can be deployed over WAN's. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-1000) Provide SSL in zookeeper to be able to run cross colos.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16845426#comment-16845426 ] Michael Han commented on ZOOKEEPER-1000: Agreed, this sounds like a dup we can close for now. If one day we find the current plain socket based solution is not good enough feature / performance wise, we can revisit this issue which is based on SSL on top of Netty. > Provide SSL in zookeeper to be able to run cross colos. > --- > > Key: ZOOKEEPER-1000 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1000 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Mahadev konar >Assignee: Mahadev konar >Priority: Major > Fix For: 3.6.0, 3.5.6 > > > This jira is to track SSL for zookeeper. The inter zookeeper server > communication and the client to server communication should be over ssl so > that zookeeper can be deployed over WAN's. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3399) Remove logging in getGlobalOutstandingLimit for optimal performance.
Michael Han created ZOOKEEPER-3399: -- Summary: Remove logging in getGlobalOutstandingLimit for optimal performance. Key: ZOOKEEPER-3399 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3399 Project: ZooKeeper Issue Type: Bug Components: server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han Recently we have moved some of our production clusters to the top of the trunk. One issue we found is a performance regression on read and write latency on the clusters where the quorum is also serving traffic. The average read latency increased by 50x, p99 read latency increased by 300x. The root cause is a log statement introduced in ZOOKEEPER-3177 (PR711), where we added a LOG.info statement in getGlobalOutstandingLimit. getGlobalOutstandingLimit is on the critical code path for request processing and for each request, it will be called twice (one at processing the packet, one at finalizing the request response). This not only degrades performance of the server, but also bloated the log file, when the QPS of a server is high. This only impacts clusters when the quorum (leader + follower) is serving traffic. For clusters where only observers are serving traffic no impact is observed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3352) Use LevelDB For Backend
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814023#comment-16814023 ] Michael Han commented on ZOOKEEPER-3352: I don't see an obvious gain of using a LSM tree backend just for snapshot and txn log. For zk clients, the read path will be served directly from memory (think zk data tree as a 'memtable' that never flushes); the write path is already sequential for both snapshot and txn log. Reading snapshot and txn log out of LSM tree instead of flat files might reduce recovery time, but I doubt the difference is substantial. That said, having a LSM tree backend and build the zk data tree on top of it will make it possible to store a much larger scale of data set per single node, as we will not store all data in memory only. > Use LevelDB For Backend > --- > > Key: ZOOKEEPER-3352 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3352 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: David Mollitor >Assignee: David Mollitor >Priority: Major > Fix For: 4.0.0 > > > Use LevelDB for managing data stored in ZK (transaction logs and snapshots). > https://stackoverflow.com/questions/6779669/does-leveldb-support-java -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3240) Close socket on Learner shutdown to avoid dangling socket
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16741714#comment-16741714 ] Michael Han commented on ZOOKEEPER-3240: [~nixon] : bq. so the Leader is unable to sense the change in Learner status through the status of the network connection A plausible theory :) The ping packet between leader and learners is designed to solve this exact problem - to be able to detect liveness of the other side. Basically for each learner, leader will constantly read packets out of the socket associated with the learner in the corresponding LearnerHandler thread. And this read has a timeout configured on the socket on leader side, so even if the sockets on both side are valid, but there is no traffic (such as in this case, where learner leaks sockets by not properly closing them after shutting down), leader's read should eventually time out after sync limit check. Unless: * Leader's socket read time out has no effect. So leader will block on reading a socket indefinitely because there is no traffic from learner. * Learner process, after restarted, somehow ended up with reusing the old Learner socket that's leaked so the corresponding LearnerHandler thread can't detect any difference (which is expected.). I am not sure how possible this case is in practice. In any case, it seems that our Ping mechanism failed to detect the network change in this case. bq. the learner queue size keeps growing Do you mind elaborate a little bit on which exact queue this is and what caused it growing? > Close socket on Learner shutdown to avoid dangling socket > - > > Key: ZOOKEEPER-3240 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3240 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.6.0 >Reporter: Brian Nixon >Priority: Minor > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > There was a Learner that had two connections to the Leader after that Learner > hit an unexpected exception during flush txn to disk, which will shutdown > previous follower instance and restart a new one. > > {quote}2018-10-26 02:31:35,568 ERROR > [SyncThread:3:ZooKeeperCriticalThread@48] - Severe unrecoverable error, from > thread : SyncThread:3 > java.io.IOException: Input/output error > at java.base/sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at > java.base/sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:72) > at > java.base/sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:395) > at > org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:457) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:548) > at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:769) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:246) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:172) > 2018-10-26 02:31:35,568 INFO [SyncThread:3:ZooKeeperServerListenerImpl@42] - > Thread SyncThread:3 exits, error code 1 > 2018-10-26 02:31:35,568 INFO [SyncThread:3:SyncRequestProcessor@234] - > SyncRequestProcessor exited!{quote} > > It is supposed to close the previous socket, but it doesn't seem to be done > anywhere in the code. This leaves the socket open with no one reading from > it, and caused the queue full and blocked on sender. > > Since the LearnerHandler didn't shutdown gracefully, the learner queue size > keeps growing, the JVM heap size on leader keeps growing and added pressure > to the GC, and cause high GC time and latency in the quorum. > > The simple fix is to gracefully shutdown the socket. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3240) Close socket on Learner shutdown to avoid dangling socket
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16740161#comment-16740161 ] Michael Han commented on ZOOKEEPER-3240: [~nixon] Good catch, the fix looks reasonable. I've seen similar issue in my production environment, the fix I did was on Leader side where I tracked the LearnerHandler threads associated with server ids, and make sure each server id only has a single LearnerHandler thread. This also work in cases where the learners don't have a chance to close their sockets, or they did but due to some reasons the TCP reset never made it to leader. But in any case, it's good to fix the resource leaking on learner side. I also wonder why we could get into such case on Leader side in first place. On leader, we do have socket read timeout set via setSoTimeout for leaner handler threads (after the socket was created via serverSocket.accept), and each learner handler would constantly polling / trying read from the socket afterwards. If, on a learner it dies but left a valid socket open, I was expecting one leader side the LearnerHandler thread that trying to read from that died learner socket will eventually timeout, which, will throw SocketTimeOutException and cause the LearnerHandler thread on the leader kill itself. This though does not seem to be the case I observed. Do you have any insights on this? > Close socket on Learner shutdown to avoid dangling socket > - > > Key: ZOOKEEPER-3240 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3240 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.6.0 >Reporter: Brian Nixon >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > There was a Learner that had two connections to the Leader after that Learner > hit an unexpected exception during flush txn to disk, which will shutdown > previous follower instance and restart a new one. > > {quote}2018-10-26 02:31:35,568 ERROR > [SyncThread:3:ZooKeeperCriticalThread@48] - Severe unrecoverable error, from > thread : SyncThread:3 > java.io.IOException: Input/output error > at java.base/sun.nio.ch.FileDispatcherImpl.force0(Native Method) > at > java.base/sun.nio.ch.FileDispatcherImpl.force(FileDispatcherImpl.java:72) > at > java.base/sun.nio.ch.FileChannelImpl.force(FileChannelImpl.java:395) > at > org.apache.zookeeper.server.persistence.FileTxnLog.commit(FileTxnLog.java:457) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.commit(FileTxnSnapLog.java:548) > at org.apache.zookeeper.server.ZKDatabase.commit(ZKDatabase.java:769) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:246) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:172) > 2018-10-26 02:31:35,568 INFO [SyncThread:3:ZooKeeperServerListenerImpl@42] - > Thread SyncThread:3 exits, error code 1 > 2018-10-26 02:31:35,568 INFO [SyncThread:3:SyncRequestProcessor@234] - > SyncRequestProcessor exited!{quote} > > It is supposed to close the previous socket, but it doesn't seem to be done > anywhere in the code. This leaves the socket open with no one reading from > it, and caused the queue full and blocked on sender. > > Since the LearnerHandler didn't shutdown gracefully, the learner queue size > keeps growing, the JVM heap size on leader keeps growing and added pressure > to the GC, and cause high GC time and latency in the quorum. > > The simple fix is to gracefully shutdown the socket. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3180) Add response cache to improve the throughput of read heavy traffic
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719752#comment-16719752 ] Michael Han commented on ZOOKEEPER-3180: My experience with JVM GC and ZooKeeper is GC is rarely a real issue in production if tuned correctly (I ran fairly large ZK fleet which kind push ZK to its limit). Most GC issue I had is software bugs - such as leaking connections. For this cache case, the current implementation is good enough for my use case, though I do have interests on off heap solutions as well. My concern around off heap solution is it's probably going to be more complicated, and has overhead of serialization / deserialization between heap / off heap. I'd say we get this patch landed, have more people tested it out, then improve it with more options. And for caching in general, obviously it depends a lot on workload and actual use case, so it's kind hard to provide a cache solution that works for everyone in first place... > Add response cache to improve the throughput of read heavy traffic > --- > > Key: ZOOKEEPER-3180 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3180 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Brian Nixon >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 3h 50m > Remaining Estimate: 0h > > On read heavy use case with large response data size, the serialization of > response takes time and added overhead to the GC. > Add response cache helps improving the throughput we can support, which also > reduces the latency in general. > This Jira is going to implement a LRU cache for the response, which shows > some performance gain on some of our production ensembles. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3211) zookeeper standalone mode,found a high level bug in kernel of centos7.0 ,zookeeper Server's tcp/ip socket connections(default 60 ) are CLOSE_WAIT ,this lead to zk
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719736#comment-16719736 ] Michael Han commented on ZOOKEEPER-3211: {quote}Have similar defects been solved in 3.4.13? {quote} Previously there were reports about CLOSE_WAIT, but if I remember correctly, most of those cases ended up no actions taken because it was hard to reproduce. {quote}It looks like zk Server is deadlocked {quote} The thread dump in 1.log file indicates some threads are blocked, but that seems a symptom rather than the cause. If we run out available sockets then some zookeeper threads that involves file IO / socket IO will be blocked. {quote}Does this cause CLOSE_WAIT for zk? {quote} Most of time, long living CLOSE_WAIT connections indicate an application side bug instead of kernel bug - that the connection should be closed but for some reasons the application, after receiving TCP reset from clients can't close the connection - which effectively leaks connections. The upgrade of kernel could be a trigger though. I am interested to know if any other folks can reproduce this. I currently don't have the environment to reproduce this. Also, [~yss] can you please use zip file instead of rar file for uploading log files? Another thing to try is to increase your limit of open file descriptors - seems its currently set as 60? If you increase it (ulimit), you could end up still leaking connections but the server should be available before its running out of sockets. > zookeeper standalone mode,found a high level bug in kernel of centos7.0 > ,zookeeper Server's tcp/ip socket connections(default 60 ) are CLOSE_WAIT > ,this lead to zk can't work for client any more > -- > > Key: ZOOKEEPER-3211 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3211 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.5 > Environment: 1.zoo.cfg > server.1=127.0.0.1:2902:2903 > 2.kernel > kernel:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 > 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux > JDK: > java version "1.7.0_181" > OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00) > OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode) > zk: 3.4.5 >Reporter: yeshuangshuang >Priority: Blocker > Fix For: 3.4.5 > > Attachments: 1.log, zklog.rar > > Original Estimate: 168h > Remaining Estimate: 168h > > 1.config--zoo.cfg > server.1=127.0.0.1:2902:2903 > 2.kernel version > version:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 > 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux > JDK: > java version "1.7.0_181" > OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00) > OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode) > zk: 3.4.5 > 3.bug details: > Occasionally,But the recurrence probability is extremely high. At first, the > read-write timeout takes about 6s, and after a few minutes, all connections > (including long ones) will be CLOSE_WAIT state. > 4.:Circumvention scheme: it is found that all connections become close_wait > to restart the zookeeper server side actively -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3211) zookeeper standalone mode,found a high level bug in kernel of centos7.0 ,zookeeper Server's tcp/ip socket connections(default 60 ) are CLOSE_WAIT ,this lead to zk
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719680#comment-16719680 ] Michael Han commented on ZOOKEEPER-3211: [~yss] Have you tried newer version of stable zookeeper release (e.g. 3.4.13), as well as different versions of OS? 3.4.5 is a pretty old version. > zookeeper standalone mode,found a high level bug in kernel of centos7.0 > ,zookeeper Server's tcp/ip socket connections(default 60 ) are CLOSE_WAIT > ,this lead to zk can't work for client any more > -- > > Key: ZOOKEEPER-3211 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3211 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.5 > Environment: 1.zoo.cfg > server.1=127.0.0.1:2902:2903 > 2.kernel > kernel:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 > 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux > JDK: > java version "1.7.0_181" > OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00) > OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode) > zk: 3.4.5 >Reporter: yeshuangshuang >Priority: Blocker > Fix For: 3.4.5 > > Attachments: 1.log, zklog.rar > > Original Estimate: 168h > Remaining Estimate: 168h > > 1.config--zoo.cfg > server.1=127.0.0.1:2902:2903 > 2.kernel version > version:Linux localhost.localdomain 3.10.0-123.el7.x86_64 #1 SMP Tue Feb 12 > 19:44:50 EST 2019 x86_64 x86_64 x86_64 GNU/Linux > JDK: > java version "1.7.0_181" > OpenJDK Runtime Environment (rhel-2.6.14.5.el7-x86_64 u181-b00) > OpenJDK 64-Bit Server VM (build 24.181-b00, mixed mode) > zk: 3.4.5 > 3.bug details: > Occasionally,But the recurrence probability is extremely high. At first, the > read-write timeout takes about 6s, and after a few minutes, all connections > (including long ones) will be CLOSE_WAIT state. > 4.:Circumvention scheme: it is found that all connections become close_wait > to restart the zookeeper server side actively -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3214) Flaky test: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16719671#comment-16719671 ] Michael Han commented on ZOOKEEPER-3214: Thanks for reporting the flaky test. It's important to keep eyes on flaky tests which is an important signal on quality. For this specific issue, it was reported before, so I am resolving this Jira and move discussions in the original Jira. > Flaky test: > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter > - > > Key: ZOOKEEPER-3214 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3214 > Project: ZooKeeper > Issue Type: Bug > Components: tests >Reporter: maoling >Priority: Minor > > more details in: > https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2901/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (ZOOKEEPER-3141) testLeaderElectionWithDisloyalVoter is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reopened ZOOKEEPER-3141: Reopen this issue because this test was observed similar symptom recently as reported in ZOOKEEPER-3124 [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2901/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter/]. > testLeaderElectionWithDisloyalVoter is flaky > > > Key: ZOOKEEPER-3141 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3141 > Project: ZooKeeper > Issue Type: Sub-task > Components: leaderElection, server, tests >Affects Versions: 3.6.0 >Reporter: Michael Han >Priority: Major > > The unit test added in ZOOKEEPER-3109 turns out to be quite flaky. > See > [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/511/artifact/report.html] > Recent failure builds: > [https://builds.apache.org/job/ZooKeeper-trunk//181] > [https://builds.apache.org/job/ZooKeeper-trunk//179] > [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2123/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter_stillHasMajority/] > > > Snapshot of the failure: > {code:java} > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority > Error Message > Server 0 should have joined quorum by now > Stacktrace > junit.framework.AssertionFailedError: Server 0 should have joined quorum by > now > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElection(QuorumPeerMainTest.java:1482) > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority(QuorumPeerMainTest.java:1431) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3214) Flaky test: org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3214?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3214. Resolution: Duplicate > Flaky test: > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter > - > > Key: ZOOKEEPER-3214 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3214 > Project: ZooKeeper > Issue Type: Bug > Components: tests >Reporter: maoling >Priority: Minor > > more details in: > https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2901/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter/ -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3188) Improve resilience to network
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16712162#comment-16712162 ] Michael Han commented on ZOOKEEPER-3188: Appreciate detailed reply, agree on replies on 1 and 2. bq. Such changes should be handled exactly the way they are now and there should be no interactions with the changes to the networking stack. Agreed. I think I was just looking for more elaborated use cases around using reconfig to manipulate multiple server addresses, as the proposal does not go into details other than 'support dynamic reconfiguration.'. I expect dynamic reconfiguration will just work out of box with proper abstractions, without touching too much part of reconfiguration code path, but there are some subtleties to consider. A couple of examples: * Proper rebalance client connections - this was discussed on dev mailing list. * Avoid unnecessary leader elections during reconfig - this change will probably change the abstraction of server addresses (QuorumServer) and we should be careful how the QuorumServers will be compared, to avoid unnecessary leader elections in cases where the server set is the same but some servers have new server addresses. There might be more cases to consider... bq. The documentation, in particular, should be essentially identical except that an example of adding an address might be nice I am thinking at least [this|https://zookeeper.apache.org/doc/r3.5.4-beta/zookeeperReconfig.html#sc_reconfig_clientport] should be updated to reflect the fact that 1. the config format is changed and 2. the multiple server addresses can be manipulated via reconfig. > Improve resilience to network > - > > Key: ZOOKEEPER-3188 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3188 > Project: ZooKeeper > Issue Type: Bug >Reporter: Ted Dunning >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > We propose to add network level resiliency to Zookeeper. The ideas that we > have on the topic have been discussed on the mailing list and via a > specification document that is located at > [https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing] > That document is copied to this issue which is being created to report the > results of experimental implementations. > h1. Zookeeper Network Resilience > h2. Background > Zookeeper is designed to help in building distributed systems. It provides a > variety of operations for doing this and all of these operations have rather > strict guarantees on semantics. Zookeeper itself is a distributed system made > up of cluster containing a leader and a number of followers. The leader is > designated in a process known as leader election in which a majority of all > nodes in the cluster must agree on a leader. All subsequent operations are > initiated by the leader and completed when a majority of nodes have confirmed > the operation. Whenever an operation cannot be confirmed by a majority or > whenever the leader goes missing for a time, a new leader election is > conducted and normal operations proceed once a new leader is confirmed. > > The details of this are not important relative to this discussion. What is > important is that the semantics of the operations conducted by a Zookeeper > cluster and the semantics of how client processes communicate with the > cluster depend only on the basic fact that messages sent over TCP connections > will never appear out of order or missing. Central to the design of ZK is > that a server to server network connection is used as long as it works to use > it and a new connection is made when it appears that the old connection isn't > working. > > As currently implemented, however, each member of a Zookeeper cluster can > have only a single address as viewed from some other process. This means, > absent network link bonding, that the loss of a single switch or a few > network connections could completely stop the operations of a the Zookeeper > cluster. It is the goal of this work to address this issue by allowing each > server to listen on multiple network interfaces and to connect to other > servers any of several addresses. The effect will be to allow servers to > communicate over redundant network paths to improve resiliency to network > failures without changing any core algorithms. > h2. Proposed Change > Interestingly, the correct operations of a Zookeeper cluster do not depend on > _how_ a TCP connection was made. There is no reason at all not to advertise > multiple addresses for members of a Zookeeper cluster. > > Connections between members of a Zookeeper cluster and between a client and a > cluster member are established by referencing a
[jira] [Commented] (ZOOKEEPER-2778) Potential server deadlock between follower sync with leader and follower receiving external connection requests.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16693936#comment-16693936 ] Michael Han commented on ZOOKEEPER-2778: I have to refresh my memory on this issue, but now looking back I think the gist of the issue is: We want to guarantee to get a consistent membership view of the ensemble thus we need to lock (QV_LOCK) on the quorum peer when we access (read/write) to it. Meanwhile we need another lock on QCM itself and the order of acquiring both locks in different path is not consistent, thus causing dead lock. The fix I proposed earlier and PR 707 did it by removing the QV lock on the read path. The problem is I am not sure how to validate its correctness given the intertwined code path :) - I previously while working on this was convinced removing QV_LOCK on read path of the three addresses is sound, now I am not sure. We could also try remove one or both synchronized on the connectOne of QCM - it seems OK at least for the first connectOne (with two parameters) and this should fix this specific dead lock. Another idea is to avoid QV lock completely by abstract the quorum verifier as an AtomicrReference similar as 707 did for three address field, if it's feasible to do so. > Potential server deadlock between follower sync with leader and follower > receiving external connection requests. > > > Key: ZOOKEEPER-2778 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2778 > Project: ZooKeeper > Issue Type: Bug > Components: quorum >Affects Versions: 3.5.3 >Reporter: Michael Han >Priority: Blocker > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > It's possible to have a deadlock during recovery phase. > Found this issue by analyzing thread dumps of "flaky" ReconfigRecoveryTest > [1]. . Here is a sample thread dump that illustrates the state of the > execution: > {noformat} > [junit] java.lang.Thread.State: BLOCKED > [junit] at > org.apache.zookeeper.server.quorum.QuorumPeer.getElectionAddress(QuorumPeer.java:686) > [junit] at > org.apache.zookeeper.server.quorum.QuorumCnxManager.initiateConnection(QuorumCnxManager.java:265) > [junit] at > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:445) > [junit] at > org.apache.zookeeper.server.quorum.QuorumCnxManager.receiveConnection(QuorumCnxManager.java:369) > [junit] at > org.apache.zookeeper.server.quorum.QuorumCnxManager$Listener.run(QuorumCnxManager.java:642) > [junit] > [junit] java.lang.Thread.State: BLOCKED > [junit] at > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:472) > [junit] at > org.apache.zookeeper.server.quorum.QuorumPeer.connectNewPeers(QuorumPeer.java:1438) > [junit] at > org.apache.zookeeper.server.quorum.QuorumPeer.setLastSeenQuorumVerifier(QuorumPeer.java:1471) > [junit] at > org.apache.zookeeper.server.quorum.Learner.syncWithLeader(Learner.java:520) > [junit] at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:88) > [junit] at > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:1133) > {noformat} > The dead lock happens between the quorum peer thread which running the > follower that doing sync with leader work, and the listener of the qcm of the > same quorum peer that doing the receiving connection work. Basically to > finish sync with leader, the follower needs to synchronize on both QV_LOCK > and the qmc object it owns; while in the receiver thread to finish setup an > incoming connection the thread needs to synchronize on both the qcm object > the quorum peer owns, and the same QV_LOCK. It's easy to see the problem here > is the order of acquiring two locks are different, thus depends on timing / > actual execution order, two threads might end up acquiring one lock while > holding another. > [1] > org.apache.zookeeper.server.quorum.ReconfigRecoveryTest.testCurrentServersAreObserversInNextConfig -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3190) Spell check on the Zookeeper server files
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3190. Resolution: Fixed Fix Version/s: 3.6.0 Issue resolved by pull request 702 [https://github.com/apache/zookeeper/pull/702] > Spell check on the Zookeeper server files > - > > Key: ZOOKEEPER-3190 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3190 > Project: ZooKeeper > Issue Type: Improvement > Components: documentation, other >Reporter: Dinesh Appavoo >Priority: Minor > Labels: newbie, pull-request-available > Fix For: 3.6.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > This JIRA is to do spell check on the zookeeper server files [ > zookeeper/zookeeper-server/src/main/java/org/apache/zookeeper/server ]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3177) Refactor request throttle logic in NIO and Netty to keep the same behavior and make the code easier to maintain
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3177. Resolution: Fixed Issue resolved by pull request 673 [https://github.com/apache/zookeeper/pull/673] > Refactor request throttle logic in NIO and Netty to keep the same behavior > and make the code easier to maintain > --- > > Key: ZOOKEEPER-3177 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3177 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > There is shouldThrottle logic in zkServer, we should use it in NIO as well, > refactor the code to make it cleaner and easier to maintain in the future. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3188) Improve resilience to network
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3188?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685929#comment-16685929 ] Michael Han commented on ZOOKEEPER-3188: A couple of comments on the high level design: * Did we consider the compatibility requirement here? Will the new configuration format be backward compatible? One concrete use case is if a customer upgrades to new version with this multiple address per server capability but wants to roll back without rewriting the config files to older version. * Did we evaluate the impact of this feature on existing server to server mutual authentication and authorization feature (e.g. ZOOKEEPER-1045 for Kerberos, ZOOKEEPER-236 for SSL), and also the impact on operations? For example how to configure Kerberos principals and / or SSL certs per host given multiple potential IP address and / or FQDN names per server? * Could we provide more details on expected level of support with regards to dynamic reconfiguration feature? Examples would be great - for example: we would support adding, removing, or updating server address that's appertained to a given server via dynamic reconfiguration, and also the expected behavior in each case. For example, adding a new address to an existing ensemble member should not cause any disconnect / reconnect but removing an in use address of a server should cause a disconnect. Likely the dynamic reconfig API / CLI / doc should be updated because of this. > Improve resilience to network > - > > Key: ZOOKEEPER-3188 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3188 > Project: ZooKeeper > Issue Type: Bug >Reporter: Ted Dunning >Priority: Major > > We propose to add network level resiliency to Zookeeper. The ideas that we > have on the topic have been discussed on the mailing list and via a > specification document that is located at > [https://docs.google.com/document/d/1iGVwxeHp57qogwfdodCh9b32P2_kOQaJZ2GDo7j36fI/edit?usp=sharing] > That document is copied to this issue which is being created to report the > results of experimental implementations. > h1. Zookeeper Network Resilience > h2. Background > Zookeeper is designed to help in building distributed systems. It provides a > variety of operations for doing this and all of these operations have rather > strict guarantees on semantics. Zookeeper itself is a distributed system made > up of cluster containing a leader and a number of followers. The leader is > designated in a process known as leader election in which a majority of all > nodes in the cluster must agree on a leader. All subsequent operations are > initiated by the leader and completed when a majority of nodes have confirmed > the operation. Whenever an operation cannot be confirmed by a majority or > whenever the leader goes missing for a time, a new leader election is > conducted and normal operations proceed once a new leader is confirmed. > > The details of this are not important relative to this discussion. What is > important is that the semantics of the operations conducted by a Zookeeper > cluster and the semantics of how client processes communicate with the > cluster depend only on the basic fact that messages sent over TCP connections > will never appear out of order or missing. Central to the design of ZK is > that a server to server network connection is used as long as it works to use > it and a new connection is made when it appears that the old connection isn't > working. > > As currently implemented, however, each member of a Zookeeper cluster can > have only a single address as viewed from some other process. This means, > absent network link bonding, that the loss of a single switch or a few > network connections could completely stop the operations of a the Zookeeper > cluster. It is the goal of this work to address this issue by allowing each > server to listen on multiple network interfaces and to connect to other > servers any of several addresses. The effect will be to allow servers to > communicate over redundant network paths to improve resiliency to network > failures without changing any core algorithms. > h2. Proposed Change > Interestingly, the correct operations of a Zookeeper cluster do not depend on > _how_ a TCP connection was made. There is no reason at all not to advertise > multiple addresses for members of a Zookeeper cluster. > > Connections between members of a Zookeeper cluster and between a client and a > cluster member are established by referencing a configuration file (for > cluster members) that specifies the address of all of the nodes in a cluster > or by using a connection string containing possible addresses of Zookeeper > cluster members. As soon as a connection is made, any desired authentication > or encryption
[jira] [Commented] (ZOOKEEPER-1441) Some test cases are failing because Port bind issue.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682124#comment-16682124 ] Michael Han commented on ZOOKEEPER-1441: PortAssignment itself is fine and if everyone is using it, they should not get conflicts because PortAssignment is the single source of truth of port allocation. However, the problem here is not every processes running on test machine using PortAssignment, despite most, if not all of ZK unit tests do use it. So if there are heavy workloads running on the test machine while ZK unit tests were running, potential port conflicts would occur. >> I never actually got why PortAssigment tries to bind the port before returns What PortAssignment implemented is a "reserve and release" pattern for port allocation, and this is better than "choose a port but not reserver" approach, because it is very unlikely the OS, regardless of how it allocates actual ports to the processes, will yield two consecutive port for two socket bind calls. Thus, by creating the socket via bind, and the immediately close it, we buy us sometime during which OS will not reuse this same socket for a successive socket call. This time however varies, thus there could be race conditions that by the time we actually going to bind this port again, it's already grabbed by another process. For ZK server, it requires an unbinded port number pass to it (otherwise it can't bind the port), but due to the same race condition it's possible when the server tries to bind, the port was taken already. The only way to guarantee atomicity in this case is to have ZK server asking a port from OS and bind immediately. > Some test cases are failing because Port bind issue. > > > Key: ZOOKEEPER-1441 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1441 > Project: ZooKeeper > Issue Type: Test > Components: server, tests >Reporter: kavita sharma >Assignee: Michael Han >Priority: Major > Labels: flaky, flaky-test > > very frequently testcases are failing because of : > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind(Native Method) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:111) > at > org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:112) > at > org.apache.zookeeper.server.quorum.QuorumPeer.(QuorumPeer.java:514) > at > org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:156) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) > may be because of Port Assignment so please give me some suggestions if > someone is also facing same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-1441) Some test cases are failing because Port bind issue.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682112#comment-16682112 ] Michael Han commented on ZOOKEEPER-1441: PortAssignment itself could also be more flaky under Java 11 because it can't guarantee atomicity between the time of allocation of a port and the time of actually binding the port inside a ZK server. I remember [~lvfangmin] mentioned that in FB they improved PortAssignment by using random ports rather than sequential port, which might help here. Alternatively we could also let ZK server to atomically allocate and bind a port inside it and then return the binded port number to caller, for testing purpose, rather than having to pass a port in, which will fix the root cause of the issue. > Some test cases are failing because Port bind issue. > > > Key: ZOOKEEPER-1441 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1441 > Project: ZooKeeper > Issue Type: Test > Components: server, tests >Reporter: kavita sharma >Assignee: Michael Han >Priority: Major > Labels: flaky, flaky-test > > very frequently testcases are failing because of : > java.net.BindException: Address already in use > at sun.nio.ch.Net.bind(Native Method) > at > sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) > at > org.apache.zookeeper.server.NIOServerCnxnFactory.configure(NIOServerCnxnFactory.java:111) > at > org.apache.zookeeper.server.ServerCnxnFactory.createFactory(ServerCnxnFactory.java:112) > at > org.apache.zookeeper.server.quorum.QuorumPeer.(QuorumPeer.java:514) > at > org.apache.zookeeper.test.QuorumBase.startServers(QuorumBase.java:156) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:103) > at org.apache.zookeeper.test.QuorumBase.setUp(QuorumBase.java:67) > may be because of Port Assignment so please give me some suggestions if > someone is also facing same problem. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3173) Quorum TLS - support PEM trust/key stores
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664684#comment-16664684 ] Michael Han commented on ZOOKEEPER-3173: [~ilyam] just added your alias into Contributor role group in JIRA. feel free to assign issues to yourself. > Quorum TLS - support PEM trust/key stores > - > > Key: ZOOKEEPER-3173 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3173 > Project: ZooKeeper > Issue Type: Improvement >Affects Versions: 3.6.0, 3.5.5 >Reporter: Ilya Maykov >Assignee: Ilya Maykov >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > ZOOKEEPER-236 is landed so there is some TLS support in Zookeeper now, but > only JKS trust stores are supported. JKS is not really used by non-Java > software, where PKCS12 and PEM are more standard. Let's add support for PEM > trust / key stores to make Quorum TLS easier to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-3174) Quorum TLS - support reloading trust/key store
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reassigned ZOOKEEPER-3174: -- Assignee: Ilya Maykov > Quorum TLS - support reloading trust/key store > -- > > Key: ZOOKEEPER-3174 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3174 > Project: ZooKeeper > Issue Type: Improvement >Affects Versions: 3.6.0, 3.5.5 >Reporter: Ilya Maykov >Assignee: Ilya Maykov >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > The Quorum TLS feature recently added in ZOOKEEPER-236 doesn't support > reloading a trust/key store from disk when it changes. In an environment > where short-lived certificates are used and are refreshed by some background > daemon / cron job, this is a problem. Let's support reloading a trust/key > store from disk when the file on disk changes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-3172) Quorum TLS - fix port unification to allow rolling upgrades
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reassigned ZOOKEEPER-3172: -- Assignee: Ilya Maykov > Quorum TLS - fix port unification to allow rolling upgrades > --- > > Key: ZOOKEEPER-3172 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3172 > Project: ZooKeeper > Issue Type: Improvement > Components: security, server >Affects Versions: 3.6.0, 3.5.5 >Reporter: Ilya Maykov >Assignee: Ilya Maykov >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > ZOOKEEPER-236 was committed with port unification support disabled, because > of various issues with the implementation. These issues should be fixed so > port unification can be enabled again. Port unification is necessary to > upgrade an ensemble from plaintext to TLS quorum connections without downtime. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-3173) Quorum TLS - support PEM trust/key stores
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reassigned ZOOKEEPER-3173: -- Assignee: Ilya Maykov > Quorum TLS - support PEM trust/key stores > - > > Key: ZOOKEEPER-3173 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3173 > Project: ZooKeeper > Issue Type: Improvement >Affects Versions: 3.6.0, 3.5.5 >Reporter: Ilya Maykov >Assignee: Ilya Maykov >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > ZOOKEEPER-236 is landed so there is some TLS support in Zookeeper now, but > only JKS trust stores are supported. JKS is not really used by non-Java > software, where PKCS12 and PEM are more standard. Let's add support for PEM > trust / key stores to make Quorum TLS easier to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3180) Add response cache to improve the throughput of read heavy traffic
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16664668#comment-16664668 ] Michael Han commented on ZOOKEEPER-3180: This will be a very useful feature for my prod env as well, where some of our read heavy workloads require serialize large payload from (an almost immutable part of) the data tree - in our case it's not the data stored but the getChildren call with tens of thousands children under the zNode. I'll be glad to review and test the patch in our prod env. > Add response cache to improve the throughput of read heavy traffic > --- > > Key: ZOOKEEPER-3180 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3180 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Priority: Minor > Fix For: 3.6.0 > > > On read heavy use case with large response data size, the serialization of > response takes time and added overhead to the GC. > Add response cache helps improving the throughput we can support, which also > reduces the latency in general. > This Jira is going to implement a LRU cache for the response, which shows > some performance gain on some of our production ensembles. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3163) Use session map to improve the performance when closing session in Netty
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3163. Resolution: Fixed Issue resolved by pull request 665 [https://github.com/apache/zookeeper/pull/665] > Use session map to improve the performance when closing session in Netty > > > Key: ZOOKEEPER-3163 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3163 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > Previously, it needs to go through all the cnxns to find out the session to > close, which is O(N), N is the total connections we have. > This will affect the performance of close session or renew session if there > are lots of connections on this server, this JIRA is going to reuse the > session map code in NIO implementation to improve the performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3180) Add response cache to improve the throughput of read heavy traffic
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661452#comment-16661452 ] Michael Han commented on ZOOKEEPER-3180: What will be we caching here? Is it the byte buffers that holding the (serialized) response body that going to write out to socket? > Add response cache to improve the throughput of read heavy traffic > --- > > Key: ZOOKEEPER-3180 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3180 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Priority: Minor > Fix For: 3.6.0 > > > On read heavy use case with large response data size, the serialization of > response takes time and added overhead to the GC. > Add response cache helps improving the throughput we can support, which also > reduces the latency in general. > This Jira is going to implement a LRU cache for the response, which shows > some performance gain on some of our production ensembles. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3179) Add snapshot compression to reduce the disk IO
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661432#comment-16661432 ] Michael Han commented on ZOOKEEPER-3179: Good feature. We can also consider provide the option to offload compression / decompression to dedicated hardware - e.g. FPGA. > Add snapshot compression to reduce the disk IO > -- > > Key: ZOOKEEPER-3179 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3179 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Fangmin Lv >Priority: Major > Fix For: 3.6.0 > > > When the snapshot becomes larger, the periodically snapshot after certain > number of txns will be more expensive. Which will in turn affect the maximum > throughput we can support within SLA, because of the disk contention between > snapshot and txn when they're on the same drive. > > With compression like zstd/snappy/gzip, the actual snapshot size could be > much smaller, the compress ratio depends on the actual data. It might make > the recovery time (loading from disk) faster in some cases, but will take > longer sometimes because of the extra time used to compress/decompress. > > Based on the production traffic, the performance various with different > compress method as well, that's why we provided different implementations, we > can select different compress method for different use cases. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-2847) Cannot bind to client port when reconfig based on old static config
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-2847. Resolution: Fixed Issue resolved by pull request 649 [https://github.com/apache/zookeeper/pull/649] > Cannot bind to client port when reconfig based on old static config > --- > > Key: ZOOKEEPER-2847 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2847 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Yisong Yue >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > When started the ensemble with old static config that the server string > doesn't have client port, dynamically remove and add the same server from the > ensemble will cause that server cannot bind to client port, and the ZooKeeper > server cannot serve client requests anymore. > From the code, we'll set the clientAddr to null when start up with old static > config, and dynamic config forces to have part, which will > trigger the following rebind code in QuorumPeer#processReconfig, and cause > the address already in used issue. > public boolean processReconfig(QuorumVerifier qv, Long suggestedLeaderId, > Long zxid, boolean restartLE) { > ... > if (myNewQS != null && myNewQS.clientAddr != null > && !myNewQS.clientAddr.equals(oldClientAddr)) { > cnxnFactory.reconfigure(myNewQS.clientAddr); > updateThreadName(); > } > ... > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-1177) Enabling a large number of watches for a large number of clients
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-1177. Resolution: Fixed Fix Version/s: (was: 3.5.5) Issue resolved by pull request 590 [https://github.com/apache/zookeeper/pull/590] > Enabling a large number of watches for a large number of clients > > > Key: ZOOKEEPER-1177 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1177 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.3.3 >Reporter: Vishal Kathuria >Assignee: Fangmin Lv >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Attachments: ZOOKEEPER-1177.patch, ZOOKEEPER-1177.patch, > ZooKeeper-with-fix-for-findbugs-warning.patch, ZooKeeper.patch, > Zookeeper-after-resolving-merge-conflicts.patch > > Time Spent: 13.5h > Remaining Estimate: 0h > > In my ZooKeeper, I see watch manager consuming several GB of memory and I dug > a bit deeper. > In the scenario I am testing, I have 10K clients connected to an observer. > There are about 20K znodes in ZooKeeper, each is about 1K - so about 20M data > in total. > Each client fetches and puts watches on all the znodes. That is 200 million > watches. > It seems a single watch takes about 100 bytes. I am currently at 14528037 > watches and according to the yourkit profiler, WatchManager has 1.2 G > already. This is not going to work as it might end up needing 20G of RAM just > for the watches. > So we need a more compact way of storing watches. Here are the possible > solutions. > 1. Use a bitmap instead of the current hashmap. In this approach, each znode > would get a unique id when its gets created. For every session, we can keep > track of a bitmap that indicates the set of znodes this session is watching. > A bitmap, assuming a 100K znodes, would be 12K. For 10K sessions, we can keep > track of watches using 120M instead of 20G. > 2. This second idea is based on the observation that clients watch znodes in > sets (for example all znodes under a folder). Multiple clients watch the same > set and the total number of sets is a couple of orders of magnitude smaller > than the total number of znodes. In my scenario, there are about 100 sets. So > instead of keeping track of watches at the znode level, keep track of it at > the set level. It may mean that get may also need to be implemented at the > set level. With this, we can save the watches in 100M. > Are there any other suggestions of solutions? > Thanks > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3157) Improve FuzzySnapshotRelatedTest to avoid flaky due to issues like connection loss
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632139#comment-16632139 ] Michael Han commented on ZOOKEEPER-3157: [~lvfangmin] thanks for making a fix on this. For this specific flaky test, we could either do what I suggested there (by wrapping the getData with some retry logic), or apply junit.RetryRule for this specific test case only since we know the cause and the fix should be retry anyway. I suggest we should not add junit.RetryRule to all test cases / ZKTestCase for reasons I mentioned here https://github.com/apache/zookeeper/pull/605#issuecomment-425496416. > Improve FuzzySnapshotRelatedTest to avoid flaky due to issues like connection > loss > -- > > Key: ZOOKEEPER-3157 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3157 > Project: ZooKeeper > Issue Type: Test > Components: tests >Affects Versions: 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Fix For: 3.6.0 > > > [~hanm] noticed that the test might failure because of ConnectionLoss when > trying to getData, [here is an > example|https://builds.apache.org/job/ZooKeepertrunk/198/testReport/junit/org.apache.zookeeper.server.quorum/FuzzySnapshotRelatedTest/testPZxidUpdatedWhenLoadingSnapshot], > we should catch this and retry to avoid flaky. > Internally, we 'fixed' flaky test by adding junit.RetryRule in ZKTestCase, > which is the base class for most of the tests. I'm not sure this is the right > way to go or not, since it's actually 'hiding' the flaky tests, but this will > help reducing the flaky tests a lot if we're not going to tackle it in the > near time, and we can check the testing history to find out which tests are > flaky and deal with them separately. So let me know if this seems to provide > any benefit in short term, if it is I'll provide a patch to do that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2847) Cannot bind to client port when reconfig based on old static config
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632131#comment-16632131 ] Michael Han commented on ZOOKEEPER-2847: not sure, this test passed in the pre-commit check build (2237) [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2237/testReport/org.apache.zookeeper.server.quorum/ReconfigLegacyTest/testReconfigRemoveClientFromStatic/] though it fails deterministically on my local box (and now on jenkins). > Cannot bind to client port when reconfig based on old static config > --- > > Key: ZOOKEEPER-2847 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2847 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Yisong Yue >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When started the ensemble with old static config that the server string > doesn't have client port, dynamically remove and add the same server from the > ensemble will cause that server cannot bind to client port, and the ZooKeeper > server cannot serve client requests anymore. > From the code, we'll set the clientAddr to null when start up with old static > config, and dynamic config forces to have part, which will > trigger the following rebind code in QuorumPeer#processReconfig, and cause > the address already in used issue. > public boolean processReconfig(QuorumVerifier qv, Long suggestedLeaderId, > Long zxid, boolean restartLE) { > ... > if (myNewQS != null && myNewQS.clientAddr != null > && !myNewQS.clientAddr.equals(oldClientAddr)) { > cnxnFactory.reconfigure(myNewQS.clientAddr); > updateThreadName(); > } > ... > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2847) Cannot bind to client port when reconfig based on old static config
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626539#comment-16626539 ] Michael Han commented on ZOOKEEPER-2847: [~yisong-yue] Thanks for quick response. bq. by that you mean open another issue, right We can reuse this JIRA issue for the fix. But if you like, create a new issue is also OK. up to you :) > Cannot bind to client port when reconfig based on old static config > --- > > Key: ZOOKEEPER-2847 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2847 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Yisong Yue >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When started the ensemble with old static config that the server string > doesn't have client port, dynamically remove and add the same server from the > ensemble will cause that server cannot bind to client port, and the ZooKeeper > server cannot serve client requests anymore. > From the code, we'll set the clientAddr to null when start up with old static > config, and dynamic config forces to have part, which will > trigger the following rebind code in QuorumPeer#processReconfig, and cause > the address already in used issue. > public boolean processReconfig(QuorumVerifier qv, Long suggestedLeaderId, > Long zxid, boolean restartLE) { > ... > if (myNewQS != null && myNewQS.clientAddr != null > && !myNewQS.clientAddr.equals(oldClientAddr)) { > cnxnFactory.reconfigure(myNewQS.clientAddr); > updateThreadName(); > } > ... > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2847) Cannot bind to client port when reconfig based on old static config
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626486#comment-16626486 ] Michael Han commented on ZOOKEEPER-2847: [~yisong-yue] Do you want to put up another patch to fix the unit test failure of testReconfigRemoveClientFromStatic? btw I am not sure why a previous old passes https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2237/ - that is why i did not catch this until today ran the build both locally and on jenkins. > Cannot bind to client port when reconfig based on old static config > --- > > Key: ZOOKEEPER-2847 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2847 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Yisong Yue >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When started the ensemble with old static config that the server string > doesn't have client port, dynamically remove and add the same server from the > ensemble will cause that server cannot bind to client port, and the ZooKeeper > server cannot serve client requests anymore. > From the code, we'll set the clientAddr to null when start up with old static > config, and dynamic config forces to have part, which will > trigger the following rebind code in QuorumPeer#processReconfig, and cause > the address already in used issue. > public boolean processReconfig(QuorumVerifier qv, Long suggestedLeaderId, > Long zxid, boolean restartLE) { > ... > if (myNewQS != null && myNewQS.clientAddr != null > && !myNewQS.clientAddr.equals(oldClientAddr)) { > cnxnFactory.reconfigure(myNewQS.clientAddr); > updateThreadName(); > } > ... > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (ZOOKEEPER-2847) Cannot bind to client port when reconfig based on old static config
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reopened ZOOKEEPER-2847: > Cannot bind to client port when reconfig based on old static config > --- > > Key: ZOOKEEPER-2847 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2847 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Yisong Yue >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When started the ensemble with old static config that the server string > doesn't have client port, dynamically remove and add the same server from the > ensemble will cause that server cannot bind to client port, and the ZooKeeper > server cannot serve client requests anymore. > From the code, we'll set the clientAddr to null when start up with old static > config, and dynamic config forces to have part, which will > trigger the following rebind code in QuorumPeer#processReconfig, and cause > the address already in used issue. > public boolean processReconfig(QuorumVerifier qv, Long suggestedLeaderId, > Long zxid, boolean restartLE) { > ... > if (myNewQS != null && myNewQS.clientAddr != null > && !myNewQS.clientAddr.equals(oldClientAddr)) { > cnxnFactory.reconfigure(myNewQS.clientAddr); > updateThreadName(); > } > ... > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2847) Cannot bind to client port when reconfig based on old static config
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16626485#comment-16626485 ] Michael Han commented on ZOOKEEPER-2847: looks like this patch breaks recent trunk build. There is at least one reconfig test failing deterministically. To reproduce: {code:java} ant test -Dtestcase=ReconfigLegacyTest -Dtest.method=testReconfigRemoveClientFromStatic test-core-java {code} This test was broken because it expected that if "clientPort" was available in static config file it should be kept there, for compatibility reasons (ZOOKEEPER-1992). The code checks if the condition is met using QuorumServer.clientAddr, which is null iff there is no clientPort in the static config file. The fix in this patch broke this assumption, because now QuorumServer.clientAddr will always be assigned a value, there is no way to differentiate if the value was assigned by reading from the server config portion of the static config file, or from the code (if (qs != null && qs.clientAddr == null) qs.clientAddr = clientPortAddress;). As a result, needEraseClientInfoFromStaticConfig now return true even if clientPort configuration is available in static config file, which leads to this test fails. I think we can use a dedicated field to represent the state of "should erase" for static config file to fix this. > Cannot bind to client port when reconfig based on old static config > --- > > Key: ZOOKEEPER-2847 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2847 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Yisong Yue >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > When started the ensemble with old static config that the server string > doesn't have client port, dynamically remove and add the same server from the > ensemble will cause that server cannot bind to client port, and the ZooKeeper > server cannot serve client requests anymore. > From the code, we'll set the clientAddr to null when start up with old static > config, and dynamic config forces to have part, which will > trigger the following rebind code in QuorumPeer#processReconfig, and cause > the address already in used issue. > public boolean processReconfig(QuorumVerifier qv, Long suggestedLeaderId, > Long zxid, boolean restartLE) { > ... > if (myNewQS != null && myNewQS.clientAddr != null > && !myNewQS.clientAddr.equals(oldClientAddr)) { > cnxnFactory.reconfigure(myNewQS.clientAddr); > updateThreadName(); > } > ... > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3146) Limit the maximum client connections per IP in NettyServerCnxnFactory
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3146. Resolution: Fixed Issue resolved by pull request 623 [https://github.com/apache/zookeeper/pull/623] > Limit the maximum client connections per IP in NettyServerCnxnFactory > -- > > Key: ZOOKEEPER-3146 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3146 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > There is maximum connections per IP limit in NIOServerCnxnFactory > implementation, but not exist in Netty, this is useful to avoid spamming > happened on prod ensembles. > This Jira is going to add similar throttling logic in NettyServerCnxnFactory. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-2847) Cannot bind to client port when reconfig based on old static config
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-2847. Resolution: Fixed Fix Version/s: 3.6.0 Issue resolved by pull request 620 [https://github.com/apache/zookeeper/pull/620] > Cannot bind to client port when reconfig based on old static config > --- > > Key: ZOOKEEPER-2847 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2847 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Yisong Yue >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h > Remaining Estimate: 0h > > When started the ensemble with old static config that the server string > doesn't have client port, dynamically remove and add the same server from the > ensemble will cause that server cannot bind to client port, and the ZooKeeper > server cannot serve client requests anymore. > From the code, we'll set the clientAddr to null when start up with old static > config, and dynamic config forces to have part, which will > trigger the following rebind code in QuorumPeer#processReconfig, and cause > the address already in used issue. > public boolean processReconfig(QuorumVerifier qv, Long suggestedLeaderId, > Long zxid, boolean restartLE) { > ... > if (myNewQS != null && myNewQS.clientAddr != null > && !myNewQS.clientAddr.equals(oldClientAddr)) { > cnxnFactory.reconfigure(myNewQS.clientAddr); > updateThreadName(); > } > ... > } -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3151) Jenkins github integration is broken if retriggering the precommit job through Jenkins admin web page.
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3151. Resolution: Workaround Provided work around in https://cwiki.apache.org/confluence/display/ZOOKEEPER/HowToContribute > Jenkins github integration is broken if retriggering the precommit job > through Jenkins admin web page. > -- > > Key: ZOOKEEPER-3151 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3151 > Project: ZooKeeper > Issue Type: Bug > Components: build-infrastructure >Reporter: Michael Han >Assignee: Michael Han >Priority: Minor > Labels: pull-request-available > Attachments: screen.png > > Time Spent: 1h > Remaining Estimate: 0h > > When trigger a precommit check Jenkins job directly through the [web > interface|https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/] > , the result can't be relayed back on github, after the job finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3151) Jenkins github integration is broken if retriggering the precommit job through Jenkins admin web page.
Michael Han created ZOOKEEPER-3151: -- Summary: Jenkins github integration is broken if retriggering the precommit job through Jenkins admin web page. Key: ZOOKEEPER-3151 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3151 Project: ZooKeeper Issue Type: Bug Components: build-infrastructure Reporter: Michael Han Assignee: Michael Han Attachments: screen.png When trigger a precommit check Jenkins job directly through the [web interface|https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/] , the result can't be relayed back on github, after the job finished. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3098) Add additional server metrics
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3098. Resolution: Fixed Fix Version/s: 3.6.0 Issue resolved by pull request 580 [https://github.com/apache/zookeeper/pull/580] > Add additional server metrics > - > > Key: ZOOKEEPER-3098 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3098 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.6.0 >Reporter: Joseph Blomstedt >Assignee: Joseph Blomstedt >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 5h 50m > Remaining Estimate: 0h > > This patch adds several new server-side metrics as well as makes it easier to > add new metrics in the future. This patch also includes a handful of other > minor metrics-related changes. > Here's a high-level summary of the changes. > # This patch extends the request latency tracked in {{ServerStats}} to track > {{read}} and {{update}} latency separately. Updates are any request that must > be voted on and can change data, reads are all requests that can be handled > locally and don't change data. > # This patch adds the {{ServerMetrics}} logic and the related > {{AvgMinMaxCounter}} and {{SimpleCounter}} classes. This code is designed to > make it incredibly easy to add new metrics. To add a new metric you just add > one line to {{ServerMetrics}} and then directly reference that new metric > anywhere in the code base. The {{ServerMetrics}} logic handles creating the > metric, properly adding the metric to the JSON output of the {{/monitor}} > admin command, and properly resetting the metric when necessary. The > motivation behind {{ServerMetrics}} is to make things easy enough that it > encourages new metrics to be added liberally. Lack of in-depth > metrics/visibility is a long-standing ZooKeeper weakness. At Facebook, most > of our internal changes build on {{ServerMetrics}} and we have nearly 100 > internal metrics at this time – all of which we'll be upstreaming in the > coming months as we publish more internal patches. > # This patch adds 20 new metrics, 14 which are handled by {{ServerMetrics}}. > # This patch replaces some uses of {{synchronized}} in {{ServerStats}} with > atomic operations. > Here's a list of new metrics added in this patch: > - {{uptime}}: time that a peer has been in a stable > leading/following/observing state > - {{leader_uptime}}: uptime for peer in leading state > - {{global_sessions}}: count of global sessions > - {{local_sessions}}: count of local sessions > - {{quorum_size}}: configured ensemble size > - {{synced_observers}}: similar to existing `synced_followers` but for > observers > - {{fsynctime}}: time to fsync transaction log (avg/min/max) > - {{snapshottime}}: time to write a snapshot (avg/min/max) > - {{dbinittime}}: time to reload database – read snapshot + apply > transactions (avg/min/max) > - {{readlatency}}: read request latency (avg/min/max) > - {{updatelatency}}: update request latency (avg/min/max) > - {{propagation_latency}}: end-to-end latency for updates, from proposal on > leader to committed-to-datatree on a given host (avg/min/max) > - {{follower_sync_time}}: time for follower to sync with leader (avg/min/max) > - {{election_time}}: time between entering and leaving election (avg/min/max) > - {{looking_count}}: number of transitions into looking state > - {{diff_count}}: number of diff syncs performed > - {{snap_count}}: number of snap syncs performed > - {{commit_count}}: number of commits performed on leader > - {{connection_request_count}}: number of incoming client connection requests > - {{bytes_received_count}}: similar to existing `packets_received` but > tracks bytes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3141) testLeaderElectionWithDisloyalVoter is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615502#comment-16615502 ] Michael Han commented on ZOOKEEPER-3141: Thanks [~lvfangmin], appreciate your help on fixing this. To identify a build where this test fail, you can start at flaky test dashboard: [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/lastSuccessfulBuild/artifact/report.html] You will see this test is currently ranking on of the top flaky tests. Then click the show/hide label of the right most column it will expand and list the builds. Currently, these builds can be used to triage [179|https://builds.apache.org/job/ZooKeeper-trunk//179] [181|https://builds.apache.org/job/ZooKeeper-trunk//181] [189|https://builds.apache.org/job/ZooKeeper-trunk//189] > testLeaderElectionWithDisloyalVoter is flaky > > > Key: ZOOKEEPER-3141 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3141 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, server, tests >Affects Versions: 3.6.0 >Reporter: Michael Han >Priority: Major > > The unit test added in ZOOKEEPER-3109 turns out to be quite flaky. > See > [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/511/artifact/report.html] > Recent failure builds: > [https://builds.apache.org/job/ZooKeeper-trunk//181] > [https://builds.apache.org/job/ZooKeeper-trunk//179] > [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2123/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter_stillHasMajority/] > > > Snapshot of the failure: > {code:java} > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority > Error Message > Server 0 should have joined quorum by now > Stacktrace > junit.framework.AssertionFailedError: Server 0 should have joined quorum by > now > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElection(QuorumPeerMainTest.java:1482) > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority(QuorumPeerMainTest.java:1431) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ZOOKEEPER-3141) testLeaderElectionWithDisloyalVoter is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615502#comment-16615502 ] Michael Han edited comment on ZOOKEEPER-3141 at 9/14/18 11:23 PM: -- Thanks [~lvfangmin], appreciate your help on fixing this. To identify a build where this test fail, you can start at flaky test dashboard: [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/lastSuccessfulBuild/artifact/report.html] You will see this test is currently ranking as one of the top flaky tests. Then click the show/hide label of the right most column it will expand and list the builds. Currently, these builds can be used to triage [179|https://builds.apache.org/job/ZooKeeper-trunk//179] [181|https://builds.apache.org/job/ZooKeeper-trunk//181] [189|https://builds.apache.org/job/ZooKeeper-trunk//189] was (Author: hanm): Thanks [~lvfangmin], appreciate your help on fixing this. To identify a build where this test fail, you can start at flaky test dashboard: [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/lastSuccessfulBuild/artifact/report.html] You will see this test is currently ranking on of the top flaky tests. Then click the show/hide label of the right most column it will expand and list the builds. Currently, these builds can be used to triage [179|https://builds.apache.org/job/ZooKeeper-trunk//179] [181|https://builds.apache.org/job/ZooKeeper-trunk//181] [189|https://builds.apache.org/job/ZooKeeper-trunk//189] > testLeaderElectionWithDisloyalVoter is flaky > > > Key: ZOOKEEPER-3141 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3141 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, server, tests >Affects Versions: 3.6.0 >Reporter: Michael Han >Priority: Major > > The unit test added in ZOOKEEPER-3109 turns out to be quite flaky. > See > [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/511/artifact/report.html] > Recent failure builds: > [https://builds.apache.org/job/ZooKeeper-trunk//181] > [https://builds.apache.org/job/ZooKeeper-trunk//179] > [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2123/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter_stillHasMajority/] > > > Snapshot of the failure: > {code:java} > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority > Error Message > Server 0 should have joined quorum by now > Stacktrace > junit.framework.AssertionFailedError: Server 0 should have joined quorum by > now > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElection(QuorumPeerMainTest.java:1482) > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority(QuorumPeerMainTest.java:1431) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-3140) Allow Followers to host Observers
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reassigned ZOOKEEPER-3140: -- Assignee: Brian Nixon > Allow Followers to host Observers > - > > Key: ZOOKEEPER-3140 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3140 > Project: ZooKeeper > Issue Type: New Feature > Components: server >Affects Versions: 3.6.0 >Reporter: Brian Nixon >Assignee: Brian Nixon >Priority: Minor > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Observers function simple as non-voting members of the ensemble, sharing the > Learner interface with Followers and holding only a slightly difference > internal pipeline. Both maintain connections along the quorum port with the > Leader by which they learn of all new proposals on the ensemble. > > There are benefits to allowing Observers to connect to the Followers to plug > into the commit stream in addition to connecting to the Leader. It shifts the > burden of supporting Observers off the Leader and allow it to focus on > coordinating the commit of writes. This means better performance when the > Leader is under high load, particularly high network load such as can happen > after a leader election when many Learners need to sync. It also reduces the > total network connections maintained on the Leader when there are a high > number of observers. One the other end, Observer availability is improved > since it will take shorter time for a high number of Observers to finish > syncing and start serving client traffic. > > The current implementation only supports scaling the number of Observers > into the hundreds before performance begins to degrade. By opening up > Followers to also host Observers, over a thousand observers can be hosted on > a typical ensemble without major negative impact under both normal operation > and during post-leader election sync. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3142) Extend SnapshotFormatter to dump data in json format
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3142. Resolution: Fixed Fix Version/s: 3.6.0 Issue resolved by pull request 619 [https://github.com/apache/zookeeper/pull/619] > Extend SnapshotFormatter to dump data in json format > > > Key: ZOOKEEPER-3142 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3142 > Project: ZooKeeper > Issue Type: Improvement >Affects Versions: 3.6.0 >Reporter: Brian Nixon >Priority: Trivial > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Json format can be chained into other tools such as ncdu. Extend the > SnapshotFormatter functionality to dump json. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-2122) Impplement SSL support in the Zookeeper C client library
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reassigned ZOOKEEPER-2122: -- Assignee: shuoshi (was: Ashish Amarnath) > Impplement SSL support in the Zookeeper C client library > > > Key: ZOOKEEPER-2122 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2122 > Project: ZooKeeper > Issue Type: Sub-task > Components: c client >Affects Versions: 3.5.0 >Reporter: Ashish Amarnath >Assignee: shuoshi >Priority: Trivial > Labels: build, pull-request-available, security > Fix For: 3.6.0, 3.5.5 > > Time Spent: 10m > Remaining Estimate: 0h > > Implement SSL support in the Zookeeper C client library to work with the > secure server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3144) Potential ephemeral nodes inconsistent due to global session inconsistent with fuzzy snapshot
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3144. Resolution: Fixed Issue resolved by pull request 621 [https://github.com/apache/zookeeper/pull/621] > Potential ephemeral nodes inconsistent due to global session inconsistent > with fuzzy snapshot > - > > Key: ZOOKEEPER-3144 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3144 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Critical > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Found this issue recently when checking another prod issue, the problem is > that the current code will update lastProcessedZxid before it's actually > making change for the global sessions in the DataTree. > > In case there is a snapshot taking in progress, and there is a small time > stall between set lastProcessedZxid and update the session in DataTree due to > reasons like thread context switch or GC, etc, then it's possible the > lastProcessedZxid is actually set to the future which doesn't include the > global session change (add or remove). > > When reload this snapshot and it's txns, it will replay txns from > lastProcessedZxid + 1, so it won't create the global session anymore, which > could cause data inconsistent. > > When global sessions are inconsistent, it might have ephemeral inconsistent > as well, since the leader will delete all the ephemerals locally if there is > no global sessions associated with it, and if someone have snapshot sync with > it then that server will not have that ephemeral as well, but others will. It > will also have global session renew issue for that problematic session. > > The same issue exist for the closeSession txn, we need to move these global > session update logic before processTxn, so the lastProcessedZxid will not > miss the global session here. > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3137) add a utility to truncate logs to a zxid
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3137. Resolution: Fixed Fix Version/s: 3.6.0 Issue resolved by pull request 615 [https://github.com/apache/zookeeper/pull/615] > add a utility to truncate logs to a zxid > > > Key: ZOOKEEPER-3137 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3137 > Project: ZooKeeper > Issue Type: New Feature >Affects Versions: 3.6.0 >Reporter: Brian Nixon >Priority: Trivial > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Add a utility that allows an admin to truncate a given transaction log to a > specified zxid. This can be similar to the existent LogFormatter. > Among the benefits, this allows an admin to put together a point-in-time view > of a data tree by manually mutating files from a saved backup. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3141) testLeaderElectionWithDisloyalVoter is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16613021#comment-16613021 ] Michael Han commented on ZOOKEEPER-3141: The failure does not reproduce in my stress test job which just ran this single test (https://builds.apache.org/job/Zookeeper_UT_sTRESS/). It can be reproduced on nightly build on trunk. Likely caused by interference from another test case. > testLeaderElectionWithDisloyalVoter is flaky > > > Key: ZOOKEEPER-3141 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3141 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, server, tests >Affects Versions: 3.6.0 >Reporter: Michael Han >Priority: Major > > The unit test added in ZOOKEEPER-3109 turns out to be quite flaky. > See > [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/511/artifact/report.html] > Recent failure builds: > [https://builds.apache.org/job/ZooKeeper-trunk//181] > [https://builds.apache.org/job/ZooKeeper-trunk//179] > [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2123/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter_stillHasMajority/] > > > Snapshot of the failure: > {code:java} > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority > Error Message > Server 0 should have joined quorum by now > Stacktrace > junit.framework.AssertionFailedError: Server 0 should have joined quorum by > now > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElection(QuorumPeerMainTest.java:1482) > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority(QuorumPeerMainTest.java:1431) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3147) Enable server tracking client information
Michael Han created ZOOKEEPER-3147: -- Summary: Enable server tracking client information Key: ZOOKEEPER-3147 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3147 Project: ZooKeeper Issue Type: Improvement Components: java client, server Affects Versions: 3.6.0 Reporter: Michael Han Assignee: Michael Han We should consider add fine grained tracking information for clients and send these information to server side, which will be useful for debugging and in future multi-tenancy support / enforced quota. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3127) Fixing potential data inconsistency due to update last processed zxid with partial multi-op txn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16612780#comment-16612780 ] Michael Han commented on ZOOKEEPER-3127: If there is a conflict, it's expected that the author of the original PR create a new PR targeting the new branch. If there is no conflict then committer can cherry pick the patch to different branches. It's because the author has best knowledge of how to deal with the conflict regardless of it's trivial or not, plus a separate PR will test the patch again through precommit jenkins. I think this is consistent with what committers was doing in old days when contributions coming in as patches; it's expected the original author uploaded a new patch to Jira if there was a conflict. > Fixing potential data inconsistency due to update last processed zxid with > partial multi-op txn > --- > > Key: ZOOKEEPER-3127 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3127 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Critical > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Found this issue while checking the code for another issue, this is a > relatively rare case which we haven't seen it on prod so far. > Currently, the lastProcessedZxid is updated when applying the first txn of > multi-op, if there is a snapshot in progress, it's possible that the zxid > associated with the snapshot only include partial of the multi op. > When loading snapshot, it will only load the txns after the zxid associated > with snapshot file, which could data inconsistency due to missing sub txns. > To avoid this, we only update the lastProcessedZxid when the whole multi-op > txn is applied to DataTree. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-2261) When only secureClientPort is configured connections, configuration, connection_stat_reset, and stats admin commands throw NullPointerException
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han updated ZOOKEEPER-2261: --- Fix Version/s: 3.5.5 > When only secureClientPort is configured connections, configuration, > connection_stat_reset, and stats admin commands throw NullPointerException > --- > > Key: ZOOKEEPER-2261 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2261 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Mohammad Arshad >Assignee: Andor Molnar >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Time Spent: 4.5h > Remaining Estimate: 0h > > When only secureClientPort is configured connections, configuration, > connection_stat_reset and stats admin commands throw NullPointerException. > Here is stack trace one of the connections command. > {code} > java.lang.NullPointerException > at > org.apache.zookeeper.server.admin.Commands$ConsCommand.run(Commands.java:177) > at > org.apache.zookeeper.server.admin.Commands.runCommand(Commands.java:92) > at > org.apache.zookeeper.server.admin.JettyAdminServer$CommandServlet.doGet(JettyAdminServer.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2261) When only secureClientPort is configured connections, configuration, connection_stat_reset, and stats admin commands throw NullPointerException
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611321#comment-16611321 ] Michael Han commented on ZOOKEEPER-2261: I cherry picked the commit to 3.5 just now ([https://github.com/apache/zookeeper/commit/14d0aaab853d535be268a1d7a234c9c47bf4cd25).] > When only secureClientPort is configured connections, configuration, > connection_stat_reset, and stats admin commands throw NullPointerException > --- > > Key: ZOOKEEPER-2261 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2261 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Mohammad Arshad >Assignee: Andor Molnar >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 4.5h > Remaining Estimate: 0h > > When only secureClientPort is configured connections, configuration, > connection_stat_reset and stats admin commands throw NullPointerException. > Here is stack trace one of the connections command. > {code} > java.lang.NullPointerException > at > org.apache.zookeeper.server.admin.Commands$ConsCommand.run(Commands.java:177) > at > org.apache.zookeeper.server.admin.Commands.runCommand(Commands.java:92) > at > org.apache.zookeeper.server.admin.JettyAdminServer$CommandServlet.doGet(JettyAdminServer.java:166) > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3141) testLeaderElectionWithDisloyalVoter is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16610124#comment-16610124 ] Michael Han commented on ZOOKEEPER-3141: The address `fee.fii.foo.fum` was from another test case in same file: [testBadPeerAddressInQuorum| https://github.com/apache/zookeeper/blob/master/src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java#L597]. One possibility is apache Jenkins was running multiple test cases and for some reasons, one test case (testBadPeerAddressInQuorum) interferes the other (testLeaderElectionWithDisloyalVoter_stillHasMajority). I've seen some flaky tests caused by interference between test cases, but this one is new to me. I set up a stress test on apache jenkins just to run testLeaderElectionWithDisloyalVoter_stillHasMajority alone and if the failure does not reproduce then it's likely the interferences between test cases are the cause. > testLeaderElectionWithDisloyalVoter is flaky > > > Key: ZOOKEEPER-3141 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3141 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, server, tests >Affects Versions: 3.6.0 >Reporter: Michael Han >Priority: Major > > The unit test added in ZOOKEEPER-3109 turns out to be quite flaky. > See > [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/511/artifact/report.html] > Recent failure builds: > [https://builds.apache.org/job/ZooKeeper-trunk//181] > [https://builds.apache.org/job/ZooKeeper-trunk//179] > [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2123/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter_stillHasMajority/] > > > Snapshot of the failure: > {code:java} > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority > Error Message > Server 0 should have joined quorum by now > Stacktrace > junit.framework.AssertionFailedError: Server 0 should have joined quorum by > now > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElection(QuorumPeerMainTest.java:1482) > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority(QuorumPeerMainTest.java:1431) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3141) testLeaderElectionWithDisloyalVoter is flaky
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16609863#comment-16609863 ] Michael Han commented on ZOOKEEPER-3141: [~lvfangmin] Do you want to take a look at this flaky test introduced by your patch in ZOOKEEPER-3109? > testLeaderElectionWithDisloyalVoter is flaky > > > Key: ZOOKEEPER-3141 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3141 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection, server, tests >Affects Versions: 3.6.0 >Reporter: Michael Han >Priority: Major > > The unit test added in ZOOKEEPER-3109 turns out to be quite flaky. > See > [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/511/artifact/report.html] > Recent failure builds: > [https://builds.apache.org/job/ZooKeeper-trunk//181] > [https://builds.apache.org/job/ZooKeeper-trunk//179] > [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2123/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter_stillHasMajority/] > > > Snapshot of the failure: > {code:java} > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority > Error Message > Server 0 should have joined quorum by now > Stacktrace > junit.framework.AssertionFailedError: Server 0 should have joined quorum by > now > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElection(QuorumPeerMainTest.java:1482) > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority(QuorumPeerMainTest.java:1431) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3141) testLeaderElectionWithDisloyalVoter is flaky
Michael Han created ZOOKEEPER-3141: -- Summary: testLeaderElectionWithDisloyalVoter is flaky Key: ZOOKEEPER-3141 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3141 Project: ZooKeeper Issue Type: Bug Components: leaderElection, server, tests Affects Versions: 3.6.0 Reporter: Michael Han The unit test added in ZOOKEEPER-3109 turns out to be quite flaky. See [https://builds.apache.org/job/zOOkeeper-Find-Flaky-Tests/511/artifact/report.html] Recent failure builds: [https://builds.apache.org/job/ZooKeeper-trunk//181] [https://builds.apache.org/job/ZooKeeper-trunk//179] [https://builds.apache.org/job/PreCommit-ZOOKEEPER-github-pr-build/2123/testReport/junit/org.apache.zookeeper.server.quorum/QuorumPeerMainTest/testLeaderElectionWithDisloyalVoter_stillHasMajority/] Snapshot of the failure: {code:java} org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority Error Message Server 0 should have joined quorum by now Stacktrace junit.framework.AssertionFailedError: Server 0 should have joined quorum by now at org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElection(QuorumPeerMainTest.java:1482) at org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testLeaderElectionWithDisloyalVoter_stillHasMajority(QuorumPeerMainTest.java:1431) at org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:79) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3129) Improve ZK Client resiliency by throwing a jute.maxbuffer size exception before sending a request to server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16606608#comment-16606608 ] Michael Han commented on ZOOKEEPER-3129: I think it's a good ida. I am leaning towards using a new property for this. I think this will be a client only property, correct? > Improve ZK Client resiliency by throwing a jute.maxbuffer size exception > before sending a request to server > --- > > Key: ZOOKEEPER-3129 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3129 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Karan Mehta >Priority: Major > > Zookeeper is mostly operated in controlled environments and the client/server > properties are usually known. With this Jira, I would like to propose a new > property on client side that represents the max jute buffer size server is > going to accept. > On the ZKClient, in case of multi Op, the request is serialized and hence we > know the size of complete packet that will be sent. We can use this new > property to determine if the we are exceeding the limit and throw some form > of KeeperException. This would be fail fast mechanism and the application can > potentially retry by chunking up the request or serializing. > Since the same properties are now present in two locations, over time, two > possibilities can happen. > -- Server jutebuffer accepts value is more than what is specified on client > side > The application might end up serializing it or zkclient can be made > configurable to retry even when it gets this exception > -- Server jutebuffer accepts value is lower than what is specified on client > side > That would have failed previously as well, so there is no change in behavior > This would help silent failures like HBASE-18549 getting avoided. > Thoughts [~apurtell] [~xucang] [~anmolnar] [~hanm] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3136) Reduce log in ClientBase in case of ConnectException
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3136. Resolution: Fixed Issue resolved by pull request 614 [https://github.com/apache/zookeeper/pull/614] > Reduce log in ClientBase in case of ConnectException > > > Key: ZOOKEEPER-3136 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3136 > Project: ZooKeeper > Issue Type: Task > Components: tests >Reporter: Enrico Olivelli >Assignee: Enrico Olivelli >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 40m > Remaining Estimate: 0h > > While running tests you will always see spammy log lines like the ones below. > As we are expecting the server to be up, it is not useful to log such > stacktraces. > The patch simply reduce the log in this specific case, because it adds no > value and it is very annoying. > > {code:java} > [junit] 2018-08-31 23:31:49,173 [myid:] - INFO [main:ClientBase@292] - > server 127.0.0.1:11222 not up > [junit] java.net.ConnectException: Connection refused (Connection refused) > [junit] at java.net.PlainSocketImpl.socketConnect(Native Method) > [junit] at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350) > [junit] at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206) > [junit] at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188) > [junit] at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > [junit] at java.net.Socket.connect(Socket.java:589) > [junit] at > org.apache.zookeeper.client.FourLetterWordMain.send4LetterWord(FourLetterWordMain.java:101) > [junit] at > org.apache.zookeeper.client.FourLetterWordMain.send4LetterWord(FourLetterWordMain.java:71) > [junit] at > org.apache.zookeeper.test.ClientBase.waitForServerUp(ClientBase.java:285) > [junit] at > org.apache.zookeeper.test.ClientBase.waitForServerUp(ClientBase.java:276) > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3131) org.apache.zookeeper.server.WatchManager resource leak
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3131. Resolution: Fixed Fix Version/s: 3.6.0 Issue resolved by pull request 612 [https://github.com/apache/zookeeper/pull/612] > org.apache.zookeeper.server.WatchManager resource leak > -- > > Key: ZOOKEEPER-3131 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3131 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.5.4, 3.6.0 > Environment: -Xmx512m >Reporter: ChaoWang >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 2h > Remaining Estimate: 0h > > In some cases, the variable _watch2Paths_ in _Class WatchManager_ does not > remove the entry, even if the associated value "HashSet" is empty already. > The type of key in Map _watch2Paths_ is Watcher, instance of > _NettyServerCnxn._ If it is not removed when the associated set of paths is > empty, it will cause the memory increases little by little, and > OutOfMemoryError triggered finally. > > {color:#FF}*Possible Solution:*{color} > In the following function, the logic should be added to remove the entry. > org.apache.zookeeper.server.WatchManager#removeWatcher(java.lang.String, > org.apache.zookeeper.Watcher) > if (paths.isEmpty()) > { watch2Paths.remove(watcher); } > For the following function as well: > org.apache.zookeeper.server.WatchManager#triggerWatch(java.lang.String, > org.apache.zookeeper.Watcher.Event.EventType, > java.util.Set) > > Please confirm this issue? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3127) Fixing potential data inconsistency due to update last processed zxid with partial multi-op txn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3127. Resolution: Fixed Fix Version/s: (was: 3.4.14) (was: 3.5.5) Issue resolved by pull request 606 [https://github.com/apache/zookeeper/pull/606] > Fixing potential data inconsistency due to update last processed zxid with > partial multi-op txn > --- > > Key: ZOOKEEPER-3127 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3127 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.4, 3.6.0, 3.4.13 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Critical > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 50m > Remaining Estimate: 0h > > Found this issue while checking the code for another issue, this is a > relatively rare case which we haven't seen it on prod so far. > Currently, the lastProcessedZxid is updated when applying the first txn of > multi-op, if there is a snapshot in progress, it's possible that the zxid > associated with the snapshot only include partial of the multi op. > When loading snapshot, it will only load the txns after the zxid associated > with snapshot file, which could data inconsistency due to missing sub txns. > To avoid this, we only update the lastProcessedZxid when the whole multi-op > txn is applied to DataTree. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3109) Avoid long unavailable time due to voter changed mind when activating the leader during election
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3109. Resolution: Fixed Issue resolved by pull request 588 [https://github.com/apache/zookeeper/pull/588] > Avoid long unavailable time due to voter changed mind when activating the > leader during election > > > Key: ZOOKEEPER-3109 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3109 > Project: ZooKeeper > Issue Type: Improvement > Components: quorum, server >Affects Versions: 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 3h > Remaining Estimate: 0h > > Occasionally, we'll find it takes long time to elect a leader, might longer > then 1 minute, depends on how big the initLimit and tickTime are set. > > This exposes an issue in leader election protocol. During leader election, > before the voter goes to the LEADING/FOLLOWING state, it will wait for a > finalizeWait time before changing its state. Depends on the order of > notifications, some voter might change mind just after it voting for a > server. If the server it was previous voting for has majority of votes after > considering this one, then that server will goto LEADING state. In some > corner cases, the leader may end up with timeout waiting for epoch ACK from > majority, because of the changed mind voter. This usually happen when there > are even number of servers in the ensemble (either because one of the server > is down or being restarted and it takes long time to restart). If there are 5 > servers in the ensemble, then we'll find two of them in LEADING/FOLLOWING > state, another two in LOOKING state, but the LOOKING servers cannot join the > quorum since they're waiting for majority servers FOLLOWING the current > leader before changing to FOLLOWING as well. > > As far as we know, this voter will change mind if it received a vote from > another host which just started and start to vote itself, or there is a > server takes long time to shutdown it's previous ZK server and start to vote > itself when starting the leader election process. > > Also the follower may abandon the leader if the leader is not ready for > accepting learner connection when the follower tried to connect to it. > > To solve this issue, there are multiple options: > 1. increase the finalizeWait time > 2. smartly detect this state on leader and quit earlier > > The 1st option is straightforward and easier to change, but it will cause > longer leader election time in common cases. > > The 2nd option is more complexity, but it can efficiently solve the problem > without sacrificing the performance in common cases. It remembers the first > majority servers voting for it, checking if there is anyone changed mind > while it's waiting for epoch ACK. The leader will wait for sometime before > quitting LEADING state, since one voter changed may not be a problem if there > are still majority voters voting for it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3108) deprecated myid file and use a new property "server.id" in the zoo.cfg
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16595906#comment-16595906 ] Michael Han commented on ZOOKEEPER-3108: I think we can provide an option to move unique identifier of the server from myid file to zoo.cfg - thus avoiding creating myid file, but I don't feel this approach is much more convenient comparing to the current approach of putting the server id in myid file - the unique id still has to be created for each server it's just put into a different place. [~maoling] comments? As others also mentioned, regardless of what additional options we are going to add, please keep the current myid approach. It would be a pain for those who operates ZK to upgrade if we just abandon the myid file completely. > deprecated myid file and use a new property "server.id" in the zoo.cfg > --- > > Key: ZOOKEEPER-3108 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3108 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.5.0 >Reporter: maoling >Assignee: maoling >Priority: Major > > When use zk in distributional model,we need to touch a myid file in > dataDir.then write a unique number to it.It is inconvenient and not > user-friendly,Look at an example from other distribution system such as > kafka:it just uses broker.id=0 in the server.properties to indentify a unique > server node.This issue is going to abandon the myid file and use a new > property such as server.id=0 in the zoo.cfg. this fix will be applied to > master branch,branch-3.5+, > keep branch-3.4 unchaged. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3129) Improve ZK Client resiliency by throwing a jute.maxbuffer size exception before sending a request to server
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16594570#comment-16594570 ] Michael Han commented on ZOOKEEPER-3129: We already have jute.maxbuffer property for client side (and server side). Would this property fit the needs here? > Improve ZK Client resiliency by throwing a jute.maxbuffer size exception > before sending a request to server > --- > > Key: ZOOKEEPER-3129 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3129 > Project: ZooKeeper > Issue Type: Improvement >Reporter: Karan Mehta >Priority: Major > > Zookeeper is mostly operated in controlled environments and the client/server > properties are usually known. With this Jira, I would like to propose a new > property on client side that represents the max jute buffer size server is > going to accept. > On the ZKClient, in case of multi Op, the request is serialized and hence we > know the size of complete packet that will be sent. We can use this new > property to determine if the we are exceeding the limit and throw some form > of KeeperException. This would be fail fast mechanism and the application can > potentially retry by chunking up the request or serializing. > Since the same properties are now present in two locations, over time, two > possibilities can happen. > -- Server jutebuffer accepts value is more than what is specified on client > side > The application might end up serializing it or zkclient can be made > configurable to retry even when it gets this exception > -- Server jutebuffer accepts value is lower than what is specified on client > side > That would have failed previously as well, so there is no change in behavior > This would help silent failures like HBASE-18549 getting avoided. > Thoughts [~apurtell] [~xucang] [~anmolnar] [~hanm] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ZOOKEEPER-3124) Remove special logic to handle cversion and pzxid in DataTree.processTxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589477#comment-16589477 ] Michael Han edited comment on ZOOKEEPER-3124 at 8/22/18 11:33 PM: -- [~lvfangmin] That code was actually added ZOOKEEPER-1046. I think it fixes the issue of incorrect cversion of parent caused by deleting some of its children after taking snapshot (so the deleted nodes never made into the snapshot which caused problems later while replying tx logs); rather than adding children after snapshot is serialized. [~fournc] had a detailed analysis on this. Does that make sense for you? The comment in the patch that finally landed sounds confusing to me as well. ZOOKEEPER-1269 just moved the same code from one place to the other. was (Author: hanm): [~lvfangmin] That code was actually added ZOOKEEPER-1046. I think it fixes the issue of incorrect cversion of parent caused by deleting some of its children after taking snapshot (so the deleted nodes never made into the snapshot which caused problems later while replying tx logs); rather than adding children after snapshot is serialized. [~fournc] had a [detailed analysis|https://issues.apache.org/jira/browse/ZOOKEEPER-1046?focusedCommentId=13020441=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13020441] on this. Does that make sense for you? The comment in the path that finally landed sounds confusing to me as well. ZOOKEEPER-1269 just moved the same code from one place to the other. > Remove special logic to handle cversion and pzxid in DataTree.processTxn > > > Key: ZOOKEEPER-3124 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Major > Fix For: 3.6.0 > > > There is special logic in the DataTree.processTxn to handle the NODEEXISTS > when createNode, which is used to handle the cversion and pzxid not being > updated due to fuzzy snapshot: > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994. > > But is this a real issue, or is it still an issue for now? > In the current code, when serializing a parent node, we'll lock on it, and > take a children snapshot at that time. If the child added after the parent is > serialized to disk, then it won't be written out, so we shouldn't hit the > issue where the child is in the snapshot but parent cversion and pzxid is not > changed. > > > I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this > issue as well, I'm not sure why we added this, am I missing anything? Can we > just get rid of it? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (ZOOKEEPER-3124) Remove special logic to handle cversion and pzxid in DataTree.processTxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589477#comment-16589477 ] Michael Han edited comment on ZOOKEEPER-3124 at 8/22/18 11:32 PM: -- [~lvfangmin] That code was actually added ZOOKEEPER-1046. I think it fixes the issue of incorrect cversion of parent caused by deleting some of its children after taking snapshot (so the deleted nodes never made into the snapshot which caused problems later while replying tx logs); rather than adding children after snapshot is serialized. [~fournc] had a [detailed analysis|https://issues.apache.org/jira/browse/ZOOKEEPER-1046?focusedCommentId=13020441=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-13020441] on this. Does that make sense for you? The comment in the path that finally landed sounds confusing to me as well. ZOOKEEPER-1269 just moved the same code from one place to the other. was (Author: hanm): [~lvfangmin] That code was actually added ZOOKEEPER-1046. I think it fixes the issue of incorrect cversion of parent caused by deleting some of its children after taking snapshot (so the deleted nodes never made into the snapshot which caused problems later while replying tx logs); rather than adding children after snapshot is serialized. [~fournc] had a detailed analysis on this. Does that make sense for you? The comment in the path that finally landed sounds confusing to me as well. ZOOKEEPER-1269 just moved the same code from one place to the other. > Remove special logic to handle cversion and pzxid in DataTree.processTxn > > > Key: ZOOKEEPER-3124 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Major > Fix For: 3.6.0 > > > There is special logic in the DataTree.processTxn to handle the NODEEXISTS > when createNode, which is used to handle the cversion and pzxid not being > updated due to fuzzy snapshot: > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994. > > But is this a real issue, or is it still an issue for now? > In the current code, when serializing a parent node, we'll lock on it, and > take a children snapshot at that time. If the child added after the parent is > serialized to disk, then it won't be written out, so we shouldn't hit the > issue where the child is in the snapshot but parent cversion and pzxid is not > changed. > > > I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this > issue as well, I'm not sure why we added this, am I missing anything? Can we > just get rid of it? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3124) Remove special logic to handle cversion and pzxid in DataTree.processTxn
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16589477#comment-16589477 ] Michael Han commented on ZOOKEEPER-3124: [~lvfangmin] That code was actually added ZOOKEEPER-1046. I think it fixes the issue of incorrect cversion of parent caused by deleting some of its children after taking snapshot (so the deleted nodes never made into the snapshot which caused problems later while replying tx logs); rather than adding children after snapshot is serialized. [~fournc] had a detailed analysis on this. Does that make sense for you? The comment in the path that finally landed sounds confusing to me as well. ZOOKEEPER-1269 just moved the same code from one place to the other. > Remove special logic to handle cversion and pzxid in DataTree.processTxn > > > Key: ZOOKEEPER-3124 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3124 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Major > Fix For: 3.6.0 > > > There is special logic in the DataTree.processTxn to handle the NODEEXISTS > when createNode, which is used to handle the cversion and pzxid not being > updated due to fuzzy snapshot: > https://github.com/apache/zookeeper/blob/master/src/java/main/org/apache/zookeeper/server/DataTree.java#L962-L994. > > But is this a real issue, or is it still an issue for now? > In the current code, when serializing a parent node, we'll lock on it, and > take a children snapshot at that time. If the child added after the parent is > serialized to disk, then it won't be written out, so we shouldn't hit the > issue where the child is in the snapshot but parent cversion and pzxid is not > changed. > > > I checked the JIRA ZOOKEEPER-1269 which added this code, it won't hit this > issue as well, I'm not sure why we added this, am I missing anything? Can we > just get rid of it? > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3110) Improve the closeSession throughput in PrepRequestProcessor
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3110. Resolution: Fixed > Improve the closeSession throughput in PrepRequestProcessor > --- > > Key: ZOOKEEPER-3110 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3110 > Project: ZooKeeper > Issue Type: Improvement > Components: quorum >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Time Spent: 20m > Remaining Estimate: 0h > > On leader every expired global session will add 3 lines of logs, which is > pretty heavy and if the log file is more than a few GB, the log for the > closeSession in PrepRequestProcessor will slow down the whole ensemble's > throughput. > From some use case, we found the prep request processor will be a bottleneck > when there are constantly high number of expired session or closing session > explicitly. > This JIra is going to remove one of the useless log when prepare close > session txns, which should give us higher throughput during processing large > number of expire sessions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3110) Improve the closeSession throughput in PrepRequestProcessor
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han updated ZOOKEEPER-3110: --- Fix Version/s: 3.5.5 > Improve the closeSession throughput in PrepRequestProcessor > --- > > Key: ZOOKEEPER-3110 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3110 > Project: ZooKeeper > Issue Type: Improvement > Components: quorum >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Time Spent: 20m > Remaining Estimate: 0h > > On leader every expired global session will add 3 lines of logs, which is > pretty heavy and if the log file is more than a few GB, the log for the > closeSession in PrepRequestProcessor will slow down the whole ensemble's > throughput. > From some use case, we found the prep request processor will be a bottleneck > when there are constantly high number of expired session or closing session > explicitly. > This JIra is going to remove one of the useless log when prepare close > session txns, which should give us higher throughput during processing large > number of expire sessions. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-2926) Data inconsistency issue due to the flaw in the session management
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-2926. Resolution: Fixed Fix Version/s: 3.6.0 Issue resolved by pull request 447 [https://github.com/apache/zookeeper/pull/447] > Data inconsistency issue due to the flaw in the session management > -- > > Key: ZOOKEEPER-2926 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2926 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Critical > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 3h > Remaining Estimate: 0h > > The local session upgrading feature will upgrade the session locally before > receving a quorum commit of creating global session. It's possible that the > server shutdown before the creating session request being sent to leader, if > we retained the ZKDatabase or there is Snapshot happened just before > shutdown, then only this server will have the global session. > If that server didn't become leader, then it will have more global sessions > than others, and those global sessions won't expire as the leader doesn't > know it's existence. If the server became leader, it will accept the client > renew session request and the client is allowed to create ephemeral nodes, > which means other servers only have ephemeral nodes but not that global > session. If there is follower going to have SNAP sync with it, then it will > also have the global session. If the server without that global session > becomes new leader, it will check and delete those dangling ephemeral node > before serving traffic. These could introduce the issues that the ephemeral > node being exist on some servers but not others. > There is dangling global session issue even without local session feature, > because on leader it will update the ZKDatabase when processing > ConnectionRequest and in the PrepRequestProcessor before it's quorum > committed, which also has this risk. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3036) Unexpected exception in zookeeper
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han updated ZOOKEEPER-3036: --- Component/s: (was: jmx) server quorum > Unexpected exception in zookeeper > - > > Key: ZOOKEEPER-3036 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036 > Project: ZooKeeper > Issue Type: Bug > Components: quorum, server >Affects Versions: 3.4.10 > Environment: 3 Zookeepers, 5 kafka servers >Reporter: Oded >Priority: Critical > > We got an issue with one of the zookeeprs (Leader), causing the entire kafka > cluster to fail: > 2018-05-09 02:29:01,730 [myid:3] - ERROR > [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected > exception causing shutdown while sock still open > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read(BufferedInputStream.java:265) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99) > at > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559) > 2018-05-09 02:29:01,730 [myid:3] - WARN > [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - *** GOODBYE > /192.168.0.91:42490 > > We would expect that zookeeper will choose another Leader and the Kafka > cluster will continue to work as expected, but that was not the case. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3036) Unexpected exception in zookeeper
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561417#comment-16561417 ] Michael Han commented on ZOOKEEPER-3036: What is the issue related to ZooKeeper in this case? When a learner thread dies the leader should be able to start another learner thread once the follower / observer corresponding to the died learner thread comes back. > Unexpected exception in zookeeper > - > > Key: ZOOKEEPER-3036 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036 > Project: ZooKeeper > Issue Type: Bug > Components: quorum, server >Affects Versions: 3.4.10 > Environment: 3 Zookeepers, 5 kafka servers >Reporter: Oded >Priority: Critical > > We got an issue with one of the zookeeprs (Leader), causing the entire kafka > cluster to fail: > 2018-05-09 02:29:01,730 [myid:3] - ERROR > [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648] - Unexpected > exception causing shutdown while sock still open > java.net.SocketTimeoutException: Read timed out > at java.net.SocketInputStream.socketRead0(Native Method) > at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) > at java.net.SocketInputStream.read(SocketInputStream.java:171) > at java.net.SocketInputStream.read(SocketInputStream.java:141) > at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) > at java.io.BufferedInputStream.read(BufferedInputStream.java:265) > at java.io.DataInputStream.readInt(DataInputStream.java:387) > at > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99) > at > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559) > 2018-05-09 02:29:01,730 [myid:3] - WARN > [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661] - *** GOODBYE > /192.168.0.91:42490 > > We would expect that zookeeper will choose another Leader and the Kafka > cluster will continue to work as expected, but that was not the case. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3082) Fix server snapshot behavior when out of disk space
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561414#comment-16561414 ] Michael Han commented on ZOOKEEPER-3082: Committed to 3.6. Merge conflicts with branch-3.5, need a separate pull request to get this in 3.5. > Fix server snapshot behavior when out of disk space > --- > > Key: ZOOKEEPER-3082 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3082 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.6.0, 3.4.12, 3.5.5 >Reporter: Brian Nixon >Assignee: Brian Nixon >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h > Remaining Estimate: 0h > > When the ZK server tries to make a snapshot and the machine is out of disk > space, the snapshot creation fails and throws an IOException. An empty > snapshot file is created, (probably because the server is able to create an > entry in the dir) but is not able to write to the file. > > If snapshot creation fails, the server commits suicide. When it restarts, it > will do so from the last known good snapshot. However, when it tries to make > a snapshot again, the same thing happens. This results in lots of empty > snapshot files being created. If eventually the DataDirCleanupManager garbage > collects the good snapshot files then only the empty files remain. At this > point, the server is well and truly screwed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3082) Fix server snapshot behavior when out of disk space
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3082. Resolution: Fixed Fix Version/s: 3.6.0 > Fix server snapshot behavior when out of disk space > --- > > Key: ZOOKEEPER-3082 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3082 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.6.0, 3.4.12, 3.5.5 >Reporter: Brian Nixon >Assignee: Brian Nixon >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0 > > Time Spent: 1h > Remaining Estimate: 0h > > When the ZK server tries to make a snapshot and the machine is out of disk > space, the snapshot creation fails and throws an IOException. An empty > snapshot file is created, (probably because the server is able to create an > entry in the dir) but is not able to write to the file. > > If snapshot creation fails, the server commits suicide. When it restarts, it > will do so from the last known good snapshot. However, when it tries to make > a snapshot again, the same thing happens. This results in lots of empty > snapshot files being created. If eventually the DataDirCleanupManager garbage > collects the good snapshot files then only the empty files remain. At this > point, the server is well and truly screwed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3094) Make BufferSizeTest reliable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han updated ZOOKEEPER-3094: --- Fix Version/s: 3.4.14 > Make BufferSizeTest reliable > > > Key: ZOOKEEPER-3094 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3094 > Project: ZooKeeper > Issue Type: Improvement > Components: tests >Affects Versions: 3.4.0 >Reporter: Mohamed Jeelani >Assignee: Mohamed Jeelani >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5, 3.4.14 > > Time Spent: 10m > Remaining Estimate: 0h > > Improve reliability of BufferSizeTest. > Changes made to the testStartupFailure test to remember the old directory and > switch back to it after the test has completed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3094) Make BufferSizeTest reliable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3094. Resolution: Fixed Fix Version/s: 3.5.5 Issue resolved by pull request 577 [https://github.com/apache/zookeeper/pull/577] > Make BufferSizeTest reliable > > > Key: ZOOKEEPER-3094 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3094 > Project: ZooKeeper > Issue Type: Improvement > Components: tests >Affects Versions: 3.4.0 >Reporter: Mohamed Jeelani >Assignee: Mohamed Jeelani >Priority: Minor > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5 > > Time Spent: 10m > Remaining Estimate: 0h > > Improve reliability of BufferSizeTest. > Changes made to the testStartupFailure test to remember the old directory and > switch back to it after the test has completed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (ZOOKEEPER-3009) Potential NPE in NIOServerCnxnFactory
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han updated ZOOKEEPER-3009: --- Fix Version/s: 3.4.14 > Potential NPE in NIOServerCnxnFactory > - > > Key: ZOOKEEPER-3009 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3009 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.6.0, 3.4.12 >Reporter: lujie >Assignee: lujie >Priority: Major > Labels: pull-request-available > Fix For: 3.6.0, 3.5.5, 3.4.14 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > Inspired by ZK-3006 , I develop a simple static analysis tool to find other > Potential NPE like ZK-3006. This bug is found by this tool ,and I have > carefully studied it. But i am a newbie at here so i may be wrong, hope > someone could confirm it and help me improve this tool. > h2. Bug description: > class NIOServerCnxn has three method > :getSocketAddress,getRemoteSocketAddress can return null just like : > {code:java} > // code placeholder > if (sock.isOpen() == false) { > return null; > } > {code} > some of their caller give null checker, some(total 3 list in below) are not. > {code:java} > // ServerCnxn#getConnectionInfo > Map info = new LinkedHashMap(); > info.put("remote_socket_address", getRemoteSocketAddress());// Map.put will > throw NPE if parameter is null > //IPAuthenticationProvider#handleAuthentication > tring id = cnxn.getRemoteSocketAddress().getAddress().getHostAddress(); > cnxn.addAuthInfo(new Id(getScheme(), id));// finally call Set.add(it will > throw NPE if parameter is null ) > //NIOServerCnxnFactory#addCnxn > InetAddress addr = cnxn.getSocketAddress(); > Set set = ipMap.get(addr);// Map.get will throw NPE if > parameter is null{code} > I think we should add null check in above three caller . > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (ZOOKEEPER-3097) Use Runnable instead of Thread for working items in WorkerService to improve the throughput of CommitProcessor
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han resolved ZOOKEEPER-3097. Resolution: Fixed Fix Version/s: 3.5.5 Issue resolved by pull request 578 [https://github.com/apache/zookeeper/pull/578] > Use Runnable instead of Thread for working items in WorkerService to improve > the throughput of CommitProcessor > -- > > Key: ZOOKEEPER-3097 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3097 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv >Priority: Minor > Labels: performance, pull-request-available > Fix For: 3.6.0, 3.5.5 > > Time Spent: 10m > Remaining Estimate: 0h > > CommitProcessor is using this to submit read/write tasks, each task is > initialized as a thread, which is heavy, change it to a lighter Runnable > object to avoid the overhead of initializing the thread, it shows promised > improvement in the CommitProcessor. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-3098) Add additional server metrics
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reassigned ZOOKEEPER-3098: -- Assignee: Joseph Blomstedt > Add additional server metrics > - > > Key: ZOOKEEPER-3098 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3098 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.6.0 >Reporter: Joseph Blomstedt >Assignee: Joseph Blomstedt >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This patch adds several new server-side metrics as well as makes it easier to > add new metrics in the future. This patch also includes a handful of other > minor metrics-related changes. > Here's a high-level summary of the changes. > # This patch extends the request latency tracked in {{ServerStats}} to track > {{read}} and {{update}} latency separately. Updates are any request that must > be voted on and can change data, reads are all requests that can be handled > locally and don't change data. > # This patch adds the {{ServerMetrics}} logic and the related > {{AvgMinMaxCounter}} and {{SimpleCounter}} classes. This code is designed to > make it incredibly easy to add new metrics. To add a new metric you just add > one line to {{ServerMetrics}} and then directly reference that new metric > anywhere in the code base. The {{ServerMetrics}} logic handles creating the > metric, properly adding the metric to the JSON output of the {{/monitor}} > admin command, and properly resetting the metric when necessary. The > motivation behind {{ServerMetrics}} is to make things easy enough that it > encourages new metrics to be added liberally. Lack of in-depth > metrics/visibility is a long-standing ZooKeeper weakness. At Facebook, most > of our internal changes build on {{ServerMetrics}} and we have nearly 100 > internal metrics at this time – all of which we'll be upstreaming in the > coming months as we publish more internal patches. > # This patch adds 20 new metrics, 14 which are handled by {{ServerMetrics}}. > # This patch replaces some uses of {{synchronized}} in {{ServerStats}} with > atomic operations. > Here's a list of new metrics added in this patch: > - {{uptime}}: time that a peer has been in a stable > leading/following/observing state > - {{leader_uptime}}: uptime for peer in leading state > - {{global_sessions}}: count of global sessions > - {{local_sessions}}: count of local sessions > - {{quorum_size}}: configured ensemble size > - {{synced_observers}}: similar to existing `synced_followers` but for > observers > - {{fsynctime}}: time to fsync transaction log (avg/min/max) > - {{snapshottime}}: time to write a snapshot (avg/min/max) > - {{dbinittime}}: time to reload database – read snapshot + apply > transactions (avg/min/max) > - {{readlatency}}: read request latency (avg/min/max) > - {{updatelatency}}: update request latency (avg/min/max) > - {{propagation_latency}}: end-to-end latency for updates, from proposal on > leader to committed-to-datatree on a given host (avg/min/max) > - {{follower_sync_time}}: time for follower to sync with leader (avg/min/max) > - {{election_time}}: time between entering and leaving election (avg/min/max) > - {{looking_count}}: number of transitions into looking state > - {{diff_count}}: number of diff syncs performed > - {{snap_count}}: number of snap syncs performed > - {{commit_count}}: number of commits performed on leader > - {{connection_request_count}}: number of incoming client connection requests > - {{bytes_received_count}}: similar to existing `packets_received` but > tracks bytes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-3098) Add additional server metrics
[ https://issues.apache.org/jira/browse/ZOOKEEPER-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551364#comment-16551364 ] Michael Han commented on ZOOKEEPER-3098: [~eolivelli] My thought is we still need a metric interface to hook with external reporters - and also we need metrics type definition more than counter which is the only type presented in the patch. ZOOKEEPER-3092 is more about the general metrics framework infrastructure and the work in this Jira is more about actual instrumentation and metrics collection. > Add additional server metrics > - > > Key: ZOOKEEPER-3098 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3098 > Project: ZooKeeper > Issue Type: Improvement > Components: server >Affects Versions: 3.6.0 >Reporter: Joseph Blomstedt >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > This patch adds several new server-side metrics as well as makes it easier to > add new metrics in the future. This patch also includes a handful of other > minor metrics-related changes. > Here's a high-level summary of the changes. > # This patch extends the request latency tracked in {{ServerStats}} to track > {{read}} and {{update}} latency separately. Updates are any request that must > be voted on and can change data, reads are all requests that can be handled > locally and don't change data. > # This patch adds the {{ServerMetrics}} logic and the related > {{AvgMinMaxCounter}} and {{SimpleCounter}} classes. This code is designed to > make it incredibly easy to add new metrics. To add a new metric you just add > one line to {{ServerMetrics}} and then directly reference that new metric > anywhere in the code base. The {{ServerMetrics}} logic handles creating the > metric, properly adding the metric to the JSON output of the {{/monitor}} > admin command, and properly resetting the metric when necessary. The > motivation behind {{ServerMetrics}} is to make things easy enough that it > encourages new metrics to be added liberally. Lack of in-depth > metrics/visibility is a long-standing ZooKeeper weakness. At Facebook, most > of our internal changes build on {{ServerMetrics}} and we have nearly 100 > internal metrics at this time – all of which we'll be upstreaming in the > coming months as we publish more internal patches. > # This patch adds 20 new metrics, 14 which are handled by {{ServerMetrics}}. > # This patch replaces some uses of {{synchronized}} in {{ServerStats}} with > atomic operations. > Here's a list of new metrics added in this patch: > - {{uptime}}: time that a peer has been in a stable > leading/following/observing state > - {{leader_uptime}}: uptime for peer in leading state > - {{global_sessions}}: count of global sessions > - {{local_sessions}}: count of local sessions > - {{quorum_size}}: configured ensemble size > - {{synced_observers}}: similar to existing `synced_followers` but for > observers > - {{fsynctime}}: time to fsync transaction log (avg/min/max) > - {{snapshottime}}: time to write a snapshot (avg/min/max) > - {{dbinittime}}: time to reload database – read snapshot + apply > transactions (avg/min/max) > - {{readlatency}}: read request latency (avg/min/max) > - {{updatelatency}}: update request latency (avg/min/max) > - {{propagation_latency}}: end-to-end latency for updates, from proposal on > leader to committed-to-datatree on a given host (avg/min/max) > - {{follower_sync_time}}: time for follower to sync with leader (avg/min/max) > - {{election_time}}: time between entering and leaving election (avg/min/max) > - {{looking_count}}: number of transitions into looking state > - {{diff_count}}: number of diff syncs performed > - {{snap_count}}: number of snap syncs performed > - {{commit_count}}: number of commits performed on leader > - {{connection_request_count}}: number of incoming client connection requests > - {{bytes_received_count}}: similar to existing `packets_received` but > tracks bytes -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (ZOOKEEPER-2504) Enforce that server ids are unique in a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Han reassigned ZOOKEEPER-2504: -- Assignee: Michael Han (was: Dan Benediktson) > Enforce that server ids are unique in a cluster > --- > > Key: ZOOKEEPER-2504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Dan Benediktson >Assignee: Michael Han >Priority: Major > Attachments: ZOOKEEPER-2504.patch > > > The leader will happily accept connections from learners that have the same > server id (i.e., due to misconfiguration). This can lead to various issues > including non-unique session_ids being generated by these servers. > The leader can enforce that all learners come in with unique server IDs; if a > learner attempts to connect with an id that is already in use, it should be > denied. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2504) Enforce that server ids are unique in a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551122#comment-16551122 ] Michael Han commented on ZOOKEEPER-2504: This patch still has its value, I am taking it over here. > Enforce that server ids are unique in a cluster > --- > > Key: ZOOKEEPER-2504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Dan Benediktson >Assignee: Dan Benediktson >Priority: Major > Attachments: ZOOKEEPER-2504.patch > > > The leader will happily accept connections from learners that have the same > server id (i.e., due to misconfiguration). This can lead to various issues > including non-unique session_ids being generated by these servers. > The leader can enforce that all learners come in with unique server IDs; if a > learner attempts to connect with an id that is already in use, it should be > denied. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZOOKEEPER-3096) Leader should not leak LearnerHandler threads
Michael Han created ZOOKEEPER-3096: -- Summary: Leader should not leak LearnerHandler threads Key: ZOOKEEPER-3096 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3096 Project: ZooKeeper Issue Type: Bug Components: quorum, server Affects Versions: 3.4.13, 3.5.4, 3.6.0 Reporter: Michael Han Assignee: Michael Han Currently we don't track LearnerHandler threads in leader; we rely on the socket timeout to raise an exception and use that exception as a signal to let the LearnerHandler thread kills itself. In cases where the learners restarts, if the time between restart beginning to finishing is less than the socket timeout value (currently hardcoded as initLimit * tickTime), then there will be no exception raised and the previous LearnerHandler thread corresponding to this learner will leak. I have a test case and a proposed fix which I will submit later. -- This message was sent by Atlassian JIRA (v7.6.3#76005)