[ https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15793074#comment-15793074 ]
ASF GitHub Bot commented on ZOOKEEPER-1416: ------------------------------------------- GitHub user Randgalt opened a pull request: https://github.com/apache/curator/pull/181 FOR DISCUSSION ONLY - Persistent watch and Cache recipe replacements I've pushed an implementation for Persistent recursive watches as a ZooKeeper PR for https://issues.apache.org/jira/browse/ZOOKEEPER-1416 - if it's accepted, Curator should support this. This PR has implementations for: - PersistentWatcher - CuratorCache PersistentWatcher is a wrapper around the new persistent/recursive cache CuratorCache is a replacement for PathChildrenCache, TreeCache and NodeCache. With Persistent recursive watches the implementation is orders of magnitude simpler and uses a lot less resources (i.e. 1 watch for the entire tree). You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/curator persistent-watch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/curator/pull/181.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #181 ---- commit 32a2fb7594510be2ee6d28c3c3a7db3b4ee9ab99 Author: randgalt <randg...@apache.org> Date: 2016-12-28T19:50:05Z wip commit 94a0205d4c3d34b1e1384ab5af1b997f74d2a912 Author: randgalt <randg...@apache.org> Date: 2016-12-29T04:10:15Z Finished addPersistentWatcher DSL, re-wrote new version of cache code to handle all cases and deprecated other versions commit 0d9acb6dd4ec4143cf08ae1cf4ab77a0865370e2 Author: randgalt <randg...@apache.org> Date: 2016-12-29T15:02:15Z wip commit 01652cef64e3cf3cc1e311b7a85f3c613f06ab0a Author: randgalt <randg...@apache.org> Date: 2016-12-30T15:26:18Z wip, refactoring, testing commit bf73f0d3999bfc21b1799ce0c9d3e06214479206 Author: randgalt <randg...@apache.org> Date: 2016-12-30T17:03:41Z continued work on porting old PathChildrenCache tests commit 076583d14506e3e761ca061cd51a358a97c08eb6 Author: randgalt <randg...@apache.org> Date: 2016-12-30T18:23:14Z CacheListener needs to get the affected node. Also, PATH_ONLY still needs to store the stat commit 5b0a9f56e7d050eedfea0618f90c58d718441d3f Author: randgalt <randg...@apache.org> Date: 2016-12-30T19:09:53Z refactoring commit 313fd7d46ccede6bbc9ac1feb0b5a2099fce7a6d Author: randgalt <randg...@apache.org> Date: 2016-12-30T20:12:04Z Added a composite cache commit 1f0bdf9265e6f5bfb34520761649240209c17d72 Author: randgalt <randg...@apache.org> Date: 2016-12-30T21:26:10Z renamed rebuildTestExchanger commit f8f5cafa956da97c5fa177ac64ee003e955887da Author: randgalt <randg...@apache.org> Date: 2016-12-30T21:26:23Z finished ported tests commit 38c766310432bd1d6b3f64d2778b3605df434e64 Author: randgalt <randg...@apache.org> Date: 2016-12-31T21:41:10Z More test porting, refinements commit 6cfd38c25391865503ba4cf35530f1794c777b91 Author: randgalt <randg...@apache.org> Date: 2016-12-31T22:57:18Z More testing and refactoring. Wasn't checking for deleted children after a refresh. Also, allow for different methods of comparing nodes for change. commit 40a985243d2959a2fff397eeebb9ff844f6a154c Author: randgalt <randg...@apache.org> Date: 2017-01-01T00:48:10Z finished porting TreeCache tests commit add0d10bbb58b0dd6eeffde8c6a2bd2df99a7eae Author: randgalt <randg...@apache.org> Date: 2017-01-01T15:55:26Z Finished porting TestTreeCacheRandomTree. However, it exposed a design issue with separate CacheFilters and RefreshFilters. To do maxDepth properly you need both to be in sync. Need to rething this. commit 2cf7c412caf81cb7846a7da5aac3adbc62502d3e Author: randgalt <randg...@apache.org> Date: 2017-01-01T18:33:13Z Reworked filters. Went back to the CacheSelector multi-method fitler used in TreeCache. commit 72fe88c3b99e43e905cce40c178a08cb2c409b78 Author: randgalt <randg...@apache.org> Date: 2017-01-01T19:24:04Z Ported/finished NodeCache and tests commit 8565de6d75d34b2dd597878167b51cd921a0a00e Author: randgalt <randg...@apache.org> Date: 2017-01-01T22:15:14Z Removed composite stuff. Interesting, but gilding the lilly commit 2148b6d1a829c2efd0309e2914b755ce9ebff003 Author: randgalt <randg...@apache.org> Date: 2017-01-02T02:22:32Z Add docs, more refactoring, final testing, etc. ---- > Persistent Recursive Watch > -------------------------- > > Key: ZOOKEEPER-1416 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416 > Project: ZooKeeper > Issue Type: Improvement > Components: c client, documentation, java client, server > Reporter: Phillip Liu > Assignee: Jordan Zimmerman > Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch > > Original Estimate: 504h > Remaining Estimate: 504h > > h4. The Problem > A ZooKeeper Watch can be placed on a single znode and when the znode changes > a Watch event is sent to the client. If there are thousands of znodes being > watched, when a client (re)connect, it would have to send thousands of watch > requests. At Facebook, we have this problem storing information for thousands > of db shards. Consequently a naming service that consumes the db shard > definition issues thousands of watch requests each time the service starts > and changes client watcher. > h4. Proposed Solution > We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent > means no Watch reset is necessary after a watch-fire. Recursive means the > Watch applies to the node and descendant nodes. A Persistent Recursive Watch > behaves as follows: > # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS. > # CHILDREN and DATA Recursive Watches can be placed on any znode. > # EXISTS Recursive Watches can be placed on any path. > # A Recursive Watch behaves like a auto-watch registrar on the server side. > Setting a Recursive Watch means to set watches on all descendant znodes. > # When a watch on a descendant fires, no subsequent event is fired until a > corresponding getData(..) on the znode is called, then Recursive Watch > automically apply the watch on the znode. This maintains the existing Watch > semantic on an individual znode. > # A Recursive Watch overrides any watches placed on a descendant znode. > Practically this means the Recursive Watch Watcher callback is the one > receiving the event and event is delivered exactly once. > A goal here is to reduce the number of semantic changes. The guarantee of no > intermediate watch event until data is read will be maintained. The only > difference is we will automatically re-add the watch after read. At the same > time we add the convience of reducing the need to add multiple watches for > sibling znodes and in turn reduce the number of watch messages sent from the > client to the server. > There are some implementation details that needs to be hashed out. Initial > thinking is to have the Recursive Watch create per-node watches. This will > cause a lot of watches to be created on the server side. Currently, each > watch is stored as a single bit in a bit set relative to a session - up to 3 > bits per client per znode. If there are 100m znodes with 100k clients, each > watching all nodes, then this strategy will consume approximately 3.75TB of > ram distributed across all Observers. Seems expensive. > Alternatively, a blacklist of paths to not send Watches regardless of Watch > setting can be set each time a watch event from a Recursive Watch is fired. > The memory utilization is relative to the number of outstanding reads and at > worst case it's 1/3 * 3.75TB using the parameters given above. > Otherwise, a relaxation of no intermediate watch event until read guarantee > is required. If the server can send watch events regardless of one has > already been fired without corresponding read, then the server can simply > fire watch events without tracking. -- This message was sent by Atlassian JIRA (v6.3.4#6332)