[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15793074#comment-15793074
 ] 

ASF GitHub Bot commented on ZOOKEEPER-1416:
-------------------------------------------

GitHub user Randgalt opened a pull request:

    https://github.com/apache/curator/pull/181

    FOR DISCUSSION ONLY - Persistent watch and Cache recipe replacements

    I've pushed an implementation for Persistent recursive watches as a 
ZooKeeper PR for https://issues.apache.org/jira/browse/ZOOKEEPER-1416 - if it's 
accepted, Curator should support this. This PR has implementations for:
    
    - PersistentWatcher
    - CuratorCache
    
    PersistentWatcher is a wrapper around the new persistent/recursive cache
    
    CuratorCache is a replacement for PathChildrenCache, TreeCache and 
NodeCache. With Persistent recursive watches the implementation is orders of 
magnitude simpler and uses a lot less resources (i.e. 1 watch for the entire 
tree).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/curator persistent-watch

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/curator/pull/181.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #181
    
----
commit 32a2fb7594510be2ee6d28c3c3a7db3b4ee9ab99
Author: randgalt <randg...@apache.org>
Date:   2016-12-28T19:50:05Z

    wip

commit 94a0205d4c3d34b1e1384ab5af1b997f74d2a912
Author: randgalt <randg...@apache.org>
Date:   2016-12-29T04:10:15Z

    Finished addPersistentWatcher DSL, re-wrote new version of cache code to 
handle all cases and deprecated other versions

commit 0d9acb6dd4ec4143cf08ae1cf4ab77a0865370e2
Author: randgalt <randg...@apache.org>
Date:   2016-12-29T15:02:15Z

    wip

commit 01652cef64e3cf3cc1e311b7a85f3c613f06ab0a
Author: randgalt <randg...@apache.org>
Date:   2016-12-30T15:26:18Z

    wip, refactoring, testing

commit bf73f0d3999bfc21b1799ce0c9d3e06214479206
Author: randgalt <randg...@apache.org>
Date:   2016-12-30T17:03:41Z

    continued work on porting old PathChildrenCache tests

commit 076583d14506e3e761ca061cd51a358a97c08eb6
Author: randgalt <randg...@apache.org>
Date:   2016-12-30T18:23:14Z

    CacheListener needs to get the affected node. Also, PATH_ONLY still needs 
to store the stat

commit 5b0a9f56e7d050eedfea0618f90c58d718441d3f
Author: randgalt <randg...@apache.org>
Date:   2016-12-30T19:09:53Z

    refactoring

commit 313fd7d46ccede6bbc9ac1feb0b5a2099fce7a6d
Author: randgalt <randg...@apache.org>
Date:   2016-12-30T20:12:04Z

    Added a composite cache

commit 1f0bdf9265e6f5bfb34520761649240209c17d72
Author: randgalt <randg...@apache.org>
Date:   2016-12-30T21:26:10Z

    renamed rebuildTestExchanger

commit f8f5cafa956da97c5fa177ac64ee003e955887da
Author: randgalt <randg...@apache.org>
Date:   2016-12-30T21:26:23Z

    finished ported tests

commit 38c766310432bd1d6b3f64d2778b3605df434e64
Author: randgalt <randg...@apache.org>
Date:   2016-12-31T21:41:10Z

    More test porting, refinements

commit 6cfd38c25391865503ba4cf35530f1794c777b91
Author: randgalt <randg...@apache.org>
Date:   2016-12-31T22:57:18Z

    More testing and refactoring. Wasn't checking for deleted children after a 
refresh. Also, allow for different methods of comparing nodes for change.

commit 40a985243d2959a2fff397eeebb9ff844f6a154c
Author: randgalt <randg...@apache.org>
Date:   2017-01-01T00:48:10Z

    finished porting TreeCache tests

commit add0d10bbb58b0dd6eeffde8c6a2bd2df99a7eae
Author: randgalt <randg...@apache.org>
Date:   2017-01-01T15:55:26Z

    Finished porting TestTreeCacheRandomTree. However, it exposed a design 
issue with separate CacheFilters and RefreshFilters. To do maxDepth properly 
you need both to be in sync. Need to rething this.

commit 2cf7c412caf81cb7846a7da5aac3adbc62502d3e
Author: randgalt <randg...@apache.org>
Date:   2017-01-01T18:33:13Z

    Reworked filters. Went back to the CacheSelector multi-method fitler used 
in TreeCache.

commit 72fe88c3b99e43e905cce40c178a08cb2c409b78
Author: randgalt <randg...@apache.org>
Date:   2017-01-01T19:24:04Z

    Ported/finished NodeCache and tests

commit 8565de6d75d34b2dd597878167b51cd921a0a00e
Author: randgalt <randg...@apache.org>
Date:   2017-01-01T22:15:14Z

    Removed composite stuff. Interesting, but gilding the lilly

commit 2148b6d1a829c2efd0309e2914b755ce9ebff003
Author: randgalt <randg...@apache.org>
Date:   2017-01-02T02:22:32Z

    Add docs, more refactoring, final testing, etc.

----


> Persistent Recursive Watch
> --------------------------
>
>                 Key: ZOOKEEPER-1416
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1416
>             Project: ZooKeeper
>          Issue Type: Improvement
>          Components: c client, documentation, java client, server
>            Reporter: Phillip Liu
>            Assignee: Jordan Zimmerman
>         Attachments: ZOOKEEPER-1416.patch, ZOOKEEPER-1416.patch
>
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> h4. The Problem
> A ZooKeeper Watch can be placed on a single znode and when the znode changes 
> a Watch event is sent to the client. If there are thousands of znodes being 
> watched, when a client (re)connect, it would have to send thousands of watch 
> requests. At Facebook, we have this problem storing information for thousands 
> of db shards. Consequently a naming service that consumes the db shard 
> definition issues thousands of watch requests each time the service starts 
> and changes client watcher.
> h4. Proposed Solution
> We add the notion of a Persistent Recursive Watch in ZooKeeper. Persistent 
> means no Watch reset is necessary after a watch-fire. Recursive means the 
> Watch applies to the node and descendant nodes. A Persistent Recursive Watch 
> behaves as follows:
> # Recursive Watch supports all Watch semantics: CHILDREN, DATA, and EXISTS.
> # CHILDREN and DATA Recursive Watches can be placed on any znode.
> # EXISTS Recursive Watches can be placed on any path.
> # A Recursive Watch behaves like a auto-watch registrar on the server side. 
> Setting a  Recursive Watch means to set watches on all descendant znodes.
> # When a watch on a descendant fires, no subsequent event is fired until a 
> corresponding getData(..) on the znode is called, then Recursive Watch 
> automically apply the watch on the znode. This maintains the existing Watch 
> semantic on an individual znode.
> # A Recursive Watch overrides any watches placed on a descendant znode. 
> Practically this means the Recursive Watch Watcher callback is the one 
> receiving the event and event is delivered exactly once.
> A goal here is to reduce the number of semantic changes. The guarantee of no 
> intermediate watch event until data is read will be maintained. The only 
> difference is we will automatically re-add the watch after read. At the same 
> time we add the convience of reducing the need to add multiple watches for 
> sibling znodes and in turn reduce the number of watch messages sent from the 
> client to the server.
> There are some implementation details that needs to be hashed out. Initial 
> thinking is to have the Recursive Watch create per-node watches. This will 
> cause a lot of watches to be created on the server side. Currently, each 
> watch is stored as a single bit in a bit set relative to a session - up to 3 
> bits per client per znode. If there are 100m znodes with 100k clients, each 
> watching all nodes, then this strategy will consume approximately 3.75TB of 
> ram distributed across all Observers. Seems expensive.
> Alternatively, a blacklist of paths to not send Watches regardless of Watch 
> setting can be set each time a watch event from a Recursive Watch is fired. 
> The memory utilization is relative to the number of outstanding reads and at 
> worst case it's 1/3 * 3.75TB using the parameters given above.
> Otherwise, a relaxation of no intermediate watch event until read guarantee 
> is required. If the server can send watch events regardless of one has 
> already been fired without corresponding read, then the server can simply 
> fire watch events without tracking.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to