Solution (2) appeals to me for its conceptual simplicity -- and having a stateless CouchDB layer I feel is super important in simplifying overall CouchDB deployment going forward.
-- Mike. On Mon, 4 Feb 2019, at 20:11, Adam Kocoloski wrote: > Probably good to take a quick step back and note that FoundationDB’s > versionstamps are an elegant and scalable solution to atomically > maintaining the index of documents in the order in which they were most > recently updated. I think that’s what you mean by the first part of the > problem, but I want to make sure that on the ML here we collectively > understand that FoundationDB actually nails this hard part of the > problem *really* well. > > When you say “notify CouchDB about new updates”, are you referring to > the feed=longpoll or feed=continuous options to the _changes API? I > guess I see three different routes that can be taken here. > > One route is to use the same kind machinery that we have in place today > in CouchDB 2.x. As a reminder, the way this works is > > - a client waiting for changes on a DB spawns one local process and > also a rexi RPC process on each node hosting one of the DB shards of > interest (see fabric_db_update_listener). > - those RPC processes register as local couch_event listeners, where > they receive {db_updated, ShardName} messages forwarded to them from > the couch_db_updater processes. > > Of course, in the FoundationDB design we don’t need to serialize > updates in couch_db_updater processes, but individual writers could > just as easily fire off those db_updated messages. This design is > already heavily optimized for large numbers of listeners on large > numbers of databases. The downside that I can see is it means the > *CouchDB layer nodes would need to form a distributed Erlang cluster* > in order for them to learn about the changes being committed from other > nodes in the cluster. > > So let’s say we *didn’t* want to do that and we rather are trying to > design for completely independent layer nodes that have no knowledge of > or communication with one another save through FoundationDB. There’s > definitely something to be said for that constraint. One very simple > approach might be to just poll FoundationDB. If the database is under a > heavy write load there’s no overhead to this approach; every time we > finish sending the output of one range query against the versionstamp > space and we re-acquire a new read version there will be new updates to > consume. Where it gets inefficient is if we have a lot of listeners on > the feed and a very low-throughput database. But one fiddle with > polling intervals, or else have a layer of indirection so only one > process on each layer node is doing the polling and then sending events > to couch_event. I think this could scale quite far. > > The other option (which I think is the one you’re homing in on) is to > leverage FoundationDB’s watchers to get a push notification about > updates to a particular key. I would be cautious about creating a > specific key or set of keys specifically for this purpose, but, if we > find that there’s some other bit of metadata that we are needing to > maintain anyway then this could work nicely. I think same indirection > that I described above (where each layer node has a maximum of one > watcher per database, and it re-broadcasts messages to all interested > clients via couch_event) would help us not be too constrained by the > limit on watches. > > So to recap, the three approaches > > 1. Writers publish db_updated events to couch_event, listeners use > distributed Erlang to subscribe to all nodes > 2. Poll the _changes subspace, scale by nominating a specific process > per node to do the polling > 3. Same as #2 but using a watch on DB metadata that changes with every > update instead of polling > > Adam > > > On Feb 4, 2019, at 2:18 PM, Ilya Khlopotov <iil...@apache.org> wrote: > > > > One of the features of CouchDB, which doesn't map cleanly into FoudationDB > > is changes feed. The essence of the feature is: > > - Subscriber of the feed wants to receive notifications when database is > > updated. > > - The notification includes update_seq for the database and list of changes > > which happen at that time. > > - The change itself includes docid and rev. > > Hi, > > > > There are multiple ways to easily solve this problem. Designing a scalable > > way to do it is way harder. > > > > There are at least two parts to this problem: > > - how to structure secondary indexes so we can provide what we need in > > notification event > > - how to notify CouchDB about new updates > > > > For the second part of the problem we could setup a watcher on one of the > > keys we have to update on every transaction. For example the key which > > tracks the database_size or key which tracks the number of documents or we > > can add our own key. The problem is at some point we would hit a capacity > > limit for atomic updates of a single key (FoundationDB doesn't redistribute > > the load among servers on per key basis). In such case we would have to > > distribute the counter among multiple keys to allow FoundationDB to split > > the hot range. Therefore, we would have to setup multiple watches. > > FoundationDB has a limit on the number of watches the client can setup > > (100000 watches). So we need to keep in mind this number when designing the > > feature. > > > > The single key update rate problem is very theoretical and we might ignore > > it for the PoC version. Then we can measure the impact and change design > > accordingly. The reason I decided to bring it up is to see maybe someone > > has a simple solution to avoid the bottleneck. > > > > best regards, > > iilyak > >