Hi Adam, In general, I like this idea especially with the future possibility of adding transactions to CouchDB. What makes me a little nervous is that this requires a fair amount of knowledge of CouchDB and FDB for a user to fully understand what is happening and could be a potential place where a user could get it horribly wrong or cause unnecessary issues. I would prefer that a user has to explicitly opt into this functionality, either by changing config or via adding another field in the HTTP header or a query parameter.
Cheers Garren On Fri, Sep 20, 2019 at 12:11 AM Adam Kocoloski <kocol...@apache.org> wrote: > Hi all, > > As we’ve gotten more familiar with FoundationDB we’ve come to realize that > acquiring a read version at the beginning of a transaction is a relatively > expensive[*] operation. It’s also a challenging one to scale given the > amount of communication required between proxies and tlogs in order to > agree on a good version. The prototype CouchDB layer we’ve been working on > (i.e., the beginnings of CouchDB 4.0) uses a separate FDB transaction with > a new read version for every request made to CouchDB. I wanted to start a > discussion about ways we might augment that approach while preserving (or > even enhancing) the semantics that we can expose to CouchDB users. > > One thing we can do is cache known versions that FDB has supplied in the > past second in the CouchDB layer and reuse those when a client permits us > to do so. If you like, this is the modern version of `?stale=ok`, but now > applicable to all types of requests. One big downside of this approach is > that if you scale out the members of the CouchDB layer they’ll have > different views of recent FDB versions, and a client whose requests are > load-balanced across members won’t have any guarantee that time moves > forward from request to request. You could imagine gossiping versions > between layer members, but now you’re basically redoing the work that > FoundationDB is doing itself. > > Another approach is to communicate the FDB version as part of the response > to each request, and allow the client to set an FDB version as part of a > submitted request. Clients that do this will experience lower latencies for > requests 2..N that share a version, will have the benefit of a consistent > snapshot of the database for all the reads that are executed using the same > version, and can guarantee they read their own writes when interleaving > those operations (assuming any reads following a write use the new FDB > version associated with the write). > > These techniques are not mutually exclusive; a client could acquire a > slightly stale FDB version and then use that for a collection of read > requests that would all observe the same consistent snapshot of the > database. Also, recall that a CouchDB sequence is now essentially the same > as an FDB version, with a little extra metadata to ensure sequences are > always monotonically increasing even when moving a database to a different > FDB cluster. So if you like, this is about allowing requests to be executed > as of a certain sequence (provided that sequence is no more than 5 seconds > old). > > I’m refraining from proposing any specific API extensions at this point, > partly because that’s an easy bikeshed and partly because I think whatever > API we’d add would be a primitive that client libraries would use to > construct richer semantics around. I’m also biting my tongue and avoiding > any detailed discussion of the transactional capabilities that CouchDB > could offer by surfacing these versions to clients — but that’s definitely > an interesting topic in its own right! > > Curious to hear what you all think. Thanks, Adam > > [*]: I don’t want to come off as alarmist; when I say this operation is > “expensive” I mean it might take a couple of milliseconds depending on FDB > configuration, and FDB can execute 10s of thousands of these per second > without much tuning. But it’s always good to be looking for the next > bottleneck :) > >