Re: [DISCUSS] couchdb 4.0 transactional semantics

Joan Touzet Thu, 16 Jul 2020 13:53:15 -0700



On 2020-07-16 4:50 p.m., Joan Touzet wrote:

On 2020-07-16 2:24 p.m., Robert Samuel Newson wrote:
Agreed on all 4 points. On the final point, it's worth noting that acontinuous changes feed was two-phase, the first is indeed over asnapshot of the db as of the start of the _changes request, the secondphase is an endless series of subsequent snapshots. the 4.0 behaviourwon't exactly match that but it's definitely in the same spirit.
Agreed also on requiring pagination (I've not reviewed the proposedpagination api in sufficient detail to +1 it yet). Would we start theresponse as rows are retrieved, though? That's my preference, with anunclean termination if we hit txn_too_old, and an upper bound on the"limit" parameter or equivalent chosen such that txn_too_old isvanishingly unlikely.
On compatibility, there's precedent for a minor release of oldbranches just to add replicator compatibility. for example, thereplicator could call _changes again if it received a complete_changes response (i.e, one that ended with a } that completes thejson object) that did not include a "last_seq" row. The 4.0 replicatorwould always do this.
I wouldn't really want to release a new 1.x, would you? Augh.
If we're going to change how replication works, wouldn't it better tosimply say "there is no guaranteed one-shot replication back from 4.x to1.x?" Or, intentionally break backward compatibility so one-shotreplication to un-upgraded old Couches refuses to work at all? Thiswould prevent the confusion by making it clear - you can't do thingsthis way anymore.

Sorry, meant to say we publish that the workaround is you need either a"push" replication from 4.x -> 1.x, or must use a hypothetically patched3.x+ replicator as a "third party" to replicate successfully from 4.x ->non-patched older CouchDBs.

I'd rather support this scenario than have to support explaining why the"one shot" replication back to an old 1.x, when initiated by a 1.xcluster, is returning results "ahead" of the time at which the one-shotreplication was started.

We could do a point release of 3.x, sure.

-Joan
B.
On 16 Jul 2020, at 17:25, Paul Davis <[email protected]>wrote:
From what I'm reading it sounds like we have general consensus on afew things:
1. A single CouchDB API call should map to a single FDB transaction
2. We absolutely do not want to return a valid JSON response to any
streaming API that hit a transaction boundary (because data
loss/corruption)
3. We're willing to change the API requirements so that 2 is not anissue.
4. None of this applies to continuous changes since that API call was
never a single snapshot.

If everyone generally agrees with that summarization, my suggestion
would be that we just revisit the new pagination APIs and make them
the only behavior rather than having them be opt-in. I believe those
APIs already address all the concerns in this thread and the only
reason we kept the older versions with `restart_tx` was to maintain
API backwards compatibility at the expense of a slight change to
semantics of snapshots. However, if there's a consensus that the
semantics are more important than allowing a blanket `GET
/db/_all_docs` I think it'd make the most sense to just embrace the
pagination APIs that already exist and were written to cover these
issues.

The only thing I'm not 100% on is how to deal with non-continuous
replications. I.e., the older single shot replication. Do we go back
with patches to older replicators to allow 4.0 compatibility? Just
declare that you have to mediate a replication on the newer of the two
CouchDB deployments? Sniff the replicator's UserAgent and behave
differently on 4.x for just that special case?

Paul
On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski <[email protected]>wrote:
Sorry, I also missed that you quoted this specific bit about eagerlyrequesting a new snapshot. Currently the code will just react to thetransaction expiring, then wait till it acquires a new snapshot if“restart_tx” is set (which can take a couple of milliseconds on aFoundationDB cluster that is deployed across multiple AZs in a cloudRegion) and then proceed.
Adam
On Jul 15, 2020, at 6:54 PM, Adam Kocoloski <[email protected]>wrote:
Right now the code has an internal “restart_tx” flag that is usedto automatically request a new snapshot if the original one expiresand continue streaming the response. It can be used for all mannerof multi-row responses, not just _changes.
As this is a pretty big change to the isolation guarantees providedby the database Bob volunteered to elevate the issue to the mailinglist for a deeper discussion.
Cheers, Adam
On Jul 15, 2020, at 11:38 AM, Joan Touzet <[email protected]> wrote:

I'm having trouble following the thread...

On 14/07/2020 14:56, Adam Kocoloski wrote:
For cases where you’re not concerned about the snapshot isolation(e.g. streaming an entire _changes feed), there is a smallperformance benefit to requesting a new FDB transactionasynchronously before the old one actually times out and swappingover to it. That’s a pattern I’ve seen in other FDB layers butI’m not sure we’ve used it anywhere in CouchDB yet.
How does _changes work right now in the proposed 4.0 code?

-Joan

Re: [DISCUSS] couchdb 4.0 transactional semantics

Reply via email to