Agreed on all 4 points. On the final point, it's worth noting that a continuous 
changes feed was two-phase, the first is indeed over a snapshot of the db as of 
the start of the _changes request, the second phase is an endless series of 
subsequent snapshots. the 4.0 behaviour won't exactly match that but it's 
definitely in the same spirit.

Agreed also on requiring pagination (I've not reviewed the proposed pagination 
api in sufficient detail to +1 it yet). Would we start the response as rows are 
retrieved, though? That's my preference, with an unclean termination if we hit 
txn_too_old, and an upper bound on the "limit" parameter or equivalent chosen 
such that txn_too_old is vanishingly unlikely.

On compatibility, there's precedent for a minor release of old branches just to 
add replicator compatibility. for example, the replicator could call _changes 
again if it received a complete _changes response (i.e, one that ended with a } 
that completes the json object) that did not include a "last_seq" row. The 4.0 
replicator would always do this.

B.

> On 16 Jul 2020, at 17:25, Paul Davis <paul.joseph.da...@gmail.com> wrote:
> 
> From what I'm reading it sounds like we have general consensus on a few 
> things:
> 
> 1. A single CouchDB API call should map to a single FDB transaction
> 2. We absolutely do not want to return a valid JSON response to any
> streaming API that hit a transaction boundary (because data
> loss/corruption)
> 3. We're willing to change the API requirements so that 2 is not an issue.
> 4. None of this applies to continuous changes since that API call was
> never a single snapshot.
> 
> If everyone generally agrees with that summarization, my suggestion
> would be that we just revisit the new pagination APIs and make them
> the only behavior rather than having them be opt-in. I believe those
> APIs already address all the concerns in this thread and the only
> reason we kept the older versions with `restart_tx` was to maintain
> API backwards compatibility at the expense of a slight change to
> semantics of snapshots. However, if there's a consensus that the
> semantics are more important than allowing a blanket `GET
> /db/_all_docs` I think it'd make the most sense to just embrace the
> pagination APIs that already exist and were written to cover these
> issues.
> 
> The only thing I'm not 100% on is how to deal with non-continuous
> replications. I.e., the older single shot replication. Do we go back
> with patches to older replicators to allow 4.0 compatibility? Just
> declare that you have to mediate a replication on the newer of the two
> CouchDB deployments? Sniff the replicator's UserAgent and behave
> differently on 4.x for just that special case?
> 
> Paul
> 
> On Wed, Jul 15, 2020 at 7:25 PM Adam Kocoloski <kocol...@apache.org> wrote:
>> 
>> Sorry, I also missed that you quoted this specific bit about eagerly 
>> requesting a new snapshot. Currently the code will just react to the 
>> transaction expiring, then wait till it acquires a new snapshot if 
>> “restart_tx” is set (which can take a couple of milliseconds on a 
>> FoundationDB cluster that is deployed across multiple AZs in a cloud Region) 
>> and then proceed.
>> 
>> Adam
>> 
>>> On Jul 15, 2020, at 6:54 PM, Adam Kocoloski <kocol...@apache.org> wrote:
>>> 
>>> Right now the code has an internal “restart_tx” flag that is used to 
>>> automatically request a new snapshot if the original one expires and 
>>> continue streaming the response. It can be used for all manner of multi-row 
>>> responses, not just _changes.
>>> 
>>> As this is a pretty big change to the isolation guarantees provided by the 
>>> database Bob volunteered to elevate the issue to the mailing list for a 
>>> deeper discussion.
>>> 
>>> Cheers, Adam
>>> 
>>>> On Jul 15, 2020, at 11:38 AM, Joan Touzet <woh...@apache.org> wrote:
>>>> 
>>>> I'm having trouble following the thread...
>>>> 
>>>> On 14/07/2020 14:56, Adam Kocoloski wrote:
>>>>> For cases where you’re not concerned about the snapshot isolation (e.g. 
>>>>> streaming an entire _changes feed), there is a small performance benefit 
>>>>> to requesting a new FDB transaction asynchronously before the old one 
>>>>> actually times out and swapping over to it. That’s a pattern I’ve seen in 
>>>>> other FDB layers but I’m not sure we’ve used it anywhere in CouchDB yet.
>>>> 
>>>> How does _changes work right now in the proposed 4.0 code?
>>>> 
>>>> -Joan
>>> 
>> 

Reply via email to