Re: [DISCUSS] couchdb 4.0 transactional semantics

Adam Kocoloski Tue, 14 Jul 2020 09:01:14 -0700

I think there’s tremendous value in being able to tell our users that each 
response served by CouchDB is constructed from a single isolated snapshot of 
the underlying database. I’d advocate for this being the default behavior of 
4.0.


If folks wanted to add an opt-in compatibility mode to support longer 
responses, I suppose that could be OK. I think we should discourage that access 
pattern in general, though, as it’s somewhat less friendly to various other 
parts of the stack than a pattern of shorter responses and a smart pagination 
API like the one we’re introducing. To wit, I don’t think we’d want to support 
that compatibility mode in IBM Cloud.

Adam

> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson <rnew...@apache.org> wrote:
> 
> Thanks Nick, very helpful, and it vindicates me opening this thread.
> 
> I don't accept Mike Rhodes argument at all but I should explain why I don't;
> 
> In CouchDB 1.x, a response was generated from a single .couch file. There was 
> always a window between the start of the request as the client sees it and 
> CouchDB acquiring a snapshot of the relevant database. I don't think that gap 
> is meaningful and does not refute our statements of the time that CouchDB 
> responses are from a snapshot (specifically, that no change to the database 
> made _during_ the response will be visible in _this_ response). In CouchDB 
> 2.x (and continuing in 3.x), a CouchDB database typically consists of 
> multiple shards, each of which, once opened, remain snapshotted for the 
> duration of that response. The difference between 1.x and 2.x/3.x is that the 
> window is potentially larger (though the requests are issued in parallel). 
> The response, however much it returned, was impervious to changes in other 
> requests once it has begun.
> 
> I don't think _all_docs, _view or a non-continuous _changes response should 
> allow changes made in other requests to appear midway through them and I want 
> to hear the opinions of folks that have watched over CouchDB from its 
> earliest days on this specific point (If I must name names, at least Adam K, 
> Paul D, Jan L, Joan T). If there's a majority for deviating from this 
> semantic, I will go with the majority.
> 
> If we were to agree to preserve the 'single snapshot' behaviour, what would 
> the behaviour be if we can't honour it because of the FoundationDB 
> transaction limits?
> 
> I see a few options.
> 
> 1) We could end the response uncleanly, mid-response. CouchDB does this when 
> it has no alternative, and it is ugly, but it is usually handled well by 
> clients. They are at least not usually convinced they got a complete response 
> if they are using a competent HTTP client.
> 
> 2) We could disavow the streaming API, as you've suggested, attempt to gather 
> the full response. If we do this within the FDB bounds, return a 200 code and 
> the response body. A 400 and an error body if we don't.
> 
> 3) We could make the "limit" parameter mandatory and with an upper bound, in 
> combination with 1 or 2, such that a valid request is very likely to be 
> served within the limits.
> 
> I'd like to hear more voices on which way we want to break the unachievable 
> semantic of old where you could read _all_docs on a billion document database 
> over, uptime gods willing, a snapshot of the database.
> 
> B.
> 
>> On 13 Jul 2020, at 21:15, Nick Vatamaniuc <vatam...@gmail.com> wrote:
>> 
>> Thanks for bringing the topic up for the discussion!
>> 
>> For background, this topic was discussed on the mailing list starting
>> in February, 2019
>> https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E
>> 
>> The primary reason for restart_tx option is to provide compatibility
>> for _changes feeds to allow older replicators to handle 4.0 sources.
>> It starts a new transaction after 5 seconds or so (a current FDB
>> limitation, might go up in the future) and transparently continues to
>> stream data where it left off. Ex, streaming [a,b,c,d], times out
>> after b, then it will continue with c, d etc. Currently this is also
>> used for other streaming APIs as an alternative to returning mangled
>> JSON after emitting a 200 response and streaming some of the rows.
>> However it is not used for paginated responses, the new APIs developed
>> by Ilya. So users have an option to get the guaranteed snapshot
>> behavior option as well.
>> 
>> And for completeness, if we decide to remove the option, we should
>> specify what happens if we remove it and get a transaction_too_old
>> exception. Currently the behavior would be to restart the transaction,
>> resend all the headers and all the rows again down the socket, which I
>> don't think anyone wants, but is what we'd get if we just make
>> {restart_tx, false}
>> 
>>> I understand that automatically resetting the FDB txn during a response is 
>>> an attempt to work around that and maintain "compatibility" with CouchDB < 
>>> 4 semantics. I think it fails to do so and is very misleading.
>> 
>> It is a trade-off in order to keep the same API shape as before. Sure,
>> streaming all the docs with _all_docs or _changes feeds is not a great
>> pattern but many applications are implemented that way already.
>> Letting them migrate to 4.0 without having to rewrite the application
>> with the caveat that they might see a document updated in the
>> _all_docs stream after the request has already started, is a nicer
>> choice, I think, than forcing them to rewrite their application, which
>> could lead to a python 2/3 scenario.
>> 
>> Due to having multiple shards (Q>1), as discussed in the original
>> mailing thread by Mike
>> (https://lists.apache.org/thread.html/r8345f534a6fa88c107c1085fba13e660e0e2aedfd206c2748e002664%40%3Cdev.couchdb.apache.org%3E),
>> we don't provide a strict read-only snapshot guarantee in 2.x and 3.x
>> anyway, so users would have to handle scenarios where a document might
>> appear in the stream that wasn't there at the start of the request
>> already. Though, granted, a much smaller corner case but I wonder how
>> many users care to handle that...
>> 
>> Currently users do have an option of using the new paginated API which
>> disables restart_tx behavior
>> https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/chttpd/src/chttpd_db.erl#L947,
>> though I am not sure what happens when transaction_too_old exception
>> is thrown then (emit a bookmark?)
>> 
>> So based on the compatibility consideration, I'd vote to keep the
>> restart_tx option (configurable perhaps if we figure out what to do
>> when it is disabled) in order to allow users to migrate their
>> application to 4.0. At least informally we promised users to keep a
>> strong API compatibility when we released 3.0 with an eye towards 4.0
>> (https://blog.couchdb.org/2020/02/26/the-road-to-couchdb-3-0/). I'd
>> think not emitting all the data in a _changes or _all_docs response
>> would break that compatibility more than using multiple transactions.
>> 
>> As for what happens when a transaction_too_old is thrown, I could see
>> an option passed in, something like, single_snapshot=true, and then
>> use Adam's suggestion to accumulate all the rows in memory and if we
>> hit the end of the transaction return a 400 error. We won't emit
>> anything out while rows are accumulated, so users don't get partial
>> data, it will be every row requested or a 400 error (so no chance of
>> perceived data loss). Users may retry if they think it was a temporary
>> hiccup or may use a small limit number.
>> 
>> Cheers,
>> -Nick
>> 
>> On Mon, Jul 13, 2020 at 2:05 PM Robert Samuel Newson <rnew...@apache.org> 
>> wrote:
>>> 
>>> Hi All,
>>> 
>>> I'm concerned to see the restart_fold function in fabric2_fdb 
>>> (https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L1828)
>>>  in the 4.0 development branch.
>>> 
>>> The upshot of doing this is that a CouchDB response could be taken across 
>>> multiple snapshots of the database, which is not the behaviour of CouchDB 1 
>>> through 3.
>>> 
>>> I don't think this is ok (with the obvious and established exception of a 
>>> continuous changes feed, where new snapshots are continuously visible at 
>>> the end of the response).
>>> 
>>> FoundationDB imposes certain limits on transactions, the most notable being 
>>> the 5 second maximum duration. I understand that automatically resetting 
>>> the FDB txn during a response is an attempt to work around that and 
>>> maintain "compatibility" with CouchDB < 4 semantics. I think it fails to do 
>>> so and is very misleading.
>>> 
>>> Discuss.
>>> 
>>> B.
>>> 
>

Re: [DISCUSS] couchdb 4.0 transactional semantics

Reply via email to