Re: [DISCUSS] couchdb 4.0 transactional semantics

Jan Lehnardt Wed, 15 Jul 2020 06:47:21 -0700


> On 14. Jul 2020, at 18:00, Adam Kocoloski <[email protected]> wrote:
> 
> I think there’s tremendous value in being able to tell our users that each 
> response served by CouchDB is constructed from a single isolated snapshot of 
> the underlying database. I’d advocate for this being the default behavior of 
> 4.0.


I too am in favour of this. I apologise for not speaking up in the earlier 
thread, which I followed closely, but never found the time to respond to.

From rnewson’s options, I’d suggest 3. the mandatory limit parameter. While 
this does indeed mean a BC break, it teaches the right semantics for folks on 
4.0 and onwards. For client libraries like our own nano, we can easily wrap 
this behaviour, so the resulting API is mostly compatible still, at least when 
used in streaming mode, less so when buffering a big _all_docs response).

> If folks wanted to add an opt-in compatibility mode to support longer 
> responses, I suppose that could be OK. I think we should discourage that 
> access pattern in general, though, as it’s somewhat less friendly to various 
> other parts of the stack than a pattern of shorter responses and a smart 
> pagination API like the one we’re introducing. To wit, I don’t think we’d 
> want to support that compatibility mode in IBM Cloud.

Like Adam, I do not mind a compat mode, either through a different API 
endpoint, or even a config option. I think we will be fine in getting people on 
this path when we document this in our update guide for the 4.0 release. I 
don’t think this will lead to a Python 2/3 situation overall, because the 4.0+ 
features are compelling enough for relatively small changes required, and 
CouchDB 3.x in its then latest form will continue to be a fine database for 
years to come, for folks who can’t upgrade as easily. So yes, I anticipate 
we’ll live in a two-versions world a little longer than we did during 1.x to 
2.x, but the reasons to leave 1.x behind were a little more severe than the 
improvements of 4.x over 3.x (while still significant, of course).

Best
Jan
—

> 
> Adam
> 
>> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson <[email protected]> 
>> wrote:
>> 
>> Thanks Nick, very helpful, and it vindicates me opening this thread.
>> 
>> I don't accept Mike Rhodes argument at all but I should explain why I don't;
>> 
>> In CouchDB 1.x, a response was generated from a single .couch file. There 
>> was always a window between the start of the request as the client sees it 
>> and CouchDB acquiring a snapshot of the relevant database. I don't think 
>> that gap is meaningful and does not refute our statements of the time that 
>> CouchDB responses are from a snapshot (specifically, that no change to the 
>> database made _during_ the response will be visible in _this_ response). In 
>> CouchDB 2.x (and continuing in 3.x), a CouchDB database typically consists 
>> of multiple shards, each of which, once opened, remain snapshotted for the 
>> duration of that response. The difference between 1.x and 2.x/3.x is that 
>> the window is potentially larger (though the requests are issued in 
>> parallel). The response, however much it returned, was impervious to changes 
>> in other requests once it has begun.
>> 
>> I don't think _all_docs, _view or a non-continuous _changes response should 
>> allow changes made in other requests to appear midway through them and I 
>> want to hear the opinions of folks that have watched over CouchDB from its 
>> earliest days on this specific point (If I must name names, at least Adam K, 
>> Paul D, Jan L, Joan T). If there's a majority for deviating from this 
>> semantic, I will go with the majority.
>> 
>> If we were to agree to preserve the 'single snapshot' behaviour, what would 
>> the behaviour be if we can't honour it because of the FoundationDB 
>> transaction limits?
>> 
>> I see a few options.
>> 
>> 1) We could end the response uncleanly, mid-response. CouchDB does this when 
>> it has no alternative, and it is ugly, but it is usually handled well by 
>> clients. They are at least not usually convinced they got a complete 
>> response if they are using a competent HTTP client.
>> 
>> 2) We could disavow the streaming API, as you've suggested, attempt to 
>> gather the full response. If we do this within the FDB bounds, return a 200 
>> code and the response body. A 400 and an error body if we don't.
>> 
>> 3) We could make the "limit" parameter mandatory and with an upper bound, in 
>> combination with 1 or 2, such that a valid request is very likely to be 
>> served within the limits.
>> 
>> I'd like to hear more voices on which way we want to break the unachievable 
>> semantic of old where you could read _all_docs on a billion document 
>> database over, uptime gods willing, a snapshot of the database.
>> 
>> B.
>> 
>>> On 13 Jul 2020, at 21:15, Nick Vatamaniuc <[email protected]> wrote:
>>> 
>>> Thanks for bringing the topic up for the discussion!
>>> 
>>> For background, this topic was discussed on the mailing list starting
>>> in February, 2019
>>> https://lists.apache.org/thread.html/r02cee7045cac4722e1682bb69ba0ec791f5cce025597d0099fb34033%40%3Cdev.couchdb.apache.org%3E
>>> 
>>> The primary reason for restart_tx option is to provide compatibility
>>> for _changes feeds to allow older replicators to handle 4.0 sources.
>>> It starts a new transaction after 5 seconds or so (a current FDB
>>> limitation, might go up in the future) and transparently continues to
>>> stream data where it left off. Ex, streaming [a,b,c,d], times out
>>> after b, then it will continue with c, d etc. Currently this is also
>>> used for other streaming APIs as an alternative to returning mangled
>>> JSON after emitting a 200 response and streaming some of the rows.
>>> However it is not used for paginated responses, the new APIs developed
>>> by Ilya. So users have an option to get the guaranteed snapshot
>>> behavior option as well.
>>> 
>>> And for completeness, if we decide to remove the option, we should
>>> specify what happens if we remove it and get a transaction_too_old
>>> exception. Currently the behavior would be to restart the transaction,
>>> resend all the headers and all the rows again down the socket, which I
>>> don't think anyone wants, but is what we'd get if we just make
>>> {restart_tx, false}
>>> 
>>>> I understand that automatically resetting the FDB txn during a response is 
>>>> an attempt to work around that and maintain "compatibility" with CouchDB < 
>>>> 4 semantics. I think it fails to do so and is very misleading.
>>> 
>>> It is a trade-off in order to keep the same API shape as before. Sure,
>>> streaming all the docs with _all_docs or _changes feeds is not a great
>>> pattern but many applications are implemented that way already.
>>> Letting them migrate to 4.0 without having to rewrite the application
>>> with the caveat that they might see a document updated in the
>>> _all_docs stream after the request has already started, is a nicer
>>> choice, I think, than forcing them to rewrite their application, which
>>> could lead to a python 2/3 scenario.
>>> 
>>> Due to having multiple shards (Q>1), as discussed in the original
>>> mailing thread by Mike
>>> (https://lists.apache.org/thread.html/r8345f534a6fa88c107c1085fba13e660e0e2aedfd206c2748e002664%40%3Cdev.couchdb.apache.org%3E),
>>> we don't provide a strict read-only snapshot guarantee in 2.x and 3.x
>>> anyway, so users would have to handle scenarios where a document might
>>> appear in the stream that wasn't there at the start of the request
>>> already. Though, granted, a much smaller corner case but I wonder how
>>> many users care to handle that...
>>> 
>>> Currently users do have an option of using the new paginated API which
>>> disables restart_tx behavior
>>> https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/chttpd/src/chttpd_db.erl#L947,
>>> though I am not sure what happens when transaction_too_old exception
>>> is thrown then (emit a bookmark?)
>>> 
>>> So based on the compatibility consideration, I'd vote to keep the
>>> restart_tx option (configurable perhaps if we figure out what to do
>>> when it is disabled) in order to allow users to migrate their
>>> application to 4.0. At least informally we promised users to keep a
>>> strong API compatibility when we released 3.0 with an eye towards 4.0
>>> (https://blog.couchdb.org/2020/02/26/the-road-to-couchdb-3-0/). I'd
>>> think not emitting all the data in a _changes or _all_docs response
>>> would break that compatibility more than using multiple transactions.
>>> 
>>> As for what happens when a transaction_too_old is thrown, I could see
>>> an option passed in, something like, single_snapshot=true, and then
>>> use Adam's suggestion to accumulate all the rows in memory and if we
>>> hit the end of the transaction return a 400 error. We won't emit
>>> anything out while rows are accumulated, so users don't get partial
>>> data, it will be every row requested or a 400 error (so no chance of
>>> perceived data loss). Users may retry if they think it was a temporary
>>> hiccup or may use a small limit number.
>>> 
>>> Cheers,
>>> -Nick
>>> 
>>> On Mon, Jul 13, 2020 at 2:05 PM Robert Samuel Newson <[email protected]> 
>>> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> I'm concerned to see the restart_fold function in fabric2_fdb 
>>>> (https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/src/fabric2_fdb.erl#L1828)
>>>>  in the 4.0 development branch.
>>>> 
>>>> The upshot of doing this is that a CouchDB response could be taken across 
>>>> multiple snapshots of the database, which is not the behaviour of CouchDB 
>>>> 1 through 3.
>>>> 
>>>> I don't think this is ok (with the obvious and established exception of a 
>>>> continuous changes feed, where new snapshots are continuously visible at 
>>>> the end of the response).
>>>> 
>>>> FoundationDB imposes certain limits on transactions, the most notable 
>>>> being the 5 second maximum duration. I understand that automatically 
>>>> resetting the FDB txn during a response is an attempt to work around that 
>>>> and maintain "compatibility" with CouchDB < 4 semantics. I think it fails 
>>>> to do so and is very misleading.
>>>> 
>>>> Discuss.
>>>> 
>>>> B.
>>>> 
>> 
>

Re: [DISCUSS] couchdb 4.0 transactional semantics

Reply via email to