Re: [VOTE] Release Apache CouchDB 3.3.3

2023-12-02 Thread Robert Newson
+1 

checksum: match
sig: ok
make check: pass
make release: ok
fauxton verify installation: pass

os: macos 14.1.2 (23B92)
erlang: 24.3.4.11
elixir: 1.14.3-otp-24
python: 3.11.6
spidermonkey: 91

> On 1 Dec 2023, at 20:28, Jay Doane  wrote:
> 
> signature: ok
> checksum: match
> make check: pass
> make release: ok
> fauxton verify install: pass
> 
> env:
>macOS 13.6.2
>erlang 25.3.2
>elixir 1.13.4-otp-25
>python 3.12.0
>spidermonkey 91
> 
> +1
> 
> On Fri, Dec 1, 2023 at 10:09 AM Jiahui Li  wrote:
> 
>> Hi,
>> 
>> +1
>> 
>> Signature: ok
>> Checksums: match
>> make check: passed
>> make release: ok
>> 
>> env:
>>  - macOS/x86_64 14.1.1, Erlang 24.3.4.14, Elixir 1.15.7, Python 3.11.6,
>> SpiderMonkey 91
>>  - macOS/x86_64 14.1.1, Erlang 25.3.2.7, Elixir 1.15.7, Python 3.11.6,
>> SpiderMonkey 91
>> 
>> Thanks for preparing the release!
>> 
>> 
>> Jiahui Li (Jessica)
>> 
>> From: Nick Vatamaniuc 
>> Sent: Wednesday, November 29, 2023 1:22 PM
>> To: dev@couchdb.apache.org 
>> Subject: [EXTERNAL] [VOTE] Release Apache CouchDB 3.3.3
>> 
>> Dear community,
>> 
>> I would like to propose that we release Apache CouchDB 3.3.3.
>> 
>> Candidate release notes:
>> https://github.com/apache/couchdb/blob/3.3.x/src/docs/src/whatsnew/3.3.rst
>> 
>> We encourage the whole community to download and test these release
>> artefacts so that any critical issues can be resolved before the
>> release is made. Everyone is free to vote on this release, so dig
>> right in! (Only PMC members have binding votes, but they depend on
>> community feedback to gauge if an official release is ready to be
>> made.)
>> 
>> The release artefacts we are voting on are available here:
>> https://dist.apache.org/repos/dist/dev/couchdb/source/3.3.3/rc.1/
>> 
>> There, you will find a tarball, a GPG signature, and the SHA256 checksum.
>> 
>> Please follow the test procedure here:
>> 
>> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
>> 
>> Please remember that "RC1" is an annotation. If the vote passes, these
>> artefacts will be released as Apache CouchDB 3.3.3.
>> 
>> Please cast your votes now.
>> 
>> Thanks,
>> -Nick
>> 



Out of disk handler proposal

2023-06-29 Thread Robert Newson
Hi All,

out of disk handler

I propose to enhance CouchDB to monitor disk occupancy and react automatically 
as free space becomes scarce. I've written a working prototype at: 
https://github.com/apache/couchdb/compare/main...out-of-disk-handler

The `diskmon` application is part of Erlang/OTP and I suggest we use that as 
the base, since it supports all the platforms we support (and a few more).

The patch reacts differently depending on whether it is database_dir or 
view_index_dir that runs out of space (of course they might both run out of 
space at the same time in the common case that the same device is used for 
both), namely;

1) Clustered database updates are prohibited (a 507 Insufficient Storage error 
is returned)
2) Background indexing is suspended (no new jobs will be started)
3) Querying a stale view is prohibited (a 507 Insufficient Storage error is 
returned)
4) Querying an up-to-date view is permitted

The goal being to leave internal replication running (to avoid data loss) and 
compaction (as the only action that reduces disk occupancy). I can see adding 
an option to suspend _all_ writing at, say, 99% full, in order to avoid hitting 
actual end of disk, but have not coded this up in the branch so far.

At the moment these all activate at once, which I think is not how we want to 
do this.

I suggest that we have configuration options for;

1) a global toggle to activate the out of disk handler
2) a parameter for the used disk percentage of view_index_dir at which we 
suspend background indexing, defaulting to 80
3) a parameter for the used disk percentage of view_index_dir at which we 
refuse to update stale indexes, defaulting to 90
4) a parameter for the used disk percentage of database_dir at which we suspend 
writes, defaulting to 90.

What do we all think?


B.

Re: [DISCUSS] Make Erlang 24 our minimum supported version

2023-04-27 Thread Robert Newson
+1 from me. Since the binary builds include erlang/otp we aren't restrained by 
which version of erlang each distro/platform currently uses. Anyone that wants 
to build from source can install erlang 24 by other means.

B.

> On 27 Apr 2023, at 00:21, Nick Vatamaniuc  wrote:
> 
> What do we think about moving to Erlang 24 as our minimum supported version?
> 
> 24 has been the base version of our package releases for a while, and
> we've also been running it at Cloudant for more than 6 months now
> without any issues.
> 
> Besides speedier JIT, there are some handy-dandy maps functions, and
> raw file delete functionality which would be nice to use without the
> extra macros around them.
> 
> What do we think?
> 
> -Nick



Re: [VOTE] Release Apache CouchDB 3.3.1-RC2

2023-01-09 Thread Robert Newson
+1

Sig: valid
Checksums: match
Make check: passes

Given the bug in 3.3.0, I checked that multiple replications work manually.

macOS 13.1, sm 91, erlang 23, elixir 1.13.4

> On 9 Jan 2023, at 14:56, Jan Lehnardt  wrote:
> 
> +1
> 
> macOS arm64 and x86_64, Erlang 25, SM91.
> 
> Best
> Jan
> —
> 
>> On 9. Jan 2023, at 11:06, Juanjo Rodríguez  wrote:
>> 
>> Hi all,
>> 
>> +1
>> 
>> Sig: ok
>> Checksums: ok
>> make check: passes
>> make release: works
>> Fauxton verify install: works
>> Creating database and docs in Fauxton: works
>> 
>> Tested on Ubuntu 20.04.5, x86-64, Erlang 23, Elixir 1.13.4, Spidermonkey 68
>> 
>> Additional tests:
>> - Forked LightCouch (CouchDB Java Client) tests: pass
>> - Forked Cloudant Sync integration tests: pass
>> 
>> Thanks for the work!!
>> 
>> Juanjo
>> 
>> El vie, 6 ene 2023 a las 20:24, Nick Vatamaniuc ()
>> escribió:
>> 
>>> +1
>>> 
>>> MacOS, x86-64, Erlang 23
>>> 
>>> sig: ok
>>> checksums: ok
>>> make check: ok
>>> make release: ok
>>> fauxton verify: ok
>>> multiple fauxton replication working: ok
>>> 
>>> Cheers,
>>> -Nick
>>> 
>>> On Fri, Jan 6, 2023 at 10:34 AM Jan Lehnardt  wrote:
 
 Convenience macOS binaries are up for arm64 and x86_64:
 
  https://dist.apache.org/repos/dist/dev/couchdb/binary/mac/3.3.1/rc.2/
 
 Best
 Jan
 —
 
> On 6. Jan 2023, at 12:11, Jan Lehnardt  wrote:
> 
> Dear community,
> 
> I would like to propose that we release Apache CouchDB 3.3.1.
> 
> Changes since RC1:
> 
> - allow starting of more than one replication job (D’OH)
> 
> Candidate release notes:
> 
> https://docs.couchdb.org/en/latest/whatsnew/3.3.html
> 
> We encourage the whole community to download and test these release
>>> artefacts so that any critical issues can be resolved before the release is
>>> made. Everyone is free to vote on this release, so dig right in! (Only PMC
>>> members have binding votes, but they depend on community feedback to gauge
>>> if an official release is ready to be made.)
> 
> The release artefacts we are voting on are available here:
> 
> https://dist.apache.org/repos/dist/dev/couchdb/source/3.3.1/rc.2/ <
>>> https://dist.apache.org/repos/dist/dev/couchdb/source/3.3.1/rc.1/>
> 
> There, you will find a tarball, a GPG signature, and SHA256/SHA512
>>> checksums.
> 
> Please follow the test procedure here:
> 
> 
>>> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> 
> Please remember that "RC2" is an annotation. If the vote passes, these
>>> artefacts will be released as Apache CouchDB 3.3.1.
> 
> Please cast your votes now.
> 
> Thanks,
> Jan
> —
 
 
>>> 
> 



Re: [VOTE] Release Apache CouchDB 3.3.0 (RC2)

2022-12-27 Thread Robert Newson
+1

MacOS Ventura 13.1, erlang 23.3.4.14, elixir 1.13.4-otp-23 (both via asdf).

Sha256: ok
Sha512: ok
Sig: ok
Make check: pass
Make release: works
Fauxton verify installation: pass

Excellent work, everyone. So much in this release!

B.

> On 26 Dec 2022, at 20:16, Nick Vatamaniuc  wrote:
> 
> My own vote: +1
> 
> Ubuntu 22.04, x86-64, Erlang 23
> 
> Sig: ok
> Checksums: ok
> make check: pass
> make release: works
> Fauxton "Verify Installation": pass
> 
> 
> I think we need one more PMC member +1 vote to make it official.
> 
> Thanks,
> -Nick
> 
> 
> 
> On Fri, Dec 23, 2022 at 2:07 PM Nick Vatamaniuc  wrote:
>> 
>> Thank you, all for testing. It's looking encouraging so far!
>> 
>> We happen to have some deb and rpm packages available as well from the
>> last CI on main
>> https://ci-couchdb.apache.org/blue/organizations/jenkins/jenkins-cm1%2FFullPlatformMatrix/detail/main/438/artifacts/
>> Those are plain rpm and deb packages not full yum or apt repos, but it
>> may be possible to use them for tests when the build dependencies are
>> not present or it's tricky to set things up.
>> 
>> Cheers!
>> -Nick
>> 
>> 
>> On Fri, Dec 23, 2022 at 11:39 AM Ronny Berndt  wrote:
>>> 
>>> Hi,
>>> 
>>> here is the missing „see below“ section:
>>> 
>>> First run with error:
>>> 
>>> PartitionViewUpdateTest [test/elixir/test/partition_view_update_test.exs]
>>>  * test purge removes view rows (212.0ms) [L#80]
>>>  * test view updates properly remove old keys (211.8ms) [L#9]
>>>  * test query with update=false works (3352.6ms) [L#34]
>>> 
>>>  1) test query with update=false works (PartitionViewUpdateTest)
>>> test/elixir/test/partition_view_update_test.exs:34
>>> ** (RuntimeError) timed out after 30371 ms
>>> code: retry_until(fn ->
>>> stacktrace:
>>>   (couchdbtest 0.1.0) test/elixir/lib/couch/dbtest.ex:423: 
>>> Couch.DBTest.retry_until/4
>>>   test/elixir/test/partition_view_update_test.exs:63: (test)
>>> 
>>>  * test purged conflict changes view rows (264.5ms) [L#108]
>>> 
>>> 
>>> Running test again:
>>> 
>>> PartitionViewUpdateTest [test/elixir/test/partition_view_update_test.exs]
>>>  * test view updates properly remove old keys (888.7ms) [L#9]
>>>  * test purged conflict changes view rows (315.8ms) [L#108]
>>>  * test purge removes view rows (264.1ms) [L#80]
>>> 
>>> ReshardHelpers [test/elixir/test/reshard_helpers.exs]
>>> 
>>> PartitionHelpers [test/elixir/test/partition_helpers.exs]
>>> 
>>> 
>>> Finished in 2.1 seconds (0.00s async, 2.1s sync)
>>> 4 tests, 0 failures
>>> 
>>> Merry Christmas.
>>> 
>>> /Ronny
>>> 
>>> 
 Am 23.12.2022 um 14:30 schrieb Ronny Berndt :
 
 Hi,
 
 +1
 
 Windows 10 Pro / Version 22H2 / OS build 19045.2364
 Erlang 24.3.4.6
 Elixir 1.13.4
 Spidermonkey 91
 
 sig: ok
 checksums: ok
 make check: ok (1 test fail at first run, see below)
 make release: ok
 build msi: ok
 fauxton verify: ok
 db & doc creation: ok
 
 Great work!
 
 /Ronny
 
 
> 
> On Thu, Dec 22, 2022 at 12:11 AM Nick Vatamaniuc 
> wrote:
> 
>> Dear community,
>> 
>> I would like to propose that we release Apache CouchDB 3.3.0.
>> 
>> Candidate release notes:
>> https://docs.couchdb.org/en/latest/whatsnew/3.3.html
>> 
>> Changes since RC1:
>> https://github.com/apache/couchdb/compare/3.3.0-RC1...3.3.0-RC2
>> 
>> We encourage the whole community to download and test these release
>> artefacts so that any critical issues can be resolved before the
>> release is made. Everyone is free to vote on this release, so dig
>> right in! (Only PMC members have binding votes, but they depend on
>> community feedback to gauge if an official release is ready to be
>> made.)
>> 
>> The release artefacts we are voting on are available here:
>> https://dist.apache.org/repos/dist/dev/couchdb/source/3.3.0/rc.2/
>> 
>> There, you will find a tarball, a GPG signature, and SHA256/SHA512
>> checksums.
>> 
>> Please follow the test procedure here:
>> 
>> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
>> 
>> Please remember that "RC2" is an annotation. If the vote passes, these
>> artefacts will be released as Apache CouchDB 3.3.0.
>> 
>> Please cast your votes now.
>> 
>> Thanks,
>> -Nick
>> 
 
>>> 



Re: [DISCUSSION] Make Erlang 23 the minimum supported distribution

2022-06-14 Thread Robert Newson
+1

> On 13 Jun 2022, at 21:30, Ilya Khlopotov  wrote:
> 
> It would be great to move to 23. It has quite a few interesting features, 
> which we could use.
> 
> Also as Nick said it would make switching to rebar3 easier.
> 
> +1 to the proposal
> 
> Best regards,
> iilyak
> 
> On 2022/06/13 20:20:09 Nick Vatamaniuc wrote:
>> Hello everyone,
>> 
>> I'd like to propose making Erlang 23 the minimum supported Erlang
>> distribution. We have accumulated a lot of ifdefs and other cruft
>> supporting various APIs and syntactic constructs since Erlang 20. With
>> Erlang 25 just released it could be a good time to do some cleanup.
>> 
>> We could remove ifdefs for crypto, time functions, the non-syntactic
>> get_stacktrace() macro, as well as have access to nifty new
>> functionality like persistent terms, atomic counters and a few other
>> things.
>> 
>> If we want to switch to rebar3, that would also be easier with Erlang
>> 23 being the minimum.
>> 
>> What do we think?
>> 
>> Thanks,
>> -Nick
>> 



Re: [DISCUSSION] Rename 3.x branch to main and include docs in the main repo

2022-06-02 Thread Robert Newson
+1. It's time.

B.

> On 2 Jun 2022, at 20:40, Nick Vatamaniuc  wrote:
> 
> Hi everyone,
> 
> In a #dev slack thread we were discussing improvements to how we tag
> our documentation repo. There were two proposals to simplify the
> development and release process, and we wanted to see what the rest of
> the community thought about it. Those two ideas are:
> 
> 1. Rename couchdb repo's main branch to fdbmain and rename 3.x to main.
> 
> From an ergonomic point of view, there is more development on the 3.x
> branch so having it as main makes more sense. It can help with:
>   * Issues auto-closing when PRs are merged
>   * Github code search works better in the default branch
> 
> 2. Move docs to the main repo.
> 
> We noticed that the docs repo tags/branches can get out-of-sync with
> the main couchdb repo. We have been merging features in main when they
> apply only to 3.2.x and it requires care to keep track of what gets
> merged and ported between branches. The idea is to simplify and make
> it automatic so docs always follow the main repo so merging and
> porting happens in one place only.
> 
> What does everyone think?
> 
> Thanks,
> -Nick



Re: Native Encryption

2022-05-19 Thread Robert Newson
Hi,

My proposal is not about backups, encrypted or otherwise, though I can see 
there's a relationship. Could the built-in encryption of my proposal also be 
suitable for protecting a backup of these files? Yes, I think so. Given key 
rotation we would expect to eventually have backups that need a wrapping key 
that is no longer the current one, hence the need we both perceive for multiple 
key slots. We differ only in that I pictured filling in the empty slots some 
time after file creation, and merely as a way to avoid a lock-step rotation.

You wondered if encryption should be optional. That's a good topic. In my view 
it's a "yes". Encryption is optional, admins should be able to configure 
encryption for any subset of databases, including none and all databases. It 
should be possible to configure CouchDB so that it unencrypts your databases 
(via compaction). It would also be useful if the wrapping key could vary 
between databases (it doesn't appear to be useful to go more granular than 
that). So perhaps it is DatabaseName in the callback functions and not 
WrappingKeyId.

I agree that we'll need the ability to have multiple key slots. I hadn't 
considered that we'd fill more than one slot at couch_file creation time but I 
don't see why not. We can delegate that to the key manager;

-callback new_key(DatabaseName :: binary()) ->
{ok, [WrappedKey :: binary()], UnwrappedKey :: binary()} | {error, Reason ::
term()}.

The key manager might send back a list of one item or several, and couch_file 
is simply obliged to record them at the start of the file. We would maybe also 
want to ensure there are empty slots available, so there might need to be a 
callback on the lines of;

-callback slot_size() -> pos_integer().

So we can know how much space to leave at the start of the file for empty slots.

The unwrap callback in this scheme would be essentially your revised proposal;

-callback unwrap_key(DatabaseName :: binary(), [WrappedKey::binary()]) ->
{ok, UnwrappedKey :: binary() | {error, Reason :: term()}.

I am wary of adding any code path in couchdb where we write anywhere but the 
end of the file, so the actual process of filling in a preallocated empty slot 
will need more thought. The atomicity of disk writes in theory and practice 
come into play and will likely force some decisions. For example we might be 
obliged to round up to the nearest 4 KiB (or disk sector size of the storage 
device if we can retrieve that; though it's probably 4 KiB).

Another option is to store the wrapped keys in the db headers but this presents 
a few difficulties. couch_file itself has no idea what is in the headers, only 
that they are 4 KiB-aligned and have the magic bit set at the start that 
indicates it has found a real header. So there's a layering issue there, but I 
think we can solve that. The other issue, though, is that the header itself 
could not be encrypted. I have a strong preference for encrypting every byte of 
the file.

B.


> On 19 May 2022, at 11:17, Will Young  wrote:
> 
> On Wed, 18 May 2022, 19:31 Robert Newson,  wrote:
> 
>> Hi Will,
>> 
>> I still don't see how public/private encryption helps. It seems the
>> private keys in your idea(s) are needed in all the same places and for the
>> same duration as the secret keys in mine. I think it's clear, though, that
>> I didn't fully comprehend your post.
>> 
> 
> I'm a bit confused here, in the example the node(s) never get any access to
> backup's private key. A node would never need to know any other node's
> private key(s). The nodeN encrypted to its own and backup's public keys so
> they can each decrypt the shard key with their private key. If node1 were
> to lose it's own keystore or cease to exist, backup's token might finally
> be plugged in (i.e. to node1 or maybe a recent backup of node1's data
> volume is restored to a replacement host with new token) and then using
> backup's private key inside its token one can begin compacting shards (just
> as usual encrypting to a public key for node1's token and the one for
> backups.)
> 
> Once backup's token is unplugged again from this restore operation, new
> node1 would have no secrets for any past backups it's new key reads only
> these newly updated shards and its own updates to them. Therefore backups
> is holding a master key to restore the history of backups for any node or
> its replacement while each node has a key that could only read back into
> some period of its own backups and can be destroyed whenever we like (as
> long as we are willing to use the backup key).
> 
> I don't see any similar possibility with secured symmetric keys as a
> symmetric key being used as the wrapping key on a node means that either
> only that node has the secret key or the key is improperly secured, i.e.
> many nodes have access to that secr

Re: Native Encryption

2022-05-18 Thread Robert Newson
Hi Will,

I still don't see how public/private encryption helps. It seems the private 
keys in your idea(s) are needed in all the same places and for the same 
duration as the secret keys in mine. I think it's clear, though, that I didn't 
fully comprehend your post.

I can at least see that my proposal's use of key wrapping techniques came 
without an explanation, which I include later in this post.

The core part of 'native encryption' is in the first commit, the mere mechanics 
of using AES in counter mode at the right places in couch_file to cause all the 
bytes on disk to be correctly encrypted and yet correctly decrypted on 
subsequent read, no matter which section of the file is read or in what order.

That bit, I hope, is not controversial (though it does require careful review).

It is the layers above that which we, as a dev community, need to ponder.

To be useful, it needs to be the case that an attacker is prevented from 
reading the data contained within the .couch files under as many scenarios as 
possible. We wouldn't want the confidentiality of data to be compromised easily.

To be secure, we must follow the security guidance for AES and for the selected 
mode (Counter Mode, in this case).  We don't want keys to live forever. We 
don't want to encrypt too much with the same key. We must never use the same 
key and the same counter to encrypt different plaintext, and so on.

CouchDB can open .couch (and .view) files very frequently, so we probably 
cannot afford to contact an external key manager for every single couch_file 
open. There is an LRU for performance reasons but I'm wary of coupling that to 
a potentially slow unwrap service call.

I've used a key wrapping technique for a few reasons. Firstly, so that every 
file has its own key, chosen at random on creation, that is entirely 
independent of any other key. This hugely helps with ensuring we've encrypted 
within the guidance. It also means the counter can be the file offset which, 
for our append-only files, ensures uniqueness. The second reason is that it 
introduces a layer of management keys which can have a longer lifetime, managed 
externally to one degree or another. It is those keys and their management 
which I'm trying to decide on before merging.

Perhaps it would help if I make a strawman proposal for a key manager interface?

Before I do that, imagine a number of small changes to the branch of work I've 
presented already;

1. A key manager interface is defined.
2. The concrete implementation of that interface can be defined for a couchdb 
installation somehow (perhaps in vm.args or default.ini)
3. That couch_file delegates to this interface at the points where it currently 
needs to wrap or unwrap the file key.

An interface might look like this;

-callback new_wrapped_key(WrappingKeyId :: binary()) ->
{ok, WrappedKey :: binary(), UnwrappedKey :: binary()} | {error, Reason :: 
term()}.

-callback unwrap_key(WrappingKeyId::binary(), WrappedKey::binary()) ->
{ok, UnwrappedKey :: binary() | {error, Reason :: term()}.

couch_file would call new_wrapped_key when creating a new file, and would 
receive the wrapped form, for writing to the header, and the unwrapped form, 
for initialising the ciphers held in the state variable.

For existing files, couch_file would read the wrapped key from the file and 
call unwrap_key to retrieve the unwrapped form, for the same purpose as 
previous.

An implementation of this interface could be done in erlang, as I've already 
shown, or could involve a remote network connection to some service that does 
it for us (and, one hopes, does so over HTTPS).

So the questions I'm most interested in discussing are whether this is the 
right level of abstraction and, if not, what others think would be?

I hope most folks can see that the above interface could be introduced in my 
branch fairly easily, and the parts of that work which use aegis_keywrap or 
read from config could be consolidated into an implementation of it. I'm happy 
to do that work if it helps.

B.

> On 18 May 2022, at 14:47, Will Young  wrote:
> 
> Hi Robert,
> 
>  I think it is best to try to clear up the matter of non-extractable
> keys since that is where I am confused about the capabilities without
> asymmetric keys. The setup I am used to seeing for non-extractable
> keys looks similar to crypto's OpenSSL engine references where erlang
> says it supports only RSA and the underlying OpenSSL supports RSA or
> some EC-variants. I think that is pretty inline for ~smartcard-chip
> pkcs11 tokens like yubikeys, i.e. I have an old feitian epass2003
> which says it supports some symmetric key algorithms but really it
> supports generating only keypairs in pkcs11-tool and some
> ~acceleration of shared keys.
> 
> Looking at AWS' cloudHSM, OTOH, I see that it supports symmetric
> non-extractable keys, but also backing up, restoring and ending up
> with clones that are copies of one original HSM. I see how that could
> work 

Re: Native Encryption

2022-05-17 Thread Robert Newson
Hi Will,

Thanks for your post and for spending significant time thinking about my 
proposal.

An important aspect of my proposed patch has, I think, been overlooked or 
under-examined. There is no essential need for the wrapping keys to ever be 
present in Erlang memory or, indeed, ever to leave some more secure enclave on 
some remote host or service. In my PR I have used the local couchdb config in 
order to demonstrate the functionality but I would not consider that a 
production mechanism, at least not without some significant refinement.

You seem to imply that storing wrapped keys in the shard files is a security 
concern and I'd like to more clearly understand that concern, as I do not share 
it. The encrypted files can only be decrypted with the right encryption key, 
and the wrapped key at the start of the file can only be unwrapped by another 
key. Guessing either of these keys is equally infeasible.

The motivation behind the (possible) use of multiple key slots is to allow an 
administrator to change the wrapping key in a safer manner. The start of the 
file would be preallocated with multiple slots, only one of which would be 
filled at the file's creation (using the current wrapping key). At any moment 
the administrator can specify a new wrapping key, and we would then wrap the 
existing key (which we'd need to unwrap with the 'old' key) with that new 
wrapping key and store it in a spare slot. If there were any kind of crash (a 
power failure, say), the old wrapped key is still there.

We could, instead, choose to overwrite the single wrapped key with its new 
value and use the original trick of writing a 4 KiB value (a disk sector) with 
two copies of the same data, and try to exploit atomicity at the disk/disk 
controller level. I'm not a huge fan of that approach.

Any version of this work, before it can be merged, must allow for all keys (or 
passwords/phrases) to be replaced by an administrator without data loss. I have 
no strong opinion on how we achieve this yet, only that we must.

You said;

> I don't think there is ever a point in combining this with a HSM/cryptoki/etc 
> hardware keystore

I have the opposite feeling. The protections I'm proposing to add to .couch and 
.view files benefit most when the wrapping keys are generated, used, and 
exclusively stored within an HSM. There is no need for CouchDB to ever see 
them. All CouchDB needs to be able to do is to request, at any time, that the 
wrapped key, read from the start of the relevant file, is unwrapped. Permission 
to perform that unwrapping could be revoked at any time, and the wrapping key 
itself could be forensically destroyed. While files that are currently _open_ 
would still be readable (by couchdb and anyone able to introspect the erlang VM 
or the host memory), no new file could be opened.

On Asymmetric keys, I don't understand your proposal well enough to usefully 
respond to it. There doesn't seem to be any way to apply them to this problem, 
where couchdb must perform both encryption and decryption of the same data. 
Asymmetric would make sense if these duties were split (if, say, a party were 
encrypting a .couch file to send to another person, they could encrypt it with 
the public key of the recipient, who could use their private key to decrypt 
it). I would be grateful if you could explain how asymmetric encryption could 
be used here in a way that doesn't require every party that holds the public 
key to also hold the corresponding private key (and vice versa).

Your point on future expansion or new formats is well made. I anticipate that a 
little with the encryption header that I write at the very start of the file, 
before the wrapped key or the id of the wrapping key. We could use other values 
to indicate other formats.

> On 17 May 2022, at 13:30, Will Young  wrote:
> 
> Hi Robert,
> 
>  I've taken some time to think over your PR and writeup, and have the
> following comments:
> 
> benefits of the PR
>  I like this idea of native encryption a lot. While lower layers can
> offer encryption, I think there are a lot more situations where the
> lower layer has been delegated through cloud hosting, etc, and one is
> not really sure it is providing the expected capabilities without some
> unexpected caveat. I think native encryption should be very
> appropriate in a situation where the main system volume can be small
> and protected carefully but data volumes need to be cheap, large, easy
> to backup.
> 
> Expunging uncertainty and manual shared key management
> I like systems like the regularly recycling of the per-shard key
> trying to somewhat limit something like momentary full system read
> access at one moment from inherently being able to snoop through old
> data that could have been expunged and all future data (after rekeys
> etc). I can understand why performance/design-wise the per-shard key
> is best wrapped and stored in the shard itself, but I find it a bit
> unfortunate for directly 

Native Encryption

2022-05-10 Thread Robert Newson
All,

One feature we built on main (aka 4.x) was native support for encrypted 
databases. As the foundationdb/4.x work is largely halted now I thought I'd 
scratch a personal itch and try to bring this feature to 3.x.

I've posted a draft pull request at https://github.com/apache/couchdb/pull/4019 
which works.

Obviously the encryption and decryption primitives are the least interesting 
part (though I'm pleased with the implementation), the important part is key 
management, which I hope will be the primary focus of thread responses.

The draft PR has numerous commits that can be understood independently, and not 
all of which necessarily make sense in any final version. I'll describe them 
briefly at the end of this post.

In brief, encryption is done with AES in counter mode. This provides two 
properties that make integration with the way we write to couch_file's much 
easier than other modes. Firstly, we can calculate the cipher text for any 
plain text without needing to read any other data (i.e, we don't need to read 
an AES block's, 16 bytes, worth of data to then modify it). Secondly, we can 
write, or read, any subsection of an AES block. Taken together this means we 
can preserve our append-only scheme and do not have to buffer writes or pad 
them out to AES block sizes. There is a penalty in performance, of course, as 
we must encrypt or decrypt a full AES block at minimum, even if we can discard 
sections of the result. Another consequence is that native encryption only 
provides confidentially, not authenticated (like GCM mode would, for example).

The basics of key management are present in the PR. Each couch_file is 
encrypted by a unique key, generated with crypto:strong_rand_bytes when 
created. This value is wrapped using a secure key wrapping algorithm and stored 
very deliberately at the beginning of the file. Unlike db headers, we do not 
write a further copy later. It lives in the first X bytes. One important 
benefit of this is that the file can be crypto-shredded by overwriting this 
area. The key is unique to the file and does not propagate through compaction. 
Compacting a file encrypts it with a new key. There was no reason to preserve 
the key, so I didn't.

You will see there are commits that hardcode the wrapping key (or "key 
encrypting key" if you prefer), later in the commit set I switch to storing 
these in the config files. I don't think that is a suitable option for any 
version of this but it is useful to demonstrate the separation of concerns. The 
keys can come from anywhere. 

To make things more manageable, later commits introduce the notion of a "key 
id" that is essentially a label for an otherwise random 32-byte value. The key 
id allows us to perform rekeying. That is, we can also change the key that 
wraps the per-couch_file key, by changing the wrapping_key_id in config and 
compacting all existing files. We can also consider a far faster approach where 
we simply overwrite the encryption header at the start of each file. More care 
is needed there, of course.

To the commits themselves;

* demonstrate native encryption

This introduces only the essential changes for native encryption. A single 
"master" key is used. We use the NIST AES Key Wrap algorithm (copied from the 
aegis application on couchdb main branch) to wrap the per-couch_file keys.
 
encrypt the headers too

I went back and forth on whether to encrypt the headers (really, footers) or 
not. In this commit I begin encrypting them. At this point the entire file is 
encrypted (and `ent` confirms the entire file is statistically random). 

support unencrypted files

To support migration, this commit allows couch_file to read unencrypted files.
 
canary value to detect encryption

The previous commit made the assumption that any failure to unwrap a key means 
the file is not encrypted. This isn't necessarily true, so this commit allows 
us to distinguish between an unencrypted file and one where unwrap fails for 
some other reason (tampering, corruption, truncation, cosmic rays). In the 
latter case we return an error. This prevents us from resetting a file 
erroneously.
 
import https://github.com/whitelynx/erlang-pbkdf2/blob/master/src/pbk…

At this point I wanted more than a hardcoded key. In the absence of our 
community's view on key management I elected to use the config file as a 
stepping stone. In advance of doing that I imported a better PBKDF2 
implementation than the one I originally wrote for CouchDB approximately a 
century ago (rounding up). I also expunged our implementation and delegated to 
the imported version.
 
encryption password from config

This commit introduces wrapping keys that are derived from user supplied 
values, using PBKDF2 with SHA-256 as the PRF.
 
performance boost and also hides the key from inspection

This one is very cool (which is entirely down to the Erlang/OTP team). We 
switch to the new dyn_iv functions which has both a performance boost 
(approximately 

Re: [ANNOUNCE] Will Young elected as CouchDB committer

2022-04-19 Thread Robert Newson


Congrats Will! Welcome aboard.

B.

> On 19 Apr 2022, at 08:36, Jan Lehnardt  wrote:
> 
> Dear community,
> 
> I am pleased to announce that the CouchDB Project Management Committee has 
> elected Will Young as a CouchDB committer.
> 
>Apache ID: wyoung
> 
>Slack nick: Will Young
> 
> Committers are given a binding vote in certain project decisions, as well as 
> write access to public project infrastructure.
> 
> This election was made in recognition of Will's commitment to the project. We 
> mean this in the sense of being loyal to the project and its interests.
> 
> Please join me in extending a warm welcome to Will!
> 
> On behalf of the CouchDB PMC,
> Jan
> —



Re: Important update on couchdb's foundationdb work

2022-03-14 Thread Robert Newson
Hi,

That already happened. “Pluggable storage engine” was introduced in 2016 
(https://github.com/apache/couchdb/commit/f6a771147ba488f80a7d29491263d19088d0eefb).

No alternative backends have yet been contributed. 

B.

> On 13 Mar 2022, at 16:27, Chintan Mishra from Rebhu  wrote:
> 
> As a user, my team and I were keenly looking forward to CouchDB v4 with 
> FoundationDB.
> 
> Given the current situation, it is only reasonable to come up with a best 
> alternative.
> 
> How about refactoring CouchDB to work with multiple storage engines?
> 
> The default CouchDB will support whatever the PMC agrees upon. Whereas the 
> community can tinker with different backend storage engines. So, the 
> FoundationDB can be one of the backing engines that get used with CouchDB. 
> Other storage engines can be RocksDB, Apache Derby, etc.
> 
> Thank you.
> 
> --
> Chintan Mishra
> 
> On 13/03/22 17:09, Robert Newson wrote:
>> Thank you for this feedback.
>> I think it’s reasonable to worry about tying CouchDB to FoundationDB for 
>> some of the reasons you mentioned, but not all of them. We did worry, at the 
>> start, at the lack of a governance policy around FoundationDB; something 
>> that would help ensure the project is not beholden to a single corporate 
>> entity that might abandon the project or take it in places that make it 
>> unsuitable for CouchDB in the future. There hasn’t been much progress on 
>> that, but likewise the project has stayed true to form.
>> CouchDB is critically dependent on Erlang/OTP, among other components, which 
>> similarly lack the kind of governance or oversight that Apache projects 
>> themselves work within. At no point have I feared the "project will end up 
>> in FoundationDB integrating CouchDB rather than the other way around”. 
>> FoundationDB is not a database, it is explicitly only foundational support 
>> to build databases on top of.
>> "If even you guys weren't treated as a priority, I doubt that my feature 
>> requests and other input will matter even one bit as a user.” - I’m not sure 
>> who you refer to with “you guys”, but I remind everyone that the CouchDB 
>> contributors from IBM Cloudant are the main contributors to CouchDB 2.0 and 
>> 3.0, have been so for years and are in, many cases, either CouchDB 
>> committers or PMC members. They are “us” as much as any other contributor. 
>> That the Cloudant team has moved focus from CouchDB 4.0 (as it would have 
>> been) to 3.0 is a re-establishment of the status quo ante.
>> "I doubt that my feature requests and other input will matter even one bit 
>> as a user.” — I strongly disagree here. Community contributions are hugely 
>> valuable and valued, the rewrite of the lower layers of CouchDB would not 
>> have changed that significantly. CouchDB-FDB is still written in Erlang, the 
>> http layer is largely the same code as before. The parts that interact with 
>> FoundationDB are confined to a single library application (erlfdb) which 
>> exposes the C language bindings as Erlang functions and data structures. 
>> Unless you are working at that level you can mostly ignore it.
>> Finally, while I don’t think we’ve explicitly described it this way, 
>> CouchDB-FDB effectively _is_ a “layer” on top of FDB in the same sense that 
>> their “document layer” (which is mongo-like) is.
>> B.
>>> On 13 Mar 2022, at 11:17, Reddy B.  wrote:
>>> 
>>> Hello!
>>> 
>>> Thanks a lot for this update and overview of the situation. As users (our 
>>> company has been using couchdb since 2015 circa as the main database of our 
>>> 3 tier web apps), I feel it may be preferable to move the couchdb-fdb work 
>>> to a separate project having a different name. As Janh has mentioned, the 
>>> internals and daily management of FDB may with certain regards be at odds 
>>> with the philosophy and user experience that couchdb wants to provide.
>>> 
>>> Moving this effort to a different project would give people interested in 
>>> this effort more flexibility to introduce breaking changes and limitations 
>>> taking full advantage of the philosophy of FDB. I feel the idea that: if 
>>> you have outscaled CouchDB, move to couchdb-fdb (or  another more 
>>> specialized DB) is the right idea. Couchdb-fdb advantage compared to 
>>> alternative would simply be that it implements both the replication 
>>> protocol and the HTTP API.
>>> 
>>> This project may/should even "simply" become something under the umbrella 
>>> of the FoundationDB layer similar to the MongoDB-compati

Re: Important update on couchdb's foundationdb work

2022-03-13 Thread Robert Newson
able pace. Doubling 
>> that effort might be tricky. While we had an influx of contributors 
>> recently, this would probably need more dedicated planning and outreach.
>> 
>> - New API features would have to be implemented twice, if we want to keep a 
>> majority API overlap. This is not a fun proposition for folks who add 
>> features, which is hard enough, but now they have to do it twice, onto two 
>> different subsystems. Some features (say multi-doc-transactions) would only 
>> be possible in one of the projects (FDB-Couch), what would our policy be for 
>> deliberate API feature divergence?
>> 
>> - probably more that elude me at the moment.
>> 
>> While there are non-trivial points among these, they are not impossible 
>> tasks *if* we find enough and the right folks to carry the work forward.
>> 
>> * * *
>> 
>> For myself, I still see a lot of potential in the 3.x codebase and I’m 
>> looking forward to renewed roadmap discussions there. I know I have a long 
>> list of things I’d like to see added.
>> 
>> From my professional observation, the thing that our (Neighbourhoodie) 
>> customers tend to run into the most is the scaling limits of the 
>> database-per-user pattern. We have a proposal for per-doc-authentication 
>> that helps mitigate a subset of those use-cases, which would be a great help 
>> overall. I have worked on a draft PR of this over the years, but it mostly 
>> stalled out during the pandemic. I’m planning to restart work on this 
>> shortly. If anyone wants to contribute with time and/or money, please do get 
>> in touch.
>> 
>> The other major issue with 3.x as reported by IBM is _changes feed rewinds 
>> when nodes are rotated in and out of clusters. We already fixed a number of 
>> changes rewind bugs relatively recently. I don’t know if we got them all 
>> now, or if there are theoretical limits to how far we can take this given 
>> our consistency model, but it’d be worth spending some time on at least 
>> getting rid of all rewind-to-zero cases.
>> 
>> * * *
>> 
>> I’m also looking forward to all your input on the discussion here. I’m sure 
>> this will explode into a lot of detailed discussions quickly, so maybe as a 
>> guide to come back to when get closer to having to make a decision, here are 
>> three ways forward that I see:
>> 
>> 1. Follow IBM in abandoning FDB-Couch, refocus all effort on Erlang-Couch 
>> (3.x).
>> 
>> 2. Take FDB-Couch development over fully, come up with a story for how 
>> FDB-Couch and Erlang-Couch can coexist and when users should choose which 
>> one.
>> 
>> 3. Hand over the FDB-Couch codebase to an independent team that then can do 
>> what they like with it (if this materialises from this discussion).
>> 
>> * * *
>> 
>> Best
>> Jan
>> —
>> 
>> 
>>> On 10. Mar 2022, at 17:24, Robert Newson  wrote:
>>> 
>>> Hi,
>>> 
>>> For those that are following closely, and particularly those that build or 
>>> use CouchDB from our git repo, you'll be aware that CouchDB embarked on an 
>>> attempt to build a next-generation version of CouchDB using the 
>>> FoundationDB database engine as its new base.
>>> 
>>> The principal sponsors of this work, the Cloudant team at IBM, have 
>>> informed us that, unfortunately, they will not be continuing to fund the 
>>> development of this version and are refocusing their efforts on CouchDB 3.x.
>>> 
>>> Cloudant developers will continue to contribute as they always have done 
>>> and the CouchDB PMC thanks them for their efforts.
>>> 
>>> As the Project Management Committee for the CouchDB project, we are now 
>>> asking the developer community how we’d like to proceed in light of this 
>>> new information.
>>> 
>>> Regards,
>>> Robert Newson
>>> Apache CouchDB PMC
>>> 



Re: Important update on couchdb's foundationdb work

2022-03-12 Thread Robert Newson
, the lot. At the moment, CouchDB has just about 
> enough folks contributing to move forward at a reasonable pace. Doubling that 
> effort might be tricky. While we had an influx of contributors recently, this 
> would probably need more dedicated planning and outreach.
> 
> - New API features would have to be implemented twice, if we want to keep a 
> majority API overlap. This is not a fun proposition for folks who add 
> features, which is hard enough, but now they have to do it twice, onto two 
> different subsystems. Some features (say multi-doc-transactions) would only 
> be possible in one of the projects (FDB-Couch), what would our policy be for 
> deliberate API feature divergence?
> 
> - probably more that elude me at the moment.
> 
> While there are non-trivial points among these, they are not impossible tasks 
> *if* we find enough and the right folks to carry the work forward.
> 
> * * *
> 
> For myself, I still see a lot of potential in the 3.x codebase and I’m 
> looking forward to renewed roadmap discussions there. I know I have a long 
> list of things I’d like to see added.
> 
> From my professional observation, the thing that our (Neighbourhoodie) 
> customers tend to run into the most is the scaling limits of the 
> database-per-user pattern. We have a proposal for per-doc-authentication that 
> helps mitigate a subset of those use-cases, which would be a great help 
> overall. I have worked on a draft PR of this over the years, but it mostly 
> stalled out during the pandemic. I’m planning to restart work on this 
> shortly. If anyone wants to contribute with time and/or money, please do get 
> in touch.
> 
> The other major issue with 3.x as reported by IBM is _changes feed rewinds 
> when nodes are rotated in and out of clusters. We already fixed a number of 
> changes rewind bugs relatively recently. I don’t know if we got them all now, 
> or if there are theoretical limits to how far we can take this given our 
> consistency model, but it’d be worth spending some time on at least getting 
> rid of all rewind-to-zero cases.
> 
> * * *
> 
> I’m also looking forward to all your input on the discussion here. I’m sure 
> this will explode into a lot of detailed discussions quickly, so maybe as a 
> guide to come back to when get closer to having to make a decision, here are 
> three ways forward that I see:
> 
> 1. Follow IBM in abandoning FDB-Couch, refocus all effort on Erlang-Couch 
> (3.x).
> 
> 2. Take FDB-Couch development over fully, come up with a story for how 
> FDB-Couch and Erlang-Couch can coexist and when users should choose which one.
> 
> 3. Hand over the FDB-Couch codebase to an independent team that then can do 
> what they like with it (if this materialises from this discussion).
> 
> * * *
> 
> Best
> Jan
> —
> 
> 
>> On 10. Mar 2022, at 17:24, Robert Newson  wrote:
>> 
>> Hi,
>> 
>> For those that are following closely, and particularly those that build or 
>> use CouchDB from our git repo, you'll be aware that CouchDB embarked on an 
>> attempt to build a next-generation version of CouchDB using the FoundationDB 
>> database engine as its new base.
>> 
>> The principal sponsors of this work, the Cloudant team at IBM, have informed 
>> us that, unfortunately, they will not be continuing to fund the development 
>> of this version and are refocusing their efforts on CouchDB 3.x.
>> 
>> Cloudant developers will continue to contribute as they always have done and 
>> the CouchDB PMC thanks them for their efforts.
>> 
>> As the Project Management Committee for the CouchDB project, we are now 
>> asking the developer community how we’d like to proceed in light of this new 
>> information.
>> 
>> Regards,
>> Robert Newson
>> Apache CouchDB PMC
>> 
> 



Important update on couchdb's foundationdb work

2022-03-10 Thread Robert Newson
Hi,

For those that are following closely, and particularly those that build or use 
CouchDB from our git repo, you'll be aware that CouchDB embarked on an attempt 
to build a next-generation version of CouchDB using the FoundationDB database 
engine as its new base.

The principal sponsors of this work, the Cloudant team at IBM, have informed us 
that, unfortunately, they will not be continuing to fund the development of 
this version and are refocusing their efforts on CouchDB 3.x.

Cloudant developers will continue to contribute as they always have done and 
the CouchDB PMC thanks them for their efforts.

As the Project Management Committee for the CouchDB project, we are now asking 
the developer community how we’d like to proceed in light of this new 
information.

Regards,
Robert Newson
Apache CouchDB PMC



Re: [PROPOSAL] Drop support for Ubuntu 16.04

2022-01-15 Thread Robert Newson
+1

> On 15 Jan 2022, at 06:26, Nick V  wrote:
> 
> That sounds great. +1 to drop Ubuntu 16.04
> 
> -Nick
> 
>> On Jan 14, 2022, at 22:41, Adam Kocoloski  wrote:
>> 
>> Hi, I propose that we remove Ubuntu 16.04 (Xenial Xerus) from the CI matrix 
>> and binary package generation systems.
>> 
>> Ubuntu 16.04 stopped being a standard LTS release in April 2021 and is now 
>> only supported through Canonical’s Extended Security Maintenance program. I 
>> think the end of LTS is a reasonable standard to apply for removing support 
>> in Apache CouchDB. If we apply this to Debian / Ubuntu / CentOS I believe we 
>> end up with the following expiration dates:
>> 
>> Debian 9: 06/2022
>> Debian 10: ~07/2024
>> Debian 11: ~08/2026
>> 
>> Ubuntu 18.04: 04/2023
>> Ubuntu 20.04: 04/2025
>> 
>> CentOS 7: 06/2024
>> CentOS 8: 12/2021*
>> 
>> (Red Hat did a thing with CentOS where it switched from a rebuild of RHEL to 
>> being upstream of RHEL, and they accelerated the EOL of CentOS 8 as part of 
>> that).
>> 
>> I’d like to get in the habit of proactively removing these releases from our 
>> build system when they leave LTS rather than waiting around for something to 
>> break. Any objections?
>> 
>> Adam



Re: [DISCUSS] Handle libicu upgrades better

2021-11-19 Thread Robert Newson
Noting that the upgrade channel for views was misconceived (by me) as there is 
no version number in the header for them. You’d need to add it. 

B. 

> On 18 Nov 2021, at 07:12, Nick Vatamaniuc  wrote:
> 
> Thinking more about this issue I wonder if we can avoid resetting and
> rebuilding everything from scratch, and instead, let the upgrade
> happen in the background, while still serving the existing view data.
> 
> The realization was that collation doesn't affect the emitted keys and
> values themselves, only their order in the view b-trees. That means
> we'd just have to rebuild b-trees, and that is exactly what our view
> compactor already does.
> 
> When we detect a libicu version discrepancy we'd submit the view for
> compaction. We even have a dedicated "upgrade" [1] channel in smoosh
> which handles file version format upgrades, but we'll tweak that logic
> to trigger on libicu version mismatches as well.
> 
> Would this work? Does anyone see any issue with that approach?
> 
> [1] 
> https://github.com/apache/couchdb/blob/3.x/src/smoosh/src/smoosh_server.erl#L435-L442
> 
> Cheers,
> -Nick
> 
> 
> 
>> On Fri, Oct 29, 2021 at 7:01 PM Nick Vatamaniuc  wrote:
>> 
>> Hello everyone,
>> 
>> CouchDB by default uses the libicu library to sort its view rows.
>> When views are built, we do not record or track the version of the
>> collation algorithm. The issue is that the ICU library may modify the
>> collation order between major libicu versions, and when that happens,
>> views built with the older versions may experience data loss. I wanted
>> to discuss the option to record the libicu collator version in each
>> view then warn the user when there is a mismatch. Also, optionally
>> ignore the mismatch, or automatically rebuild the views.
>> 
>> Imagine, for example, searching patient records using start/end keys.
>> It could be possible that, say, the first letter of their name now
>> collates differently in a new libicu. That would prevent the patient
>> record from showing up in the view results for some important
>> procedure or medication. Users might not even be aware of this kind of
>> data loss occurring, there won't be any error in the API or warning in
>> the logs.
>> 
>> I was thinking how to solve this. There were a few commits already to
>> cleanup our collation drivers [1], expose libicu and collation
>> algorithm version in the new _versions endpoint [2], and some other
>> minor fixes in that area. As the next steps we could:
>> 
>>  1) Modify our views to keep track of the collation algorithm
>> version. We could attempt to transparently upgrade the view header
>> format -- read the old view file, update the header with an extra
>> libicu collation version field, that updates the signature, and then,
>> save the file with the new header and new signature. This avoids view
>> rebuilds, just records the collator version in the view and moves the
>> files to a new name.
>> 
>>  2) Do what PostgreSQL does, and 2a) emit a warning with the view
>> results when the current libicu version doesn't match the version in
>> the view [3]. That means altering the view results to add a "warning":
>> "..." field. Another alternative 2b) is emit a warning in the
>> _design/$ddoc/_info only. Users would have to know that after an OS
>> version upgrade, or restoring backups, to make sure to look at their
>> _design/$ddoc/_info for each db for each ddoc. Of course, there may be
>> users which used the "raw" collation option, or know they are using
>> just the plain ASCII character sets in their views. So we'd have a
>> configuration setting to ignore the warnings as well.
>> 
>>  3) Users who see the warning, could then either rebuild the view
>> with the new collator library manually, or it could happen
>> automatically based on a configuration option, basically "when
>> collator versions are miss-matched, invalidate and rebuild all the
>> views".
>> 
>>  4) We'd have a way for the users to assert (POST a ddoc update) that
>> they double-checked the new ICU version and are convinced that a
>> particular view would not experience data loss with the new collator.
>> That should make the warning go away, and the view to not be rebuilt.
>> This can't be just a naive "collator" option setting as both per-view
>> and per-design options are used when computing the view signature, and
>> any changes there would result in the view being rebuilt. Perhaps we
>> can add it to the design docs as a separate option which is excluded
>> from the signature hash, like the "autoupdate" setting for background
>> index builder ("collation_version_accept"?). PostgreSQL also offers
>> this option with the ALTER COLLATION ... REFRESH VERSION command [3]
>> 
>> What do we think, is this a reasonable approach? Is there something
>> easier / simpler we can do?
>> 
>> Thanks!
>> -Nick
>> 
>> [1] 
>> https://github.com/apache/couchdb/pull/3746/commits/28f26f52fe2e170d98658311dafa8198d96b8061
>> [2] 
>> 

Re: [VOTE] Release Apache CouchDB 3.1.2

2021-09-27 Thread Robert Newson
+1

Sigs, checksums and 'make check' passed for me.

macOS 11.6, erlang 22, elixir 1.9.4.

> On 27 Sep 2021, at 13:53, Nick Vatamaniuc  wrote:
> 
> Dear community,
> 
> I would like to propose that we release Apache CouchDB 3.1.2
> 
> Changes since 3.1.1
> 
>
> https://github.com/apache/couchdb-documentation/compare/3.1.1...3.1.x?expand=1
> 
> This is a minor release where we backport a few features from the
> pending 3.2 release to the 3.1.x branch.
> 
> We encourage the whole community to download and test these release
> artefacts so that any critical issues can be resolved before the
> release is made. Everyone is free to vote on this release, so dig
> right in! (Only PMC members have binding votes, but they depend on
> community feedback to gauge if an official release is ready to be
> made.)
> 
> The release artefacts we are voting on are available here:
> 
>https://dist.apache.org/repos/dist/dev/couchdb/source/3.1.2/rc.1/
> 
> There, you will find a tarball, a GPG signature, and SHA256/SHA512 checksums.
> 
> Please follow the test procedure here:
> 
>
> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> 
> Please remember that "RC1" is an annotation. If the vote passes, these
> artefacts will be released as Apache CouchDB 3.1.2
> 
> Please cast your votes.
> 
> Thanks,
> -Nick



Re: Reformat src files with `erlfmt` on `main`

2021-06-03 Thread Robert Newson
As Paul's involuntary amanuensis he says "go for it" but asks if we've 
confirmed that the abstract syntax tree is unaffected (i.e, the changes are 
purely cosmetic and make no difference to the compiled artifacts).

B.

> On 2 Jun 2021, at 09:18, Bessenyei Balázs Donát  wrote:
> 
>> My only nose-wrinkle was at `->` being placed on its own line under some
> circumstances
> I counted too many occurrences of that to add ignores for them (and people
> would probably forget adding them on new code which would result in a mixed
> state).
> If there are no objections, I'll go ahead with merging it with the
> controversial `->`s on newlines  (as advantages seem to outweigh this
> drawback). As I mentioned earlier, if we can get a config option or change
> to erlfmt, we can always do a quick reformat.
> 
>> local git hook
> I couldn't find a nice way to do it, so I can open a ticket to do that
> later. The PR adds it to CI and people can run the checks (and the
> formatter) themselves locally.
> 
> I have not received a +>=0 from Paul, but as it's been more than a week now
> I'll merge the PR assuming consent. (The PR is already approved on GitHub.)
> The change is not irreversible and I'd be happy to either revert or adjust
> if necessary.
> 
> Thank you all for the support and the contribution!
> 
> 
> Donat
> 
> 
> On Fri, May 28, 2021 at 4:31 PM Ilya Khlopotov  wrote:
> 
>>> Can it also be set up as a local git hook etc?
>> 
>> Few complications here:
>> 1) CouchDB codebase is not 100% resides in a single repository
>> 2) Which hook manager to use given differences in platforms we support and
>> the fact that none of the hook managers support multiple repositories.
>> There are multiple options:
>> 
>> - https://github.com/frankywahl/super_hooks
>> - https://github.com/Arkweid/lefthook
>> - https://github.com/pre-commit/pre-commit
>> 
>> Do we need a separate ML discussion which hook manager to use?
>> 
>> Another option is to update configure or rebar.config.script to place
>> files (or links) in `.git/hooks/pre-commit`.
>> 
>> Best regards,
>> iilyak
>> 
>> On 2021/05/21 12:25:53, Robert Newson  wrote:
>>> Hi,
>>> 
>>> My only nose-wrinkle was at `->` being placed on its own line under some
>> circumstances. The rest looked good. I agree that uniformity of formatting
>> is a very good thing and this reformat is long overdue.
>>> 
>>> Agree with Donat that the formatting should be enforced by CI tools so
>> there’s no backsliding. Can it also be set up as a local git hook etc?
>>> 
>>> B.
>>> 
>>>> On 21 May 2021, at 12:46, Bessenyei Balázs Donát 
>> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> I believe I've only seen +>=0s so far so I intend to (in the following
>> order):
>>>> * wait for an ok from @Robert Newson and @Paul J. Davis
>>>> * add `erlfmt-ignore`s if necessary to #3568
>>>> * add a check to CI (ideally via `make`) to ensure `erlfmt` is +1 on
>>>> the PRs in #3568
>>>> * create a PR for 3.x analogous to #3568
>>>> 
>>>> Please let me know if I missed anything.
>>>> 
>>>> 
>>>> Donat
>>>> 
>>>> 
>>>> On Fri, May 21, 2021 at 2:27 AM Joan Touzet  wrote:
>>>>> 
>>>>> In general I am +0.5 on the entire thing, but would like to see Bob
>>>>> Newson or Paul Davis speak up. In the past they've been the most vocal
>>>>> about code formatting standards, and I'd at least like to see a +0
>> from
>>>>> both of them.
>>>>> 
>>>>> -Joan
>>>>> 
>>>>> On 20/05/2021 11:53, Ilya Khlopotov wrote:
>>>>>> Good idea Donat!!!
>>>>>> 
>>>>>> Even though I disagree with some of the choices made by erlfmt I
>> appreciate consistency it provides.
>>>>>> The choices are logical. I really love that every decision is
>> documented and properly discussed. I did read PR in its entirety and in
>> fact was not even noticed the ugly `->` in the beginning of the line closer
>> to the end of the review process.
>>>>>> I do believe our wetware would adjust in no time to new formatting.
>> Given how easy it is to reason about. I agree with Donat's observation that
>> we are spending too much time and emphasis on formatting issues every time
>> we review PRs. I do believe it is a machine job to provide consis

Re: Reformat src files with `erlfmt` on `main`

2021-05-21 Thread Robert Newson
Hi,

My only nose-wrinkle was at `->` being placed on its own line under some 
circumstances. The rest looked good. I agree that uniformity of formatting is a 
very good thing and this reformat is long overdue.

Agree with Donat that the formatting should be enforced by CI tools so there’s 
no backsliding. Can it also be set up as a local git hook etc?

B.

> On 21 May 2021, at 12:46, Bessenyei Balázs Donát  wrote:
> 
> Hi All,
> 
> I believe I've only seen +>=0s so far so I intend to (in the following order):
> * wait for an ok from @Robert Newson and @Paul J. Davis
> * add `erlfmt-ignore`s if necessary to #3568
> * add a check to CI (ideally via `make`) to ensure `erlfmt` is +1 on
> the PRs in #3568
> * create a PR for 3.x analogous to #3568
> 
> Please let me know if I missed anything.
> 
> 
> Donat
> 
> 
> On Fri, May 21, 2021 at 2:27 AM Joan Touzet  wrote:
>> 
>> In general I am +0.5 on the entire thing, but would like to see Bob
>> Newson or Paul Davis speak up. In the past they've been the most vocal
>> about code formatting standards, and I'd at least like to see a +0 from
>> both of them.
>> 
>> -Joan
>> 
>> On 20/05/2021 11:53, Ilya Khlopotov wrote:
>>> Good idea Donat!!!
>>> 
>>> Even though I disagree with some of the choices made by erlfmt I appreciate 
>>> consistency it provides.
>>> The choices are logical. I really love that every decision is documented 
>>> and properly discussed. I did read PR in its entirety and in fact was not 
>>> even noticed the ugly `->` in the beginning of the line closer to the end 
>>> of the review process.
>>> I do believe our wetware would adjust in no time to new formatting. Given 
>>> how easy it is to reason about. I agree with Donat's observation that we 
>>> are spending too much time and emphasis on formatting issues every time we 
>>> review PRs. I do believe it is a machine job to provide consistent 
>>> formatting. We humans are better at other things. All in all I vote for 
>>> adopting `erlfmt` for both 3.x and main.
>>> 
>>> Also thank you Donat for providing validation scripts to make sure the 
>>> re-formatted code compiles to the same beam files.
>>> 
>>> Best regards,
>>> iilyak
>>> 
>>> 
>>> On 2021/05/18 18:13:14, Bessenyei Balázs Donát  wrote:
>>>> Hi dev@couchdb,
>>>> 
>>>> To eliminate the need for formatting-related comments and thus
>>>> unnecessary cycles in PRs, I've invested a little time to see if we
>>>> could use a formatter on `main` [1].
>>>> The PR reformats `.erl` files in `src` and the script [2] included
>>>> shows that the compiled binaries match "before" and "after".
>>>> The formatter used in the PR is `erlfmt` [3] which is an opinionated
>>>> [4] tool so it's more of a "take it or leave it" as-is. (We could try
>>>> using other formatters if we want in case people want formatting but
>>>> think the choices `erlfmt` makes are unacceptable.)
>>>> Some members of the CouchDB dev community already left some great
>>>> comments on the PR and I haven't seen any strong opposition so far,
>>>> but I wanted to make sure more people are aware of this.
>>>> If you have any questions, comments or concerns (or objections),
>>>> please let me know.
>>>> 
>>>> 
>>>> Thank you,
>>>> 
>>>> Donat
>>>> 
>>>> 
>>>> [1]: https://github.com/apache/couchdb/pull/3568
>>>> [2]: 
>>>> https://github.com/apache/couchdb/pull/3568/files#diff-7adfbc2d8dba4d4ff49ff2b760b81c006097f20f412ea2007f899042fd0de98a
>>>> [3]: https://github.com/WhatsApp/erlfmt
>>>> [4]: 
>>>> https://github.com/WhatsApp/erlfmt#comparison-with-other-erlang-formatters
>>>> 



Re: [DISCUSSION] Clean up non-functioning applications from main

2021-04-12 Thread Robert Newson
+1 to all the proposed cuts.

I’m keen to see couch_server.erl itself go, so its remaining uses need new 
homes (couch_passwords an obvious choice for the hashing referred to, etc).

I’m inferring that neither purge and global_changes work on main anyway, but 
they can still be called and will route to 3.x code. Agree that it’s better to 
stub those out (send a 503 I guess?) in the short term and either re-implement 
on FDB or (as Joan said) vote on their permanent removal. (Noting that a much 
better implementation of purge and global_changes seems possible with FDB 
though less clear if the effort is justified).

So, in brief, remove absolutely all the obsoleted, unreachable code as soon as 
possible, then once the dust has settled we can see if there are obvious gaps 
we should fill in before the 4.0 release.

B.

> On 12 Apr 2021, at 18:51, Nick Vatamaniuc  wrote:
> 
> The current versions of those apps rely on mem3, clustering, adding
> nodes, etc and they will trail behind the 3.x versions since
> developers wouldn't think to port those updates to main since they are
> simply non-functional there. Most of those apps have to be re-written
> from scratch and it would be better to start from the recent working
> versions on 3.x.  The tests for those apps don't really fail as we get
> green builds on PR branches to main. We simply don't run them at all
> and only run a subset of applications (fabric, couch_jobs, couch_views
> and a few others).
> 
> Don't think this is about a 4.x release per-se. This is mainly about
> cleaning up, reducing the cognitive load of anyone jumping in trying
> to work on main and seeing applications and endpoints calling into
> non-existing applications.
> 
> -Nick
> 
> 
> -Nick
> 
> On Mon, Apr 12, 2021 at 1:13 PM Joan Touzet  wrote:
>> 
>> Generally +1 with one major reservation:
>> 
>> On 12/04/2021 12:25, Nick Vatamaniuc wrote:
>>> * Some applications we want to have in main, but the way they are
>>> implemented currently rely completely or mostly on 3.x code: purge
>>> logic, couch_peruser, global_changes, setup. I am thinking it may be
>>> better to remove them from main as we'll have them on the 3.x branch
>>> they'll be recent (working) there. When we're ready to fix them up, we
>>> can copy that code from there to the main branch.
>> 
>> If the intent is to release 4.0 with them, then I would suggest keeping
>> them there and allowing their tests to fail so we know that a "failing
>> main" means that the product isn't ready to release yet.
>> 
>> If we are pushing these out past a 4.0 release, then that decision needs
>> to be made formally.
>> 
>> Parenthetically, we try to avoid "code owners" here, but usually fixes
>> to couch_peruser and setup fall to Jan, while purge and global_changes I
>> *believe* have generally been made by IBM/Cloudant.
>> 
>> -Joan "not sure main is ready to be called 4.0 yet anyway" Touzet



Re: Removing "node" field from replicator "/_scheduler/{jobs | docs}"

2021-04-05 Thread Robert Newson
Several good points in there, Nick.

How about a config toggle? A single config setting that decides if any endpoint 
exposes “erlangness” (so at least node names, pids or registered names). To be 
applied to active_tasks output also. Defaulting to true for compatibility.

> On 5 Apr 2021, at 17:17, Nick Vatamaniuc  wrote:
> 
> The "node" field can be helpful in determining where the background
> task runs even if nodes are not connected in a mesh. Nodes could still
> be named something like replication_node_1, replication_node_2, etc.
> Even in 3.x, the replicator doesn't rely on the nodes being meshed all
> that much, the jobs start independently on each node based on how
> _replicator doc shards are distributed.
> 
> A few more thoughts around it:
> 
> * If we're going to remove "node", should we also remove the "pid"
> field? It seems odd to have one but not the other...
> 
> * One of the reasons node and pid were added to _scheduler/*
> endpoints was to eliminate the need for the users to ever look in
> _active_tasks to check the status of their replication jobs. In other
> words, having to tell a user "if you want stats look in
> _scheduler/jobs, but if you want to find the node look in
> _active_tasks".
> 
>  * What about _active_tasks and "node" and "pid" there? Both view
> indexing and replication jobs have those fields, should we remove them
> too?
> 
> So I think overall I am neutral (-0) on the idea. I can see how it may
> be odd to have those internal details in there to start with, but I am
> not sure meshing alone is a good reason either way. And if we're going
> to do it, it may be better to be consistent with "pid" and "node" and
> in regards to _active_tasks as well.
> 
> (As a fun aside, the node and pid was once helpful in our test
> environment in discovering that a different kube cluster was picking
> up and executing indexing jobs due to a misconfigured [fabric]
> fdb_directory config. Both clusters were sharing the same FDB cluster
> underneath).
> 
> Cheers,
> -Nick
> 
> On Fri, Apr 2, 2021 at 6:00 PM Bessenyei Balázs Donát  
> wrote:
>> 
>> I support removing obsolete fields from responses.
>> I also support tracking API changes.
>> 
>> 
>> Donat
>> 
>> On Fri, Apr 2, 2021 at 10:23 PM Robert Newson  wrote:
>>> 
>>> +1 to removing “node” on main (but not 3.x branch).
>>> 
>>> B.
>>> 
>>>> On 2 Apr 2021, at 21:11, Ilya Khlopotov  wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> In FDB world there wouldn't be erlang mesh as far as I can tell. In this 
>>>> situation the `node` field in the response from `/_scheduler/jobs`  and 
>>>> `/_scheduler/docs` doesn't make sense.
>>>> 
>>>> We could either remove the field or set it to `None`. I propose complete 
>>>> removal.
>>>> 
>>>> I also propose to establish a formal process to track API changes 
>>>> formally. Sooner or latter we would need to compile list of changes 
>>>> between versions. In case of a rewrite on top of FDB I suspect the 
>>>> archeology process wouldn't be easy. We could create a github issue 
>>>> template which would set necessary labels for example.
>>>> 
>>>> Best regards,
>>>> iilyak
>>>> 
>>> 



Re: Removing "node" field from replicator "/_scheduler/{jobs | docs}"

2021-04-02 Thread Robert Newson
+1 to removing “node” on main (but not 3.x branch).

B.

> On 2 Apr 2021, at 21:11, Ilya Khlopotov  wrote:
> 
> Hi, 
> 
> In FDB world there wouldn't be erlang mesh as far as I can tell. In this 
> situation the `node` field in the response from `/_scheduler/jobs`  and 
> `/_scheduler/docs` doesn't make sense.
> 
> We could either remove the field or set it to `None`. I propose complete 
> removal. 
> 
> I also propose to establish a formal process to track API changes formally. 
> Sooner or latter we would need to compile list of changes between versions. 
> In case of a rewrite on top of FDB I suspect the archeology process wouldn't 
> be easy. We could create a github issue template which would set necessary 
> labels for example. 
> 
> Best regards,
> iilyak
> 



Re: [VOTE] Set a finite default for max_attachment_size

2021-01-28 Thread Robert Newson
Hi,

I think a gigabyte is _very_ generous given our experience of this feature in 
practice.

In 4.x attachment size will necessarily be much more restrictive, so it seems 
prudent to move toward that limit.

I don’t think many folks (hopefully no one!) is routinely inserting attachments 
over 1 gib today, I’d be fairly surprised if it even works.

B.

> On 28 Jan 2021, at 19:42, Eric Avdey  wrote:
> 
> There is no justification neither here or on the PR for this change, i.e. why 
> this is done. Original infinity default was set to preserve previous 
> behaviour, this change will inadvertently break workflow for users who upload 
> large attachment and haven't set explicit default, so why is it fine to do 
> now? There might be some discussion around this somewhere, but it'd be nice 
> to include it here for sake of people like me who's out of the loop.
> 
> Also 1G limit seems arbitrary - how was it choosen?
> 
> 
> Thanks,
> Eric
> 
> 
> 
>> On Jan 28, 2021, at 01:46, Bessenyei Balázs Donát  wrote:
>> 
>> Hi All,
>> 
>> In https://github.com/apache/couchdb/pull/3347 I'm proposing to set a
>> finite default for max_attachment_size .
>> The PR is approved, but as per Ilya's request, I'd like to call for a
>> lazy majority vote here.
>> The vote will remain open for at least 72 hours from now.
>> 
>> Please let me know if you have any questions, comments or concerns.
>> 
>> 
>> Donat
> 



Re: [VOTE] Set a finite default for max_attachment_size

2021-01-28 Thread Robert Newson
+1

> On 28 Jan 2021, at 05:46, Bessenyei Balázs Donát  wrote:
> 
> Hi All,
> 
> In https://github.com/apache/couchdb/pull/3347 I'm proposing to set a
> finite default for max_attachment_size .
> The PR is approved, but as per Ilya's request, I'd like to call for a
> lazy majority vote here.
> The vote will remain open for at least 72 hours from now.
> 
> Please let me know if you have any questions, comments or concerns.
> 
> 
> Donat



Re: [DISCUSS] Removing erlang 19 support

2021-01-22 Thread Robert Newson
Iteresting. I’m actually surprised at the inversion here (that CouchDB is 
dependent on  IBM to confirm CouchDB’s stability). I’ve always agonised over 
even the perception that IBM/Cloudant is calling the shots. I appreciate the 
reassurance that running at scale provides, of course, I just don’t think it 
can be our official position.

On the core point of the thread, it seems there’s no barrier to dropping Erlang 
19 support, so I think we can go to a VOTE thread, perhaps best to wait till 
Monday for others to chime in on this discussion though.

I also think that IBM Cloudant’s chosen Erlang release is in part influenced by 
CouchDB’s lack of support for later versions and requirement of compatible with 
older releases, which now appears illusory.

B.


> On 22 Jan 2021, at 21:19, Joan Touzet  wrote:
> 
> On 22/01/2021 15:48, Robert Newson wrote:
>> I’m +1 on dropping Erlang 19 support. Erlang is now on major release 23.
> 
> No problem here.
> 
>> I’d further advocate a general policy of supporting only the most recent 2 
>> or 3 major releases of Erlang/OTP.
>> 
>> The main (I think only?) reason to keep compatibility so far back is because 
>> of the versions supported by some OS’es. I don’t feel that is a strong 
>> reason given the couchdb install, in common with Erlang-based projects, is 
>> self-contained. The existence of Erlang Solutions packages for all common 
>> platforms is also a factor.
> 
> That hasn't been the case for at least 2 years, if not longer.
> 
> As the release engineer, I've been focused on stability for everyone.
> This is largely driven by knowing that IBM/Cloudant largely run CouchDB
> on 20.x at scale. Standing on the shoulders of giants, our releases run
> the latest 20.x release at the time of binary generation.
> 
> A few times recently issues cropped up in 21 and 22 that we didn't
> encounter in our user base because, at scale, we are deployed on
> 20.3.8.something. Some of these issues were non-trivial. (I'm off today,
> so I don't have the time to dig into the specifics until Monday.)
> 
> So my $0.02 is that: if IBM/Cloudant is ready to move to something newer
> at scale, I'm ready to release binaries on a newer Erlang by default.
> 
> The alternative (running newer Erlangs in the binary distributions than
> IBM/Cloudant run in production) could possibly be perceived as treating
> our open source customers as guinea pigs. I'd rather not risk that
> perception, but am willing to be convinced otherwise.
> 
> -Joan
> 
>> 
>> B.
>> 
>>> On 22 Jan 2021, at 19:54, Bessenyei Balázs Donát  wrote:
>>> 
>>> Hi All,
>>> 
>>> CI for https://github.com/apache/couchdb-config appears to be broken.
>>> I wanted to fix it in
>>> https://github.com/apache/couchdb-config/pull/34/files , but I'm
>>> getting issues with erlang 19. Are we okay with dropping 19 support
>>> there?
>>> 
>>> On a different note: are we okay with dropping erlang 19 support
>>> overall in couch project(s)?
>>> 
>>> 
>>> Thank you,
>>> Donat
>> 



Re: [DISCUSS] Removing erlang 19 support

2021-01-22 Thread Robert Newson
I’m +1 on dropping Erlang 19 support. Erlang is now on major release 23.

I’d further advocate a general policy of supporting only the most recent 2 or 3 
major releases of Erlang/OTP.

The main (I think only?) reason to keep compatibility so far back is because of 
the versions supported by some OS’es. I don’t feel that is a strong reason 
given the couchdb install, in common with Erlang-based projects, is 
self-contained. The existence of Erlang Solutions packages for all common 
platforms is also a factor.

B.

> On 22 Jan 2021, at 19:54, Bessenyei Balázs Donát  wrote:
> 
> Hi All,
> 
> CI for https://github.com/apache/couchdb-config appears to be broken.
> I wanted to fix it in
> https://github.com/apache/couchdb-config/pull/34/files , but I'm
> getting issues with erlang 19. Are we okay with dropping 19 support
> there?
> 
> On a different note: are we okay with dropping erlang 19 support
> overall in couch project(s)?
> 
> 
> Thank you,
> Donat



[ANNOUNCE] Bessenyei Balázs Donát elected as CouchDB committer

2021-01-14 Thread Robert Newson
Dear community,

I am pleased to announce that the CouchDB Project Management Committee has 
elected Bessenyei Balázs Donát as a CouchDB committer.

Apache ID: bessbd

Committers are given a binding vote in certain project decisions, as well as 
write access to public project infrastructure.

This election was made in recognition of Donát's commitment to the project. We 
mean this in the sense of being loyal to the project and its interests.

Please join me in extending a warm welcome to Donát!

On behalf of the CouchDB PMC,

Robert Newson



Re: [VOTE] couchdb 4.0 transaction semantics

2021-01-10 Thread Robert Newson
Hi,


There is a fundamental incompatibility between CouchDB using couch_file/btree 
and CouchDB using FDB.

The choice at hand here is between two different forms of compatibility break;

1) All responses that were over a single snapshot in CouchDB 1/2/3 will still 
be over a single snapshot in CouchDB 4, but necessarily limited in the amount 
of data they can return and time they can take. This means that some requests 
will fail that previously succeeded. As an example, hitting _all_docs without a 
suitably small “limit” parameter will either a) return a 400 Bad Request right 
out of the gate or b) be abruptly terminated mid-response when one of the FDB 
limits is reached (5 second transaction duration or 10MB of data: 
https://apple.github.io/foundationdb/known-limitations.html 
<https://apple.github.io/foundationdb/known-limitations.html>).

2) Some responses that were over a single snapshot in CouchDB 1/2/3 will 
potentially be over multiple snapshots in CouchDB 4, so clients will sometimes 
see incoherent responses. As an example, an _all_docs response running 
concurrently with doc inserts will see none, some or all of those inserts, 
depending on the doc _ids of those inserts and how far along the _all_docs 
response has progressed. Two _all_docs responses running concurrently with 
those inserts could see a different subset of those concurrently running 
inserts (based on when the restart_tx code is called and the GRV grabbed at 
that point.

I vastly prefer the 1) scenario.

To bridge the gap that Nick describes I agree it would be acceptable to drop 
the snapshot requirement for all _changes responses, not just the continuous 
mode. All correctly-implemented consumers of the changes feed should be able to 
handle that.

For 3.x -> 4.x replication, we could make a 3.x minor release that solely 
enhances the replicator in whatever way would be necessary to restore 
replication compatibility. We’ve done this before at least once.

To Will’s points, we would need to decide if CouchDB 4 will have a 
“compatibility mode” at all (taken to mean that no adjustment is needed by the 
client whatsoever). Beyond replication, I don’t see how it could be done, and I 
don’t think it should be a goal. We shouldn’t be _incompatible_ capriciously, 
however. But, at base, this is a major version bump (the classic signal of 
potential incompatibility), a very significant amount of new code, and a 
completely new storage backend with constraints that preclude CouchDB 1.x 
semantics.

Anyway, this is a VOTE thread and not a DISCUSS thread. I think it’s fair to 
say the proposal has failed and so this thread is over. We do need to make a 
project level decision on this topic before CouchDB 4 can be released.

B.

> On 10 Jan 2021, at 07:26, Joan Touzet  wrote:
> 
> If this proposal means v3.x replicators can't replicate one-shot / normal / 
> non-continuous changes from 4.x+ endpoints, that sounds like a big break in 
> compatibility.
> 
> I'm -0.5, tending towards -1, but mostly because I'm having trouble 
> understanding if it's even possible - unless a proposal is being made to 
> release a 3.2 that introduces replication compatibility with 4.x in tandem.
> 
> -Joan
> 
> On 2021-01-09 6:45 p.m., Nick Vatamaniuc wrote:
>>> I withdraw my vote until I can get a clearer view. Nick would you mind
>> re-stating?
>> Not at all! The longer version and other considerations was stated in
>> my last reply to the discussion thread so I assumed that was accepted
>> as a consensus since nobody replied arguing otherwise.
>> https://lists.apache.org/thread.html/r45bff6ca4339f775df631f47e77657afbca83ee0ef03c6aa1a1d45cb%40%3Cdev.couchdb.apache.org%3E
>> But the gist of it is that existing (< 3.x) replicators won't be able
>> to replicate non-continuous (normal) changes from >= 4.x endpoints.
>> Regards,
>> -Nick
>> On Sat, Jan 9, 2021 at 1:26 AM Joan Touzet  wrote:
>>> 
>>> Wait, what? I thought you agreed with this approach in that thread.
>>> 
>>> I withdraw my vote until I can get a clearer view. Nick would you mind
>>> re-stating?
>>> 
>>> -Joan
>>> 
>>> On 2021-01-08 11:37 p.m., Nick V wrote:
>>>> +1 for 1 through 3
>>>> 
>>>> -1 for 4  as I think the exception should apply to normal change feeds as 
>>>> well, as described in the thread
>>>> 
>>>> Cheers,
>>>> -Nick
>>>> 
>>>>> On Jan 8, 2021, at 17:12, Joan Touzet  wrote:
>>>>> 
>>>>> Thanks, then it's a solid +1 from me.
>>>>> 
>>>>> -Joan
>>>>> 
>>>>>> On 2021-01-08 4:13 p.m., Robert Newson wrote:
>>>>>> You are probably thinking of a possible “group

Re: [VOTE] couchdb 4.0 transaction semantics

2021-01-09 Thread Robert Newson
The vote is on the proposal text in the quote. 

> On 9 Jan 2021, at 04:37, Nick V  wrote:
> 
> +1 for 1 through 3
> 
> -1 for 4  as I think the exception should apply to normal change feeds as 
> well, as described in the thread
> 
> Cheers,
> -Nick
> 
>> On Jan 8, 2021, at 17:12, Joan Touzet  wrote:
>> 
>> Thanks, then it's a solid +1 from me.
>> 
>> -Joan
>> 
>>> On 2021-01-08 4:13 p.m., Robert Newson wrote:
>>> You are probably thinking of a possible “group commit”. That is anticipated 
>>> and not contradicted by this proposal. This proposal is explicitly about 
>>> not using multiple states of the database for a single doc lookup, view 
>>> query, etc.
>>>>> On 8 Jan 2021, at 19:53, Joan Touzet  wrote:
>>>> 
>>>> +1.
>>>> 
>>>> This is for now I presume, as I thought that there was feeling about
>>>> relaxing this restriction somewhat for the 5.0 timeframe? Memory's dim.
>>>> 
>>>> -Joan
>>>> 
>>>> On 07/01/2021 06:00, Robert Newson wrote:
>>>>> Hi,
>>>>> 
>>>>> Following on from the discussion at 
>>>>> https://lists.apache.org/thread.html/rac6c90c4ae03dc055c7e8be6eca1c1e173cf2f98d2afe6d018e62d29%40%3Cdev.couchdb.apache.org%3E
>>>>>  
>>>>> <https://lists.apache.org/thread.html/rac6c90c4ae03dc055c7e8be6eca1c1e173cf2f98d2afe6d018e62d29@%3Cdev.couchdb.apache.org%3E>
>>>>> 
>>>>> The proposal is;
>>>>> 
>>>>> "With the exception of the changes endpoint when in feed=continuous mode, 
>>>>> that all data-bearing responses from CouchDB are constructed from a 
>>>>> single, immutable snapshot of the database at the time of the request.”
>>>>> 
>>>>> Paul Davis summarised the discussion in four bullet points, reiterated 
>>>>> here for context;
>>>>> 
>>>>> 1. A single CouchDB API call should map to a single FDB transaction
>>>>> 2. We absolutely do not want to return a valid JSON response to any
>>>>> streaming API that hit a transaction boundary (because data
>>>>> loss/corruption)
>>>>> 3. We're willing to change the API requirements so that 2 is not an issue.
>>>>> 4. None of this applies to continuous changes since that API call was
>>>>> never a single snapshot.
>>>>> 
>>>>> 
>>>>> Please vote accordingly, we’ll run this as lazy consensus per the bylaws 
>>>>> (https://couchdb.apache.org/bylaws.html#lazy 
>>>>> <https://couchdb.apache.org/bylaws.html#lazy>)
>>>>> 
>>>>> B.
>>>>> 
>>>>> 



Re: [VOTE] couchdb 4.0 transaction semantics

2021-01-08 Thread Robert Newson
You are probably thinking of a possible “group commit”. That is anticipated and 
not contradicted by this proposal. This proposal is explicitly about not using 
multiple states of the database for a single doc lookup, view query, etc.

> On 8 Jan 2021, at 19:53, Joan Touzet  wrote:
> 
> +1.
> 
> This is for now I presume, as I thought that there was feeling about
> relaxing this restriction somewhat for the 5.0 timeframe? Memory's dim.
> 
> -Joan
> 
> On 07/01/2021 06:00, Robert Newson wrote:
>> Hi,
>> 
>> Following on from the discussion at 
>> https://lists.apache.org/thread.html/rac6c90c4ae03dc055c7e8be6eca1c1e173cf2f98d2afe6d018e62d29%40%3Cdev.couchdb.apache.org%3E
>>  
>> <https://lists.apache.org/thread.html/rac6c90c4ae03dc055c7e8be6eca1c1e173cf2f98d2afe6d018e62d29@%3Cdev.couchdb.apache.org%3E>
>> 
>> The proposal is;
>> 
>> "With the exception of the changes endpoint when in feed=continuous mode, 
>> that all data-bearing responses from CouchDB are constructed from a single, 
>> immutable snapshot of the database at the time of the request.”
>> 
>> Paul Davis summarised the discussion in four bullet points, reiterated here 
>> for context;
>> 
>> 1. A single CouchDB API call should map to a single FDB transaction
>> 2. We absolutely do not want to return a valid JSON response to any
>> streaming API that hit a transaction boundary (because data
>> loss/corruption)
>> 3. We're willing to change the API requirements so that 2 is not an issue.
>> 4. None of this applies to continuous changes since that API call was
>> never a single snapshot.
>> 
>> 
>> Please vote accordingly, we’ll run this as lazy consensus per the bylaws 
>> (https://couchdb.apache.org/bylaws.html#lazy 
>> <https://couchdb.apache.org/bylaws.html#lazy>)
>> 
>> B.
>> 
>> 



Re: [VOTE] couchdb 4.0 transaction semantics

2021-01-07 Thread Robert Newson
+1

> On 7 Jan 2021, at 11:00, Robert Newson  wrote:
> 
> Hi,
> 
> Following on from the discussion at 
> https://lists.apache.org/thread.html/rac6c90c4ae03dc055c7e8be6eca1c1e173cf2f98d2afe6d018e62d29%40%3Cdev.couchdb.apache.org%3E
>  
> <https://lists.apache.org/thread.html/rac6c90c4ae03dc055c7e8be6eca1c1e173cf2f98d2afe6d018e62d29@%3Cdev.couchdb.apache.org%3E>
> 
> The proposal is;
> 
> "With the exception of the changes endpoint when in feed=continuous mode, 
> that all data-bearing responses from CouchDB are constructed from a single, 
> immutable snapshot of the database at the time of the request.”
> 
> Paul Davis summarised the discussion in four bullet points, reiterated here 
> for context;
> 
> 1. A single CouchDB API call should map to a single FDB transaction
> 2. We absolutely do not want to return a valid JSON response to any
> streaming API that hit a transaction boundary (because data
> loss/corruption)
> 3. We're willing to change the API requirements so that 2 is not an issue.
> 4. None of this applies to continuous changes since that API call was
> never a single snapshot.
> 
> 
> Please vote accordingly, we’ll run this as lazy consensus per the bylaws 
> (https://couchdb.apache.org/bylaws.html#lazy 
> <https://couchdb.apache.org/bylaws.html#lazy>)
> 
> B.
> 



[VOTE] couchdb 4.0 transaction semantics

2021-01-07 Thread Robert Newson
Hi,

Following on from the discussion at 
https://lists.apache.org/thread.html/rac6c90c4ae03dc055c7e8be6eca1c1e173cf2f98d2afe6d018e62d29%40%3Cdev.couchdb.apache.org%3E
 


The proposal is;

"With the exception of the changes endpoint when in feed=continuous mode, that 
all data-bearing responses from CouchDB are constructed from a single, 
immutable snapshot of the database at the time of the request.”

Paul Davis summarised the discussion in four bullet points, reiterated here for 
context;

1. A single CouchDB API call should map to a single FDB transaction
2. We absolutely do not want to return a valid JSON response to any
streaming API that hit a transaction boundary (because data
loss/corruption)
3. We're willing to change the API requirements so that 2 is not an issue.
4. None of this applies to continuous changes since that API call was
never a single snapshot.


Please vote accordingly, we’ll run this as lazy consensus per the bylaws 
(https://couchdb.apache.org/bylaws.html#lazy 
)

B.



Re: [DISCUSS] couchdb 4.0 transactional semantics

2021-01-07 Thread Robert Newson
Apologies for resurrecting this thread after so long.

I’ve looked over the thread again today and it seems there is general consensus 
on the desired semantics. I will start a vote thread.

B.

> On 24 Jul 2020, at 18:27, Nick Vatamaniuc  wrote:
> 
> Great discussion everyone!
> 
> For normal replications, I think it might be nice to make an exception
> and allow server-side pagination for compatibility at first, with a
> new option to explicitly enable strict snapshots behavior. Then, in a
> later release make it the default to match _all_docs and _view reads.
> In other words, for a short while, we'd support bi-directional
> replications between 4.x and 1/2/3.x on any replicator and document
> that fact, then after a while will switch that capability off and
> users would have to run replications on a 4.x replicator only, or
> specially updated 3.x replicators.
> 
>> I'd rather support this scenario than have to support explaining why the 
>> "one shot" replication back to an old 1.x, when initiated by a 1.x cluster, 
>> is returning results "ahead" of the time at which the one-shot replication 
>> was started.
> 
> Ah, that won't happen in the current fdb prototype branch
> implementation. What might happen is there would be changes present in
> the changes feed that happened _after_ the request has started. That
> won't be any different than if a node where replication runs restarts,
> or there is a network glitch. The changes feed would proceed from the
> last checkpoint and see changes that happened after the initial
> starting sequence and apply them in order (document "a" was deleted,
> then it was updated again then deleted again, every change will be
> applied incrementally to the target, etc).
> 
> We'd have to document the fact that a single snapshot replication from
> 4.x -> 1/2/3.x is impossible anyway (unless we do the trick where we
> compare the update sequence and db was not updated in the meantime or
> the new FDB storage engine allows it).  The question then becomes if
> we allow the pagination to happen on the client or the server. In case
> of normal replication I think it would be nice to allow it to happen
> on the server for a bit to allow for maximum initial replication
> interoperability.
> 
>> For cases where you’re not concerned about the snapshot isolation (e.g. 
>> streaming an entire _changes feed), there is a small performance benefit to 
>> requesting a new FDB transaction asynchronously before the old one actually 
>> times out and swapping over to it. That’s a pattern I’ve seen in other FDB 
>> layers but I’m not sure we’ve used it anywhere in CouchDB yet.
> 
> Good point, Adam. We could optimize that part, yeah. Fetch a GRV after
> 4.9 seconds or so and keep it ready to go for example. So far we tried
> to react to the transaction_too_old exception, as opposed to starting
> a timer there in order to allow us to use the maximum time a tx is
> alive, to save a few seconds or milliseconds. That required some
> tricks such as handling the exception bubbling up from either the
> range read itself, or from the user's callback (say if user code in
> the callback fetched a doc body which blew up with a
> transaction_too_old exception). As an interesting aside, from quick
> experiments I had noticed we were able to stream about 100-150k rows
> from a single tx snapshot, that wasn't too bad I thought.
> 
> Speaking of replication, I am trying to see what the replicator might
> look like in 4.x in the https://github.com/apache/couchdb/pull/3015
> (prototype/fdb-replicator branch). It's very much a wip and hot mess
> currently. Will issue an RFC once I have a better handle on the
> general shape of it. So far it's based on couch_jobs, with a global
> queue and looks like it might be smaller overall, as it's leveraging
> the scheduling capabilities already present in couch_jobs, and but
> once started individual replication job process hierarchy is largely
> the same as before.
> 
> Cheers,
> -Nick
> 
> 
> 
> 
> 
> On Wed, Jul 22, 2020 at 8:48 AM Bessenyei Balázs Donát
>  wrote:
>> 
>> On Tue, 21 Jul 2020 at 18:45, Jan Lehnardt  wrote:
>>> I’m not sure why a URL parameter vs. a path makes a big difference?
>>> 
>>> Do you have an example?
>>> 
>>> Best
>>> Jan
>>> —
>> 
>> Oh, sure! OpenAPI Generator [1] and et al. for example generate Java
>> methods (like [2] out of spec [3]) per path per verb.
>> Java's type safety and the way methods are currently generated don't
>> really provide an easy way to retrieve multiple kinds of responses, so
>> having them separate would help a lot there.
>> 
>> 
>> Donat
>> 
>> PS. I'm getting self-conscious about discussing this in this thread.
>> Should I open a new one?
>> 
>> 
>> [1] https://openapi-generator.tech/
>> [2] 
>> https://github.com/OpenAPITools/openapi-generator/blob/c49d8fd/samples/client/petstore/java/okhttp-gson/src/main/java/org/openapitools/client/api/PetApi.java#L606
>> [3] 
>> 

Re: Jenkins issues, looking for committer volunteer(s)

2020-09-15 Thread Robert Newson
+1

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 15 Sep 2020, at 09:18, Jan Lehnardt wrote:
> 
> 
> > On 15. Sep 2020, at 04:09, Joan Touzet  wrote:
> > 
> > Paul informs me that IBM have discontinued all Power platform hosting at 
> > the level that suits us. He is following up with Adam and others to find a 
> > solution, but...
> > 
> > This directly endangers our ability to release packages and Docker 
> > containers on ppc64le, as this platform will not be in the regression 
> > suite. We've had issues on alternate platforms (such as ARM and ppc64le) 
> > when not performing active testing.
> > 
> > This is especially troubling since IBM are the primary clients for this 
> > platform, or rather, their customers are.
> > 
> > I realize this may seem harsh, but I propose to remove ppc64le from the 
> > packages and the couchdb top-level Docker file by end of 2020, should 
> > replacement machines not be made available.
> > 
> > Please discuss.
> 
> +1.
> 
> Best
> Jan
> —
> > 
> > -Joan
> > 
> > On 2020-09-12 5:01 p.m., Joan Touzet wrote:
> >> Hi Devs,
> >> FYI per Jenkins:
> >> > All nodes of label ‘ppc64le’ are offline
> >> This is one of the reasons causing our Jenkins failures on master.
> >> (The other is our usual heisenbugs in the test suite.)
> >> I really would like it if someone on the PMC (other than me and Paul)
> >> would agree to help keep Jenkins running. It's my weekend and I really
> >> don't have time to stay on top of these things. If you're a committer
> >> we can get you access to the machines fairly readily, and Paul can help
> >> talk you through what's necessary to keep the workers alive.
> >> -Joan "more help is always welcome" Touzet
> 
>


Re: Preparing 3.1.1 release

2020-09-01 Thread Robert Newson
Nice 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 1 Sep 2020, at 16:24, Jan Lehnardt wrote:
> This PR by my coworker Jacoba should address this issue satisfactorily:
> 
>https://github.com/apache/couchdb-fauxton/pull/1292
> 
> Best
> Jan
> —
> 
> > On 27. Aug 2020, at 11:41, Jan Lehnardt  wrote:
> > 
> > In ermouth's defence, I also think that the PR was merged prematurely. But 
> > adding a button with a warning that then conditionally loads the iframe 
> > should not be a lot of work and I'm happy to review a PR there.
> > 
> > Cheers
> > Jan
> > —
> > 
> >> On 27. Aug 2020, at 01:05, Joan Touzet  wrote:
> >> 
> >> A PR to disable the tab via an ini file setting would absolutely be 
> >> merged. Why not work on one?
> >> 
> >> On 2020-08-26 6:45 p.m., ermouth wrote:
>  The blog is controlled by the CouchDB PMC. No one outside of the PMC or
> >>> who they authorize has access to it.
> >>> This is about wordpress server where the blog lives. The server is
> >>> maintained so impressively, that shows default wordpress favicon for years
> >>> and responds with x-hacker header, promoting jobs aggregator. It implies 
> >>> an
> >>> obvious question about how reliable is the server in terms of injections
> >>> and logs protection.
> >>> Also the blog pings gravatar, not good.
>  If you don't want to display it, don't click on it, and the iframe won't
> >>> This is not how things are protected, and I know that you know about it.
> >>> ermouth
> >>> чт, 27 авг. 2020 г. в 00:55, Joan Touzet :
>  At the moment, I have no plan to update Fauxton for 3.1.1.
>  
>  The blog is controlled by the CouchDB PMC. No one outside of the PMC or
>  who they authorize has access to it.
>  
>  If you don't want to display it, don't click on it, and the iframe won't
>  load.
>  
>  -Joan
>  
>  On 2020-08-26 11:57 a.m., ermouth wrote:
> > Is that very unsafe PR
> > https://github.com/apache/couchdb-fauxton/pull/1284 going
> > to be included into 3.1.1?
> > 
> > If it will, who exactly controls the wordpress site with those “news”?
> > 
> > ermouth
> > 
> > 
> > вт, 25 авг. 2020 г. в 23:45, Joan Touzet :
> > 
> >> Hello there,
> >> 
> >> I have time to get together a 3.1.1 release now. If you have any
> >> pressing things to get into 3.x, or anything that's on master that
> >> should be backported, please open your PRs now.
> >> 
> >> -Joan "Labor Day! Schools are out and pools are open!" Touzet
> >> 
> > 
>  
> > 
> 
>


Re: [DISCUSS] Reduce on FDB take 3

2020-07-26 Thread Robert Newson
or the other 
> > and reach a point where we eliminate one, or the two could coexist 
> > indefinitely.
> > 
> >> On 24 Jul 2020, at 20:00, Paul Davis  wrote:
> >> 
> >> FWIW, a first pass at views entirely on ebtree turned out to be fairly
> >> straightforward. Almost surprisingly simple in some cases.
> >> 
> >> https://github.com/apache/couchdb/compare/prototype/fdb-layer...prototype/fdb-layer-ebtree-views
> >> 
> >> Its currently passing all tests in `couch_views_map_test.erl` but is
> >> failing on other suites. I only did a quick skim on the failures but
> >> they all look superficial around some APIs I changed.
> >> 
> >> I haven't added the APIs to query reduce functions via HTTP but the
> >> reduce functions are being executed to calculate row counts and KV
> >> sizes. Adding the builtin reduce functions and extending those to user
> >> defined reduce functions should be straightforward.
> >> 
> >> On Fri, Jul 24, 2020 at 9:39 AM Robert Newson  wrote:
> >>> 
> >>> 
> >>> I’m happy to restrict my PR comments to the actual diff, yes. So I’m not 
> >>> +1 yet.
> >>> 
> >>> I fixed the spurious conflicts at 
> >>> https://github.com/apache/couchdb/pull/3033.
> >>> 
> >>> --
> >>> Robert Samuel Newson
> >>> rnew...@apache.org
> >>> 
> >>> On Fri, 24 Jul 2020, at 14:59, Garren Smith wrote:
> >>>> Ok so just to confirm, we keep my PR as-is with ebtree only for reduce. 
> >>>> We
> >>>> can get that ready to merge into fdb master. We can then use that to 
> >>>> battle
> >>>> test ebtree and then look at using it for the map side as well. At that
> >>>> point we would combine the reduce and map index into a ebtree index. Are
> >>>> you happy with that?
> >>>> 
> >>>> Cheers
> >>>> Garren
> >>>> 
> >>>> On Fri, Jul 24, 2020 at 3:48 PM Robert Newson  wrote:
> >>>> 
> >>>>> Hi,
> >>>>> 
> >>>>> It’s not as unknown as you think but certainly we need empirical data to
> >>>>> guide us on the reduce side. I’m also fine with continuing with the
> >>>>> map-only code as it stands today until such time as we demonstrate 
> >>>>> ebtree
> >>>>> meets or exceeds our needs (and I freely accept the possibility that it
> >>>>> might not).
> >>>>> 
> >>>>> I think the principal enhancement to ebtree that would address most
> >>>>> concerns is if it could store the leaf entries vertically (as they are
> >>>>> currently). I have some thoughts which I’ll try to realise as working 
> >>>>> code.
> >>>>> 
> >>>>> I’ve confirmed that I do create spurious conflicts and will have a PR up
> >>>>> today to fix that.
> >>>>> 
> >>>>> --
> >>>>> Robert Samuel Newson
> >>>>> rnew...@apache.org
> >>>>> 
> >>>>> On Fri, 24 Jul 2020, at 12:43, Garren Smith wrote:
> >>>>>> Hi Bob,
> >>>>>> 
> >>>>>> Thanks for that explanation, that is really helpful and it is good we
> >>>>> have
> >>>>>> some options but it also does highlight a lot of unknowns. Whereas our
> >>>>>> current map indexes are really simple and we know its behaviour. There
> >>>>> are
> >>>>>> no real unknowns. Adding ebtree here could make map indexes a fair bit
> >>>>> more
> >>>>>> complicated since we don't know the effect of managing different node
> >>>>>> sizes, concurrent doc updates, and querying performance.
> >>>>>> 
> >>>>>> Could we do a compromise here, could we look at using ebtree only for
> >>>>>> reduces now? That means that we can have reduce indexes working quite
> >>>>> soon
> >>>>>> on CouchDB on FDB. At the same time, we can work on ebtree and run
> >>>>>> performance tests on ebtree for map indexes. Some interesting tests we
> >>>>> can
> >>>>>> do is see if a user emits KV's near the limits (8KB for keys and 50KB 
> >>>>>> for
> >>>>>> v

Re: [DISCUSS] Reduce on FDB take 3

2020-07-24 Thread Robert Newson


I’m happy to restrict my PR comments to the actual diff, yes. So I’m not +1 
yet. 

I fixed the spurious conflicts at https://github.com/apache/couchdb/pull/3033. 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Fri, 24 Jul 2020, at 14:59, Garren Smith wrote:
> Ok so just to confirm, we keep my PR as-is with ebtree only for reduce. We
> can get that ready to merge into fdb master. We can then use that to battle
> test ebtree and then look at using it for the map side as well. At that
> point we would combine the reduce and map index into a ebtree index. Are
> you happy with that?
> 
> Cheers
> Garren
> 
> On Fri, Jul 24, 2020 at 3:48 PM Robert Newson  wrote:
> 
> > Hi,
> >
> > It’s not as unknown as you think but certainly we need empirical data to
> > guide us on the reduce side. I’m also fine with continuing with the
> > map-only code as it stands today until such time as we demonstrate ebtree
> > meets or exceeds our needs (and I freely accept the possibility that it
> > might not).
> >
> > I think the principal enhancement to ebtree that would address most
> > concerns is if it could store the leaf entries vertically (as they are
> > currently). I have some thoughts which I’ll try to realise as working code.
> >
> > I’ve confirmed that I do create spurious conflicts and will have a PR up
> > today to fix that.
> >
> > --
> >   Robert Samuel Newson
> >   rnew...@apache.org
> >
> > On Fri, 24 Jul 2020, at 12:43, Garren Smith wrote:
> > > Hi Bob,
> > >
> > > Thanks for that explanation, that is really helpful and it is good we
> > have
> > > some options but it also does highlight a lot of unknowns. Whereas our
> > > current map indexes are really simple and we know its behaviour. There
> > are
> > > no real unknowns. Adding ebtree here could make map indexes a fair bit
> > more
> > > complicated since we don't know the effect of managing different node
> > > sizes, concurrent doc updates, and querying performance.
> > >
> > > Could we do a compromise here, could we look at using ebtree only for
> > > reduces now? That means that we can have reduce indexes working quite
> > soon
> > > on CouchDB on FDB. At the same time, we can work on ebtree and run
> > > performance tests on ebtree for map indexes. Some interesting tests we
> > can
> > > do is see if a user emits KV's near the limits (8KB for keys and 50KB for
> > > values) how does ebtree handle that? How it handles being updated in the
> > > doc update transaction. And general query performance. What do you think?
> > >
> > > Cheers
> > > Garren
> > >
> > >
> > > On Fri, Jul 24, 2020 at 10:06 AM Robert Samuel Newson <
> > rnew...@apache.org>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > A short preface at this stage is needed I think:
> > > >
> > > > My goal with ebtree was to implement a complete and correct b+tree that
> > > > also calculated and maintained inner reductions. I consciously chose
> > not to
> > > > go further than that before presenting it to the group for wider debate
> > > > (indeed, the very debate we're having in this thread).
> > > >
> > > > --
> > > >
> > > > Ebtree is not single writer, at least not inherently. Two updates to
> > the
> > > > same ebtree should both succeed as long as they don't modify the same
> > nodes.
> > > >
> > > > Modifying the same node is likely where there's a reduce function,
> > though
> > > > only if the reduction value actually changes, as that percolates up the
> > > > tree. Where the reduce value does not change and when neither
> > transaction
> > > > causes a split, rebalance, or merge that affects the other, they should
> > > > both commit without conflict. There is a rich history of optimizations
> > in
> > > > this space specifically around btrees. ebtree might cause spurious
> > > > conflicts today, I will investigate and propose fixes if so (e.g, I
> > think I
> > > > probably call erlfdb:set on nodes that have not changed).
> > > >
> > > > I certainly envisaged updating ebtree within the same txn as a doc
> > update,
> > > > which is why the first argument to all the public functions can be
> > either
> > > > an erlfdb Db or open Tx.
> > > >
> > > > Parallelising the initial build is more di

Re: [DISCUSS] Reduce on FDB take 3

2020-07-24 Thread Robert Newson
Hi,

It’s not as unknown as you think but certainly we need empirical data to guide 
us on the reduce side. I’m also fine with continuing with the map-only code as 
it stands today until such time as we demonstrate ebtree meets or exceeds our 
needs (and I freely accept the possibility that it might not). 

I think the principal enhancement to ebtree that would address most concerns is 
if it could store the leaf entries vertically (as they are currently). I have 
some thoughts which I’ll try to realise as working code. 

I’ve confirmed that I do create spurious conflicts and will have a PR up today 
to fix that. 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Fri, 24 Jul 2020, at 12:43, Garren Smith wrote:
> Hi Bob,
> 
> Thanks for that explanation, that is really helpful and it is good we have
> some options but it also does highlight a lot of unknowns. Whereas our
> current map indexes are really simple and we know its behaviour. There are
> no real unknowns. Adding ebtree here could make map indexes a fair bit more
> complicated since we don't know the effect of managing different node
> sizes, concurrent doc updates, and querying performance.
> 
> Could we do a compromise here, could we look at using ebtree only for
> reduces now? That means that we can have reduce indexes working quite soon
> on CouchDB on FDB. At the same time, we can work on ebtree and run
> performance tests on ebtree for map indexes. Some interesting tests we can
> do is see if a user emits KV's near the limits (8KB for keys and 50KB for
> values) how does ebtree handle that? How it handles being updated in the
> doc update transaction. And general query performance. What do you think?
> 
> Cheers
> Garren
> 
> 
> On Fri, Jul 24, 2020 at 10:06 AM Robert Samuel Newson 
> wrote:
> 
> > Hi,
> >
> > A short preface at this stage is needed I think:
> >
> > My goal with ebtree was to implement a complete and correct b+tree that
> > also calculated and maintained inner reductions. I consciously chose not to
> > go further than that before presenting it to the group for wider debate
> > (indeed, the very debate we're having in this thread).
> >
> > --
> >
> > Ebtree is not single writer, at least not inherently. Two updates to the
> > same ebtree should both succeed as long as they don't modify the same nodes.
> >
> > Modifying the same node is likely where there's a reduce function, though
> > only if the reduction value actually changes, as that percolates up the
> > tree. Where the reduce value does not change and when neither transaction
> > causes a split, rebalance, or merge that affects the other, they should
> > both commit without conflict. There is a rich history of optimizations in
> > this space specifically around btrees. ebtree might cause spurious
> > conflicts today, I will investigate and propose fixes if so (e.g, I think I
> > probably call erlfdb:set on nodes that have not changed).
> >
> > I certainly envisaged updating ebtree within the same txn as a doc update,
> > which is why the first argument to all the public functions can be either
> > an erlfdb Db or open Tx.
> >
> > Parallelising the initial build is more difficult with ebtree than
> > parallelising the existing couch_views map-only code, though it's worth
> > noting that ebtree:insert benefits a great deal from batching (multiple
> > calls to :insert in the same transaction). For an offline build (i.e, an
> > index that the client cannot see until the entire build is complete) the
> > batch size can be maximised. That is still a serial process in that there
> > is only one transaction at a time updating the ebtree. I can't say offhand
> > how fast that is in practice, but it is clearly less powerful than a fully
> > parallelisable approach could be.
> >
> > Any parallel build would require a way to divide the database into
> > non-overlapping subsets of emitted keys. This is easy and natural if the
> > fdb key is the emitted key, which is the case for the couch_views map-only
> > code. For ebtree it might be enough to simply grab a large chunk of
> > documents, perform the map transform, and then issues multiple transactions
> > on subsets of those.
> >
> > Another common technique for btrees is bulk loading (more or less
> > literally constructing the btree nodes directly from the source, as long as
> > you can sort it), which might be an option as well.
> >
> > Parallelising a build _with_ a reduce function seems hard however we do
> > it. The non-ebtree approach is parallelisable by virtue of paring down the
> > reduce functionality itself (only whole key groups, only those functions
> > that fdb has atomic operations for).
> >
> > I will first of all verify the multi-writer nature of ebtree as it stands
> > today and make a PR which fixes any spurious conflicts, and then ponder
> > further  how true parallel builds might be possible.
> >
> >
> > > On 24 Jul 2020, at 07:30, Garren Smith  wrote:
> > >
> > > We haven't spoken much about updates with 

Re: [DISCUSS] Reduce on FDB take 3

2020-07-21 Thread Robert Newson
Thank you for those kind words. 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 21 Jul 2020, at 13:45, Jan Lehnardt wrote:
> Heya Garren an Bob,
> 
> this looks really nice. I remember when this was a twinkle in our
> planning eyes. Seeing the full thing realised is very very cool.
> 
> I’m additionally impressed by the rather pretty and clean code.
> This doesn’t have to to be hard :)
> 
> Looking forward to see this in action.
> 
> Best
> Jan
> —
> 
> > On 21. Jul 2020, at 14:01, Garren Smith  wrote:
> > 
> > Hi All
> > 
> > We have a new reduce design for FoundationDB and we think this one will
> > work.
> > Recently I proposed a simpler reduce design [1] and at the same time, Bob
> > (rnewson) looked at implementing a B+tree [2], called ebtree, on top of
> > FoundationDB. The b+tree implementation has turned out really nicely, the
> > code is quite readable and works really well. I would like to propose that
> > instead of using the simpler reduce design I mentioned in the previous
> > email, we rather go with a reduce implementation on top of ebtree. The big
> > advantage of ebtree is that it allows us to keep the behaviour of CouchDB
> > 3.x.
> > 
> > We have run some basic performance tests on the Cloudant performance
> > clusters and so far the performance is looking quite good and performs very
> > similar to my simpler reduce work.
> > 
> > There is an unknown around the ebtree Order value. The Order is the number
> > of key/values stored for a node. We need to determine the optimal order
> > value for ebtree so that it doesn't exceed FoundationDB's key/value limits
> > and still performs well. This is something we will be looking at as we
> > finish up the reduce work. The work in progress for the reduce PR is
> > https://github.com/apache/couchdb/pull/3018.
> > 
> > A great thanks to Bob for implementing the B+tree. I would love to hear
> > your thoughts or questions around this?
> > 
> > Cheers
> > Garren
> > 
> > [1]
> > https://lists.apache.org/thread.html/r1d77cf9bb9c86eddec57ca6ea2aad90f396ee5f0dfe43450f730b1cf%40%3Cdev.couchdb.apache.org%3E
> > 
> > [2] https://github.com/apache/couchdb/pull/3017
> > [3] https://github.com/apache/couchdb/pull/3018
> 
>


Re: [DISCUSS] couchdb 4.0 transactional semantics

2020-07-15 Thread Robert Newson


Thanks Jan

I would prefer not to have the configuration switch, instead remove what we 
don’t want. As you said there’ll be a 3 / 4 split for a while (and not just for 
this reason). 
-- 
  Robert Samuel Newson
  rnew...@apache.org

On Wed, 15 Jul 2020, at 14:46, Jan Lehnardt wrote:
> 
> > On 14. Jul 2020, at 18:00, Adam Kocoloski  wrote:
> > 
> > I think there’s tremendous value in being able to tell our users that each 
> > response served by CouchDB is constructed from a single isolated snapshot 
> > of the underlying database. I’d advocate for this being the default 
> > behavior of 4.0.
> 
> I too am in favour of this. I apologise for not speaking up in the 
> earlier thread, which I followed closely, but never found the time to 
> respond to.
> 
> From rnewson’s options, I’d suggest 3. the mandatory limit parameter. 
> While this does indeed mean a BC break, it teaches the right semantics 
> for folks on 4.0 and onwards. For client libraries like our own nano, 
> we can easily wrap this behaviour, so the resulting API is mostly 
> compatible still, at least when used in streaming mode, less so when 
> buffering a big _all_docs response).
> 
> > If folks wanted to add an opt-in compatibility mode to support longer 
> > responses, I suppose that could be OK. I think we should discourage that 
> > access pattern in general, though, as it’s somewhat less friendly to 
> > various other parts of the stack than a pattern of shorter responses and a 
> > smart pagination API like the one we’re introducing. To wit, I don’t think 
> > we’d want to support that compatibility mode in IBM Cloud.
> 
> Like Adam, I do not mind a compat mode, either through a different API 
> endpoint, or even a config option. I think we will be fine in getting 
> people on this path when we document this in our update guide for the 
> 4.0 release. I don’t think this will lead to a Python 2/3 situation 
> overall, because the 4.0+ features are compelling enough for relatively 
> small changes required, and CouchDB 3.x in its then latest form will 
> continue to be a fine database for years to come, for folks who can’t 
> upgrade as easily. So yes, I anticipate we’ll live in a two-versions 
> world a little longer than we did during 1.x to 2.x, but the reasons to 
> leave 1.x behind were a little more severe than the improvements of 4.x 
> over 3.x (while still significant, of course).
> 
> Best
> Jan
> —
> 
> > 
> > Adam
> > 
> >> On Jul 14, 2020, at 10:18 AM, Robert Samuel Newson  
> >> wrote:
> >> 
> >> Thanks Nick, very helpful, and it vindicates me opening this thread.
> >> 
> >> I don't accept Mike Rhodes argument at all but I should explain why I 
> >> don't;
> >> 
> >> In CouchDB 1.x, a response was generated from a single .couch file. There 
> >> was always a window between the start of the request as the client sees it 
> >> and CouchDB acquiring a snapshot of the relevant database. I don't think 
> >> that gap is meaningful and does not refute our statements of the time that 
> >> CouchDB responses are from a snapshot (specifically, that no change to the 
> >> database made _during_ the response will be visible in _this_ response). 
> >> In CouchDB 2.x (and continuing in 3.x), a CouchDB database typically 
> >> consists of multiple shards, each of which, once opened, remain 
> >> snapshotted for the duration of that response. The difference between 1.x 
> >> and 2.x/3.x is that the window is potentially larger (though the requests 
> >> are issued in parallel). The response, however much it returned, was 
> >> impervious to changes in other requests once it has begun.
> >> 
> >> I don't think _all_docs, _view or a non-continuous _changes response 
> >> should allow changes made in other requests to appear midway through them 
> >> and I want to hear the opinions of folks that have watched over CouchDB 
> >> from its earliest days on this specific point (If I must name names, at 
> >> least Adam K, Paul D, Jan L, Joan T). If there's a majority for deviating 
> >> from this semantic, I will go with the majority.
> >> 
> >> If we were to agree to preserve the 'single snapshot' behaviour, what 
> >> would the behaviour be if we can't honour it because of the FoundationDB 
> >> transaction limits?
> >> 
> >> I see a few options.
> >> 
> >> 1) We could end the response uncleanly, mid-response. CouchDB does this 
> >> when it has no alternative, and it is ugly, but it is usually handled well 
> >> by clients. They are at least not usually convinced they got a complete 
> >> response if they are using a competent HTTP client.
> >> 
> >> 2) We could disavow the streaming API, as you've suggested, attempt to 
> >> gather the full response. If we do this within the FDB bounds, return a 
> >> 200 code and the response body. A 400 and an error body if we don't.
> >> 
> >> 3) We could make the "limit" parameter mandatory and with an upper bound, 
> >> in combination with 1 or 2, such that a valid request is very likely to be 
> >> 

Re: [DISCUSS] length restrictions in 4.0

2020-05-12 Thread Robert Newson
I still don’t understand how the internal shard database name format has any 
bearing on our public interface, present or future. 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 12 May 2020, at 19:52, Nick Vatamaniuc wrote:
> I still like it. It's only 18 bytes difference but it introduces one
> more compatibility issue. At least for 4.x, it would be nice to have
> less of those and we can always increase it later. But if other
> participants think it's too nitpick-y and odd I am happy to go with
> 256.
> 
> -Nick
> 
> On Tue, May 12, 2020 at 9:24 AM Robert Samuel Newson  
> wrote:
> >
> > Sorry to let this thread drop.
> >
> > Nick, are you still preferring 238?
> >
> > B.
> >
> > > On 4 May 2020, at 21:06, Robert Samuel Newson  wrote:
> > >
> > > Ah, ok, understood. I don't think that's a compelling reason to fix our 
> > > maximum database name length at 238.
> > >
> > > CouchDB 4.0 will be the first version of CouchDB where we're not coupled 
> > > to the filesystem for this list. 256 is very common for a filesystem 
> > > filename length limit (though not universal) so I don't think our history 
> > > should dictate an odd (fine, _even_) choice of 238.
> > >
> > > B.
> > >
> > >
> > >> On 4 May 2020, at 20:41, Nick Vatamaniuc  wrote:
> > >>
> > >> It will prevent replicating from db created in 4.0 which has a name
> > >> longer than 238 (say 250) back to 2.x/3.x if the user intends to keep
> > >> the same database name on both systems, that's what I meant.
> > >>
> > >> On Mon, May 4, 2020 at 3:15 PM Robert Samuel Newson  
> > >> wrote:
> > >>>
> > >>> The 'timestamp in filename' is only on the internal shards, which would 
> > >>> not be part of a replication between 2.x/3.x and 4.x.
> > >>>
> > >>> In any case, Nick is suggesting lowering from 256 charts to 238 chars 
> > >>> to leave room for these things that won't be there. I confess I don't 
> > >>> understand the reasoning.
> > >>>
> > >>> B.
> > >>>
> >  On 4 May 2020, at 20:04, Joan Touzet  wrote:
> > 
> >  I suspect he means when replicating back to a 3.x or 2.x cluster.
> > 
> >  On 2020-05-04 3:03 p.m., Robert Samuel Newson wrote:
> > > But we don't need to add a file extension or a timestamp to database 
> > > names.
> > > B.
> > >> On 4 May 2020, at 18:42, Nick Vatamaniuc  wrote:
> > >>
> > >> Hello everyone,
> > >>
> > >> Good idea, +1 with one minor tweak: database name length in versions
> > >> <4.0 was restricted by the maximum file name on whatever file system
> > >> the server was running on. In practice that was 255, then there is an
> > >> extension and a timestamp in the filename which made the db name 
> > >> limit
> > >> be 238 so I suggest to use that instead.
> > >>
> > >> -Nick
> > >>
> > >> On Mon, May 4, 2020 at 11:51 AM Robert Samuel Newson 
> > >>  wrote:
> > >>>
> > >>> Hi,
> > >>>
> > >>> I think I speak for many in accepting the risk that we're excluding 
> > >>> doc ids formed from 4096-bit RSA signatures.
> > >>>
> > >>> I don't think I made it clear but I think these should be fixed 
> > >>> limits (i.e, not configurable) in order to ensure inter-replication 
> > >>> between couchdb installations wherever they are.
> > >>>
> > >>> B.
> > >>>
> >  On 4 May 2020, at 10:52, Ilya Khlopotov  wrote:
> > 
> >  Hello,
> > 
> >  Thank you Robert for starting this important discussion. I think 
> >  that the values you propose make sense.
> >  I can see a case when user would use hashes as document ids. All 
> >  existent hash functions I am aware of should return data which fit 
> >  into 512 characters. There is only one case which doesn't fit into 
> >  512 limit. If user would decide to use RSA signatures as document 
> >  ids and they use 4096 bytes sized keys the signature size would be 
> >  684 bytes.
> > 
> >  However in this case users can easily replace signatures with 
> >  hashes of signatures. So I wouldn't worry about it to much. 512 
> >  sounds plenty to me.
> > 
> >  +1 to set hard limits on db name size and doc id size with 
> >  proposed values.
> > 
> >  Best regards,
> >  iilyak
> > 
> >  On 2020/05/01 18:36:45, Robert Samuel Newson  
> >  wrote:
> > > Hello,
> > >
> > > There are other threads related to doc size (etc) limits for 
> > > CouchDB 4.0, motivated by restrictions in FoundationDB, but we 
> > > haven't discussed database name length and doc id length limits. 
> > > These are encoded into FoundationDB keys and so we would be wise 
> > > to forcibly limit their length from the start.
> > >
> > > I propose 256 character limit for database name and 512 character 
> > > 

Re: [DISCUSS] Streaming API in CouchDB 4.0

2020-04-23 Thread Robert Newson
cursor has established meaning in other databases and ours would not be very 
close to them. I don’t think it’s a good idea. 

B. 

> On 23 Apr 2020, at 11:50, Ilya Khlopotov  wrote:
> 
> 
>> 
>> The best I could come up with is replacing page with
>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
> Good idea, I like {db}/_all_docs/cursor (or {db}/_all_docs/_cursor).
> 
>> On 2020/04/23 08:54:36, Garren Smith  wrote:
>> I agree with Bob that page doesn't make sense as an endpoint. I'm also
>> rubbish with naming. The best I could come up with is replacing page with
>> cursor - {db}/_all_docs/cursor or possibly {db}/_cursor/_all_docs
>> All the fields in the bookmark make sense except timestamp. Why would it
>> matter if the timestamp is old? What happens if a node's time is an hour
>> behind another node?
>> 
>> 
>>> On Thu, Apr 23, 2020 at 4:55 AM Ilya Khlopotov  wrote:
>>> 
>>> - page is to provide some notion of progress for user
>>> - timestamp - I was thinking that we should drop requests if user would
>>> try to pass bookmark created an hour ago.
>>> 
>>> On 2020/04/22 21:58:40, Robert Samuel Newson  wrote:
 "page" and "page number" are odd to me as these don't exist as concepts,
>>> I'd rather not invent them. I note there's no mention of page size, which
>>> makes "page number" very vague.
 
 What is "timestamp" in the bookmark and what effect does it have when
>>> the bookmark is passed back in?
 
 I guess, why does the bookmark include so much extraneous data? Items
>>> that are not needed to find the fdb key to begin the next response from.
 
 
> On 22 Apr 2020, at 21:18, Ilya Khlopotov  wrote:
> 
> Hello everyone,
> 
> Based on the discussions on the thread I would like to propose a
>>> number of first steps:
> 1) introduce new endpoints
> - {db}/_all_docs/page
> - {db}/_all_docs/queries/page
> - _all_dbs/page
> - _dbs_info/page
> - {db}/_design/{ddoc}/_view/{view}/page
> - {db}/_design/{ddoc}/_view/{view}/queries/page
> - {db}/_find/page
> 
> These new endpoints would act as follows:
> - don't use delayed responses
> - return object with following structure
> ```
> {
>"total": Total,
>"bookmark": base64 encoded opaque value,
>"completed": true | false,
>"update_seq": when available,
>"page": current page number,
>"items": [
>]
> }
> ```
> - the bookmark would include following data (base64 or protobuff???):
> - direction
> - page
> - descending
> - endkey
> - endkey_docid
> - inclusive_end
> - startkey
> - startkey_docid
> - last_key
> - update_seq
> - timestamp
> ```
> 
> 2) Implement per-endpoint configurable max limits
> ```
> _all_docs = 5000
> _all_docs/queries = 5000
> _all_dbs = 5000
> _dbs_info = 5000
> _view = 2500
> _view/queries = 2500
> _find = 2500
> ```
> 
> Latter (after few years) CouchDB would deprecate and remove old
>>> endpoints.
> 
> Best regards,
> iilyak
> 
> On 2020/02/19 22:39:45, Nick Vatamaniuc  wrote:
>> Hello everyone,
>> 
>> I'd like to discuss the shape and behavior of streaming APIs for
>>> CouchDB 4.x
>> 
>> By "streaming APIs" I mean APIs which stream data in row as it gets
>> read from the database. These are the endpoints I was thinking of:
>> 
>> _all_docs, _all_dbs, _dbs_info  and query results
>> 
>> I want to focus on what happens when FoundationDB transactions
>> time-out after 5 seconds. Currently, all those APIs except _changes[1]
>> feeds, will crash or freeze. The reason is because the
>> transaction_too_old error at the end of 5 seconds is retry-able by
>> default, so the request handlers run again and end up shoving the
>> whole request down the socket again, headers and all, which is
>> obviously broken and not what we want.
>> 
>> There are few alternatives discussed in couchdb-dev channel. I'll
>> present some behaviors but feel free to add more. Some ideas might
>> have been discounted on the IRC discussion already but I'll present
>> them anyway in case is sparks further conversation:
>> 
>> A) Do what _changes[1] feeds do. Start a new transaction and continue
>> streaming the data from the next key after last emitted in the
>> previous transaction. Document the API behavior change that it may
>> present a view of the data is never a point-in-time[4] snapshot of the
>> DB.
>> 
>> - Keeps the API shape the same as CouchDB <4.0. Client libraries
>> don't have to change to continue using these CouchDB 4.0 endpoints
>> - This is the easiest to implement since it would re-use the
>> implementation for _changes feed (an extra option passed to the fold
>> function).
>> - Breaks API behavior if users relied on 

Re: [DISCUSS] Mango indexes on FDB

2020-03-24 Thread Robert Newson
No, 425 is something specific

A 503 Service Unavailable seems the only suitable standard code. 

B. 

> On 24 Mar 2020, at 08:48, Glynn Bird  wrote:
> 
> If a user didn't specify the index they wanted to use, leaving the choice
> of index up to CouchDB, I would expect Couch would ignore the partially
> built index and fall back on _all_docs. so +1 on this.
> 
> But we need also consider the API response if a user *specifies* an index
> during a query (with use_index) when that index is not built yet, I think I
> would prefer an instant 4** response indicating that the requested
> resource isn't ready yet, rather than performing a very slow,
> _all_docs-powered search. Is "425 Too Early" a suitable response?
> 
> 
> 
> 
>> On Mon, 23 Mar 2020 at 23:30, Joan Touzet  wrote:
>> 
>> 
>> 
>>> On 2020-03-23 4:46 p.m., Mike Rhodes wrote:
>>> Garren,
>>> 
>>> Very much +1 on this suggestion, as it is, at least for me, what I'd
>> expect to happen if I were leaving the system to select an index -- as you
>> imply, the build process almost certainly takes longer than using the
>> _all_docs index. In addition, for the common case where there is a less
>> optimal but still useful index available, one might expect that index to be
>> used in preference to the "better" but unbuilt one.
>> 
>> I agree.
>> 
>>> But I do think this is important:
>>> 
 We can amend the warning message
 to let them know that they have an index that is building that could
 service the index when it's ready.
>>> 
>>> Otherwise it's a bit too easy to get confused when trying to understand
>> the reason why an index you were _sure_ should've been used in fact was not.
>> 
>> Question: Imagine a node that's been offline for a bit and is just
>> coming back on. (I'm not 100% sure how this works in FDB land.) If
>> there's a (stale) index on disk, and the index is being updated, and the
>> index on disk is kind of stale...what happens?
>> 
>> -Joan
>> 



Re: FDB: Map index key/value limits

2020-01-16 Thread Robert Newson
Option A matches our behaviour for other (persistent) errors during indexing 
and gets my vote. 

Surfacing view build errors (option C) is certainly better but is obviously 
more work. 

> On 16 Jan 2020, at 16:09, Adam Kocoloski  wrote:
> 
> Right. I sort of assumed an additional endpoint, something like
> 
> GET /db/_design/ddoc/_errors
> 
> Adam
> 
>> On Jan 16, 2020, at 10:56 AM, Garren Smith  wrote:
>> 
>> Option A is similar to what we do currently when a doc fails to be mapped
>> and a user/admin would see the errors in the log.
>> 
>> Keeping an index is a nice idea,
>> But what do we do with it? How would we expose that to the user? I’m
>> guessing we would have to add a new api endpoint or add it to the _info
>> endpoint
>> 
>> 
>>> On Thu, Jan 16, 2020 at 5:35 PM Adam Kocoloski  wrote:
>>> 
>>> Option C - keep a separate index of document IDs that failed indexing.
>>> 
>>> I could be convinced of either Option C or Option A, and tentatively agree
>>> with Paul that the document indexing ought to be atomic for an entire view
>>> group.
>>> 
>>> Adam
>>> 
 On Jan 16, 2020, at 9:48 AM, Paul Davis 
>>> wrote:
 
 For A you also want to consider multiple emitted K/Vs on whether we
 index some or none. I'd assume none as that would match the existing
 equivalent of a doc throwing an exception during indexing.
 
> On Thu, Jan 16, 2020 at 8:45 AM Garren Smith  wrote:
> 
> Hi Everyone,
> 
> We want to impose limits on the size of keys and values for map indexes.
> See the RFC for full details -
> https://github.com/apache/couchdb-documentation/pull/410
> 
> The question I have is what is the best user experience if the user does
> exceed the key or value limit?
> 
> Option A - Do not index the key/value and log the error
> 
> Option B - Throw an error and don't build the index
> 
> Option C - Any other ideas?
> 
> Cheers
> Garren
>>> 
>>> 
> 



Re: CouchDB 3.0 Update - Dec 3rd

2019-12-04 Thread Robert Newson
Hi,

I’m fine with a release either side of Christmas but I agree with Joan’s point. 

I suggest a compromise of a code freeze. Only fixes for 3.0 to be merged until 
the new year then do the release dance. 

> On 4 Dec 2019, at 14:45, support-tiger  wrote:
> 
> fyi: Every year Ruby releases a new major version on Christmas - has become a 
> tradition with the user base - so no need to worry about holidays but 
> obviously must be ready to ship.
> 
> Do not forget about PR - I have seen little on the web about the upcoming 
> major version release.  What are the major new features ?  (not just a 
> changelog).  How about a review from someone trying out the beta or RC.
> 
> One more thing: node express crud example ?  (if you want to attract users to 
> a JSON database)
> 
> And one more thing:  are Fedora, Debian, Ubuntu pkgs ready ?
> 
> 
>> On 12/4/19 6:43 AM, Denitsa Burroughs wrote:
>> Hi Joan,
>> 
>> Point taken. Let's see what the rest of the PMC members thinks. Just to be
>> clear: I had spoken to Bob about helping with the release activities, so I
>> wasn't expecting this to land on you. :)
>> 
>> I think that the biggest challenge would be getting the release notes and
>> documentation ready. I would appreciate some feedback on areas that are
>> lacking (if any) so that I can track it. Happy to open a ticket where we
>> could capture a list if that makes sense.
>> 
>> Thanks,
>> 
>> Deni
>> 
>>> On Wed, Dec 4, 2019 at 1:08 AM Joan Touzet  wrote:
>>> 
>>> Deni,
>>> 
>>> Is it wise to rush out a 3.0 release prior to the holidays? I don't
>>> think so. Practically speaking we have 2 weeks before people start
>>> disappearing (including me, I'm gone as of Dec 18) and I don't think
>>> we'll get either the critical mass for testing, nor the attention from
>>> our release channels, if we rush to get it done before then.
>>> 
>>> Consider this an informal desire to push off the RC/release process
>>> until January 2020. If the rest of the PMC want to push ahead (knowing I
>>> won't be here to help, and are ready to do it themselves), go for it.
>>> 
>>> -Joan
>>> 
 On 2019-12-04 12:47 a.m., Jan Lehnardt wrote:
 
> On 4. Dec 2019, at 05:32, Denitsa Burroughs <
>>> denitsa.burrou...@gmail.com> wrote:
> Hi all,
> 
> We are really close to merging the last few open PRs for CouchDB 3.0.
>>> I'd
> like to propose that we aim to merge all remaining changes *by the end
>>> of
> the week* (Dec 6th) and work on a first RC next week. Please let me
>>> know if
> you don't think you could meet that goal. Also, last call for any
> additional change requests!
> Here's the current status:
> 
> *In progress:*
> - 2167 Remove vestiges of view-based `_changes` feed
>  *(Eric) **- PR
>>> reviewed,
> addressing comments*
> - 1875 Update SpiderMonkey version
>  *(Peng Hui)* *- PR
> reviewed, addressing comments*
> - 2171 Document new management subsystems (smoosh, ioq, ken)
> -ioq left. *(Adam) **-
>>> ioq
> only, doc ticket, not a blocker*
> - 1524 Per-document access control  #1524
>  *(Jan)* -
> *Need an ETA*
 Def not before Christmas, but as stated, happy to leave this for 3.1 or
>>> later. EXCEPT for one patch to accept the _access member in docs.
 Im travelling internationally until the end of *next* week, so I can't
>>> promise that before Dec 13.
 It is a relatively minor patch, though, so we might be okay with
>>> sneaking it during the RC phase.
 Best
 Jan
 —
 
> - 2249 Cluster setup does not create IOQ stats database
>   *(Adam) **- ETA Dec 6
>>> *
> *Backlog:*
> *- *2191 Tighten up security model
>  *- **(TBD)** change db
> security to admin_only, small change*
> - Release Notes
> - Blog posts (see Jan's email)
> 
> Thanks!
> 
> Deni
> 
> -- 
> Support Dept
> Tiger Nassau, Inc.
> www.tigernassau.com
> 406-624-9310
> 
> 
> 



Re: Move ken / smoosh / ioq applications in-tree

2019-11-22 Thread Robert Newson
+1 

B. 

> On 22 Nov 2019, at 04:33, Adam Kocoloski  wrote:
> 
> Hi all,
> 
> Any complaints about moving these apps into the main CouchDB repo? None of 
> them have any utility outside of CouchDB. Cheers,
> 
> Adam



Re: Batch mode options for CouchDB 4.0

2019-10-29 Thread Robert Newson
I am fine with returning 202 even though we blocked to complete the request. 

B. 

> On 29 Oct 2019, at 10:24, Mike Rhodes  wrote:
> 
> There are a two things I'd like to break down here:
> 
> 1. The non-functional behaviour of the API is changing. What was hopefully a 
> short request could now block for much longer as the client must wait for a 
> write to happen. Among other things, this affects UI latency, as well as the 
> power consumption of low-power devices. Silently changing this behaviour is 
> very hard to debug client side. This is an example where the new behaviour 
> may not be better for some use-cases.
> 2. The request is documented as returning 202 only. We are proposing changing 
> that API contract.
> 
> IMO, the HTTP response code is a fundamental part of any HTTP API, and it's 
> reasonable for clients to listen on the 202 that is documented as the only 
> possible response code in this scenario. For example, the client might want 
> to be sure CouchDB is interpreting the argument they are sending correctly.
> 
> On the question of accepting any 2XX response being desirable, I would agree 
> that perhaps it is better to be liberal in what you accept, but we need to 
> therefore be strict in what we send. CouchDB isn't great at returning 400 
> when there are mutually exclusive parameters supplied in a request, for 
> example.
> 
> If the only reason for retaining this setting is to maintain backwards API 
> compatibility, and we are not worried about API purity, returning 202 seems 
> the appropriate approach to me; it may not be "correct" but it is seemingly 
> the way of achieving the stated goal of silently dropping the param in a 
> safe(ish) manner.
> 
> -- 
> Mike.
> 
>> On Wed, 23 Oct 2019, at 13:32, Jan Lehnardt wrote:
>> 
>> 
 On 23. Oct 2019, at 14:26, Arturo GARCIA-VARGAS  
 wrote:
>>> 
>>> I guess the way I see it (and where I may be wrong) is that batch=ok will 
>>> become a deprecated use of the API.  And if we are to support a deprecated 
>>> behaviour:
>>> 
>>> 1. Behave as before because you are nice, via an explicit config enable; or
>> 
>> The point is, we would be behaving “better than before”
>> 
>>> 2. Stop doing it because it is well..., deprecated.  Update your client.
>> 
>> …and we don’t want to break client software, when we don’t have to.
>> 
>> Best
>> Jan
>> —
>>> 
>>> -A.
>>> 
>>> Again my opinion :-)
>>> 
>>> On 23/10/2019 13:19, Jan Lehnardt wrote:
> On 23. Oct 2019, at 13:56, Arturo GARCIA-VARGAS  
> wrote:
> 
> Maybe my point is not coming across correctly.
> 
> By reading the docs, a consumer would match *explicitly* to a 202 
> response, to acknowledge success.
> 
> We better be consistent and either hard-break this behaviour, or behave 
> as before, but not silently switch the behaviour, even more if the 
> operation behind is a no-op.
 I think I do understand your point, however, the nature of this API allows 
 us to argue for the best of both worlds: batch=ok today says that the 
 client is fine with letting CouchDB decide when to fully commit data. 
 Depending on the circumstances, that decision could be “immediately”, or 
 it could be “some time later”. The proposal here now suggests that we 
 switch this to be always “immediately”, but regardless of batch=ok being 
 present or not, the client doesn’t really care about that. So I don’t 
 think there is a good reason for suggesting a hard break.
 Best
 Jan
 —
> 
> Well, my opinion.
> 
> On 23/10/2019 12:50, Jan Lehnardt wrote:
>>> On 23. Oct 2019, at 13:32, Arturo GARCIA-VARGAS  
>>> wrote:
>>> 
>>> Well, a consumer would be explicitly waiting the the accept response 
>>> code like responseCode === '202' as a sign of "success".  We have 
>>> silently broken the consumer.
>>> 
>>> Granted a consumer should cater for a '201' response, but the docs 
>>> explicitly say you do not get a 201 when using batch=ok.
>> A consumer that can’t deal with different HTTP response codes already 
>> isn’t doing HTTP correctly. They could already equally receive a 400, 
>> 401, 500 or any other variety or responses, so I think we’re fine here.
>>> 
>>> On 23/10/2019 12:29, Jan Lehnardt wrote:
> On 23. Oct 2019, at 13:25, Arturo GARCIA-VARGAS 
>  wrote:
> 
> My opinion
> 
> On 23/10/2019 12:15, Jan Lehnardt wrote:
>> 
>>> On 23. Oct 2019, at 12:40, Robert Samuel Newson 
>>>  wrote:
>>> 
>>> Hi,
>>> 
>>> Just confirming my position on this. We should treat a request with 
>>> batch=ok as if the setting was not there. That is, make the same 
>>> durable commit as normal. We should therefore send a 201 Created 
>>> response code. We should continue to validate the batch setting (it 
>>> can be absent or it can 

Re: [DISCUSS] [PROPOSAL] Accept donation of the IBM Cloudant Weather Report diagnostic tool?

2019-08-14 Thread Robert Newson
I’m for the proposal and am confident IBM will apply release custodian under 
the ASLv2 if the community is in favour of the the proposal. 

B. 

> On 14 Aug 2019, at 07:19, Jay Doane  wrote:
> 
> In the interest of making CouchDB 3.0 "the best CouchDB Classic possible",
> I'd like to discuss whether to accept a donation from Cloudant of the
> "Weather Report" diagnostic tool. This tool (and dependencies) are OTP
> applications, and it is typically run from an escript which connects to a
> running cluster, gathers numerous diagnostics, and emits various warning
> and errors when it finds something to complain about. It was originally
> ported from a fork of Riaknostic (the Automated diagnostic tools for Riak)
> [1] by Mike Wallace.
> 
> The checks it makes are represented by the following modules:
> 
> weatherreport_check_custodian.erl
> weatherreport_check_disk.erl
> weatherreport_check_internal_replication.erl
> weatherreport_check_ioq.erl
> weatherreport_check_mem3_sync.erl
> weatherreport_check_membership.erl
> weatherreport_check_memory_use.erl
> weatherreport_check_message_queues.erl
> weatherreport_check_node_stats.erl
> weatherreport_check_nodes_connected.erl
> weatherreport_check_process_calls.erl
> weatherreport_check_process_memory.erl
> weatherreport_check_safe_to_rebuild.erl
> weatherreport_check_search.erl
> weatherreport_check_tcp_queues.erl
> 
> While some of these checks are self-contained, check_node_stats,
> check_process_calls, check_process_memory, and check_message_queues all use
> recon [2] under the hood. Similarly, check_custodian
> and check_safe_to_rebuild use another Cloudant OTP application called
> Custodian, which periodically scans the "dbs" database to track the
> location of every shard of every database and can integrate with sensu [3]
> to ensure that operators are aware of any shard that is under-replicated.
> 
> I have created a POC branch [4] that adds Weather Report, Custodian, and
> Recon to CouchDB, and when I ran it in my dev environment (without search
> running), got the following diagnostic output:
> 
> $ ./weatherreport --etc ~/proj/couchdb/dev/lib/node1/etc/ -a
> ['node1@127.0.0.1'] [error] Local search node at 'clouseau@127.0.0.1' not
> responding: pang
> ['node2@127.0.0.1'] [error] Local search node at 'clouseau@127.0.0.1' not
> responding: pang
> ['node3@127.0.0.1'] [error] Local search node at 'clouseau@127.0.0.1' not
> responding: pang
> ['node1@127.0.0.1'] [notice] Data directory
> /Users/jay/proj/couchdb/dev/lib/node1/data is not mounted with 'noatime'.
> Please remount its disk with the 'noatime' flag to improve performance.
> ['node2@127.0.0.1'] [notice] Data directory
> /Users/jay/proj/couchdb/dev/lib/node2/data is not mounted with 'noatime'.
> Please remount its disk with the 'noatime' flag to improve performance.
> ['node3@127.0.0.1'] [notice] Data directory
> /Users/jay/proj/couchdb/dev/lib/node3/data is not mounted with 'noatime'.
> Please remount its disk with the 'noatime' flag to improve performance.
> returned 1
> 
> There is still a little cleanup to be done before these tools would be
> ready to donate, but it seems that overall they already integrate tolerably
> well with CouchDB.
> 
> As far as licenses go, Riaknostic is Apache 2.0. Recon is not [5], but it
> seems like it should be ok to include in CouchDB based on my possibly naive
> reading. Currently Custodian has no license (just Copyright 2013 Cloudant),
> but I assume it would get an Apache license, just like all other donated
> code.
> 
> Would this be a welcome addition to CouchDB? Please let me know what you
> think.
> 
> Thanks,
> Jay
> 
> [1] https://github.com/basho/riaknostic
> [2] http://ferd.github.io/recon/
> [3] https://sensu.io
> [4]
> https://github.com/apache/couchdb/compare/master...cloudant:weatherreport?expand=1
> [5] https://github.com/ferd/recon/blob/master/LICENSE



Re: [VOTE] Adopt FoundationDB

2019-07-30 Thread Robert Newson
+1

B. 

> On 30 Jul 2019, at 09:51, Garren Smith  wrote:
> 
> +1
> 
>> On Tue, Jul 30, 2019 at 10:27 AM Jan Lehnardt  wrote:
>> 
>> Dear CouchDB developers,
>> 
>> This vote decides whether the CouchDB project accepts the proposal[1]
>> to switch our underlying storage and distributed systems technology out
>> for FoundationDB[2].
>> 
>> At the outset, we said that we wanted to cover these topic areas before
>> making a vote:
>> 
>> - Bylaw changes
>>- RFC process: done, passed
>>- Add qualified vote option: done, changes proposed were not
>>  ratified
>> 
>> - Roadmap: proposal done, detailed discussions TBD, includes
>>  deprecations
>> 
>> - Onboarding: ASF onboarding links shared, CouchDB specific onboarding
>>  TBD.
>> 
>> - (Re-)Branding: tentatively: 3.0 is the last release before FDB
>>  CouchDB and 4.0 is the FDB CouchDB. If we need nicknames, we can
>>  decide on those later.
>> 
>> - FoundationDB Governance: FoundationDB is currently loosely organised
>>  between Apple and a few key stakeholder companies invested in the
>>  technology. Apple contributions are trending downwards relatively,
>>  approaching 50%, which means in the future, more non-Apple than Apple
>>  contributions are likely.
>> 
>>  In addition, the CouchDB PMC has requested addition to the current
>>  organisational FDB weekly meeting, which is where any more formal
>>  governance decisions are going to be made and the CouchDB PMC can be
>>  a part of the surrounding discussions.
>> 
>> - FoundationDB Operations knowledge: IBM has intends to share this
>>  knowledge as they acquire it in conjunction with Apache CouchDB in
>>  terms of general ops knowledge, best practices and tooling.
>> 
>> - Proj. Mgmt.: RFC process + outline list of TBD RFCs allow for enough
>>  visibility and collaboration opportunities, everyone on dev@ list is
>>  encouraged to participate.
>> 
>> - Tech deep dives: DISCUSS threads and RFCs are covering this, current
>>  list of TBD DISCUSS/RFCs, for the proposal. Most of which were
>>  already discussed on dev@ or RFC’d in our documentation repo:
>> 
>>* JSON doc storage and storage of edit conflicts
>>* revision management
>>* _changes feed
>>* _db_updates
>>* _all_docs
>>* database creation and deletion
>>* attachments
>>* mango indexes (including collation)
>>* map-only views / search / geo
>>* reduces
>>* aggregate metrics (data_size, etc.)
>>* release engineering
>>* local/desktop/dev install security
>> 
>> * * *
>> 
>> As shown above, all topics we wanted to have clarity on have been
>> advanced to a point where we are now ready to make a decision:
>> 
>>  Should Apache CouchDB adopt FoundationDB?
>> 
>> Since this is a big decision, I suggest we make this a Lazy 2/3
>> Majority Vote with PMC Binding Votes, and a 7 day duration (as per our
>> bylaws[3]).
>> 
>> You can cast your votes now.
>> 
>> Best
>> Jan
>> —
>> [1]:
>> https://lists.apache.org/thread.html/04e7889354c077a6beb91fd1292b6d38b7a3f2c6a5dc7d20f5b87c44@%3Cdev.couchdb.apache.org%3E
>> [2]: https://www.foundationdb.org
>> [3]: https://couchdb.apache.org/bylaws.html
>> 
>> 
>> 



Re: CouchDb Rewrite/Fork

2019-07-10 Thread Robert Newson
That’s valuable feedback thank you. 

Best of luck with your new project and a gentle reminder that you may not call 
it CouchDB. 

B. 

> On 10 Jul 2019, at 00:07, Reddy B.  wrote:
> 
> Hi all,
> 
> I've checked the recent discussions and apparently July is the "vision month" 
> lol. Hopefully this email will not saturate the patience of the core team.
> 
> We have been thinking about forking/rewriting CouchDb internally for quite 
> some time now, and this idea has reached a degree of maturity such that I'm 
> pretty confident it will materialize at this point. We hesitated between 
> doing our thing internally to then make our big open-sourcing announcement 
> 5-10 years from now when the product is battle tested, and announcing our 
> intentions here today.
> 
> However, I realized that good things may happen by providing this feedback, 
> and that providing this type of feedback also is a way of giving back to the 
> community.
> 
> The reason for this project is that we have lost confidence in the way the 
> vision of CouchDb aligns with our goals. As far as we are concerned, there 
> are 3 things we loved with CouchDb:
> 
> #Map/Reduce
> 
> We think that the benefits of Map/Reduce are very underrated. Map/reduce 
> forces developpers to approach problems differently and results in much more 
> efficient and well-thought of  application architectures and implementations. 
> This is in addition to the performance benefits since indexes are built in 
> advance in a very predictable manner (with a few well-documented caveats). 
> For this reason, our developers are forbidden from using Mango, and we 
> require them to wrap their head around problems until they are able to solve 
> them in map/reduce mode.
> 
> However, we can see that the focus of the CouchDb project is increasingly on 
> Mango, and we have little confidence in the commitment of the project to 
> first-class citizen Map/Reduce support (while this was for us a defining 
> aspect of the identity of CouchDb).
> 
> #Complexity of the codebase
> 
> An open-source software that is too complex to be tweaked and hacked is for 
> all practical purposes closed-source software. You guys are VERY smart. And 
> by nature a database software system is a non-trivial piece of technology.
> 
> Initially we felt confident that the codebase was small enough and clean 
> enough that should we really need to get our hands dirty in an emergency 
> situation, we would be able to do so. Then Mango made the situation a bit 
> blurrier, but we could easily ignore that, especially since we do not use it. 
> However with FoundationDB... this becomes a whole different story.
> 
> The domain model of a database is non-trivial by nature, and now FoundationDb 
> will introduce an additional level of abstraction and indirection, and a very 
> serious one. I've been reading the design discussions since the FoundationDb 
> announcement and there are a lot of impedance mistmatches requiring the 
> domain model of CouchDb to be broken up in fictious entities intended to 
> accomodate FoundationDb abstractions and their limitations (I'll back to this 
> point in a moment).
> 
> Indirection is also introduced at the business logic level, with additional 
> steps needing to be followed to emulate the desired behavior. All of this is 
> complexity and obfuscation, and to be realistic, if we already struggled with 
> the straight-to-the-point implementation, there is no way we'll be able to 
> navigate (let alone hack), the FoundationDB-based implementation.
> 
> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love 
> CouchDb
> 
> FoundationDb introduces limitations regarding transactions, document sizes 
> and another number of critical items. One of the main reasons we use CouchDb 
> is because of the way it allows us to develop applications rapidly and 
> flexibly address all the state storage needs of application layers. CouchDb 
> has you covered if you just want to dump large media file streamed with HTTP 
> range requests while you iterate fast and your userbase is small, and 
> replication allows you to seemless scale by distributing load on clusters in 
> advanced ways without needing to redesign your applications. The user nkosi23 
> nicely describes some of the new possibilities enabled by CouchDb:
> 
> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
> 
> However, the limitations introduced by FoundationDb and the spirit of their 
> project favoring abstraction purity through aggressive constraints, over 
> operational flexibility is the opposite of the reasons we loved CouchDb and 
> believed in it. It is to us pretty clear that the writing is on the wall. We 
> aren't confident in FoundationDb to cover our bases, since covering our bases 
> is explicitly not the goal of their project and their spirit is different 
> from what has made CouchDb unique (ease of use, simple yet powerful and 
> flexible abstractions etc...).
> 

Re: CouchDB and future

2019-07-08 Thread Robert Newson
Hi,

CouchDB 1.x is no longer supported, even for security updates. 

CouchDB 4.0, the one with foundationdb, will reinstate a few couchdb 1.x 
semantics, particularly in the _changes response. 

B. 

> On 8 Jul 2019, at 16:24, Chintan Mishra  wrote:
> 
> Interesting. I started using CouchDB since 2.0+. So, I am not aware of the 
> benefits of the older versions. I will look into those releases. However, it 
> appears that they won't be maintained in the future.
> 
> I couldn't agree more about PouchDB for Web.
> 
> On 08/07/19 8:33 PM, ermouth wrote:
> 
> CouchDB as it is now will be a poor fit for embedded systems/IoTs.
> 
>> This is too bold and broad, sorry. Indeed, 2.x is poor fit, because it
>> demands regular if not daily maintenance and has substantial amount of
>> issues. With no doubts FDB-based release will have even more problems, not
>> because of FDB or IoT by itself, but because any re-architectured solution
>> is full of issues and not yet covered corner cases.
>> 
>> However, 1.x is ok for some IoT scenaria, esp if you use Erlang for
>> CPU-intensive query server functions. Latest 1.x releases have very good
>> balance in terms of reliability/speed, and require no additional SW (except
>> probably nginx) – which is especially valuable.
>> 
>> Having 1.x CouchDB installed on devices, which are physically remote from
>> service is reasonable choice: Couch 1.x is famous for it’s ability to work
>> without requiring administrative intervention for years. Couch ability to
>> receive QS functions updates using regular replication is invaluable for
>> long-running distributed IoT projects.
>> 
>> As for Pouch – it’s a wonderful solution for browsers, however it can be
>> easily knocked out when acts as a server.
>> 
>> Best regards,
>> ermouth
>> 



Re: [DISCUSS] Improve load shedding by enforcing timeouts throughout stack

2019-04-22 Thread Robert Newson
My memory is fuzzy, but those items sound a lot like what happens with rex, 
that motivated us (i.e, Adam) to build rexi, which deliberately does less than 
the stock approach.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 22 Apr 2019, at 18:33, Nick Vatamaniuc wrote:
> Hi everyone,
> 
> We partially implement the first part (cleaning rexi workers) for all 
> the
> fabric streaming requests. Which should be all_docs, changes, view map,
> view reduce:
> https://github.com/apache/couchdb/commit/632f303a47bd89a97c831fd0532cb7541b80355d
> 
> The pattern there is the following:
> 
>  - With every request spawn a monitoring process that is in charge of
> keeping track of all the workers as they are spawned.
>  - If regular cleanup takes place, then this monitoring process is killed,
> to avoid sending double the number of kill messages to workers.
>  - If the coordinating process doesn't run cleanup and just dies, the
> monitoring process will performs cleanup on its behalf.
> 
> Cheers,
> -Nick
> 
> 
> 
> On Thu, Apr 18, 2019 at 5:16 PM Robert Samuel Newson 
> wrote:
> 
> > My view is a) the server was unavailable for this request due to all the
> > other requests it’s currently dealing with b) the connection was not idle,
> > the client is not at fault.
> >
> > B.
> >
> > > On 18 Apr 2019, at 22:03, Done Collectively  wrote:
> > >
> > > Any reason 408 would be undesirable?
> > >
> > > https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/408
> > >
> > >
> > > On Thu, Apr 18, 2019 at 10:37 AM Robert Newson 
> > wrote:
> > >
> > >> 503 imo.
> > >>
> > >> --
> > >>  Robert Samuel Newson
> > >>  rnew...@apache.org
> > >>
> > >> On Thu, 18 Apr 2019, at 18:24, Adam Kocoloski wrote:
> > >>> Yes, we should. Currently it’s a 500, maybe there’s something more
> > >> appropriate:
> > >>>
> > >>>
> > >>
> > https://github.com/apache/couchdb/blob/8ef42f7241f8788afc1b6e7255ce78ce5d5ea5c3/src/chttpd/src/chttpd.erl#L947-L949
> > >>>
> > >>> Adam
> > >>>
> > >>>> On Apr 18, 2019, at 12:50 PM, Joan Touzet  wrote:
> > >>>>
> > >>>> What happens when it turns out the client *hasn't* timed out and we
> > >>>> just...hang up on them? Should we consider at least trying to send
> > back
> > >>>> some sort of HTTP status code?
> > >>>>
> > >>>> -Joan
> > >>>>
> > >>>> On 2019-04-18 10:58, Garren Smith wrote:
> > >>>>> I'm +1 on this. With partition queries, we added a few more timeouts
> > >> that
> > >>>>> can be enabled which Cloudant enable. So having the ability to shed
> > >> old
> > >>>>> requests when these timeouts get hit would be great.
> > >>>>>
> > >>>>> Cheers
> > >>>>> Garren
> > >>>>>
> > >>>>> On Tue, Apr 16, 2019 at 2:41 AM Adam Kocoloski 
> > >> wrote:
> > >>>>>
> > >>>>>> Hi all,
> > >>>>>>
> > >>>>>> For once, I’m coming to you with a topic that is not strictly about
> > >>>>>> FoundationDB :)
> > >>>>>>
> > >>>>>> CouchDB offers a few config settings (some of them undocumented) to
> > >> put a
> > >>>>>> limit on how long the server is allowed to take to generate a
> > >> response. The
> > >>>>>> trouble with many of these timeouts is that, when they fire, they do
> > >> not
> > >>>>>> actually clean up all of the work that they initiated. A couple of
> > >> examples:
> > >>>>>>
> > >>>>>> - Each HTTP response coordinated by the “fabric” application spawns
> > >>>>>> several ephemeral processes via “rexi" on different nodes in the
> > >> cluster to
> > >>>>>> retrieve data and send it back to the process coordinating the
> > >> response. If
> > >>>>>> the request timeout fires, the coordinating process will be killed
> > >> off, but
> > >>>>>> the ephemeral workers might not be. In a healthy cluster they’ll
> > >> exit on
> > >>>>>> their

Re: [DISCUSS] Improve load shedding by enforcing timeouts throughout stack

2019-04-18 Thread Robert Newson
503 imo.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Thu, 18 Apr 2019, at 18:24, Adam Kocoloski wrote:
> Yes, we should. Currently it’s a 500, maybe there’s something more 
> appropriate:
> 
> https://github.com/apache/couchdb/blob/8ef42f7241f8788afc1b6e7255ce78ce5d5ea5c3/src/chttpd/src/chttpd.erl#L947-L949
> 
> Adam
> 
> > On Apr 18, 2019, at 12:50 PM, Joan Touzet  wrote:
> > 
> > What happens when it turns out the client *hasn't* timed out and we
> > just...hang up on them? Should we consider at least trying to send back
> > some sort of HTTP status code?
> > 
> > -Joan
> > 
> > On 2019-04-18 10:58, Garren Smith wrote:
> >> I'm +1 on this. With partition queries, we added a few more timeouts that
> >> can be enabled which Cloudant enable. So having the ability to shed old
> >> requests when these timeouts get hit would be great.
> >> 
> >> Cheers
> >> Garren
> >> 
> >> On Tue, Apr 16, 2019 at 2:41 AM Adam Kocoloski  wrote:
> >> 
> >>> Hi all,
> >>> 
> >>> For once, I’m coming to you with a topic that is not strictly about
> >>> FoundationDB :)
> >>> 
> >>> CouchDB offers a few config settings (some of them undocumented) to put a
> >>> limit on how long the server is allowed to take to generate a response. 
> >>> The
> >>> trouble with many of these timeouts is that, when they fire, they do not
> >>> actually clean up all of the work that they initiated. A couple of 
> >>> examples:
> >>> 
> >>> - Each HTTP response coordinated by the “fabric” application spawns
> >>> several ephemeral processes via “rexi" on different nodes in the cluster 
> >>> to
> >>> retrieve data and send it back to the process coordinating the response. 
> >>> If
> >>> the request timeout fires, the coordinating process will be killed off, 
> >>> but
> >>> the ephemeral workers might not be. In a healthy cluster they’ll exit on
> >>> their own when they finish their jobs, but there are conditions under 
> >>> which
> >>> they can sit around for extended periods of time waiting for an overloaded
> >>> gen_server (e.g. couch_server) to respond.
> >>> 
> >>> - Those named gen_servers (like couch_server) responsible for serializing
> >>> access to important data structures will dutifully process messages
> >>> received from old requests without any regard for (of even knowledge of)
> >>> the fact that the client that sent the message timed out long ago. This 
> >>> can
> >>> lead to a sort of death spiral in which the gen_server is ultimately
> >>> spending ~all of its time serving dead clients and every client is timing
> >>> out.
> >>> 
> >>> I’d like to see us introduce a documented maximum request duration for all
> >>> requests except the _changes feed, and then use that information to aid in
> >>> load shedding throughout the stack. We can audit the codebase for
> >>> gen_server calls with long timeouts (I know of a few on the critical path
> >>> that set their timeouts to `infinity`) and we can design servers that
> >>> efficiently drop old requests, knowing that the client who made the 
> >>> request
> >>> must have timed out. A couple of topics for discussion:
> >>> 
> >>> - the “gen_server that sheds old requests” is a very generic pattern, one
> >>> that seems like it could be well-suited to its own behaviour. A cursory
> >>> search of the internet didn’t turn up any prior art here, which surprises
> >>> me a bit. I’m wondering if this is worth bringing up with the broader
> >>> Erlang community.
> >>> 
> >>> - setting and enforcing timeouts is a healthy pattern for read-only
> >>> requests as it gives a lot more feedback to clients about the health of 
> >>> the
> >>> server. When it comes to updates things are a little bit more muddy, just
> >>> because there remains a chance that an update can be committed, but the
> >>> caller times out before learning of the successful commit. We should try 
> >>> to
> >>> minimize the likelihood of that occurring.
> >>> 
> >>> Cheers, Adam
> >>> 
> >>> P.S. I did say that this wasn’t _strictly_ about FoundationDB, but of
> >>> course FDB has a hard 5 second limit on all transactions, so it is a bit 
> >>> of
> >>> a forcing function :).Even putting FoundationDB aside, I would still argue
> >>> to pursue this path based on our Ops experience with the current codebase.
> >> 
> > 
> 
>


Re: [DISCUSS] FDB and CouchDB replication

2019-04-10 Thread Robert Newson
As long as any given replicator node can grab as much work as it can handle, It 
doesn't need to be 'fair' in the way we currently do it. The notion of an 
'owner' node drops away imo. As nick mentions, the fun part is recognising when 
jobs become unowned due to a resource failure somewhere but this is a very 
standard thing, a pool of workers competing over jobs.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Thu, 11 Apr 2019, at 00:43, Adam Kocoloski wrote:
> Hi Nick,
> 
> Good stuff. On the first topic, I wonder if it makes sense to use a 
> dedicated code path for updates to _replicator DB docs, one that would 
> automatically register these replication jobs in a queue in FDB as part 
> of the update transaction. That’d save the overhead and latency of 
> listening to _db_updates and then each _replicator DB’s _changes feed 
> just to discover these updates (and then presumably create jobs in a 
> job queue anyway).
> 
> On the second topic — is it important for each node declaring the 
> replicator role to receive an equal allotment of jobs and manage its 
> own queue, or can the replication worker processes on each node just 
> grab the next job off the global queue in FDB whenever they free up? I 
> could see the latter approach decreasing the tail latency for job 
> execution, and I think there are good patterns for managing high 
> contention dequeue operations in the case where we’ve got more worker 
> processes than jobs to run.
> 
> Regardless, you make a good point about paying special attention to 
> liveness checking now that we’re not relying on Erlang distribution for 
> that purpose. I didn’t grok all the details of approach you have in 
> mind for that yet because I wanted to bring up these two points above 
> and get your perspective.
> 
> Adam
> 
> > On Apr 10, 2019, at 6:21 PM, Nick Vatamaniuc  wrote:
> > 
> > I was thinking how replication would work with FDB and so far there are two
> > main issues I believe would need to be addressed. One deals with how we
> > monitor _replicator db docs for changes, and other one is how replication
> > jobs coordinate so we don't run multiple replication jobs for the same
> > replication document in a cluster.
> > 
> > 1) Shard-level vs fabric-level notifications for _replicator db docs
> > 
> > Currently replicator is monitoring and updating individual _replicator
> > shards. Change notifications are done via change feeds (normal,
> > non-continuous) and couch event server callbacks.
> > https://github.com/apache/couchdb/blob/master/src/couch/src/couch_multidb_changes.erl#L180,
> > https://github.com/apache/couchdb/blob/master/src/couch/src/couch_multidb_changes.erl#L246.
> > With fdb we'd have to get these updates via a fabric changes feeds and rely
> > on the global _db_updates. That could result in a performance impact and
> > would be something to keep an eye on.
> > 
> > 2) Replicator job coordination
> > 
> > Replicator has a basic constraint that there should be only one replication
> > job running for each replicator doc per cluster.
> > 
> > Each replication currently has a single "owner" node. The owner is picked
> > to be one of 3 nodes were the _replicator doc shards live. If nodes connect
> > or disconnect, replicator will reshuffle replication jobs and some nodes
> > will stop running jobs that they don't "own" anymore and then proceed to
> > "rescan" all the replicator docs to possibly start new ones. However, with
> > fdb, there are no connected erlang nodes and no shards. All coordination
> > happens via fdb, so we'd have to somehow coordinate replication job
> > ownership through there.
> > 
> > For discussion, here is a proposal for a worker registration layer do that
> > job coordination:
> > 
> > The basic idea is erlang fabric nodes would declare, by writing to fdb,
> > that they can take on certain "roles". "replicator" would be one such role.
> > And so, for each role there is a list of nodes. Each node picks a fraction
> > of jobs based on how many other nodes of the same role are in the list.
> > When membership changes, nodes which are alive might have to pick up new
> > jobs or stop running existing jobs since they'd be started by other nodes.
> > 
> > For example, there are 3 nodes with "replicator" roles: [n1, n2, n3]. n1 is
> > currently down so the membership list is [n2, n3]. If there are 60
> > replication jobs then n2 might run 30, and n3 another 30. n1 comes online
> > and adds itself to the roles list, which now looks like [n1, n2, n3]. n1
> > then picks 20 replication jobs. At about the same time n2 and n3 notice n1
> > is online and decide to stop running the jobs that n1 would pick up and
> > they each would end up running roughly 20 jobs.
> > 
> > The difficulty here comes from maintaining liveliness. A node could stop at
> > any time without removing itself from the membership list of its roles.
> > That means all of the sudden a subset of jobs would stop running without
> > anyone picking them 

Re: [DISCUSS] Statistics maintenance in FoundationDB

2019-04-09 Thread Robert Newson
Hi,

I agree with all of this.

On "sizes", we should clean up the various places that the different sizes are 
reported. I suggest we stick with just the "sizes" object, which will have two 
items, 'external' which will be jiffy's estimate of the body as json plus the 
length of all attachments (only if held within fdb) and 'file' which will be 
the sum of the lengths of the keys and values in fdb for the Directory 
(excluding the sum key/value itself). (the long way of saying I agree with what 
you already said).

On "offset", I agree we should remove it. It's of questionable value today, so 
let's call it out as an API change in the appropriate RFC section. The fdb 
release (ostensibly "4.0") is an opportunity to clean up some API cruft. Given 
we know about this one early, we should also remove it in 3.0.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 8 Apr 2019, at 23:33, Adam Kocoloski wrote:
> Hi all, a recent comment from Paul on the revision model RFC reminded 
> me that we should have a discussion on how we maintain aggregate 
> statistics about databases stored in FoundationDB. I’ll ignore the 
> statistics associated with secondary indexes for the moment, assuming 
> that the design we put in place for document data can serve as the 
> basis for an extension there.
> 
> The first class of statistics are the ones we report in GET /, 
> which are documented here:
> 
> http://docs.couchdb.org/en/stable/api/database/common.html#get--db
> 
> These fall into a few different classes:
> 
> doc_count, doc_del_count: these should be maintained using 
> FoundationDB’s atomic operations. The revision model RFC enumerated all 
> the possible update paths and showed that we always have enough 
> information to know whether to increment or decrement each of these 
> counters; i.e., we always know when we’re removing the last 
> deleted=false branch, adding a new branch to a previously-deleted 
> document, etc.
> 
> update_seq: this must _not_ be maintained as its own key; attempting to 
> do so would cause every write to the database to conflict with every 
> other write and kill throughput. Rather, we can do a limit=1 range read 
> on the end of the ?CHANGES space to retrieve the current sequence of 
> the database.
> 
> sizes.*: things get a little weird here. Historically we relied on the 
> relationship between sizes.active and sizes.file to know when to 
> trigger a database compaction, but we don’t yet have a need for 
> compaction in the FDB-based data model and it’s not clear how we should 
> define these two quantities. The sizes.external field has also been a 
> little fuzzy. Ignoring the various definitions of “size” for the 
> moment, let’s agree that we’ll want to be tracking some set of byte 
> counts for each database. I think the way we should do this is by 
> extending the information stored in each edit branch in ?REVISIONS to 
> included the size(s) of the current revision. When we update a document 
> we need to compare the size(s) of the new revision with the size(s) of 
> the parent, and update the database level atomic counter(s) 
> appropriately. This requires an enhancement to RFC 001.
> 
> I’d like to further propose that we track byte counts not just at a 
> database level but also across the entire Directory associated with a 
> single CouchDB deployment, so that FoundationDB administrators managing 
> multiple applications for a single cluster can have a better view of 
> per-Directory resource utilization without walking every single 
> database stored inside.
> 
> Looking past the DB info endpoint, one other statistic worth discussing 
> is the “offset” field included with every response to an _all_docs 
> request. This is not something that we get for free in FoundationDB, 
> and I have to confess it seems to be of limited utility. We could 
> support this by implementing a tree structure by adding additional 
> aggregation keys on top of the keys stored in the _all_docs space, but 
> I’m skeptical that it’s worth baking this extra cost into every 
> database update and _all_docs operation. I’d like to hear others’ 
> thoughts on this one.
> 
> I haven’t yet looked closely at _stats and _system to see if any of 
> those metrics require specific support from FDB.
> 
> Adam


Re: [DISCUSS] Implementing _all_docs on FoundationDB

2019-03-21 Thread Robert Newson
Hi,

Thanks for pushing forward, and I owe feedback on other threads you've started.

Rather feebly, I'm just agreeing with you. option 3 for include_docs=false and 
option 1 for include_docs=true sounds ideal. both flavours are very common so 
it makes sense to build a solution for each. At a pinch we can just do option 3 
+ async doc lookups in a first release and then circle back, but the RFC should 
propose 1 and 3 as our design intention.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Thu, 21 Mar 2019, at 19:50, Adam Kocoloski wrote:
> Hi all, me again. This one will be shorter :) As I see it we have three 
> different options for serving the _all_docs endpoint from FDB: 
> 
> ## Option 1: Read the document data, discard the bodies
> 
> We likely will have the documents stored in docid order already; we 
> could do range reads and discard everything but the ID and _rev by 
> default. This can be a very efficient implementation of 
> include_docs=true (though one needs to be careful about skipping the 
> conflict bodies), but pretty wasteful otherwise.
> 
> ## Option 2: Read the “revisions” subspace
> 
> We also have an entry for every document in ID order in the “revisions” 
> subspace. The disadvantage of this approach is that every deleted edit 
> branch shows up there, too, and some databases will have lots of 
> deleted documents. We may need to build skiplists to know how to scan 
> efficiently. This subspace is also doing a lot of heavy lifting for us 
> already, and if we wanted to toy with alternative revision history 
> representations in the future it could get complicated
> 
> ## Option 3: Add specific entries to support _all_docs
> 
> We can also write an extra KV containing the ID and winning _rev in a 
> special subspace just to support this endpoint. It would be a blind 
> write because we’re already coordinating concurrent transactions 
> through reads on the “revisions” subspace. This would be conceptually 
> quite clean and simple, and the fastest implementation for constructing 
> the default response.
> 
> ===
> 
> My sense is Option 2 is a non-starter but I include it for completeness 
> in case anyone else thought of the same. I think Option 3 is a 
> reasonable space / efficiency / simplicity tradeoff, and it might also 
> be worth testing out Option 1 as an optimized implementation for 
> include_docs=true.
> 
> Thoughts? I imagine we can move quickly to an RFC for at least having 
> the extra KVs for Option 3, and in that design also acknowledge the 
> option for scanning the docs space directly to support include_docs.
> 
> Adam


Re: FoundationDB & Multi tenancy model

2019-03-18 Thread Robert Newson
Hi,

Firstly, CouchDB today does not have multi-tenancy as a feature. Cloudant does 
and achieves this by inserting the tenant's name as a prefix on the database 
name (so "rnewson/db1" is a different database to "sleroux/db1"), with 
appropriate stripping of the prefix in various responses. I would like to see 
multi-tenancy carried into CouchDB as first-level feature, though.

With that preamble done, each tenant will have a unique label pretty much by 
definition, and this would be included in all the keys. Running that, or other 
properties, through a cryptographically secure message digest algorithm 
achieves nothing but obfuscation and, as you note, the possibility (however 
remote) of a collision. Crypto isn't magic, even if it looks like magic.

FDB provides the notion of a "Directory" which is a mechanism to help with very 
long keys, given the key length constraint of 10k.

So, instead of representing a doc of {"foo":12} in "db1" of my "rnewson" 
account simply as;

/couchdb/rnewson/db1/doc1/foo => 12

we could create a Directory for the prefix "/couchdb/rnewson/db1" instead;

dirspace/couchdb/rnewson/db1 => 0x01
0x01/doc1/foo => 12

We're overdue for the Document Model RFC that would make this explicit.

Finally, I think we're passed the "proposition" stage as there is broad 
agreement (and no disagreement) from the conversations already had. We are a 
little behind on writing and publishing the RFC's that will describe the full 
work, though.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 18 Mar 2019, at 17:32, Steven Le Roux wrote:
> Hi everyone.
> 
> I'm new here and just discovered the ongoing proposition for CouchDB to
> rely upon FDB.
> 
> With my team, we were considering providing an HTTP API over FDB in the
> form of the CouchDB API definition, so I'm very pleased to see there is
> already an ongoing effort for this (even if still a proposition). I've
> tried to catch up with all the good discussions on how you could make this
> work, mapping to the K/V model, but sorry if I could have missed a point.
> 
> I'm curious on how you're considering to manage multi tenancy while
> ensuring a good scalability and avoiding hotspotting.
> 
> I've read an idea from Mickael with CryptoHash to map the model this way :
> 
> {bucket_id}/{cryptohash}  : value
> 
> We currently use this CryptoHash mecanism to manage some data in a multi
> tenancy context applied to Time Series.
> 
> Here is a simple diagram that summarize it :
> 
> {raw_data} -> ingress component -> {hashed_metadata+data} -> HBase
> -> {crypted_metadata} -> HBase
> -> {crypted_metadata} -> Directory service
> 
> Query -> egress component -> HBase
> 
> raw_data is in the metric{tags} format, like in Prometheus/OpenTSDB/Warp10
> style.
> hashed metadata is a double 64 or 128 bits hashes of hash(metric) +
> hash(tags).
> Default is 64bits but it can lead to collision in the keyspace above 1B
> unique series where 128bits hashes are safer.
> egress will query the Directoy service to get the series list to be read in
> the store.
> 
> While authenticating, a custom "application" label is embedded into a label
> that ends in the data model, then hashed that avoid conflict between
> users.Hashed metadata are suffixed with a timestamp because it's convenient
> for Time Series data.
> What makes it very useful is :
>  - it can still use scans per series (metrics+tags)
>  - it avoids hotspotting the cluster and ensures a very good distributions
> among nodes
>  - it provides authentication through a directory service that act as an
> indirection
>  - keys are consistent while metrics or tags can be very long
> 
> I think this kind of model can perfectly apply to FDB for documents given
> that Namespace would be a user application/bucket/...  :
> 
> hash ( {NS} + {...} + {DOC_ID} ) / fields / ...
> 
> Drawbacks are that it may require a bit more storage for keys, but hashing
> could be adjusted given the use case. Moreover, managing rights at the
> document level would also require additional fields or few bytes to manage
> this, while using a directory index (could be as memory inside CouchDB,
> outside relying on something like Elastic, or available directly inside FDB)
> 
> I realize that just FDB as a backend is a considerable amount of work and
> pushing multi tenancy adds even more work maybe into CouchDB itself. For
> example, Tokens could embed rights and buckets ids, that would be used by
> CouchDB to authorize and build the underlying data model for storing with
> scalability and optimizations in mind. Also, did anyone considered reaching
> the FDB guys to try to align CouchDB document representation to the
> Document Layer (
> https://foundationdb.github.io/fdb-document-layer/data-modeling.html ).
> This would make CouchDB to be also MongoDB API compatible.
> 
> I don't where discussions are, but maybe we could help :)
>


Re: [DISCUSS] : things we need to solve/decide : changes feed

2019-03-06 Thread Robert Newson
+1 to both changes, will echo that in the PR.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Wed, 6 Mar 2019, at 00:04, Adam Kocoloski wrote:
> Dredging this thread back up with an eye towards moving to an RFC …
> 
> I was reading through the FoundationDB Record Layer preprint[1] a few 
> weeks ago and noticed an enhancement to their version of _changes that 
> I know would be beneficial to IBM and that I think is worth considering 
> for inclusion in CouchDB directly. Quoting the paper:
> 
> > To implement a sync index, CloudKit leverages the total order on 
> > FoundationDB’s commit versions by using a VERSION index, mapping versions 
> > to record identifiers. To perform a sync, CloudKit simply scans the VERSION 
> > index.
> > 
> > However, commit versions assigned by different FoundationDB clusters are 
> > uncorrelated. This introduces a challenge when migrating data from one 
> > cluster to another; CloudKit periodically moves users to improve load 
> > balance and locality. The sync index must represent the order of updates 
> > across all clusters, so updates committed after the move must be sorted 
> > after updates committed before the move. CloudKit addresses this with an 
> > application-level per-user count of the number of moves, called the 
> > incarnation. Initially, the incarnation is 1, and CloudKit increments it 
> > each time the user’s data is moved to a different cluster. On every record 
> > update, we write the user’s current incarnation to the record’s header; 
> > these values are not modified during a move. The VERSION sync index maps 
> > (incarnation, version) pairs to changed records, sorting the changes first 
> > by incarnation, then by version.
> 
> One of our goals in adopting FoundationDB is to eliminate rewinds of 
> the _changes feed; we make significant progress towards that goal 
> simply by adopting FoundationDB versionstamps as sequence identifiers, 
> but in cases where user data might be migrated from one FoundationDB 
> cluster to another we can lose this total ordering and rewind (or 
> worse, possibly skip updates). The “incarnation” trick of prefixing the 
> versionstamp with an integer which gets bumped whenever a user is moved 
> is a good way to mitigate that. I’ll give some thought to how the 
> per-database incarnation can be recorded and what facility we might 
> have for intelligently bumping it automatically, but I wanted to bring 
> this to folks’ attention and resurrect this ML thread.
> 
> Another thought I had this evening is to record the number of edit 
> branches for a given document in the value of the index. The reason I’d 
> do this is to optimize the popular `style=all_docs` queries to _changes 
> to avoid an extra range read in the very common case where a document 
> has only a single edit branch.
> 
> With the incarnation and branch count in place we’d be looking at a 
> design where the KV pairs have the structure
> 
> (“changes”, Incarnation, Versionstamp) = (ValFomat, DocID, RevFormat, 
> RevPosition, RevHash, BranchCount)
> 
> where ValFormat is an enumeration enabling schema evolution of the 
> value format in the future, and RevFormat, RevPosition, RevHash are 
> associated with the winning edit branch for the document (not 
> necessarily the edit that occurred at this version, matching current 
> CouchDB behavior) and carry the meanings defined in the revision 
> storage RFC[2].
> 
> A regular _changes feed request can respond simply by scanning this 
> index. A style=all_docs request can also be a simple scan if 
> BranchCount is 1; if it’s greater than 1 we would need to do an 
> additional range read of the “revisions” subspace to retrieve the leaf 
> revision identifiers for the document in question. An include_docs=true 
> request would need to do an additional range read in the document 
> storage subspace for this revision.
> 
> I think both the incarnation and the branch count warrant a small 
> update to the revision metadata RFC …
> 
> Adam
> 
> [1]: https://www.foundationdb.org/files/record-layer-paper.pdf
> [2]: https://github.com/apache/couchdb-documentation/pull/397
> 
> 
> > On Feb 5, 2019, at 12:20 PM, Mike Rhodes  wrote:
> > 
> > Solution (2) appeals to me for its conceptual simplicity -- and having a 
> > stateless CouchDB layer I feel is super important in simplifying overall 
> > CouchDB deployment going forward.
> > 
> > -- 
> > Mike.
> > 
> > On Mon, 4 Feb 2019, at 20:11, Adam Kocoloski wrote:
> >> Probably good to take a quick step back and note that FoundationDB’s 
> >> versionstamps are an elegant and scalable solution to atomically 
> >> maintaining the index of documents in the order in which they were most 
> >> recently updated. I think that’s what you mean by the first part of the 
> >> problem, but I want to make sure that on the ML here we collectively 
> >> understand that FoundationDB actually nails this hard part of the 
> >> problem *really* well.
> >> 
> >> When you say “notify CouchDB 

Re: [VOTE] Bylaws change: Establish a Qualified Lazy Majority vote type for the RFC process (revised)

2019-03-05 Thread Robert Newson
-1 for the following reason:

"https://apache.org/foundation/how-it-works.html#hats

INDIVIDUALS COMPOSE THE ASF
All of the ASF including the board, the other officers, the committers, and the 
members, are participating as individuals. That is one strength of the ASF, 
affiliations do not cloud the personal contributions."

The Qualified Lazy Majority explicitly links an individual contributors actions 
with their employee (where they have one) and therefore, in my opinion, 
presumes bad faith.

I agree that some ASF projects appear to have been co-opted by a single 
company, that some people may believe this is true of CouchDB and this deserves 
a response.

I would vote _for_ a bylaw change that addressed this head-on. For example, if 
we established a remedy for when this occurs (temporary or permanent 
suspension, for example) or a definition of what we would consider an 
occurrence. Obviously that is a hard problem and presumably the reason it 
hasn't yet been proposed.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 5 Mar 2019, at 20:29, Joan Touzet wrote:
> (Revised: The Reply-To field on the last email was incorrect. I am
> also including the diff of the proposed changes to this email per
> request.)
> 
> 
> 
> PMC Members,
> 
> This is a VOTE to create or amend our official documents.
> 
> It establishes a new Qualified Lazy Majority voting type, and amends
> the bylaws to use this new type for RFC votes only.
> 
> This vote depends on the RFC vote passing.
> 
> The git branch with the proposed further changes to the bylaws,
> beyond the RFC process itself, is here:
> 
>   
> https://github.com/apache/couchdb-www/compare/add-rfc...add-qualified-lazy-majority
> 
> Per our process, this vote occurs on the Main development list (dev@),
> and requires a Lazy 2/3 majority, meaning it requires three binding +1
> votes and twice as many binding +1 votes as binding -1 votes. Only PMC
> Members can vote on this issue, and no veto is allowed.
> 
> This vote will run for one week, ending on 12 March 2019 23:59 UTC.
> 
> -Joan
> 
> ---
> 
> diff --git a/bylaws.html b/bylaws.html
> index 4aea136..7503b88 100644
> --- a/bylaws.html
> +++ b/bylaws.html
> @@ -262,7 +262,7 @@
> 
>  3.4. Approval Models
> 
> -We use three different approval models for formal voting:
> +We use four different approval models for formal voting:
> 
>
>  RTC (see section 3.5)
> @@ -273,6 +273,11 @@
>
> Requires three binding +1 votes and more binding +1 votes 
> than binding -1 votes
>
> +Qualified lazy majority
> +  
> +Requires three binding +1 votes and more binding +1 votes 
> than binding -1 votes
> +In addition, at least one binding +1 vote must be from an 
> individual not directly affiliated with the proposer's employer (if 
> applicable)
> +  
>  Lazy 2/3 majority
>
> Requires three binding +1 votes and twice as many binding 
> +1 votes as binding -1 votes
> @@ -281,6 +286,8 @@
> 
>  RTC is only ever used in the context of a code review or a pull 
> request, and does not require a separate vote thread. Each of the other 
> approval models requires a vote thread.
> 
> +Qualified lazy majority is only used for the RFC 
> process.
> +
>  A -1 vote is never called a veto except when using the RTC approval 
> model. This is because a single -1 vote never has the power to block a 
> vote outside of RTC.
> 
>  Which approval model to use is dictated by the table in section 
> 3.6. This is project policy, and can be changed by amending this 
> document.
> @@ -316,7 +323,7 @@ The process is:
>
>  Start a [DISCUSS] thread on  href="https://lists.apache.org/list.html?dev@couchdb.apache.org;>the 
> developer mailing list. Discuss your proposal in detail, including 
> which modules/applications are affected, any HTTP API additions and 
> deprecations, and security issues.
>  When there is consensus on the approach from the community,  href="https://s.apache.org/CouchDB-RFC;>complete the RFC template 
> and work through any final revisions to the document, with the support 
> of the developer mailing list.
> -Start the RFC vote on the developer mailing 
> list. Hold the vote according to the lazy majority process: at 
> least 3 +1 votes, and more binding +1 than binding -1 votes.
> +Start the RFC vote on the developer mailing 
> list. Hold the vote according to the qualified lazy majority 
> process: at least 3 +1 votes, more +1 than -1 votes, and at least 
> one +1 vote must be from someone not directly affiliated with the 
> proposer's employer.
>
> 
>  3.7 API changes and deprecations
> @@ -355,7 +362,7 @@ The process is:
>A decision on a specific proposal to alter CouchDB in a 
> significant way.
> href="https://lists.apache.org/list.html?dev@couchdb.apache.org;>Main 
> development list
>No
> -  Lazy majority
> +  Qualified lazy 

Re: [VOTE] Bylaws change: Establish a Request For Comments (RFC) process for major new contribution (revised)

2019-03-05 Thread Robert Newson
+1

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 5 Mar 2019, at 20:18, Joan Touzet wrote:
> (Revised: The Reply-To field on the last email was incorrect. I am
> also including the diff of the proposed changes to this email per
> request.)
> 
> 
> 
> PMC Members,
> 
> This is a VOTE to create or amend our official documents.
> 
> It establishes a new Request For Comments (RFC) process through which
> all major changes to Apache CouchDB must pass.
> 
> This includes modifying the Bylaws to explain the process, as well as
> adding the new RFC template.
> 
> There are also minor changes proposed to the Bylaws to improve links to
> external information resources, such as the new Pony Mail mailing list
> interface.
> 
> The git branch with the proposed changes is here:
> 
>   https://github.com/apache/couchdb-www/compare/asf-site...add-rfc
> 
> Two commits have been used to make it clearer which changes are purely
> cosmetic, and which directly affect the text of the bylaws.
> 
> An important note: this vote explicitly excludes the mechanics of the
> process in GitHub, focusing only on the proposal and voting procedures.
> (Whether we store the RFCs in our main tree or the docs tree, and
> whether we use issues or PRs for them don't have to go in the bylaws.)
> 
> Per our process, this vote occurs on the Main development list (dev@),
> and requires a Lazy 2/3 majority, meaning it requires three binding +1
> votes and twice as many binding +1 votes as binding -1 votes. Only PMC
> Members can vote on this issue, and no veto is allowed.
> 
> This vote will run for one week, ending on 12 March 2019 23:59 UTC.
> 
> -Joan
> 
> 
> ---
> 
> diff --git a/bylaws.html b/bylaws.html
> index 77ad2c0..4aea136 100644
> --- a/bylaws.html
> +++ b/bylaws.html
> @@ -8,7 +8,8 @@
> 
>  CouchDB Bylaws
> 
> -This document was officially adopted by the CouchDB PMC as of 
> 31 July 2014.
> +This document was officially adopted by the CouchDB PMC as of 
> 31 July 2014.
> +A full changelog is available  href="https://github.com/apache/couchdb-www/commits/asf-site/bylaws.html;>on 
> GitHub.
> 
>  Table of Contents
> 
> @@ -35,7 +36,7 @@
> 
>  We value the community more than the code. A strong and healthy 
> community should be a fun and rewarding place for everyone involved. 
> Code, and everything else that goes with that code, will be produced by 
> a healthy community over time.
> 
> -The direction of the project and the decisions we make are up to 
> you. If you are participating on  href="https://mail-archives.apache.org/mod_mbox/#couchdb;>the mailing 
> lists you have the right to make decisions. All decisions about the 
> project are taken on the mailing lists. There are no lead developers, 
> nor is there any one person in charge.
> +The direction of the project and the decisions we make are up to 
> you. If you are participating on  href="https://couchdb.apache.org/#mailing-lists;>the mailing lists 
> you have the right to make decisions. All decisions about the project 
> are taken on the mailing lists. There are no lead developers, nor is 
> there any one person in charge.
> 
>  Anyone can subscribe to the public mailing lists, and in fact, you 
> are encouraged to do so. The development mailing list is not just for 
> developers, for instance. It is for anyone who is interested in the 
> development of the project. Everybody's voice is welcome.
> 
> @@ -59,7 +60,7 @@
> 
>  The most important participants in the project are people 
> who use our software.
> 
> -Users can participate by talking about the project, providing 
> feedback, and helping others. This can be done at the ASF or elsewhere, 
> and includes being active on  href="https://mail-archives.apache.org/mod_mbox/couchdb-user/;>the user 
> mailing list, third-party support forums, blogs, and social media. 
> Users who participate in this way automatically become contributors.
> +Users can participate by talking about the project, providing 
> feedback, and helping others. This can be done at the ASF or elsewhere, 
> and includes being active on  href="https://lists.apache.org/list.html?u...@couchdb.apache.org;>the 
> user mailing list, third-party support forums, blogs, and social 
> media. Users who participate in this way automatically become 
> contributors.
> 
>2.2. Contributors
> 
> @@ -152,9 +153,9 @@
> 
>  Our goal is to build a community of trust, reduce mailing list 
> traffic, and deal with disagreements swiftly when they occur.
> 
> -All decision making must happen on  href="https://mail-archives.apache.org/mod_mbox/#couchdb;>the mailing 
> lists. Any discussion that takes place away from the lists (for 
> example on IRC or in person) must be brought to the lists before 
> anything can be decided. We have a saying: if it's not on the lists, it 
> didn't happen. We take this approach so that the greatest amount of 
> people have a chance to participate.
> +All decision making must 

Re: Maintainig two codebases

2019-03-01 Thread Robert Newson
Hi,

The first decision is to what extent we are supporting the "old" version (once 
the new one exists, of course). I think it would be limited to security fixes. 
If so, this is exactly the same as when we released 1.2, 1.3, and so on. Master 
is always latest (FDB, in this case) and there are branches like 1.2.x or 2.0.x 
if we need to backport security fixes to make a release there.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Fri, 1 Mar 2019, at 13:49, Ilya Khlopotov wrote:
> There are ongoing discussions about rebasing CouchDB on top of 
> FoundationDB. It seems like the community is in agreement that it is 
> worth it to try. This would mean that we would be supporting and 
> extending two quite separate codebases. How are we going to do it? 
> 
> Possible options are:
> - brand new repository
> - separate branch which we would treat as master for FDB rebase project
> 
> I think that separate branch approach has a number of disadvantages:
> - CI might require different dependencies
> - It would be awkward to open GH issues since we would have to always 
> refer to the project we are talking about
> - There would be little friction when we open PR since the correct base 
> branch would need to be selected
> 
> Best regards,
> iilyak
>


Re: [DISCUSS] Attachment support in CouchDB with FDB

2019-02-28 Thread Robert Newson
Thanks to you both, and I agree. 

Adam's "I would like to see a basic “native” attachment provider with the 
limitations described in 2), as well as an “object store” provider targeting 
the S3 API." is my position/preference too. 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Thu, 28 Feb 2019, at 11:41, Adam Kocoloski wrote:
> I would like to see a basic “native” attachment provider with the 
> limitations described in 2), as well as an “object store” provider 
> targeting the S3 API. I think the consistency considerations are 
> tractable if you’re comfortable with the possibility that attachments 
> could possibly be orphaned in the object store in the case of a failed 
> transaction.
> 
> I had not considered the “just write them on the file system” provider 
> but that’s probably partly my cloud-native blinders. I think the main 
> question there is redundancy; I would argue against trying to do any 
> sort of replication across local disks. Users who happen to have an 
> NFS-style mount point accessible to all the CouchDB nodes could use 
> this option reliably, though.
> 
> We should calculate a safe maximum attachment size for the native 
> provider — as I understand things the FDB transaction size includes 
> both keys and values, so our effective attachment size limit will be 
> smaller.
> 
> Adam
> 
> > On Feb 28, 2019, at 6:21 AM, Robert Newson  wrote:
> > 
> > Hi,
> > 
> > Yes, I agree we should have a framework like that. Folks should be able to 
> > choose S3 or COS (IBM), etc. 
> > 
> > I am personally on the hook for the implementation for CouchDB and for IBM 
> > Cloudant and expect them to be different, so the framework, IMO, is a 
> > given. 
> > 
> > B. 
> > 
> >> On 28 Feb 2019, at 10:33, Jan Lehnardt  wrote:
> >> 
> >> Thanks for getting this started, Bob!
> >> 
> >> In fear of derailing this right off the bat, is there a potential 4) 
> >> approach where on the CouchDB side there is a way to specify “attachment 
> >> backends”, one of which could be 2), but others could be “node local file 
> >> storage”*, others could be S3-API compatible, etc?
> >> 
> >> *a bunch of heavy handwaving about how to ensure consistency and fault 
> >> tolerance here.
> >> 
> >> * * *
> >> 
> >> My hypothetical 4) could also be a later addition, and we’ll do one of 1-3 
> >> first.
> >> 
> >> 
> >> * * *
> >> 
> >> From 1-3, I think 2 is most pragmatic in terms of keeping desirable 
> >> functionality, while limiting it so it can be useful in practice.
> >> 
> >> I feel strongly about not dropping attachment support. While not ideal in 
> >> all cases, it is an extremely useful and reasonably popular feature.
> >> 
> >> Best
> >> Jan
> >> —
> >> 
> >>> On 28. Feb 2019, at 11:22, Robert Newson  wrote:
> >>> 
> >>> Hi All,
> >>> 
> >>> We've not yet discussed attachments in terms of the foundationdb work so 
> >>> here's where we do that.
> >>> 
> >>> Today, CouchDB allows you to store large binary values, stored as a 
> >>> series of much smaller chunks. These "attachments" cannot be indexed, 
> >>> they can only be sent and received (you can fetch the whole thing or you 
> >>> can fetch arbitrary subsets of them).
> >>> 
> >>> On the FDB side, we have a few constraints. A transaction cannot be more 
> >>> than 10MB and cannot take more than 5 seconds.
> >>> 
> >>> Given that, there are a few paths to attachment support going forward;
> >>> 
> >>> 1) Drop native attachment support. 
> >>> 
> >>> I suspect this is not going to be a popular approach but it's worth 
> >>> hearing a range of views. Instead of direct attachment support, a user 
> >>> could store the URL to the large binary content and could simply fetch 
> >>> that URL directly.
> >>> 
> >>> 2) Write attachments into FDB but with limits.
> >>> 
> >>> The next simplest is to write the attachments into FDB as a series of 
> >>> key/value entries, where the key is {database_name, doc_id, 
> >>> attachment_name, 0..N} and the value is a short byte array (say, 16K to 
> >>> match current). The 0..N is just a counter such that we can do an fdb 
> >>> range get / iterator to retrieve the attachment. An embellishment would 
> >>> re

Re: [DISCUSS] Attachment support in CouchDB with FDB

2019-02-28 Thread Robert Newson
Hi,

Yes, I agree we should have a framework like that. Folks should be able to 
choose S3 or COS (IBM), etc. 

I am personally on the hook for the implementation for CouchDB and for IBM 
Cloudant and expect them to be different, so the framework, IMO, is a given. 

B. 

> On 28 Feb 2019, at 10:33, Jan Lehnardt  wrote:
> 
> Thanks for getting this started, Bob!
> 
> In fear of derailing this right off the bat, is there a potential 4) approach 
> where on the CouchDB side there is a way to specify “attachment backends”, 
> one of which could be 2), but others could be “node local file storage”*, 
> others could be S3-API compatible, etc?
> 
> *a bunch of heavy handwaving about how to ensure consistency and fault 
> tolerance here.
> 
> * * *
> 
> My hypothetical 4) could also be a later addition, and we’ll do one of 1-3 
> first.
> 
> 
> * * *
> 
> From 1-3, I think 2 is most pragmatic in terms of keeping desirable 
> functionality, while limiting it so it can be useful in practice.
> 
> I feel strongly about not dropping attachment support. While not ideal in all 
> cases, it is an extremely useful and reasonably popular feature.
> 
> Best
> Jan
> —
> 
>> On 28. Feb 2019, at 11:22, Robert Newson  wrote:
>> 
>> Hi All,
>> 
>> We've not yet discussed attachments in terms of the foundationdb work so 
>> here's where we do that.
>> 
>> Today, CouchDB allows you to store large binary values, stored as a series 
>> of much smaller chunks. These "attachments" cannot be indexed, they can only 
>> be sent and received (you can fetch the whole thing or you can fetch 
>> arbitrary subsets of them).
>> 
>> On the FDB side, we have a few constraints. A transaction cannot be more 
>> than 10MB and cannot take more than 5 seconds.
>> 
>> Given that, there are a few paths to attachment support going forward;
>> 
>> 1) Drop native attachment support. 
>> 
>> I suspect this is not going to be a popular approach but it's worth hearing 
>> a range of views. Instead of direct attachment support, a user could store 
>> the URL to the large binary content and could simply fetch that URL directly.
>> 
>> 2) Write attachments into FDB but with limits.
>> 
>> The next simplest is to write the attachments into FDB as a series of 
>> key/value entries, where the key is {database_name, doc_id, attachment_name, 
>> 0..N} and the value is a short byte array (say, 16K to match current). The 
>> 0..N is just a counter such that we can do an fdb range get / iterator to 
>> retrieve the attachment. An embellishment would restore the http Range 
>> header options, if we still wanted that (disclaimer: I implemented the Range 
>> thing many years ago, I'm happy to drop support if no one really cares for 
>> it in 2019).
>> 
>> This would be subject to the 10mb and 5s limit, which is less that you _can_ 
>> do today with attachments but not, in my opinion, any less that people 
>> actually do (with some notable outliers like npm in the past).
>> 
>> 3) Full functionality
>> 
>> This would be the same as today. Attachments of arbitrary size (up to the 
>> disk capacity of the fdb cluster). It would require some extra cleverness to 
>> work over multiple txn transactions and in such a way that an aborted upload 
>> doesn't leave partially uploaded data in fdb forever. I have not sat down 
>> and designed this yet, hence I would very much like to hear from the 
>> community as to which of these paths are sufficient.
>> 
>> -- 
>> Robert Samuel Newson
>> rnew...@apache.org
> 
> -- 
> Professional Support for Apache CouchDB:
> https://neighbourhood.ie/couchdb-support/
> 



[DISCUSS] Attachment support in CouchDB with FDB

2019-02-28 Thread Robert Newson
Hi All,

We've not yet discussed attachments in terms of the foundationdb work so here's 
where we do that.

Today, CouchDB allows you to store large binary values, stored as a series of 
much smaller chunks. These "attachments" cannot be indexed, they can only be 
sent and received (you can fetch the whole thing or you can fetch arbitrary 
subsets of them).

On the FDB side, we have a few constraints. A transaction cannot be more than 
10MB and cannot take more than 5 seconds.

Given that, there are a few paths to attachment support going forward;

1) Drop native attachment support. 

I suspect this is not going to be a popular approach but it's worth hearing a 
range of views. Instead of direct attachment support, a user could store the 
URL to the large binary content and could simply fetch that URL directly.

2) Write attachments into FDB but with limits.

The next simplest is to write the attachments into FDB as a series of key/value 
entries, where the key is {database_name, doc_id, attachment_name, 0..N} and 
the value is a short byte array (say, 16K to match current). The 0..N is just a 
counter such that we can do an fdb range get / iterator to retrieve the 
attachment. An embellishment would restore the http Range header options, if we 
still wanted that (disclaimer: I implemented the Range thing many years ago, 
I'm happy to drop support if no one really cares for it in 2019).

This would be subject to the 10mb and 5s limit, which is less that you _can_ do 
today with attachments but not, in my opinion, any less that people actually do 
(with some notable outliers like npm in the past).

3) Full functionality

This would be the same as today. Attachments of arbitrary size (up to the disk 
capacity of the fdb cluster). It would require some extra cleverness to work 
over multiple txn transactions and in such a way that an aborted upload doesn't 
leave partially uploaded data in fdb forever. I have not sat down and designed 
this yet, hence I would very much like to hear from the community as to which 
of these paths are sufficient.

-- 
  Robert Samuel Newson
  rnew...@apache.org


Re: [DISCUSS] Per-doc access control

2019-02-27 Thread Robert Newson
Not to pile on here but "Once you read your available docs into your DB, you 
can grant yourself
write privileges to every document there." does seem to miss the mark.

All replication is doing is making a copy of data you have access to. You can 
modify your own copy as you please, it doesn't violate the security of the 
origin server. If that server allowed you to replicate those changes back, 
sure, but that is wholly within the origin servers control.

The notion that access controls could meaningful propagate to servers outside 
of the original servers control seems very much like DRM with all its 
doesn't-work-without-litigation problems.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Wed, 27 Feb 2019, at 20:14, Adam Kocoloski wrote:
> 
> > On Feb 27, 2019, at 3:01 PM, Michael Fair  wrote:
> > 
> > On Wed, Feb 27, 2019 at 10:36 AM Adam Kocoloski  wrote:
> > 
> >> Hi Mike, just picking out this one snippet:
> >> 
> >>> On Feb 27, 2019, at 12:16 PM, Michael Fair 
> >> wrote:
> >>> 
> >>> If I get a replica of a database from your server, what, if anything,
> >>> prevents me from granting myself access controls to the entire database?
> >> 
> >> Replication is a client of the API like everyone else and cannot bypass
> >> the access controls on the source. You can only create a replication which
> >> has at most access to all the documents in the database that you can access
> >> yourself; i.e. a replication of a database with per-doc access controls
> >> enabled may only transfer a subset of the documents in the database.
> >> 
> > 
> > Right, but generally speaking READ access is very prevalent, WRITE access
> > is much more restrictive.
> > 
> > Or are these "access controls" really just the binary "exposed"/"not
> > exposed" sort such that had these documents simply gone into a different
> > database in the first place, and the view indexes tracked in a "per
> > database" way then everything would work as expected?
> > 
> > In fact, maybe "the same doc in multiple user centered databases" is not
> > such a bad idea/model to consider.
> > 
> > Currently, I see this idea in Couch that documents belong to a particular
> > "named document collection" called a database.  A view is really just
> > another kind of "named document collection".
> > 
> > What if instead of a "database", there was simply the single universal
> > document store and a "database" is then more like a "view" that documents
> > became a member of just like a user's access controlled "slice" would be?
> > (i.e. the dbname becomes more like of a document access filter and less a
> > grouping and storage boundary)
> > 
> > Users then have a "per user scope/database" which they always access
> > through when they connect to the server.
> > The top level list of "_databases" the user sees is actually just a list of
> > those "named collections" that the user can access.  To the client side
> > APIs nothing really changes for them, what changes is how Couch internally
> > organizes itself to make "databases" a construct of the access control
> > feature and no longer a structural primitive.
> > 
> > Documents now carry the information about what "named collections" they
> > belong to in the same way as what entities are authorized to access them,
> > and "databases" basically become a "view" grouping by the "_collections"
> > array field member values on each document.
> > 
> > 
> > Many people have requested the option to expose the same document to
> > multiple databases and then use database access rights to enforce per user
> > access and share documents between users.  I've even wanted this feature
> > from time to time.  If views have to be modified to hide unauthorized
> > documents anyway, this seems a perfect opportunity to address this at the
> > same time...
> > 
> > 
> > .
> > In this model, a "document collection", whether it be a database, a user, a
> > user group, a role, or a role group becomes the entity that the access
> > control documents are "authorizing".
> > 
> > In this model, a "view" becomes an identifiable "collection" that can be
> > treated like a database.
> > Which would make creating views on top of other views becomes something
> > much easier to define/express.
> > 
> > 
> > I'm envisioning that to implement successful "access control" based views,
> > each "authorized entity" would have to maintain its own view index.
> > Otherwise it's really hard for me to imagine how a reduce function can
> > cache any precomputed results because it has no way of knowing that all the
> > underlying documents used in the reduce were authorized for the accessing
> > entity...  All reduce functions would have to run in real time to ensure it
> > is only using authorized documents...
> > 
> > If there were a separate index for each authorized entity though (and
> > especially if these entities were able to share index buckets where they
> > had all the same document access in common), then the reduce 

Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-02-19 Thread Robert Newson
Good points on revtree, I agree with you we should store that intelligently to 
gain the benefits you mentioned.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 19 Feb 2019, at 18:41, Adam Kocoloski wrote:
> I do not think we should store the revtree as a blob. The design where 
> each edit branch is its own KV should save on network IO and CPU cycles 
> for normal updates. We’ve performed too many heroics to keep 
> couch_key_tree from stalling entire databases when trying to update a 
> single document with a wide revision tree, I would much prefer to ignore 
> other edit branches entirely when all we’re doing is extending one of 
> them.
> 
> I also do not think we should store JSON documents as blobs, but it’s a 
> closer call. Some of my reasoning for preferring the exploded path 
> design:
> 
> - it lends itself nicely to sub-document operations, for which Jan 
> crafted an RFC last year: https://github.com/apache/couchdb/issues/1559
> - it optimizes the creation of Mango indexes on existing databases since 
> we only need to retrieve the value(s) we want to index
> - it optimizes Mango queries that use field selectors
> - anyone who wanted to try their hand at GraphQL will find it very 
> handy: https://github.com/apache/couchdb/issues/1499
> - looking further ahead, it lets us play with smarter leaf value types 
> like Counters (yes I’m still on the CRDT bandwagon, sorry)
> 
> A few comments on the thread:
> 
> >>> * Most documents bodies are probably going to be smaller than 100k. So in
> >>> the majority of case it would be one write / one read to update and fetch
> >>> the document body.
> 
> We should test, but I expect reading 50KB of data in a range query is 
> almost as efficient as reading a single 50 KB value. Similarly, writes 
> to a contiguous set of keys should be quite efficient.
> 
> I am concerned about the overhead of the repeated field paths in the 
> keys with the exploded path option in the absence of key prefix 
> compression. That would be my main reason to acquiesce and throw away 
> all the document structure.
> 
> Adam
> 
> > On Feb 19, 2019, at 12:04 PM, Robert Newson  wrote:
> > 
> > I like the idea that we'd reuse the same pattern (but perhaps not the same 
> > _code_) for doc bodies, revtree and attachments.
> > 
> > I hope we still get to delete couch_key_tree.erl, though.
> > 
> > -- 
> >  Robert Samuel Newson
> >  rnew...@apache.org
> > 
> > On Tue, 19 Feb 2019, at 17:03, Jan Lehnardt wrote:
> >> I like the idea from a “trying a simple thing first” perspective, but 
> >> Nick’s points below are especially convincing to with this for now.
> >> 
> >> Best
> >> Jan
> >> —
> >> 
> >>> On 19. Feb 2019, at 17:53, Nick Vatamaniuc  wrote:
> >>> 
> >>> Hi,
> >>> 
> >>> Sorry for jumping in so late, I was following from the sidelines mostly. A
> >>> lot of good discussion happening and am excited about the possibilities
> >>> here.
> >>> 
> >>> I do like the simpler "chunking" approach for a few reasons:
> >>> 
> >>> * Most documents bodies are probably going to be smaller than 100k. So in
> >>> the majority of case it would be one write / one read to update and fetch
> >>> the document body.
> >>> 
> >>> * We could reuse the chunking code for attachment handling and possibly
> >>> revision key trees. So it's the general pattern of upload chunks to some
> >>> prefix, and when finished flip an atomic toggle to make it current.
> >>> 
> >>> * Do the same thing with revision trees and we could re-use the revision
> >>> tree manipulation logic. That is, the key tree in most cases would be 
> >>> small
> >>> enough to fit in 100k but if they get huge, they'd get chunked. This would
> >>> allow us to reuse all the battle tested couch_key_tree code mostly as is.
> >>> We even have property tests for it
> >>> https://github.com/apache/couchdb/blob/master/src/couch/test/couch_key_tree_prop_tests.erl
> >>> 
> >>> * It removes the need to explain the max exploded path length limitation 
> >>> to
> >>> customers.
> >>> 
> >>> Cheers,
> >>> -Nick
> >>> 
> >>> 
> >>> On Tue, Feb 19, 2019 at 11:18 AM Robert Newson  wrote:
> >>> 
> >>>> Hi,
> >>>> 
> >>>> An alternative storage model that we should seriously conside

Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-02-19 Thread Robert Newson
 I like the idea that we'd reuse the same pattern (but perhaps not the same 
_code_) for doc bodies, revtree and attachments.

I hope we still get to delete couch_key_tree.erl, though.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 19 Feb 2019, at 17:03, Jan Lehnardt wrote:
> I like the idea from a “trying a simple thing first” perspective, but 
> Nick’s points below are especially convincing to with this for now.
> 
> Best
> Jan
> —
> 
> > On 19. Feb 2019, at 17:53, Nick Vatamaniuc  wrote:
> > 
> > Hi,
> > 
> > Sorry for jumping in so late, I was following from the sidelines mostly. A
> > lot of good discussion happening and am excited about the possibilities
> > here.
> > 
> > I do like the simpler "chunking" approach for a few reasons:
> > 
> > * Most documents bodies are probably going to be smaller than 100k. So in
> > the majority of case it would be one write / one read to update and fetch
> > the document body.
> > 
> > * We could reuse the chunking code for attachment handling and possibly
> > revision key trees. So it's the general pattern of upload chunks to some
> > prefix, and when finished flip an atomic toggle to make it current.
> > 
> > * Do the same thing with revision trees and we could re-use the revision
> > tree manipulation logic. That is, the key tree in most cases would be small
> > enough to fit in 100k but if they get huge, they'd get chunked. This would
> > allow us to reuse all the battle tested couch_key_tree code mostly as is.
> > We even have property tests for it
> > https://github.com/apache/couchdb/blob/master/src/couch/test/couch_key_tree_prop_tests.erl
> > 
> > * It removes the need to explain the max exploded path length limitation to
> > customers.
> > 
> > Cheers,
> > -Nick
> > 
> > 
> > On Tue, Feb 19, 2019 at 11:18 AM Robert Newson  wrote:
> > 
> >> Hi,
> >> 
> >> An alternative storage model that we should seriously consider is to
> >> follow our current approach in couch_file et al. Specifically, that the
> >> document _body_ is stored as an uninterpreted binary value. This would be
> >> much like the obvious plan for attachment storage; a key prefix that
> >> identifies the database and document, with the final item of that key tuple
> >> is an incrementing integer. Each of those keys has a binary value of up to
> >> 100k. Fetching all values with that key prefix, in fdb's natural ordering,
> >> will yield the full document body, which can be JSON decoded for further
> >> processing.
> >> 
> >> I like this idea, and I like Adam's original proposal to explode documents
> >> into property paths. I have a slight preference for the simplicity of the
> >> idea in the previous paragraph, not least because it's close to what we do
> >> today. I also think it will be possible to migrate to alternative storage
> >> models in future, and foundationdb's transaction supports means we can do
> >> this migration seamlessly should we come to it.
> >> 
> >> I'm very interested in knowing if anyone else is interested in going this
> >> simple, or considers it a wasted opportunity relative to the 'exploded'
> >> path.
> >> 
> >> B.
> >> 
> >> --
> >>  Robert Samuel Newson
> >>  rnew...@apache.org
> >> 
> >> On Mon, 4 Feb 2019, at 19:59, Robert Newson wrote:
> >>> I've been remiss here in not posting the data model ideas that IBM
> >>> worked up while we were thinking about using FoundationDB so I'm posting
> >>> it now. This is Adam' Kocoloski's original work, I am just transcribing
> >>> it, and this is the context that the folks from the IBM side came in
> >>> with, for full disclosure.
> >>> 
> >>> Basics
> >>> 
> >>> 1. All CouchDB databases are inside a Directory
> >>> 2. Each CouchDB database is a Directory within that Directory
> >>> 3. It's possible to list all subdirectories of a Directory, so
> >>> `_all_dbs` is the list of directories from 1.
> >>> 4. Each Directory representing a CouchdB database has several Subspaces;
> >>> 4a. by_id/ doc subspace: actual document contents
> >>> 4b. by_seq/versionstamp subspace: for the _changes feed
> >>> 4c. index_definitions, indexes, ...
> >>> 
> >>> JSON Mapping
> >>> 
> >>> A hierarchical JSON object naturally maps to multiple KV pairs in FDB:
> >>> 
> >>>

Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-02-19 Thread Robert Newson
Hi,

An alternative storage model that we should seriously consider is to follow our 
current approach in couch_file et al. Specifically, that the document _body_ is 
stored as an uninterpreted binary value. This would be much like the obvious 
plan for attachment storage; a key prefix that identifies the database and 
document, with the final item of that key tuple is an incrementing integer. 
Each of those keys has a binary value of up to 100k. Fetching all values with 
that key prefix, in fdb's natural ordering, will yield the full document body, 
which can be JSON decoded for further processing.

I like this idea, and I like Adam's original proposal to explode documents into 
property paths. I have a slight preference for the simplicity of the idea in 
the previous paragraph, not least because it's close to what we do today. I 
also think it will be possible to migrate to alternative storage models in 
future, and foundationdb's transaction supports means we can do this migration 
seamlessly should we come to it.

I'm very interested in knowing if anyone else is interested in going this 
simple, or considers it a wasted opportunity relative to the 'exploded' path.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 4 Feb 2019, at 19:59, Robert Newson wrote:
> I've been remiss here in not posting the data model ideas that IBM 
> worked up while we were thinking about using FoundationDB so I'm posting 
> it now. This is Adam' Kocoloski's original work, I am just transcribing 
> it, and this is the context that the folks from the IBM side came in 
> with, for full disclosure.
> 
> Basics
> 
> 1. All CouchDB databases are inside a Directory
> 2. Each CouchDB database is a Directory within that Directory
> 3. It's possible to list all subdirectories of a Directory, so 
> `_all_dbs` is the list of directories from 1.
> 4. Each Directory representing a CouchdB database has several Subspaces;
> 4a. by_id/ doc subspace: actual document contents 
> 4b. by_seq/versionstamp subspace: for the _changes feed 
> 4c. index_definitions, indexes, ...
> 
> JSON Mapping
> 
> A hierarchical JSON object naturally maps to multiple KV pairs in FDB:
> 
> { 
> “_id”: “foo”, 
> “owner”: “bob”, 
> “mylist”: [1,3,5], 
> “mymap”: { 
> “blue”: “#FF”, 
> “red”: “#FF” 
> } 
> }
> 
> maps to
> 
> (“foo”, “owner”) = “bob” 
> (“foo”, “mylist”, 0) = 1 
> (“foo”, “mylist”, 1) = 3 
> (“foo”, “mylist”, 2) = 5 
> (“foo”, “mymap”, “blue”) = “#FF” 
> (“foo”, “mymap”, “red”) = “#FF”
> 
> NB: this means that the 100KB limit applies to individual leafs in the 
> JSON object, not the entire doc
> 
> Edit Conflicts
> 
> We need to account for the presence of conflicts in various levels of 
> the doc due to replication.
> 
> Proposal is to create a special value indicating that the subtree below 
> our current cursor position is in an unresolvable conflict. Then add 
> additional KV pairs below to describe the conflicting entries.
> 
> KV data model allows us to store these efficiently and minimize 
> duplication of data:
> 
> A document with these two conflicts:
> 
> { 
> “_id”: “foo”, 
> “_rev”: “1-abc”, 
> “owner”: “alice”, 
> “active”: true 
> }
> { 
> “_id”: “foo”, 
> “_rev”: “1-def”, 
> “owner”: “bob”, 
> “active”: true 
> }
> 
> could be stored thus:
> 
> (“foo”, “active”) = true 
> (“foo”, “owner”) = kCONFLICT 
> (“foo”, “owner”, “1-abc”) = “alice” 
> (“foo”, “owner”, “1-def”) = “bob”
> 
> So long as `kCONFLICT` is set at the top of the conflicting subtree this 
> representation can handle conflicts of different data types as well.
> 
> Missing fields need to be handled explicitly:
> 
> { 
>   “_id”: “foo”, 
>   “_rev”: “1-abc”, 
>   “owner”: “alice”, 
>   “active”: true 
> }
> 
> { 
>   “_id”: “foo”, 
>   “_rev”: “1-def”, 
>   “owner”: { 
> “name”: “bob”, 
> “email”: “
> b...@example.com
> " 
>   } 
> }
> 
> could be stored thus:
> 
> (“foo”, “active”) = kCONFLICT 
> (“foo”, “active”, “1-abc”) = true 
> (“foo”, “active”, “1-def”) = kMISSING 
> (“foo”, “owner”) = kCONFLICT 
> (“foo”, “owner”, “1-abc”) = “alice” 
> (“foo”, “owner”, “1-def”, “name”) = “bob” 
> (“foo”, “owner”, “1-def”, “email”) = ...
> 
> Revision Metadata
> 
> * CouchDB uses a hash history for revisions 
> ** Each edit is identified by the hash of the content of the edit 
> including the base revision against which it was applied 
> ** Individual edit branches are bounded in length but the number of 
> branches is potentially unbounded 
> 
> * Size limits preclude us from storing the entire key tree as a single 
>

Re: [VOTE] Release Apache CouchDB 2.3.1 RC2

2019-02-19 Thread Robert Newson
now it exists and is the right value :)

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 19 Feb 2019, at 14:30, Robert Newson wrote:
> the sha256 file exists but is zero bytes long.
> 
> -- 
>   Robert Samuel Newson
>   rnew...@apache.org
> 
> On Tue, 19 Feb 2019, at 13:03, Jan Lehnardt wrote:
> > 
> > 
> > > On 19. Feb 2019, at 13:14, Robert Newson  wrote:
> > > 
> > > sha512 checksum - verified
> > > sha256 checksum - missing
> > 
> > musta been a case of the eventual consistencies, the file should be there 
> > now.
> > 
> > > signature - good
> > > 
> > > eunit tests - pass
> > > js tests - crashes at delayed_commits.js with
> > > 
> > > "test/javascript/tests/delayed_commits.js
> > >Error: Failed to execute HTTP request: Failed to connect to 127.0.0.1 
> > > port 15984: Connection refused”
> > 
> > Funny this one, I thought this was local to my setup s I had no one else 
> > complain about it. AFAICT it’s a mac only issue that I went see-no-evil-
> > monkey on since I couldn’t be bothered to have more than a cursory look, 
> > which I had and nothing stood out. Personally, I’m treating this as 
> > “wontfix, wait for elixir suite”. Happy to take it out of the list of 
> > tests until then, but shoulnd’t IMHO block the release.
> > 
> > Best
> > Jan
> > —
> > 
> > 
> > > 
> > > 
> > > -- 
> > >  Robert Samuel Newson
> > >  rnew...@apache.org
> > > 
> > > On Tue, 19 Feb 2019, at 11:20, Jan Lehnardt wrote:
> > >> Convenience Mac binary is up here: 
> > >> https://dist.apache.org/repos/dist/dev/couchdb/binary/mac/2.3.1/rc.2/
> > >> 
> > >> Best
> > >> Jan
> > >> —
> > >>> On 19. Feb 2019, at 12:13, Jan Lehnardt  wrote:
> > >>> 
> > >>> Dear community,
> > >>> 
> > >>> I would like to propose that we release Apache CouchDB 2.3.1-RC2.
> > >>> 
> > >>> Candidate release notes:
> > >>> 
> > >>> https://docs.couchdb.org/en/2.3.1/whatsnew/2.3.html#version-2-3-1
> > >>> 
> > >>> Changes since last time:
> > >>> 
> > >>> - built rebar with Erlang 17.4.1 (h/t Nick)
> > >>> 
> > >>> We encourage the whole community to download and test these release 
> > >>> artefacts so that any critical issues can be resolved before the 
> > >>> release is made. Everyone is free to vote on this release, so dig right 
> > >>> in!
> > >>> 
> > >>> The release artefacts we are voting on are available here:
> > >>> 
> > >>>  https://dist.apache.org/repos/dist/dev/couchdb/source/2.3.1/rc.2/
> > >>> 
> > >>> There, you will find a tarball, a GPG signature, and SHA256/SHA512 
> > >>> checksums.
> > >>> 
> > >>> Please follow the test procedure here:
> > >>> 
> > >>>  
> > >>> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> > >>> 
> > >>> Please remember that "RC2" is an annotation. If the vote passes, these 
> > >>> artefacts will be released as Apache CouchDB 2.3.1.
> > >>> 
> > >>> Please cast your votes now.
> > >>> 
> > >>> Thanks,
> > >>> Jan
> > >>> —
> > >>> 
> > >>>> On 17. Feb 2019, at 19:47, Jan Lehnardt  wrote:
> > >>>> 
> > >>>> Dear community,
> > >>>> 
> > >>>> I would like to propose that we release Apache CouchDB 2.3.1-RC1.
> > >>>> 
> > >>>> Candidate release notes:
> > >>>> 
> > >>>> https://docs.couchdb.org/en/2.3.1/whatsnew/2.3.html#version-2-3-1
> > >>>> 
> > >>>> We encourage the whole community to download and test these release 
> > >>>> artefacts so that any critical issues can be resolved before the 
> > >>>> release is made. Everyone is free to vote on this release, so dig 
> > >>>> right in!
> > >>>> 
> > >>>> The release artefacts we are voting on are available here:
> > >>>> 
> > >>>>  https://dist.apache.org/repos/dist/dev/couchdb/source/2.3.1/rc.1/
> > >>>> 
> > >>>> There, you will find a tarball, a GPG signature, and SHA256/SHA512 
> > >>>> checksums.
> > >>>> 
> > >>>> Please follow the test procedure here:
> > >>>> 
> > >>>>  
> > >>>> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> > >>>> 
> > >>>> Please remember that "RC1" is an annotation. If the vote passes, these 
> > >>>> artefacts will be released as Apache CouchDB 2.3.1.
> > >>>> 
> > >>>> Please cast your votes now.
> > >>>> 
> > >>>> Thanks,
> > >>>> Jan
> > >>>> —
> > >>> 
> > >>> -- 
> > >>> Professional Support for Apache CouchDB:
> > >>> https://neighbourhood.ie/couchdb-support/
> > >>> 
> > >> 
> > >> -- 
> > >> Professional Support for Apache CouchDB:
> > >> https://neighbourhood.ie/couchdb-support/
> > >> 
> > 
> > -- 
> > Professional Support for Apache CouchDB:
> > https://neighbourhood.ie/couchdb-support/
> > 


Re: [VOTE] Release Apache CouchDB 2.3.1 RC2

2019-02-19 Thread Robert Newson
the sha256 file exists but is zero bytes long.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 19 Feb 2019, at 13:03, Jan Lehnardt wrote:
> 
> 
> > On 19. Feb 2019, at 13:14, Robert Newson  wrote:
> > 
> > sha512 checksum - verified
> > sha256 checksum - missing
> 
> musta been a case of the eventual consistencies, the file should be there now.
> 
> > signature - good
> > 
> > eunit tests - pass
> > js tests - crashes at delayed_commits.js with
> > 
> > "test/javascript/tests/delayed_commits.js
> >Error: Failed to execute HTTP request: Failed to connect to 127.0.0.1 
> > port 15984: Connection refused”
> 
> Funny this one, I thought this was local to my setup s I had no one else 
> complain about it. AFAICT it’s a mac only issue that I went see-no-evil-
> monkey on since I couldn’t be bothered to have more than a cursory look, 
> which I had and nothing stood out. Personally, I’m treating this as 
> “wontfix, wait for elixir suite”. Happy to take it out of the list of 
> tests until then, but shoulnd’t IMHO block the release.
> 
> Best
> Jan
> —
> 
> 
> > 
> > 
> > -- 
> >  Robert Samuel Newson
> >  rnew...@apache.org
> > 
> > On Tue, 19 Feb 2019, at 11:20, Jan Lehnardt wrote:
> >> Convenience Mac binary is up here: 
> >> https://dist.apache.org/repos/dist/dev/couchdb/binary/mac/2.3.1/rc.2/
> >> 
> >> Best
> >> Jan
> >> —
> >>> On 19. Feb 2019, at 12:13, Jan Lehnardt  wrote:
> >>> 
> >>> Dear community,
> >>> 
> >>> I would like to propose that we release Apache CouchDB 2.3.1-RC2.
> >>> 
> >>> Candidate release notes:
> >>> 
> >>> https://docs.couchdb.org/en/2.3.1/whatsnew/2.3.html#version-2-3-1
> >>> 
> >>> Changes since last time:
> >>> 
> >>> - built rebar with Erlang 17.4.1 (h/t Nick)
> >>> 
> >>> We encourage the whole community to download and test these release 
> >>> artefacts so that any critical issues can be resolved before the release 
> >>> is made. Everyone is free to vote on this release, so dig right in!
> >>> 
> >>> The release artefacts we are voting on are available here:
> >>> 
> >>>  https://dist.apache.org/repos/dist/dev/couchdb/source/2.3.1/rc.2/
> >>> 
> >>> There, you will find a tarball, a GPG signature, and SHA256/SHA512 
> >>> checksums.
> >>> 
> >>> Please follow the test procedure here:
> >>> 
> >>>  
> >>> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> >>> 
> >>> Please remember that "RC2" is an annotation. If the vote passes, these 
> >>> artefacts will be released as Apache CouchDB 2.3.1.
> >>> 
> >>> Please cast your votes now.
> >>> 
> >>> Thanks,
> >>> Jan
> >>> —
> >>> 
> >>>> On 17. Feb 2019, at 19:47, Jan Lehnardt  wrote:
> >>>> 
> >>>> Dear community,
> >>>> 
> >>>> I would like to propose that we release Apache CouchDB 2.3.1-RC1.
> >>>> 
> >>>> Candidate release notes:
> >>>> 
> >>>> https://docs.couchdb.org/en/2.3.1/whatsnew/2.3.html#version-2-3-1
> >>>> 
> >>>> We encourage the whole community to download and test these release 
> >>>> artefacts so that any critical issues can be resolved before the release 
> >>>> is made. Everyone is free to vote on this release, so dig right in!
> >>>> 
> >>>> The release artefacts we are voting on are available here:
> >>>> 
> >>>>  https://dist.apache.org/repos/dist/dev/couchdb/source/2.3.1/rc.1/
> >>>> 
> >>>> There, you will find a tarball, a GPG signature, and SHA256/SHA512 
> >>>> checksums.
> >>>> 
> >>>> Please follow the test procedure here:
> >>>> 
> >>>>  
> >>>> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> >>>> 
> >>>> Please remember that "RC1" is an annotation. If the vote passes, these 
> >>>> artefacts will be released as Apache CouchDB 2.3.1.
> >>>> 
> >>>> Please cast your votes now.
> >>>> 
> >>>> Thanks,
> >>>> Jan
> >>>> —
> >>> 
> >>> -- 
> >>> Professional Support for Apache CouchDB:
> >>> https://neighbourhood.ie/couchdb-support/
> >>> 
> >> 
> >> -- 
> >> Professional Support for Apache CouchDB:
> >> https://neighbourhood.ie/couchdb-support/
> >> 
> 
> -- 
> Professional Support for Apache CouchDB:
> https://neighbourhood.ie/couchdb-support/
> 


Re: [VOTE] Release Apache CouchDB 2.3.1 RC2

2019-02-19 Thread Robert Newson
sha512 checksum - verified
sha256 checksum - missing
signature - good

eunit tests - pass
js tests - crashes at delayed_commits.js with

"test/javascript/tests/delayed_commits.js
Error: Failed to execute HTTP request: Failed to connect to 127.0.0.1 port 
15984: Connection refused"


-- 
  Robert Samuel Newson
  rnew...@apache.org

On Tue, 19 Feb 2019, at 11:20, Jan Lehnardt wrote:
> Convenience Mac binary is up here: 
> https://dist.apache.org/repos/dist/dev/couchdb/binary/mac/2.3.1/rc.2/
> 
> Best
> Jan
> —
> > On 19. Feb 2019, at 12:13, Jan Lehnardt  wrote:
> > 
> > Dear community,
> > 
> > I would like to propose that we release Apache CouchDB 2.3.1-RC2.
> > 
> > Candidate release notes:
> > 
> >  https://docs.couchdb.org/en/2.3.1/whatsnew/2.3.html#version-2-3-1
> > 
> > Changes since last time:
> > 
> > - built rebar with Erlang 17.4.1 (h/t Nick)
> > 
> > We encourage the whole community to download and test these release 
> > artefacts so that any critical issues can be resolved before the release is 
> > made. Everyone is free to vote on this release, so dig right in!
> > 
> > The release artefacts we are voting on are available here:
> > 
> >   https://dist.apache.org/repos/dist/dev/couchdb/source/2.3.1/rc.2/
> > 
> > There, you will find a tarball, a GPG signature, and SHA256/SHA512 
> > checksums.
> > 
> > Please follow the test procedure here:
> > 
> >   
> > https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> > 
> > Please remember that "RC2" is an annotation. If the vote passes, these 
> > artefacts will be released as Apache CouchDB 2.3.1.
> > 
> > Please cast your votes now.
> > 
> > Thanks,
> > Jan
> > —
> > 
> >> On 17. Feb 2019, at 19:47, Jan Lehnardt  wrote:
> >> 
> >> Dear community,
> >> 
> >> I would like to propose that we release Apache CouchDB 2.3.1-RC1.
> >> 
> >> Candidate release notes:
> >> 
> >>  https://docs.couchdb.org/en/2.3.1/whatsnew/2.3.html#version-2-3-1
> >> 
> >> We encourage the whole community to download and test these release 
> >> artefacts so that any critical issues can be resolved before the release 
> >> is made. Everyone is free to vote on this release, so dig right in!
> >> 
> >> The release artefacts we are voting on are available here:
> >> 
> >>   https://dist.apache.org/repos/dist/dev/couchdb/source/2.3.1/rc.1/
> >> 
> >> There, you will find a tarball, a GPG signature, and SHA256/SHA512 
> >> checksums.
> >> 
> >> Please follow the test procedure here:
> >> 
> >>   
> >> https://cwiki.apache.org/confluence/display/COUCHDB/Testing+a+Source+Release
> >> 
> >> Please remember that "RC1" is an annotation. If the vote passes, these 
> >> artefacts will be released as Apache CouchDB 2.3.1.
> >> 
> >> Please cast your votes now.
> >> 
> >> Thanks,
> >> Jan
> >> —
> > 
> > -- 
> > Professional Support for Apache CouchDB:
> > https://neighbourhood.ie/couchdb-support/
> > 
> 
> -- 
> Professional Support for Apache CouchDB:
> https://neighbourhood.ie/couchdb-support/
> 


Re: [DISCUSS] Proposed Bylaws changes

2019-02-15 Thread Robert Newson
https://apache.org/foundation/how-it-works.html#hats

INDIVIDUALS COMPOSE THE ASF
All of the ASF including the board, the other officers, the committers, and the 
members, are participating as individuals. That is one strength of the ASF, 
affiliations do not cloud the personal contributions.

Unless they specifically state otherwise, whatever they post on any mailing 
list is done as themselves. It is the individual point-of-view, wearing their 
personal hat and not as a mouthpiece for whatever company happens to be signing 
their paychecks right now, and not even as a director of the ASF.

All of those ASF people implicitly have multiple hats, especially the Board, 
the other officers, and the PMC chairs. They sometimes need to talk about a 
matter of policy, so to avoid appearing to be expressing a personal opinion, 
they will state that they are talking in their special capacity. However, most 
of the time this is not necessary, personal opinions work well.

Some people declare their hats by using a special footer to their email, others 
enclose their statements in special quotation marks, others use their 
apache.org email address when otherwise they would use their personal one. This 
latter method is not reliable, as many people use their apache.org address all 
of the time.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Fri, 15 Feb 2019, at 18:47, Joan Touzet wrote:
> Garren,
> 
> RFCs are intended for major changes to our projects, not for minor
> improvments.
> 
> Do you foresee massive changes to nano and fauxton?
> 
> Do you not see that a single employer driving ~all the development
> of either or both of these as a significant concern re: the health
> of our community?
> 
> -Joan
> 
> - Original Message -
> > From: "Garren Smith" 
> > To: "priv...@couchdb.apache.org Private" , 
> > "Joan Touzet" 
> > Cc: "CouchDB Developers" 
> > Sent: Friday, February 15, 2019 2:56:04 AM
> > Subject: Re: [DISCUSS] Proposed Bylaws changes
> > 
> > I'm also not super keen on the "not directly affiliated with the
> > proposer's
> > employer”. I think this will put unnecessary strain on the community.
> > Take
> > the Fauxton and Nano.js project.  The majority of work on those
> > projects
> > come from IBM affiliated developers. We do have a smaller group of
> > community developers. That small group of community developers would
> > have
> > to review all RFC's and approve them and ideally not hold up
> > development on
> > a feature for a few weeks while they try and find time to get to it.
> > 
> > On Fri, Feb 15, 2019 at 12:49 AM Joan Touzet 
> > wrote:
> > 
> > > Hi,
> > >
> > > Thanks. I'll make another attempt to sway others, and I'd like to
> > > hear
> > > from more people on this thread.
> > >
> > > I don't see the harm in this, it would rarely if ever be invoked,
> > > and
> > > it allows us to point to a concrete, solid action we have taken to
> > > ensure we don't have a runaway project in the future. I would think
> > > it could be a guiding light for other ASF projects that have lost
> > > their
> > > way (where we, I continue to assert, have not).
> > >
> > > Remember that votes on RFCs are the *committer* community, not the
> > > PMC.
> > > I'd be shocked if the PMC remained entirely silent on a proposal,
> > > but
> > > it indeed could be possible that committers could get an RFC
> > > together
> > > "while the PMC isn't looking" (say, over a holiday). Granted it'd
> > > be in
> > > bad form, and the PMC could still take steps to correct things
> > > after
> > > the action,  but it'd be annoying to deal with.
> > >
> > > Again all I am trying to do here is put in a limiter in case the
> > > PMC
> > > and committer base /were/ to get stacked against the community. If
> > > that
> > > were to occur, your argument that the PMC could step in at that
> > > point
> > > is moot, because the PMC would already be stacked in that
> > > direction.
> > > This would protect the community from the negative effects of that
> > > happening.
> > >
> > > -Joan
> > >
> > >
> > >
> > > - Original Message -
> > > > From: "Robert Samuel Newson" 
> > > > To: "Joan Touzet" 
> > > > Cc: "CouchDB Developers" , "CouchDB PMC"
> > > > <
> > > priv...@couchdb.apache.org>
> > > > Sent: Thursday, February 14, 2019 4:46:35 PM
> > > > Subject: Re: [DISCUSS] Proposed Bylaws changes
> > > >
> > > > Hi,
> > > >
> > > > Sure.
> > > >
> > > > Any member of the PMC who is railroading changes through on
> > > > behalf of
> > > > their employer to the detriment of this project should be
> > > > disciplined, ultimately losing their PMC membership (and their
> > > > binding vote on future changes).
> > > >
> > > > The "not directly affiliated with proposer's employer” seems to
> > > > presume bad faith on the part of some of those with binding votes
> > > > at
> > > > worst, and, at best, is stating that the PMC already distrusts
> > > > its
> > > > members that happen to be employed by IBM. If that is currently
> 

Re: [DISCUSSION] Proposed new RFC process

2019-02-12 Thread Robert Newson
Hi,

I like the idea of RFC's and agree with Joan that they should help with the 
actual (and perceived) gaps in cooperation from large corporate vendors.

I would like to see a mandatory "Security Considerations" section added to the 
template. Not every RFC will have anything to say on the matter, but this 
should asserted explicitly.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 11 Feb 2019, at 17:22, Joan Touzet wrote:
> Hi Paul,
> 
> As you know, I try my hardest to post well-researched comments to this
> mailing list, and this time I fell short of that. Please accept my
> apologies. Let me try and re-frame the problem, and respond to your
> criticisms.
> 
> My point is: we need more public design discussions and review, and we
> need those discussions to have a logical conclusion. I think the RFC,
> coupled with more traffic on dev@, is the answer to that.
> 
> That said, I counted the number of comments on those 4 PRs from the
> general public:
> 
>   * Clustered purge PR #1370 has 0 non-Cloudant comments on it.
> 
>   * PSE PR #496 has one comment from me asking you to write
> documentation (that I don't think landed). That's the only
> non-Cloudant post.
> 
>   * Replicator scheduler PR #470 has a number of community
> comments on it that resulted in a higher quality PR.
> 
>   ...and I'm not going to even attempt to recap the BigCouch
>   mess, but a lot of non-Cloudant people were involved.
> 
> So 50% of the PRs were developed in the open, but they might as well
> have happened on an IBM private repo. That's unfortunate.
> 
> There are a number of possible, valid explanations for why these PRs
> were so unengaging, in my view. It may be a natural reflection of the
> fact that those are the only people who are paid to take an interest in
> the code. Or it may be that the PRs themselves are not as discoverable
> as posts to the mailing list. Perhaps it's because big PRs are
> intimidating and difficult to interpret to those who don't live and
> breathe the CouchDB code base daily. I was wrong to say that there was
> just one reason why this is the case.
> 
> But I don't think I am wrong to point out that something smells wrong
> when features land without community comment on either the design or the
> code itself. I do think it's fair to say that the mailing list
> discussions for these features were minimal as compared to the
> discussions that happened in the PRs, regardless of participant. (Your
> PSE dev@ post got no responses, for instance. Maybe it's a bad example,
> being a somewhat esoteric feature.)
> 
> Recent traffic on FDB and resharding proves to me that the ML is still a
> valid venue to discuss proposals, and that these proposals are getting
> better as a result of those things. The RFC is intended to be a cap to
> those discussions, just a slightly more ritualised way of voting on the
> discussion and writing up the result.
> 
> As to the PR side of things, because PRs go to notifications@, they are
> largely ignored by the dev@ community. Subscribing to all of the GitHub
> emails from all of the CouchDB repos is overwhelming. Even if you were
> to filter that only to new PRs and forward them to dev@ somehow, it's
> still a lot of emails to wade through, so I'm not sure that's a solution
> to the problem. PRs that reference an RFC, though, could be the "happy
> medium" that we need, and again a simple bot could help here.
> 
> As a PMC member, I feel it is my responsibility to try and steer more of
> our community into these discussions, so that the best possible solution
> can be reached. It's less about "Cloudant vs. non-Cloudant" and more
> about serving the needs of our developer and user base.
> 
> (In fact, none of the feature proposals in this thread said anything to
> the user@ mailing list - where we might have reached even more people
> who could have informed the design phase of the work. Something to
> consider.)
> 
> > Yes these were big PRs, and yes they took a long time to review. But
> > there was plenty of time for anyone to do that review (and there were
> > a number of non Cloudant people involved in these listed).
> 
> Being open for a long time, and helping people through reading the PR
> are very different things. Again, not until recently did these PRs
> start including top-level READMEs that helped people understand the code
> involved. Nick's README on the replicator scheduler is a great example
> of something very positive:
> 
> https://github.com/apache/couchdb/pull/470/files#diff-a3be920760d32aca56cc1d2b838d07ef
> 
> I feel the RFC could be the initial README.md, which would then be
> supplemented by a short intro to how the code is written and actually
> works. But one thing at a time ;)
> 
> > While I'm not sure about prototyping, I do think RFCs would help solve
> > this problem. It definitely helps to know what the reason a PR even
> > exists and maybe why various other approaches were discarded before
> > starting to 

Re: # [DISCUSS] : things we need to solve/decide : storage of edit conflicts

2019-02-08 Thread Robert Newson
d as fast as possible.
> >
> > Option 2: Jump straight to get_range_startswith() request using only
> > “docid” as the prefix, then cancel the iteration once we reach a revision
> > not equal to the first one we see. We might transfer too much data, or we
> > might end up doing multiple roundtrips if the default “iterator” streaming
> > mode sends too little data to start (I haven’t checked what the default
> > iteration block is there), but in the typical case of zero edit conflicts
> > we have a good chance of retrieving the full document in one roundtrip.
> >
> > I don’t have a good sense of which option wins out here from a performance
> > perspective, but they’re both operating on the same data model so easy
> > enough to test the alternatives. The important bit is getting the
> > revision-ish things to sort correctly. I think we can do that by generating
> > something like
> >
> > revision-ish = NotDeleted/1bit : RevPos : RevHash
> >
> > with some suitable order-preserving encoding on the RevPos integer.
> >
> > Apologies for the long email. Happy for any comments, either here or over
> > on IRC. Cheers,
> >
> > Adam
> >
> > > On Feb 7, 2019, at 4:52 PM, Robert Newson  wrote:
> > >
> > > I think we should choose simple. We can then see if performance is too
> > low or storage overhead too high and then see what we can do about it.
> > >
> > > B.
> > >
> > > --
> > >  Robert Samuel Newson
> > >  rnew...@apache.org
> > >
> > > On Thu, 7 Feb 2019, at 20:36, Ilya Khlopotov wrote:
> > >> We cannot do simple thing if we want to support sharing of JSON terms.
> > I
> > >> think if we want the simplest path we should move sharing out of the
> > >> scope. The problem with sharing is we need to know the location of
> > >> shared terms when we do write. This means that we have to read full
> > >> document on every write. There might be tricks to replace full document
> > >> read with some sort of hierarchical signature or sketch of a document.
> > >> However these tricks do not fall into simplest solution category. We
> > >> need to choose the design goals:
> > >> - simple
> > >> - performance
> > >> - reduced storage overhead
> > >>
> > >> best regards,
> > >> iilyak
> > >>
> > >> On 2019/02/07 12:45:34, Garren Smith  wrote:
> > >>> I’m also in favor of keeping it really simple and then testing and
> > >>> measuring it.
> > >>>
> > >>> What is the best way to measure that we have something that works? I’m
> > not
> > >>> sure just relying on our current tests will prove that? Should we
> > define
> > >>> and build some more complex situations e.g docs with lots of conflicts
> > or
> > >>> docs with wide revisions and make sure we can solve for those?
> > >>>
> > >>> On Thu, Feb 7, 2019 at 12:33 PM Jan Lehnardt  wrote:
> > >>>
> > >>>> I’m also very much in favour with starting with the simplest thing
> > that
> > >>>> can possibly work and doesn’t go against the advertised best
> > practices of
> > >>>> FoundationDB. Let’s get that going and get a feel for how it all works
> > >>>> together, before trying to optimise things we can’t measure yet.
> > >>>>
> > >>>> Best
> > >>>> Jan
> > >>>> —
> > >>>>
> > >>>>> On 6. Feb 2019, at 16:58, Robert Samuel Newson 
> > >>>> wrote:
> > >>>>>
> > >>>>> Hi,
> > >>>>>
> > >>>>> With the Redwood storage engine under development and with prefix
> > >>>> elision part of its design, I don’t think we should get too hung up on
> > >>>> adding complications and indirections in the key space just yet. We
> > haven’t
> > >>>> written a line of code or run any tests, this is premature
> > optimisation.
> > >>>>>
> > >>>>> I’d like to focus on the simplest solution that yields all required
> > >>>> properties. We can embellish later (if warranted).
> > >>>>>
> > >>>>> I am intrigued by all the ideas that might allow us cheaper inserts
> > and
> > >>>> updates than the current code where there are multiple edit

Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-02-04 Thread Robert Newson
I've been remiss here in not posting the data model ideas that IBM worked up 
while we were thinking about using FoundationDB so I'm posting it now. This is 
Adam' Kocoloski's original work, I am just transcribing it, and this is the 
context that the folks from the IBM side came in with, for full disclosure.

Basics

1. All CouchDB databases are inside a Directory
2. Each CouchDB database is a Directory within that Directory
3. It's possible to list all subdirectories of a Directory, so `_all_dbs` is 
the list of directories from 1.
4. Each Directory representing a CouchdB database has several Subspaces;
4a. by_id/ doc subspace: actual document contents 
4b. by_seq/versionstamp subspace: for the _changes feed 
4c. index_definitions, indexes, ...

JSON Mapping

A hierarchical JSON object naturally maps to multiple KV pairs in FDB:

{ 
“_id”: “foo”, 
“owner”: “bob”, 
“mylist”: [1,3,5], 
“mymap”: { 
“blue”: “#FF”, 
“red”: “#FF” 
} 
}

maps to

(“foo”, “owner”) = “bob” 
(“foo”, “mylist”, 0) = 1 
(“foo”, “mylist”, 1) = 3 
(“foo”, “mylist”, 2) = 5 
(“foo”, “mymap”, “blue”) = “#FF” 
(“foo”, “mymap”, “red”) = “#FF”

NB: this means that the 100KB limit applies to individual leafs in the JSON 
object, not the entire doc

Edit Conflicts

We need to account for the presence of conflicts in various levels of the doc 
due to replication.

Proposal is to create a special value indicating that the subtree below our 
current cursor position is in an unresolvable conflict. Then add additional KV 
pairs below to describe the conflicting entries.

KV data model allows us to store these efficiently and minimize duplication of 
data:

A document with these two conflicts:

{ 
“_id”: “foo”, 
“_rev”: “1-abc”, 
“owner”: “alice”, 
“active”: true 
}
{ 
“_id”: “foo”, 
“_rev”: “1-def”, 
“owner”: “bob”, 
“active”: true 
}

could be stored thus:

(“foo”, “active”) = true 
(“foo”, “owner”) = kCONFLICT 
(“foo”, “owner”, “1-abc”) = “alice” 
(“foo”, “owner”, “1-def”) = “bob”

So long as `kCONFLICT` is set at the top of the conflicting subtree this 
representation can handle conflicts of different data types as well.

Missing fields need to be handled explicitly:

{ 
  “_id”: “foo”, 
  “_rev”: “1-abc”, 
  “owner”: “alice”, 
  “active”: true 
}

{ 
  “_id”: “foo”, 
  “_rev”: “1-def”, 
  “owner”: { 
“name”: “bob”, 
“email”: “
b...@example.com
" 
  } 
}

could be stored thus:

(“foo”, “active”) = kCONFLICT 
(“foo”, “active”, “1-abc”) = true 
(“foo”, “active”, “1-def”) = kMISSING 
(“foo”, “owner”) = kCONFLICT 
(“foo”, “owner”, “1-abc”) = “alice” 
(“foo”, “owner”, “1-def”, “name”) = “bob” 
(“foo”, “owner”, “1-def”, “email”) = ...

Revision Metadata

* CouchDB uses a hash history for revisions 
** Each edit is identified by the hash of the content of the edit including the 
base revision against which it was applied 
** Individual edit branches are bounded in length but the number of branches is 
potentially unbounded 

* Size limits preclude us from storing the entire key tree as a single value; 
in pathological situations 
the tree could exceed 100KB (each entry is > 16 bytes) 

* Store each edit branch as a separate KV including deleted status in a special 
subspace 

* Structure key representation so that “winning” revision can be automatically 
retrieved in a limit=1 
key range operation

(“foo”, “_meta”, “deleted=false”, 1, “def”) = [] 
(“foo”, “_meta”, “deleted=false”, 4, “bif”) = [“3-baz”,”2-bar”,”1-foo”]  <-- 
winner
(“foo”, “_meta”, “deleted=true”, 3, “abc”) = [“2-bar”, “1-foo”]

Changes Feed

* FDB supports a concept called a versionstamp — a 10 byte, unique, 
monotonically (but not sequentially) increasing value for each committed 
transaction. The first 8 bytes are the committed version of the database. The 
last 2 bytes are monotonic in the serialization order for transactions. 

* A transaction can specify a particular index into a key where the following 
10 bytes will be overwritten by the versionstamp at commit time 

* A subspace keyed on versionstamp naturally yields a _changes feed

by_seq subspace 
  (“versionstamp1”) = (“foo”, “1-abc”) 
  (“versionstamp4”) = (“bar”, “4-def”) 

by_id subspace 
  (“bar”, “_vsn”) = “versionstamp4” 
  ... 
  (“foo”, “_vsn”) = “versionstamp1”

JSON Indexes

* “Mango” JSON indexes are defined by
** a list of field names, each of which may be nested,  
** an optional partial_filter_selector which constrains the set of docs that 
contribute 
** an optional name defined by the ddoc field (the name is auto-generated if 
not supplied) 

* Store index definitions in a single subspace to aid query planning 
** ((person,name), title, email) = (“name-title-email”, “{“student”: true}”) 
** Store the values for each index in a dedicated subspace, adding the document 
ID as the last element in the tuple 
*** (“rosie revere”, “engineer”, “ro...@example.com", “foo”) = null

B.

-- 
  Robert Samuel Newson
  

Re: # [DISCUSS] : things we need to solve/decide : storage of edit conflicts

2019-02-04 Thread Robert Newson
This one is quite tightly coupled to the other thread on data model, should we 
start much conversation here before that one gets closer to a solution?

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 4 Feb 2019, at 19:25, Ilya Khlopotov wrote:
> This is a beginning of a discussion thread about storage of edit 
> conflicts and everything which relates to revisions.
> 
> 


Re: [DISCUSS] : things we need to solve/decide : changes feed

2019-02-04 Thread Robert Newson
Let's not conflate two things here.

The changes feed is just the documents in the database, including deleted ones, 
in the order they were last written. This is simple to do in FoundationDB and 
Adam already showed the solution using fdb's versionstamps.

The talk of 'watcher' and 'subscriber' is about the continuous mode of changes 
feed and this is a stateful part of the current couchdb system (as it is 
mediating the writes to the db). We know the watcher facility in fdb will not 
scale to this usage.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 4 Feb 2019, at 19:18, Ilya Khlopotov wrote:
> One of the features of CouchDB, which doesn't map cleanly into 
> FoudationDB is changes feed. The essence of the feature is: 
> - Subscriber of the feed wants to receive notifications when database is 
> updated. 
> - The notification includes update_seq for the database and list of 
> changes which happen at that time. 
> - The change itself includes docid and rev. 
> Hi, 
> 
> There are multiple ways to easily solve this problem. Designing a 
> scalable way to do it is way harder.  
> 
> There are at least two parts to this problem:
> - how to structure secondary indexes so we can provide what we need in 
> notification event
> - how to notify CouchDB about new updates
> 
> For the second part of the problem we could setup a watcher on one of 
> the keys we have to update on every transaction. For example the key 
> which tracks the database_size or key which tracks the number of 
> documents or we can add our own key. The problem is at some point we 
> would hit a capacity limit for atomic updates of a single key 
> (FoundationDB doesn't redistribute the load among servers on per key 
> basis). In such case we would have to distribute the counter among 
> multiple keys to allow FoundationDB to split the hot range. Therefore, 
> we would have to setup multiple watches. FoundationDB has a limit on the 
> number of watches the client can setup (10 watches). So we need to 
> keep in mind this number when designing the feature. 
> 
> The single key update rate problem is very theoretical and we might 
> ignore it for the PoC version. Then we can measure the impact and change 
> design accordingly. The reason I decided to bring it up is to see maybe 
> someone has a simple solution to avoid the bottleneck. 
> 
> best regards,
> iilyak


Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-02-04 Thread Robert Newson
I think we're deep in the weeds on this small aspect of the data model problem, 
and haven't touched other aspects yet. The numbers used in your example (1k of 
paths, 100 unique field names, 100 bytes for a value), where are they from? If 
they are not from some empirical data source, I don't see any reason to dwell 
on anything we might infer from them.

I think we should focus on the simplest model that also 'works' (i.e, delivers 
all essential properties) and then prototype so we can see how efficient it is.

I am happy to sacrifice some degree of efficiency for a comprehensible mapping 
of documents to key-value pairs and we have at least three techniques to 
address long keys so far. We also know there are other approaches to this 
problem if necessary that have a much smaller storage overhead (adjacent rows 
of 100k chunks of the couchdb document treated as a blob). 

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 4 Feb 2019, at 18:29, Ilya Khlopotov wrote:
> At some point I changed the number of unique JSON paths and probably 
> forgot to update other conditions.
> The ` - each document is around 10Kb` is not used in the calculations so 
> can be ignored.
> 
> On 2019/02/04 17:46:20, Adam Kocoloski  wrote: 
> > Ugh! We definitely cannot have a model where a 10K JSON document is 
> > exploded into 2MB worth of KV data. I’ve tried several times to follow the 
> > math here but I’m failing. I can’t even get past this first bit:
> > 
> > > - each document is around 10Kb
> > > - each document consists of 1K of unique JSON paths 
> > > - each document has 100 unique JSON field names
> > > - every scalar value is 100 bytes
> > 
> > If each document has 1000 paths, and each path (which leads to a unique 
> > scalar value, right?) has a value of 100 bytes associated with it … how is 
> > the document 10KB? Wouldn’t it need to be at least 100KB just by adding up 
> > all the scalar values?
> > 
> > Adam
> > 
> > > On Feb 4, 2019, at 6:08 AM, Ilya Khlopotov  wrote:
> > > 
> > > Hi Michael,
> > > 
> > >> For example, hears a crazy thought:
> > >> Map every distinct occurence of a key/value instance through a crypto 
> > >> hash
> > >> function to get a set of hashes.
> > >> 
> > >> These can be be precomputed by Couch without any lookups in FDB.  These
> > >> will be spread all over kingdom come in FDB and not lend themselves to
> > >> range search well.
> > >> 
> > >> So what you do is index them for frequency of occurring in the same set.
> > >> In essence, you 'bucket them' statistically, and that bucket id becomes a
> > >> key prefix. A crypto hash value can be copied into more than one bucket.
> > >> The {bucket_id}/{cryptohash} becomes a {val_id}
> > > 
> > >> When writing a document, Couch submits the list/array of cryptohash 
> > >> values
> > >> it computed to FDB and gets back the corresponding  {val_id} (the id with
> > >> the bucket prefixed).  This can get somewhat expensive if there's always 
> > >> a
> > >> lot of app local cache misses.
> > >> 
> > >> A document's value is then a series of {val_id} arrays up to 100k per
> > >> segment.
> > >> 
> > >> When retrieving a document, you get the val_ids, find the distinct 
> > >> buckets
> > >> and min/max entries for this doc, and then parallel query each bucket 
> > >> while
> > >> reconstructing the document.
> > > 
> > > Interesting idea. Let's try to think it through to see if we can make it 
> > > viable. 
> > > Let's go through hypothetical example. Input data for the example:
> > > - 1M of documents
> > > - each document is around 10Kb
> > > - each document consists of 1K of unique JSON paths 
> > > - each document has 100 unique JSON field names
> > > - every scalar value is 100 bytes
> > > - 10% of unique JSON paths for every document already stored in database 
> > > under different doc or different revision of the current one
> > > - we assume 3 independent copies for every key-value pair in FDB
> > > - our hash key size is 32 bytes
> > > - let's assume we can determine if key is already on the storage without 
> > > doing query
> > > - 1% of paths is in cache (unrealistic value, in real live the percentage 
> > > is lower)
> > > - every JSON field name is 20 bytes
> > > - every JSON path is 10 levels deep
> > > - document key prefix length is 50
> > > - every document has 10 revisions
> > > Let's estimate the storage requirements and size of data we need to 
> > > transmit. The calculations are not exact.
> > > 1. storage_size_per_document (we cannot estimate exact numbers since we 
> > > don't know how FDB stores it)
> > >  - 10 * ((10Kb - (10Kb * 10%)) + (1K - (1K * 10%)) * 32 bytes) = 38Kb * 
> > > 10 * 3 = 1140 Kb (11x)
> > > 2. number of independent keys to retrieve on document read (non-range 
> > > queries) per document
> > >  - 1K - (1K * 1%) = 990
> > > 3. number of range queries: 0
> > > 4. data to transmit on read: (1K - (1K * 1%)) * (100 bytes + 32 bytes) = 
> > > 102 Kb (10x) 
> > > 5. read latency (we use 2ms 

Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-02-04 Thread Robert Newson
Hi,

The talk of crypto in the key space is extremely premature in my opinion. It it 
is the database's job (foundationdb's in this case) to map meaningful names to 
whatever it takes to efficiently store, index, and retrieve them. Obscuring 
every key with an expensive cryptographic operation works against everything I 
think distinguishes good software.

Keep it simple. The overhead of using readable, meaningful keys can be 
mitigated to a degree with a) the Directory layer which shortens prefixes at 
the cost of a network round trip b) prefix elision in the fdb storage system 
itself (Redwood, which may land before we've completed our work). 

Actual measurements take priority over the speculation in this thread so far, 
and overhead (defined as the actual storage of a document versus its 
theoretical minimum disk occupancy) is preferable to complicated, "clever", but 
brittle solutions.

I point to my earlier comment on optional document schemas which would reduce 
the length of keys to a scalar value anyway (the offset of the data item within 
the declared schema).

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Mon, 4 Feb 2019, at 11:08, Ilya Khlopotov wrote:
> Hi Michael,
> 
> > For example, hears a crazy thought:
> > Map every distinct occurence of a key/value instance through a crypto hash
> > function to get a set of hashes.
> >
> > These can be be precomputed by Couch without any lookups in FDB.  These
> > will be spread all over kingdom come in FDB and not lend themselves to
> > range search well.
> > 
> > So what you do is index them for frequency of occurring in the same set.
> > In essence, you 'bucket them' statistically, and that bucket id becomes a
> > key prefix. A crypto hash value can be copied into more than one bucket.
> > The {bucket_id}/{cryptohash} becomes a {val_id}
> 
> > When writing a document, Couch submits the list/array of cryptohash values
> > it computed to FDB and gets back the corresponding  {val_id} (the id with
> > the bucket prefixed).  This can get somewhat expensive if there's always a
> > lot of app local cache misses.
> >
> > A document's value is then a series of {val_id} arrays up to 100k per
> > segment.
> > 
> > When retrieving a document, you get the val_ids, find the distinct buckets
> > and min/max entries for this doc, and then parallel query each bucket while
> > reconstructing the document.
> 
> Interesting idea. Let's try to think it through to see if we can make it 
> viable. 
> Let's go through hypothetical example. Input data for the example:
> - 1M of documents
> - each document is around 10Kb
> - each document consists of 1K of unique JSON paths 
> - each document has 100 unique JSON field names
> - every scalar value is 100 bytes
> - 10% of unique JSON paths for every document already stored in database 
> under different doc or different revision of the current one
> - we assume 3 independent copies for every key-value pair in FDB
> - our hash key size is 32 bytes
> - let's assume we can determine if key is already on the storage without 
> doing query
> - 1% of paths is in cache (unrealistic value, in real live the 
> percentage is lower)
> - every JSON field name is 20 bytes
> - every JSON path is 10 levels deep
> - document key prefix length is 50
> - every document has 10 revisions
> Let's estimate the storage requirements and size of data we need to 
> transmit. The calculations are not exact.
> 1. storage_size_per_document (we cannot estimate exact numbers since we 
> don't know how FDB stores it)
>   - 10 * ((10Kb - (10Kb * 10%)) + (1K - (1K * 10%)) * 32 bytes) = 38Kb * 
> 10 * 3 = 1140 Kb (11x)
> 2. number of independent keys to retrieve on document read (non-range 
> queries) per document
>   - 1K - (1K * 1%) = 990
> 3. number of range queries: 0
> 4. data to transmit on read: (1K - (1K * 1%)) * (100 bytes + 32 bytes) = 
> 102 Kb (10x) 
> 5. read latency (we use 2ms per read based on numbers from 
> https://apple.github.io/foundationdb/performance.html)
> - sequential: 990*2ms = 1980ms 
> - range: 0
> Let's compare these numbers with initial proposal (flattened JSON docs 
> without global schema and without cache)
> 1. storage_size_per_document
>   - mapping table size: 100 * (20 + 4(integer size)) = 2400 bytes
>   - key size: (10 * (4 + 1(delimiter))) + 50 = 100 bytes 
>   - storage_size_per_document: 2.4K*10 + 100*1K*10 + 1K*100*10 = 2024K = 
> 1976 Kb * 3 = 5930 Kb (59.3x)
> 2. number of independent keys to retrieve: 0-2 (depending on index 
> structure)
> 3. number of range queries: 1 (1001 of keys in result)
> 4. data to transmit on read: 24K + 1000*100 + 1000*100 = 23.6 Kb (2.4x)  
> 5. read latency (we use 2ms per read based on numbers from 
> https://apple.github.io/foundationdb/performance.html and estimate range 
> read performance based on numbers from 
> https://apple.github.io/foundationdb/benchmarking.html#single-core-read-test)
>   - range read performance: Given read performance is about 305,000 
> 

Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

2019-02-01 Thread Robert Newson
Hi,

Thanks for the links, that's the detail we need.

While we can't exclude the possibility of it, FoundationDB has an extraordinary 
amount of testing. It's far more likely that there will be bugs in the layer of 
code that we write to talk to it, which of course we can also then fix and 
release on our own cadence.

We are working to understand what upgrades to FoundationDB mean for CouchDB 
users. One concern we've already seen is a brief downtime period is needed to 
upgrade between major versions as the network messages fdb currently sends have 
no versioning. Happily, this is being actively worked on and I would be 
surprised if that work hasn't landed in an official fdb release well in advance 
of the first couchdb-on-fdb release.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Fri, 1 Feb 2019, at 18:51, Eli Stevens (Gmail) wrote:
> To the best of my recollection, it was these issues:
> 
> https://github.com/apache/couchdb/issues/745 (leading to
> https://github.com/apache/couchdb/pull/1200 and then
> https://github.com/apache/couchdb/pull/1253)
> https://github.com/apache/couchdb/issues/1093 (also points back at 745?)
> 
> I could have sworn there were two separate issues, but maybe it all boiled
> down to the same thing in the end (I left the company in June of last year,
> so I can no longer go back and check my notes). I feel fairly certain that
> I remember 2.1 not working for us, and waiting for 2.1.1 (or 2.1.2?) to
> land expecting a resolution, but there still being a blocking issue that
> took until 2.2 to resolve. We contracted out the investigation and fixes,
> with Joan as our primary point of contact. She might recall more.
> 
> Of course, at least some (all?) of our problems were due to large-ish
> attachments.  :/
> 
> IIRC, the prerelease 2.2 nightly we had installed on some of our testing
> machines was behaving well, but I was gone before 2.2 final landed. I don't
> know what happened after I left. That resulted in a 4-year period where we
> couldn't upgrade DB versions in production. We literally had a customer
> independently find "someone in charge at CouchDB" and contact Jan because
> the customer thought that we were lying to them about there not being any
> DB updates we could use (this specifically was in the gap between 2.0
> landing and the availability of the Ubuntu packaging work we sponsored).
> 
> I worry that having two separate projects with their own, independent
> release cadence is going to make this kind of thing worse ("The fix we need
> is in FDB 7.0, but that's still in alpha, and there are breaking changes to
> some random bit of API, so once that firms up CDB will have to get updated
> to match" etc.). Of course, I'm somewhat playing devil's advocate here, as
> I no longer directly have skin in the game (but I like this project, and
> think that logistic issues like this are an impediment to its greater
> success).
> 
> Let me know if you'd like more detail on anything.  :)
> 
> Cheers,
> Eli
> 
> On Fri, Feb 1, 2019 at 9:29 AM Robert Newson  wrote:
> 
> > Hi,
> >
> > Avoiding unintended regressions is a high priority for the team and we
> > will need to lean on the community here to keep us honest. I'd appreciate a
> > summary of the capability regressions that affected you.
> >
> > The CouchDB PMC is interacting with the FoundationDB team now (you can see
> > the beginnings of that here:
> > https://forums.foundationdb.org/t/couchdb-considering-rearchitecting-as-an-fdb-layer/1088/15).
> > We intend to grow that relationship, in particular around governance but
> > also including the things that concern you.
> >
> > On attachments, I completely agree with you that we should choose one of
> > the two options you mention. I think FoundationDB will allow robust
> > attachment support (
> > https://apple.github.io/foundationdb/data-modeling.html#large-values-and-blobs).
> > That said, CouchDB is not designed as a BLOB storage system. Attachments
> > were originally part of the couchapp manifesto. I would like to see
> > attachment support preserved (but radically improved) but that topic is one
> > for the wider couchdb dev team to consider. The alternative, full removal
> > of the feature, has its advocates too.
> >
> > --
> >   Robert Samuel Newson
> >   rnew...@apache.org
> >
> > On Fri, 1 Feb 2019, at 17:21, Eli Stevens (Gmail) wrote:
> > > A couple other topics that I think would make sense to discuss:
> > >
> > > - How to manage edge-case performance or capability regressions resulting
> > > from the switch. My former team couldn't use 2.x in production until
> > > 2.2 due to a handful of these

Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

2019-02-01 Thread Robert Newson
Hi,

Avoiding unintended regressions is a high priority for the team and we will 
need to lean on the community here to keep us honest. I'd appreciate a summary 
of the capability regressions that affected you.

The CouchDB PMC is interacting with the FoundationDB team now (you can see the 
beginnings of that here: 
https://forums.foundationdb.org/t/couchdb-considering-rearchitecting-as-an-fdb-layer/1088/15).
 We intend to grow that relationship, in particular around governance but also 
including the things that concern you.

On attachments, I completely agree with you that we should choose one of the 
two options you mention. I think FoundationDB will allow robust attachment 
support 
(https://apple.github.io/foundationdb/data-modeling.html#large-values-and-blobs).
 That said, CouchDB is not designed as a BLOB storage system. Attachments were 
originally part of the couchapp manifesto. I would like to see attachment 
support preserved (but radically improved) but that topic is one for the wider 
couchdb dev team to consider. The alternative, full removal of the feature, has 
its advocates too.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Fri, 1 Feb 2019, at 17:21, Eli Stevens (Gmail) wrote:
> A couple other topics that I think would make sense to discuss:
> 
> - How to manage edge-case performance or capability regressions resulting
> from the switch. My former team couldn't use 2.x in production until
> 2.2 due to a handful of these kinds of issues. What's going to happen when
> users blocked due to things that can only be addressed on the FoundationDB
> side of things? Will CouchDB have a privileged seat at the table when it
> comes to requesting bugfixes or performance improvements from the
> FoundationDB team?
> - What's going to happen to attachments?  I'd really like them to get out
> of the "supported, but conventional wisdom is don't use them" limbo they're
> in now (either by becoming a first-class feature, or by officially
> deprecating them).
> 
> Cheers,
> Eli
> 
> On Thu, Jan 31, 2019 at 9:40 AM Adam Kocoloski  wrote:
> 
> >
> > > On Jan 31, 2019, at 12:27 PM, nicholas a. evans <
> > nicholas.ev...@gmail.com> wrote:
> > >
> > >> I called out the problems with reduce functionality in the first post
> > of this thread specifically to shake out people's concerns there, so thank
> > you for voicing yours. The current approach to reduce only works because we
> > control the writing of the b+tree nodes, including when they're split, etc,
> > so we're able to maintain intermediate values on the inner nodes as the
> > data changes over time. This is not something we can do with FoundationDB
> > directly (or, indeed, anything else). We're looking for a solution here.
> > >
> > > Yes, I don't want to dive too deep into the nitty gritty here (my
> > > experience with FoundationDB is only a quick skim of the docs,
> > > anyway). I was thinking of something along the lines of making a
> > > pseudo-btree (just for reductions, distinct from the map emits) where
> > > each btree node is a FoundationDB value. It might not be useful or
> > > efficient for anything *other* than ranged reduce queries, so perhaps
> > > it could be opt-in per ddoc or per view (and v4.x, where x > 0). It
> > > could be updated within the same transactions as the map emits, or
> > > maybe it could be updated as a cache layer separately from the map
> > > emits.
> >
> > That’s at least the third time I’ve heard someone independently come up
> > with this idea :) I think it could work, if anyone has the cycles to sit
> > down and write that code.
> >
> > Adam
> >
> >


Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-02-01 Thread Robert Newson
Hi,

"rebasing is not just a politically correct way of saying that CouchDb is being 
retired"

Emphatically, no. We see this is an evolution of CouchDB, delivering CouchDB 
1.0 semantics around conflicts and changes feeds but in a way that scales 
better than CouchDB 2.0's approach.

We intend to preserve what makes CouchDB special, which includes being able to 
"drop" documents in without having to declare their format. In my post from 
yesterday I suggested _optional_ schema declarations to improve efficiency and 
to address some of the constraints on doc and field size that might arise based 
on how we plan to map documents into foundationdb key-value entries.

The notion of "schemaless" for CouchDB has never meant that users don't have to 
think about how they map their data into CouchDB documents; it just relieved 
them of the burden of teaching CouchDB about them. That notion will remain.

CouchDB has a long history and a fair few clever ideas at the start are looking 
less relevant today (as you mentioned, couchapps, the _show, _list, _update, 
_rewrite sorts of things), as the ecosystem in which CouchDB lives has been so 
hugely expanded in the last ten years. It is right for the CouchDB project to 
re-evaluate the feature set we present and remove things that are of little 
value or are better done with other technology. That is just basic project 
maintenance, though.

Thank you for raising this concern, you are certainly not adding toxicity. It 
would be toxic if there was no expression of concerns about this change. Please 
continue to follow and contribute to this discussion.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Fri, 1 Feb 2019, at 09:11, Reddy B. wrote:
> By the way, if the FDB migration was to happen, will CouchDb continue to 
> be a schema-less database where we can just drop our documents and map/
> reduce them without further ceremony?
> 
> I mean for the long-term, is there a commitment to keeping this feature? 
> This is a big deal, the basics of CouchDb. I think this is the first 
> assumption you make when you use CouchDb as of today.
> 
> I'm not trying to add toxicity to this very positive, constructive and 
> high quality discussion, but just some humble feedback. As a user, when 
> I see this being questioned, along with the other limitations introduced 
> by FDB I am starting to wonder if rebasing is not just a politically 
> correct way of saying that CouchDb is being retired. For many once core 
> features now become optional extensions to be implemented.
> 
> Which makes me wonder "what's the core" and question the benefit/cost 
> analysis of the switch in light of the current vision of the project. 
> For it's starting to look like FDB may not only be used as an 
> implementation convenience but as a new vision for CouchDb (deprecating 
> the former vision). In light of this the benefit-cost analysis would 
> make sense but such a change in vision has not been publicly announced.
> 
> And this would mean that today's core feature are likely to go the way 
> of Couchapps tomorrow if the vision has indeed changed. This is a very 
> problematic uncertainty as an end-user thinking long-term support for 
> new projects. I totally appreciate that this is dev mailing list where 
> ideas are bounced and technical details worked out, but it's important 
> for us as users to see commitments on vision, thus my question. I also 
> took advantage of this opportunity to voice the more general concern 
> aforementioned.
> 
> But the specific question is: what's the vision for "schema-less" usage 
> of CouchDb.
> 
> Thanks
> 
> 
> 
> 
> De : Ilya Khlopotov 
> Envoyé : mercredi 30 janvier 2019 22:08
> À : dev@couchdb.apache.org
> Objet : Re: [DISCUSS] : things we need to solve/decide : storing JSON 
> documents
> 
> > I think I prefer the idea of indexing all document's keys using the same
> > identifier set.  In general I think applications have the behavior that
> > some keys are referenced far more than other keys and giving those keys in
> > each document the same value I think could eventually prove useful for
> > making many features faster and easier than expected.
> 
> This approach would require an invention of schema evolution features 
> similar to recently open sourced Record Layer 
> https://www.foundationdb.org/files/record-layer-paper.pdf
> I am sure some CouchDB users do (because CouchDB is NoSQL i.e. schema-
> less database):
> - rename fields
> - reuse field names for something else when they update application
> - remove fields
> - have documents of different structure in one database
> 
> > I think regardless of whether the mapping is document local or global, 
> > having
> > FDB return those individual values is faster/easier than having Couch Range
> > fetch the mapping and do the translation work itself.
> in case of global mapping we would do
> - get_schema from different subspace (i.e. contact different nodes)
> - 

Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-01-31 Thread Robert Newson
Thanks! I stress it would be optional, we could add it in a release after the 
main couchdb-on-fdb in response to pressure from users finding the 10mb (etc) 
limits too restrictive, or we can do it as a neat enhancement in its own right 
(the validation aspect) that just happens to allow us to optimise the lengths 
of keys.

B.

-- 
  Robert Samuel Newson
  rnew...@apache.org

On Thu, 31 Jan 2019, at 17:36, Adam Kocoloski wrote:
> I like the idea, both for the efficiencies it enables in the 
> FoundationDB data model and for the ability to cover a lot of validation 
> functionality without shelling out to JS.
> 
> It’s pretty obviously a big, meaty topic unto itself, one that needs 
> some careful thought and design. Also an awful lot of opportunity for 
> scope creep. But a good topic nonetheless.
> 
> Adam
> 
> > On Jan 31, 2019, at 12:05 PM, Robert Newson  wrote:
> > 
> > Hi,
> > 
> > An enhancement over the first idea (where we flatten JSON documents into 
> > keys and values where the key is the full path to every terminal value, 
> > regardless of depth in the JSON) is to allow users to register schemas.
> > 
> > For documents that match a registered schema (suggestion, a top level field 
> > called "_schema" is required to mark the documents) we can convert to 
> > key/value pairs much more compactly. The key name, barring whatever prefix 
> > identifies the database itself, is just the name of the schema and the 
> > ordinal of the schema item relative to the declaration.
> > 
> > These schema documents (living under /dbname/_schema/$scheme_name akin to 
> > design documents) would list all required and optional fields, their types 
> > and perhaps other constraints (like valid ranges or relationships with 
> > other fields). We could get arbitrarily complex in schema definitions over 
> > time. Effectively, these are validate_doc_update functions without the 
> > Javascript evaluation pain.
> > 
> > We don't necessarily need this in the first version, but it feels like a 
> > better response to the worries over the restrictions that the flattening 
> > idea is causing than switching to an opaque series of 100k chunks.
> > 
> > thoughts?
> > B
> > 
> > -- 
> >  Robert Newson
> >  b...@rsn.io
> > 
> > On Thu, 31 Jan 2019, at 16:26, Adam Kocoloski wrote:
> >> 
> >>> On Jan 31, 2019, at 1:47 AM, ermouth  wrote:
> >>> 
> >>>> As I don't see the 10k limitation as having significant merit
> >>> 
> >>> Not sure it’s relevant here, but Mango indexes put selected doc values 
> >>> into
> >>> keys.
> >>> 
> >>> ermouth
> >> 
> >> Totally relevant. Not just because of the possibility of putting a large 
> >> scalar value into the index, but because someone could create an index 
> >> on a field that is itself a container, and Mango could just dump the 
> >> entire container into the index as the key.
> >> 
> >> Presumably there’s a followup discussion dedicated to indexing where we 
> >> can suss out what to do in that scenario.
> >> 
> >> Adam
> 


Re: [DISCUSS] : things we need to solve/decide : storing JSON documents

2019-01-31 Thread Robert Newson
Hi,

An enhancement over the first idea (where we flatten JSON documents into keys 
and values where the key is the full path to every terminal value, regardless 
of depth in the JSON) is to allow users to register schemas.

For documents that match a registered schema (suggestion, a top level field 
called "_schema" is required to mark the documents) we can convert to key/value 
pairs much more compactly. The key name, barring whatever prefix identifies the 
database itself, is just the name of the schema and the ordinal of the schema 
item relative to the declaration.

These schema documents (living under /dbname/_schema/$scheme_name akin to 
design documents) would list all required and optional fields, their types and 
perhaps other constraints (like valid ranges or relationships with other 
fields). We could get arbitrarily complex in schema definitions over time. 
Effectively, these are validate_doc_update functions without the Javascript 
evaluation pain.

We don't necessarily need this in the first version, but it feels like a better 
response to the worries over the restrictions that the flattening idea is 
causing than switching to an opaque series of 100k chunks.

thoughts?
B

-- 
  Robert Newson
  b...@rsn.io

On Thu, 31 Jan 2019, at 16:26, Adam Kocoloski wrote:
> 
> > On Jan 31, 2019, at 1:47 AM, ermouth  wrote:
> > 
> >> As I don't see the 10k limitation as having significant merit
> > 
> > Not sure it’s relevant here, but Mango indexes put selected doc values into
> > keys.
> > 
> > ermouth
> 
> Totally relevant. Not just because of the possibility of putting a large 
> scalar value into the index, but because someone could create an index 
> on a field that is itself a container, and Mango could just dump the 
> entire container into the index as the key.
> 
> Presumably there’s a followup discussion dedicated to indexing where we 
> can suss out what to do in that scenario.
> 
> Adam


Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

2019-01-31 Thread Robert Newson
Hi Nick,

I don't think anyone responded to your points yet.

I think it would significantly complicate this work to make it a per-database 
decision. I think it has to be a wholesale cutover to a new backend with 
appropriate warnings in release notes and guidance on migration. This is why 
the plan is for a major 3.0 release before the fdb-based release (likely to be 
4.0) as a pathway to that. To your specific point about the pluggable storage 
engine, I believe none of that code makes it over the couchdb-on-fdb release.

I called out the problems with reduce functionality in the first post of this 
thread specifically to shake out people's concerns there, so thank you for 
voicing yours. The current approach to reduce only works because we control the 
writing of the b+tree nodes, including when they're split, etc, so we're able 
to maintain intermediate values on the inner nodes as the data changes over 
time. This is not something we can do with FoundationDB directly (or, indeed, 
anything else). We're looking for a solution here. The best we have so far 
preserves the group level reduces (including group level of 0, i.e, the reduce 
value of everything in the view). Those group level reduces will be at least as 
efficient as today, I think. For arbitrary start/endkey reduce we might decide 
to not support them, or to support them the expensive way (without the benefit 
of precomputed intermediate values). 

B.

-- 
  Robert Newson
  rnew...@apache.org

On Sun, 27 Jan 2019, at 14:45, nicholas a. evans wrote:
> Thanks Jan,
> 
> On Sun, Jan 27, 2019, 3:43 AM Jan Lehnardt  
> > The FDB proposal is starting at a higher level than the pluggable storage
> > engines. This isn't just about storage, but also about having a new
> > abstraction over the distributed systems aspects of CouchDB.
> >
> 
> Right, FoundationDB would *also* replace all of the
> internal-cluster-replication, since it is already a scalable distributed
> database. I was just curious if we'd be able to leverage the pluggable
> storage engine work. I.e. could the other parts that change (fabric, etc)
> *also* be swapped out such that foundationdb or legacy
> couch_bt_engine+fabric could be selected on a per db basis. Maybe not, but
> IMO it'd still be interesting if we could somehow try out a foundationdb
> pluggable storage engine as a proof of concept.
> 
> As for reduce: CouchDB will *not* lose reduce. Details are TBD, so let's
> > wait to discuss them for when the technical proposal for that part is out,
> > please.
> >
> 
> > So far, all IBM has mentioned is that in their preliminary exploration of
> > this, they couldn't find a trivial way to support *efficient querying* for
> > *custom reduce functions* (anything that isn't _sum/_count/_stats).
> >
> 
> Yes, I understand all of that. But I *really* *really* need efficient
> querying for custom reduce functions. Ideally, I'd like group_level queries
> to be even *more* efficient than they currently are.
> 
> I wasn't trying to jump into a deep technical proposal. I was just putting
> forward a naive napkin sketch level idea, and wondering if it could
> possibly work as a trivial way to support efficient queries on custom
> reduce functions.
> 
> I either need that to stay (or improve) or else I need some new
> features (i.e. view changes feed, dbcopy, etc) that make it easier and
> worthwhile to rewrite all of my code that relies on it. Because if I had to
> had to significantly change my codebase to migrate to CouchDB 4.0, I
> honestly think my bosses might opt to replace our storage layer with
> something other than CouchDB rather than upgrade.
> 
> Thanks,
> Nick Evans
> 
> >


Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

2019-01-26 Thread Robert Newson
Hi,

It’s only arbitrary start/end key of the reduce values we have no solution for 
yet . For map-only, we can and would supply arbitrary start/end key. 

More explicitly, it’s only _efficient_ reduce results that are lost, because we 
can’t store the intermediate reduce values on inner btree nodes. We could 
calculate the reduce value dynamically by reading the entire range of selected 
keys and calculate the reduce value each time. 

Finally, this is a known gap in our investigation. It doesn’t mean there isn’t 
an answer to be discovered. 
 
B

> On 26 Jan 2019, at 17:20, Reddy B.  wrote:
> 
> Hello,
> 
> Just to add the modest perspective of a user. I appreciate the benefits of 
> taking advantage of the infrastructure provided by FDB both from a quality 
> perspective and from a maintenance and ease of expansion perspective.
> 
> However, this development makes me really worried of being burned as a user 
> so to speak. Losing arbitrary reduce functions would be a big concern. But 
> losing arbitrary startkey and endkey would be an even bigger concern.
> 
> This is not about the inconvenience of updating our codebase, this is about 
> losing the ability to do quite significant things, losing expressiveness so 
> to speak. We make extensive use of startkeys/endkeys for things ranging from 
> geoqueries using a simple application-based geohash implementation, all the 
> way to matching the documents belonging to a tenant using complex keys. So if 
> this is indeed the feature we'd be losing, this is quite a big deal in my 
> opinion. I think all our data access layers would need to be rewritten but I 
> do not even know how.
> 
> For I do not know if we are representative, but we intentionally stay away 
> from Mango to leverage the performance benefits of using precomputed views 
> which see as a key feature of Couchdb. Mango is quite non-deterministic when 
> it comes to performance (defining the right indexes is cumbersome compared to 
> using views, and its difficult to know if a query will be completed by doing 
> in-memory filtering). And people keep reporting a number of troubling bugs. 
> So moving people to Mango is not only about updating applications, this is 
> also losing quite substantial features.
> 
> All in all, my point is that with that the changes I hear, I feel like a lot 
> of the technical assumptions we made when we settled on Couchdb would no 
> longer hold. There are patterns we wouldn't be able to use, and I don't even 
> think that it would still be possible to develop real world applications 
> solely relying of the view/reduce pipeline only if we do not have the level 
> of expressiveness provided by custom reduce and arbitrary startkeys/endkeys. 
> Without these two structures, we will burned in certain situations.
> 
> Just wanted to voice this concern to highlight that there are folks like us 
> for who the API of the view/reduce pipeline is central. So that hopefully 
> this can be taken into account as the merits of this proposal are being 
> reviewed.
> 
> 
> De : Dave Cottlehuber 
> Envoyé : samedi 26 janvier 2019 15:31:24
> À : dev@couchdb.apache.org
> Objet : Re: [DISCUSS] Rebase CouchDB on top of FoundationDB
> 
>> On Fri, 25 Jan 2019, at 09:58, Robert Samuel Newson wrote:
>> 
> 
> Thanks for sharing this Bob, and also thanks everybody who shared their
> thoughts too.
> 
> I'm super excited, partly because we get to keep all our Couchy
> goodness, and also that FDB brings some really interesting operational
> capabilities to the table that normally you spend a decade trying to
> build from scratch. The level of testing that has gone into FDB is
> astounding[1].
> 
> Things like seamless data migration, expanding storage and rebalancing
> shards and nodes, as anybody who's dealt with large or long-lived
> couchdb clusters knows are Hard Problems today.
> 
> There's clearly a lot of work to be done -- it's early days -- and it
> changes a lot of non-visible things like packaging, dependencies,
> cross-platform support, and a markedly different operations model -- but
> I'm most excited about the opportunities here at the storage layer for
> us.
> 
> Regarding handling larger k/v items than what fdb can handle, is covered
> in the forums already[2] and is similar to how we'd query multiple docs
> from a couchdb view today using an array-based complex/compound key:
> 
> [0, ..] would give you all the docs in that view under key 0
> 
> except that in FDB that query would happen for a single couchdb doc, and
> returning a range query to achieve that. Similar to multiple docs, there
> are some traps around managing that in an atomic fashion at the higher
> layer.
> 
> I'm sure there are many more things like this we'll need to wrap our
> heads around!
> 
> Especial thanks to the dual-hat-wearing IBM folk who have engaged with
> the community so early in the process -- basically at the napkin
> stage[3].
> 
> [1]: 

Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

2019-01-23 Thread Robert Newson
eering steps.
>>> 
>>> * * *
>>> 
>>> Finally, I want to reiterate Bob’s point: while this proposal is largely
>>> driven by IBM, IBM has no power to unilaterally force the CouchDB project
>>> to accept this proposal and they have already signalled and worked towards
>>> making this a mutually beneficial endeavour. The CouchDB project has
>>> different objectives from IBM and it is up to us to come up with a proposal
>>> that satisfies all of our objectives as well as IBMs, should this motion
>>> pass.
>>> 
>>> Best
>>> Jan
>>> —
>>> 
>>> 
>>>> On 23. Jan 2019, at 11:00, Robert Samuel Newson 
>>> wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> CouchDB 2.0 introduced clustering; the ability to scale a single
>>> database across multiple nodes, increasing both the maximum size of a
>>> database and adding native fault-tolerance. This welcome and considerable
>>> step forward was not without its trade-offs. In the years since 2.0 was
>>> released, users frequently encounter the following issues as a direct
>>> consequence of the 2.0 clustering approach:
>>>> 
>>>> 1. Conflict revisions can be created on normal concurrent updates issued
>>> to a single database, since each replica of a database shard independently
>>> chooses whether to accept a given update, and all replicas will eventually
>>> propagate updates that any one of them has chosen to accept.
>>>> 2. Secondary indexes ("views") do not scale the same way as document
>>> lookups, as they are sharded by doc id, not emitted view key (thus forcing
>>> a consultation of all shard ranges for each query).
>>>> 3. The changes feed is no longer totally ordered and, worse, could
>>> replay earlier changes in the event of a node failure (even a temporary
>>> one).
>>>> 
>>>> The idea is to use FoundationDB as the new CouchDB foundational layer,
>>> letting it take care of data storage and placement. An introduction to
>>> FoundationDB would take up too much space here so I will summarise it as a
>>> highly scalable ordered key-value store with transactional semantics,
>>> provides strong consistency, scaling from a single node to many. It is
>>> licensed under the ASLv2 but is not an Apache project.
>>>> 
>>>> By using FoundationDB we can solve all three of the problems listed
>>> above and deliver semantics much closer to CouchDB 1.x's behaviour while
>>> improving upon the scalability advantages that 2.0 introduced. The
>>> essential character of CouchDB would be preserved (MVCC for documents,
>>> replication between CouchDB databases) but the underlying plumbing would
>>> change significantly. In addition, this new foundation will allow us to add
>>> long wished-for features more easily. For example, multi-document
>>> transactions become possible, as does efficient field-level reading and
>>> writing. A further thought is the ability to update views transactionally
>>> with the database update.
>>>> 
>>>> For those familiar with the CouchDB 2.0 architecture, the proposal is,
>>> in effect, to change all the functions in fabric.erl so that they work
>>> against a (possibly remote) FoundationDB cluster instead of the current
>>> implementation of calling into the original CouchDB 1.x code (couch_btree,
>>> couch_file, etc).
>>>> 
>>>> This is a large change and, for full disclosure, the IBM Cloudant team
>>> are proposing it. We have done our due diligence in investigating
>>> FoundationDB as well as detailed investigation into how CouchDB semantics
>>> would be built on top of FoundationDB. Any and all decisions on that must
>>> take place here on the CouchDB developer mailing list, of course, but we
>>> are confident that this is feasible.
>>>> During those investigations we have identified a small number of CouchDB
>>> features that we do not yet see a way to do on FoundationDB, the main one
>>> being custom (Javascript) reduces. This is a direct consequence of no
>>> longer rolling our own persistence layer (couch_btree and friends) and
>>> would likely apply to any alternative technology.
>>>> 
>>>> I think this would be a great advance for CouchDB, preserving what makes
>>> CouchDB special but taking advantage of the superbly engineered
>>> FoundationDB software at the bottom of the stack.
>>>> 
>>>> Regards,
>>>> Robert Newson
>>> 
>>> --
>>> Professional Support for Apache CouchDB:
>>> https://neighbourhood.ie/couchdb-support/
>>> 
>>> 
> 



Re: Proposal: removing view changes code from mrview

2018-08-04 Thread Robert Newson
+1

Sent from my iPhone

> On 31 Jul 2018, at 13:52, Eiri  wrote:
> 
> Hi all,
> 
> Since we seems to be in agreement and with 2.1.2 released, I'm starting to 
> work on this.
> Just wanted to let everyone know.
> 
> 
> Regards,
> Eric
> 
> 
> 
>> On Apr 3, 2018, at 13:03, Paul Davis  wrote:
>> 
>> +1
>> 
>>> On Tue, Apr 3, 2018 at 9:23 AM, Joan Touzet  wrote:
>>> 
>>> +1.
>>> 
>>> 1. No one has worked on a fix since its contribution prior to 2.0.
>>> 2. The code will always be in git in an older revision if someone is
>>> looking for it.
>>> 3. We have #592 which describes the fundamental problem that needs to be
>>> resolved. (By the way, with my PMC hat on, you should unassign this issue
>>> from yourself unless you're actively working on it *right now*.)
>>> 
>>> - Original Message -
>>> From: "Eiri" 
>>> To: dev@couchdb.apache.org
>>> Sent: Tuesday, April 3, 2018 8:15:21 AM
>>> Subject: Proposal: removing view changes code from mrview
>>> 
>>> Hi all,
>>> 
>>> It is my understanding that a current implementation of view changes in
>>> mrview is conceptually broken. I heard from Robert Newson that he and
>>> Benjamin Bastian found that some time ago doing testing with deletion and
>>> recreation of docs emitting same keys in the views.
>>> 
>>> I propose to remove view changes code from mrview and its mention from
>>> documentation, as it seem that people keep trying to use those for filtered
>>> replication or getting a false impression that it's a simple fix in fabric.
>>> Not to mention that the current implementation quite complicates mrviews
>>> code and takes space in view files with building unneeded seq and kseq
>>> btrees.
>>> 
>>> We can re-implement this feature later in more robust way as there are
>>> clearly a demand for it. Please share your opinion.
>>> 
>>> 
>>> Regards,
>>> Eric
>>> 
> 



Re: [PROPOSAL] Drop PDF / texinfo documentation builds

2017-03-18 Thread Robert Newson
+1. Please use fire.

> On 18 Mar 2017, at 17:44, Robert Kowalski  wrote:
> 
> good idea, +1
> 
>> On Sat, Mar 18, 2017 at 9:14 AM, Jan Lehnardt  wrote:
>> 
>> +1
>> 
>> Cheers
>> Jan
>> --
>> 
>>> On 18 Mar 2017, at 05:55, Joan Touzet  wrote:
>>> 
>>> Hi everyone,
>>> 
>>> I'd like to propose dropping the PDF and texinfo targets from our
>>> documentation build, or at the very least, having them not be part of
>>> the default target / not standard deliverables for the project.
>>> 
>>> We'd continue to build HTML documentation as part of the workflow,
>>> naturally, as well as the man pages.
>>> 
>>> I don't have any solid numbers, but I'm fairly sure most people use
>>> https://docs.couchdb.org/ or a locally installed copy for their
>>> documentation for CouchDB rather than the PDF documentation. I
>>> personally can't remember the last time I opened the docs in PDF form. I
>>> also have never seen anyone refer to the PDF docs on the mailing lists,
>>> IRC or Slack when asking for advice or support.
>>> 
>>> I've also never seen anyone use or talk about the texinfo target, and
>>> I've not used them myself.
>>> 
>>> Dropping this dependency will allow us to drop TeX/LaTeX from our build
>>> chain, which speeds up build times by about 90 seconds and reduces the
>>> size of containers currently being built for our Jenkins CI workflow.
>>> It also means CouchDB devs don't have to install 0.5-1.5GB worth of
>>> toolset.
>>> 
>>> I've captured this in JIRA as
>>> https://issues.apache.org/jira/browse/COUCHDB-3329 and have PRs ready
>>> to fire off if people agree.
>>> 
>>> Your thoughts?
>>> 
>>> -Joan
>> 


Re: ransom note - couchdb exploit / privilege escalation ?

2017-01-23 Thread Robert Newson
Hi Vivek,

Thanks for the update and thanks for persevering. I agree with you on likely 
cause here. 

To your follow up question, it's not intentional that _users can be deleted, 
its more a side effect of admin privileges. CouchDB before 2.0 will create that 
db automatically if it's missing. From 2.0 we can't automate it as we need to 
wait until the cluster is joined. 

Sent from my iPhone

> On 23 Jan 2017, at 10:39, Vivek Pathak  wrote:
> 
> As a follow up, I have a design question
> 
> http://docs.couchdb.org/en/2.0.0/intro/security.html#authentication-database 
> says:
> 
> * There is a special design document|_auth|that cannot be modified
> 
> However it looks like the admin user can delete the authentication database 
> (thereby deleting _auth document as well).
> 
> Is there a convenience benefit of allowing this (eg: admin party is useful 
> when you start off locally and dont care about security) ?
> 
> Thanks
> Vivek
> 
> 
>> On 01/23/2017 05:27 AM, Vivek Pathak wrote:
>> Sorry for delayed response (I had to restore the backups and harden the 
>> server a bit in order to deal with the ongoing attempts to grab my data).   
>> And thank you all to those who helped.
>> 
>> Looks like this was a plain password sniffing of admin password. No evidence 
>> of guessing or repeated attempts - and it was not a simple password to guess 
>> or crack.
>> 
>> I believe the admin password could only be sniffed because it was on open 
>> port 5984.   I was careless because the site was in development.
>> 
>> So now I have couchdb listening on 127.0.0.1, and the admin password is now 
>> randomly generated 18 characters (dont know good if the centos 7 rng has 
>> trapdoor though).  The need for replication and UI access via _utils can be 
>> satisfied by setting up a ssh tunnel via a random port, eg:
>> 
>>ssh -N -L 57237:localhost:5984  user@1.2.3.4
>> 
>> Next is to move to https - and that should complete the securing aspect.  
>> Also ended up creating offline backup on a stopped ec2 instance - this 
>> should come handy if the attack become really serious.
>> 
>> Thank you
>> 
>> 
>>> On 01/20/2017 09:09 AM, Thomas Guillet wrote:
>>> @Paul: I agree, it is pretty straightforward to have some basic settings on.
>>> 
>>> Could we rely on the cluster_setup endpoint to secure the instance?
>>> If that is considered to be the first 'mandatory step' of a live
>>> instance, it would be nice as an almost out-of-the-box secure set up.
>>> (Plus, you can always "curl" the endpoint instead of "perl" the local.ini)
>>> 
>>> SSL-only is tricky as the http server can't be deactivated in
>>> local.ini but in default.ini (from memory).
>>> 
>>> @All: What do you consider a same/secure set up? What are the known
>>> unsecured features/weaknesses of CouchDB.
>>> 
>>> @Vivek: You issue worries me quite a lot. Do you have a better idea of
>>> what happened?
>>> I saw you are using HTTP instead of HTTPS, were you using in encrypted
>>> connection to exchange your credentials and session?
>>> Is your instance behind a proxy? (nginx or alike) They may have other
>>> logs to help us investigate.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 2017-01-20 12:49 GMT+01:00 Paul Hammant :
> tee-hee, that was my wishful thinking, less actual planning :)
> 
> As usual, there is no estimate for now.
> 
 Don't worry - my open source commitments slip by five years at a time, but
 I thought I'd ask just in case.
 
 It might be better to focus on a series of post-install scripts for 2.x
 that lock down a couch.
 
 I was *very* excited by my first (and more or less only) exposure to
 CouchDB for - 
 http://paulhammant.com/2015/12/21/angular-and-svg-and-couchdb.
 As part of that I wanted to make it easy for the reader to turn on CORS:
 
 perl -p -i -e 's/;enable_cors/enable_cors/'
 /usr/local/etc/couchdb/default.ini
 perl -p -i -e 's/enable_cors = false/enable_cors = true/'
 /usr/local/etc/couchdb/default.ini
 perl -p -i -e 's/;origins/origins/' /usr/local/etc/couchdb/default.ini
 perl -p -i -e 's/origins = /origins = */' 
 /usr/local/etc/couchdb/default.ini
 perl -p -i -e 's/origins = \*\*/origins = */'
 /usr/local/etc/couchdb/default.ini
 
 
 That's to turn on CORS (CouchDB v1.6.x), for the blog entry.
 
 I'll bet that it's only another eight "one-liners" (Perl or not) to go
 SSL-only, cancel the AdminParty, and generate a unique admin password.
 
 - Paul
>> 
> 



Re: ransom note - couchdb exploit / privilege escalation ?

2017-01-20 Thread Robert Newson
A reminder that the security sensitive discussion with Vivek is happening 
elsewhere. We don't want to reveal any issue that might be found until we have 
a fix, if  it turns out to be an avoidable fault in couchdb. The discussion of 
improved 3.0 defaults can continue here or in a new thread. 

Sent from my iPhone

> On 20 Jan 2017, at 14:09, Thomas Guillet  wrote:
> 
> @Paul: I agree, it is pretty straightforward to have some basic settings on.
> 
> Could we rely on the cluster_setup endpoint to secure the instance?
> If that is considered to be the first 'mandatory step' of a live
> instance, it would be nice as an almost out-of-the-box secure set up.
> (Plus, you can always "curl" the endpoint instead of "perl" the local.ini)
> 
> SSL-only is tricky as the http server can't be deactivated in
> local.ini but in default.ini (from memory).
> 
> @All: What do you consider a same/secure set up? What are the known
> unsecured features/weaknesses of CouchDB.
> 
> @Vivek: You issue worries me quite a lot. Do you have a better idea of
> what happened?
> I saw you are using HTTP instead of HTTPS, were you using in encrypted
> connection to exchange your credentials and session?
> Is your instance behind a proxy? (nginx or alike) They may have other
> logs to help us investigate.
> 
> 
> 
> 
> 
> 
> 2017-01-20 12:49 GMT+01:00 Paul Hammant :
>>> 
>>> tee-hee, that was my wishful thinking, less actual planning :)
>>> 
>>> As usual, there is no estimate for now.
>>> 
>> 
>> Don't worry - my open source commitments slip by five years at a time, but
>> I thought I'd ask just in case.
>> 
>> It might be better to focus on a series of post-install scripts for 2.x
>> that lock down a couch.
>> 
>> I was *very* excited by my first (and more or less only) exposure to
>> CouchDB for - http://paulhammant.com/2015/12/21/angular-and-svg-and-couchdb.
>> As part of that I wanted to make it easy for the reader to turn on CORS:
>> 
>> perl -p -i -e 's/;enable_cors/enable_cors/'
>> /usr/local/etc/couchdb/default.ini
>> perl -p -i -e 's/enable_cors = false/enable_cors = true/'
>> /usr/local/etc/couchdb/default.ini
>> perl -p -i -e 's/;origins/origins/' /usr/local/etc/couchdb/default.ini
>> perl -p -i -e 's/origins = /origins = */' /usr/local/etc/couchdb/default.ini
>> perl -p -i -e 's/origins = \*\*/origins = */'
>> /usr/local/etc/couchdb/default.ini
>> 
>> 
>> That's to turn on CORS (CouchDB v1.6.x), for the blog entry.
>> 
>> I'll bet that it's only another eight "one-liners" (Perl or not) to go
>> SSL-only, cancel the AdminParty, and generate a unique admin password.
>> 
>> - Paul



Re: CouchDB 2.0 as Snap

2016-09-19 Thread Robert Newson
Make a separate systemd service for epmd and have the couch one depend on it. 
There is a parameter you can add to couch's vm.args file to prevent it even 
trying to start epmd. 

Sent from my iPhone

> On 19 Sep 2016, at 22:47, Michael Hall  wrote:
> 
> Thanks to help from Jan and Wohali on IRC, I was able to manually build
> couchdb from the 2.0.x branch, and then snap-package the resulting
> binary. I have attached the snapcraft.yaml used for this. Put this file
> in a directory with the couchdb directory built in ./rel/, then run
> "snapcraft snap" to build couchdb_2.0_amd64.snap
> 
> The snap package will create a systemd service file for running couchdb
> as a daemon, but due to the way it launches a background epmd process
> this isn't working right (systemd thinks it failed to start and keeps
> trying to restart it until it givesup). Because of that, I've also
> included a /snap/bin/couchdb.run which will manually kick it off, but
> this should only be temporary until the daemon process can be fixed.
> 
> One last caveat, you'll need to copy /snap/couchdb/current/etc/*.ini
> into /var/snap/couchdb/current/ and mkdir /var/snap/couchdb/current/data
> before running it. This could be done at runtime either by couchdb
> itself, or with a custom wrapper script for the snap command.
> 
> Michael Hall
> mhall...@gmail.com
> 
>> On 09/19/2016 01:19 PM, Jan Lehnardt wrote:
>> 
>>> On 19 Sep 2016, at 19:13, Michael Hall  wrote:
>>> 
>>> Maybe I'm using the wrong branch, because the Makefile has an "install"
>>> target but not a "release" target. I'm using developer-preview-2.0, if
>>> that's not the correct one, which should I use?
>> 
>> Please use the `2.0.x` branch.
>> 
>> Best
>> Jan
>> --
>> 
>>> 
>>> Michael Hall
>>> mhall...@gmail.com
>>> 
 On 09/19/2016 12:10 PM, Jan Lehnardt wrote:
 Heya, nice effort here :)
 
 CouchDB 2.0 doesn’t use autotools. It mimics them minimally, but only
 insofar as it is useful for CouchDB and not for tools that expect
 autotools-like behaviour.
 
 Over time, we want to make it so that the CouchDB install procedure
 fits right into normal tooling, but we are not there yet.
 
 Especially, `make install` is not available in 2.0. Instead, we
 have `make release` which produces a location independent directory
 `./rel/couchdb` that you can move into your system where you need it.
 
 There is no way to externalise log files or so from a setup perspective
 (although it can be configured in local.ini).
 
 HTH
 
 Best
 Jan
 --
 
> On 19 Sep 2016, at 17:48, Michael Hall  wrote:
> 
> I have attached the snapcraft.yaml file I've started. This is used by
> the snapcraft tool to build and package a .snap file (just run
> `snapcraft snap` in the same directory as this file).
> 
> You can see that most of it is dedicated to grabbing the source,
> specifying build dependencies (build-packages) and runtime dependencies
> (stage-packages). The 'autotools' plugin will run the standard
> "./configure; make; make install" steps on the source, and while the
> output of those claims to be successful, make returns with a non-zero
> status code ($?=2) which causes snapcraft to abort after building.
> 
> As mentioned previously, this could be significantly simplified if it
> could use the build processes already in place. In that case the
> snapcraft.yaml would only need to be pointed to the local directory
> containing the binary files needed to include in the .snap package. If
> somebody wants to give that a try, I can put together a new
> snapcraft.yaml that will do that.
> 
> 
> Michael Hall
> mhall...@gmail.com
> 
>> On 09/19/2016 02:56 AM, Constantin Teodorescu wrote:
>> It would be nice to have two snap packages:
>> - CouchDB 2.0 UN-CLUSTERED
>> - CouchDB 2.0 CLUSTERED VERSION
>> 
>> That will encourage a lot of "standalone" CouchDB users to upgrade to a 
>> 2.0
>> version without the clustering overload stuff, and thus make a big pool 
>> of
>> 2.0 testers and bug-reporters!
>> Teo
>> 
>> 
>>> On Mon, Sep 19, 2016 at 4:47 AM, Michael Hall  
>>> wrote:
>>> 
>>> First off, congratulations on the upcoming 2.0 release!
>>> 
>>> I would love to see this new version available as a Snap package for
>>> users of Ubuntu 16.04 LTS, since the archive version will be frozen on
>>> 1.6.0 for the next 5 years of it's lifecycle.
>>> 
>>> Snaps are self-contained packages that include all of the dependencies
>>> they need, which lets them run as you (the upstream) intended across new
>>> releases of Ubuntu, Debian, Arch, and many other distros. They run in a
>>> sandbox that protects them from changes made to the user's system, but
>>> 

Re: Printing passwords in Couch log files?

2016-09-15 Thread Robert Newson
100% agree that we shouldn't but it's hard to guarantee it never happens, hence 
the warning. Passwords are held in process state so we can authenticate to 
remote sources and targets while replicating. Crashes of those processes write 
state dumps to the log. 

We can do better but it will involve some re-engineering of internals. We'll 
get it done but , for now, we can only warn you about the problem. 

Sent from my iPhone

> On 15 Sep 2016, at 11:44, Paul Hammant  wrote:
> 
> In http://guide.couchdb.org/draft/security.html it is disclosed that
> passwords are written to the log if the debug level is 'debug' level. I'm
> not sure that's good practice.  I do not think Couch should log passwords
> at any log level, and I think others might agree.
> 
> At the very least it should be a specific setting in the config:
> 
>  [log]
>  level = debug
>  log-passwords = false  // proposed :)
> 
> Thoughts?
> 
> - Paul


Re: Adding a node to cluster

2016-08-25 Thread Robert Newson
Ok, seems I've confused you. 

Couchdb replication occurs over http or https, as you know. The nodes in a 
couchdb 2.0 cluster do not communicate with each other over http. They use 
Erlang rpc. Erlang rpc can be configured for TLS encryption.  It's in the 
Erlang faq and is fairly simple to set up in newer Erlang releases. 

I feel I owe an example of 2.0 cluster that exclusively uses TLS for all 
communications. 

Sent from my iPhone

> On 24 Aug 2016, at 20:47, Joey Samonte  wrote:
> 
> What if we remove the reverse proxy and just set up the CouchDB nodes to 
> allow only SSL connections, port 6984? 
> https://wiki.apache.org/couchdb/How_to_enable_SSL
> 
>> Subject: Re: Adding a node to cluster
>> From: rnew...@apache.org
>> Date: Wed, 24 Aug 2016 19:43:51 +0100
>> To: dev@couchdb.apache.org
>> 
>> Assuming you mean a 2.0 cluster, no, all those nodes need to be able to 
>> communicate with erlang rpc (service discovery over port 4369 and then 
>> whatever port the node is running ong).
>> 
>>> On 24 Aug 2016, at 12:36, Joey Samonte  wrote:
>>> 
>>> Good day,
>>> 
>>> Is it possible to add a node to a cluster from Fauxton if the remote host 
>>> is behind a reverse proxy (nginx) configured as HTTPS?
>>> 
>>> Regards,
>>> Joey
> 



Re: Can clustering be setup between nodes that only accept SSL connections?

2016-08-25 Thread Robert Newson
Yes, couchdb can be configured that way but my recommendation is to put 
something like haproxy in front instead. The native ssl support in Erlang has a 
buggy history in my experience, though I believe 18.x is working quite nicely. 
Further, with couchdb 2.0, you'll want a round-robin loss balancer in front of 
them to fully enjoy the clustered fault tolerance. 

For < 2.0, you just need to configure the httpsd daemon and comment out the 
httpsd one. For 2.0, I'll have to research a little as I'm not sure the chttpd 
service is as rainy disabled. 

Sent from my iPhone

> On 24 Aug 2016, at 21:08, Joey Samonte  wrote:
> 
> Good day,
> 
> SSL is a must for us to secure our data. Can the CouchDB nodes in the cluster 
> only allow https, for example, on port 6984?
> 



Re: local.d/*.ini are not read by couchdb 2.0 rc2?

2016-07-31 Thread Robert Newson
I filed a ticket for it. It'll work before final 2.0 release. 

Sent from my iPhone

> On 31 Jul 2016, at 22:48, Bian Ying  wrote:
> 
> Thanks for your response.
> 
> Can this make into rc3 or rc4? It's important for us to have configurations 
> managed in local.d.
> 
> - Ying
> 
>> On Aug 1, 2016, at 00:44, Robert Samuel Newson  wrote:
>> 
>> oooh, you might be right there. Fixable, thanks for catching it!
>> 
>> B.
>> 
>>> On 31 Jul 2016, at 09:39, Bian Ying  wrote:
>>> 
>>> I guess developers here may give some insight to me. ;)
>>> 
>>> - Ying
>>> 
>>> Begin forwarded message:
>>> 
 From: Ying Bian 
 Date: July 30, 2016 at 23:19:27 GMT+8
 To: u...@couchdb.apache.org
 Subject: local.d/*.ini are not read by couchdb 2.0 rc2?
 Reply-To: u...@couchdb.apache.org
 
 Hi,
 
 I’m using the official docker image of couchdb 2.0 rc2 and want to mount 
 some local configuration under local.d. 
 Per document (http://docs.couchdb.org/en/1.6.1/config/intro.html) (does it 
 still apply to 2.0?), it should work. But I
 just don’t make it. This is the command I used to start the container:
 
 $ docker run -it -v /home/ybian/local.d:/opt/couchdb/etc/local.d -p 
 5984:5984 klaemo/couchdb:2.0-rc2
 
 There is one *.ini file under /home/ybian/local.d where I added cors and 
 admin user configuration. I also verified that
 this directory got mounted successfully by inspecting the container. But 
 the configuration does not take any effect.
 Seems to me that the *.ini file under local.d was not read at all.
 
 I also found that `couchdb -c` command mentioned in the doc above does not 
 work to show configuration file chain 
 anymore.
 
 Is local.d still expected to work in 2.0?
 
 -Ying
>> 



Re: Mango full text search is immune to accented letters?

2016-07-30 Thread Robert Newson
The backend of mango FT is Lucene and certainly handles accented characters. It 
all comes down to which analyser you are using. 

Sent from my iPhone

> On 30 Jul 2016, at 13:17, Constantin Teodorescu  wrote:
> 
> Is Mango Full text indexer/search (or would it be) immune for accented
> letters?
> 
> I'm planning to use it for searching "posta" but it may be "poştă" in
> documents!
> SQLite3 FTS4 is able to do that!
> 
> For the moment I'm using CouchDB 1.6 views with explicit "flatten function"
> in JavaScript to create a non-accented index:
> 
>  var translate_re = /[ŞȘŢȚÎĂÂÁşșţțîăâá]/g,
>  translate = {
>'Ş': 'S', 'ş': 's',
>'Ș': 'S', 'ș': 's',
>'Ţ': 'T', 'ţ': 't',
>'Ț': 'T', 'ț': 't',
>'Ă': 'A', 'ă': 'a',
>'Â': 'A', 'â': 'a',
>'Á': 'A', 'á': 'a',
>'Î': 'I', 'î': 'i'
>  };
> 
>function makeSearchString(s) {
>return ( s.replace(translate_re, function(match) {
>  return translate[match];
>}) );
>}
> 
> Teo



Re: CouchDB 2.0 blog series

2016-07-28 Thread Robert Newson
Whenever you want to compact, it's the only current method. 

Given the difficulty there, especially for view shards, we should consider 
adding _compact on 5984 (compact all shards at once). 

Sent from my iPhone

> On 27 Jul 2016, at 18:47, Joan Touzet <woh...@apache.org> wrote:
> 
> We had a question in IRC recently about compaction of individual shards
> and having to go through the 5986 port. Some discussion of when this is
> necessary and why would be most useful.
> 
> -Joan
> 
> - Original Message -
>> From: "Mayya Sharipova" <may...@ca.ibm.com>
>> To: dev@couchdb.apache.org
>> Sent: Wednesday, July 27, 2016 12:59:21 PM
>> Subject: Re: CouchDB 2.0 blog series
>> 
>> Hello everyone!
>> Does anyone know any user-facing changes in compaction between v1.6
>> and v2.0?
>> 
>> Jay and me have identified the following two changes that we would
>> like to address in the blog:
>> 
>> 1) use ioq to separately prioritise compaction requests
>> https://github.com/apache/couchdb-couch/commit/95b60be72c271db1fc4317c9a1aa0a1537798fda
>> 
>> 2) improved compaction efficiency with a temp file:
>> https://github.com/apache/couchdb-couch/commit/9d830590f8a9a699315c78b329a8e80079ed48bd
>> 
>> 
>> Were there any other major changes that worth mentioning in the
>> compaction blog?
>> 
>> Thanks,
>> Mayya
>> 
>> 
>> 
>> 
>> 
>> -Jenn Turner <j...@thehoodiefirm.com> wrote: -
>> To: "dev@couchdb.apache.org" <dev@couchdb.apache.org>
>> From: Jenn Turner <j...@thehoodiefirm.com>
>> Date: 07/25/2016 02:14PM
>> Cc: "dev@couchdb.apache.org" <dev@couchdb.apache.org>,
>> "market...@couchdb.apache.org" <market...@couchdb.apache.org>
>> Subject: Re: CouchDB 2.0 blog series
>> 
>> Hello!
>> 
>> 
>> 
>> Based on the responses to my initial requests for volunteers I’ve put
>> together
>> a tentative schedule for the series. I've also created issues in JIRA
>> and if
>> there aren't any objections, I'll be assigning these dates as the due
>> dates.
>> 
>> 
>> 
>> Please let me know if these dates don’t work for you!
>> 
>> 
>> 
>> Week 1
>> Jul 25: The Road to CouchDB 2.0, Jan Lehnardt
>> Jul 27: Feature: Fauxton, Garren Smith
>> 
>> Week 2
>> Aug 1: The CouchDB 2.0 Architecture, Robert Newson
>> Aug 3: Feature: Mango query, Tony Sun
>> 
>> Week 3
>> Aug 8: Release Candidates, Joan Touzet
>> Aug 10: Feature: compactor, Maaya Sharipova
>> 
>> Week 4
>> Aug 15: Feature: replicator, Nick Vatamaniuc
>> Aug 17: Migration Guide, (need volunteer)
>> 
>> Week 5
>> Aug 22: Miscellaneous improvements and bugfixes, Jan Lehnardt
>> 
>> 
>> 
>> Also – For the Migration Guide post, we had a volunteer, but I'd like
>> to pair
>> them up with someone who has been on the project a bit longer, is
>> there anyone
>> who wants to volunteer to do that?
>> 
>> 
>> 
>> Thanks again to everyone who has volunteered, you're awesome :D
>> 
>> 
>> 
>> Jenn Turner
>> 
>> The Neighbourhoodie Software GmbH
>> Adalbertstr. 7-8, 10999 Berlin
>> [neighbourhood.ie](https://link.nylas.com/link/c4yg26doe3du1m7gpdgdrj1jp
>> /local-667de70f-
>> 5a2e/0?redirect=http%3A%2F%2Fneighbourhood.ie%2F=ZGV2QGNvdWNoZGIuYXBhY2hlLm9yZw==
>> "http://neighbourhood.ie/; )
>> 
>> 
>> Handelsregister HRB 157851 B Amtsgericht Charlottenburg
>> Geschäftsführung: Jan Lehnardt
>> 
>> 
>> 
>> On Jul 25 2016, at 2:19 am, Andy Wenk andyw...@apache.org>;
>> wrote:
>> 
>>> awesome  Spread the word everybody !
>> 
>> 
>>> Cheers
>> 
>> 
>>> Andy
>> 
>> 
>>> \--
>>> Andy Wenk
>>> RockIt!
>> 
>> 
>>> Hamburg / Germany
>> 
>> 
>>> GPG public key:
>> https://pgp.mit.edu/pks/lookup?op=getsearch=0x4F1D0C59BC90917D
>> 
>> 
>>>  On 25 Jul 2016, at 11:14, Jan Lehnardt j...@apache.org>;
>>> wrote:
>>> 
>>>  And we’re live:
>>> https://blog.couchdb.org/2016/07/25/the-road-to-
>> couchdb-2-0/
>>> 
>>>  Thanks everyone for their comments! 3
>>> 
>>>  Best
>>>  Jan
>>>  \--
>>> 
>>>  On 24 Jul 2016, at 18:43, Jan Lehnardt
>>> j...@apache.org>;
>> wrote:
>>> 

  1   2   3   4   5   6   7   8   9   10   >