I'd like to request that there be threads where it's appropriate to discuss:
- Managing the refactoring/merge process to avoid the previous situation where 1.x was mostly dead, but 2.x wasn't going to land for a few years. - Other features to deprecate at the same time as losing JS reduce (I assume that this really means "all external query servers" are going away?). - What the support for users who will be stuck on 2.x will be. Apologies for the noise if those are already on the list of topics. :) Cheers, Eli On Wed, Jan 23, 2019 at 5:33 AM Jan Lehnardt <j...@apache.org> wrote: > Hi Bob, > > this is all very exciting! > > First up, full disclosure, the CouchDB PMC has had about two weeks to > think about this already, so if any of the following doesn’t sound like a > knee-jerk reaction, that’s why. > > I’m personally tentatively optimistic about this proposal and I’m willing > to work through all open questions from governance, contribution management > to the technical bits to see if we as the CouchDB project arrive at a point > where we are comfortable going down this path. > > The PMC has already identified a set of discussion areas for this dev@ > mailing list to go through before any definite decision can be made. > Separate emails for those discussions are going to be posted on this list > shortly, so I won’t go into further detail here. > > If anyone sees a need for discussion beyond the threads that will appear > here, please speak up at your earliest convenience. This proposal would > mean a big step for our project, and we must make sure to hear all voices. > > Once we’ve gone through all this, the resulting answers to all the open > questions coming up will end up in a consensus finding process on this > mailing list, which will signify the final project decision. > > * * * > > That said, I’d like to highlight one of these topics: IBM/Cloudant’s > contributions going forward. > > Looking at how 2.0 came to be, the contributions were mostly taken on good > faith (and legal review), and from the trust Cloudant built up operating a > large number of large instances of clusters of what would eventually become > CouchDB 2.0. It has clearly paid off for CouchDB and our current level of > success wouldn’t be without IBM/Cloudant. > > However, some of the ways we work with the IBM team leave things to be > desired. Specifically, the Apache CouchDB community is frequently not > involved in design discussions around new features. Those happen inside IBM > and we “only” get a PR that then goes through the regular review process. > Again, this has served us well, but we can do even better, so I’d like to > take the opportunity of this larger proposal to suggest we actually do > better. As promised, a more detailed thread about this is going to come up, > and it’ll be the right place to go through the minutiae of this. > > With this structural change, I believe we are in a great position to work > through the details of this proposal and the subsequent design and > engineering steps. > > * * * > > Finally, I want to reiterate Bob’s point: while this proposal is largely > driven by IBM, IBM has no power to unilaterally force the CouchDB project > to accept this proposal and they have already signalled and worked towards > making this a mutually beneficial endeavour. The CouchDB project has > different objectives from IBM and it is up to us to come up with a proposal > that satisfies all of our objectives as well as IBMs, should this motion > pass. > > Best > Jan > — > > > > On 23. Jan 2019, at 11:00, Robert Samuel Newson <rnew...@apache.org> > wrote: > > > > Hi, > > > > CouchDB 2.0 introduced clustering; the ability to scale a single > database across multiple nodes, increasing both the maximum size of a > database and adding native fault-tolerance. This welcome and considerable > step forward was not without its trade-offs. In the years since 2.0 was > released, users frequently encounter the following issues as a direct > consequence of the 2.0 clustering approach: > > > > 1. Conflict revisions can be created on normal concurrent updates issued > to a single database, since each replica of a database shard independently > chooses whether to accept a given update, and all replicas will eventually > propagate updates that any one of them has chosen to accept. > > 2. Secondary indexes ("views") do not scale the same way as document > lookups, as they are sharded by doc id, not emitted view key (thus forcing > a consultation of all shard ranges for each query). > > 3. The changes feed is no longer totally ordered and, worse, could > replay earlier changes in the event of a node failure (even a temporary > one). > > > > The idea is to use FoundationDB as the new CouchDB foundational layer, > letting it take care of data storage and placement. An introduction to > FoundationDB would take up too much space here so I will summarise it as a > highly scalable ordered key-value store with transactional semantics, > provides strong consistency, scaling from a single node to many. It is > licensed under the ASLv2 but is not an Apache project. > > > > By using FoundationDB we can solve all three of the problems listed > above and deliver semantics much closer to CouchDB 1.x's behaviour while > improving upon the scalability advantages that 2.0 introduced. The > essential character of CouchDB would be preserved (MVCC for documents, > replication between CouchDB databases) but the underlying plumbing would > change significantly. In addition, this new foundation will allow us to add > long wished-for features more easily. For example, multi-document > transactions become possible, as does efficient field-level reading and > writing. A further thought is the ability to update views transactionally > with the database update. > > > > For those familiar with the CouchDB 2.0 architecture, the proposal is, > in effect, to change all the functions in fabric.erl so that they work > against a (possibly remote) FoundationDB cluster instead of the current > implementation of calling into the original CouchDB 1.x code (couch_btree, > couch_file, etc). > > > > This is a large change and, for full disclosure, the IBM Cloudant team > are proposing it. We have done our due diligence in investigating > FoundationDB as well as detailed investigation into how CouchDB semantics > would be built on top of FoundationDB. Any and all decisions on that must > take place here on the CouchDB developer mailing list, of course, but we > are confident that this is feasible. > > During those investigations we have identified a small number of CouchDB > features that we do not yet see a way to do on FoundationDB, the main one > being custom (Javascript) reduces. This is a direct consequence of no > longer rolling our own persistence layer (couch_btree and friends) and > would likely apply to any alternative technology. > > > > I think this would be a great advance for CouchDB, preserving what makes > CouchDB special but taking advantage of the superbly engineered > FoundationDB software at the bottom of the stack. > > > > Regards, > > Robert Newson > > -- > Professional Support for Apache CouchDB: > https://neighbourhood.ie/couchdb-support/ > >