Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

Eli Stevens (Gmail) Wed, 23 Jan 2019 09:55:50 -0800

I'd like to request that there be threads where it's appropriate to discuss:


- Managing the refactoring/merge process to avoid the previous situation
where 1.x was mostly dead, but 2.x wasn't going to land for a few years.
- Other features to deprecate at the same time as losing JS reduce (I
assume that this really means "all external query servers" are going away?).
- What the support for users who will be stuck on 2.x will be.

Apologies for the noise if those are already on the list of topics.  :)

Cheers,
Eli

On Wed, Jan 23, 2019 at 5:33 AM Jan Lehnardt <[email protected]> wrote:

> Hi Bob,
>
> this is all very exciting!
>
> First up, full disclosure, the CouchDB PMC has had about two weeks to
> think about this already, so if any of the following doesn’t sound like a
> knee-jerk reaction, that’s why.
>
> I’m personally tentatively optimistic about this proposal and I’m willing
> to work through all open questions from governance, contribution management
> to the technical bits to see if we as the CouchDB project arrive at a point
> where we are comfortable going down this path.
>
> The PMC has already identified a set of discussion areas for this dev@
> mailing list to go through before any definite decision can be made.
> Separate emails for those discussions are going to be posted on this list
> shortly, so I won’t go into further detail here.
>
> If anyone sees a need for discussion beyond the threads that will appear
> here, please speak up at your earliest convenience. This proposal would
> mean a big step for our project, and we must make sure to hear all voices.
>
> Once we’ve gone through all this, the resulting answers to all the open
> questions coming up will end up in a consensus finding process on this
> mailing list, which will signify the final project decision.
>
> * * *
>
> That said, I’d like to highlight one of these topics: IBM/Cloudant’s
> contributions going forward.
>
> Looking at how 2.0 came to be, the contributions were mostly taken on good
> faith (and legal review), and from the trust Cloudant built up operating a
> large number of large instances of clusters of what would eventually become
> CouchDB 2.0. It has clearly paid off for CouchDB and our current level of
> success wouldn’t be without IBM/Cloudant.
>
> However, some of the ways we work with the IBM team leave things to be
> desired. Specifically, the Apache CouchDB community is frequently not
> involved in design discussions around new features. Those happen inside IBM
> and we “only” get a PR that then goes through the regular review process.
> Again, this has served us well, but we can do even better, so I’d like to
> take the opportunity of this larger proposal to suggest we actually do
> better. As promised, a more detailed thread about this is going to come up,
> and it’ll be the right place to go through the minutiae of this.
>
> With this structural change, I believe we are in a great position to work
> through the details of this proposal and the subsequent design and
> engineering steps.
>
> * * *
>
> Finally, I want to reiterate Bob’s point: while this proposal is largely
> driven by IBM, IBM has no power to unilaterally force the CouchDB project
> to accept this proposal and they have already signalled and worked towards
> making this a mutually beneficial endeavour. The CouchDB project has
> different objectives from IBM and it is up to us to come up with a proposal
> that satisfies all of our objectives as well as IBMs, should this motion
> pass.
>
> Best
> Jan
> —
>
>
> > On 23. Jan 2019, at 11:00, Robert Samuel Newson <[email protected]>
> wrote:
> >
> > Hi,
> >
> > CouchDB 2.0 introduced clustering; the ability to scale a single
> database across multiple nodes, increasing both the maximum size of a
> database and adding native fault-tolerance. This welcome and considerable
> step forward was not without its trade-offs. In the years since 2.0 was
> released, users frequently encounter the following issues as a direct
> consequence of the 2.0 clustering approach:
> >
> > 1. Conflict revisions can be created on normal concurrent updates issued
> to a single database, since each replica of a database shard independently
> chooses whether to accept a given update, and all replicas will eventually
> propagate updates that any one of them has chosen to accept.
> > 2. Secondary indexes ("views") do not scale the same way as document
> lookups, as they are sharded by doc id, not emitted view key (thus forcing
> a consultation of all shard ranges for each query).
> > 3. The changes feed is no longer totally ordered and, worse, could
> replay earlier changes in the event of a node failure (even a temporary
> one).
> >
> > The idea is to use FoundationDB as the new CouchDB foundational layer,
> letting it take care of data storage and placement. An introduction to
> FoundationDB would take up too much space here so I will summarise it as a
> highly scalable ordered key-value store with transactional semantics,
> provides strong consistency, scaling from a single node to many. It is
> licensed under the ASLv2 but is not an Apache project.
> >
> > By using FoundationDB we can solve all three of the problems listed
> above and deliver semantics much closer to CouchDB 1.x's behaviour while
> improving upon the scalability advantages that 2.0 introduced. The
> essential character of CouchDB would be preserved (MVCC for documents,
> replication between CouchDB databases) but the underlying plumbing would
> change significantly. In addition, this new foundation will allow us to add
> long wished-for features more easily. For example, multi-document
> transactions become possible, as does efficient field-level reading and
> writing. A further thought is the ability to update views transactionally
> with the database update.
> >
> > For those familiar with the CouchDB 2.0 architecture, the proposal is,
> in effect, to change all the functions in fabric.erl so that they work
> against a (possibly remote) FoundationDB cluster instead of the current
> implementation of calling into the original CouchDB 1.x code (couch_btree,
> couch_file, etc).
> >
> > This is a large change and, for full disclosure, the IBM Cloudant team
> are proposing it. We have done our due diligence in investigating
> FoundationDB as well as detailed investigation into how CouchDB semantics
> would be built on top of FoundationDB. Any and all decisions on that must
> take place here on the CouchDB developer mailing list, of course, but we
> are confident that this is feasible.
> > During those investigations we have identified a small number of CouchDB
> features that we do not yet see a way to do on FoundationDB, the main one
> being custom (Javascript) reduces. This is a direct consequence of no
> longer rolling our own persistence layer (couch_btree and friends) and
> would likely apply to any alternative technology.
> >
> > I think this would be a great advance for CouchDB, preserving what makes
> CouchDB special but taking advantage of the superbly engineered
> FoundationDB software at the bottom of the stack.
> >
> > Regards,
> > Robert Newson
>
> --
> Professional Support for Apache CouchDB:
> https://neighbourhood.ie/couchdb-support/
>
>

Re: [DISCUSS] Rebase CouchDB on top of FoundationDB

Reply via email to