Re: Important update on couchdb's foundationdb work

Will Young Tue, 15 Mar 2022 05:33:22 -0700

Hi,

  Personally, I had no specific need in mind for foundationdb to
address, I'm well served by the current storage atop ZFS, so I mostly
see downsides in adding any storage that adds new restrictions,
non-portable dependencies, or puts groups of features into silos. I.e.
foundationdb's requirement for avx was a headache for me as I have
some older x86 as well as Arm, where avx2neon seemed experimental.
Several host local DBs like RocksDB and Sqlite seem like they might
provide a modest benefit while less likely to create a split somewhere
between anyone using couch for embedded, anyone building a modest
cluster and anyone building a full scale cloud DB service. So if there
is an interest in continuing to develop an alternative to the current
storage, I would prefer investigations into the benefits of a
direct/local RocksDB storage engine rather than having a portion of
the community invest more time into continuing a CouchDB-FDB project.


-Will



Am Mo., 14. März 2022 um 12:19 Uhr schrieb Robert Newson <rnew...@apache.org>:
>
> Hi,
>
> That already happened. “Pluggable storage engine” was introduced in 2016 
> (https://github.com/apache/couchdb/commit/f6a771147ba488f80a7d29491263d19088d0eefb).
>
> No alternative backends have yet been contributed.
>
> B.
>
> > On 13 Mar 2022, at 16:27, Chintan Mishra from Rebhu <chin...@rebhu.com> 
> > wrote:
> >
> > As a user, my team and I were keenly looking forward to CouchDB v4 with 
> > FoundationDB.
> >
> > Given the current situation, it is only reasonable to come up with a best 
> > alternative.
> >
> > How about refactoring CouchDB to work with multiple storage engines?
> >
> > The default CouchDB will support whatever the PMC agrees upon. Whereas the 
> > community can tinker with different backend storage engines. So, the 
> > FoundationDB can be one of the backing engines that get used with CouchDB. 
> > Other storage engines can be RocksDB, Apache Derby, etc.
> >
> > Thank you.
> >
> > --
> > Chintan Mishra
> >
> > On 13/03/22 17:09, Robert Newson wrote:
> >> Thank you for this feedback.
> >> I think it’s reasonable to worry about tying CouchDB to FoundationDB for 
> >> some of the reasons you mentioned, but not all of them. We did worry, at 
> >> the start, at the lack of a governance policy around FoundationDB; 
> >> something that would help ensure the project is not beholden to a single 
> >> corporate entity that might abandon the project or take it in places that 
> >> make it unsuitable for CouchDB in the future. There hasn’t been much 
> >> progress on that, but likewise the project has stayed true to form.
> >> CouchDB is critically dependent on Erlang/OTP, among other components, 
> >> which similarly lack the kind of governance or oversight that Apache 
> >> projects themselves work within. At no point have I feared the "project 
> >> will end up in FoundationDB integrating CouchDB rather than the other way 
> >> around”. FoundationDB is not a database, it is explicitly only 
> >> foundational support to build databases on top of.
> >> "If even you guys weren't treated as a priority, I doubt that my feature 
> >> requests and other input will matter even one bit as a user.” - I’m not 
> >> sure who you refer to with “you guys”, but I remind everyone that the 
> >> CouchDB contributors from IBM Cloudant are the main contributors to 
> >> CouchDB 2.0 and 3.0, have been so for years and are in, many cases, either 
> >> CouchDB committers or PMC members. They are “us” as much as any other 
> >> contributor. That the Cloudant team has moved focus from CouchDB 4.0 (as 
> >> it would have been) to 3.0 is a re-establishment of the status quo ante.
> >> "I doubt that my feature requests and other input will matter even one bit 
> >> as a user.” — I strongly disagree here. Community contributions are hugely 
> >> valuable and valued, the rewrite of the lower layers of CouchDB would not 
> >> have changed that significantly. CouchDB-FDB is still written in Erlang, 
> >> the http layer is largely the same code as before. The parts that interact 
> >> with FoundationDB are confined to a single library application (erlfdb) 
> >> which exposes the C language bindings as Erlang functions and data 
> >> structures. Unless you are working at that level you can mostly ignore it.
> >> Finally, while I don’t think we’ve explicitly described it this way, 
> >> CouchDB-FDB effectively _is_ a “layer” on top of FDB in the same sense 
> >> that their “document layer” (which is mongo-like) is.
> >> B.
> >>> On 13 Mar 2022, at 11:17, Reddy B. <redd...@live.fr> wrote:
> >>>
> >>> Hello!
> >>>
> >>> Thanks a lot for this update and overview of the situation. As users (our 
> >>> company has been using couchdb since 2015 circa as the main database of 
> >>> our 3 tier web apps), I feel it may be preferable to move the couchdb-fdb 
> >>> work to a separate project having a different name. As Janh has 
> >>> mentioned, the internals and daily management of FDB may with certain 
> >>> regards be at odds with the philosophy and user experience that couchdb 
> >>> wants to provide.
> >>>
> >>> Moving this effort to a different project would give people interested in 
> >>> this effort more flexibility to introduce breaking changes and 
> >>> limitations taking full advantage of the philosophy of FDB. I feel the 
> >>> idea that: if you have outscaled CouchDB, move to couchdb-fdb (or  
> >>> another more specialized DB) is the right idea. Couchdb-fdb advantage 
> >>> compared to alternative would simply be that it implements both the 
> >>> replication protocol and the HTTP API.
> >>>
> >>> This project may/should even "simply" become something under the umbrella 
> >>> of the FoundationDB layer similar to the MongoDB-compatible document 
> >>> layer of FoundationDB [1].
> >>>
> >>> And this fact is also the cause of the unease I personally have this 
> >>> FoundationDB migration: it looks like CouchDB will have much less control 
> >>> over its destiny and even philosophy. This is different from say an 
> >>> encrypted messaging app deciding to replace its home-made encryption with 
> >>> an established and more robust open-source solution. From day 1, I feel 
> >>> like this project will end up in FoundationDB integrating CouchDB rather 
> >>> than the other way around. I even suspected that maybe the dev team was 
> >>> no longer interested in CouchDB and wanted to find it a new home.
> >>>
> >>> My friendly feedback as a user is that I trust the Apache governance 
> >>> model much more than I trust Apple, especially when the welcome meal they 
> >>> have offer me is that features will be removed and limitations 
> >>> introduced. The political background and what I would call "corporate 
> >>> risk" (key capabilities not implemented by upstream, change in priority 
> >>> or vision, difficulty to affect the roadmap of upstream etc...), is also 
> >>> a key factor when choosing a DB solution as a user.
> >>>
> >>> If even you guys weren't treated as a priority, I doubt that my feature 
> >>> requests and other input will matter even one bit as a user. And I would 
> >>> have zero chance of having the expertize required to modify the FDB core 
> >>> myself and get my changes approved to make my CouchDB Layer- related 
> >>> request possible. While right now I get can get my hands dirty and 
> >>> eventually get something done if I really want to. The governance here is 
> >>> very friendly, welcoming and inspiring trust.
> >>>
> >>> So to summarize, I feel that to realize the full potential of this vision 
> >>> rather than settling on compromises not satisfying anyone, it may be 
> >>> better to treat it as a separate project and let CouchDB remain CouchDB. 
> >>> I also feel that the project would lose too much control and sovereignty 
> >>> with such a migration, especially in light of the facts reported.
> >>>
> >>> The scaling challenges and limitations that motivated this effort may 
> >>> probably be addressed differently with a fresh outlook. For example, 
> >>> nowadays, there are even application-level middleware libraries like 
> >>> Microsoft Orleans being able to coordinate ACID distributed transactions 
> >>> from the application layer. My point is, challenges may be able to be 
> >>> overcome overtime by approaching things in a creative manner.
> >>>
> >>> Users may be able to workaround some of them by adjusting the topology of 
> >>> their clusters (using single writer, huge single node with distributed 
> >>> file systems etc...), for other challenges application-layer solutions 
> >>> may exist, or the solution may simply be shipping extremely user friendly 
> >>> graphical management tools making for example things such as conflict 
> >>> resolution a breeze for the admin.
> >>>
> >>> My 2 cents
> >>>
> >>> [1]: https://github.com/FoundationDB/fdb-document-layer
> >>>
> >>>
> >>>
> >>> 12 mars 2022 10:26:35 Jan Lehnardt <j...@apache.org>:
> >>>
> >>>> Thanks Bob for passing this along.
> >>>>
> >>>> I’m looking forward to renewed interest in the 3.x codebase :)
> >>>>
> >>>> For our 4.x plans, we’ll have to discuss here what we want to do with it 
> >>>> and I’m looking at everyone for input here. Even if you’ve never spoken 
> >>>> up on this list before, I’d lie to hear from you.
> >>>>
> >>>> * * *
> >>>>
> >>>> First off, as a project, CouchDB is not obliged to follow IBMs lead and 
> >>>> abandon the FDB-CouchDB effort. At the same time, it is not obliged to 
> >>>> take what they leave behind and finish it.
> >>>>
> >>>> I know for some the 4.x release is highly anticipated and we as a 
> >>>> project hoped to make a generational jump for our underlying storage and 
> >>>> distribution technologies. During initial discussions about FDB-Couch 
> >>>> and during its development, we anticipated certain developments on the 
> >>>> FDB side (especially allowing longer transactions for consistent 
> >>>> _changes responses with their new Redwood storage engine). It is my 
> >>>> understanding that these developments have not materialised in the way 
> >>>> we would like them. The consequence is that there are certain API 
> >>>> guarantees that 3.x CouchDB gives (consistent full-database snapshots in 
> >>>> _changes) are not possible to build with native FDB features. — I can’t 
> >>>> speak to the very specifics of this, and I hope we can dig into all this 
> >>>> together in this thread, but my takeaway from this is that *if* we 
> >>>> continue with FDB-Couch, I think we will have to reevaluate its 
> >>>> compatibility story, as we had hoped to make it mainly a seamless (but 
> >>>> better) API upgrade from 3.x.
> >>>>
> >>>> We also learned that operating a FDB cluster is a significant effort 
> >>>> that somewhat goes against CouchDB’s mostly “just works” nature. We had 
> >>>> asked the IBM team to share their operational FDB learnings with the 
> >>>> CoucHDB project, so we can build up community knowledge around this, but 
> >>>> this has not materialised either.
> >>>>
> >>>> I’m personally still excited about the opportunities we have with 
> >>>> FDB-Couch, but as a project, we might have to come up with a more 
> >>>> realistic positioning of FDB-CouchDB. Less a “new and improved drop-in 
> >>>> replacement” and maybe more a “if you exceed the scale/capacity of 3.x 
> >>>> CouchDB, you can upgrade to FDB-CouchDB at the expense of a few API 
> >>>> differences and higher operational cost”. This might be worth a 
> >>>> trade-off for large users of CouchDB and thus it might be worth having 
> >>>> both of these codebases live alongside each other.
> >>>>
> >>>> However, that comes with a number of consequences:
> >>>>
> >>>> - The 3.x/4.x naming doesn’t quite work if these are meant to continue 
> >>>> alongside each other.
> >>>>
> >>>> - Maybe FDB-Couch gets its own separate project name and versioning, 
> >>>> with a clear delineation between them.
> >>>>
> >>>> - We would have to maintain two projects complete with release 
> >>>> management, vulnerability management, the lot. At the moment, CouchDB 
> >>>> has just about enough folks contributing to move forward at a reasonable 
> >>>> pace. Doubling that effort might be tricky. While we had an influx of 
> >>>> contributors recently, this would probably need more dedicated planning 
> >>>> and outreach.
> >>>>
> >>>> - New API features would have to be implemented twice, if we want to 
> >>>> keep a majority API overlap. This is not a fun proposition for folks who 
> >>>> add features, which is hard enough, but now they have to do it twice, 
> >>>> onto two different subsystems. Some features (say 
> >>>> multi-doc-transactions) would only be possible in one of the projects 
> >>>> (FDB-Couch), what would our policy be for deliberate API feature 
> >>>> divergence?
> >>>>
> >>>> - probably more that elude me at the moment.
> >>>>
> >>>> While there are non-trivial points among these, they are not impossible 
> >>>> tasks *if* we find enough and the right folks to carry the work forward.
> >>>>
> >>>> * * *
> >>>>
> >>>> For myself, I still see a lot of potential in the 3.x codebase and I’m 
> >>>> looking forward to renewed roadmap discussions there. I know I have a 
> >>>> long list of things I’d like to see added.
> >>>>
> >>>> From my professional observation, the thing that our (Neighbourhoodie) 
> >>>> customers tend to run into the most is the scaling limits of the 
> >>>> database-per-user pattern. We have a proposal for per-doc-authentication 
> >>>> that helps mitigate a subset of those use-cases, which would be a great 
> >>>> help overall. I have worked on a draft PR of this over the years, but it 
> >>>> mostly stalled out during the pandemic. I’m planning to restart work on 
> >>>> this shortly. If anyone wants to contribute with time and/or money, 
> >>>> please do get in touch.
> >>>>
> >>>> The other major issue with 3.x as reported by IBM is _changes feed 
> >>>> rewinds when nodes are rotated in and out of clusters. We already fixed 
> >>>> a number of changes rewind bugs relatively recently. I don’t know if we 
> >>>> got them all now, or if there are theoretical limits to how far we can 
> >>>> take this given our consistency model, but it’d be worth spending some 
> >>>> time on at least getting rid of all rewind-to-zero cases.
> >>>>
> >>>> * * *
> >>>>
> >>>> I’m also looking forward to all your input on the discussion here. I’m 
> >>>> sure this will explode into a lot of detailed discussions quickly, so 
> >>>> maybe as a guide to come back to when get closer to having to make a 
> >>>> decision, here are three ways forward that I see:
> >>>>
> >>>> 1. Follow IBM in abandoning FDB-Couch, refocus all effort on 
> >>>> Erlang-Couch (3.x).
> >>>>
> >>>> 2. Take FDB-Couch development over fully, come up with a story for how 
> >>>> FDB-Couch and Erlang-Couch can coexist and when users should choose 
> >>>> which one.
> >>>>
> >>>> 3. Hand over the FDB-Couch codebase to an independent team that then can 
> >>>> do what they like with it (if this materialises from this discussion).
> >>>>
> >>>> * * *
> >>>>
> >>>> Best
> >>>> Jan
> >>>> —
> >>>>
> >>>>
> >>>>> On 10. Mar 2022, at 17:24, Robert Newson <rnew...@apache.org> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> For those that are following closely, and particularly those that build 
> >>>>> or use CouchDB from our git repo, you'll be aware that CouchDB embarked 
> >>>>> on an attempt to build a next-generation version of CouchDB using the 
> >>>>> FoundationDB database engine as its new base.
> >>>>>
> >>>>> The principal sponsors of this work, the Cloudant team at IBM, have 
> >>>>> informed us that, unfortunately, they will not be continuing to fund 
> >>>>> the development of this version and are refocusing their efforts on 
> >>>>> CouchDB 3.x.
> >>>>>
> >>>>> Cloudant developers will continue to contribute as they always have 
> >>>>> done and the CouchDB PMC thanks them for their efforts.
> >>>>>
> >>>>> As the Project Management Committee for the CouchDB project, we are now 
> >>>>> asking the developer community how we’d like to proceed in light of 
> >>>>> this new information.
> >>>>>
> >>>>> Regards,
> >>>>> Robert Newson
> >>>>> Apache CouchDB PMC
> >>>>>
>

Re: Important update on couchdb's foundationdb work

Reply via email to