Re: CouchDb Rewrite/Fork

Joan Touzet Wed, 10 Jul 2019 14:01:38 -0700

Sounds like a challenging and intense project!

If you retain compatibility with the CouchDB replication protocol, we'd
certainly be willing to include it in our list of CouchDB ecosystem
partners, once your product is publicly available.


-Joan

On 2019-07-10 10:04, Reddy B. wrote:
> Of course, this will have nothing to do in terms of branding, I don't even 
> think we'll use the codebase. Moreover our primarily goal isn't to offer a 
> competing product to the public. It is to serve our internal needs and reduce 
> our risk. We will only open-source it as a way to give back once the product 
> is very mature (which is also a way to reduce support needs).
> 
> Thanks
> ________________________________
> De : Robert Newson <rnew...@apache.org>
> Envoyé : mercredi 10 juillet 2019 15:47
> À : dev@couchdb.apache.org
> Objet : Re: CouchDb Rewrite/Fork
> 
> That’s valuable feedback thank you.
> 
> Best of luck with your new project and a gentle reminder that you may not 
> call it CouchDB.
> 
> B.
> 
>> On 10 Jul 2019, at 00:07, Reddy B. <redd...@live.fr> wrote:
>>
>> Hi all,
>>
>> I've checked the recent discussions and apparently July is the "vision 
>> month" lol. Hopefully this email will not saturate the patience of the core 
>> team.
>>
>> We have been thinking about forking/rewriting CouchDb internally for quite 
>> some time now, and this idea has reached a degree of maturity such that I'm 
>> pretty confident it will materialize at this point. We hesitated between 
>> doing our thing internally to then make our big open-sourcing announcement 
>> 5-10 years from now when the product is battle tested, and announcing our 
>> intentions here today.
>>
>> However, I realized that good things may happen by providing this feedback, 
>> and that providing this type of feedback also is a way of giving back to the 
>> community.
>>
>> The reason for this project is that we have lost confidence in the way the 
>> vision of CouchDb aligns with our goals. As far as we are concerned, there 
>> are 3 things we loved with CouchDb:
>>
>> #Map/Reduce
>>
>> We think that the benefits of Map/Reduce are very underrated. Map/reduce 
>> forces developpers to approach problems differently and results in much more 
>> efficient and well-thought of  application architectures and 
>> implementations. This is in addition to the performance benefits since 
>> indexes are built in advance in a very predictable manner (with a few 
>> well-documented caveats). For this reason, our developers are forbidden from 
>> using Mango, and we require them to wrap their head around problems until 
>> they are able to solve them in map/reduce mode.
>>
>> However, we can see that the focus of the CouchDb project is increasingly on 
>> Mango, and we have little confidence in the commitment of the project to 
>> first-class citizen Map/Reduce support (while this was for us a defining 
>> aspect of the identity of CouchDb).
>>
>> #Complexity of the codebase
>>
>> An open-source software that is too complex to be tweaked and hacked is for 
>> all practical purposes closed-source software. You guys are VERY smart. And 
>> by nature a database software system is a non-trivial piece of technology.
>>
>> Initially we felt confident that the codebase was small enough and clean 
>> enough that should we really need to get our hands dirty in an emergency 
>> situation, we would be able to do so. Then Mango made the situation a bit 
>> blurrier, but we could easily ignore that, especially since we do not use 
>> it. However with FoundationDB... this becomes a whole different story.
>>
>> The domain model of a database is non-trivial by nature, and now 
>> FoundationDb will introduce an additional level of abstraction and 
>> indirection, and a very serious one. I've been reading the design 
>> discussions since the FoundationDb announcement and there are a lot of 
>> impedance mistmatches requiring the domain model of CouchDb to be broken up 
>> in fictious entities intended to accomodate FoundationDb abstractions and 
>> their limitations (I'll back to this point in a moment).
>>
>> Indirection is also introduced at the business logic level, with additional 
>> steps needing to be followed to emulate the desired behavior. All of this is 
>> complexity and obfuscation, and to be realistic, if we already struggled 
>> with the straight-to-the-point implementation, there is no way we'll be able 
>> to navigate (let alone hack), the FoundationDB-based implementation.
>>
>> #(Apparent) Non-Alignment of FoundationDb with the reasons that made us love 
>> CouchDb
>>
>> FoundationDb introduces limitations regarding transactions, document sizes 
>> and another number of critical items. One of the main reasons we use CouchDb 
>> is because of the way it allows us to develop applications rapidly and 
>> flexibly address all the state storage needs of application layers. CouchDb 
>> has you covered if you just want to dump large media file streamed with HTTP 
>> range requests while you iterate fast and your userbase is small, and 
>> replication allows you to seemless scale by distributing load on clusters in 
>> advanced ways without needing to redesign your applications. The user 
>> nkosi23 nicely describes some of the new possibilities enabled by CouchDb:
>>
>> https://github.com/apache/couchdb/pull/1253#issuecomment-507043600
>>
>> However, the limitations introduced by FoundationDb and the spirit of their 
>> project favoring abstraction purity through aggressive constraints, over 
>> operational flexibility is the opposite of the reasons we loved CouchDb and 
>> believed in it. It is to us pretty clear that the writing is on the wall. We 
>> aren't confident in FoundationDb to cover our bases, since covering our 
>> bases is explicitly not the goal of their project and their spirit is 
>> different from what has made CouchDb unique (ease of use, simple yet 
>> powerful and flexible abstractions etc...).
>>
>> #Lack of commitment to the ideas pioneered
>>
>> We feel like Couchdb itself undervalues the wealth of what it has brought to 
>> the table. For example when it comes to architecting load balancing for all 
>> sorts of applications with a single and transparent value store, CouchDb 
>> enables things that simply weren't possible before, and people will need 
>> time to understand how they can take advantage of them.
>>
>> Nowadays we can see sed, awk and such be used in pretty clever ways, but it 
>> took time for people to incorporate the possibilities enabled by these tools 
>> in their thinking process (even though system administration are much easier 
>> to deploy than enterprise applications).
>>
>> I think that CouchDb should have a 10 or 20-year outlook on the paradigm 
>> shifts its introduces, there is a need to give more place to faith and less 
>> place to data since not every usage will be adopted within 3 years. 
>> Sometimes you need to do things because you believe in them and you know you 
>> are right and that eventually people will come. But right now, it feels like 
>> customer statistics from Cloudant have become the main driver of the 
>> project. A balanced probably can be found between aligning with business 
>> realities and evangelism realities. I feel IBM guys are totally right to 
>> share their insights, but if there are not faith-zealots to counter-balance, 
>> then a positive may become a negative.
>>
>> #What we plan to do
>>
>> For all these reasons, CouchDb 3 will likely be the last release we will 
>> use. What we are about to activate is an effort to rewrite CouchDb to focus 
>> on the use case that we think makes CouchDb unique: a one-stop shop for all 
>> data storage needs, no matter the type of application and load. This means 
>> focusing on, on the one hand on working seamlessly with extremely large 
>> attachments and documents of any size, and on the other hand replication 
>> features (which goes hand in hand).
>>
>> We will also seek to resurrect old features such as list views that we think 
>> need long-term faith. To make it possible from a bandwidth perspective, we 
>> will make a number of radical decisions. The two most important ones may be 
>> the following:
>>
>> - Only map/reduce will be supported. Far from a limitation we see this as a 
>> way of life and a different way of thinking about designing line of business 
>> applications. Our finding is that a line of business applications never 
>> needs SQL style flexibility for the main app is the problem space has been 
>> correctly modeled (instead of being Excel in the web browser). When Business 
>> Analytics are really needed, the need is always very localized, and it is 
>> nowadays easy enough to have an ETL pipeline on a separate instance 
>> (especially considering CouchDb filtered replication capabilities).
>> - Rewrite CouchDb in FSharp.
>>
>> Rewriting in Fsharp will provide all the benefits of functional programming, 
>> while giving us access to a rich ecosystem of libraries, and a great static 
>> type checking system. All of this will mean more time to focus on the core 
>> features.
>>
>> This is in a gist pretty much the plan. This is still early stages, and the 
>> way we do things, we would typically roll it out internally for a number of 
>> years before announcing it to the public. So I think there will likely be a 
>> 10-yearish window before you hear about this again.
>>
>> I simply wanted to provide our feedback as a friendly contribution.
> 
>

Re: CouchDb Rewrite/Fork

Reply via email to