I think that's slightly different, although related. It seems like a separate feature where we could have a ddoc with a "drop during compaction" function/mango query thing that would be more useful there.
On Wed, Sep 28, 2016 at 12:48 PM, Adam Kocoloski <[email protected]> wrote: > Cool. I think we can merge this topic with the “Tombstone Curation” topic I > posted. > > Adam > > >> On Sep 28, 2016, at 1:04 PM, Paul Davis <[email protected]> wrote: >> >> Thanks for the write up, Jan! I've only got one major change to add >> and its a bit of a doozy. >> >> # Update our revision model from a tree to a graph >> >> As a bit of background, our current revision model is a standard tree. >> The biggest issue we've seen customers have with using CouchDB >> "normally" is when they have a work load that generates conflicts with >> regularity. This ends up creating revision trees with many thousands >> of revisions with no general bound on growth of the tree. Eventually a >> single document can take many seconds to minutes to update as the tree >> has to be read from disk, updated, and then written back to disk. The >> only effective solution to this currently is to have an operator >> manually purge revisions from the document. >> >> The best solution I've come across to this issue would be to change >> our revision model to be a graph. Then instead of resolving conflicts >> by deleting a revision, the conflict is resolved by making an update >> that references two or more revisions. In this way a customer that >> generates a large number of conflicts can easily resolve the situation >> during their normal conflict resolution process. (Our stemming would >> change from keeping $revs_limit revisions for each leaf to doing a >> breadth first search and keeping all revisions for the depth that >> contains $revs_limit revisions or something.) >> >> While this approach is fairly straight forward in theory, the >> difficult part is how we'd want to handle backwards compatibility with >> replication. So far I could see having a replicator that could >> translate new revision graphs to old revision trees by "undoing" merge >> changes and created the equivalent of the old "deleted revision" >> logic. However, I don't see a way that we could go from the old format >> back to the graph (ie, think about replicating a new style graph doc >> through an old CouchDB version back to a new graph doc and end up with >> the same doc). >> >> Obviously, this is a long term goal but I do think we should start >> thinking about this and possibly making transition plans long term >> (assuming everyone thinks this is a good idea). >> >> A couple other notes: >> >> For the HTTP API upgrades, we should look at this as a step in >> refactoring the logic quite a bit and working for clean interfaces >> internally. If we include that as part of the work then having >> HTTP/HTTP2/WebSockets/Whatevers interfaces available would become much >> easier. This then enables follow on features like replication over >> WebSockets or even easier integration work as Koco suggests. I also >> agree that this is probably our highest priority major feature. >> >> My second highest priority feature would then be the pluggable storage >> engine work. I believe the current is solid and is minimally invasive. >> Mayya Sharipova has also been doing some work at [1] using it to >> enable an improved purge that will hopefully allow us to provide a >> purge operation at the cluster level (as a bandaid for the revision >> tree issues I mentioned above). I'd love to get this in and start >> having other people hacking on alternative storage engine >> implementations so we can refine the APIs even further. >> >> Lastly, for the smart cluster clients, the work I did for COUCHDB-2791 [2] >> already implements a bit of that. There's definitely more to add here >> to flesh it out but the surprisingly simple implementation makes me >> think that it'd fit in quite nicely with the HTTP refactoring work >> from above. I'm currently working on improving our API tests in Erlang >> so that I can eventually turn those branches into PRs. >> >> [1] https://github.com/cloudant/couchdb-couch/commits/68275_cluster_purge >> [2] https://issues.apache.org/jira/browse/COUCHDB-2791 >> >> On Tue, Sep 27, 2016 at 7:56 AM, Jan Lehnardt <[email protected]> wrote: >>> Hi all, >>> >>> apologies in advance, this is going to be a long email. >>> >>> >>> I’ve been holding this back intentionally in order to be able to focus on >>> shipping 2.0, but now that that’s out, I feel we should talk about what’s >>> next. >>> >>> This email is separated into areas of work that I think CouchDB could >>> improve on, some with very concrete plans, some with rather vague ideas. >>> I’ve been collecting these over the past year or <strike>two</strike>five, >>> so it’s fairly wide, but I’m sure I’m missing things that other people find >>> important, so please add to this list. >>> >>> After the initial discussion here, I’ll move all of the individual issues >>> to JIRA, so we can go down our usual process. >>> >>> This is basically my wish list, and I’d like this to become everyone’s wish >>> list, so please add what I’ve been missing. :) — Note, this isn’t a >>> free-for-all, only suggest things that you are prepared to see through >>> being shipped, from design, implementation to docs. >>> >>> I don’t have a specific order for these in mind, although I have a rough >>> idea of what we should be doing first. Putting all of this on a roadmap is >>> going to be a fun future exercise for us, though :) >>> >>> One last note: this doesn’t include anything on documentation or testing. I >>> fully expect to step our game from here on out. This list is for the >>> technical aspects of the project. >>> >>> * * * >>> >>> These are the areas of work I’ve roughly come up with that my suggestions >>> fit into: >>> >>> - API >>> - Storage >>> - Query >>> - Replication >>> - Cluster >>> - Fauxton >>> - Releases >>> - Performance >>> - Internals >>> - Builds >>> - Features >>> >>> (I’m not claiming these are any good, but it’s what I’ve got) >>> >>> >>> Let’s go. >>> >>> >>> * * * >>> >>> # API >>> >>> ## HTTP2 >>> >>> I think this is an obvious first next step. Our HTTP Layer needs work, our >>> existing HTTP server library is not getting HTTP2 support, it’s time to >>> attack this head-first. I’m imagining a Cowboy[1]-based HTTP layer that >>> calls into a unified internals layer and everything will be rose-golden. >>> HTTP2 support for Cowboy is still in progress. Maybe we can help them >>> along, or we focus on the internals refactor first and drop Cowboy in later >>> (not sure how feasible this approach is, but we’ll figure this out. >>> >>> In my head, we focus on this and call the result 3.0 in 6-12 months. That >>> doesn’t mean we *only* do this, but this will be the focus (more on this >>> later). >>> >>> There are a few fun considerations, mainly of the “avoid Python >>> 2/3-chasm”-type. Do we re-implement the 2.0 API with all its >>> idiosyncrasies, or do we take the opportunity to clean things up while we >>> are at it? If yes, how and how long do we support the then old API? Do we >>> manage this via different ports? If yes, how can this me made to work for >>> hosting services like Cloudant? Etc. etc. >>> >>> [1] https://github.com/ninenines/cowboy >>> >>> >>> ## Sub-Document Operations >>> >>> Currently a doc update needs the whole doc body sent to the server. There >>> are some obvious performance improvements possible. For the longest time, I >>> wanted to see if we can model sub-document operations via JSON Pointers[2]. >>> These would roughly allow pointing to a JSON value via a URL. >>> >>> For example in this doc: >>> >>> { >>> "_id": "123abc", >>> "_rev": "zyx987", >>> "contact": { >>> "name": "", >>> "address": { >>> "street": "Long Street", >>> "nr": 123 >>> "zip": "12345" >>> } >>> } >>> >>> An update to the zip code could look like this: >>> >>> curl -X POST $SERVER/db/123abc/_jsonpointer/contact/address/zip?rev=zyx987 >>> -d '54321' >>> >>> GET/DELETE accordingly. We could shortcut the `_jsonpointer` to just `_` if >>> we like the short magic. >>> >>> JSONPointer can deal with nested objects and lists and works fairly well >>> for this type of stuff, and it is rather simple to implement (even I could >>> do it: >>> https://github.com/janl/erl-jsonpointer/blob/master/src/jsonpointer.erl — >>> This idea is literally 5 years old, it looks like, no need to use my code >>> if there is anything better). >>> >>> This is just a raw idea, and I’m happy to solve this any other way, if >>> somebody has a good approach. >>> >>> [2] https://tools.ietf.org/html/rfc6901 >>> >>> >>> ## HTTP PATCH / JSON Diff >>> >>> Another stab at a similar problem are HTTP PATCH with JSON Diff, but with >>> the inherent problems of JSON normalisation, I’m leaning towards the >>> JSONPointer variant as simpler, but I’d be open for this as well, if >>> someone comes up with a good approach. >>> >>> >>> ## GraphQL[3] >>> >>> It’s rather new, but getting good traction[4]. This would be a nice >>> addition to our API. Somebody might already be hacking on this ;) >>> >>> [3]: http://graphql.org >>> [4]: http://githubengineering.com/the-github-graphql-api/ >>> >>> >>> ## Mango for Document Validation >>> >>> The only place where we absolutely require writing JS is >>> validate_doc_update functions. Some security behaviour can only be enforced >>> there. With their inherent performance problems, I’d like to get doc >>> validations out of the path of the query server and would love to find a >>> way to validate document updates through Mango. >>> >>> >>> ## Redesign Security System >>> >>> Our security system is slowly grown and not coherently designed. We should >>> start over. I have many ideas and opinions, but they are out of scope for >>> this. I think everybody here agrees that we can do better. This *very >>> likely* will *not* include per-document ACLs as per the often stated issues >>> with that approach in our data model. >>> >>> * * * >>> >>> >>> # Replication >>> >>> This is our flagship feature of course, and there are a few things we can >>> do better. >>> >>> >>> ## Mobile-optimised extension or new version of the protocol >>> >>> The original protocol design didn’t take mobile devices into account and >>> through PouchDB et.al. we are now learning that there are number of >>> downsides to our protocol. We’ve helped a lot with introducing >>> _bulk_get/_revs, but that’s more a bandaid than a considered strategy ;) >>> >>> That new version could also be HTTP2-only, to take advantage of the new >>> connection semantics there. >>> >>> >>> ## Easy way to skip deletes on sync >>> >>> This one is self-explanatory, mobile clients usually don’t need to sync >>> deletes from a year ago first. Mango filters might already get us there, >>> maybe we can do better. >>> >>> >>> ## Sync a rolling subset >>> >>> Say you always want to keep the last 90 days of email on a mobile device >>> with optionally back-loading older documents on user-request. It is >>> something I could see getting a lot of traction. >>> >>> Today, this can be built on 1.x with clever use of _purge, but that’s >>> hardly a good experience. I don’t know if it can be done in a cluster. >>> >>> >>> ## Selective Sync >>> >>> There might be other criteria than “last 90 days”, so the more general >>> solution to this problem class would be arbitrary (e.g. client-directed) >>> selective sync, but this might be really hard as opposed to just very hard >>> of the “last 90 days” one, so happy to punt on this first. But filters are >>> generally not the answer, especially with large data sets. Maybe proper >>> sync from views _changes is the answer. >>> >>> >>> ## A _db_updates powered _replicator DB >>> >>> Running thousands+ of replications on a server is not really resource >>> friendly today, we should teach the replicator to only run replication on >>> active databases via _db_updates. Somebody might already be looking into >>> this one. >>> >>> * * * >>> >>> >>> # Storage >>> >>> >>> ## Pluggable Storage Engines >>> >>> Paul Davis already showed some work on allowing multiple different storage >>> backends. I’d like to see this land. >>> >>> ## Different Storage Backends >>> >>> These don’t all have to be supported by the main project, but I’d really >>> like to see some experimentation with different backends like >>> LevelDB[5]/RocksDB[6], InnoDB[7], SQLite[8] a native-erlang one that is >>> optimised for space usage and not performance (I don’t want to budge on >>> safety). Similarly, it’d be fun to see if there is a compression format >>> that we can use as a storage backend directly, so we get full-DB >>> compression as opposed to just per-doc compression. >>> >>> [5]: http://leveldb.org >>> [6]: http://rocksdb.org >>> [7]: https://en.wikipedia.org/wiki/InnoDB >>> [8]: https://www.sqlite.org >>> >>> * * * >>> >>> >>> # Query >>> >>> ## Teach Mango JOINs and result sorting >>> >>> It’s the natural path for query languages. We should make these happen. >>> Once we have the basics, we might even be able to find a way to compile >>> basic SQL into Mango, it’s going to be glorious :) >>> >>> >>> ## “No-JavaScript”-mode >>> >>> I’ve hinted at this above, but I’d really like a way for users to use >>> CouchDB productively without having to write a line of JavaScript. My main >>> motivation is the poor performance characteristics of the Query Server >>> (hello CGI[9]?). But even with one that is improved, it will always faster >>> to do any, say filtering or validation operations in native Erlang. I don’t >>> know if we can expand Mango to cover all this, and I’m not really concerned >>> about the specifics, as long as we get there. >>> >>> Of course, for pro-users, the JS-variant will still be around. >>> >>> [9]: https://en.wikipedia.org/wiki/Common_Gateway_Interface >>> >>> >>> ## Query Server V2 >>> >>> We need to revamp the Query Server. It is hardcoded to an out-of-date >>> version of SpiderMonkey and we are stuck with C-bindings that barely anyone >>> dares to look at, let alone iterate on. >>> >>> I believe the way forward is re-vamping the query server protocol to use >>> streaming IO instead of blocking batches like we do now, and use JS-native >>> implementation of the JS-side instead of C-bindings. >>> >>> I’m partial to doing this straight in Node, because there is a ton of >>> support for things we need already, and I believe we’ve solved the >>> isolation issues required for secure MapReduce, but I’m happy to use any >>> other thing as well, if it helps. >>> >>> Other benefits would be support for emerging JS features that devs will >>> want to use. >>> >>> And we can have two modes: standalone QS like now, and embedded QS where, >>> say, V8 is compiled into the Erlang VM. Not everybody will want to run >>> this, but it’ll be neat for those who do. >>> >>> >>> * * * >>> >>> >>> # Cluster >>> >>> ## Rebalancing >>> >>> With this we will be able to grow clusters one by one instead of hitting a >>> wall when eventually each shard lives on a single machine. E.g. when you >>> add a node to the cluster, all other nodes share 1/Nth of their data with >>> the new node, and everything can keep going. Same for removing a node and >>> shrinking the cluster. >>> >>> Couchbase has this and it is really nice. >>> >>> >>> ## Setup >>> >>> Even without rebalancing, we need a nice Fauxton UI to manage the cluster, >>> so far we only have a simple setup procedure (which is great don’t get me >>> wrong), but users will want to do more elaborate cluster management and we >>> should make that easy with a slick UI. >>> >>> >>> ## Cluster-Aware Clients >>> >>> This might end up being not a good idea, but I’d like some experimentation >>> here. Say you’d have a CouchDB client that could be hooked into the cluster >>> topology so it’d know which nodes to query for which data, then we can save >>> a proxy-hop, and build clients that have lower-latency access to CouchDB. >>> Again, this is something that Couchbase does and I think is worth exploring. >>> >>> >>> >>> * * * >>> >>> >>> # Fauxton >>> >>> Fauxton is great, but it could be better too, I think. I’m mostly concerned >>> about number of clicks/taps required for more specialised actions (like >>> setting the group_level of a reduce query, it’s like 15 or so). More >>> cluster info would also be nice, and maybe a specialised dashboard for >>> db-per-user setups. >>> >>> >>> * * * >>> >>> >>> # Releases >>> >>> >>> ## Six-Week Release Trains >>> >>> We need to get back to frequent releases and I propose to go back to our >>> six-week-release train plans from three years ago. Whatever lands within a >>> release train time frame goes out. The nature of the change dictates the >>> version number increment as per semver, and we just ship a new version >>> every six weeks, even if it only includes a single bug fix. We should >>> automate most of this infrastructure, so actual releases are cheap. We are >>> reasonably close with this, but we need some more folks to step up on using >>> and maintaining our CI systems. >>> >>> >>> ## One major feature per major version >>> >>> I also propose to keep the scope of future major versions small, so we >>> don’t have to wait another 3-5 years for 3.0. In particular, I think we >>> should focus on a single major feature per major version and get that >>> shipped within 6-12 months tops. If anything needs more time, it needs to >>> be broken up. Of course we continue to add features and fix things while >>> this happens, but as a project, there is *one* major feature we push. For >>> example, for 3.0 I see our push be behind HTTP2 support. There is a lot of >>> subsequent work required to make that happen, so it’ll be a worthwhile 3.0, >>> but we can ship it in 6-12 months (hopefully). >>> >>> Best case scenario, we have CouchDB 4.0 coming out 12 months from now with >>> two new major features. That would be amazing. >>> >>> >>> * * * >>> >>> >>> # Performance >>> >>> ## Perf Team >>> >>> We need a team to comprehensive look at CouchDB performance. There is a lot >>> of low-hanging fruit like Robert Kowalski showed a while back, we should >>> get back into this. I’m mostly inspired by SQLite who’ve done a release a >>> while back that only focussed on 1-2% performance improvements, but got >>> like 20-30 of those and made the thing a lot faster across the board. I >>> can’t remember where I read about this, but I’ll update this once I find >>> the link. >>> >>> >>> ## Benchmark Suite >>> >>> We need a benchmark suite that tests a variety of different work loads. The >>> goal here is to run different versions of CouchDB against the same suite on >>> the same hardware, to see where are going. I’m imagining a >>> http://arewefastyet.com style dashboard where we can track this, and even >>> run this on Pull Requests and not allow them if they significantly impact >>> performance. >>> >>> >>> ## Synthetic Load Suite >>> >>> This one is for end users. I’d like to be able to say: My app produces >>> mostly 10-20kb-sized docs, but millions of those in a single database, or >>> across 1000s of databases, with these views etc. and then run this on >>> target hardware so I’d know, e.g. how many nodes I need for a cluster with >>> my estimated workload. I know this can only be done in approximation, but I >>> think this could make a big difference in CouchDB adoption and feed back >>> into Perf Team mentioned above. >>> >>> * * * >>> >>> >>> # Internals >>> >>> ## Consolidate Repositories >>> >>> With 2.0 we started to experiment with radically small modules for our >>> components and I think we’ve come to the conclusion that some consolidation >>> is better for us going forward. Obvious candidates for separate repos are >>> docs, Fauxton etc. but also some of the Erlang modules that other projects >>> reasonably would use. >>> >>> >>> ## Elixir >>> >>> I’d like it very much if we elevate Elixir as a prime target language for >>> writing CouchDB internals. I believe this would get us an influx of new >>> developers that we badly need to get all the things I’m listing here done. >>> Somebody might be looking into the technical aspects of this already, but >>> we need to decide as a project if we are okay with that. >>> >>> >>> ## GitHub Issues >>> >>> I hope we can transition to GitHub Issues soon. >>> >>> * * * >>> >>> >>> # Builds >>> >>> I’d like automated builds for source, Docker et.al., rpm, deb, brew, ports, >>> Mac Binary, etc with proper release channels for people to subscribe to, >>> all powered by CI for nightly builds, so people can test in-development >>> versions easily. >>> >>> I’d also like builds that include popular community plugins like Geo or >>> Fulltext Search. >>> >>> >>> >>> * * * >>> >>> >>> # Features >>> >>> ## Better Support for db-per-user >>> >>> I don’t know what this will look like, but this is a pattern, and we need >>> to support it better. >>> >>> One approach could be “virtual dbs” that are backed by a single database, >>> but that’s usually at odds with views, so we could make this an XOR and >>> disable views on these dbs. Since this usually powers client-heavy apps, >>> querying usually happens there anyway. >>> >>> Another approach would be better / easier cross-db aggregation or querying. >>> There are a few approaches, but nothing really slick. >>> >>> >>> ## Schema Extraction >>> >>> I have half an (old) patch that extracts top level fields from a document >>> and stores them with a hash in an “attachment” to the database header. So >>> we only end up storing doc values and the schema hash. First of all this >>> trades storage for CPU time (I haven’t measured anything yet), but more >>> interestingly, we could use that schema data to do smart things like >>> auto-generating a validation function / mango expression based on the data >>> that is already in the database. And other fun things like easier schema >>> migration operations that are native in CouchDB and thus a lot faster than >>> external ones. For the curious ones, I’ve got the idea from V8’s property >>> access optimisation strategy[10]. >>> >>> [10]: https://github.com/v8/v8/wiki/Design%20Elements#fast-property-access >>> >>> * * * >>> >>> Alright, that’s it for now. Can’t wait for your feedback! >>> >>> Best >>> Jan >>> -- >>> Professional Support for Apache CouchDB: >>> https://neighbourhood.ie/couchdb-support/ >>> >
