Re: 1.0.0 wishlist/roadmap

Jan Lehnardt Wed, 03 Dec 2008 07:24:30 -0800

Adding:

- Listening on multiple network addresses: Having MochiWeb bind tomultiple (mixed IPv4 and IPv6) IP addresses an ports would helpsetting CouchDB in highly customized setups. We have a patch forlistening on a second port on the same IP address, this, combined witha bit of couch_config magic should do the trick.


Cheers
Jan
==

On 2 Dec 2008, at 20:34, Damien Katz wrote:

Here is some stuff I'd like to see in a 1.0.0 release. Everything isopen for discussion.
- Built-in reduce functions to avoid unnecessary JS overhead -

Count, Sum, Avg, Min, Max, Std dev. others?

- Restrict database read access -
Right now any user can read any database, we need to be able torestrict that at least on a whole database level.
- Replication performance enhancements -
Adam Kocoloski has some replication patches that greatly improvereplication performance.
- Revision stemming: It should be possible to limit the number ofrevisions tracked -
By default each document edit produces a revision id that is trackedindefinitely. This guarantees conflicts versus subsequent edits canalways be distinguished in ad-hoc replication, however the forevergrowing list of revisions isn't always desirable. THis can beaddressed by limiting the number tracked and purging the oldestrevisions. The downside is that if the revision tracking limited isN, then anyone who hasn't replicated a document since its last Nedits will see a spurious edit conflict.
- Lucene/Full-text indexing integration -
We have this working to in side patches, this needs to be integratedto trunk and with the view engine
- Incremental document replication -
We need at the minimum the ability to incrementally replicate onlythe attachments that have changed in a document. This will save lotsof network IO and CouchDB can be version control system withdocument diffs added as attachments.
This can work for document fields too, but the overhead may not beworth it.
- Built-in authentication module(s) -
The ability to host a CouchDB database used for HTTP authenticationschemes. If storing passwords, they would need to be storedencrypted, decrypted on demand by the authentication process.
- View server enhancements (stale/partial index option) -
Chris Anderson has a side branch for this we need to finish and putinto trunk.
- View index compaction -
Views indexes expand forever, and need to be compacted in a similarway the storage files are compacted. This work will tie into theView Server enhancements.
- Document integrity/deterministic revid -
For the sake of end to end document integrity, we need a way to hasha document's contents, and since we already have revision ids, Ithink the revision ids should be the hashes. The hashed documentshould be a canonical json representation, and it should have the_id and _rev fields in it. The _rev will be the PREVIOUS revision ID/hash the edit is based on, or blank if a new edit. Then the _rev isreplaced with the new hash value.
- Fully tail append writes -
CouchDB uses zero-overwrite storage, but not fully tail appendstorage. Document json bodies are stored in internal buffers,written consecutively, one after another until the buffers incompletely full, then another buffer is created at the end of thefile for more documents. File attachments are written to similarbuffers as well. Btree updates are always tail append, each updateto a btree, even if its a deletion, causes new writes to the end ofthe file. Once the document, attachments and indexes are commited(fsync), the header is then written and flushed to disk, and that isalways stored right at the beginning of the file (requiring anotherseek).
Document updates to CouchDB require 2 fsyncs with ~3 seeks for fullcommittal and index consistency. This is true if you write 1 or 1000documents in a single transaction (bulk update), you still need ~ 3seeks. Using conventional transaction journalling, it's possible toget the committal down to a single seek and fsync, and worry aboutensuring file and index consistency asynchronously, often in batchmode with other committed updates. This can perform very well, buthas downsides like extra complexity and increased memory usage asdata is cached waiting to be flushed to disk, and must do specialconsistency checks and fix-ups on startup if there is a crash.
If CouchDB used tail-append storage for everything, then alldocument updates can be completely flushed with full fileconsistency with a single seek and, depending on the file system, asingle fsync. All the disk updates, documents, file attachments,indexes and file header, occur as appends to the end of the file.
The biggest changes will be in how file attachments and the headersare written and read, and the performance characteristics of viewindexing as documents will no longer be packed into contiguousbuffers.
File attachment will be written in chunks with the last chunk beingan index to the other chunks.
Headers will be specially signed blocks written to the end of thefile. Reading the header on database open will require scanning thefile from the end, since the file might have partial updates thatdidn't complete since the last update.
The performance of the views will be impacted as the documents aremore likely to be fragmented across the storage file. But they willstill be in the order they will be accessed for indexing, so theread seeks are always moving forward. Also, the act of compactingthe storage file will result in the documents being tightly packedagain.
- Streaming document updates with attachment writes -
Using mime mulitpart encoding, it should be possible to send allparts of a document in a single http request, with the json andbinary attachments sent as different mime parts. Attachments can bestreamed to disk as bytes are received, keeping total memoryoverhead to a minimum. Attachments can also be written to disk incompressed format and served over http by default in that compressedformat, using 0% CPU for compression at read time, but will requiredecompression if the client doesn't support the compression format.
- Partitioning/Clustering Support -
Clustering for failover and load balancing is priority. Largedatabase support via partitioning may not make 1.0

Re: 1.0.0 wishlist/roadmap

Reply via email to