While we're throwing stuff out there: Some sort of URL prettifier for couchdb hosted applications. Perhaps some sort of method for setting up Routes type mappings?
Not sure if I have this right in my head, but I AFAIK, replication is pushing from one db to _bulk_docs on another. We might want to have pull-oriented to account for things like NAT. On Wed, Dec 3, 2008 at 10:17 AM, Jan Lehnardt <[EMAIL PROTECTED]> wrote: > Adding: > > - Listening on multiple network addresses: Having MochiWeb bind to multiple > (mixed IPv4 and IPv6) IP addresses an ports would help setting CouchDB in > highly customized setups. We have a patch for listening on a second port on > the same IP address, this, combined with a bit of couch_config magic should > do the trick. > > Cheers > Jan > == > > On 2 Dec 2008, at 20:34, Damien Katz wrote: > >> Here is some stuff I'd like to see in a 1.0.0 release. Everything is open >> for discussion. >> >> - Built-in reduce functions to avoid unnecessary JS overhead - >> >> Count, Sum, Avg, Min, Max, Std dev. others? >> >> - Restrict database read access - >> >> Right now any user can read any database, we need to be able to restrict >> that at least on a whole database level. >> >> - Replication performance enhancements - >> >> Adam Kocoloski has some replication patches that greatly improve >> replication performance. >> >> - Revision stemming: It should be possible to limit the number of >> revisions tracked - >> >> By default each document edit produces a revision id that is tracked >> indefinitely. This guarantees conflicts versus subsequent edits can always >> be distinguished in ad-hoc replication, however the forever growing list of >> revisions isn't always desirable. THis can be addressed by limiting the >> number tracked and purging the oldest revisions. The downside is that if the >> revision tracking limited is N, then anyone who hasn't replicated a document >> since its last N edits will see a spurious edit conflict. >> >> - Lucene/Full-text indexing integration - >> >> We have this working to in side patches, this needs to be integrated to >> trunk and with the view engine >> >> - Incremental document replication - >> >> We need at the minimum the ability to incrementally replicate only the >> attachments that have changed in a document. This will save lots of network >> IO and CouchDB can be version control system with document diffs added as >> attachments. >> >> This can work for document fields too, but the overhead may not be worth >> it. >> >> - Built-in authentication module(s) - >> >> The ability to host a CouchDB database used for HTTP authentication >> schemes. If storing passwords, they would need to be stored encrypted, >> decrypted on demand by the authentication process. >> >> - View server enhancements (stale/partial index option) - >> >> Chris Anderson has a side branch for this we need to finish and put into >> trunk. >> >> - View index compaction - >> >> Views indexes expand forever, and need to be compacted in a similar way >> the storage files are compacted. This work will tie into the View Server >> enhancements. >> >> - Document integrity/deterministic revid - >> >> For the sake of end to end document integrity, we need a way to hash a >> document's contents, and since we already have revision ids, I think the >> revision ids should be the hashes. The hashed document should be a canonical >> json representation, and it should have the _id and _rev fields in it. The >> _rev will be the PREVIOUS revision ID/hash the edit is based on, or blank if >> a new edit. Then the _rev is replaced with the new hash value. >> >> - Fully tail append writes - >> >> CouchDB uses zero-overwrite storage, but not fully tail append storage. >> Document json bodies are stored in internal buffers, written consecutively, >> one after another until the buffers in completely full, then another buffer >> is created at the end of the file for more documents. File attachments are >> written to similar buffers as well. Btree updates are always tail append, >> each update to a btree, even if its a deletion, causes new writes to the end >> of the file. Once the document, attachments and indexes are commited >> (fsync), the header is then written and flushed to disk, and that is always >> stored right at the beginning of the file (requiring another seek). >> >> Document updates to CouchDB require 2 fsyncs with ~3 seeks for full >> committal and index consistency. This is true if you write 1 or 1000 >> documents in a single transaction (bulk update), you still need ~ 3 seeks. >> Using conventional transaction journalling, it's possible to get the >> committal down to a single seek and fsync, and worry about ensuring file and >> index consistency asynchronously, often in batch mode with other committed >> updates. This can perform very well, but has downsides like extra complexity >> and increased memory usage as data is cached waiting to be flushed to disk, >> and must do special consistency checks and fix-ups on startup if there is a >> crash. >> >> If CouchDB used tail-append storage for everything, then all document >> updates can be completely flushed with full file consistency with a single >> seek and, depending on the file system, a single fsync. All the disk >> updates, documents, file attachments, indexes and file header, occur as >> appends to the end of the file. >> >> The biggest changes will be in how file attachments and the headers are >> written and read, and the performance characteristics of view indexing as >> documents will no longer be packed into contiguous buffers. >> >> File attachment will be written in chunks with the last chunk being an >> index to the other chunks. >> >> Headers will be specially signed blocks written to the end of the file. >> Reading the header on database open will require scanning the file from the >> end, since the file might have partial updates that didn't complete since >> the last update. >> >> The performance of the views will be impacted as the documents are more >> likely to be fragmented across the storage file. But they will still be in >> the order they will be accessed for indexing, so the read seeks are always >> moving forward. Also, the act of compacting the storage file will result in >> the documents being tightly packed again. >> >> - Streaming document updates with attachment writes - >> >> Using mime mulitpart encoding, it should be possible to send all parts of >> a document in a single http request, with the json and binary attachments >> sent as different mime parts. Attachments can be streamed to disk as bytes >> are received, keeping total memory overhead to a minimum. Attachments can >> also be written to disk in compressed format and served over http by default >> in that compressed format, using 0% CPU for compression at read time, but >> will require decompression if the client doesn't support the compression >> format. >> >> >> - Partitioning/Clustering Support - >> >> Clustering for failover and load balancing is priority. Large database >> support via partitioning may not make 1.0 >> >> >> >> >> > >
