Re: 1.0.0 wishlist/roadmap

Jan Lehnardt Wed, 03 Dec 2008 03:54:17 -0800


On 3 Dec 2008, at 12:47, Volker Mische wrote:

An additional feature would be that you can return any arbitraryJSON to
the view that will be attached to the resulting document. An example
would be returning a distance between a point specified in the queryand
a geometry in a document.

as opposed to the "rank" the protocol uses now which is "limited" tofulltext

search.

+1

Cheers
Jan
--

Damien Katz wrote:
Here is some stuff I'd like to see in a 1.0.0 release. Everything is
open for discussion.

- Built-in reduce functions to avoid unnecessary JS overhead -

Count, Sum, Avg, Min, Max, Std dev. others?

- Restrict database read access -
Right now any user can read any database, we need to be able torestrict
that at least on a whole database level.

- Replication performance enhancements -

Adam Kocoloski has some replication patches that greatly improve
replication performance.

- Revision stemming: It should be possible to limit the number of
revisions tracked -

By default each document edit produces a revision id that is tracked
indefinitely. This guarantees conflicts versus subsequent edits can
always be distinguished in ad-hoc replication, however the forever
growing list of revisions isn't always desirable. THis can beaddressed
by limiting the number tracked and purging the oldest revisions. The
downside is that if the revision tracking limited is N, then anyonewhohasn't replicated a document since its last N edits will see aspurious
edit conflict.

- Lucene/Full-text indexing integration -
We have this working to in side patches, this needs to beintegrated to
trunk and with the view engine

- Incremental document replication -
We need at the minimum the ability to incrementally replicate onlythe
attachments that have changed in a document. This will save lots of
network IO and CouchDB can be version control system with documentdiffs
added as attachments.
This can work for document fields too, but the overhead may not beworth
it.

- Built-in authentication module(s) -

The ability to host a CouchDB database used for HTTP authentication
schemes. If storing passwords, they would need to be storedencrypted,
decrypted on demand by the authentication process.

- View server enhancements (stale/partial index option) -
Chris Anderson has a side branch for this we need to finish and putinto
trunk.

- View index compaction -
Views indexes expand forever, and need to be compacted in a similarwaythe storage files are compacted. This work will tie into the ViewServer
enhancements.

- Document integrity/deterministic revid -
For the sake of end to end document integrity, we need a way tohash adocument's contents, and since we already have revision ids, Ithink the
revision ids should be the hashes. The hashed document should be a
canonical json representation, and it should have the _id and _rev
fields in it. The _rev will be the PREVIOUS revision ID/hash theedit isbased on, or blank if a new edit. Then the _rev is replaced withthe new
hash value.

- Fully tail append writes -
CouchDB uses zero-overwrite storage, but not fully tail appendstorage.
Document json bodies are stored in internal buffers, written
consecutively, one after another until the buffers in completelyfull,
then another buffer is created at the end of the file for more
documents. File attachments are written to similar buffers as well.
Btree updates are always tail append, each update to a btree, even if
its a deletion, causes new writes to the end of the file. Once the
document, attachments and indexes are commited (fsync), the header is
then written and flushed to disk, and that is always stored rightat the
beginning of the file (requiring another seek).

Document updates to CouchDB require 2 fsyncs with ~3 seeks for full
committal and index consistency. This is true if you write 1 or 1000
documents in a single transaction (bulk update), you still need ~ 3
seeks. Using conventional transaction journalling, it's possible togetthe committal down to a single seek and fsync, and worry aboutensuring
file and index consistency asynchronously, often in batch mode with
other committed updates. This can perform very well, but hasdownsides
like extra complexity and increased memory usage as data is cached
waiting to be flushed to disk, and must do special consistency checks
and fix-ups on startup if there is a crash.

If CouchDB used tail-append storage for everything, then all document
updates can be completely flushed with full file consistency with a
single seek and, depending on the file system, a single fsync. Allthe
disk updates, documents, file attachments, indexes and file header,
occur as appends to the end of the file.
The biggest changes will be in how file attachments and the headersarewritten and read, and the performance characteristics of viewindexing
as documents will no longer be packed into contiguous buffers.
File attachment will be written in chunks with the last chunk beingan
index to the other chunks.
Headers will be specially signed blocks written to the end of thefile.Reading the header on database open will require scanning the filefromthe end, since the file might have partial updates that didn'tcomplete
since the last update.
The performance of the views will be impacted as the documents aremorelikely to be fragmented across the storage file. But they willstill bein the order they will be accessed for indexing, so the read seeksarealways moving forward. Also, the act of compacting the storage filewill
result in the documents being tightly packed again.

- Streaming document updates with attachment writes -
Using mime mulitpart encoding, it should be possible to send allparts
of a document in a single http request, with the json and binary
attachments sent as different mime parts. Attachments can bestreamed todisk as bytes are received, keeping total memory overhead to aminimum.Attachments can also be written to disk in compressed format andserved
over http by default in that compressed format, using 0% CPU for
compression at read time, but will require decompression if theclient
doesn't support the compression format.


- Partitioning/Clustering Support -
Clustering for failover and load balancing is priority. Largedatabase
support via partitioning may not make 1.0

Re: 1.0.0 wishlist/roadmap

Reply via email to