Re: CouchDB 1.0 work

Jan Lehnardt Wed, 30 Apr 2008 13:20:38 -0700

Heya Ted,
we definitely want to do a 0.8.0 before going 1.0.

See http://mail-archives.apache.org/mod_mbox/incubator-couchdb-dev/200804.mbox/[EMAIL PROTECTED]ff.

for details.


Summary: Wait for cmlenz to get back home :)

Cheers
Jan
--

On Apr 30, 2008, at 22:11, Ted Leung wrote:

What about trying to make a 0.8 release from the ASF repository? Orwould you rather do this starting at 1.0?
Ted

On Apr 28, 2008, at 9:27 AM, Damien Katz wrote:
Here are my thoughts on what we need for before we can get toCouchDB 1.0. Feedback please.
Must have:
Incremental reduce: Maybe single biggest outstanding work item.Probably 2 weeks of development to get to a testable state
Security/Document validation: We need a way to control who canupdate what documents and to validate the updates are correct. Thisis absolutely necessary for offline replication, where replicatedupdates to the database do not come through the application layer.
View index compaction/management: View indexes currently just grow,need a compaction similar to storage compaction. Also, there is noway to purge old unused indexes, except via the OS.
File sync problem: file:sync(), a call that flushes all uncommittedwrites to disk before returning, doesn't work fully or at all onall some platforms (usually we just lack the flags to tell the OSto write to disk). Should be fixable by either patching theexisting Erlang driver source, or using a replacement file driver.
Optimizations. Right now HTTP overhead is huge, with HTTP latency/overhead at about 80% of our document read time when loaded fromlocal client (same machine). Once we can get this down to below50%, we'll focus on optimizing the database and other component.Most core database operations, document reads, updates and viewindexing are completely unoptimized so far, which the update speedbeing the biggest complaint.
Testing: We need lots more tests. By the time we ship 1.0, weshould have far more test suite code than production code. And weneed to do load testing. Will the current browser based test suitecan scale for this kind of heavy testing?
Nice to have:
Plugs in: Erlang module plug-in architecture, to make adding newserver side code easy. Right now the code that maps special urls(_view, _compact, _search, etc) to the appropriate Erlang call ismessy and convoluted, and getting worse as we go. We need astandard way to map the special urls to the appropriate Erlang call.
Tail committed database headers: To optimize the updating ofdatabase by reducing the number and length of seeks required, thefile header should be written to the end of the file, rather thanthe beginning. Depending on platform this can remove a fullheadseek and in the best case scenario a document insert/update canrequire zero head seeks (if the head is already positioned at theend of the file). But this can slow file opening speed as it mayneed to do a search in the file for the most recent valid header.In the result of a crash, the header scan/search cost at databaseopen can be linear or logarithmic, depending on the exactimplementation.
Clustering: The ability to cluster CouchDB servers, to increaseboth reliability (failover-clustering) and client scalability (moreservers to handle more concurrent user load). Clustering does notincrease data scalability, which is (that's partitioning/sharding).
Selective document purging/compaction: Deletion stubs are keptaround for replication purposes. Need a way to purge the records ofdocument that are old or deleted.
Revision rev path pruning: Each document keeps a list of allprevious revisions. We need a way to prune the oldest records ofdocument revisions and remerge pruned lists during replication.
Don't Need:
Authentication. We can go to 1.0 without authentication, relyinginstead on local proxies to provide authentication.
Partioning. Partitioning is a big project with lots ofconsiderations. It's best to move this post 1.0.

Re: CouchDB 1.0 work

Reply via email to