On Dec 21, 2010, at 4:55 AM, Bob Clary wrote: > Large Initial View sizes: Several of my views are initially created with > sizes which are 10-20 times the size of the compacted view. For example, I > have one view which when initially created can take 95G but when compacted > uses less than 5G. This has caused several out of disk space conditions when > I've had to regenerate views for the database. I know commodity disks are > relatively cheap these days, but due to my current hosting environment I am > using relatively expensive networked storage. Asking for sufficient storage > for my expected database size was difficult enough, but asking for 10 or more > times that amount just to deal with temporary explosive view sizes is > probably a non-starter.
This one is being worked on in https://issues.apache.org/jira/browse/COUCHDB-700 . Guaranteeing a minimum batch size results in a smaller index file and also speeds up indexing in many circumstances. > CouchDB 1.0.x: My experience with attempting to use the 1.0.x branch was a > failure due to the crashing immediately upon view compaction completion which > caused the views to begin indexing from scratch. I agree with Paul that the timeout dropping a ref counter at the end of view compaction is a significant bug. I'm guessing it depends on the particular deployment and size of the file being deleted. There have been multiple attempts [1,2] to rewrite the reference counting system; one of those should probably be merged for 1.2.0. We might be able to have some stopgap fix for 1.0.x and 1.1.x. I also have to agree with Mike and Paul that BigCouch would help you a lot here. Even if you use it in a single-node setup the ability to split a large monolithic database into an arbitrary number of shards can help tremendously when trying to build and compact indexes. Regards, Adam [1]: https://github.com/tilgovi/couchdb/tree/ets_ref_count [2]: https://github.com/cloudant/bigcouch/blob/master/apps/couch/src/couch_file.erl#L483
