On Tue, May 26, 2009 at 2:31 PM, Jeff Macdonald <[email protected]> wrote: > Hi all, > I've been experimenting with CouchDB. I'm use Net::CouchDB to batch insert > 20 docs at a time and I'm simply setting _id to a sequence that is > incremented for each doc. For just over 9 million rows where each row is > just 6 small fields the resulting DB is 3.4G. When I was letting CouchDB set > the _id, the resulting database was over 20G. The input source as a tab > delimited file is just over 500MB. > > So is it normal for CouchDB to create such a large database file when it > assigns document ids? >
yes, currently couchdb docids are random which means more of the btree must be rewritten, than if they were concentrated, such as you see with sequential ids. for high performance applications, sequential ids is faster as well. Compacting may shrink your databases so they are roughly equal size. You an trigger compaction from Futon. I'd be interested to see what results you get. > -- > Jeff Macdonald > Ayer, MA > -- Chris Anderson http://jchrisa.net http://couch.io
