On Tue, May 26, 2009 at 5:36 PM, Chris Anderson <[email protected]> wrote:
> On Tue, May 26, 2009 at 2:31 PM, Jeff Macdonald <[email protected]> > wrote: > > Hi all, > > I've been experimenting with CouchDB. I'm use Net::CouchDB to batch > insert > > 20 docs at a time and I'm simply setting _id to a sequence that is > > incremented for each doc. For just over 9 million rows where each row is > > just 6 small fields the resulting DB is 3.4G. When I was letting CouchDB > set > > the _id, the resulting database was over 20G. The input source as a tab > > delimited file is just over 500MB. > > > > So is it normal for CouchDB to create such a large database file when it > > assigns document ids? > > > > yes, currently couchdb docids are random which means more of the btree > must be rewritten, than if they were concentrated, such as you see > with sequential ids. for high performance applications, sequential ids > is faster as well. > > Compacting may shrink your databases so they are roughly equal size. > You an trigger compaction from Futon. I'd be interested to see what > results you get. Well, it took over a day to do it before. I was however only inserting 10 docs at a time then. So, right now I'm not motivated to find out how well the compaction would be. :) -- Jeff Macdonald Ayer, MA
