On Mon, May 14, 2012 at 03:42:01PM -0400, Tim Tisdall wrote: > Yes, I did it with a PUT for each id. When you call for compaction, is > there a way to see the progress or a way to know if it's done?
the "status" tool in Futon will show you compaction progress Also, two other things. Insertions of data goes faster if you use the bulk_docs interface. To keep things under control, I like to insert about 100 docs at a time, but it depends on your doc size, really. Second, I have found in my own totally unscientific testing that large documents compact better than many small documents. For example, I have detector data with one record per 30 seconds. If I combine data into daily docs and save, after compaction the resulting database is much smaller than if I keep one document per observation. I ran these tests back around the 1.0.1 generation of CouchDB, but I think the reason compaction doesn't work well for small document is the same gzip doesn't work well for small documents...if there is very little repeated information in a document, then gzip and other compression utilities can't do much. The larger the doc, the more the text will have repeats, and the better the compression algorithms perform. But all that compression savings will be wasted if you then have to write a view that explodes each doc back into its smaller docs. Oh, and don't forget to compact any views you use as well. Hope that helps, James > > On Mon, May 14, 2012 at 3:20 PM, Paul Davis > <[email protected]>wrote: > > > How did you insert them? If you did a PUT per docid you'll still want > > to compact afterwards. > > > > On Mon, May 14, 2012 at 2:13 PM, Tim Tisdall <[email protected]> wrote: > > > I've got several gigabytes of data that I'm trying to store in a couchdb > > on > > > a single machine. I've placed a section of the data in an sqlite db and > > > the file is about 5.9gb. I'm currently placing the same data into > > couchdb > > > and while it hasn't finished yet, the file size is already 10gb and > > > continuing to grow. The sqlite database is essentially a table of ids > > with > > > a json block of text for each, so I figured the couchdb wouldn't be too > > > much different in size. > > > > > > Does anyone have some recommendations on how to reduce the size of the > > db? > > > Right now I've only inserted data and have not made any "updates" to > > > documents, so there should be no revision copies to be cleared away. > >
pgpO0pmFJHCOW.pgp
Description: PGP signature
