Thanks Sharath. For the record I did misspeak; the default value for checkpoint_after is 10x the buffer size, not 10 (i.e., its measured in bytes). Cheers,
Adam > On Jun 19, 2015, at 10:03 AM, Kiril Stankov <[email protected]> wrote: > > Thanks! > I'll try the suggestions. > ------------------------------------------------------------------------ > *With best regards,* > Kiril Stankov, > > On 19-Jun-15 6:05 AM, Sharath wrote: >> Hi Kiril, >> >> I came across this issue when I was using couchdb to store large documents. >> Various members from this forumn helped me. >> >> You can find the conversation here: >> http://qnalist.com/questions/5836043/couchdb-database-size >> >> The following setting helped reduce my database file size: >> >> checkpoint_after = 5242880000 >> doc_buffer_size = 524288000 >> >> >> I haven't had to revisit this setting. However, the drawback is the RAM >> consumed (largish during compacting). I used to compact twice daily but now >> its once weekly. >> >> My application mostly inserts to the database. >> >> regards, >> Sharath >> >> On Fri, Jun 19, 2015 at 6:23 AM, Adam Kocoloski <[email protected]> wrote: >> >>> Yep, it’s normal. The wasted space is due to the purely copy-on-write >>> nature of the btree indexes that the database maintains. Two main things >>> you can do to reduce the overhead: >>> >>> * use the _bulk_docs endpoint >>> * choose a long common prefix for the _ids of the documents in a given >>> payload >>> >>> Yes, periodic compaction and cleanup is a good practice. Compaction only >>> requires 1-2 extra file descriptors. It will use up to `doc_buffer_size` >>> bytes to store docs in memory (default 512k), and will fsync after if fills >>> the buffer `checkpoint_after` times (default 10). A larger buffer should >>> result in a slightly faster compaction and a slightly more compact file. >>> You probably don’t want to bother changing the checkpoint frequency. Cheers, >>> >>> Adam >>> >>>> On Jun 18, 2015, at 2:11 PM, Kiril Stankov <[email protected]> wrote: >>>> >>>> Hi, >>>> >>>> I'm importing now a big number of documents in CouchDB. >>>> The documents have only single revision. And they will stay with single >>> rev in one of the DB's >>>> I notice that the Db size grows significantly, then, after compact drops >>> by 70%. >>>> This process - import of single version documents will occur once a week. >>>> >>>> Why is so much space wasted? Is it something normal? >>>> >>>> Is it a good practice to run periodically compact and cleanup? >>>> >>>> Is there some DB size limit, after which the compact and cleanup may >>> cause issue or have problems to run? E.g. file descriptors, memory. How >>> should I configure checkpoint_after, doc_buffer_size? >>>> Thanks in advance. >>>> >>> >
