Configurable database compression settings
------------------------------------------

                 Key: COUCHDB-1142
                 URL: https://issues.apache.org/jira/browse/COUCHDB-1142
             Project: CouchDB
          Issue Type: Improvement
            Reporter: Filipe Manana
            Assignee: Filipe Manana
             Fix For: 1.2
         Attachments: configurable_db_compaction.patch

Currenly the database compaction code uses an hard coded batch size of 1 000 
doc infos and a checkpoint frequency of 10 000 (a checkpoint is done after 
every 10 000 documents are copied to the compaction file).

As part of the experiment/discussion of COUCHDB-1132, it was agreed to make 
both these values configurable (see the comments of that issue). The results I 
posted there also show that increasing the batch size results in a smaller file 
size. I made some tests after this on a 2.2Gb database, and I noticed that at 
least in this machine (Linux 32 bits, 5400rpms spinning disk), increasing the 
batch size increases compaction time - this might necessarily be true on other 
systems.

Here are some tests (before running each, I cleared my filesystem caches with 
"echo 3 > /proc/sys/vm/drop_caches").
Fpr this particular database, 1 000 doc infos correspond to about 100Kb.

2.2Gb database, 232 551 documents

trunk

[Wed, 20 Apr 2011 23:13:41 GMT] [info] [<0.151.0>] Starting compaction for db 
"testdb1"
[Wed, 20 Apr 2011 23:24:08 GMT] [info] [<0.151.0>] Compaction for db "testdb1" 
completed.

final database file size was 1088Mb


512Kb buffer size, checkpoint after 5Mb

[Wed, 20 Apr 2011 22:50:56 GMT] [info] [<0.154.0>] Starting compaction for db 
"testdb1"
[Wed, 20 Apr 2011 23:05:54 GMT] [info] [<0.154.0>] Compaction for db "testdb1" 
completed.

final database file size was 954M


1Mb buffer size, checkpoint after 10Mb

[Wed, 20 Apr 2011 21:33:23 GMT] [info] [<0.162.0>] Starting compaction for db 
"testdb1"
[Wed, 20 Apr 2011 21:48:46 GMT] [info] [<0.162.0>] Compaction for db "testdb1" 
completed.

final database file size was 896 Mb


2Mb buffer size, checkpoint after 20Mb

[Wed, 20 Apr 2011 22:29:06 GMT] [info] [<0.150.0>] Starting compaction for db 
"testdb1"
[Wed, 20 Apr 2011 22:46:41 GMT] [info] [<0.150.0>] Compaction for db "testdb1" 
completed.

final database file size was 840 Mb


8Mb buffer size, checkpoint after 80Mb

[Wed, 20 Apr 2011 22:03:27 GMT] [info] [<0.152.0>] Starting compaction for db 
"testdb1"
[Wed, 20 Apr 2011 22:22:45 GMT] [info] [<0.152.0>] Compaction for db "testdb1" 
completed.

final database file size was 751M


Patch attached.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to