[ 
https://issues.apache.org/jira/browse/COUCHDB-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14642611#comment-14642611
 ] 

ASF GitHub Bot commented on COUCHDB-2726:
-----------------------------------------

Github user eiri closed the pull request at:

    https://github.com/apache/couchdb-couch/pull/61


> Remove a compression's over-optimization
> ----------------------------------------
>
>                 Key: COUCHDB-2726
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2726
>             Project: CouchDB
>          Issue Type: Improvement
>      Security Level: public(Regular issues) 
>            Reporter: Eric Avdey
>            Assignee: Eric Avdey
>            Priority: Minor
>
> When a file compression set to snappy, couch is doing an additional 
> optimization step by also compressing the term with deflate, comparing the 
> sizes of the result binary and choosing the smaller one. This leads to a 
> situation when "winning" deflated term got decompressed and compressed back 
> on each document update, because deflate's compressed terms are not 
> recognized with option file_compression set to snappy. This is done to allow 
> migration from deflate to snappy.
> However this optimization is a problem, because couch keeps field `body` in 
> #doc record as 2 elements tuple of compressed body and compressed list of the 
> attachments pointers. If the document doesn't have the attachments the 
> pointers are an empty list which always compressed by deflate better than by 
> snappy. In other words, if the option file_compression set to snappy almost 
> every document in all databases goes through decompression\compression cycle 
> on each write.
> Basic test shows that this compression optimization on average saves less 
> that one percent of the disk space, so it doesn't worth to trade this space 
> for CPU cycles.
> http://nbviewer.ipython.org/gist/eiri/79d91a797af9c6a6ff6d
> I suggest to remove this optimization all together and just follow configured 
> option for choosing the compression library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to