[ 
https://issues.apache.org/jira/browse/COUCHDB-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695893#action_12695893
 ] 

Paul Joseph Davis commented on COUCHDB-220:
-------------------------------------------

You forgot to mention the part where I said that I have absolutely no idea if 
MinSize is important for some other part of the code ;)

Also, it looks like if we add a check in couch_db:doc_flush_binaries/2 to see 
if we're not streaming an attachment of unknown length, and then pass that 
information to couch_stream:ensure_buffer/2 so that couch_stream can decide if 
it wants to allocate exactly the requested amount or some extra it'd solve the 
issue. The patch should be relatively trivial, but like I said, I have no idea 
if there is other important stuff going on there or not.

> Extreme sparseness in couch files
> ---------------------------------
>
>                 Key: COUCHDB-220
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-220
>             Project: CouchDB
>          Issue Type: Bug
>          Components: Database Core
>    Affects Versions: 0.9
>         Environment: ubuntu 8.10 64-bit, ext3
>            Reporter: Robert Newson
>
> When adding ten thousand documents, each with a small attachment, the 
> discrepancy between reported file size and actual file size becomes huge;
> ls -lh shard0.couch
> 698M 2009-01-23 13:42 shard0.couch
> du -sh shard0.couch
> 57M   shard0.couch
> On filesystems that do not support write holes, this will cause an order of 
> magnitude more I/O.
> I think it was introduced by the streaming attachment patch as each 
> attachment is followed by huge swathes of zeroes when viewed with 'hd -v'.
> Compacting this database reduced it to 7.8mb, indicating other sparseness 
> besides attachments.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to