[
https://issues.apache.org/jira/browse/COUCHDB-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12695893#action_12695893
]
Paul Joseph Davis commented on COUCHDB-220:
-------------------------------------------
You forgot to mention the part where I said that I have absolutely no idea if
MinSize is important for some other part of the code ;)
Also, it looks like if we add a check in couch_db:doc_flush_binaries/2 to see
if we're not streaming an attachment of unknown length, and then pass that
information to couch_stream:ensure_buffer/2 so that couch_stream can decide if
it wants to allocate exactly the requested amount or some extra it'd solve the
issue. The patch should be relatively trivial, but like I said, I have no idea
if there is other important stuff going on there or not.
> Extreme sparseness in couch files
> ---------------------------------
>
> Key: COUCHDB-220
> URL: https://issues.apache.org/jira/browse/COUCHDB-220
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Affects Versions: 0.9
> Environment: ubuntu 8.10 64-bit, ext3
> Reporter: Robert Newson
>
> When adding ten thousand documents, each with a small attachment, the
> discrepancy between reported file size and actual file size becomes huge;
> ls -lh shard0.couch
> 698M 2009-01-23 13:42 shard0.couch
> du -sh shard0.couch
> 57M shard0.couch
> On filesystems that do not support write holes, this will cause an order of
> magnitude more I/O.
> I think it was introduced by the streaming attachment patch as each
> attachment is followed by huge swathes of zeroes when viewed with 'hd -v'.
> Compacting this database reduced it to 7.8mb, indicating other sparseness
> besides attachments.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.