[
https://issues.apache.org/jira/browse/COUCHDB-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12696888#action_12696888
]
Paul Joseph Davis commented on COUCHDB-220:
-------------------------------------------
Chris,
First, I'm pretty certain that this bug is only affecting document writes that
include an attachment.
You should check if your erlang loader is getting the proper attachment
information all the way down into couch_db:doc_flush_binaries. My first
haphazard guess is that its not. My second random guess is you could be seeing
the same bug from a different code path. Also, there's another slight tweak to
the patch to only go to the 65K allocation when there's a binary of unknown
size.
Either way, I'm fairly certain that while changing the min_alloc to a single
byte shows that there is a bug, its not the proper fix for the bug.
> Extreme sparseness in couch files
> ---------------------------------
>
> Key: COUCHDB-220
> URL: https://issues.apache.org/jira/browse/COUCHDB-220
> Project: CouchDB
> Issue Type: Bug
> Components: Database Core
> Affects Versions: 0.9
> Environment: ubuntu 8.10 64-bit, ext3
> Reporter: Robert Newson
> Attachments: 220.patch, 220.patch, attachment_sparseness.js,
> stream.diff
>
>
> When adding ten thousand documents, each with a small attachment, the
> discrepancy between reported file size and actual file size becomes huge;
> ls -lh shard0.couch
> 698M 2009-01-23 13:42 shard0.couch
> du -sh shard0.couch
> 57M shard0.couch
> On filesystems that do not support write holes, this will cause an order of
> magnitude more I/O.
> I think it was introduced by the streaming attachment patch as each
> attachment is followed by huge swathes of zeroes when viewed with 'hd -v'.
> Compacting this database reduced it to 7.8mb, indicating other sparseness
> besides attachments.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.