[
https://issues.apache.org/jira/browse/OAK-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14282321#comment-14282321
]
Thomas Mueller commented on OAK-2105:
-------------------------------------
Looks good to me now. In one example, I got:
{noformat}
> oak.blobStats();
"count" : 6262,
"size" : 1160,
"storageSize" : 1274,
"bsonSize" : 1028,
"indexSize" : 0
{noformat}
Most binaries are quite large, the bson size of the documents is 2'096'246 at
most. There are many with size 1'047'670, I'm not sure what those are
(Lucene?), but the size is below 1024 * 1024 (1'048'576) so that's fine as well.
> Review padding for blobs collection
> -----------------------------------
>
> Key: OAK-2105
> URL: https://issues.apache.org/jira/browse/OAK-2105
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core, mongomk
> Reporter: Marcel Reutegger
> Assignee: Thomas Mueller
> Fix For: 1.2, 1.1.5
>
>
> MongoDB does some default padding when it stores documents. The default
> policy adds some padding and then rounds up to the next power of 2 number
> of bytes. For the blobs collection with documents that are written once
> and never modified, this default behavior may not be optimal. E.g. the
> Oak lucene directory implementation splits data into 32k chunks and stores
> them as multi-valued binary properties. This leads to documents that are
> slightly over 32k bytes in size and MongoDB will allocate 64k for it.
> Half of the space is wasted.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)