[
https://issues.apache.org/jira/browse/OAK-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alex Parvulescu updated OAK-2140:
---------------------------------
Fix Version/s: 1.0.9
> Segment Compactor will not compact binaries > 16k
> -------------------------------------------------
>
> Key: OAK-2140
> URL: https://issues.apache.org/jira/browse/OAK-2140
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: core, segmentmk
> Reporter: Alex Parvulescu
> Assignee: Alex Parvulescu
> Fix For: 1.0.9, 1.1.3
>
> Attachments: OAK-2140.patch
>
>
> The compaction bit rely on the SegmentBlob#clone method in the case a binary
> is being processed but it looks like the #clone contract is not fully
> enforced for streams that are qualified as 'long values' (>16k if I read the
> code correctly).
> What happens is the stream is initially persisted as chunks in a ListRecord.
> When compaction calls #clone it will get back the original list of record
> ids, which will get referenced from the compacted node state [0], making
> compaction on large binaries ineffective as the bulk segments will never move
> from the original location where they were created, unless the reference node
> gets deleted.
> I think the original design was setup to prevent large binaries from being
> copied over but looking at the size problem we have now it might be a good
> time to reconsider this approach.
> [0]
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentBlob.java#L75
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)