[ 
https://issues.apache.org/jira/browse/OAK-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Parvulescu updated OAK-3133:
---------------------------------
    Attachment: OAK-3133-partial.patch

what I'm seeing based on a few rounds with the offline compaction (1.0 code 
only) is that there's a considerable amount of time lost with the TreeMaps 
insert methods. Switching to HashMaps and some post map building sorting gives 
an order of magnitude boost. Still missing some proper benchmarks though.

I'm attaching some POC code (1.0 branch) that I'm basing my ideas on, to gather 
some feedback. Ideally I would integrate this into trunk (into the 
InMemoryCompactionMap) and provide some numbers.
fyi [~mduerig], [~frm].

> Make compaction map more efficient for offline compaction
> ---------------------------------------------------------
>
>                 Key: OAK-3133
>                 URL: https://issues.apache.org/jira/browse/OAK-3133
>             Project: Jackrabbit Oak
>          Issue Type: Sub-task
>          Components: run
>            Reporter: Michael Dürig
>            Assignee: Francesco Mari
>              Labels: compaction, gc
>             Fix For: 1.0.20, 1.3.6, 1.2.5
>
>         Attachments: OAK-3133-01.patch, OAK-3133-partial.patch
>
>
> The compaction map might be expensive. See OAK-2862 for the analysis. We 
> should find ways to lower the impact of this on offline compaction. One 
> option would be to make the compress cycle configurable, which would require 
> more heap. Another option would be to leverage the persisted compaction map 
> here also.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to