[
https://issues.apache.org/jira/browse/OAK-5192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15996472#comment-15996472
]
Chetan Mehrotra commented on OAK-5192:
--------------------------------------
bq. Here content added consists of the number of bytes written to property
values in the segment store (excluding the ones that go the data store).
If this size delta is just due to property values mostly consisting of blobIds
then yes its quite significant. One we can reduce this is by changing the chunk
size (which defaults to ~ 1 MB) to a higher value such that number of blobIds
per lucene index file get reduced.
By any chance can we get actual delta values. Also with recent changes in
OakDirectory files should not be getting inlined
> Reduce Lucene related growth of repository size
> -----------------------------------------------
>
> Key: OAK-5192
> URL: https://issues.apache.org/jira/browse/OAK-5192
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: lucene, segment-tar
> Reporter: Michael Dürig
> Assignee: Tommaso Teofili
> Labels: perfomance, scalability
> Fix For: 1.8, 1.7.3
>
> Attachments: added-bytes-zoom.png
>
>
> I observed Lucene indexing contributing to up to 99% of repository growth.
> While the size of the index itself is well inside reasonable bounds, the
> overall turnover of data being written and removed again can be as much as
> 99%.
> In the case of the TarMK this negatively impacts overall system performance
> due to fast growing number of tar files / segments, bad locality of
> reference, cache misses/thrashing when looking up segments and vastly
> prolonged garbage collection cycles.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)