[
https://issues.apache.org/jira/browse/OAK-4812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492820#comment-15492820
]
Francesco Mari commented on OAK-4812:
-------------------------------------
I guess the record ID cache per segment is kind of useless. It should better be
a segment ID cache, since this seems to be the bulk of the problem. Creating
record IDs is cheap, but it's not the same for segment IDs.
> Reduce calls to SegmentStore#newSegmentId from the Segment class
> ----------------------------------------------------------------
>
> Key: OAK-4812
> URL: https://issues.apache.org/jira/browse/OAK-4812
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: segment-tar
> Reporter: Alex Parvulescu
> Priority: Minor
>
> OAK-4631 introduced a change in records handling in a segment that will
> amplify the number of calls to {{SegmentStore#newSegmentId}} by the number of
> external references [0]. It usually is the case that there are a lot of
> record references that point to the same segment id, and the existing
> {{recordIdCache}} would not help much in this case.
> The scenario I'm seeing for offline compaction (might be a bit biased) is a
> full traversal of segments that increases pressure on the {{SegmentIdTable}}
> by calling {{newSegmentId}} with a lot of already existing segments.
> I'm creating this issue as an 'Improvement' as I think it is interesting to
> look into reducing this pressure. This might be by squeezing more out of the
> {{SegmentIdTable}} bits (I'd like to followup on this with a benchmark) or
> revisiting the code paths from the {{Segment}} class.
> [0]
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Segment.java#L405
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)