Francesco Mari commented on OAK-4812:

I guess the record ID cache per segment is kind of useless. It should better be 
a segment ID cache, since this seems to be the bulk of the problem. Creating 
record IDs is cheap, but it's not the same for segment IDs.

> Reduce calls to SegmentStore#newSegmentId from the Segment class
> ----------------------------------------------------------------
>                 Key: OAK-4812
>                 URL: https://issues.apache.org/jira/browse/OAK-4812
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Alex Parvulescu
>            Priority: Minor
> OAK-4631 introduced a change in records handling in a segment that will 
> amplify the number of calls to {{SegmentStore#newSegmentId}} by the number of 
> external references [0]. It usually is the case that there are a lot of 
> record references that point to the same segment id, and the existing 
> {{recordIdCache}} would not help much in this case.
> The scenario I'm seeing for offline compaction (might be a bit biased) is a 
> full traversal of segments that increases pressure on the {{SegmentIdTable}} 
> by calling {{newSegmentId}} with a lot of already existing segments.
> I'm creating this issue as an 'Improvement' as I think it is interesting to 
> look into reducing this pressure. This might be by squeezing more out of the 
> {{SegmentIdTable}} bits (I'd like to followup on this with a benchmark) or 
> revisiting the code paths from the {{Segment}} class.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/Segment.java#L405

This message was sent by Atlassian JIRA

Reply via email to