[
https://issues.apache.org/jira/browse/OAK-6912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16242217#comment-16242217
]
Francesco Mari commented on OAK-6912:
-------------------------------------
The only functionalities implemented by the primary that interact with
{{SegmentCache}} are represented by {{DefaultStandbyReferencesReader}} and
{{DefaultStandbySegmentReader}}. In both these classes, a {{SegmentId}} is
built and {{FileStore#readSegment}} is invoked. This calls, in turn,
{{SegmentCache#getSegment}}.
My wild assumption, not backed by any evidence, is the following. If many new
{{SegmentId}} instances are created for the same MSB/LSB pair,
{{SegmentCache#getSegment}} is called more than once because
{{SegmentId#segment}} is {{null}} in each instance. But
{{SegmentCache#getSegment}} invokes the provided {{Callable<Segment>}} without
passing through the cache first, creating unnecessary I/O. Moreover, by
unconditionally putting always the same {{SegmentId/Segment}} pair into the
cache, the chances of evicting segments increases.
While this assumption might explain the slowdown, I couldn't find a place in
the code of the primary where {{SegmentId}} instances are created without first
passing through the {{SegmentTracker}}. Were I to hold this hypothesis, I
should assume that the {{SegmentTracker}} is creating more instances of
{{SegmentId}} than necessary, but this is even more speculative than my
previous point.
> Cold standby performance regression due to segment caching
> ----------------------------------------------------------
>
> Key: OAK-6912
> URL: https://issues.apache.org/jira/browse/OAK-6912
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: segment-tar, tarmk-standby
> Affects Versions: 1.7.0
> Reporter: Andrei Dulceanu
> Assignee: Andrei Dulceanu
> Labels: cold-standby, performance, scalability
> Fix For: 1.8, 1.7.12
>
>
> The changes to the segment cache introduced in r1793527 [0] introduced a
> performance regression on the primary for the case in which a standby is
> attached to it. Below a benchmark duration comparison between primary w/o and
> w/ standby for r1793527 (after the segment cache changes) and r1793526
> (before the changes) :
> |Oak 1.6 r1793527 (20170502)|{noformat}
> # BasicWriteTest C min 10% 50% 90% max
> N
> Oak-Segment-Tar 1 19 21 22 26 160
> 2491
> Oak-Segment-Tar-DS 1 56 59 63 70 181
> 919
> Oak-Segment-Tar-Cold(Shared DS) 1 58 66 159 177 372
> 302
> {noformat}|
> |Oak 1.6 r1793526 (20170502)|{noformat}
> # BasicWriteTest C min 10% 50% 90% max
> N
> Oak-Segment-Tar 1 19 21 22 25 52
> 2584
> Oak-Segment-Tar-DS 1 56 60 63 69 158
> 925
> Oak-Segment-Tar-Cold(Shared DS) 1 57 60 64 70 122
> 915
> {noformat}|
> [0]
> https://github.com/apache/jackrabbit-oak/commit/efafa4e1710621b7f3b8e92d0b2681669185fcd4
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)