[ https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780439#comment-16780439 ]
Andrei Dulceanu commented on OAK-8063: -------------------------------------- In my previous patch I somehow misinterpreted the edge direction between segments and their references. Therefore, by using a stack for storing the final transfer order, I actually achieved the opposite: a segment transferred before all its references. I came back to this and replaced the stack with a linked list to fix this behaviour. Moreover, I added an explanation and the same logging we had in the original code. [~frm], could you please take a look at the second version of the patch? > The cold standby client doesn't correctly handle backward references > -------------------------------------------------------------------- > > Key: OAK-8063 > URL: https://issues.apache.org/jira/browse/OAK-8063 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar, tarmk-standby > Affects Versions: 1.6.0 > Reporter: Andrei Dulceanu > Assignee: Andrei Dulceanu > Priority: Major > Labels: cold-standby > Fix For: 1.12, 1.8.12, 1.10.2 > > Attachments: OAK-8063-02.patch, OAK-8063.patch > > > The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} > has a flaw when it comes to "backward references". Suppose we have the > following data segment graph to be transferred from primary: S1, which > references \{S2, S3} and S3 which references S2. Then, the correct transfer > order should be S2, S3 and S1. > Going through the current logic employed by the method, here's what happens: > {noformat} > Step 0: batch={S1} > Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3} > Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3} > Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, > S3}.{noformat} > Therefore, at the end of the loop, the order of the segments to be > transferred will be S3, S2, S1, which might trigger a > {{SegmentNotFoundException}} when S3 is further processed, because S2 is > missing on standby (see OAK-8006). > /cc [~frm] -- This message was sent by Atlassian JIRA (v7.6.3#76005)