[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-03-06 Thread Andrei Dulceanu (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16785596#comment-16785596
 ] 

Andrei Dulceanu commented on OAK-8063:
--

Backported to 1.10 at r1854920.

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.8.12, 1.10.2
>
> Attachments: OAK-8063-02.patch, OAK-8063-03.patch, OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-03-05 Thread Andrei Dulceanu (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784424#comment-16784424
 ] 

Andrei Dulceanu commented on OAK-8063:
--

Backported to 1.8 at r1854850.

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.8.12, 1.10.2
>
> Attachments: OAK-8063-02.patch, OAK-8063-03.patch, OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-03-04 Thread Andrei Dulceanu (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783423#comment-16783423
 ] 

Andrei Dulceanu commented on OAK-8063:
--

[~frm], thanks for reviewing!

I fixed this in trunk using [^OAK-8063-03.patch]  at r1854772.

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.8.12, 1.10.2
>
> Attachments: OAK-8063-02.patch, OAK-8063-03.patch, OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-03-04 Thread Francesco Mari (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16783356#comment-16783356
 ] 

Francesco Mari commented on OAK-8063:
-

[~dulceanu], I like your patch very much since it greatly simplifies the 
transfer logic. I propose a third version of the patch with a minor change just 
to avoid repeating the same check in two different places and to consistently 
log the root segment ID where the traversal starts.

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.11.0, 1.8.12, 1.10.2
>
> Attachments: OAK-8063-02.patch, OAK-8063-03.patch, OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-02-28 Thread Andrei Dulceanu (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16780439#comment-16780439
 ] 

Andrei Dulceanu commented on OAK-8063:
--

In my previous patch I somehow misinterpreted the edge direction between 
segments and their references. Therefore, by using a stack for storing the 
final transfer order, I actually achieved the opposite: a segment transferred 
before all its references. I came back to this and replaced the stack with a 
linked list to fix this behaviour. Moreover, I added an explanation and the 
same logging we had in the original code.

[~frm], could you please take a look at the second version of the patch?

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.8.12, 1.10.2
>
> Attachments: OAK-8063-02.patch, OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-02-22 Thread JIRA


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775268#comment-16775268
 ] 

Michael Dürig commented on OAK-8063:


{quote}BTW, could really be the case that the segment graph to be transferred 
from primary contains a cycle?
{quote}
Unfortunately yes. Fortunately those loops should be rather small.

Loop can happen when writing in parallel through multiple 
{{SegmentBufferWriter}} instances.

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.10.1, 1.8.12
>
> Attachments: OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8063) The cold standby client doesn't correctly handle backward references

2019-02-22 Thread Andrei Dulceanu (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775105#comment-16775105
 ] 

Andrei Dulceanu commented on OAK-8063:
--

[~frm], in [^OAK-8063.patch] I propose a new, simplified implementation for 
{{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}}. It follows the 
classical topological order algorithm with a twist: by eagerly marking a node 
visited, it breaks possible loops if the graph contains a cycle. BTW, could 
really be the case that the segment graph to be transferred from primary 
contains a cycle? My understanding is that segments only refer previous written 
segments, without any forward references, thus avoiding cycles.

I also kept the optimisation about segments already transferred (i.e. local), 
but removed the one for the "diamond problem" which could not be possible in 
the current implementation. 

 Could you please take a look at the patch?

> The cold standby client doesn't correctly handle backward references
> 
>
> Key: OAK-8063
> URL: https://issues.apache.org/jira/browse/OAK-8063
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, tarmk-standby
>Affects Versions: 1.6.0
>Reporter: Andrei Dulceanu
>Assignee: Andrei Dulceanu
>Priority: Major
>  Labels: cold-standby
> Fix For: 1.12, 1.10.1, 1.8.12
>
> Attachments: OAK-8063.patch
>
>
> The logic from {{StandbyClientSyncExecution#copySegmentHierarchyFromPrimary}} 
> has a flaw when it comes to "backward references". Suppose we have the 
> following data segment graph to be transferred from primary: S1, which 
> references \{S2, S3} and S3 which references S2. Then, the correct transfer 
> order should be S2, S3 and S1.
> Going through the current logic employed by the method, here's what happens:
> {noformat}
> Step 0: batch={S1}
> Step 1: visited={S1}, data={S1}, batch={S2, S3}, queued={S2, S3}
> Step 2: visited={S1, S2}, data={S2, S1}, batch={S3}, queued={S2, S3}
> Step 3: visited={S1, S2, S3}, data={S3, S2, S1}, batch={}, queued={S2, 
> S3}.{noformat}
> Therefore, at the end of the loop, the order of the segments to be 
> transferred will be S3, S2, S1, which might trigger a 
> {{SegmentNotFoundException}} when S3 is further processed, because S2 is 
> missing on standby (see OAK-8006).
> /cc [~frm]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)