[ 
https://issues.apache.org/jira/browse/OAK-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16351715#comment-16351715
 ] 

Andrei Dulceanu commented on OAK-7162:
--------------------------------------

[~mduerig], could you take a look at [^OAK-7162-test.patch]? I simulated the 
situation detailed in the issue description, by having a parallel task which 
sweeps out everything from the root state by replacing the current revisions 
head with an initial head with no content. To be 100% testable (at least for 
this kind of test I came up with), the code in {{LockBasedScheduler#execute}} 
needs to be changed to omit the call to #{{refreshHead}}, because at this point 
the other thread might jump in and erase the modifications just committed by 
substituting the correct head with the empty one. 

I see no impact of this change in production code, since anyways, this kind of 
interaction can happen only with OnRC. The effect is that the next time a 
commit is scheduled an additional rebase needs to be done.

As expected, the test case fails consistently on 1.8 branch, without the fix 
from OAK-7162 applied.

/cc [~frm]

> Race condition on revisions head between compaction and scheduler could 
> result in skipped commit
> ------------------------------------------------------------------------------------------------
>
>                 Key: OAK-7162
>                 URL: https://issues.apache.org/jira/browse/OAK-7162
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>    Affects Versions: 1.8.0
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Blocker
>              Labels: scalability
>             Fix For: 1.9.0, 1.10, 1.8.2
>
>         Attachments: OAK-7162-02.patch, OAK-7162-03.patch, 
> OAK-7162-test.patch, OAK-7162.patch
>
>
> There is a race condition on {{TarRevisions#head}} between a running 
> compaction trying to set the new head [0] and the scheduler doing the same 
> after executing a specific commit [1]. If the compaction thread is first, 
> then the head assignment in the scheduler will fail and not be re-attempted. 
> IMO, the simple if statement should be changed to a while loop in which the 
> head is refreshed and the commit is re-applied against the new head, before 
> attempting again to set a new head in {{TarRevisions}}. This is somehow 
> similar to what we previously had [2], but without the unneeded 
> optimistic/pessimistic strategies involving tokens.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L764
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/scheduler/LockBasedScheduler.java#L253
> [2] 
> https://github.com/apache/jackrabbit-oak/blob/1.6/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentNodeStore.java#L686



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to