[ 
https://issues.apache.org/jira/browse/OAK-7162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16344887#comment-16344887
 ] 

Michael Dürig commented on OAK-7162:
------------------------------------

Re. the patch:
 * Shouldn't the initial call to {{refreshHead()}} go into the back-off loop? 
If we end up in a further iteration of that loops this means that someone 
changed the head and we should dispatch the changes.
 * I would be more explicit with the log messages. I.e. the message should 
mention after how much time the retry is going to happen and how many time it 
tried already. The final message should probably say how many times the commit 
was tried and how long this took overall.

Re. coming up with a test case, let me know if I can be of any help here.

> Race condition on revisions head between compaction and scheduler could 
> result in skipped commit
> ------------------------------------------------------------------------------------------------
>
>                 Key: OAK-7162
>                 URL: https://issues.apache.org/jira/browse/OAK-7162
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-tar
>    Affects Versions: 1.8.0
>            Reporter: Andrei Dulceanu
>            Assignee: Andrei Dulceanu
>            Priority: Blocker
>              Labels: scalability
>             Fix For: 1.9.0, 1.10, 1.8.2
>
>         Attachments: OAK-7162.patch
>
>
> There is a race condition on {{TarRevisions#head}} between a running 
> compaction trying to set the new head [0] and the scheduler doing the same 
> after executing a specific commit [1]. If the compaction thread is first, 
> then the head assignment in the scheduler will fail and not be re-attempted. 
> IMO, the simple if statement should be changed to a while loop in which the 
> head is refreshed and the commit is re-applied against the new head, before 
> attempting again to set a new head in {{TarRevisions}}. This is somehow 
> similar to what we previously had [2], but without the unneeded 
> optimistic/pessimistic strategies involving tokens.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/file/FileStore.java#L764
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/scheduler/LockBasedScheduler.java#L253
> [2] 
> https://github.com/apache/jackrabbit-oak/blob/1.6/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentNodeStore.java#L686



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to