[
https://issues.apache.org/jira/browse/OAK-560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564351#comment-13564351
]
Marcel Reutegger commented on OAK-560:
--------------------------------------
Here's an alternative approach I'm considering: make sure we can detect
concurrent or rather conflicting commits as early as possible. The best place
to do this is in MongoDB. Ideally we already know that there is a conflict
before we save the commit. This is possible if we store the revisions of a node
in a single mongo document. So, instead of creating a new document for each
revision of a node, we'd update the existing document (or create a new document
if the node never existed before). MongoDB allows you to conditionally update a
document, which we could do to check the expected node base revision for the
update. We'd then know there was no conflict if we were able to insert/update
the nodes successfully. The associated commit document then finalizes the
commit and makes the update visible/available to others. I didn't go through
all the edge cases yet that can occur with this model, but wanted to share my
idea early.
> MongoMK.commit() not atomic
> ---------------------------
>
> Key: OAK-560
> URL: https://issues.apache.org/jira/browse/OAK-560
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: mongomk
> Affects Versions: 0.5
> Reporter: Marcel Reutegger
> Priority: Critical
>
> I created a test (rev 1433995), while reasoning about the optimization we are
> working on in OAK-535. It seem it uncovers a race condition in the commit
> command. I was already wondering in the past if something like this can
> happen, and now it looks like there is indeed a problem.
> Without having further analyzed the sporadic test failure I see, I think it
> is caused by the optimistic commit protocol implemented in CommitCommandNew.
> The internal retry loop saves the nodes, then saves the commit and finally
> save head revision. IIUC this exposes a commit even though it is not yet
> valid and the command may flag it as failed later when it cannot save the
> head revision. And this is indeed the case when I set a breakpoint for the
> failed commit and look inside the commits collection I can see that the
> commit in question is flagged as failed.
> The exception then looks like this:
> org.apache.jackrabbit.mk.api.MicroKernelException: java.lang.Exception:
> Commit with revision 134 could not be found
> at
> org.apache.jackrabbit.mongomk.impl.MongoMicroKernel.merge(MongoMicroKernel.java:214)
> at
> org.apache.jackrabbit.mongomk.impl.MongoMKBranchMergeTest$1.run(MongoMKBranchMergeTest.java:362)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.Exception: Commit with revision 134 could not be found
> at
> org.apache.jackrabbit.mongomk.impl.action.FetchCommitAction.execute(FetchCommitAction.java:65)
> at
> org.apache.jackrabbit.mongomk.impl.command.CommitCommandNew.readBranchIdFromBaseCommit(CommitCommandNew.java:145)
> at
> org.apache.jackrabbit.mongomk.impl.command.CommitCommandNew.execute(CommitCommandNew.java:96)
> at
> org.apache.jackrabbit.mongomk.impl.command.CommitCommandNew.execute(CommitCommandNew.java:56)
> at
> org.apache.jackrabbit.mongomk.impl.command.MergeCommand.execute(MergeCommand.java:118)
> at
> org.apache.jackrabbit.mongomk.impl.command.MergeCommand.execute(MergeCommand.java:51)
> at
> org.apache.jackrabbit.mongomk.impl.command.DefaultCommandExecutor.execute(DefaultCommandExecutor.java:38)
> at
> org.apache.jackrabbit.mongomk.impl.MongoNodeStore.merge(MongoNodeStore.java:146)
> at
> org.apache.jackrabbit.mongomk.impl.MongoMicroKernel.merge(MongoMicroKernel.java:212)
> ... 2 more
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira