[
https://issues.apache.org/jira/browse/OAK-638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582223#comment-13582223
]
Marcel Reutegger commented on OAK-638:
--------------------------------------
bq. My take on this is still: branch, rebase, fast forward merge. If the last
operation fails, try again. This does not lead to contention since all conflict
handling takes place on private branches thus not blocking other operations.
But this approach has a liveness issue. It suffers from starvation. While the
retry is in progress other changes may occur again, forcing yet another retry.
I quickly tried it with the new SegmentMK and two oak-run benchmarks with
SetProperty. I had to slightly modify MongoStore and SetProperty. MongoStore
currently always initializes with an empty node, which basically cleans the
repository on startup. I also adapted SetProperty to work on a distinct test
node on each test execution. This allows to run multiple SetProperty bechmarks
concurrently on the same MongoDB.
The second benchmark instance always failed to start on my machine because 10
retries were not enough to run the repository initializer on startup
concurrently to the first benchmark which was already running. With some help
(pausing the first bechmark) I managed to start the second benchmark, but as
soon as both were running, one of them would fail quite quickly again with
CommitFailedException because of too many retries. Even increasing the number
of retries to 100 didn't help. Keep in mind the changes happen on different
nodes and are not conflicting.
Introducing a scheduler would help, but this probably means introducing direct
communication between the oak instances (e.g. passing a token). I'm not sure
this is desireable.
bq. This is why we came up with the commit hooks, which separate the concerns
of conflict detection (what you call structural consistency) and conflict
resolution (what you call semantic consistency).
This is not what I mean with semantic consistency. It's not just about conflict
resolution, but validation of application imposed constraints, which may be
violated even when there are no structural conflicts (like a uniqueness
constraint).
bq. which often is the nature of optimisation: specialisation.
I should probably change the summary of this issue. It sounds like an
optimization, but I'm actually after scalable writes as outlined in the
description.
To give some more context. This issue originated from a discussion with Thomas.
One of the ideas of the new MongoDB prototype he is working on, is to leverage
MongoDB for (structural) conflict detection. This means updating in a single
commit is preferable because that's where MongoDB will indicate when there is a
conflict.
Another assumption we (Thomas and I) made and discussed was concurrent commits
in distinct areas of the tree do not conflict and can proceed in parallel. This
is in contrast to the strict conflict handling discussed earlier. It does come
with compromises on semantic consistency (the commit hook / validation stuff),
because it does not guarantee the hook runs on the revison of the tree, which
will be the new head.
> Avoid branch/merge for small commits
> ------------------------------------
>
> Key: OAK-638
> URL: https://issues.apache.org/jira/browse/OAK-638
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core
> Reporter: Marcel Reutegger
> Priority: Minor
> Attachments: OAK-638.patch
>
>
> The branch/merge features on the MicroKernel were initially introduced to
> stage changes of large commits. Currently oak-core creates a branch even for
> small changes like updating a property. I think this introduces quite some
> overhead for scenarios with highly concurrent updates. E.g. think of a
> twitter like application or a forum with comments. Well, basically user
> generated content. These update tend to be rather small (couple of nodes) but
> frequent and concurrent.
> Right now oak-core always does:
> - MK.branch()
> - MK.commit() to branch
> - MK.merge()
> For small commits, it ideally should do:
> - MK.commit() to trunk
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira