Michael Dürig created OAK-5464:
----------------------------------

             Summary: Improve the transaction rate of the TarMK
                 Key: OAK-5464
                 URL: https://issues.apache.org/jira/browse/OAK-5464
             Project: Jackrabbit Oak
          Issue Type: Epic
          Components: segment-tar
            Reporter: Michael Dürig
             Fix For: 1.8


The TarMK's write throughput is limited by the way concurrent commits are 
processed: rebasing and running the commit hooks happen within a lock without 
any explicit scheduling. This epic covers improving the overall transaction 
rate. The proposed approach would roughly be to first make scheduling of 
transactions explicit, then add monitoring on transaction to gather a better 
understanding and then experiment and implement explicit scheduling strategies 
to optimise particular aspects. 

h2. Summary of ideas mentioned in an offline sessions

h3. Advantages of explicit scheduling:
* Control over (order) of commits
* Sophisticated monitoring (commit statistics, e.g. commit rate, time in queue, 
etc.) 
* Favour certain commits (e.g. checkpoints)
* Reorder commits to simplify rebasing
* Suspend the compactor on concurrent commits and have it resume where it left 
off afterwards
* Parallelise certain commits (e.g. by piggy backing)
* Implement a concurrent commit editor. we'd need to take care of proper access 
to the shared state; [~frm] maybe introduce the idea of a common context to 
enforce concurrent access semantics.

h3. Scheduler Implementation
* Expedite
* Prioritise
* Defer
* Collapse
* Coalesce
* Parallelise
* Piggy back: can we piggy back commits on top of each other? The idea would be 
while processing the changes of one commit to also check them for conflicts 
with the changes of other commits waiting to commit. If a conflict is detected 
there, that other commit can immediately be failed (given the current commit 
doesn't fail).
* Merging non conflicting commits. Given multiple transactions ready to commit 
at the same time. Can we process them as one (given they don't conflict) 
instead of one after each other, which requires rebasing the later transaction 
to be rebase on the former.
* Shield the file store from {{InterruptedException}} because of thread 
boundaries introduced
* Implement tests, benchmarks and fixtures for verification




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to