joerghoh opened a new pull request, #180:
URL: 
https://github.com/apache/sling-org-apache-sling-distribution-journal/pull/180

   Ability to concurrently import packages.
   
   The most important changes are in the BookKeeper, where offsets are only 
stored, if no other message with a lower offset is being processed at the 
moment. An offset is only persisted, when all "older" messages have already 
been processed.
   
   If the concurrency is set to "1" (= serialized), the import semantic does 
not change at all, in this case any older message was already processed, and 
therefor the offset of every package will be stored.
   
   if the concurrency is higher (true parallel import), it can be that the 
processing of some messages is not persisted, as "older" messages (messages 
with a smaller offset) are still being processed. In such a case the import 
semantic changes from "successfully imported exactly once " to "successfully 
imported at least once"; this only works under the assumption, that every 
message is idempotent, and re-importing it (in the correct order) is possible 
without side effects.
   
   In the context of distribution, this also means, that any package being 
replicated must not have any dependency between them (at least not in a way, 
that these packages are potentially processed in parallel).
   
   For the reviewers:
   * Please validate that I have covered all cases. Right now I only handle 
``BookKeeper.importPackage()`` and ``BookKepper.invalidatePackage()`` in this 
particular way; skipping a package is always idempotent, and it will be ignored 
when it comes to the decision, if an offset should be stored or not.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to