[
https://issues.apache.org/jira/browse/OAK-8613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926586#comment-16926586
]
Axel Hanikel commented on OAK-8613:
-----------------------------------
Cool! :) Do I understand correctly that changes are written to the private repo
first (so they succeed very quickly), and then synced with the shared repo,
where the shared repo requests a rebase if needed? If so, the private
repositories differ from the shared repo: not by their content, but they have
different segments, right? (Not that I think that's a problem, I just want to
make sure I understand the mechanism)
> Azure Segment Store clustering PoC
> ----------------------------------
>
> Key: OAK-8613
> URL: https://issues.apache.org/jira/browse/OAK-8613
> Project: Jackrabbit Oak
> Issue Type: Story
> Components: segment-azure
> Reporter: Tomek Rękawek
> Assignee: Tomek Rękawek
> Priority: Major
> Attachments: OAK-8613.patch, remote node store.png
>
>
> Azure Segment Store offers a way to read the same segments, concurrently in
> many Oak instances. With a way to coordinate writes, it's possible to
> implement a distributed node store based on the SegmentNS. The solution will
> consist of following elements:
> * a central server, that'd coordinate the writes to the shared repository,
> * a number of clients, that can read directly from the shared repository.
> They also have their own, private repositories within the same cloud storage,
> * the shared repository, which represents the current state. It can be read
> by anyone, but only the central server can write it,
> * the private repositories. Clients can write their own segments there and
> then pass their references to the server.
> As above, every client uses two repositories: shared (in the read only mode)
> and private (in read-write mode). When a client wants to read the current
> root, it asks the server for its revision. Then, it looks in the shared
> repository, reads the segment and creates a node state.
> If the client wants to modify the repository, it convert the fetched node
> state into node builder. The applied changes will eventually be serialized in
> the new segment, in the client's private repository. In order to merge it
> changes, the client will send two revision ids: of the base root (fetched
> from shared repo) and of the current root (stored in the private repo).
> Server will check if the base root wasn't updated in the meantime - in such
> case it requests a rebase. Otherwise, it'll read the current root from the
> private repository, apply the commit changes in the shared repository and
> update the journal.
> gRPC is used for the communication between the server and clients. This is
> used only for the coordination. All the data are actually exchanged via
> segment stores.
> !remote node store.png|width=100%!
> The attached [^OAK-8613.patch] contains the implementation split into 3 parts:
> * oak-store-remote-commons, which contains the gRPC descriptors of used
> services and embeds the required libraries,
> * oak-store-remote-server builds an executable JAR, that starts the server,
> * oak-store-remote-client is an OSGi bundle that starts NodeStore connecting
> to the configured server and Azure Storage.
> There are also some changes in the oak-segment-tar - new SPIs that allows to
> read segments with their revisions (record ids) and exposes the revision in
> the node state.
> The gRPC uses Guava 26. I was able to somehow get it running with other Oak
> bundles, using Guava 15, but if we want to produtize it, we'd need to update
> Oak's Guava.
> There's a new fixture that tests the implementation. It can be run with:
> {noformat}
> mvn clean install -f oak-it/pom.xml -Dnsfixtures=REMOTE -Dtest=NodeStoreTest
> -Dtest.opts=-Xmx4g
> {noformat}
> This is prototype. It lacks of the tests and important resilience features:
> * unit tests, especially for the discovery lite implementation,
> * enable online compaction for the shared repository (the private ones can be
> just thrown away after merge),
> * server resilience and support disconnecting clients in a clean way (eg.
> unregister node observers),
> * client resilience, with support to re-connect,
> * clean resources on both sides on disconnection (eg. remove private
> repository).
> Potential improvements:
> * can we have multiple replicas for the server, in the active-passive mode,
> to increase resilience?
--
This message was sent by Atlassian Jira
(v8.3.2#803003)