[ 
https://issues.apache.org/jira/browse/OAK-8613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16926598#comment-16926598
 ] 

Tomek Rękawek edited comment on OAK-8613 at 9/10/19 12:50 PM:
--------------------------------------------------------------

[~ahanikel] - in order to make any change in the Oak repo, you need to take a 
current node state (which is a snapshot of the repository for the given moment, 
like revision in git) and create a node builder on top of it (which is roughly 
the same as a branch in git). When you're applying changes, they are already 
stored in the segment store, even without merging. Otherwise we'd run out of 
memory when applying a large change set. In oak-segment-tar, the changes are 
persisted in branch once we have more than 1000 of them or when we call 
nodeBuilder.getNodeState().

Now, when the nodeStore.merge(nodeBuilder) is called, the node store rebases 
the builder on top of the current root and switches the HEAD pointer to the new 
repository root.

That's how the classic oak-segment-tar works. In case of the clustered one, the 
private repository is used to store the node builder changes (before the 
merge() is called). When the merge() is called, the client sends a record id of 
the new root node. This record id references a node in the private segment 
store. In the merge() process, the server reads the node from the private tree 
and applies the same changes in the shared repository. It's actually quite 
simple method:

{noformat}
Legend:
* nodeStore - the underlying shared repository,
* commit.getBaseNodeState() - the base root revision from the incoming commit,
* commit.getHeadNodeState() - the updated root revision from the incoming 
commit,
* newHead - the node state with the new root, read from the private repository,
* builder - the builder created in the shared repository
{noformat}

{code}
NodeState currentRoot = nodeStore.getRoot();

String currentRootRevision = RevisionableUtils.getRevision(currentRoot);
if (!currentRootRevision.equals(commit.getBaseNodeState().getRevision())) {
    // the currentRoot is different than the incoming base state, return
    // ...
}

NodeBuilder builder = currentRoot.builder();
NodeState newHead = privateFileStores.getNodeState(commit.getSegmentStoreDir(), 
commit.getHeadNodeState().getRevision());
newHead.compareAgainstBaseState(currentRoot, new ApplyDiff(builder));

newRoot = nodeStore.merge(builder, EmptyHook.INSTANCE, 
CommitInfoUtil.deserialize(commit.getCommitInfo()));
{code}
[Source|https://github.com/trekawek/jackrabbit-oak/blob/58abf7c07432e69198441ceabb7c2b56f48c70af/oak-store-remote-server/src/main/java/org/apache/jackrabbit/oak/remote/server/NodeStoreService.java#L71-L84]

Clients can apply the changes to their private node builders without any 
synchronisation, however the merge() operation itself is synchronised on the 
server side and blocking on the client side. Once the merge() on client 
returns, it means that the changes have been already applied (or failed).


was (Author: tomek.rekawek):
[~ahanikel] - in order to make any change in the Oak repo, you need to take a 
current node state (which is a snapshot of the repository for the given moment, 
like revision in git) and create a node builder on top of it (which is roughly 
the same as a branch in git). When you're applying changes, they are already 
stored in the segment store, even without merging. Otherwise we'd run out of 
memory when applying a large change set. In oak-segment-tar, the changes are 
persisted in branch once we have more than 1000 of them or when we call 
nodeBuilder.getNodeState().

Now, when the nodeStore.merge(nodeBuilder) is called, the node store rebases 
the builder on top of the current root and switches the HEAD pointer to the new 
repository root.

That's how the classic oak-segment-tar works. In case of the clustered one, the 
private repository is used to store the node builder changes (before the 
merge() is called). When the merge() is called, the client sends a record id of 
the new root node. This record id references a node in the private segment 
store. In the merge() process, the server reads the node from the private tree 
and applies the same changes in the shared repository. It's actually quite 
simple method:

{noformat}
Legend:
* nodeStore - the underlying shared repository,
* commit.getBaseNodeState() - the base root from the incoming commit,
* commit.getHeadNodeState() - the updated root from the incoming commit,
* newHead - the node state with the new root, read from the private repository,
* builder - the builder created in the shared repository
{noformat}

{code}
NodeState currentRoot = nodeStore.getRoot();

String currentRootRevision = RevisionableUtils.getRevision(currentRoot);
if (!currentRootRevision.equals(commit.getBaseNodeState().getRevision())) {
    // the currentRoot is different than the incoming base state, return
    // ...
}

NodeBuilder builder = currentRoot.builder();
NodeState newHead = privateFileStores.getNodeState(commit.getSegmentStoreDir(), 
commit.getHeadNodeState().getRevision());
newHead.compareAgainstBaseState(currentRoot, new ApplyDiff(builder));

newRoot = nodeStore.merge(builder, EmptyHook.INSTANCE, 
CommitInfoUtil.deserialize(commit.getCommitInfo()));
{code}
[Source|https://github.com/trekawek/jackrabbit-oak/blob/58abf7c07432e69198441ceabb7c2b56f48c70af/oak-store-remote-server/src/main/java/org/apache/jackrabbit/oak/remote/server/NodeStoreService.java#L71-L84]

Clients can apply the changes to their private node builders without any 
synchronisation, however the merge() operation itself is synchronised on the 
server side and blocking on the client side. Once the merge() on client 
returns, it means that the changes have been already applied (or failed).

> Azure Segment Store clustering PoC
> ----------------------------------
>
>                 Key: OAK-8613
>                 URL: https://issues.apache.org/jira/browse/OAK-8613
>             Project: Jackrabbit Oak
>          Issue Type: Story
>          Components: segment-azure
>            Reporter: Tomek Rękawek
>            Assignee: Tomek Rękawek
>            Priority: Major
>         Attachments: OAK-8613.patch, remote node store.png
>
>
> Azure Segment Store offers a way to read the same segments, concurrently in 
> many Oak instances. With a way to coordinate writes, it's possible to 
> implement a distributed node store based on the SegmentNS. The solution will 
> consist of following elements:
> * a central server, that'd coordinate the writes to the shared repository,
> * a number of clients, that can read directly from the shared repository. 
> They also have their own, private repositories within the same cloud storage,
> * the shared repository, which represents the current state. It can be read 
> by anyone, but only the central server can write it,
> * the private repositories. Clients can write their own segments there and 
> then pass their references to the server.
> As above, every client uses two repositories: shared (in the read only mode) 
> and private (in read-write mode). When a client wants to read the current 
> root, it asks the server for its revision. Then, it looks in the shared 
> repository, reads the segment and creates a node state.
> If the client wants to modify the repository, it convert the fetched node 
> state into node builder. The applied changes will eventually be serialized in 
> the new segment, in the client's private repository. In order to merge it 
> changes, the client will send two revision ids: of the base root (fetched 
> from shared repo) and of the current root (stored in the private repo). 
> Server will check if the base root wasn't updated in the meantime - in such 
> case it requests a rebase. Otherwise, it'll read the current root from the 
> private repository, apply the commit changes in the shared repository and 
> update the journal.
> gRPC is used for the communication between the server and clients. This is 
> used only for the coordination. All the data are actually exchanged via 
> segment stores.
>  !remote node store.png|width=100%! 
> The attached [^OAK-8613.patch] contains the implementation split into 3 parts:
> * oak-store-remote-commons, which contains the gRPC descriptors of used 
> services and embeds the required libraries,
> * oak-store-remote-server builds an executable JAR, that starts the server,
> * oak-store-remote-client is an OSGi bundle that starts NodeStore connecting 
> to the configured server and Azure Storage.
> There are also some changes in the oak-segment-tar - new SPIs that allows to 
> read segments with their revisions (record ids) and exposes the revision in 
> the node state.
> The gRPC uses Guava 26. I was able to somehow get it running with other Oak 
> bundles, using Guava 15, but if we want to produtize it, we'd need to update 
> Oak's Guava.
> There's a new fixture that tests the implementation. It can be run with:
> {noformat}
> mvn clean install -f oak-it/pom.xml -Dnsfixtures=REMOTE -Dtest=NodeStoreTest 
> -Dtest.opts=-Xmx4g
> {noformat}
> This is prototype. It lacks of the tests and important resilience features:
> * unit tests, especially for the discovery lite implementation,
> * enable online compaction for the shared repository (the private ones can be 
> just thrown away after merge),
> * server resilience and support disconnecting clients in a clean way (eg. 
> unregister node observers),
> * client resilience, with support to re-connect,
> * clean resources on both sides on disconnection (eg. remove private 
> repository).
> Potential improvements:
> * can we have multiple replicas for the server, in the active-passive mode, 
> to increase resilience?



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to