[ 
https://issues.apache.org/jira/browse/OAK-7932?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Axel Hanikel updated OAK-7932:
------------------------------
    Description: 
h1. Outline

This issue documents some proof-of-concept work for adapting the segment tar 
nodestore to a
 distributed environment. The main idea is to adopt an actor-like model, 
meaning:

-   Communication between actors (services) is done exclusively via messages.
 -   An actor (which could also be a thread) processes one message at a time, 
avoiding sharing
     state with other actors as far as possible.
 -   Segments are kept in RAM and are written to external storage lazily only 
for disaster recovery.
 -   As RAM is a very limited resource, different actors own their share of the 
total segment space.
 -   An actor can also cache a few segments which it does not own but which it 
uses often (such as
     the one containing the root node)
 -   The granularity of operating on whole segments may be too coarse, so 
perhaps reducing the segment
    size would improve performance.
 -   We could even use the segment solely as an addressing component and 
operate at the record level.
     That would avoid copying data around when collecting garbage: garbage 
records would just be
     evicted from RAM.
h1. Implementation

The first idea was to use ZeroMQ for communication because it seems to be a 
high-quality and
 easy to use implementation. A major drawback is that the library is written in 
C and the Java
 library which does the JNI stuff seems hard to set up and did not work for me. 
There is a native
 Java implementation of the ZeroMQ protocol, aptly called jeromq, which seems 
to work well so far,
but I don't know about its performance yet.

There is an attempt to use jeromq in the segment store in a very very very 
early stage at
[https://github.com/ahanikel/jackrabbit-oak/tree/zeromq] . It is based on the 
memory segment store
and currently just replaces direct function calls for reading and writing 
segments with messages being
sent and received.

  was:
# Outline

This issue documents some proof-of-concept work for adapting the segment tar 
nodestore to a
distributed environment. The main idea is to adopt an actor-like model, meaning:

-   Communication between actors (services) is done exclusively via messages.
-   An actor (which could also be a thread) processes one message at a time, 
avoiding sharing
    state with other actors as far as possible.
-   Segments are kept in RAM and are written to external storage lazily only 
for disaster recovery.
-   As RAM is a very limited resource, different actors own their share of the 
total segment space.
-   An actor can also cache a few segments which it does not own but which it 
uses often (such as
    the one containing the root node)
-   The granularity of operating on whole segments may be too coarse, so 
perhaps reducing the segment size would improve performance.
-   We could even use the segment solely as an addressing component and operate 
at the record level.
    That would avoid copying data around when collecting garbage: garbage 
records would just be
    evicted from RAM.

# Implementation

The first idea was to use ZeroMQ for communication because it seems to be a 
high-quality and
easy to use implementation. A major drawback is that the library is written in 
C and the Java
library which does the JNI stuff seems hard to set up and did not work for me. 
There is a native
Java implementation of the ZeroMQ protocol, aptly called jeromq, which seems to 
work well so far, but I don't know about its performance yet.

There is an attempt to use jeromq in the segment store in a very very very 
early stage at
<https://github.com/ahanikel/jackrabbit-oak/tree/zeromq> . It is based on the 
memory segment store and currently just replaces direct function calls for 
reading and writing segments with messages being sent and received.


> A distributed segment store for the cloud
> -----------------------------------------
>
>                 Key: OAK-7932
>                 URL: https://issues.apache.org/jira/browse/OAK-7932
>             Project: Jackrabbit Oak
>          Issue Type: Wish
>          Components: segment-tar
>            Reporter: Axel Hanikel
>            Assignee: Axel Hanikel
>            Priority: Minor
>
> h1. Outline
> This issue documents some proof-of-concept work for adapting the segment tar 
> nodestore to a
>  distributed environment. The main idea is to adopt an actor-like model, 
> meaning:
> -   Communication between actors (services) is done exclusively via messages.
>  -   An actor (which could also be a thread) processes one message at a time, 
> avoiding sharing
>      state with other actors as far as possible.
>  -   Segments are kept in RAM and are written to external storage lazily only 
> for disaster recovery.
>  -   As RAM is a very limited resource, different actors own their share of 
> the total segment space.
>  -   An actor can also cache a few segments which it does not own but which 
> it uses often (such as
>      the one containing the root node)
>  -   The granularity of operating on whole segments may be too coarse, so 
> perhaps reducing the segment
>     size would improve performance.
>  -   We could even use the segment solely as an addressing component and 
> operate at the record level.
>      That would avoid copying data around when collecting garbage: garbage 
> records would just be
>      evicted from RAM.
> h1. Implementation
> The first idea was to use ZeroMQ for communication because it seems to be a 
> high-quality and
>  easy to use implementation. A major drawback is that the library is written 
> in C and the Java
>  library which does the JNI stuff seems hard to set up and did not work for 
> me. There is a native
>  Java implementation of the ZeroMQ protocol, aptly called jeromq, which seems 
> to work well so far,
> but I don't know about its performance yet.
> There is an attempt to use jeromq in the segment store in a very very very 
> early stage at
> [https://github.com/ahanikel/jackrabbit-oak/tree/zeromq] . It is based on the 
> memory segment store
> and currently just replaces direct function calls for reading and writing 
> segments with messages being
> sent and received.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to