Chetan Mehrotra created OAK-4581:
------------------------------------

             Summary: Persistent local journal for more reliable event 
generation
                 Key: OAK-4581
                 URL: https://issues.apache.org/jira/browse/OAK-4581
             Project: Jackrabbit Oak
          Issue Type: New Feature
          Components: core
            Reporter: Chetan Mehrotra
             Fix For: 1.6


As discussed in OAK-2683 "hitting the observation queue limit" has multiple 
drawbacks. Quite a bit of work is done to make diff generation faster. However 
there are still chances of event queue getting filled up. 

This issue is meant to implement a persistent event journal. Idea here being

# NodeStore would push the diff into a persistent store via a synchronous 
observer
# Observors which are meant to handle such events in async way (by virtue of 
being wrapped in BackgroundObserver) would instead pull the events from this 
persisted journal

h3. A - What is persisted

h4. 1 - Serialized Root States and CommitInfo

In this approach we just persist the root states in serialized form. 
* DocumentNodeStore - This means storing the root revision vector
* SegmentNodeStore - {color:red}Q1 - What does serialized form of 
SegmentNodeStore root state looks like{color} - Possible the RecordId of "root" 
state

Note that with OAK-4528 DocumentNodeStore can rely on persisted remote journal 
to determine the affected paths. Which reduces the need for persisting complete 
diff locally

h4. 2 - Serialized commit diff and CommitInfo

In this approach we can save the diff in JSOP form. The diff only contains 
information about affected path. Similar to what is current being stored in 
DocumentNodeStore journal

h4. CommitInfo

The commit info would also need to be serialized. So it needs to be ensure 
whatever is stored there can be serialized or re calculated

h3. B - How it is persisted

h4. 1 - Use a secondary segment NodeStore

OAK-4180 makes use of SegmentNodeStore as a secondary store for caching. 
[~mreutegg] suggested that for persisted local journal we can also utilize a 
SegmentNodeStore instance. Care needs to be taken for compaction. Either via 
generation approach or relying on online compaction

h4. 2- Make use of write ahead log implementations

[~ianeboston] suggested that we can make use of some write ahead log 
implementation like [1], [2] or [3]

h3. C - How changes get pulled

Some points to consider for event generation logic
# Would need a way to keep pointers to journal entry on per listener basis. 
This would allow each Listener to "pull" content changes and generate diff as 
per its speed and keeping in memory overhead low
# The journal should survive restarts

[1] http://www.mapdb.org/javadoc/latest/mapdb/org/mapdb/WriteAheadLog.html
[2] 
https://github.com/apache/activemq/tree/master/activemq-kahadb-store/src/main/java/org/apache/activemq/store/kahadb/disk/journal
[3] 
https://github.com/elastic/elasticsearch/tree/master/core/src/main/java/org/elasticsearch/index/translog



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to