Tomek Rękawek created OAK-6571:
----------------------------------
Summary: Prefetching the DocumentStore cache using machine learning
Key: OAK-6571
URL: https://issues.apache.org/jira/browse/OAK-6571
Project: Jackrabbit Oak
Issue Type: Story
Components: cache, documentmk
Reporter: Tomek Rękawek
Fix For: 1.8
The idea is that we can analyse the series of requests made by the
DocumentStore, eg.:
/content/site/jcr:content
/content/site/jcr:content/left-column
/content/site/jcr:content/left-column/item1
/content/site/jcr:content/left-column/item2
to predict the future requests and prefetch them. This way we can limit the
number of required requests, the connection latency, etc.
In order to group the requests together, we can use the thread name as a common
property. For instance, if Oak is used with Sling, then a single HTTP request
usually is served by a single thread and it's name contains the HTTP request
line.
Implementing this story will require intercepting the MongoDB/RDB requests made
by the DocumentStore and preparing an algorithm analysing and predicting the
future calls. The attached patch contains a proposal of interface which may be
used to join these two parts.
We can start with a simple algorithm trying to exact match the current requests
to the already existing sequence and it's not enough look for more
sophisticated mechanism.
Resources:
* [Intelligent web caching using machine learning
methods|http://www.nnw.cz/doi/2011/NNW.2011.21.025.pdf|
* [Hidden Markov Model|https://en.wikipedia.org/wiki/Hidden_Markov_model]
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)