[
https://issues.apache.org/jira/browse/OAK-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904042#comment-13904042
]
Alex Parvulescu commented on OAK-1161:
--------------------------------------
bq. The servlet part is already in oak-http, but probably needs some further
cleanup/fixes.
good point, I had missed that one.
So far I see one problem (ignoring the auth bits for now) with the startup:
Journal#setHead [0] throws an UnsupportedOperationException which can crash a
normal oak startup (OakInitializer#initialize will try to merge in some
changes). Simply returning _true_ seems to fix the issue, although not sure
what are the implications.
On a more generic note, I'm not sure how the HttpStore can act as a failover
when it looks like it depends heavily on the master node's availability ( the
master needs to respond to GET requests, and there's no local state AFAIKT).
Would the failover solution be a combination of a FileStore (possibly in
readonly mode) + HttpStore combo for the journal sync?
[0]
https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/http/HttpStore.java#L77
> Simple failover for TarMK-based installations
> ---------------------------------------------
>
> Key: OAK-1161
> URL: https://issues.apache.org/jira/browse/OAK-1161
> Project: Jackrabbit Oak
> Issue Type: New Feature
> Components: segmentmk
> Reporter: Michael Marth
> Assignee: Alex Parvulescu
> Fix For: 0.17
>
>
> At the moment we have a Mongo-based MK impl that Oak users for scalable
> deployments and TarMK for standalone (performant) deployments. I think it is
> OK to not implement some sort of "scalability" into TarMK, even if I realize
> that the hierarchical journals allow us to do that later if we want to.
> However, it would even now be great to have a failover option for TarMK
> (MongoMK implictly offers this through replicas). This would not be about
> clustering or scalability, but only about reliability.
> I think there are 2 parts to this:
> # keeping a standby repository (slave) in sync and
> # the actual fail over.
> For the first part there could be a relatively simple way to implement this:
> Let's consider that there is only one slave and that the slave does not
> accept writes. Given the MVCC nature of the tar files we could simply sync
> the (append-only) tar files from the master to the slave on an ongoing basis.
> This could be similar to an rsync (or even use actual rsync)
> The slave would keep on receiving and locally persisting these files.
> Also, the slave would either need to be in a state where it is blocks writes
> or even in some sort of sleep state.
> I think this synchronization of files could be done a rather robust way where
> shaky networks or high latency could be recovered from by choosing a proper
> way of transfer.
> This sync to a remote system could be implemented similarly than a
> tarMK-based incremental backup (OAK-1159).
> For the failover:
> Ideally, we would have 2 implementations: a native failover and an external
> switch (like MBean or via HTTP) that would make the slave stop accepting
> files from master and start up on the last completely received revision. But
> simply having the second option would be a good start.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)