[ 
https://issues.apache.org/jira/browse/OAK-1161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13915925#comment-13915925
 ] 

Alex Parvulescu commented on OAK-1161:
--------------------------------------

I'm experiencing some issues with the http failover: the backup of the http 
store is throwing some errors [0].
The stack trace if a bit off, because I'm running a patched version locally, 
but the error comes from running the backup a few times on the same directory 
(much like an incremental backup) and the segment that causes this seems to 
have a really small length (284 or on a different occasion 388 bytes 
transferred).

Should this issue go into a dedicated sub task?

{code}
        at java.nio.Buffer.checkIndex(Buffer.java:520)
        at java.nio.HeapByteBuffer.getLong(HeapByteBuffer.java:391)
        at 
org.apache.jackrabbit.oak.plugins.segment.Segment.<init>(Segment.java:149)
        at 
org.apache.jackrabbit.oak.plugins.segment.http.HttpStore.loadSegment(HttpStore.java:121)
        at 
org.apache.jackrabbit.oak.plugins.segment.AbstractStore.readSegment(AbstractStore.java:123)
        at 
org.apache.jackrabbit.oak.plugins.segment.Segment.getSegment(Segment.java:266)
        at 
org.apache.jackrabbit.oak.plugins.segment.Record.getSegment(Record.java:109)
        at 
org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.getTemplate(SegmentNodeState.java:74)
        at 
org.apache.jackrabbit.oak.plugins.segment.SegmentNodeState.getChildNode(SegmentNodeState.java:333)
        at 
org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStore.refreshHead(SegmentNodeStore.java:108)
        at 
org.apache.jackrabbit.oak.plugins.segment.SegmentNodeStore.checkpoint(SegmentNodeStore.java:206)
        at 
org.apache.jackrabbit.oak.plugins.backup.FileStoreBackup.backup(FileStoreBackup.java:49)
{code}

> Simple failover for TarMK-based installations
> ---------------------------------------------
>
>                 Key: OAK-1161
>                 URL: https://issues.apache.org/jira/browse/OAK-1161
>             Project: Jackrabbit Oak
>          Issue Type: New Feature
>          Components: segmentmk
>            Reporter: Michael Marth
>            Assignee: Alex Parvulescu
>             Fix For: 0.18
>
>
> At the moment we have a Mongo-based MK impl that Oak users for scalable 
> deployments and TarMK for standalone (performant) deployments. I think it is 
> OK to not implement some sort of "scalability" into TarMK, even if I realize 
> that the hierarchical journals allow us to do that later if we want to. 
> However, it would even now be great to have a failover option for TarMK 
> (MongoMK implictly offers this through replicas). This would not be about 
> clustering or scalability, but only about reliability.
> I think there are 2 parts to this:
> # keeping a standby repository (slave) in sync and
> # the actual fail over.
> For the first part there could be a relatively simple way to implement this:
> Let's consider that there is only one slave and that the slave does not 
> accept writes. Given the MVCC nature of the tar files we could simply sync 
> the (append-only) tar files from the master to the slave on an ongoing basis. 
> This could be similar to an rsync (or even use actual rsync)
> The slave would keep on receiving and locally persisting these files.
> Also, the slave would either need to be in a state where it is blocks writes 
> or even in some sort of sleep state.
> I think this synchronization of files could be done a rather robust way where 
> shaky networks or high latency could be recovered from by choosing a proper 
> way of transfer.
> This sync to a remote system could be implemented similarly than a 
> tarMK-based incremental backup (OAK-1159).
> For the failover:
> Ideally, we would have 2 implementations: a native failover and an external 
> switch (like MBean or via HTTP) that would make the slave stop accepting 
> files from master and start up on the last completely received revision. But 
> simply having the second option would be a good start.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to