How to replicate TDB indexes using rsync?

Paolo Castagna Thu, 07 Apr 2011 06:46:24 -0700

Hi,
one thing often people do to increase availability is to use replication.
Having replicated RDF stores allows you to load balance requests across
multiple machines in order to increase query throughput as well.
Replicas can act as sort of backup, however you need to be careful since
errors will be replicated as well, therefore replication, in general,
does not eliminate the need of backups. (What's the best way to backup
a TDB store?)


However, in presence of updates you are left with the problem to keep
your replicas in sync.

For simplicity, I was thinking to try something with a master/slave(s)
architecture. One Fuseki server acting as master and running with the
--update option and a few replicas running in read only mode.
I am thinking to use rsync to sync TDB indexes between master and slaves.

However, there is the need to coordinate the replication and forbid any
updates while the replication is in progress. Master should become read
only, sync everything on disk and then replication could start.

A similar thing would need to happen for slaves. Slaves should probably
be taken off-line while replication is going on and Fuseki restarted as
soon as replication finishes. I expect replication to be quite fast,
after the first time.

I am sending this email to validate my thinking and to ask if anyone else
has used rsync to manage replicated TDB stores with/without Fuseki.

Thank you,
Paolo

How to replicate TDB indexes using rsync?

Reply via email to