Re: sync_log_chain - is it always needed?

ellie timoney Sun, 03 May 2020 18:01:26 -0700

> Just to clarify here - are not IMAP/POP/LMTP actions logged anyway - 
> using sync_log? Won't they grow forever, too?


Yes, but no, because your replica is not taking IMAP/POP/LMTP/etc traffic, 
therefore no such actions are occurring on it.

If it is taking that traffic, it's not a replica, it's a master; if it needs to 
replicate to somewhere else while it's acting as master, then it needs a 
rolling sync_client, which will consume the sync_log, and it won't grow 
forever.  If it doesn't need to replicate anywhere else (i.e. maybe you don't 
have a third server in the set) then it probably wouldn't have sync_log on 
anyway, but if it does, you should turn it off in this situation.

If your replica is accidentally taking that traffic while still being a replica 
for replication purposes, you have a split brain situation, which can be fiddly 
to clean up.  So be careful to avoid this!

> What is a scenario where sync_log and sync_log_chain is used 
> independently? What is the purpose to have sync_log and sync_log_chain 
> as separate options - couldn't we just use sync_log?

sync_log should be enabled on _servers that need to replicate to elsewhere in 
approximately real time_ (let's call this "category A"), and they need to have 
a rolling sync_client active to do this, which consumes the sync logs

sync_log_chain should be enabled on the subset of servers in category A that 
also receive traffic via replication from elsewhere

So, sync_log and sync_log_chain cannot be interchangeable.  sync_log means "i 
am running a rolling sync_client, and it needs a source of events", and 
sync_log_chain means ".... and that should also include replication events".  A 
replica that is chaining needs _both_.

In an M->R1->R2 setup, M needs "sync_log", R1 needs "sync_log" and 
"sync_log_chain", and R2 needs neither

In an M->R1 + M->R2 setup, M needs "sync_log". R1 and R2 don't need sync 
configuration.  They can safely have sync_log enabled because they're not 
receiving user traffic anyway, but they don't need it.

In an M1->R1 (there is no R2) setup, M needs "sync_log".  R1 doesn't need sync 
configuration (but can safely have sync_log enabled), and R2 doesn't exist.

Presumably, you set up replication so that if something happens to your usual 
master server, you can restore service by carefully promoting one of your 
replicas into the master role.

In an M->R1->R2 role, presumably you're planning to switch to R1 in an M-failed 
scenario, since it will have the most up-to-date data.  It's already set up to 
replicate to R2, and while it is acting as master it won't be receiving 
replication traffic, so leaving sync_log_chain on won't hurt, and it will still 
need sync_log on to keep replicating to R2.  (This also implies that, if you 
have multiple data centres, M and R1 should be in different data centres, R1 
and R2 should be in different data centres, but R2 can be in the same data 
centre as M with the caveat that if you lose that whole dc, then the 
promoted-R1 has nowhere to replicate to.  YMMV depending on the size of your 
deployment, your resiliency requirements, your budget, etc etc.)

In an M->R1 + M->R2 setup, both replicas are reasonably up to date, and either 
would make a suitable failover target.  You probably want to configure 
whichever one you promote to now replicate to the other, where previously it 
didn't replicate to anywhere.

When you get M rebuilt after the disaster, you probably want to initially set 
it up as a replica in the set, so that it can be brought up to date without 
taking user traffic.  Then, if you want to make it the master again, halt 
non-admin traffic to the whole set, perform a final replication to it to make 
sure it's exactly up to date (should be quick, since the current-master will 
only have a few minutes of changes to catch up on, since you already brought it 
up to date), then flip your configurations/dns/proxy/whatever else you need to 
do, and put your sync configuration back to usual.  Make sure everything's 
fine, then re-enable non-admin traffic.

> Are both types of entries - from sync_log and sync_log_chain used only 
> by rolling replication? Or are they used by sync_client -A too?

No, the sync log (both "types") is only used by the rolling sync_client mode.  
The sync log is how a rolling sync_client knows what to replicate.  Every other 
sync_client mode gets a list of "what to replicate" in its command line 
arguments, and only replicates what it's told.

sync_log -A replicates all users.  "-A" constitutes a "list of what to 
replicate".  Thus it ignores the sync log.

> Sorry for the silly questions, the replication is quite scantly documented.

Have you seen this?  
https://www.cyrusimap.org/imap/reference/admin/sop/replication.html

Pull requests appreciated, of course.

> Thank you for all the explanations so far, regards,

No worries, hope this is helpful.

ellie
----
Cyrus Home Page: http://www.cyrusimap.org/
List Archives/Info: http://lists.andrew.cmu.edu/pipermail/info-cyrus/
To Unsubscribe:
https://lists.andrew.cmu.edu/mailman/listinfo/info-cyrus

Re: sync_log_chain - is it always needed?

Reply via email to