On Wed, 22 Jun 2016 12:25:28 +1000
ellie timoney via Cyrus-devel <cyrus-devel@lists.andrew.cmu.edu> wrote:
 D) Don't add the new sync_action_list.  If any operation returns
> IMAP_MAILBOX_LOCKED, just sync_log() that operation and continue, and
> let the next run deal with it.

I meant to comment on this a while ago, and your latest message just reminded 
me.

The current sync_client has an awful habit of just quitting when things go 
wrong ("Bailing out!"). This is not ideal for a system that is trying hard to 
keep the replica in sync. So we have a script that watches for this happening, 
and restarts it. A problem though with simply restarting is that whatever 
caused the bailing is still there, and it will happen again. So we move the old 
log out the way, do one more try on on that log, then discard it. 

This way at least most stuff is kept in sync, and replication is still running. 
We might lose a small amount of changes, but that is preferable to losing a 
large amount of changes when the client dies.

This is where we are now. It's not ideal, but mostly works. Separately we have 
to notice that there was a problem and reconstruct or whatever, and perhaps 
sync the problematic client.

Anyhow, hopefully this is something to keep in mind with your latest changes: 
don't get stuck in a loop if something is corrupted, which does happen 
sometimes; & don't just quit and lose all changes!

Thanks for your work.
g

Reply via email to