On Mon, Oct 15, 2018 at 08:50:21AM +0000, Raymond Sellars wrote: > > Hi > > Looking for some insight into mdbox index file management and recovery from > corruptions. > > > I have a two node cluster on NFS with proxy director in front for user > stickness. One node (a nominated master) bidirectionally replicates to a 3rd > node on a DR site. > > We periodically get index file corruptions resulting in rebuilds. However the > user experience is poor as messages read/deleted from months/years ago all > reappear as unread again. > > We've seen corruption because of NFS NTP time sync problems, proxy not being > stick, but also the DR node being off line for a while and then tripping > corruption within production when it comes back on. > > Error message example (1 of): > > Error: Corrupted dbox file /mailshare/.. (removed) ../home/mail/storage/m.4 > (around offset=993548): EOF reading msg header (got 0/30 bytes) > https://wiki2.dovecot.org/MailboxFormat/dbox- i've read up on all the > documentation I can find and understand " > you must not lose the dbox index files, they can't be regenerated without > data loss." > > Questions: > #1 Any additional tips for avoiding mdbox index corruptions with dsync? Or > should I revert to maildir format? I like the performance premise of the > mdbox but these index corruptions are a reliability issue. > > #2 I'm guessing read status is one of the meta data items lost. But its seems > it can't recover it from dovecot.index.backup files either. Any technique to > preserve that item as its key to the user experience? > > #3 If index/transaction logs are so critical is there some kind of check > point backups I can take? Native dovecot feature or do I need to script > something. > > #4 I've noticed that rebuilding the index does not work if the > dovecot.index.log file is lost (deleted as a hard test). The > dovecot.index.cache can be but once the log file i gone messages are not > automatically (or manually that i can find) recovered from the storage > directory. > > I've not seen any dovecot.index.log file corruptions but that file seems very > high risk. If rebuilding the index only from the log file or a combination > process from storage directory? > > Is there perhaps an option to just use the transaction log and not the index? > Although that doesn't sound wise for performance. > > #5 In additional to status UNREAD we also notice files moved to the trash > reappear. Is that expected behavior? > > Thanks > Raymond > > >
What version of dovecot and what OS are you running? Is NFS linux/bsd/netapp/etc?
