On Mon, Oct 15, 2018 at 08:50:21AM +0000, Raymond Sellars wrote:
> 
> Hi 
> 
> Looking for some insight into mdbox index file management and recovery from 
> corruptions.
> 
> 
> I have a two node cluster on NFS with proxy director in front for user 
> stickness. One node (a nominated master) bidirectionally replicates to a 3rd 
> node on a DR site.
> 
> We periodically get index file corruptions resulting in rebuilds. However the 
> user experience is poor as messages read/deleted from months/years ago all 
> reappear as unread again.
> 
> We've seen corruption because of NFS NTP time sync problems, proxy not being 
> stick, but also the DR node being off line for a while and then tripping 
> corruption within production when it comes back on.
> 
> Error message example (1 of):
> 
> Error: Corrupted dbox file /mailshare/.. (removed) ../home/mail/storage/m.4 
> (around offset=993548): EOF reading msg header (got 0/30 bytes)
> https://wiki2.dovecot.org/MailboxFormat/dbox- i've read up on all the 
> documentation I can find and understand "
> you must not lose the dbox index files, they can't be regenerated without 
> data loss."
> 
> Questions:
> #1 Any additional tips for avoiding mdbox index corruptions with dsync? Or 
> should I revert to maildir format? I like the performance premise of the 
> mdbox but these index corruptions are a reliability issue.
> 
> #2 I'm guessing read status is one of the meta data items lost. But its seems 
> it can't recover it from dovecot.index.backup files either. Any technique to 
> preserve that item as its key to the user experience?
> 
> #3 If index/transaction logs are so critical is there some kind of check 
> point backups I can take? Native dovecot feature or do I need to script 
> something.
> 
> #4 I've noticed that rebuilding the index does not work if the 
> dovecot.index.log file is lost (deleted as a hard test). The 
> dovecot.index.cache can be but once the log file i gone messages are not 
> automatically (or manually that i can find) recovered from the storage 
> directory.
> 
> I've not seen any dovecot.index.log file corruptions but that file seems very 
> high risk. If rebuilding the index only from the log file or a combination 
> process from storage directory?
> 
> Is there perhaps an option to just use the transaction log and not the index? 
> Although that doesn't sound wise for performance.
> 
> #5 In additional to status UNREAD we also notice files moved to the trash 
> reappear. Is that expected behavior?
> 
> Thanks
> Raymond
> 
> 
> 

What version of dovecot and what OS are you running? Is NFS 
linux/bsd/netapp/etc?

Reply via email to