On Wed, 2011-12-07 at 17:02 +0100, Yann Dupont wrote: > > before doing rm -rf for the user's mails. And in the archiving step you > > should do it with dsync with mail_attachment_dir disabled in the > > destination storage, so the the attachments get written to the archive > > directly instead of only referencing SIS. > > > Yes, I understand, it will work. But, if case of any error (even our > fault : premature end of script, for example) you can still end up with > attachement forever lost on the filesystem. > > Right, it SHOULD not happen, and it probably won't represent a big > volume. But Still, it could happen under specific circonstances. In that > case, I don't see any simple way to detect that kind of files ? > > Do you see how a script could detect some orphaned links ??
It wouldn't be simple. The only safe way would be to: 1. Scan through all the attachment HASH-GUID names and save them. This scanning step could already detect some orphaned attachments, where the hashes/HASH file exists with nlink=1 (i.e. HASH-GUID* files have been deleted, but the HASH itself hasn't been for some reason). 2. Read through all users' all dboxes contents and get a list of all referenced attachment HASH-GUIDs. 3. Delete all attachments that exist in list 1, but not in list 2. I guess there should be a "doveadm sis rescan" command that does this.