This is a reply to an old email, but I *have* been looking in the
archives for a solution to my problem, before writing my own... :-)

On Fri, May 01, 2015 at 05:25:38PM +0200, Oswald Buddenhagen wrote:
> On Fri, May 01, 2015 at 04:53:50PM +0200, Tamas Papp wrote:
> > Maildir error: UID 30838 is beyond highest assigned UID 18
> > 
> > I googled and the workaround seems to be deleting and resyncing the
> > mailbox.
> >
> uh, no, it's not.
> 

I have encountered the issue after switching my last account using
offlineimap (for some weird reason I've been using isync/mbsync and
offlineimap on different accounts in parallel) to mbsync/isync, because
offlineimap was being too annoying after an upgrade of its own.

> search only this list archive with key "duplicate uid".

I've looked for the email with the details, and I assume it is
<20150208105332.ga7...@ugly.fritz.box>.

The critical bit is this:

> first convert back to the native format. then use 'ls *,U=nnn' with the
> complained about uids. find out which message is the moved one, and
> manually strip the infix. rinse and repeat.

Which I understood as "remove everything that follow ,U= in each the name
of every email file". That basically brings you back to the starting
point where both the local and the remote have a set of emails, and no
relationship between then is assumed.

So mbsync will sync both ways, which leads to every email being
duplicated, and a lot of bandwidth wasted. (Typing D~= in Mutt is not
that hard, but when you have a few hundred folders, it gets old very
quickly...)

So I came up with the following solution, to avoid copying the local emails
to the remote store (i.e. save half the bandwidth):

    Assuming the MaildirStore is the master, and the IMAPStore is the slave:

    For each folder ($folder):

    1. RESET: Rename all files in cur and new to remove everything after ,U=

       This can be done in bulk using the 'rename' utility:

           rename 's/,U=[1-9][0-9]*.*//' $folder/*/*

       (Use the -nono option, to see what it's going to do, but not do it.)

    2. SYNC: Sync from remote to local (this will get all the remote emails,
       including those already in the local store, causing duplication):

           mbsync --push $channel:$folder

       Here I used --push, because I want to propagate from slave (remote)
       to master (local) only.

       (Be careful with things like Flatten, the remote folder name might
       have a different name than the local directory.)

    3. DEDUPE: Delete all local duplicates.

       This is a bit tricky, because the mails synced by mbsync will
       have an extra X-TUID header line, that should not be taken into
       acount when looking for duplicates. It turns out OfflineIMAP
       also adds its own bookkeeping header.

       This is accomplised with this Perl program:

            #!/usr/bin/env perl
            use strict;
            use warnings;
            use Path::Class;
            use Digest::SHA;

            my $dir = dir(shift);
            my %digest;

            # group files by digest
            for my $subdir (qw( cur new )) {

                # compute the digest with the extra headers
                push @{
                    $digest{ Digest::SHA->new("sha1")->add(
                            grep !m{^X-TUID: [a-zA-z0-9+/]{12}$},       # isync
                            grep !m{^X-OfflineIMAP: [0-9]+-[0-9]+$},    # 
offlineimap
                            $_->slurp
                        )->hexdigest
                    }
                  },
                  $_
                  for $dir->subdir($subdir)->children;
            }

            # drop all duplicates but one
            unlink splice @$_, 1 for grep @$_ > 1, values %digest;

      Actually, this might accidentaly work because of the order the
      system returns the list of files to the program (older first,
      so the newly synced emails will be kept, and they already live
      on the remote side).

      If that ever becomes a problem, the solution is to flag the
      emails with a X-TUID header, and preferably keep them.


I wrapped the above three steps in a shell script that I ran in a loop
over all my folders. I had to download all the emails again (but only in
one direction), and my backups are going to take a hit (all file names
have changed in the Maildir), but it still feel cleaner, and I can run
this again easily if needed.

If anyone has suggestions for improving my solution, I'm very interested.

-- 
 Philippe Bruhat (BooK)

 In war, the only winners are those who sell the weapons.
                                                 (Moral from Groo #3 (Image))

------------------------------------------------------------------------------
_______________________________________________
isync-devel mailing list
isync-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/isync-devel

Reply via email to