Re: [SLUG] Removing IMAP duplicates

Jake Anderson Thu, 03 Dec 2009 16:14:39 -0800

Nigel Allen wrote:

Greetings
We have been using the Thunderbird plugins "Remove Duplicates" and"Remove Duplicates (Alternate)" for some time.
The situation is as follows. A customer of ours has a single IMAPaccount under which there are some hundreds of folders. There are (intotal) 325,000+ emails in these folders and more are added every day(email archive system). These folders are also used for chargingpurposes, so on a monthly basis we de-duplicate all the folders andadd up all the received and set in each of the folders for that month.
The folders look like this:

Imap Archive Account
++++++++Customer 0001
++++++++Customer 0002
++++++++Customer 0003
++++++++
++++++++Customer 9999
The problem is that the de-duplication run is taking (as you wouldexpect) longer and longer each month as it compares every email withevery other email (we de-duplicate from the IMAP account down andinclude all folders). It's now up to 24+ hours. The only way I canthink of shortening this is to select and de-duplicate each of theindividual sub-folders (Customers) as a separate operation. That willbe a lot faster in pure comparison but will tie up a valuable resource(me) for hours on end - maybe an entire day.
Does anyone know of a way in which we could run the de-duplicationprocess on all the sub-folders at once on the server itself - whichshould be way faster?
Currently running Centos 4.8, sendmail and dovecot.

Thanks in anticipation.

Nigel.

Not really what you were after I know but dbmail has the option ofremoving duplicate emails as it receives them.

perhaps dovecot has a similar feature?
--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] Removing IMAP duplicates

Reply via email to