I have almost every email (not mailing list mail) i've sent or received since 1997. It's 2+ gb and around 270,000 messages. The only thing that makes it managable is i use Pine as my primary email client. On the 1st of each month, Pine asks if i want to move read-mail to read-mail-MON-YYYY. Same for sent-mail. I'll then create a sub-dir for the year and move the 24 mailboxes into say 2003/ to further subdivide things.
When i need to find something, i normally remember the year, grep the 24 mailboxes for that year, find the month, use vi to find the date, then open it in Pine. Sure a searchable sql db would be more efficient...but i've never failed to find an old email. Pine isn't pretty but it works, and is amazingly efficient at opening 40,000+ message mailboxes when other clients would just give up and commit suicide. On another note, i did write an archive system years ago for our shift reports. Basically it would parse out From, Date, Subject, and the body of a message and stick it in mysql. Then would "tokenize" the body, and stick all the keywords in another table row referencing the original message....this made for pretty fast searching. It doesn't do attachments or anything fancy though. It's been in production for years and i haven't look at it in as long, but if you want some of the code i'd be happy to send it along. (it's all pretty simple). It's a mix of perl and php. ray On Sat, 13 Mar 2004, Shannon Roddy wrote: > I have well over 80,000 emails that I have to keep up with. Makes most > mail clients choke and attempting to archive it by hand takes way too > much time every day. I have changed clients several times over the > years and some have handled filters better than others, so some is > organized. some isn't. > > I use IMAP, which is great, but it makes searches a bit more difficult. > I have about 1.7 gigs between my mail spool and my imap folders. So > it is not your average amount of email. > > Shannon
