# Doug Lee:

[ fixed quote-levels ]
> On Sat, Apr 09, 2005 at 05:33:22PM -0400, Chuck Swiger wrote:


[ mail storage backed by DB ]
> > 
> > The advantage is that users gets fancy searching.
> >
> > The disadvantage is that you need to provide around 4 times as much disk 
> > space for a DB-based mailstore as you would for a normal mbox/maildir style 
> > representation, you need to provide a lot more server horsepower, you need 
> > to continuously maintain and purge old mail from the database, and you end 
> > up with your mail buried in database tables, so heaven help you if the 
> > database becomes inconsistent and you need to recover.

Whereas you can repair mbox-files with your favorite editor
and employ pretty much the same level of fancy searching
with a couple of scripts.


> But as for increased storage requirements, I've always wondered how
> much could be saved by an intelligent method of behind-the-scenes
> handling of quoting among messages in a thread.  Goodness knows half
> the mail on a lot of lists, and even in a lot of personal mail
> streams, is simply copies of some or all of other messages, perhaps
> shifted over by quote signs like `>' etc.  Seems to me a system could
> be devised to store directions for rebuilding a message instead of the
> message itself with all quoting intact. 

Basically, you could just kill any quotechar, trim headers and 
store the threads as incremental diffs.  You could squeeze redundancy
a bit more, but then you'll cry if some bug decides to eat a byte
or two. ;)


>  but I wouldn't be surprised if it could reverse the
> increased storage requirements you mention.

Probably.

What's the gain in all that, though?

The mbox-format is simple enough[1], you can just build something
to suit your needs in your favorite scripting language. 

Personally, I'd just build three scripts for that:

 - The first to interactively insert some headers from within my
   MUA (mutt, in this instance), i.e. 'X-Archive-Keywords: ' and
   'X-Archive-Location: '.

 - The second to (as a cron-job)
         i) extract mails from mbox files
        ii) move them into some kind of archive directory tree (based 
            on the above -location-header, i.e. $TREEBASE/$LOCATION)
   and iii) store interesting headers inside a DB.
   
 - The third for searching and cat(1)ing results to stdout
   (which in turn is nothing but a new mbox-file).

The hard part about this is integrating it into $MUA, but
there might be some hook around for that.  Actually looks
like a perfect mini-project to learn a new language with. ;)


Cheers.
Mario

[1]:
IIRC: the header of a mail starts with /^From  / and
terminates with /^$/ and the other way around for the
body of a mail.  Can't get more simple than that.
_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to