At 11:54 AM -0800 2003/10/29, Peter C. Norton wrote:

 It always confounds me that people will go for database voodoo and
 deride filesystems when a filesystem is a highly specialised database
 in and of itself.

I am aware of that. I was aware of that when I first gave my invited talk entitled "Design and Implementation of Highly Scalable E-mail Systems", which you can find at <http://www.shub-internet.org/brad/papers/dihses/>.


Note that Eric Allman (author of the original Ingres database, among many other things) and Kirk McKusick (author of the Berkeley Fast File System) were in the audience. I did not embarrass myself.

 Databases aren't meant to be storage for abstract binary data.
 They're meant to be a searchable index of data of types they
 understand.

Correct. And despite all claims to the contrary from the vendors, no database properly "understands" binary large objects, nor do they give you another datatype they do actually understand that would be suitable for the storage of e-mail message bodies.


 Assuming I had a clean slate to start a database project for a mail
 store, personally I'd much rather prototype it in something like
 postgresql where I could add data types to deal with email.  I could
 then make header types, text types, mime types classes, etc.  Then I
 could test to see if it was a good idea to implement it.

IMO, that would be an exercise in futility. We've been down this road a million times before. We don't need to go down it again to know that the result is not likely to be successful, especially when we have alternatives that are proven to work well -- we store the message meta-data in the database, and then the message bodies in an separate message store akin to INN timecaf/timehash "heaps" (see <http://www.shub-internet.org/brad/papers/dihses/lisa2000/sld090.htm>).


 I think using a standard sql database for doing mail operations is
 asking for trouble.  Standard databases don't know how to parse
 rfc822/2822 headers and that means that you've got to either write a
 whole lot of stored procedures in a clunky query language (or
 java!?!?!) and then maintain it, or you've got to do it all in the
 imap/pop3/whatever server which means a whole lot of yammering traffic
 between the database and the I/P/W server all the time, which == slow.

You don't ask the database to understand or parse RFC2822 headers or messages. That's up to your application. You just store data using the formats known to the database, and the message bodies according to the methods above.


--
Brad Knowles, <[EMAIL PROTECTED]>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
    -Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+
!w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)

_______________________________________________
Mailman-Developers mailing list
[EMAIL PROTECTED]
http://mail.python.org/mailman/listinfo/mailman-developers

Reply via email to