I. Which mbox index / mail summary file format to use?
Import into Berkeley DB hash tables. Fast, easy, well-supported by many languages, robust, data can easily be extracted if necessary, and they can easily be reconstructed if necessary.
Failing that, use a mailbox-directory format.
II. index / mail summary file performance and maintenance
Mozilla .msf files can be regenerated on the fly but for a 100MB mailbox (Python-list's is 600MB+!), it already takes fairly long (a few minutes). Assuming index file corruption is very rare, then this should not be a real problem.
I would be willing to bet that Berkeley DB files could be regenerated even faster -- much faster.
-- Brad Knowles, <[EMAIL PROTECTED]>
"They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin, Historical Review of Pennsylvania.
GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI++++$ P+>++ L+ !E-(---) W+++(--) N+ !w--- O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++) tv+(+++) b+(++++) DI+(++++) D+(++) G+(++++) e++>++++ h--- r---(+++)* z(+++)
_______________________________________________ Mailman-Developers mailing list [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-developers