for 250k messages, either solution is poor.  deleting the
first of 250k messages requires rewriting the whole mbox.
dealing with 250k directory entries can be painful, too,
as many fs keep directores as arrays.

if you have that many messages, you might want an index. ;-)

I've said it before, and I'll keep saying it until I'm proved wrong:

  nothing runs faster or scales better than the Cyrus IMAP mail store.

MH filesystem layout with per-folder index + header cache. I have deployed mail servers with literally millions of user accounts using this layout, and it just works.

These days I don't find large directories to be a problem. A few months ago I did some performance tests on large directory operations (create, unlink, readdir) on Linux and FreeBSD. For both OSes, directory enumeration times were negligible for the tests I ran (in the vicinity of 200K entries). And with everyone caching directory entries these days, the only real difference between directory I/O and file I/O is the locking overhead for directory modifications. (This all assumes local disk. Throw in NFS and everything implodes.)

But scalability aside, the real evilness in mbox is the quoting of '^From '.

--lyndon

Reply via email to