Jim wrote:
On Tue, Nov 16, 2021 at 11:41 (-0500), Kris Deugau wrote:

Jim wrote:
On Mon, Nov 15, 2021 at 12:25 (-0500), Wietse Venema wrote:

Instead, use Maildir format with one message per file,

I thought about that once, but I decided I have too many e-mail
messages for that.  (I don't want to run out of inodes, nor do I want to
make file accesses too slow because of the number of files in the
directory.)

I converted "local"[*] storage from mbox to maildir a number of
years ago - IIRC I was starting to see performance issues with mbox
in part due to the way I manage my mail and in part simply due to
the number of messages I keep around.

This account has ~13G of mail on my PC, with over 100K messages each
in two folders, several in the tens of thousands, and most dedicated
mailing list folders holding somewhere between about 5K and 8K
messages each.

Thanks for the specifics.

The only performance issues I have are:

a) something sucks in the the IMAP protocol such that my mail client keeps
having to create a new connection and reauthenticate - it's not strictly a
timeout, because it's not on anything remotely resembling a predictable
timing

At first glance I wouldn't see that related to mbox vs. maildir, but
I've been surprised before.

Hard to tell, since I converted to maildir long before I had this much mail sitting around. IIRC I was at ~20K messages in the biggest folders at the time. I converted more for convenience in doing "grep -r |xargs rm"-ish things - can't really do that with mbox folders.

I also have the same needs-to-log-in-again-for-no-good-reason issue using Thunderbird against a role account on a central mail platform with "many" - but quite a bit fewer - messages, so my money is definitely on some weird corner case in the IMAP protocol.


Local storage is ext4 on a SATA SSD, although I wouldn't expect a noticeable
performance difference if it were on a conventional hard drive.

I am surprised that accessing files in a directory with 100K entries
is not slow, since (according to what I read) ext4 stores entries in
an "almost linear" list, and thus to find a director entry you might
have to chew through (on average) 50K entries.  Of course, file system
caching will speed things up immensely, assuming one has enough RAM
(given the other activity on the system) to keep the contents of those
maildirs (that is, the directory contents, not the contents of the
files) in RAM.

That could well be at the root of some of my issues, but the whole-file rewrites needed for mbox would be worse IMO. Aside from whatever strange state Seamonkey gets itself into after running for several weeks I'm not seeing any other slowdowns. Dovecot seems to be quite happy to manage all that baggage - TBH some of Dovecot's indexing may be helping out there by avoiding having to re-read the filesystem's entire directory index very often.

I do also have 32G of physical RAM, and top reports 17G of that is in use for cache...

[*] Due to some legacy mail flow that would be painful to convert, I
pull mail with fetchmail, deliver locally with procmail (sorry),
then expose it to my mail client with a local Dovecot instance.

Again, thanks for your specifics.  Maybe I should give maildir a try
some time and see what happens.  (Or maybe I should just delete a bunch
of email and forget that I ever got it.)

I haven't used actual client-local mail folders for much in a LONG time; both Seamonkey and Thunderbird default to mbox-ish files IIRC, (although TB at least has an option to use a maildir-ish format).

-kgd

Reply via email to