OT: maildir usage profiles (Re: feature request: improve vague/incorrect error message)

2021-11-17 Thread Kris Deugau

Jim wrote:

On Tue, Nov 16, 2021 at 11:41 (-0500), Kris Deugau wrote:


Jim wrote:

On Mon, Nov 15, 2021 at 12:25 (-0500), Wietse Venema wrote:



Instead, use Maildir format with one message per file,



I thought about that once, but I decided I have too many e-mail
messages for that.  (I don't want to run out of inodes, nor do I want to
make file accesses too slow because of the number of files in the
directory.)



I converted "local"[*] storage from mbox to maildir a number of
years ago - IIRC I was starting to see performance issues with mbox
in part due to the way I manage my mail and in part simply due to
the number of messages I keep around.



This account has ~13G of mail on my PC, with over 100K messages each
in two folders, several in the tens of thousands, and most dedicated
mailing list folders holding somewhere between about 5K and 8K
messages each.


Thanks for the specifics.


The only performance issues I have are:



a) something sucks in the the IMAP protocol such that my mail client keeps
having to create a new connection and reauthenticate - it's not strictly a
timeout, because it's not on anything remotely resembling a predictable
timing


At first glance I wouldn't see that related to mbox vs. maildir, but
I've been surprised before.


Hard to tell, since I converted to maildir long before I had this much 
mail sitting around.  IIRC I was at ~20K messages in the biggest folders 
at the time.  I converted more for convenience in doing "grep -r |xargs 
rm"-ish things - can't really do that with mbox folders.


I also have the same needs-to-log-in-again-for-no-good-reason issue 
using Thunderbird against a role account on a central mail platform with 
"many" - but quite a bit fewer - messages, so my money is definitely on 
some weird corner case in the IMAP protocol.





Local storage is ext4 on a SATA SSD, although I wouldn't expect a noticeable
performance difference if it were on a conventional hard drive.


I am surprised that accessing files in a directory with 100K entries
is not slow, since (according to what I read) ext4 stores entries in
an "almost linear" list, and thus to find a director entry you might
have to chew through (on average) 50K entries.  Of course, file system
caching will speed things up immensely, assuming one has enough RAM
(given the other activity on the system) to keep the contents of those
maildirs (that is, the directory contents, not the contents of the
files) in RAM.


That could well be at the root of some of my issues, but the whole-file 
rewrites needed for mbox would be worse IMO.  Aside from whatever 
strange state Seamonkey gets itself into after running for several weeks 
I'm not seeing any other slowdowns.  Dovecot seems to be quite happy to 
manage all that baggage - TBH some of Dovecot's indexing may be helping 
out there by avoiding having to re-read the filesystem's entire 
directory index very often.


I do also have 32G of physical RAM, and top reports 17G of that is in 
use for cache...



[*] Due to some legacy mail flow that would be painful to convert, I
pull mail with fetchmail, deliver locally with procmail (sorry),
then expose it to my mail client with a local Dovecot instance.


Again, thanks for your specifics.  Maybe I should give maildir a try
some time and see what happens.  (Or maybe I should just delete a bunch
of email and forget that I ever got it.)


I haven't used actual client-local mail folders for much in a LONG time; 
 both Seamonkey and Thunderbird default to mbox-ish files IIRC, 
(although TB at least has an option to use a maildir-ish format).


-kgd


Re: feature request: improve vague/incorrect error message

2021-11-16 Thread Matus UHLAR - fantomas

On 16.11.21 13:16, Jim wrote:

At first glance I wouldn't see that related to mbox vs. maildir, but
I've been surprised before.


if you only need to read header of each file (maildir), it should be faster
than read whole file (mbox).

yes, IMAP sometimes needs to read while mail file for its structure, but
afaik not every time.


On Tue, Nov 16, 2021 at 11:41 (-0500), Kris Deugau wrote:

Local storage is ext4 on a SATA SSD, although I wouldn't expect a noticeable
performance difference if it were on a conventional hard drive.



I am surprised that accessing files in a directory with 100K entries
is not slow, since (according to what I read) ext4 stores entries in
an "almost linear" list, and thus to find a director entry you might
have to chew through (on average) 50K entries.


the dir_index feature is in ext* for years. 


[*] Due to some legacy mail flow that would be painful to convert, I
pull mail with fetchmail, deliver locally with procmail (sorry),


procmail/maildrop/dovecot-lda


Again, thanks for your specifics.  Maybe I should give maildir a try
some time and see what happens.  (Or maybe I should just delete a bunch
of email and forget that I ever got it.)

I see that I have lots on inodes on the file system where I keep my
email: although the file system is 48% full, I have used only 3% of
the inodes, so I'm in no danger of running out.  If I used another
160,000 for mail messages I'd still be less than 6% inodes used, so
that turns out not to be a concern for me.  (The last time I
considered doing this I don't think I had such a surplus of inodes.)


last time I checked the average file size was ~13KB (I guess it's gonna be
more now), the inode_ratio in my mke2fs.conf is 16k, it should be enough.

--
Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Support bacteria - they're the only culture some people have.


Re: feature request: improve vague/incorrect error message

2021-11-16 Thread Jim
On Tue, Nov 16, 2021 at 11:41 (-0500), Kris Deugau wrote:

> Jim wrote:
>> On Mon, Nov 15, 2021 at 12:25 (-0500), Wietse Venema wrote:

>>> Instead, use Maildir format with one message per file,

>> I thought about that once, but I decided I have too many e-mail
>> messages for that.  (I don't want to run out of inodes, nor do I want to
>> make file accesses too slow because of the number of files in the
>> directory.)

> I converted "local"[*] storage from mbox to maildir a number of
> years ago - IIRC I was starting to see performance issues with mbox
> in part due to the way I manage my mail and in part simply due to
> the number of messages I keep around.

> This account has ~13G of mail on my PC, with over 100K messages each
> in two folders, several in the tens of thousands, and most dedicated
> mailing list folders holding somewhere between about 5K and 8K
> messages each.

Thanks for the specifics.

> The only performance issues I have are:

> a) something sucks in the the IMAP protocol such that my mail client keeps
> having to create a new connection and reauthenticate - it's not strictly a
> timeout, because it's not on anything remotely resembling a predictable
> timing

At first glance I wouldn't see that related to mbox vs. maildir, but
I've been surprised before.

> Local storage is ext4 on a SATA SSD, although I wouldn't expect a noticeable
> performance difference if it were on a conventional hard drive.

I am surprised that accessing files in a directory with 100K entries
is not slow, since (according to what I read) ext4 stores entries in
an "almost linear" list, and thus to find a director entry you might
have to chew through (on average) 50K entries.  Of course, file system
caching will speed things up immensely, assuming one has enough RAM
(given the other activity on the system) to keep the contents of those
maildirs (that is, the directory contents, not the contents of the
files) in RAM.

> [*] Due to some legacy mail flow that would be painful to convert, I
> pull mail with fetchmail, deliver locally with procmail (sorry),
> then expose it to my mail client with a local Dovecot instance.

Again, thanks for your specifics.  Maybe I should give maildir a try
some time and see what happens.  (Or maybe I should just delete a bunch
of email and forget that I ever got it.)

I see that I have lots on inodes on the file system where I keep my
email: although the file system is 48% full, I have used only 3% of
the inodes, so I'm in no danger of running out.  If I used another
160,000 for mail messages I'd still be less than 6% inodes used, so
that turns out not to be a concern for me.  (The last time I
considered doing this I don't think I had such a surplus of inodes.)

Cheers.

Jim



Re: feature request: improve vague/incorrect error message

2021-11-16 Thread Kris Deugau

Jim wrote:

On Mon, Nov 15, 2021 at 12:25 (-0500), Wietse Venema wrote:

Finally, if you want to keep lots of mail around, don't keep
everything in one huge mailbox file.


I actually have a bunch of huge mailbox files ;-)
(Yeah, way too much email.)


Instead, use Maildir format with one message per file,


I thought about that once, but I decided I have too many e-mail
messages for that.  (I don't want to run out of inodes, nor do I want to
make file accesses too slow because of the number of files in the
directory.)


I converted "local"[*] storage from mbox to maildir a number of years 
ago - IIRC I was starting to see performance issues with mbox in part 
due to the way I manage my mail and in part simply due to the number of 
messages I keep around.


This account has ~13G of mail on my PC, with over 100K messages each in 
two folders, several in the tens of thousands, and most dedicated 
mailing list folders holding somewhere between about 5K and 8K messages 
each.


The only performance issues I have are:

a) something sucks in the the IMAP protocol such that my mail client 
keeps having to create a new connection and reauthenticate - it's not 
strictly a timeout, because it's not on anything remotely resembling a 
predictable timing


b) Seamonkey has a subtle polling issue probably at least partly related 
to a) that (eventually) causes it to such up gobs of RAM, get slow, and 
crash on a certain action.  This usually takes a couple of weeks.


Local storage is ext4 on a SATA SSD, although I wouldn't expect a 
noticeable performance difference if it were on a conventional hard drive.


-kgd

[*]  Due to some legacy mail flow that would be painful to convert, I 
pull mail with fetchmail, deliver locally with procmail (sorry), then 
expose it to my mail client with a local Dovecot instance.


Re: feature request: improve vague/incorrect error message

2021-11-15 Thread John Stoffel
> "Jim" == Jim   writes:

>> Instead, use Maildir format with one message per file,

Jim> I thought about that once, but I decided I have too many e-mail
Jim> messages for that.  (I don't want to run out of inodes, nor do I want to
Jim> make file accesses too slow because of the number of files in the
Jim> directory.)

If you keep your archive Maildirs on another disk, with a filesystem
explicitly setup for large number of inodes, then you're all set.  I
too agree that Maildir (or archives split by month or year) is a
better way than just a single large monster mail file.

But to each is own...


Re: feature request: improve vague/incorrect error message

2021-11-15 Thread Jim
Wietse,

On Mon, Nov 15, 2021 at 12:25 (-0500), Wietse Venema wrote:

> Jim:

>> On Artix, the default is 5120.  (Aside: in 1985, that would have

> Postfix has limits on everything, so that the mail system will not
> get stuck. It's really a bad idea to disable them.

I agree that changing it to "unlimited" opens one up to some risks.

And I can't make a definitive argument for any particular default
value.  However, I still think that the limit of 5120 is far too
small in this day and age.

> TL;DR: If you want a better error message, stop using procmail.

Hmmm... Thanks for the summary.  On reflection, I guess I should have
realized that procmail was implicated.

> Procmail returns a status of 73, which is one of 15 status codes
> defined in /usr/include/sysexits.h. This file defines the interface
> between a mail server (such as Postfix) and an external program
> that delivers mail (such as procmail). The problem is not in
> procmail, it is in the sysexits.h interface.

> If you use Postfix itself for mailbox delivery, then the error
> message will be "File too large", one of the dozens of status codes
> defined in /usr/include/errno.h.

Indeed, that would have been more meaningful.

> On Linux, that file is the top of a forest of include files.

> Finally, if you want to keep lots of mail around, don't keep
> everything in one huge mailbox file.

I actually have a bunch of huge mailbox files ;-)
(Yeah, way too much email.)

> Instead, use Maildir format with one message per file,

I thought about that once, but I decided I have too many e-mail
messages for that.  (I don't want to run out of inodes, nor do I want to
make file accesses too slow because of the number of files in the
directory.)

> or rotate files frequently like I do.


On Mon, Nov 15, 2021 at 12:49 (-0500), Wietse Venema wrote:

> In fact, procmail can produce its own logging which may be more
> detailed. Returning fine detail to remote senders is not needed
> (hence "can't create ... file") if there is a local log that can
> record the underlying problem details.

Also thanks for that.  I guess I need to learn more about procmail
than I had really hoped to.


Thanks for the quick replies.

Jim


Re: feature request: improve vague/incorrect error message

2021-11-15 Thread Wietse Venema
Wietse Venema:
> Jim:
> > (This is really for postfix developers, but since I'm not allowed to
> > post this on the devel list, here it is here.)
> > 
> > Background: I recently moved from Ubuntu to Artix.  On Ubuntu, for
> > better or worse, mailbox_size_limit is 0, and I blissfully went around
> > using a large inbox.
> > 
> > On Artix, the default is 5120.  (Aside: in 1985, that would have
> 
> Postfix has limits on everything, so that the mail system will not
> get stuck. It's really a bad idea to disable them.
> 
> > Not realizing this is one of the parameters I would need to configure,
> > I fired up fetchmail and started downloading my email.
> > 
> > This limit caused me to lose a certain number of email messages with
> > the only hint being log messages containing this:
> > ... status=bounced (can't create user output file)
> 
> TL;DR: If you want a better error message, stop using procmail.
> 
> Procmail returns a status of 73, which is one of 15 status codes
> defined in /usr/include/sysexits.h. This file defines the interface
> between a mail server (such as Postfix) and an external program
> that delivers mail (such as procmail). The problem is not in
> procmail, it is in the sysexits.h interface.

In fact, procmail can produce its own logging which may be more
detailed. Returning fine detail to remote senders is not needed
(hence "can't create ... file") if there is a local log that can
record the underlying problem details.

Wietse

> If you use Postfix itself for mailbox delivery, then the error
> message will be "File too large", one of the dozens of status codes
> defined in /usr/include/errno.h. On Linux, that file is the top of
> a forest of include files.
> 
> Finally, if you want to keep lots of mail around, don't keep
> everything in one huge mailbox file. Instead, use Maildir format
> with one message per file, or rotate files frequently like I do.
> 
>   Wietse
> 


Re: feature request: improve vague/incorrect error message

2021-11-15 Thread Wietse Venema
Jim:
> (This is really for postfix developers, but since I'm not allowed to
> post this on the devel list, here it is here.)
> 
> Background: I recently moved from Ubuntu to Artix.  On Ubuntu, for
> better or worse, mailbox_size_limit is 0, and I blissfully went around
> using a large inbox.
> 
> On Artix, the default is 5120.  (Aside: in 1985, that would have

Postfix has limits on everything, so that the mail system will not
get stuck. It's really a bad idea to disable them.

> Not realizing this is one of the parameters I would need to configure,
> I fired up fetchmail and started downloading my email.
> 
> This limit caused me to lose a certain number of email messages with
> the only hint being log messages containing this:
> ... status=bounced (can't create user output file)

TL;DR: If you want a better error message, stop using procmail.

Procmail returns a status of 73, which is one of 15 status codes
defined in /usr/include/sysexits.h. This file defines the interface
between a mail server (such as Postfix) and an external program
that delivers mail (such as procmail). The problem is not in
procmail, it is in the sysexits.h interface.

If you use Postfix itself for mailbox delivery, then the error
message will be "File too large", one of the dozens of status codes
defined in /usr/include/errno.h. On Linux, that file is the top of
a forest of include files.

Finally, if you want to keep lots of mail around, don't keep
everything in one huge mailbox file. Instead, use Maildir format
with one message per file, or rotate files frequently like I do.

Wietse


feature request: improve vague/incorrect error message

2021-11-15 Thread Jim
(This is really for postfix developers, but since I'm not allowed to
post this on the devel list, here it is here.)

Background: I recently moved from Ubuntu to Artix.  On Ubuntu, for
better or worse, mailbox_size_limit is 0, and I blissfully went around
using a large inbox.

On Artix, the default is 5120.  (Aside: in 1985, that would have
been a reasonable limit.  But time has moved on, and mass storage
devices have come down in price by many orders of magnitude, and the
size of attachments shipped via email has correspondingly increased.
I don't know whether that is the Artix-imposed limit or Arch-imposed
limit, or whether that is in the postfix distribution, but that is
another question.)

Not realizing this is one of the parameters I would need to configure,
I fired up fetchmail and started downloading my email.

This limit caused me to lose a certain number of email messages with
the only hint being log messages containing this:
... status=bounced (can't create user output file)

Searching the web, I came across a number of messages which pointed to
various permission problems (file ownership, programs needing to be
setgid or setuid, ...), and looking down these paths wasn't useful,
since permissions were not the problem.


In summary: would the powers that be consider improving that error
message so that it contains information about *why* it couldn't create
the user output file?

(In fact, I'd argue the error message is in fact wrong, because
(a) it didn't need to *create* the output file, and
(b) it was able to write to the output file, it just didn't want to.)

Thanks for reading.

Jim