Mehmet Avcioglu:
> Wietse Venema:
> > Then I suspect that the code reaches the 10000 limit because
> > there are ~10000 files in the queue.
> ....
> > Below is a patch that should fix this.
> 
> Thank you for the patch. I had previously said that we were not
> running 'showq' or 'postqueue -p' frequently, but upon further
> investigation found out that the prometheus exporter was in fact
> running 'showq' or accessing via showq socket every 15 seconds.
> Stopping the exporter fixed the issue.

Well no, it just stopped requests to run showq daemon. It did nothing
to fix the showq daemon.

> Would your suggestion be to;
> - not run 'showq' or 'postqueue -p' (and hence a metrics exporter like
> this) on a busy server like this at all
> - run it but run it less frequently on normal non-patched postfix
> - run it less frequently and also apply the patch you had sent

I'm not going to tell you how to run Postfix with a bug, but I can
give you an insight into what would happen in different scenarios.

I suspect that the problem is that showq logs false errors because
it does not properly reset the counter for reverse jumps.

The following examples assume that you have the default Postfix
settings of max_use=100 and max_idle=100s.

- If showq is run once, or once per more-than-100 seconds, the showq
process will run once and terminate, and it could report false
errors for queues with ~10000 or more messages.

- If showq is run repeatedly every less-than-100 seconds, then you
will reuse the same showq process up to 100 times, and it could
report false errors for queues with ~100 or more messages.

If this is the problem then I'm surprised that it has not been
observed before in the 12 years since the code was written.

        Wietse

Reply via email to