Mehmet Avcioglu: > Wietse Venema: > > Then I suspect that the code reaches the 10000 limit because > > there are ~10000 files in the queue. > .... > > Below is a patch that should fix this. > > Thank you for the patch. I had previously said that we were not > running 'showq' or 'postqueue -p' frequently, but upon further > investigation found out that the prometheus exporter was in fact > running 'showq' or accessing via showq socket every 15 seconds. > Stopping the exporter fixed the issue.
Well no, it just stopped requests to run showq daemon. It did nothing to fix the showq daemon. > Would your suggestion be to; > - not run 'showq' or 'postqueue -p' (and hence a metrics exporter like > this) on a busy server like this at all > - run it but run it less frequently on normal non-patched postfix > - run it less frequently and also apply the patch you had sent I'm not going to tell you how to run Postfix with a bug, but I can give you an insight into what would happen in different scenarios. I suspect that the problem is that showq logs false errors because it does not properly reset the counter for reverse jumps. The following examples assume that you have the default Postfix settings of max_use=100 and max_idle=100s. - If showq is run once, or once per more-than-100 seconds, the showq process will run once and terminate, and it could report false errors for queues with ~10000 or more messages. - If showq is run repeatedly every less-than-100 seconds, then you will reuse the same showq process up to 100 times, and it could report false errors for queues with ~100 or more messages. If this is the problem then I'm surprised that it has not been observed before in the 12 years since the code was written. Wietse