Re: Monitoring amount of smtpd processes

2018-10-24 Thread Ralf Hildebrandt
> max_idle was the option I was looking for. Thank you.
> 
> I always grepped for something like timeout/daemon/time and I never
> found max_idle. :-)

Lowered here as well...

-- 
[*] sys4 AG

https://sys4.de, +49 (89) 30 90 46 64
Schleißheimer Straße 26/MG, 80333 München
   
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Marc Schiffbauer
Aufsichtsratsvorsitzender: Florian Kirstein


Re: Monitoring amount of smtpd processes

2018-10-24 Thread Ralf Hildebrandt
> It could also be very great to have Postfix like this, showing some
> informations about the connection:
> 
> smtpd [unused/virgin]
> or
> smtpd [, , , ]
> 
> Could be great for analysis and to get a quick overview about what's
> going on on busy servers.

That's a nice idea on systems where this kind of change is possible!

-- 
[*] sys4 AG

https://sys4.de, +49 (89) 30 90 46 64
Schleißheimer Straße 26/MG, 80333 München
   
Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263
Vorstand: Patrick Ben Koetter, Marc Schiffbauer
Aufsichtsratsvorsitzender: Florian Kirstein


Re: Monitoring amount of smtpd processes

2018-10-21 Thread Viktor Dukhovni



> On Oct 21, 2018, at 5:14 PM, Peer Heinlein  
> wrote:
> 
> If a client connects to smtpd and then breaks the connection because
> there's only STARTTLS or AUTH ONLY we have those remaining smtpd
> processes -- which makes the server looking busy, while he isn't.
> 
> If there's really a long peak then the server IS busy and I WANT to have
> an alarm.

You could look for "smtpd" processes with with "-o stress=yes" on their
command-line.  These are spawned by master(8) when process limit has
been hit.

http://www.postfix.org/STRESS_README.html#adapt
http://www.postfix.org/STRESS_README.html#feature

The document does not mention one detail you may care to know:

/* 
 * When all servers for a public internet service are busy, we start
 * creating server processes with "-o stress=yes" on the command
 * line, and keep creating such processes until the process count is
 * below the limit for at least 1000 seconds. [...]
 */

So it takes ~16 minutes without hitting the limit before the stress setting
is "relaxed".

On a modern server you can reasonably run around 1000 smtpd(8) processes,
and postscreen(8) should help to keep the typical process count lower than
it would be otherwise.

-- 
Viktor.



Re: Monitoring amount of smtpd processes

2018-10-21 Thread Peer Heinlein
Am 20.10.2018 um 19:06 schrieb Wietse Venema:

Hi,

>> If a client disconnects very early, the smtpd is still "unused" and
>> remains in server memory, waiting for the next connection.
> 
> The Postfix behavior has nothing to do with the duration of an SMTP
> session. It is determined by the max_idle setting in main.cf.

max_idle was the option I was looking for. Thank you.

I always grepped for something like timeout/daemon/time and I never
found max_idle. :-)

> You would see the same with a sustained peak of one minute long.
> It does not depend on the length of SMTP sessions.

Yes and No.

If a client connects to smtpd and then breaks the connection because
there's only STARTTLS or AUTH ONLY we have those remaining smtpd
processes -- which makes the server looking busy, while he isn't.

If there's really a long peak then the server IS busy and I WANT to have
an alarm.

>> In that situations we're seeing false positives in our monitoring.
> Please fix your monitoring! 

Yes, I do that -- that's why I'm requesting help (thanks for max_idle)
or some additional changes to enable a better monitoring.


> Conclusion: prctl(PR_SET_NAME) is safe to use. I would not
> distribute Postfix's own version of the stinking pile of garbage
> that mucks direclty with argv[].

:-)

I don't understand much about the differences and the different way on
how to implement that. -I'm not a coder, just an Admin.

I only now that this way's working really perfect with Dovecot for me
and it's very helpful to get an quick overview about what's going on and
who is eating up your ressources.

So if this could be implemented some day... I'd appreciate that.

Peer




Re: Monitoring amount of smtpd processes

2018-10-21 Thread Jan P. Kessler




we're monitoring the amount of active smtpd processes to make sure, that
we do not reach the max-proc limit from master.cf.



The number I found most useful to indicate something was going wrong 
is the number of messages in the queue.  For the servers I manage, 
normally that number would be single digit, maybe get to two digits on 
occasion.


The topic here is the number of smtpD processes (which serve *incoming* 
smtp connections). When the number is set too low, you won't get the 
messages in your queue.


Spoken clearly: Unless you're not able to monitor the queues of all 
systems that want to send an email to you this is not an option to solve 
the described problem. If you are able to do this I'd be very interested 
in that code ;)


Cheers, Jan



Re: Monitoring amount of smtpd processes

2018-10-21 Thread Shawn Heisey

On 10/20/2018 7:24 AM, Peer Heinlein wrote:

we're monitoring the amount of active smtpd processes to make sure, that
we do not reach the max-proc limit from master.cf.

If a client disconnects very early, the smtpd is still "unused" and
remains in server memory, waiting for the next connection.

If a server was flooded with a short peak of new connections, a server
could have $process_limit instances remaining ready-to-tun in memory.

In that situations we're seeing false positives in our monitoring.


The number I found most useful to indicate something was going wrong is 
the number of messages in the queue.  For the servers I manage, normally 
that number would be single digit, maybe get to two digits on occasion.


When something gets broken, the number of messages in the queue tends to 
balloon.  There are two primary causes I've seen for a large queue:  1) 
A particularly massive email storm, either spam or internally generated 
messages.  2) Delivery problems. There are lots of things that can cause 
delivery problems.  The most common problem I ran into was one of the 
webservers deciding that it needed to send thousands of messages.  
Waiting for those to clear out on their own so normal mail can make it 
through could take DAYS.


I would typically get notified about a problem with email after an hour 
or two where no messages were getting through, which is why I eventually 
added a monitor for the queue size, so I could know about the problem 
BEFORE it was noticed by high-profile people at the company.  With that, 
I could fix the problem quickly and find the right developer to chew out 
for sending thousands of messages.


For a particularly busy server, you probably would want to set the queue 
size alarm threshold at a fairly large number (at least 1000), but for 
one that's not very busy, more than about 100 is probably enough of a 
reason to investigate and see if there's a problem.  Calculating the 
total size of the message queue would be as simple as looking at the 
contents of some of the directories in /var/spool/postfix.  You could 
potentially run the 'mailq' command and parse its output, but I have 
seen that take a REALLY long time to finish, so counting files in the 
spool directories is probably better.


Thanks,
Shawn



Re: Monitoring amount of smtpd processes

2018-10-20 Thread Stefan Bauer
We simply monitor established tcp sessions to smtpd port. if client flies
away, tcp session does as well:

lsof -i tcp:25 | grep ESTABLISHED | wc -l

Am Samstag, 20. Oktober 2018 schrieb Peer Heinlein :
>
>
>
> Hi,
>
> we're monitoring the amount of active smtpd processes to make sure, that
> we do not reach the max-proc limit from master.cf.
>
> If a client disconnects very early, the smtpd is still "unused" and
> remains in server memory, waiting for the next connection.
>
> If a server was flooded with a short peak of new connections, a server
> could have $process_limit instances remaining ready-to-tun in memory.
>
> In that situations we're seeing false positives in our monitoring.
>
> I can't see a way how to detect those "waiting" smtpd to cound them
> differently in the process list. AFAIK there's now way (except we're
> counting the number of open connections with lsof/netstat).
>
> What about the idea that Postfix flags those unused processes by
> renaming them in the output of "ps"?
>
> Dovecot has a "verbose proctitle" option where pop3/imap processes are
> renamed in the process list so that they're showing the logged in user,
> the state of TLS, the client IP and the last IMAP-command.
>
> It could also be very great to have Postfix like this, showing some
> informations about the connection:
>
> smtpd [unused/virgin]
> or
> smtpd [, , , ]
>
> Could be great for analysis and to get a quick overview about what's
> going on on busy servers.
>
> Peer
>
>
> --
> Heinlein Support GmbH
> Schwedter Str. 8/9b, 10119 Berlin
>
> http://www.heinlein-support.de
>
> Tel: 030 / 405051-42
> Fax: 030 / 405051-19
>
> Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
> Berlin-Charlottenburg,
> Geschäftsführer: Peer Heinlein -- Sitz: Berlin
>
>


Monitoring amount of smtpd processes

2018-10-20 Thread Peer Heinlein




Hi,

we're monitoring the amount of active smtpd processes to make sure, that
we do not reach the max-proc limit from master.cf.

If a client disconnects very early, the smtpd is still "unused" and
remains in server memory, waiting for the next connection.

If a server was flooded with a short peak of new connections, a server
could have $process_limit instances remaining ready-to-tun in memory.

In that situations we're seeing false positives in our monitoring.

I can't see a way how to detect those "waiting" smtpd to cound them
differently in the process list. AFAIK there's now way (except we're
counting the number of open connections with lsof/netstat).

What about the idea that Postfix flags those unused processes by
renaming them in the output of "ps"?

Dovecot has a "verbose proctitle" option where pop3/imap processes are
renamed in the process list so that they're showing the logged in user,
the state of TLS, the client IP and the last IMAP-command.

It could also be very great to have Postfix like this, showing some
informations about the connection:

smtpd [unused/virgin]
or
smtpd [, , , ]

Could be great for analysis and to get a quick overview about what's
going on on busy servers.

Peer


-- 
Heinlein Support GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-42
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG: HRB 93818 B / Amtsgericht
Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin