Re: Postfix using all CPU after nightly mail submission
Zach Sheppard: > I did as you suggested and opened a console on the VMWare host, did a tail > of the mail log, and it sent mail for a good 5-10 minutes before finally > becoming unresponsive. I tried to Ctrl-C out of tail, nothing. I've done Thanks for confirming that the ENTIRE SYSTEM LOCKS UP. Evidently something cannot handle the load of system calls that Postfix makes. That should be enough cause to look for a better system. I do not expect that the problem is with VMware itself, but I cannot exclude that it is caused by some conservative configuration. Wietse
Re: Postfix using all CPU after nightly mail submission
Was seeing similar behaviour under high tcp load in other applications (unrelated to postfix). It’s possible you’re running into this bug. Try to Set vmxnet3.rev.30=FALSE under the VMs properties and reboot More info here:191201 – Randomly freezes due to VMXNET3 | | | 191201 – Randomly freezes due to VMXNET3 | | | Sent from Yahoo Mail for iPhone On Wednesday, February 21, 2018, 8:30 AM, Zach Sheppardwrote: Hi Wietse, I limited my postfix installation to default_process_limit 5, 4, 3, 2, and even 1, and still saw the same effects. I am thinking it might be either my opendkim milter (which applies the DKIM signature for each mail) or SASL as these are the only other processes on the server. Are you aware of any issues with either related to I/O? I have not seen any configuration settings for opendkim to do any performance throttling. I am still dumbfounded how this continues to occur. I am not sending mail in large quantity - maybe 7,000-8,000 total - just in a short amount of time. The I/O shouldn't be THAT high... at least not to leave the server unresponsive... the mail client connects to my server every evening (around midnight) and sends mails in a burst fashion within an hour or so. I did as you suggested and opened a console on the VMWare host, did a tail of the mail log, and it sent mail for a good 5-10 minutes before finally becoming unresponsive. I tried to Ctrl-C out of tail, nothing. I've done the same monitoring with top and still see no culprit for the sudden halt. I check syslog and other logs on the server and see no crashes or panics. Any other ideas what might be causing this? Further debugging I can do? Thanks On Fri, Jan 19, 2018 at 2:26 PM, Wietse Venema wrote: Zach Sheppard: > Wietse: > > I have not made any changes to rsyslog.conf. All it does it redirect all > mail log messages to one log in /var/log/mail which I rotate with a cron > script nightly. However, I do agree that it really could be the only other > process that could be hanging the server. > > I'm not able to determine what program is consuming the CPU because I can't > login to the console when this occurs. The only way I can recover the > machine is by forcibly powering off. I suspect that heavy I/O from Postfix and syslog is too much for your VM. To diagnose the problem, run screen(1) on a stable machine, and then open a login session into the VM while it still responsive. Then come back to that screen session when things go bad. You're likely to find that when the VM is very slow, all time is spent in the VM's kernel, and the host's VMM. Note that VMs, while fine for CPU-bound jobs, can introduce serious CPU overhead for things that do massive amounts of I/O like Postfix plus syslog. If you can't get a better VM, you can reduce the impact from a 'large' mailing by reducing the number of concurrent Postfix SMTP server and client processes. # postconf default_process_limit=10 # postfix reload Wietse This message may contain confidential information and is intended only for the individuals named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
RE: Postfix using all CPU after nightly mail submission
Hi Zach, my postfix box is on Vmware too….Did the folks that manage VmWare see any oddity? No alarms ? No performance spikes ? Also I use Zabbix for monitoring so I get email warnings when certain thresholds are exceeded. Maybe the backend VMware storage is having an issue ? shooting from the hip here… Good Luck. -ALF P.S. not sure if list allows including pics from Win10 snipping tool. ;-) It doesn’t http://tinypic.com/r/25hhh68/9 -ANGELO FAZZINA ITS Service Manager: Spam and Virus Prevention Mass Mailing G Suite/Gmail ang...@uconn.edu University of Connecticut, ITS, SSG, Server Systems 860-486-9075 From: owner-postfix-us...@postfix.org [mailto:owner-postfix-us...@postfix.org] On Behalf Of Zach Sheppard Sent: Wednesday, February 21, 2018 8:31 AM To: Postfix users <postfix-users@postfix.org> Subject: Re: Postfix using all CPU after nightly mail submission Hi Wietse, I limited my postfix installation to default_process_limit 5, 4, 3, 2, and even 1, and still saw the same effects. I am thinking it might be either my opendkim milter (which applies the DKIM signature for each mail) or SASL as these are the only other processes on the server. Are you aware of any issues with either related to I/O? I have not seen any configuration settings for opendkim to do any performance throttling. I am still dumbfounded how this continues to occur. I am not sending mail in large quantity - maybe 7,000-8,000 total - just in a short amount of time. The I/O shouldn't be THAT high... at least not to leave the server unresponsive... the mail client connects to my server every evening (around midnight) and sends mails in a burst fashion within an hour or so. I did as you suggested and opened a console on the VMWare host, did a tail of the mail log, and it sent mail for a good 5-10 minutes before finally becoming unresponsive. I tried to Ctrl-C out of tail, nothing. I've done the same monitoring with top and still see no culprit for the sudden halt. I check syslog and other logs on the server and see no crashes or panics. Any other ideas what might be causing this? Further debugging I can do? Thanks On Fri, Jan 19, 2018 at 2:26 PM, Wietse Venema <wie...@porcupine.org<mailto:wie...@porcupine.org>> wrote: Zach Sheppard: > Wietse: > > I have not made any changes to rsyslog.conf. All it does it redirect all > mail log messages to one log in /var/log/mail which I rotate with a cron > script nightly. However, I do agree that it really could be the only other > process that could be hanging the server. > > I'm not able to determine what program is consuming the CPU because I can't > login to the console when this occurs. The only way I can recover the > machine is by forcibly powering off. I suspect that heavy I/O from Postfix and syslog is too much for your VM. To diagnose the problem, run screen(1) on a stable machine, and then open a login session into the VM while it still responsive. Then come back to that screen session when things go bad. You're likely to find that when the VM is very slow, all time is spent in the VM's kernel, and the host's VMM. Note that VMs, while fine for CPU-bound jobs, can introduce serious CPU overhead for things that do massive amounts of I/O like Postfix plus syslog. If you can't get a better VM, you can reduce the impact from a 'large' mailing by reducing the number of concurrent Postfix SMTP server and client processes. # postconf default_process_limit=10 # postfix reload Wietse This message may contain confidential information and is intended only for the individuals named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
Re: Postfix using all CPU after nightly mail submission
Hi Wietse, I limited my postfix installation to default_process_limit 5, 4, 3, 2, and even 1, and still saw the same effects. I am thinking it might be either my opendkim milter (which applies the DKIM signature for each mail) or SASL as these are the only other processes on the server. Are you aware of any issues with either related to I/O? I have not seen any configuration settings for opendkim to do any performance throttling. I am still dumbfounded how this continues to occur. I am not sending mail in large quantity - maybe 7,000-8,000 total - just in a short amount of time. The I/O shouldn't be THAT high... at least not to leave the server unresponsive... the mail client connects to my server every evening (around midnight) and sends mails in a burst fashion within an hour or so. I did as you suggested and opened a console on the VMWare host, did a tail of the mail log, and it sent mail for a good 5-10 minutes before finally becoming unresponsive. I tried to Ctrl-C out of tail, nothing. I've done the same monitoring with top and still see no culprit for the sudden halt. I check syslog and other logs on the server and see no crashes or panics. Any other ideas what might be causing this? Further debugging I can do? Thanks On Fri, Jan 19, 2018 at 2:26 PM, Wietse Venemawrote: > Zach Sheppard: > > Wietse: > > > > I have not made any changes to rsyslog.conf. All it does it redirect all > > mail log messages to one log in /var/log/mail which I rotate with a cron > > script nightly. However, I do agree that it really could be the only > other > > process that could be hanging the server. > > > > I'm not able to determine what program is consuming the CPU because I > can't > > login to the console when this occurs. The only way I can recover the > > machine is by forcibly powering off. > > I suspect that heavy I/O from Postfix and syslog is too much for > your VM. > > To diagnose the problem, run screen(1) on a stable machine, and > then open a login session into the VM while it still responsive. > Then come back to that screen session when things go bad. You're > likely to find that when the VM is very slow, all time is spent in > the VM's kernel, and the host's VMM. > > Note that VMs, while fine for CPU-bound jobs, can introduce serious > CPU overhead for things that do massive amounts of I/O like Postfix > plus syslog. > > If you can't get a better VM, you can reduce the impact from a > 'large' mailing by reducing the number of concurrent Postfix SMTP > server and client processes. > > # postconf default_process_limit=10 > # postfix reload > > Wietse > -- This message may contain confidential information and is intended only for the individuals named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
Re: Postfix using all CPU after nightly mail submission
Hi Viktor: I configured that internally generated mail to be filtered through our DKIM milter. This was to allow deliveries of error messages from the server to my inbox. However, I see above how this is not ideal since that can just be delivered to the root user inbox. I have turned off that setting. After making several tweaks from suggestions in this thread I saw much better performance last night. On Fri, Jan 19, 2018 at 2:25 PM, Viktor Dukhovniwrote: > On Fri, Jan 19, 2018 at 02:04:05PM -0500, Zach Sheppard wrote: > > > I have not made any changes to rsyslog.conf. All it does it redirect all > > mail log messages to one log in /var/log/mail which I rotate with a cron > > script nightly. However, I do agree that it really could be the only > other > > process that could be hanging the server. > > Your milter could be another culprit, and you've unwisely configured > filtering of internally generated mail, including notices about > problems. So the first thing is to undo the filtering of internally > generated mail. See also my other comments about your config. > > As for syslog, for MTAs I prefer syslog-ng, to rsyslog, and make > sure that /dev/log is configured as a "dgram" not "stream" socket. > Also make sure that log writes are not synchronous. > > -- > Viktor. > -- This message may contain confidential information and is intended only for the individuals named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
Re: Postfix using all CPU after nightly mail submission
> > Jan 15 00:42:42 mailrelay postfix/qmgr[5601]: 8EF0980973: > > from=<...@oconee.k12.sc.us>, size=2408, nrcpt=1 (queue > > active)^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ > > Jan 15 09:31:40 mailrelay opendkim[668]: OpenDKIM Filter v2.11.0 starting > > (args: -x /etc/opendkim.conf) > > What you see as "^@" is how ASCII NUL (the zero byte) is displayed > by "more", "less", "vi", ... It appears that the log file has either > lots of NULs written to it, or perhaps has a "hole" as a result of > truncation of the log file while it was still being written by the > syslog daemon. Perhaps incorrect log rotation... > > As for high CPU, that's a bit hard to explain without further > information, which is difficult to obtain with a truncated log. > Postfix does not normally bring systems to their knees, its > resource limits are specifically designed to avoid that. I've seen similar issues. Machines executing one job - which either goes hogging the CPU or using all memory, then either the OOM killer goes mad or the whole machine panics, leaving the filesystem in an inconsistent state. Often leaving zeros in the log. -- [*] sys4 AG https://sys4.de, +49 (89) 30 90 46 64 Schleißheimer Straße 26/MG, 80333 München Sitz der Gesellschaft: München, Amtsgericht München: HRB 199263 Vorstand: Patrick Ben Koetter, Marc Schiffbauer Aufsichtsratsvorsitzender: Florian Kirstein
Re: Postfix using all CPU after nightly mail submission
Zach Sheppard: > Wietse: > > I have not made any changes to rsyslog.conf. All it does it redirect all > mail log messages to one log in /var/log/mail which I rotate with a cron > script nightly. However, I do agree that it really could be the only other > process that could be hanging the server. > > I'm not able to determine what program is consuming the CPU because I can't > login to the console when this occurs. The only way I can recover the > machine is by forcibly powering off. I suspect that heavy I/O from Postfix and syslog is too much for your VM. To diagnose the problem, run screen(1) on a stable machine, and then open a login session into the VM while it still responsive. Then come back to that screen session when things go bad. You're likely to find that when the VM is very slow, all time is spent in the VM's kernel, and the host's VMM. Note that VMs, while fine for CPU-bound jobs, can introduce serious CPU overhead for things that do massive amounts of I/O like Postfix plus syslog. If you can't get a better VM, you can reduce the impact from a 'large' mailing by reducing the number of concurrent Postfix SMTP server and client processes. # postconf default_process_limit=10 # postfix reload Wietse
Re: Postfix using all CPU after nightly mail submission
On Fri, Jan 19, 2018 at 02:04:05PM -0500, Zach Sheppard wrote: > I have not made any changes to rsyslog.conf. All it does it redirect all > mail log messages to one log in /var/log/mail which I rotate with a cron > script nightly. However, I do agree that it really could be the only other > process that could be hanging the server. Your milter could be another culprit, and you've unwisely configured filtering of internally generated mail, including notices about problems. So the first thing is to undo the filtering of internally generated mail. See also my other comments about your config. As for syslog, for MTAs I prefer syslog-ng, to rsyslog, and make sure that /dev/log is configured as a "dgram" not "stream" socket. Also make sure that log writes are not synchronous. -- Viktor.
Re: Postfix using all CPU after nightly mail submission
Wietse: I have not made any changes to rsyslog.conf. All it does it redirect all mail log messages to one log in /var/log/mail which I rotate with a cron script nightly. However, I do agree that it really could be the only other process that could be hanging the server. I'm not able to determine what program is consuming the CPU because I can't login to the console when this occurs. The only way I can recover the machine is by forcibly powering off. # /etc/rsyslog.confConfiguration file for rsyslog. # # For more information see # /usr/share/doc/rsyslog-doc/html/rsyslog_conf.html # MODULES # module(load="imuxsock") # provides support for local system logging module(load="imklog") # provides kernel logging support #module(load="immark") # provides --MARK-- message capability # provides UDP syslog reception #module(load="imudp") #input(type="imudp" port="514") # provides TCP syslog reception #module(load="imtcp") #input(type="imtcp" port="514") ### GLOBAL DIRECTIVES ### # # Use traditional timestamp format. # To enable high precision timestamps, comment out the following line. # $ActionFileDefaultTemplate RSYSLOG_TraditionalFileFormat # # Set the default permissions for all log files. # $FileOwner root $FileGroup adm $FileCreateMode 0640 $DirCreateMode 0755 $Umask 0022 # # Where to place spool and state files # $WorkDirectory /var/spool/rsyslog # # Include all config files in /etc/rsyslog.d/ # $IncludeConfig /etc/rsyslog.d/*.conf ### RULES ### # # First some standard log files. Log by facility. # auth,authpriv.* /var/log/auth.log *.*;auth,authpriv.none -/var/log/syslog #cron.* /var/log/cron.log daemon.*-/var/log/daemon.log kern.* -/var/log/kern.log lpr.* -/var/log/lpr.log mail.* -/var/log/mail/mail.log user.* -/var/log/user.log # # Logging for the mail system. Split it up so that # it is easy to write scripts to parse these files. # #mail.info -/var/log/mail.info #mail.warn -/var/log/mail.warn #mail.err /var/log/mail.err # # Some "catch-all" log files. # *.=debug;\ auth,authpriv.none;\ news.none;mail.none -/var/log/debug *.=info;*.=notice;*.=warn;\ auth,authpriv.none;\ cron,daemon.none;\ mail,news.none -/var/log/messages # # Emergencies are sent to everybody logged in. # *.emerg :omusrmsg:* On Fri, Jan 19, 2018 at 1:20 PM, Wietse Venemawrote: > Zach Sheppard: > > However, whenever Postfix gets a large e-mail load it processes the > e-mails > > for around 30-45 minutes and then consistently uses around 70-80% of the > > CPU effectively locking up the entire system so much so that I can't even > > login and debug without either disabling the network card in VMWare or > > rebooting the system totally. The VM has 2 CPU cores and 4GB RAM > allocated > > to it which I thought was plenty for this application. > > Did you determine out WHAT PROGRAM is consuming CPU? As Postfix is > totally I/O limited, I suspect that there may be some other system > component that is messing up. > > For example, it is well known that one badly configured syslog > service can use up lots more CPU than all of Postfix combined. > > Wietse > -- This message may contain confidential information and is intended only for the individuals named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
Re: Postfix using all CPU after nightly mail submission
Zach Sheppard: > However, whenever Postfix gets a large e-mail load it processes the e-mails > for around 30-45 minutes and then consistently uses around 70-80% of the > CPU effectively locking up the entire system so much so that I can't even > login and debug without either disabling the network card in VMWare or > rebooting the system totally. The VM has 2 CPU cores and 4GB RAM allocated > to it which I thought was plenty for this application. Did you determine out WHAT PROGRAM is consuming CPU? As Postfix is totally I/O limited, I suspect that there may be some other system component that is messing up. For example, it is well known that one badly configured syslog service can use up lots more CPU than all of Postfix combined. Wietse
Re: Postfix using all CPU after nightly mail submission
On 19 January 2018 at 16:02, Viktor Dukhovniwrote: > > >> On Jan 19, 2018, at 10:58 AM, Dominic Raferd wrote: >> >>> The pipes to "sort" should not be needed. The output of "postconf" is >>> pre-sorted. >> >> yes I thought that - but without piping through sort I see: >> $ comm -1 -2 <(postconf -n) <(postconf -d) >> comm: file 2 is not in sorted order >> comm: file 1 is not in sorted order > > Perhaps your locale sorts "_" the output differently than the C locale. > Does the issue go away with: > > export LC_COLLATE=C Ah yes it does. On my system LC_COLLATE is normally undefined. This avoids the unwanted messages: LC_COLLATE=C comm -1 -2 <(postconf -n) <(postconf -d)
Re: Postfix using all CPU after nightly mail submission
> On Jan 19, 2018, at 10:58 AM, Dominic Raferdwrote: > >> The pipes to "sort" should not be needed. The output of "postconf" is >> pre-sorted. > > yes I thought that - but without piping through sort I see: > $ comm -1 -2 <(postconf -n) <(postconf -d) > comm: file 2 is not in sorted order > comm: file 1 is not in sorted order Perhaps your locale sorts "_" the output differently than the C locale. Does the issue go away with: export LC_COLLATE=C -- Viktor.
Re: Postfix using all CPU after nightly mail submission
On 19 January 2018 at 15:55, Viktor Dukhovniwrote: > > >> On Jan 19, 2018, at 10:46 AM, Dominic Raferd wrote: >> >> Here's a way to check for explicit settings in main.cf that are >> actually defaults and so could be removed (works under bash): >> >> comm -1 -2 <(postconf -n|sort) <(postconf -d|sort) > > The pipes to "sort" should not be needed. The output of "postconf" is > pre-sorted. yes I thought that - but without piping through sort I see: $ comm -1 -2 <(postconf -n) <(postconf -d) comm: file 2 is not in sorted order comm: file 1 is not in sorted order
Re: Postfix using all CPU after nightly mail submission
> On Jan 19, 2018, at 10:46 AM, Dominic Raferdwrote: > > Here's a way to check for explicit settings in main.cf that are > actually defaults and so could be removed (works under bash): > > comm -1 -2 <(postconf -n|sort) <(postconf -d|sort) The pipes to "sort" should not be needed. The output of "postconf" is pre-sorted. -- Viktor.
Re: Postfix using all CPU after nightly mail submission
On 19 January 2018 at 15:21, Viktor Dukhovniwrote: > > > > > default_destination_concurrency_limit = 50 > > This is the default, remove the setting. > > ... Here's a way to check for explicit settings in main.cf that are actually defaults and so could be removed (works under bash): comm -1 -2 <(postconf -n|sort) <(postconf -d|sort)
Re: Postfix using all CPU after nightly mail submission
> On Jan 19, 2018, at 9:55 AM, Zach Sheppard> wrote: > > bounce_queue_lifetime = 0 This is a bad idea. Don't do that. Allow transient delivery failures to be retried. > default_destination_concurrency_limit = 50 This is the default, remove the setting. > internal_mail_filter_classes = bounce, notify Leave this setting at its default value by removing it from main.cf. > notify_classes = bounce, resource, software Consider setting this empty, when sending bulk mail, your source of notices is the mail log, not per-message notices. > smtp_connect_timeout = 10s This is may be a bit short, but should be OK unless some destination is under sustained heavy load. > smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache You can/should set this empty, and leave session caching to the client via session tickets. > smtp_helo_timeout = 30s This too can be a bit short for overloaded peers. > smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache This is your client session cache, correctly configured, DO NOT also clear this when clearing the smtpd setting. > --master.cf-- > smtp inet n - y - - smtpd > pickupunix n - y 60 1 pickup > cleanup unix n - y - 0 cleanup > qmgr unix n - n 300 1 qmgr > tlsmgrunix - - y 1000? 1 tlsmgr > rewrite unix - - y - - trivial-rewrite > bounceunix - - y - 0 bounce > defer unix - - y - 0 bounce > trace unix - - y - 0 bounce > verifyunix - - y - 1 verify > flush unix n - y 1000? 0 flush > proxymap unix - - n - - proxymap > proxywrite unix - - n - 1 proxymap > smtp unix - - y - - smtp > relay unix - - y - - smtp > showq unix n - y - - showq > error unix - - y - - error > retry unix - - y - - error > discard unix - - y - - discard > local unix - n n - - local > virtual unix - n n - - virtual > lmtp unix - - y - - lmtp > anvil unix - - y - 1 anvil > scacheunix - - y - 1 scache > [...] The master.cf file looks fairly normal, but uses chroot for many services. This can make for fragile management of the system when the chroot jail is subtly misconfigured, but is not necessarily a problem (when done right). -- Viktor.
Re: Postfix using all CPU after nightly mail submission
> On Jan 19, 2018, at 8:52 AM, Zach Sheppard> wrote: > > Jan 15 00:42:42 mailrelay postfix/qmgr[5601]: 8EF0980973: > from=<...@oconee.k12.sc.us>, size=2408, nrcpt=1 (queue > active)^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ > Jan 15 09:31:40 mailrelay opendkim[668]: OpenDKIM Filter v2.11.0 starting > (args: -x /etc/opendkim.conf) What you see as "^@" is how ASCII NUL (the zero byte) is displayed by "more", "less", "vi", ... It appears that the log file has either lots of NULs written to it, or perhaps has a "hole" as a result of truncation of the log file while it was still being written by the syslog daemon. Perhaps incorrect log rotation... As for high CPU, that's a bit hard to explain without further information, which is difficult to obtain with a truncated log. Postfix does not normally bring systems to their knees, its resource limits are specifically designed to avoid that. -- Viktor.
Re: Postfix using all CPU after nightly mail submission
Hi Bastian, I did read that debug readme and saw the following: >> If you can't use a test email address, please anonymize email addresses and host names consistently. Replace each letter by "A", each digit by "D" so that the helpers can still recognize syntactical errors. ...and is why I replaced the e-mail address and host IP addresses. The from address for this particular message was and is the same for all messages as this server only sends mail from one internal application. However, the to addresses are parents, staff, and other people I didn't want to identify. As for the statement that is tagged to all of our outgoing mail. I apologize for that being included. I wasn't sure whether the other debug readme recommendations were needed but I have the postfinger output below: ./postfinger postfinger - postfix configuration on Fri Jan 19 09:50:34 EST 2018 version: 1.30 Warning: postfinger output may show private configuration information, such as ip addresses and/or domain names which you do not want to show to the public. If this is the case it is your responsibility to modify the output to hide this private information. [Remove this warning with the --nowarn option.] --System Parameters-- mail_version = 3.1.6 hostname = uname = Linux 4.9.0-5-amd64 #1 SMP Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux --Packaging information-- looks like this postfix comes from deb package: postfix-3.1.6-0+deb9u1 --main.cf non-default parameters-- alias_maps = hash:/etc/postfix/aliases append_dot_mydomain = no biff = no bounce_notice_recipient = root@localhost bounce_queue_lifetime = 0 compatibility_level = 2 default_destination_concurrency_limit = 50 error_notice_recipient = root@localhost inet_protocols = ipv4 in_flow_delay = 2s internal_mail_filter_classes = bounce, notify masquerade_domains = $mydomain milter_default_action = reject milter_protocol = 2 mydestination = $myhostname, localhost mydomain = oconee.k12.sc.us myhostname = .fqdn mynetworks = localhost, 127.0.0.1, [::1] myorigin = $mydomain non_smtpd_milters = inet:localhost:8892 notify_classes = bounce, resource, software proxy_interfaces = readme_directory = no smtp_connect_timeout = 10s smtpd_banner = $myhostname ESMTP $mail_name (Debian/GNU) smtpd_client_restrictions = permit_mynetworks, permit_sasl_authenticated, reject smtpd_milters = inet:localhost:8892 smtpd_reject_unlisted_recipient = no smtpd_sasl_auth_enable = yes smtpd_sasl_authenticated_header = yes smtpd_sasl_local_domain = oconee.k12.sc.us smtpd_tls_cert_file = /etc/postfix/ssl/oconee.pem smtpd_tls_key_file = /etc/postfix/ssl/oconee.key smtpd_tls_security_level = may smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache smtpd_use_tls = yes smtp_generic_maps = hash:/etc/postfix/generic_maps smtp_helo_timeout = 30s smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache --master.cf-- smtp inet n - y - - smtpd pickupunix n - y 60 1 pickup cleanup unix n - y - 0 cleanup qmgr unix n - n 300 1 qmgr tlsmgrunix - - y 1000? 1 tlsmgr rewrite unix - - y - - trivial-rewrite bounceunix - - y - 0 bounce defer unix - - y - 0 bounce trace unix - - y - 0 bounce verifyunix - - y - 1 verify flush unix n - y 1000? 0 flush proxymap unix - - n - - proxymap proxywrite unix - - n - 1 proxymap smtp unix - - y - - smtp relay unix - - y - - smtp showq unix n - y - - showq error unix - - y - - error retry unix - - y - - error discard unix - - y - - discard local unix - n n - - local virtual unix - n n - - virtual lmtp unix - - y - - lmtp anvil unix - - y - 1 anvil scacheunix - - y - 1 scache maildrop unix - n n - - pipe flags=DRhu user=vmail argv=/usr/bin/maildrop -d ${recipient} uucp unix - n n - - pipe flags=Fqhu user=uucp argv=uux -r -n -z -a$sender - $nexthop!rmail ($recipient) ifmailunix - n n - - pipe flags=F user=ftn argv=/usr/lib/ifmail/ifmail -r $nexthop ($recipient) bsmtp unix - n n - - pipe flags=Fq. user=bsmtp argv=/usr/lib/bsmtp/bsmtp -t$nexthop -f$sender $recipient scalemail-backend unix - n n
Re: Postfix using all CPU after nightly mail submission
On Fri, 19 Jan 2018, Bastian Blank wrote: Date: Fri, 19 Jan 2018 22:06:15 From: Bastian Blank <bastian+postfix-users=postfix@waldi.eu.org> To: postfix-users@postfix.org Subject: Re: Postfix using all CPU after nightly mail submission On Fri, Jan 19, 2018 at 08:52:39AM -0500, Zach Sheppard wrote: I see no errors or alarms in the mail log nor syslog. The only odd message I see is a series of ^@ symbols before the system freezes: Please read http://www.postfix.org/DEBUG_README.html#mail about how to report problems. Jan 15 00:42:41 mailrelay opendkim[1643]: 385B38094E: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:41 mailrelay postfix/qmgr[5601]: 385B38094E: from=< ...@oconee.k12.sc.us>, size=2282, nrcpt=1 (queue active) Jan 15 00:42:41 mailrelay postfix/smtpd[6128]: 8BFCC80968: client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ oconee.k12.sc.us You have a broken system, nothing reports "DDD.DD.DD.DD" as ip address. This message may contain confidential information and is intended only for the individuals named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. Please remove this statement. This mailing-list is already public archived. Bastian -- Those who hate and fight must stop themselves -- otherwise it is not stopped. -- Spock, "Day of the Dove", stardate unknown I do not know whether the From address characters (non-alphanumeric, other than the '@' character) are the cause, but the above message is one of those messages that endlessly loops between filters for malicious email messages, and is only accidentally found by switching off filters. -- Bret Busby Armadale West Australia .. "So once you do know what the question actually is, you'll know what the answer means." - Deep Thought, Chapter 28 of Book 1 of "The Hitchhiker's Guide to the Galaxy: A Trilogy In Four Parts", written by Douglas Adams, published by Pan Books, 1992
Re: Postfix using all CPU after nightly mail submission
On Fri, Jan 19, 2018 at 08:52:39AM -0500, Zach Sheppard wrote: > I see no errors or alarms in the mail log nor syslog. The only odd message > I see is a series of ^@ symbols before the system freezes: Please read http://www.postfix.org/DEBUG_README.html#mail about how to report problems. > Jan 15 00:42:41 mailrelay opendkim[1643]: 385B38094E: DKIM-Signature field > added (s=mailrelay, d=oconee.k12.sc.us) > Jan 15 00:42:41 mailrelay postfix/qmgr[5601]: 385B38094E: from=< > ...@oconee.k12.sc.us>, size=2282, nrcpt=1 (queue active) > Jan 15 00:42:41 mailrelay postfix/smtpd[6128]: 8BFCC80968: > client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ > oconee.k12.sc.us You have a broken system, nothing reports "DDD.DD.DD.DD" as ip address. > This message may contain confidential information and is intended only for > the individuals named. If you are not the named addressee you should not > disseminate, distribute or copy this e-mail. Please notify the sender > immediately by e-mail if you have received this e-mail by mistake and > delete this e-mail from your system. If you are not the intended recipient > you are notified that disclosing, copying, distributing or taking any > action in reliance on the contents of this information is strictly > prohibited. Please remove this statement. This mailing-list is already public archived. Bastian -- Those who hate and fight must stop themselves -- otherwise it is not stopped. -- Spock, "Day of the Dove", stardate unknown
Postfix using all CPU after nightly mail submission
Hello, We use a Postfix installation on Debian 9 in our network to serve as a send-only SMTP server for a very large application in our school district. This application does not have its own built-in SMTP server and requires a relay to send notification e-mails to students, teachers, and parents topping out at around 10,000 e-mails per day. Our hosted e-mail solution (G Suite) has a sending cap that we hit with this application so we setup this Postfix installation to serve as an alternative. However, whenever Postfix gets a large e-mail load it processes the e-mails for around 30-45 minutes and then consistently uses around 70-80% of the CPU effectively locking up the entire system so much so that I can't even login and debug without either disabling the network card in VMWare or rebooting the system totally. The VM has 2 CPU cores and 4GB RAM allocated to it which I thought was plenty for this application. I see no errors or alarms in the mail log nor syslog. The only odd message I see is a series of ^@ symbols before the system freezes: Jan 15 00:42:41 mailrelay opendkim[1643]: 385B38094E: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:41 mailrelay postfix/qmgr[5601]: 385B38094E: from=< ...@oconee.k12.sc.us>, size=2282, nrcpt=1 (queue active) Jan 15 00:42:41 mailrelay postfix/smtpd[6128]: 8BFCC80968: client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ oconee.k12.sc.us Jan 15 00:42:41 mailrelay postfix/cleanup[6375]: 8BFCC80968: message-id=< 734906173.4337.1515994961560.javamail....@oconee.k12.sc.us> Jan 15 00:42:41 mailrelay opendkim[1643]: 8BFCC80968: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:41 mailrelay postfix/qmgr[5601]: 8BFCC80968: from=< ...@oconee.k12.sc.us>, size=2370, nrcpt=1 (queue active) Jan 15 00:42:41 mailrelay postfix/smtpd[6128]: B66B38096E: client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ oconee.k12.sc.us Jan 15 00:42:41 mailrelay postfix/cleanup[6375]: B66B38096E: message-id=< 902964141.4338.1515994961733.javamail....@oconee.k12.sc.us> Jan 15 00:42:41 mailrelay opendkim[1643]: B66B38096E: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:41 mailrelay postfix/qmgr[5601]: B66B38096E: from=< ...@oconee.k12.sc.us>, size=3705, nrcpt=1 (queue active) Jan 15 00:42:41 mailrelay postfix/smtpd[6128]: E39EA8096F: client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ oconee.k12.sc.us Jan 15 00:42:41 mailrelay postfix/cleanup[6375]: E39EA8096F: message-id=< 1233892182.4339.1515994961918.javamail....@oconee.k12.sc.us> Jan 15 00:42:41 mailrelay opendkim[1643]: E39EA8096F: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:42 mailrelay postfix/qmgr[5601]: E39EA8096F: from=< ...@oconee.k12.sc.us>, size=2281, nrcpt=1 (queue active) Jan 15 00:42:42 mailrelay postfix/smtpd[6128]: 25EFC80970: client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ oconee.k12.sc.us Jan 15 00:42:42 mailrelay postfix/cleanup[6375]: 25EFC80970: message-id=< 2097536797.4340.1515994962141.javamail....@oconee.k12.sc.us> Jan 15 00:42:42 mailrelay opendkim[1643]: 25EFC80970: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:42 mailrelay postfix/qmgr[5601]: 25EFC80970: from=< ...@oconee.k12.sc.us>, size=2486, nrcpt=1 (queue active) Jan 15 00:42:42 mailrelay postfix/smtpd[6128]: 5329380972: client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ oconee.k12.sc.us Jan 15 00:42:42 mailrelay postfix/cleanup[6375]: 5329380972: message-id=< 951323567.4341.1515994962327.javamail....@oconee.k12.sc.us> Jan 15 00:42:42 mailrelay opendkim[1643]: 5329380972: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:42 mailrelay postfix/qmgr[5601]: 5329380972: from=< ...@oconee.k12.sc.us>, size=2326, nrcpt=1 (queue active) Jan 15 00:42:42 mailrelay postfix/smtpd[6128]: 8EF0980973: client=unknown[DDD.DD.DD.DD], sasl_method=LOGIN, sasl_username=AAA@ oconee.k12.sc.us Jan 15 00:42:42 mailrelay postfix/cleanup[6375]: 8EF0980973: message-id=< 484411487.4342.1515994962572.javamail....@oconee.k12.sc.us> Jan 15 00:42:42 mailrelay opendkim[1643]: 8EF0980973: DKIM-Signature field added (s=mailrelay, d=oconee.k12.sc.us) Jan 15 00:42:42 mailrelay postfix/qmgr[5601]: 8EF0980973: from=< ...@oconee.k12.sc.us>, size=2408, nrcpt=1 (queue active)^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^ @^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@