On Mon, 2018-02-12 at 22:10 +0100, [email protected] wrote: > Hello, > > I'm using spamassassin and amavisd-new on a Debian 9. (and some > other > software that might be relevant : Postfix, dovecot, shorewall) > Spamassassin is set to restart amavis upon any DB update, at around > 4h20 > (am). (This is Debian default config, I don't know why it really need > to > be done). > Since approx. 5 months, amavis would sometime crash upon this > restart, > with the following error (in /var/log/syslog. A bit more context > added > at the end of this mail): > > Feb 11 04:23:15 myserver amavis[23850]: (!)Net::Server: > 2018/02/11-04:23:15 Can't connect to TCP port 10024 on 127.0.0.1 > [Address already in use]\n at line 68 in file > /usr/share/perl5/Net/Server/Proto/TCP.pm > > (amavis is indeed configured to listen on 10024)
# netstat -tulpn | grep ":10024" when this error occurs is most likely revealing that amavis is running on this port. But still knowing is knowing, guessing is guessing. > > This crash is very rare, I would say about once a month, if not less. > I typically never managed to trigger it when restarting amavis > manually. > However when it happens, it is quite bothersome : all incoming e- > mail > gets rejected, including the alerts that warns me about the amavis > failure to restart... > > When Spamassassin restarts amavis, it calls "/etc/init.d/amavis > restart" > which in turn calls sequentially "stop" then "start", using > start-stop-daemon, with a PID file. > I've tried to set a delay (20s) between the stop and start, without > much > luck. Actually systemd is running a generator to interpret /etc/init.d/* and dynamically creates .service files, which are executed. The generator is somewhere at /lib/systemd/system-generators/systemd-sysv-generator and it's results should be at /run/systemd/generator.late/* . Reading your description it sounds like dynamically created spamassassin.service is called, which calls /etc/init.d/amavis, which is intercepted by systemd (LSB) to run amavis.service #headaches#. Perhaps there's a problem with the interpretation of the PID or the dependencies create a cycle? I'm no expert in systemd though, so I'm afraid to be of no great help. "If a service is requested under a certain name but no unit configuration file is found, systemd looks for a SysV init script by the same name (with the .service suffix removed) and dynamically creates a service unit from that script. This is useful for compatibility with SysV. Note that this compatibility is quite comprehensive but not 100%. For details about the incompatibilities, see the Incompatibilities with SysV[1] document." [1] http://www.freedesktop.org/wiki/Software/systemd/Incompatibilities I had some time "fun" with another systemd generator used for /etc/fstab, not mounting iSCSI-drives. In the end I created my own .mount directives within systemd's config method. Btw. you have an interesting configuration, as I thought amavisd is it's own spamd daemon (using perl API) and therefore process spamd isn't needed at all. > > I can post my amavis and/or spamassassin config, but nothing changed > since at least 1 or 2 years, so I don't think the culprit is there. > From my admin notes, the only thing that might have changed on this > machine (and that could be relevant) is it's ipv6 connectivity. > > Any idea of what might be going on ? > Any idea of where I might be looking for clues ? Any debug option I > could enable ? > > Thanks for reading ! > > N > > > P.S. : the error, from /var/log/syslog, with a bit more context > (real > server name is redacted). As you can see everything else seems to be > ok... : > > Feb 11 04:22:36 myserver systemd[1]: Reloading Perl-based spam > filter > using text analysis. > Feb 11 04:22:36 myserver systemd[1]: Reloaded Perl-based spam filter > using text analysis. > Feb 11 04:22:37 myserver systemd[1]: Stopping LSB: Starts amavisd- > new > mailfilter... > Feb 11 04:22:39 myserver spamd[2957]: spamd: server hit by SIGHUP, > restarting > Feb 11 04:22:41 myserver spamd[2957]: spamd: child [2979] killed > successfully: interrupted, signal 2 (0002) > Feb 11 04:22:41 myserver spamd[2957]: spamd: child [2978] killed > successfully: interrupted, signal 2 (0002) > Feb 11 04:22:44 myserver spamd[2957]: spamd: server socket closed, > type > IO::Socket::IP > Feb 11 04:22:44 myserver spamd[2957]: spamd: server socket closed, > type > IO::Socket::IP > Feb 11 04:22:47 myserver spamd.pid --create-prefs --max-children 5 > --helper-home-dir: spamd: restarting using '/usr/sbin/spamd -d > --pidfile=/var/run/spamd.pid --create-prefs --max-children 5 > --helper-home-dir' > Feb 11 04:22:47 myserver spamd[2957]: logger: removing stderr method > Feb 11 04:22:55 myserver amavis[23809]: Stopping amavisd: amavisd- > new. > Feb 11 04:22:55 myserver systemd[1]: Stopped LSB: Starts amavisd-new > mailfilter. > Feb 11 04:22:55 myserver systemd[1]: Starting LSB: Starts amavisd- > new > mailfilter... > Feb 11 04:22:56 myserver spamd[23816]: zoom: able to use 347/347 > 'body_0' compiled rules (100%) > Feb 11 04:23:05 myserver spamd[23816]: spamd: server started on > IO::Socket::IP [::1]:783, IO::Socket::IP [127.0.0.1]:783 (running > version 3.4.1) > Feb 11 04:23:05 myserver spamd[23816]: spamd: server pid: 23816 > Feb 11 04:23:05 myserver spamd[23816]: spamd: server successfully > spawned child process, pid 23846 > Feb 11 04:23:05 myserver spamd[23816]: spamd: server successfully > spawned child process, pid 23847 > Feb 11 04:23:05 myserver spamd[23816]: prefork: child states: IS > Feb 11 04:23:05 myserver spamd[23816]: prefork: child states: II > Feb 11 04:23:05 myserver amavis[23834]: starting. /usr/sbin/amavisd- > new > at myserver.mydomain.com amavisd-new-2.10.1 (20141025), Unicode > aware, > LC_ALL="C", LANG="fr_FR.UTF-8" > Feb 11 04:23:15 myserver amavis[23827]: Starting amavisd: amavisd- > new. > Feb 11 04:23:15 myserver systemd[1]: Started LSB: Starts amavisd-new > mailfilter. > Feb 11 04:23:15 myserver amavis[23850]: (!)Net::Server: > 2018/02/11-04:23:15 Can't connect to TCP port 10024 on 127.0.0.1 > [Address already in use]\n at line 68 in file > /usr/share/perl5/Net/Server/Proto/TCP.pm > > to be compared with a "normal" (here, triggered manually) restart : > > Feb 9 05:00:02 myserver systemd[1]: Stopping LSB: Starts amavisd- > new > mailfilter... > Feb 9 05:00:16 myserver amavis[6488]: Stopping amavisd: amavisd-new. > Feb 9 05:00:16 myserver systemd[1]: Stopped LSB: Starts amavisd-new > mailfilter. > Feb 9 05:00:16 myserver systemd[1]: Starting LSB: Starts amavisd- > new > mailfilter... > Feb 9 05:00:33 myserver amavis[8917]: starting. /usr/sbin/amavisd- > new > at myserver.mydomain.com amavisd-new-2.10.1 (20141025), Unicode > aware, > LC_ALL="C", LANG="fr_FR.UTF-8" > Feb 9 05:00:46 myserver amavis[8703]: Starting amavisd: amavisd-new. > Feb 9 05:00:46 myserver systemd[1]: Started LSB: Starts amavisd-new > mailfilter. > Feb 9 05:00:46 myserver amavis[11922]: Net::Server: Group Not > Defined. > Defaulting to EGID '129 129' > Feb 9 05:00:46 myserver amavis[11922]: Net::Server: User Not > Defined. > Defaulting to EUID '119' > Feb 9 05:00:46 myserver amavis[11922]: Module > Amavis::Conf 2.404 > (...) > (... more module loading ...) > (...) > Feb 9 05:00:48 laborit amavis[11922]: Using primary internal av > scanner > code for ClamAV-clamd > Feb 9 05:00:48 laborit amavis[11922]: Found secondary av scanner > ClamAV-clamscan at /usr/bin/clamscan > Feb 9 05:00:48 laborit amavis[11922]: Deleting db files > __db.002,nanny.db,__db.003,__db.001,snmp.db in /var/lib/amavis/db > Feb 9 05:00:48 laborit amavis[11922]: Creating db in > /var/lib/amavis/db/; BerkeleyDB 0.55, libdb 5.3
