On 3/9/2021 12:07 PM, Phillip Carroll wrote:
On 3/8/2021 10:24 AM, Phillip Carroll wrote:
I will just need to check it for manual restart whenever csf does one
of its automatic updates.
After further analysis, I finally realized why the csf update is killing
fail2ban and also why fail2ban fails to restart after the csf update
completed.
The problem stemmed from the systemd fail2ban.service definition. Most
CentOS sysops run firewalld.service as their firewall, whereas the
firewall service on my system is lfd.service. (csf is the tool used to
configure lfd.)
When csf updates, it stops the lfd.service, indirectly stopping the
iptables and ipset services that fail2ban states that it needs in its
service definition file. Which therefore causes systemd to klll
fail2ban. All OK.
However, the default fail2ban.service definition doesn't contain
anything that would cause systemd to restart it when lfd is restarted
after the update.
I have now changed the following two lines in the fail2ban service
definition:
After=network.target iptables.service firewalld.service
ip6tables.service ipset.service nftables.service
PartOf=firewalld.service
Those lines now read:
After=network.target iptables.service lfd.service ip6tables.service
ipset.service nftables.service
PartOf=lfd.service
After making this change, I tested it by restarting lfd. As expected,
systemd shut down fail2ban, and then after restarting lfd, it also
started fail2ban.
Someday, I may get to a state where I actually fully understand this OS.
LOL
Phil
Unfortunately, my solution to the restart problem apparently missed
another small detail.
When cron.daily ran, the first thing scripted was daily log rotation,
followed by a restart of the lfd.service.
The following sequence was logged in syslog:
Mar 9 19:55:27 enablingsimplicity systemd: Stopping Fail2Ban Service...
Mar 9 19:55:28 enablingsimplicity fail2ban-client: Shutdown successful
Mar 9 19:55:28 enablingsimplicity systemd: Stopped Fail2Ban Service.
Mar 9 19:55:28 enablingsimplicity systemd: Stopping ConfigServer Firewall &
Security - lfd...
Mar 9 19:55:28 enablingsimplicity systemd: lfd.service: main process exited,
code=killed, status=9/KILL
Mar 9 19:55:28 enablingsimplicity systemd: Stopped ConfigServer Firewall &
Security - lfd.
Mar 9 19:55:28 enablingsimplicity systemd: Unit lfd.service entered failed
state.
Mar 9 19:55:28 enablingsimplicity systemd: lfd.service failed.
Mar 9 19:55:28 enablingsimplicity systemd: Starting ConfigServer Firewall &
Security - lfd...
Mar 9 19:55:28 enablingsimplicity systemd: Can't open PID file /run/lfd.pid
(yet?) after start: No such file or directory
Mar 9 19:55:28 enablingsimplicity systemd: Started ConfigServer Firewall &
Security - lfd.
Mar 9 19:55:28 enablingsimplicity systemd: Starting Fail2Ban Service...
Mar 9 19:55:28 enablingsimplicity systemd: Started Fail2Ban Service.
Mar 9 19:55:28 enablingsimplicity fail2ban-server: 2021-03-09 19:55:28,963
fail2ban [7844]: ERROR Failed during configuration: Have not
found any log file for exim-reject jail
Mar 9 19:55:28 enablingsimplicity systemd: fail2ban.service: main process
exited, code=exited, status=255/n/a
Mar 9 19:55:28 enablingsimplicity fail2ban-server: 2021-03-09 19:55:28,964
fail2ban [7844]: ERROR Async configuration of server failed
Mar 9 19:55:28 enablingsimplicity systemd: Unit fail2ban.service entered
failed state.
Mar 9 19:55:28 enablingsimplicity systemd: fail2ban.service failed.
Apparently, fail2ban expects log files it is watching to actually exist
when it starts! (/s)
However, my logrotate sequence for exim doesn't create any log files.
Rotation of the reject log is configured to ignore it if it doesn't exist.
Previously (with no stopping of fail2ban service just afer daily cron)
fail2ban was completely happy to exist for a while with a missing log.
When exim created a new reject log, f2b logged the following in its own
log (typical log from a few day ago):
2021-03-04 00:22:38,261 fail2ban.filterpyinotif [806]: DEBUG Event queue
size: 32
2021-03-04 00:22:38,262 fail2ban.filterpyinotif [806]: DEBUG <_RawEvent cookie=0
mask=0x100 name=reject.log wd=1 >
2021-03-04 00:22:38,262 fail2ban.filterpyinotif [806]: DEBUG Non-existing
file watcher 6 for file /var/log/exim/reject.log
2021-03-04 00:22:38,262 fail2ban.filterpyinotif [806]: DEBUG Removed file
watcher for /var/log/exim/reject.log
2021-03-04 00:22:38,262 fail2ban.filterpyinotif [806]: DEBUG New <Watch wd=7
path=/var/log/exim/reject.log mask=2 proc_fun=None auto_add=False exclude_filter=<function
<lambda> at 0x7f480143b6e0> dir=False >
2021-03-04 00:22:38,262 fail2ban.filterpyinotif [806]: DEBUG Added file
watcher for /var/log/exim/reject.log
2021-03-04 00:22:38,262 fail2ban.filter [806]: MSG Log rotation
detected for /var/log/exim/reject.log
2021-03-04 00:22:38,263 fail2ban.filter [806]: DEBUG Processing line
with time:1614842558.0 and ip:191.236.129.137
2021-03-04 00:22:38,979 fail2ban.filter [806]: INFO [exim-reject]
Found 191.236.129.137 - 2021-03-04 00:22:38
2021-03-04 00:22:38,980 fail2ban.failmanager [806]: DEBUG Total # of
detected failures: 18. Current failures from 1 IPs (IP:count): 191.236.129.137:1
2021-03-04 00:22:39,060 fail2ban.actions [806]: NOTICE [exim-reject]
Ban 191.236.129.137
2021-03-04 00:22:39,066 fail2ban.utils [806]: DEBUG 7f480140fcc8 --
returned successfully 0
2021-03-04 00:22:39,066 fail2ban.actions [806]: DEBUG Banned 1 / 26, 8
ticket(s) in 'exim-reject'
2021-03-04 00:22:39,105 fail2ban.observer [806]: DEBUG [exim-reject]
Observer: ban found 191.236.129.137, 86400
etc, etc, etc.
I am not sure precisely how to interpret that last sequence, other than
it recognized that the file it was watching named reject.log no longer
exists, but there is a new one with that name hence ut must had a
rollover, so it handled it gracefully. But, at service start time, I
guess I will need to insure the file to watch exists.
The only problem I have with this requirement is I am not totally
certain whether exim will actually use an empty reject.log, or create
another. A few weeks ago I tried to solve a letsencrypt missing log
issue by using "touch", which resulted in the log directory holding TWO
files named letsencrypt.log, one of which was empty, the other being
logged to. Log rotation, would then rotate the empty log and delete
both! (Or, if I attempted to delete the empty one, that also resulted
in deleting both. Linux is a strange beast.)
The question I now have is, in the case of f2b watching multiple log
files, whether I need to make sure ALL of them exist before any
start/restart of fail2ban?
Phil
_______________________________________________
Fail2ban-users mailing list
Fail2ban-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fail2ban-users