Public bug reported:
Hi,
We're using ufw 0.36.2 on OpenSUSE 15.6. We've been using ufw for over a
decade without issues until today.
One of our servers got unexpectedly restarted several times. At some
point, it became inaccessible to all outside traffic. Upon further
analysis, I discovered that some of the ufw rules seemed to have
disappeared, which prevented traffic from reaching the server.
We also use a pretty popular utility called fail2ban, which bans IPs
that fail to log into SSH. Such bans last 365 days. The way fail2ban
works is every time it's restarted, it first tries to remove from ufw
all the IPs that it added, and then when it's restarted, it tries to re-
add them all back. This process is highly inefficient and removes/adds
them one at a time.
In order to resolve the ufw network issue today, I basically wiped ufw
clean and re-added our base rules. Then I bumped down fail2ban to 30
days instead of 365 and restarted it.
Here is the smoking gun. While fail2ban was restarting and re-adding all
the IPs one by one, I kept refreshing ufw status, and once in a while I
would catch a version of the rules without our base IPv4 rules included
at all. Thankfully, a refresh of the status would show them all again.
My theory is that there is some sort of rule corruption that could
happen due to a race condition. I'm not sure how to reproduce this
reliably, but I thought I would file this bug and let the team know.
In our case, the list of IPs to ban was over 3,000. We use the ufw
integration of fail2ban, like so:
cat /etc/fail2ban/jail.local
```
# Do all your modifications to the jail's configuration in jail.local!
[sshd]
enabled = true
port = 222
filter = sshd
logpath = /var/log/fail2ban.log
maxretry = 2
findtime = 1d
bantime = 30d
banaction = ufw
```
The base ufw rules look something like this:
```
ufw allow 80 # nginx
ufw allow 22 # ssh
ufw allow 443 # nginx
ufw allow 8080 # apache
ufw allow 8443 # apache
```
and usually result in ufw status something like this:
```
80 ALLOW Anywhere
222 ALLOW Anywhere
443 ALLOW Anywhere
8080 ALLOW Anywhere
8443 ALLOW Anywhere
80 (v6) ALLOW Anywhere (v6)
222 (v6) ALLOW Anywhere (v6)
443 (v6) ALLOW Anywhere (v6)
8080 (v6) ALLOW Anywhere (v6)
8443 (v6) ALLOW Anywhere (v6)
```
However, during a suspicious ufw status return at one point, it looked like
this:
```
Anywhere REJECT 101.126.130.226 # by Fail2Ban
after 2 attempts against sshd
Anywhere REJECT 39.144.129.95 # by Fail2Ban
after 2 attempts agai
80 (v6) ALLOW Anywhere (v6)
222 (v6) ALLOW Anywhere (v6)
443 (v6) ALLOW Anywhere (v6)
8080 (v6) ALLOW Anywhere (v6)
8443 (v6) ALLOW Anywhere (v6)
```
Note the absence of IPv4 rules and the lack of a complete line on the
last REJECT - it cuts off after "agai".
I have to point out that when I saw this, I ran ufw status again and the
rules came back, so the corruption did not happen in that case. It's
possible that a combination of some other commands that ran when the
server got rebooted corrupted the configuration.
I hope this helps, even if just a data point for future reports of
similar corruption.
** Affects: ufw (Ubuntu)
Importance: Undecided
Status: New
** Tags: corruption fail2ban ufw
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/2126805
Title:
ufw can corrupt its own configuration when fail2ban is adding/removing
ufw rules
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/ufw/+bug/2126805/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs