On Wed, Dec 17, 2025 at 11:19:27AM +0900, Robert Smith wrote:
> This is a production firewall. It is currently on kernel 7.8 with all patches 
> installed, and was at the time of the panic. 
> 
> syspatch78-001_syspatch.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-001_syspatch.tgz>
> syspatch78-002_xserver.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-002_xserver.tgz>
> syspatch78-003_unbound.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-003_unbound.tgz>
> syspatch78-004_libssl.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-004_libssl.tgz>
> syspatch78-005_smtpd.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-005_smtpd.tgz>
> syspatch78-006_libunwind.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-006_libunwind.tgz>
> syspatch78-007_drm.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-007_drm.tgz>
> syspatch78-008_libpng.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-008_libpng.tgz>
> syspatch78-009_xkbcomp.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-009_xkbcomp.tgz>
> syspatch78-010_unbound.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-010_unbound.tgz>
> syspatch78-011_nd6.tgz 
> <http://netinstall-op01.wansecurity.com/OS/OpenBSD/syspatch/7.8/amd64/syspatch78-011_nd6.tgz>
> 
> Previously it was running OpenBSD 7.5, with all patches and running fine 
> under Xen for around a couple of years with no problems.
> 
> Within 8 days of upgrading to 7.8 and full patches with syspatch, we got this 
> kernel panic. 
> 
> I apologize I could not go through the traces and additional debug 
> information gathering. Due to internal delays we did not see the customer's 
> ticket for over 3 hours and needed to boot sync immediately. Firewall is used 
> for a very important customer, and there could not be any more delays in 
> getting their service back online.
> 
> Here are the screenshots.
> 
> If it happens again we will need to downgrade back to 7.5 because we cannot 
> take a chances with this customer. The problem is that we have no way to 
> reproduce it or know if it will happen again.
> 
> 
> 
>  
> 
> 
> Please let me know if you would like any additional information.

I cannot see anything this code is doing which would prevent this
particular assertion from getting hit. Which looks like an oversight.
Assertions help during development but aren't supposed to be hit in
production.

However, the assertion did catch a problem.
It appears that the xnf driver was trying to send a packet stored in
an mbuf chain too long for the free capacity on the Tx ring.
There is a check related to this in xnf_start() but it operates on a
different set of variables, and evidently didn't catch the problem in
your particular case.

The patch below might help, with the caveat that I cannot test xnf myself.
It should no longer crash, in return for a small performance hit when
m_defrag() gets called.

In this diff, sc_tx_frags is a constant which indicates how many DMA
buffer fragments the underlying Xen machinery can handle per packet
(worst case: just 1, best case: 18).
While sc_tx_avail stores the number of unused Tx ring slots. Each mbuf
chain element will use up one such Tx slot, so we must also defrag the
mbuf (reducing the chain to one element) if all the elements won't fit on
the Tx ring. We know that at least one slot is free, so one element will fit.

diff /usr/src
path + /usr/src
commit - ba79eea579e19dd0aa4b855217dd8cfbd2b863d2
blob - 3c423aee2e37c66eb446889acb80c2dd19a4648a
file + sys/dev/pv/if_xnf.c
--- sys/dev/pv/if_xnf.c
+++ sys/dev/pv/if_xnf.c
@@ -564,11 +564,13 @@ xnf_encap(struct xnf_softc *sc, struct mbuf *m_head, u
        struct mbuf *m, **next;
        uint32_t oprod = *prod;
        uint16_t id;
-       int i, flags, n, used = 0;
+       int i, flags, n, used = 0, frags;
 
-       if ((xnf_fragcount(m_head) > sc->sc_tx_frags) &&
-           m_defrag(m_head, M_DONTWAIT))
-               return (ENOBUFS);
+       frags = xnf_fragcount(m_head);
+       if (frags > sc->sc_tx_frags || frags > sc->sc_tx_avail) {
+               if (m_defrag(m_head, M_DONTWAIT))
+                       return (ENOBUFS);
+       }
 
        flags = (sc->sc_domid << 16) | BUS_DMA_WRITE | BUS_DMA_NOWAIT;
 

Reply via email to