Re: panic: kernel diagnostic assertion "uvm_page_owner_locked_p(pg)" failed: file "/usr/src/sys/uvm/uvm_page.c", line 1064
* Martin Pieuchot [2022-01-19 03:12]: > > On 18/01/22(Tue) 22:46, Ralf Horstmann wrote: ... > > db_enter () at db_enter+0x10 > > panic(81e5053e) at panic+0xbf > > __assert(81ebcdc6,81e23741,428,81e718b1) at > > __assert+0x25 > > uvm_page_unbusy(800022738d90,10) at uvm_page_unbusy+0x20e > > uvm_aio_aiodone(fd81cd592360) at uvm_aio_aiodone+0x252 > > uvm_aiodone_daemon(8000fffefa40) at uvm_aiodone_daemon+0x124 > > end trace frame: 0x0, count: 9 > > This is caused by an incorrect lock assertion. I just committed a fix. > > The problem can be triggered when swapping anon. It should be fixed in > the next snapshot. Thanks for the quick fix! I am testing with a local kernel for now. I need a couple of days of stability before I can jump onto the snapshots again :-) Regards Ralf
Re: panic: kernel diagnostic assertion "uvm_page_owner_locked_p(pg)" failed: file "/usr/src/sys/uvm/uvm_page.c", line 1064
Thanks for the report. On 18/01/22(Tue) 22:46, Ralf Horstmann wrote: > >Synopsis:panic: kernel diagnostic assertion > >"uvm_page_owner_locked_p(pg)" failed: file "/usr/src/sys/uvm/uvm_page.c", > >line 1064 > >Category:kernel > >Environment: > System : OpenBSD 7.0 > Details : OpenBSD 7.0-current (GENERIC.MP) #248: Tue Jan 11 > 10:12:07 MST 2022 > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > Architecture: OpenBSD.amd64 > Machine : amd64 > >Description: > The panic typically happens while using some X programs, in > most cases chromium. It can take days of uptime and occasional > system usage for the problem to show, sometimes but not always > with suspend and resume between boot and panic. > > I have seen the same diagnostics assertion with snapshots from > 2021-31-12, 2022-01-05 and now 2022-01-11. > > Even though I do have ddb enabled by default, the system does > not always enter ddb and print a backtrace. But for the most > recent case I have the following details and a backtrace > (typed from screen): > > Stopped at db_enter+0x10: popq %rbp > TIDPID UIDPRFLAGSPFLAGS CPUCOMMAND > *310366 5500400x14000 0x200 0K aiodoned >508990 6254100x14000 0x200 2srdis > db_enter () at db_enter+0x10 > panic(81e5053e) at panic+0xbf > __assert(81ebcdc6,81e23741,428,81e718b1) at > __assert+0x25 > uvm_page_unbusy(800022738d90,10) at uvm_page_unbusy+0x20e > uvm_aio_aiodone(fd81cd592360) at uvm_aio_aiodone+0x252 > uvm_aiodone_daemon(8000fffefa40) at uvm_aiodone_daemon+0x124 > end trace frame: 0x0, count: 9 This is caused by an incorrect lock assertion. I just committed a fix. The problem can be triggered when swapping anon. It should be fixed in the next snapshot. Thanks again, Martin
panic: kernel diagnostic assertion "uvm_page_owner_locked_p(pg)" failed: file "/usr/src/sys/uvm/uvm_page.c", line 1064
>Synopsis: panic: kernel diagnostic assertion >"uvm_page_owner_locked_p(pg)" failed: file "/usr/src/sys/uvm/uvm_page.c", line >1064 >Category: kernel >Environment: System : OpenBSD 7.0 Details : OpenBSD 7.0-current (GENERIC.MP) #248: Tue Jan 11 10:12:07 MST 2022 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 >Description: The panic typically happens while using some X programs, in most cases chromium. It can take days of uptime and occasional system usage for the problem to show, sometimes but not always with suspend and resume between boot and panic. I have seen the same diagnostics assertion with snapshots from 2021-31-12, 2022-01-05 and now 2022-01-11. Even though I do have ddb enabled by default, the system does not always enter ddb and print a backtrace. But for the most recent case I have the following details and a backtrace (typed from screen): Stopped at db_enter+0x10: popq %rbp TIDPID UIDPRFLAGSPFLAGS CPUCOMMAND *310366 5500400x14000 0x200 0K aiodoned 508990 6254100x14000 0x200 2srdis db_enter () at db_enter+0x10 panic(81e5053e) at panic+0xbf __assert(81ebcdc6,81e23741,428,81e718b1) at __assert+0x25 uvm_page_unbusy(800022738d90,10) at uvm_page_unbusy+0x20e uvm_aio_aiodone(fd81cd592360) at uvm_aio_aiodone+0x252 uvm_aiodone_daemon(8000fffefa40) at uvm_aiodone_daemon+0x124 end trace frame: 0x0, count: 9 >How-To-Repeat: No reliable reproducer identified yet, other than using chromium with many tabs for a couple of days. >Fix: dmesg: OpenBSD 7.0-current (GENERIC.MP) #248: Tue Jan 11 10:12:07 MST 2022 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 8250834944 (7868MB) avail mem = 7984787456 (7614MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xcc7fd000 (65 entries) bios0: vendor LENOVO version "JBET73WW (1.37 )" date 08/14/2019 bios0: LENOVO 20BWS3WY01 acpi0 at bios0: ACPI 5.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP SLIC ASF! HPET ECDT APIC MCFG SSDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT PCCT SSDT TCPA SSDT UEFI MSDM BATB FPDT UEFI DMAR acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) XHCI(S3) EHC1(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpihpet0 at acpi0: 14318179 Hz acpiec0 at acpi0 acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, 2195.20 MHz, 06-3d-04 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE cpu1 at mainbus0: apid 1 (application processor) cpu1: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, 2194.92 MHz, 06-3d-04 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 1, core 0, package 0 cpu2 at mainbus0: apid 2 (application processor) cpu2: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, 2194.93 MHz, 06-3d-04 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 0, core 1, package 0 cpu3 at mainbus0: apid 3 (application processor)
Re: wireguard-related mbuf panic (was: Re: panic: ieee80211_has_seq(wh) assertion failed)
Hi Alexander, On Tue, Jan 18, 2022 at 01:51:45PM +0100, Alexander Bluhm wrote: > On Tue, Jan 18, 2022 at 12:59:25PM +0100, Christian Ehrhardt wrote: > > If so, any volunteers to commit this? > > Done. Thanks for fixing this. > > Two more things: > > - We need a man page update. I think the missing documentation > of this feature was the reason that this bug was introduced. > - The strange M_EXT check with the XXX comment in ip input was the > reason I looked into this topic long time ago. I think a M_READONLY > would make the intention more obvious. > > ok? > > bluhm > > > Index: share/man/man9/mbuf.9 > === > RCS file: /data/mirror/openbsd/cvs/src/share/man/man9/mbuf.9,v > retrieving revision 1.123 > diff -u -p -r1.123 mbuf.9 > --- share/man/man9/mbuf.9 8 Mar 2021 02:47:26 - 1.123 > +++ share/man/man9/mbuf.9 13 Jan 2022 14:36:53 - > @@ -570,7 +570,7 @@ is freed. > Ensure that the data in the mbuf chain starting at the beginning of > the chain and ending at > .Fa len > -will be put in continuous memory region. > +will be put in continuous and writable memory region. This is not true. If the data is already contiguous (the default) the buffer where the data lives may still be read-only. regards Christian smime.p7s Description: S/MIME cryptographic signature
Re: wireguard-related mbuf panic (was: Re: panic: ieee80211_has_seq(wh) assertion failed)
On Tue, Jan 18, 2022 at 12:59:25PM +0100, Christian Ehrhardt wrote: > If so, any volunteers to commit this? Done. Thanks for fixing this. Two more things: - We need a man page update. I think the missing documentation of this feature was the reason that this bug was introduced. - The strange M_EXT check with the XXX comment in ip input was the reason I looked into this topic long time ago. I think a M_READONLY would make the intention more obvious. ok? bluhm Index: share/man/man9/mbuf.9 === RCS file: /data/mirror/openbsd/cvs/src/share/man/man9/mbuf.9,v retrieving revision 1.123 diff -u -p -r1.123 mbuf.9 --- share/man/man9/mbuf.9 8 Mar 2021 02:47:26 - 1.123 +++ share/man/man9/mbuf.9 13 Jan 2022 14:36:53 - @@ -570,7 +570,7 @@ is freed. Ensure that the data in the mbuf chain starting at the beginning of the chain and ending at .Fa len -will be put in continuous memory region. +will be put in continuous and writable memory region. If memory must be allocated, then it will fail if the .Fa len argument is greater than MAXMCLBYTES. Index: netinet/ip_input.c === RCS file: /data/mirror/openbsd/cvs/src/sys/netinet/ip_input.c,v retrieving revision 1.364 diff -u -p -r1.364 ip_input.c --- netinet/ip_input.c 22 Nov 2021 13:47:10 - 1.364 +++ netinet/ip_input.c 18 Jan 2022 12:42:14 - @@ -415,7 +415,7 @@ ip_input_if(struct mbuf **mp, int *offp, if (ipmforwarding && ip_mrouter[ifp->if_rdomain]) { int error; - if (m->m_flags & M_EXT) { + if (M_READONLY(m)) { if ((m = *mp = m_pullup(m, hlen)) == NULL) { ipstat_inc(ips_toosmall); goto bad; @@ -532,7 +532,7 @@ ip_ours(struct mbuf **mp, int *offp, int * but it's not worth the time; just let them time out.) */ if (ip->ip_off &~ htons(IP_DF | IP_RF)) { - if (m->m_flags & M_EXT) { /* XXX */ + if (M_READONLY(m)) { if ((m = *mp = m_pullup(m, hlen)) == NULL) { ipstat_inc(ips_toosmall); return IPPROTO_DONE;
Re: wireguard-related mbuf panic (was: Re: panic: ieee80211_has_seq(wh) assertion failed)
Hi Claudio, On Fri, Jan 14, 2022 at 11:02:18AM +0100, Claudio Jeker wrote: > One small comment below. > > > diff --git a/sys/kern/uipc_mbuf.c b/sys/kern/uipc_mbuf.c > > index 5e4cb5ba88..21ae5059b0 100644 > > --- a/sys/kern/uipc_mbuf.c > > +++ b/sys/kern/uipc_mbuf.c > > [ ... ] > > @@ -983,14 +982,18 @@ m_pullup(struct mbuf *m0, int len) > > > > len -= m0->m_len; > > } else { > > - /* the first mbuf is too small so make a new one */ > > + /* the first mbuf is too small or read-only, make a new one */ > > space = adj + len; > > > > if (space > MAXMCLBYTES) > > goto bad; > > > > - m0->m_next = m; > > - m = m0; > > + if (m0->m_len == 0) { > > + m_free(m0); > > + } else { > > + m0->m_next = m; > > + m = m0; > > + } > > This change is not really needed. The for (;;) loop below that does the > copy will handle an empty initial mbuf just fine and m_free() it. > I would not change this since I think it makes the code more complex for > little gain. True. Here's an updated version. ok? If so, any volunteers to commit this? regards Christian commit de01ce44f59450c9ddb0c2362bbc93eafb8cfd0a Author: Christian Ehrhardt Date: Tue Jan 11 10:31:46 2022 +0100 m_pullup: Properly handle read-only clusters If the first mbuf of a chain in m_pullup is a cluster, check if the cluster is read-only (shared or an external buffer). If so don't touch it an create an new mbuf for the pullup data. diff --git a/sys/kern/uipc_mbuf.c b/sys/kern/uipc_mbuf.c index 5e4cb5ba88..45bf2b2cc2 100644 --- a/sys/kern/uipc_mbuf.c +++ b/sys/kern/uipc_mbuf.c @@ -957,8 +957,6 @@ m_pullup(struct mbuf *m0, int len) head = M_DATABUF(m0); if (m0->m_len == 0) { - m0->m_data = head; - while (m->m_len == 0) { m = m_free(m); if (m == NULL) @@ -972,10 +970,11 @@ m_pullup(struct mbuf *m0, int len) tail = head + M_SIZE(m0); head += adj; - if (len <= tail - head) { - /* there's enough space in the first mbuf */ - - if (len > tail - mtod(m0, caddr_t)) { + if (!M_READONLY(m0) && len <= tail - head) { + /* we can copy everything into the first mbuf */ + if (m0->m_len == 0) { + m0->m_data = head; + } else if (len > tail - mtod(m0, caddr_t)) { /* need to memmove to make space at the end */ memmove(head, mtod(m0, caddr_t), m0->m_len); m0->m_data = head; @@ -983,7 +982,7 @@ m_pullup(struct mbuf *m0, int len) len -= m0->m_len; } else { - /* the first mbuf is too small so make a new one */ + /* the first mbuf is too small or read-only, make a new one */ space = adj + len; if (space > MAXMCLBYTES) smime.p7s Description: S/MIME cryptographic signature
Re: Replace cos and avoid FPU trigonometry (was: tanf returns NaN for large inputs)
> From: Greg Steuck > Date: Mon, 10 Jan 2022 20:59:17 -0800 > > Greg Steuck writes: > > > This failure can be reduced to a trivial program which does change > > its behavior for the worse if s_cos.S is taken out: > > > > #include > > #include > > > > int main(int a, char**b) { > > double y = -0.34061437849088045332; > > printf("cos(%lf)=%le delta=%e\n", y, cos(y), 0.94254960031831729956 - > > cos(y)); > > } > > > > In HEAD: > > > > cos(-0.340614)=9.425496e-01 delta=-1.110223e-16 > > > > while with the patch below: > > > > cos(-0.340614)=9.425496e-01 delta=0.00e+00 > > As Daniel noted, I swapped the cases. The HEAD is at 0.0 delta whereas > the patch used to make it worse. > > I went looking for why things are better on FreeBSD and they have a > different (simpler) implementation of cos. I copied it over. Given the > common provenance, I expect the copyright situation to be unambiguous. I think you will also need the changes done in FreeBSD commit 4339c67c485f. > With the two patches things look almost universally better in > regress/libm. I attached both logs from amd64. > > Anybody has ideas for other tests that make sense to do? Maybe people > can help me run regress on less common platforms? > > Thanks > Greg > > >From a0b065bd3f5d48786f77f654dfb53cbf2617b0b3 Mon Sep 17 00:00:00 2001 > From: Greg Steuck > Date: Mon, 10 Jan 2022 20:22:07 -0800 > Subject: [PATCH 1/2] Copy cos(3) software implementation from FreeBSD-13 > > The result passes more tests from msun suite. In particular, > testacc(cos, -0.34061437849088045332L, 0.94254960031831729956L, > ALL_STD_EXCEPT, FE_INEXACT); > matches instead of being 1e-16 off. > --- > lib/libm/src/k_cos.c | 45 ++-- > lib/libm/src/s_cos.c | 6 +- > 2 files changed, 23 insertions(+), 28 deletions(-) > > diff --git a/lib/libm/src/k_cos.c b/lib/libm/src/k_cos.c > index 8f3882b6a00..0839243e90c 100644 > --- a/lib/libm/src/k_cos.c > +++ b/lib/libm/src/k_cos.c > @@ -36,13 +36,17 @@ > * ~ cos(x) - x*y, > * a correction term is necessary in cos(x) and hence > * cos(x+y) = 1 - (x*x/2 - (r - x*y)) > - * For better accuracy when x > 0.3, let qx = |x|/4 with > - * the last 32 bits mask off, and if x > 0.78125, let qx = 0.28125. > - * Then > - * cos(x+y) = (1-qx) - ((x*x/2-qx) - (r-x*y)). > - * Note that 1-qx and (x*x/2-qx) is EXACT here, and the > - * magnitude of the latter is at least a quarter of x*x/2, > - * thus, reducing the rounding error in the subtraction. > + * For better accuracy, rearrange to > + * cos(x+y) ~ w + (tmp + (r-x*y)) > + * where w = 1 - x*x/2 and tmp is a tiny correction term > + * (1 - x*x/2 == w + tmp exactly in infinite precision). > + * The exactness of w + tmp in infinite precision depends on w > + * and tmp having the same precision as x. If they have extra > + * precision due to compiler bugs, then the extra precision is > + * only good provided it is retained in all terms of the final > + * expression for cos(). Retention happens in all cases tested > + * under FreeBSD, so don't pessimize things by forcibly clipping > + * any extra precision in w. > */ > > #include "math.h" > @@ -60,25 +64,12 @@ C6 = -1.13596475577881948265e-11; /* 0xBDA8FAE9, > 0xBE8838D4 */ > double > __kernel_cos(double x, double y) > { > - double a,hz,z,r,qx; > - int32_t ix; > - GET_HIGH_WORD(ix,x); > - ix &= 0x7fff; /* ix = |x|'s high word*/ > - if(ix<0x3e40) { /* if x < 2**27 */ > - if(((int)x)==0) return one; /* generate inexact */ > - } > + double hz,z,r,w; > + > z = x*x; > - r = z*(C1+z*(C2+z*(C3+z*(C4+z*(C5+z*C6); > - if(ix < 0x3FD3) /* if |x| < 0.3 */ > - return one - (0.5*z - (z*r - x*y)); > - else { > - if(ix > 0x3fe9) { /* x > 0.78125 */ > - qx = 0.28125; > - } else { > - INSERT_WORDS(qx,ix-0x0020,0); /* x/4 */ > - } > - hz = 0.5*z-qx; > - a = one-qx; > - return a - (hz - (z*r-x*y)); > - } > + w = z*z; > + r = z*(C1+z*(C2+z*C3)) + w*w*(C4+z*(C5+z*C6)); > + hz = 0.5*z; > + w = one-hz; > + return w + (((one-w)-hz) + (z*r-x*y)); > } > diff --git a/lib/libm/src/s_cos.c b/lib/libm/src/s_cos.c > index 8b923d5fe61..1406504e9ab 100644 > --- a/lib/libm/src/s_cos.c > +++ b/lib/libm/src/s_cos.c > @@ -57,7 +57,11 @@ cos(double x) > > /* |x| ~< pi/4 */ > ix &= 0x7fff; > - if(ix <= 0x3fe921fb) return __kernel_cos(x,z); > + if(ix <= 0x3fe921fb) { > + if(ix<0x3e46a09e) /* if x < 2**-27 * sqrt(2) */ > + if(((int)x)==0) return 1.0; /* generate inexact */ > + return __kernel_cos(x,z); > + }