disruptive amd64 snapshot coming

2023-10-26 Thread Theo de Raadt
There is a pretty disruptive amd64 snapshot coming, so anyone who is
using snapshots for critical stuff should take a pause.  (This warning
about a development step is unusual, I won't make it common practice).



Re: snmpd: Fix close after protocol error case

2023-10-26 Thread Theo Buehler
On Thu, Oct 26, 2023 at 10:47:36AM +0200, Martijn van Duren wrote:
> So here's an elusive one that can be triggered every now and then by the
> new regression test. Once an AgentX session is opened and we send an
> invalid packet appl_agentx_recv() goes to appl_agentx_free(), since
> there's no recovery. appl_agentx_free() tries to neatly close all
> open sessions by sending a close-pdu, followed by calling
> appl_agentx_send() directly.
> However, if the socket has been closed in the meantime we hit
> appl_agentx_send()'s error path, which also calls appl_agentx_free().
> This in turn leads to use after free cases.
> 
> To fix this don't call appl_agentx_send() directly anymore, but just
> schedule it via conn_wev. To make sure as much data as possible is
> written out do a last unchecked courtesy flush before definitively
> freeing the connection. Since appl_agentx_forceclose() arms conn_wev
> move the event_del() calls down in appl_agentx_free().
> 
> Other calls of appl_agentx_send() should be fine, but just convert
> all of them to be consistent and safe.

ok tb



Re: snmpd; Fix use after free for appl_request_upstream

2023-10-26 Thread Theo Buehler
On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote:
> This case is covered by the new regress' backend_get_toofew and
> backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
> to the max it's really hard to trigger (I had to run
> backend_get_wrongorder, backend_get_toofew, backend_get_toomany
> sequentially in a tight loop killing snmpd between iterations for the
> best chance).
> 
> When we receive an invalid varbindlist in a response we set the invalid
> variable. This in turn calls appl_varbind_error(), but the avi_state
> of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
> we call appl_request_downstream_free(), which sees that the varbinds
> haven't been resolved, triggering a call to
> appl_request_upstream_resolve(). This call in turn sees that the
> error has been set and just sends out the error-response to the client
> and frees the appl_request_upstream. From here we return back to
> appl_response(), which also calls appl_request_upstream_resolve(),
> resulting in a use after free.
> 
> The main tool for fixing this issue is making use of
> appl_request_upstream's aru_locked member, which will cause
> appl_request_upstream_resolve() to return instantly. The simplest fix is
> to set aru_locked before calling appl_request_downstream_free() and
> unsetting it directly afterwards inside appl_response().
> 
> The second one is the diff proposed below, which shrinks the code.
> 
> appl_request_upstream_free() is only called once from
> appl_request_upstream_reply(). appl_request_upstream_reply() in turn
> is only called by appl_request_upstream_resolve().
> appl_request_upstream_resolve() is called in 3 places:
> - appl_processpdu(): to kick things off
> - appl_request_downstream_free(): For when a backend disappears with
> outstanding requests
> - appl_response(): To kickstart the next round of resolving.
> 
> Since appl_request_downstream_free() is always called from
> appl_response(), we can leverage that function and make it call
> appl_request_upstream_resolve() unconditionally.
> 
> appl_request_downstream_free() is called from the following locations:
> - appl_close(): When a backend has disappeared.
> - appl_request_upstream_free(): We send out a reply early, because an
> error has been detected.
> - appl_response(): We received a response
> 
> appl_request_upstream_free() can't reenter into
> appl_request_upstream_resolve(), or it would potentially trigger new
> appl_request_downstreams. This can be prevented by setting aru_locked
> before calling appl_request_downstream_free().
> For all other cases we should rely on appl_request_upstream_resolve()'s
> logic to handle varbinds in any state, so there's no reason make calls
> from other contexts conditional.

Your description of the bug makes sense and your choice of resolving it
as well. Thanks for the in-depth explanation, that helped a lot. Still,
I must say that I don't really feel at ease with the amount of complexity
and entanglement here. I simply can't fit all of this into my head within
a reasonable amount of time.

ok tb (fwiw)

since it resolves a real problem and simplifies things a bit.



Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-26 Thread Martin Pieuchot
On 26/10/23(Thu) 07:06, Miod Vallat wrote:
> > I wonder if the diff below makes a difference.  It's hard to debug and it
> > might be worth adding a counter for bad swap slots.
> 
> It did not help (but your diff is probably correct).

In that case I'd like to put both diffs in, are you ok with that?

> > Index: uvm/uvm_anon.c
> > ===
> > RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> > retrieving revision 1.56
> > diff -u -p -r1.56 uvm_anon.c
> > --- uvm/uvm_anon.c  2 Sep 2023 08:24:40 -   1.56
> > +++ uvm/uvm_anon.c  22 Oct 2023 21:27:42 -
> > @@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
> > uvm_unlock_pageq(); /* free the daemon */
> > }
> > } else {
> > -   if (anon->an_swslot != 0) {
> > +   if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
> > /* This page is no longer only in swap. */
> > KASSERT(uvmexp.swpgonly > 0);
> > atomic_dec_int();



Re: relayd.conf.5: less SSL

2023-10-26 Thread Klemens Nanni
On Tue, Oct 24, 2023 at 09:09:21AM +0200, Peter N. M. Hansteen wrote:
> On Tue, Oct 24, 2023 at 06:54:30AM +, Klemens Nanni wrote:
> > - parse.y still accepting undocumented "ssl" with a warning since 2014
> > - more "SSL/TLS" instead of "TLS" in manual and code comments
> 
> my take would be that while it's fine to streamline the documentation to use
> the modern terminology, I suspect there may still be ancient configurations
> out there that use the "ssl" keyword, so removing the last bit of support for
> that option should be accompanied by or preceded by a warning on relevant
> mailing lists or at least in the commit message. 
> 
> And I think undeadly.org would be more than happy to help spread the word :)

current.html entry should do for a deprecated keyword we've been warning
about for almost ten years...  I've checked faq/upgrade*.html for previous
notes, but couldn't find any.

Here's a first try, relayd regress is also happy.

Index: usr.sbin/relayd/parse.y
===
RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
retrieving revision 1.254
diff -u -p -r1.254 parse.y
--- usr.sbin/relayd/parse.y 3 Jul 2023 09:38:08 -   1.254
+++ usr.sbin/relayd/parse.y 26 Oct 2023 06:07:08 -
@@ -175,7 +175,7 @@ typedef struct {
 %token LOOKUP METHOD MODE NAT NO DESTINATION NODELAY NOTHING ON PARENT PATH
 %token PFTAG PORT PREFORK PRIORITY PROTO QUERYSTR REAL REDIRECT RELAY REMOVE
 %token REQUEST RESPONSE RETRY QUICK RETURN ROUNDROBIN ROUTE SACK SCRIPT SEND
-%token SESSION SOCKET SPLICE SSL STICKYADDR STRIP STYLE TABLE TAG TAGGED TCP
+%token SESSION SOCKET SPLICE STICKYADDR STRIP STYLE TABLE TAG TAGGED TCP
 %token TIMEOUT TLS TO ROUTER RTLABEL TRANSPARENT URL WITH TTL RTABLE
 %token MATCH PARAMS RANDOM LEASTSTATES SRCHASH KEY CERTIFICATE PASSWORD ECDHE
 %token EDH TICKETS CONNECTION CONNECTIONS CONTEXT ERRORS STATE CHANGES CHECKS
@@ -227,21 +227,12 @@ include   : INCLUDE STRING{
}
;
 
-ssltls : SSL   {
-   log_warnx("%s:%d: %s",
-   file->name, yylval.lineno,
-   "please use the \"tls\" keyword"
-   " instead of \"ssl\"");
-   }
-   | TLS
-   ;
-
 opttls : /*empty*/ { $$ = 0; }
-   | ssltls{ $$ = 1; }
+   | TLS   { $$ = 1; }
;
 
 opttlsclient   : /*empty*/ { $$ = 0; }
-   | WITH ssltls   { $$ = 1; }
+   | WITH TLS  { $$ = 1; }
;
 
 http_type  : HTTP  { $$ = 0; }
@@ -905,7 +896,7 @@ hashkey : /* empty */   {
 
 tablecheck : ICMP  { table->conf.check = CHECK_ICMP; }
| TCP   { table->conf.check = CHECK_TCP; }
-   | ssltls{
+   | TLS   {
table->conf.check = CHECK_TCP;
conf->sc_conf.flags |= F_TLS;
table->conf.flags |= F_TLS;
@@ -1114,7 +1105,7 @@ protopts_l: protopts_l protoptsl nl
| protoptsl optnl
;
 
-protoptsl  : ssltls {
+protoptsl  : TLS {
if (!(proto->type == RELAY_PROTO_TCP ||
proto->type == RELAY_PROTO_HTTP)) {
yyerror("can set tls options only for "
@@ -1122,7 +1113,7 @@ protoptsl : ssltls {
YYERROR;
}
} tlsflags
-   | ssltls {
+   | TLS {
if (!(proto->type == RELAY_PROTO_TCP ||
proto->type == RELAY_PROTO_HTTP)) {
yyerror("can set tls options only for "
@@ -2492,7 +2483,6 @@ lookup(char *s)
{ "socket", SOCKET },
{ "source-hash",SRCHASH },
{ "splice", SPLICE },
-   { "ssl",SSL },
{ "state",  STATE },
{ "sticky-address", STICKYADDR },
{ "strip",  STRIP },
Index: usr.sbin/relayd/relay.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.257
diff -u -p -r1.257 relay.c
--- usr.sbin/relayd/relay.c 3 Sep 2023 10:22:03 -   1.257
+++ usr.sbin/relayd/relay.c 26 Oct 2023 05:49:22 -
@@ -2064,7 +2064,7 @@ relay_tls_ctx_create_proto(struct protoc
 {
uint32_t protocols = 0;
 
-   /* Set the allowed SSL protocols */
+   /* Set the allowed TLS protocols */
if (proto->tlsflags & TLSFLAG_TLSV1_2)
protocols |= TLS_PROTOCOL_TLSv1_2;
if (proto->tlsflags & TLSFLAG_TLSV1_3)
@@ -2186,7 

snmpd; Fix use after free for appl_request_upstream

2023-10-26 Thread Martijn van Duren
This case is covered by the new regress' backend_get_toofew and
backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
to the max it's really hard to trigger (I had to run
backend_get_wrongorder, backend_get_toofew, backend_get_toomany
sequentially in a tight loop killing snmpd between iterations for the
best chance).

When we receive an invalid varbindlist in a response we set the invalid
variable. This in turn calls appl_varbind_error(), but the avi_state
of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
we call appl_request_downstream_free(), which sees that the varbinds
haven't been resolved, triggering a call to
appl_request_upstream_resolve(). This call in turn sees that the
error has been set and just sends out the error-response to the client
and frees the appl_request_upstream. From here we return back to
appl_response(), which also calls appl_request_upstream_resolve(),
resulting in a use after free.

The main tool for fixing this issue is making use of
appl_request_upstream's aru_locked member, which will cause
appl_request_upstream_resolve() to return instantly. The simplest fix is
to set aru_locked before calling appl_request_downstream_free() and
unsetting it directly afterwards inside appl_response().

The second one is the diff proposed below, which shrinks the code.

appl_request_upstream_free() is only called once from
appl_request_upstream_reply(). appl_request_upstream_reply() in turn
is only called by appl_request_upstream_resolve().
appl_request_upstream_resolve() is called in 3 places:
- appl_processpdu(): to kick things off
- appl_request_downstream_free(): For when a backend disappears with
outstanding requests
- appl_response(): To kickstart the next round of resolving.

Since appl_request_downstream_free() is always called from
appl_response(), we can leverage that function and make it call
appl_request_upstream_resolve() unconditionally.

appl_request_downstream_free() is called from the following locations:
- appl_close(): When a backend has disappeared.
- appl_request_upstream_free(): We send out a reply early, because an
error has been detected.
- appl_response(): We received a response

appl_request_upstream_free() can't reenter into
appl_request_upstream_resolve(), or it would potentially trigger new
appl_request_downstreams. This can be prevented by setting aru_locked
before calling appl_request_downstream_free().
For all other cases we should rely on appl_request_upstream_resolve()'s
logic to handle varbinds in any state, so there's no reason make calls
from other contexts conditional.

OK?

martijn@

Index: application.c
===
RCS file: /cvs/src/usr.sbin/snmpd/application.c,v
retrieving revision 1.24
diff -u -p -r1.24 application.c
--- application.c   24 Oct 2023 14:21:58 -  1.24
+++ application.c   26 Oct 2023 09:40:23 -
@@ -710,6 +710,7 @@ appl_request_upstream_free(struct appl_r
if (ureq == NULL)
return;
 
+   ureq->aru_locked = 1;
for (i = 0; i < ureq->aru_varbindlen && ureq->aru_vblist != NULL; i++) {
vb = &(ureq->aru_vblist[i]);
ober_free_elements(vb->avi_varbind.av_value);
@@ -726,7 +727,6 @@ void
 appl_request_downstream_free(struct appl_request_downstream *dreq)
 {
struct appl_varbind_internal *vb;
-   int retry = 0;
 
if (dreq == NULL)
return;
@@ -736,14 +736,11 @@ appl_request_downstream_free(struct appl
 
for (vb = dreq->ard_vblist; vb != NULL; vb = vb->avi_next) {
vb->avi_request_downstream = NULL;
-   if (vb->avi_state == APPL_VBSTATE_PENDING) {
+   if (vb->avi_state == APPL_VBSTATE_PENDING)
vb->avi_state = APPL_VBSTATE_NEW;
-   retry = 1;
-   }
}
 
-   if (retry)
-   appl_request_upstream_resolve(dreq->ard_request);
+   appl_request_upstream_resolve(dreq->ard_request);
free(dreq);
 }
 
@@ -1172,9 +1169,6 @@ appl_response(struct appl_backend *backe
backend->ab_name);
backend->ab_fn->ab_close(backend, APPL_CLOSE_REASONPARSEERROR);
}
-
-   if (ureq != NULL)
-   appl_request_upstream_resolve(ureq);
 }
 
 int



snmpd: Fix close after protocol error case

2023-10-26 Thread Martijn van Duren
So here's an elusive one that can be triggered every now and then by the
new regression test. Once an AgentX session is opened and we send an
invalid packet appl_agentx_recv() goes to appl_agentx_free(), since
there's no recovery. appl_agentx_free() tries to neatly close all
open sessions by sending a close-pdu, followed by calling
appl_agentx_send() directly.
However, if the socket has been closed in the meantime we hit
appl_agentx_send()'s error path, which also calls appl_agentx_free().
This in turn leads to use after free cases.

To fix this don't call appl_agentx_send() directly anymore, but just
schedule it via conn_wev. To make sure as much data as possible is
written out do a last unchecked courtesy flush before definitively
freeing the connection. Since appl_agentx_forceclose() arms conn_wev
move the event_del() calls down in appl_agentx_free().

Other calls of appl_agentx_send() should be fine, but just convert
all of them to be consistent and safe.

OK?

martijn@

Index: usr.sbin/snmpd/application_agentx.c
===
RCS file: /cvs/src/usr.sbin/snmpd/application_agentx.c,v
retrieving revision 1.12
diff -u -p -r1.12 application_agentx.c
--- usr.sbin/snmpd/application_agentx.c 24 Oct 2023 14:11:14 -  1.12
+++ usr.sbin/snmpd/application_agentx.c 26 Oct 2023 08:43:02 -
@@ -254,9 +254,6 @@ appl_agentx_free(struct appl_agentx_conn
 {
struct appl_agentx_session *session;
 
-   event_del(&(conn->conn_rev));
-   event_del(&(conn->conn_wev));
-
while ((session = TAILQ_FIRST(&(conn->conn_sessions))) != NULL) {
if (conn->conn_ax == NULL)
appl_agentx_session_free(session);
@@ -265,7 +262,12 @@ appl_agentx_free(struct appl_agentx_conn
reason);
}
 
+   event_del(&(conn->conn_rev));
+   event_del(&(conn->conn_wev));
+
RB_REMOVE(appl_agentx_conns, _agentx_conns, conn);
+   if (conn->conn_ax != NULL)
+   (void)ax_send(conn->conn_ax);
ax_free(conn->conn_ax);
if (conn->conn_backend)
fatalx("AgentX(%"PRIu32"): disappeared unexpected",
@@ -419,7 +421,7 @@ appl_agentx_recv(int fd, short event, vo
pdu->ap_header.aph_transactionid,
pdu->ap_header.aph_packetid, smi_getticks(),
APPL_ERROR_NOERROR, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
break;
case AX_PDU_TYPE_INDEXALLOCATE:
case AX_PDU_TYPE_INDEXDEALLOCATE:
@@ -431,7 +433,7 @@ appl_agentx_recv(int fd, short event, vo
APPL_ERROR_PROCESSINGERROR, 1,
pdu->ap_payload.ap_vbl.ap_varbind,
pdu->ap_payload.ap_vbl.ap_nvarbind);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
break;
case AX_PDU_TYPE_ADDAGENTCAPS:
case AX_PDU_TYPE_REMOVEAGENTCAPS:
@@ -451,7 +453,7 @@ appl_agentx_recv(int fd, short event, vo
pdu->ap_header.aph_transactionid,
pdu->ap_header.aph_packetid, smi_getticks(),
error, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
ax_pdu_free(pdu);
 
if (session == NULL || error != APPL_ERROR_PARSEERROR)
@@ -560,13 +562,13 @@ appl_agentx_open(struct appl_agentx_conn
ax_response(conn->conn_ax, session->sess_id,
pdu->ap_header.aph_transactionid, pdu->ap_header.aph_packetid,
smi_getticks(), APPL_ERROR_NOERROR, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
 
return;
  fail:
ax_response(conn->conn_ax, 0, pdu->ap_header.aph_transactionid,
pdu->ap_header.aph_packetid, 0, error, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
if (session != NULL)
free(session->sess_descr.aos_string);
free(session);
@@ -592,7 +594,7 @@ appl_agentx_close(struct appl_agentx_ses
ax_response(conn->conn_ax, pdu->ap_header.aph_sessionid,
pdu->ap_header.aph_transactionid, pdu->ap_header.aph_packetid,
smi_getticks(), error, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
if (error == APPL_ERROR_NOERROR)
return;
 
@@ -612,7 +614,7 @@ appl_agentx_forceclose(struct appl_backe
session->sess_conn->conn_ax->ax_byteorder = session->sess_byteorder;
ax_close(session->sess_conn->conn_ax, session->sess_id,
(enum ax_close_reason) reason);
-   appl_agentx_send(-1, EV_WRITE, session->sess_conn);
+   event_add(&(session->sess_conn->conn_wev), NULL);
 
strlcpy(name, session->sess_backend.ab_name, sizeof(name));

apu4 real com0 boot - not working to install 7.4

2023-10-26 Thread harold felton
apologies for crossposting - feel free to direct me to the correct place
and/or ignore other versions...  i started a reddit-thread HERE

which
didnt specify which list specifically - but provided a very-useful
help/ftp-site to bisect when the problem occurred...

i have a pcengines apu4 which has been running 7.1-stable with all updates
just fine...  i had been avoiding doing an upgrade - so was thinking to
reinstall based on 7.4-release this week...  i was unable to do an install
from the bsd.rd (for 7.4) - and have bisected the issue to somewhere around
the 9th or 10th of september...

i will enclose the bug-report-info i just now collected from the
running-machine...  the symptom is: the bsd.rd seems to time-out/fail at
the very-beginning from com0 and then forces a reboot to the system...

onscreen - the com0 output for two) successful bsd.rd looks as follows:

  [00:15:43 - Thu Oct 26] {-ksh:729}
--hfeltonad...@apu4.hfelton.net:~/snaps/s20230908/usr[729]
$ doas reboot
doas (hfeltonad...@apu4.hfelton.net) password:
stopping package daemons: fossild.
syncing disks... done
rebooting...
▒PC Engines apu4
coreboot build 20230131
BIOS version v4.19.0.1
4080 MB ECC DRAM
SeaBIOS (version rel-1.16.0.1-0-g77603a32)

Press F10 key now for boot menu

Booting from Hard Disk...
Using drive 0, partition 3.
Loading..
probing: pc0 com0 com1 com2 com3 mem[639K 3325M 752M a20=on]
disk: hd0+
>> OpenBSD/amd64 BOOT 3.53
switching console to com>> OpenBSD/amd64 BOOT 3.53
boot> 0
b bsd.rd
booting hd0a:bsd.rd: 3891908+1614848+3895112+0+708608
[109+435984+290736]=0xa57ab0
entry point at 0x81001000
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2022 OpenBSD. All rights reserved.
https://www.OpenBSD.org

OpenBSD 7.1 (RAMDISK_CD) #440: Mon Apr 11 18:09:13 MDT 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD
real mem = 4259930112 (4062MB)
avail mem = 4126810112 (3935MB)
random: good seed from bootblocks
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xcfe97040 (13 entries)
bios0: vendor coreboot version "v4.19.0.1" date 01/31/2023
bios0: PC Engines apu4
acpi0 at bios0: ACPI 6.0
acpi0: tables DSDT FACP SSDT MCFG TPM2 APIC HEST SSDT SSDT DRTM HPET
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD GX-412TC SOC, 998.27 MHz, 16-30-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,T
cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB
64b/line 16-way L2 cache
cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, IBE
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
ioapic0 at mainbus0: apid 4 pa 0xfec0, version 21, 24 pins
ioapic1 at mainbus0: apid 5 pa 0xfec2, version 21, 32 pins
acpihpet0 at acpi0: 14318180 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PBR4)
acpiprt2 at acpi0: bus 2 (PBR5)
acpiprt3 at acpi0: bus 3 (PBR6)
acpiprt4 at acpi0: bus 4 (PBR7)
acpiprt5 at acpi0: bus -1 (PBR8)
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
com0 at acpi0 COM1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at acpi0 COM2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
amdgpio0 at acpi0 GPIO uid 0 addr 0xfed81500/0x300 irq 7, 184 pins
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"BOOT" at acpi0 not configured
acpitz at acpi0 not configured
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "AMD 16h Root Complex" rev 0x00
vendor "AMD", unknown product 0x1567 (class system subclass IOMMU, rev
0x00) at pci0 dev 0 function 2 not configured
pchb1 at pci0 dev 2 function 0 "AMD 16h Host" rev 0x00
ppb0 at pci0 dev 2 function 1 "AMD 16h PCIE" rev 0x00: msi
pci1 at ppb0 bus 1
em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03: msi, address
00:0d:b9:55:bb:64
ppb1 at pci0 dev 2 function 2 "AMD 16h PCIE" rev 0x00: msi
pci2 at ppb1 bus 2
em1 at pci2 dev 0 function 0 "Intel I211" rev 0x03: msi, address
00:0d:b9:55:bb:65
ppb2 at pci0 dev 2 function 3 "AMD 16h PCIE" rev 0x00: msi
pci3 at ppb2 bus 3
em2 at pci3 dev 0 function 0 "Intel I211" rev 

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-26 Thread Miod Vallat
> I wonder if the diff below makes a difference.  It's hard to debug and it
> might be worth adding a counter for bad swap slots.

It did not help (but your diff is probably correct).

> Index: uvm/uvm_anon.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> retrieving revision 1.56
> diff -u -p -r1.56 uvm_anon.c
> --- uvm/uvm_anon.c2 Sep 2023 08:24:40 -   1.56
> +++ uvm/uvm_anon.c22 Oct 2023 21:27:42 -
> @@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
>   uvm_unlock_pageq(); /* free the daemon */
>   }
>   } else {
> - if (anon->an_swslot != 0) {
> + if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
>   /* This page is no longer only in swap. */
>   KASSERT(uvmexp.swpgonly > 0);
>   atomic_dec_int();