subject:"Re\:"

Re: BUFSIZ-related pessimization in fvwrite.c

2024-03-29 Thread Christos Zoulas

In article ,
enh   wrote:
>(if anyone knows the equivalent freebsd list, please add them --- this
>code is the same between all three bsds.)
>
>Apple patched their copy of the FreeBSD fvwrite.c to change BUFSIZ to
>INT_MAX to massively improve the performance of large unbuffered
>writes[1], and patched the buffered case to use the largest multiple
>of the buffer size to massively improve the performance of large
>buffered writes:
>
>https://github.com/apple-oss-distributions/Libc/commit/c5a3293354e22262702a3add5b2dfc9bb0b93b85#diff-3b844a19cfb0aab1a23f7fbc457d3bce7453513730c489a72f66ff89d6557ff3
>
>i've tested similar changes to openbsd's fvwrite.c with great results.
>
>is it a _requirement_ that buffered writes _only_ happen in multiples
>of the buffer size? that seems unlikely to me (a) because of short
>writes and (b) musl just does a writev() of what's in the buffer and
>what it was just given in case of a larger-than-buffer write and (c)
>we already changed the openbsd fread() to read directly into the
>caller's buffer regardless of size, so the equivalent behavior on the
>_write_ side seems reasonable to me?

Makes sense, committed.

christos

Re: snmpd: Move SNMPv2-SMI::snmpV2 to application_internal.c

2023-10-29 Thread Martijn van Duren

On Sun, 2023-10-29 at 20:11 +0100, Martijn van Duren wrote:
> Similar reasoning and questions as the move of SNMPv2-MIB::snmp.
> This moves SNMP-FRAMEWORK-MIB::snmpEngine and
> SNMP-USER-BASED-SM-MIB::usmStats from mib.c to application_internal.c,
> under SNMPv2-SMI::snmpV2. The reason for this broader umbrella is that
> other backends have no business fiddling around under this region.
> Since different backends bite eachother with exclusive regions, both
> snmpEngine and usmStats must be moved at the same time.
> 
> This change also breaks snmpd.sh from regress. This places a few custom
> OIDs in the usmUserTable, which is under snmpV2. Since these tests are
> mostly there to test snmp(1) I think it's worth removing these tests and
> build a proper snmp(1) regress suite another day (regress part in next
> mail)
> 
> OK?
> 
> martijn@
> 
Here's the regress diff

Index: snmpd.sh
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/snmpd.sh,v
retrieving revision 1.18
diff -u -p -r1.18 snmpd.sh
--- snmpd.sh19 Jan 2022 11:02:38 -  1.18
+++ snmpd.sh29 Oct 2023 19:14:33 -
@@ -271,8 +271,6 @@ read-write community non-default-rw
 oid 1.3.6.1.4.1.30155.42.1 name myName read-only string "humppa"
 oid 1.3.6.1.4.1.30155.42.2 name myStatus read-only integer 1
 # No need to place a full index, we just need the object
-oid 1.3.6.1.6.3.15.1.2.2.1.3.1 name Reyk read-only string "Reyk Fl${oe}ter"
-oid 1.3.6.1.6.3.15.1.2.2.1.3.2 name broken read-only string "br${boe}ken"
 EOF
 
 (cd ${OBJDIR} && nohup snmpd -dvf ./snmpd.conf > snmpd.log 2>&1) &
@@ -344,38 +342,6 @@ fi
 #  echo "Setting of a ro custom oid test unexpectedly succeeded."
 #  FAILED=1
 #fi
-
-snmp_command="snmp get -Oqv -v2c -c non-default-rw localhost \
-usmUserSecurityName.1.0"
-echo === $snmp_command
-reyk="$(eval LC_ALL=en_US.UTF-8 $snmp_command)"
-if [ "$reyk" != "Reyk Fl${oe}ter" ]
-then
-   echo "Printing of UTF-8 string in UTF-8 locale failed"
-   FAILED=1
-fi
-reyk="$(eval LC_ALL=C $snmp_command)"
-if [ "$reyk" != "Reyk Fl.ter" ]
-then
-   echo "Printing of UTF-8 string in C locale failed"
-   FAILED=1
-fi
-
-snmp_command="snmp get -Oqv -v2c -c non-default-rw localhost \
-usmUserSecurityName.2.0"
-echo === $snmp_command
-broken="$(eval LC_ALL=en_US.UTF-8 $snmp_command)"
-if [ "$broken" != "br${replacement}ken" ]
-then
-   echo "Printing of UTF-8 replacement character failed"
-   FAILED=1
-fi
-broken="$(eval LC_ALL=C $snmp_command)"
-if [ "$broken" != "br?ken" ]
-then
-   echo "Printing of question mark in C locale failed"
-   FAILED=1
-fi
 
 kill $(pgrep snmpd) >/dev/null 2>&1

Re: azalia: use HDMI as second fallback

2023-10-29 Thread Peter Hessler

Yes please!


On 2023 Oct 29 (Sun) at 14:55:09 +0100 (+0100), Christopher Zimmermann wrote:
:Hi,
:
:for me azalia HDMI audio playback works fine. According to [1] it had or
:still has problems. For machines where the default audio should not be
:(possibly broken) rsnd/0, but rsnd/1 this can be configured in sndiod or
:AUDIODEVICE.
:
:This diff would attach azalia even when there are only HDMI codecs available.
:
:
:Christopher
:
:[1] 
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/azalia.c?rev=1.155=text/x-cvsweb-markup
:
:
:
:Index: azalia.c
:===
:RCS file: /cvs/src/sys/dev/pci/azalia.c,v
:retrieving revision 1.284
:diff -u -p -r1.284 azalia.c
:--- azalia.c   30 Jul 2023 08:46:03 -  1.284
:+++ azalia.c   29 Oct 2023 13:39:04 -
:@@ -950,28 +950,19 @@ azalia_init_codecs(azalia_t *az)
:   return(1);
:   }
:-  /* Use the first codec capable of analog I/O.  If there are none,
:-   * use the first codec capable of digital I/O.  Skip HDMI codecs.
:-   */
:+  /* Prefer analog over digital codecs over HDMI codecs. */
:   c = -1;
:   for (i = 0; i < az->ncodecs; i++) {
:-  codec = >codecs[i];
:-  if ((codec->audiofunc < 0) ||
:-  (codec->codec_type == AZ_CODEC_TYPE_HDMI))
:-  continue;
:-  if (codec->codec_type == AZ_CODEC_TYPE_DIGITAL) {
:-  if (c < 0)
:-  c = i;
:-  } else {
:+  if (az->codecs[i].audiofunc >= 0 &&
:+  (c = -1 ||
:+  az->codecs[i].codec_type < az->codecs[c].codec_type))
:   c = i;
:-  break;
:-  }
:   }
:-  az->codecno = c;
:-  if (az->codecno < 0) {
:+  if (c == -1) {
:   printf("%s: no supported codecs\n", XNAME(az));
:   return(1);
:   }
:+  az->codecno = c;
:   printf("%s: codecs: ", XNAME(az));
:   for (i = 0; i < az->ncodecs; i++) {
:

Re: Removing syscall(2) from libc and kernel

2023-10-29 Thread Theo de Raadt

break;
-   default:
-   break;
-   }
-
-   callp = sysent;
-   if (code < 0 || code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
-
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp = sysent + code;
argsize = (callp->sy_argsize >> 3) + argoff;
if (argsize) {
switch (MIN(argsize, 6)) {
@@ -620,7 +603,7 @@ syscall(struct trapframe *frame)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, argp, rval);
+   error = mi_syscall(p, code, callp, argp, rval);
 
switch (error) {
case 0:
Index: sys/arch/arm/arm/syscall.c
===
RCS file: /cvs/src/sys/arch/arm/arm/syscall.c,v
diff -u -p -u -r1.26 syscall.c
--- sys/arch/arm/arm/syscall.c  11 Feb 2023 23:07:26 -  1.26
+++ sys/arch/arm/arm/syscall.c  28 Oct 2023 15:11:05 -
@@ -93,8 +93,8 @@ void
 swi_handler(trapframe_t *frame)
 {
struct proc *p = curproc;
-   const struct sysent *callp;
-   int code, error, indirect = -1;
+   const struct sysent *callp = sysent;
+   int code, error = ENOSYS;
u_int nap = 4, nargs;
register_t *ap, *args, copyargs[MAXARGS], rval[2];
 
@@ -103,32 +103,19 @@ swi_handler(trapframe_t *frame)
/* Before enabling interrupts, save FPU state */
vfp_save();
 
-   /* Re-enable interrupts if they were enabled previously */
-   if (__predict_true((frame->tf_spsr & PSR_I) == 0))
-   enable_interrupts(PSR_I);
+   enable_interrupts(PSR_I);
 
p->p_addr->u_pcb.pcb_tf = frame;
 
/* Skip over speculation-blocking barrier. */
frame->tf_pc += 8;
 
-   code = frame->tf_r12;
-
ap = >tf_r0;
 
-   switch (code) { 
-   case SYS_syscall:
-   indirect = code;
-   code = *ap++;
-   nap--;
-   break;
-   }
-
-   callp = sysent;
-   if (code < 0 || code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   code = frame->tf_r12;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp += code;
 
nargs = callp->sy_argsize / sizeof(register_t);
if (nargs <= nap) {
@@ -145,27 +132,23 @@ swi_handler(trapframe_t *frame)
rval[0] = 0;
rval[1] = frame->tf_r1;
 
-   error = mi_syscall(p, code, indirect, callp, args, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
frame->tf_r0 = rval[0];
frame->tf_r1 = rval[1];
-
frame->tf_spsr &= ~PSR_C;   /* carry bit */
break;
-
case ERESTART:
/*
 * Reconstruct the pc to point at the swi.
 */
frame->tf_pc -= 12;
break;
-
case EJUSTRETURN:
/* nothing to do */
break;
-
default:
bad:
frame->tf_r0 = error;
Index: sys/arch/arm64/arm64/syscall.c
===
RCS file: /cvs/src/sys/arch/arm64/arm64/syscall.c,v
diff -u -p -u -r1.14 syscall.c
--- sys/arch/arm64/arm64/syscall.c  13 Apr 2023 02:19:04 -  1.14
+++ sys/arch/arm64/arm64/syscall.c  28 Oct 2023 15:11:05 -
@@ -33,7 +33,7 @@ svc_handler(trapframe_t *frame)
 {
struct proc *p = curproc;
const struct sysent *callp;
-   int code, error, indirect = -1;
+   int code, error = ENOSYS;
u_int nap = 8, nargs;
register_t *ap, *args, copyargs[MAXARGS], rval[2];
 
@@ -50,19 +50,9 @@ svc_handler(trapframe_t *frame)
 
ap = >tf_x[0];
 
-   switch (code) { 
-   case SYS_syscall:
-   indirect = code;
-   code = *ap++;
-   nap--;
-   break;
-   }
-
-   callp = sysent;
-   if (code < 0 || code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp = sysent + code;
 
nargs = callp->sy_argsize / sizeof(register_t);
if (nargs <= nap) {
@@ -79,25 +69,22 @@ svc_handler(trapframe_t *frame)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, args, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
frame->tf_x[0] = rval[0];
frame->tf_spsr &= ~PSR_C;   /* carry bit */
break;
-
case ERESTART:
/*
 *

Re: relayd.conf.5: less SSL

2023-10-28 Thread Sebastian Benoit

Klemens Nanni(k...@openbsd.org) on 2023.10.26 13:28:42 +:
> On Tue, Oct 24, 2023 at 09:09:21AM +0200, Peter N. M. Hansteen wrote:
> > On Tue, Oct 24, 2023 at 06:54:30AM +, Klemens Nanni wrote:
> > > - parse.y still accepting undocumented "ssl" with a warning since 2014
> > > - more "SSL/TLS" instead of "TLS" in manual and code comments
> > 
> > my take would be that while it's fine to streamline the documentation to use
> > the modern terminology, I suspect there may still be ancient configurations
> > out there that use the "ssl" keyword, so removing the last bit of support 
> > for
> > that option should be accompanied by or preceded by a warning on relevant
> > mailing lists or at least in the commit message. 
> > 
> > And I think undeadly.org would be more than happy to help spread the word :)
> 
> current.html entry should do for a deprecated keyword we've been warning
> about for almost ten years...

Yes, please kick it where it belongs.

> I've checked faq/upgrade*.html for previous
> notes, but couldn't find any.

no, because it wasnt removed after 2 releases with the warning.

> Here's a first try, relayd regress is also happy.

ok benno@

> Index: usr.sbin/relayd/parse.y
> ===
> RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
> retrieving revision 1.254
> diff -u -p -r1.254 parse.y
> --- usr.sbin/relayd/parse.y   3 Jul 2023 09:38:08 -   1.254
> +++ usr.sbin/relayd/parse.y   26 Oct 2023 06:07:08 -
> @@ -175,7 +175,7 @@ typedef struct {
>  %token   LOOKUP METHOD MODE NAT NO DESTINATION NODELAY NOTHING ON PARENT 
> PATH
>  %token   PFTAG PORT PREFORK PRIORITY PROTO QUERYSTR REAL REDIRECT RELAY 
> REMOVE
>  %token   REQUEST RESPONSE RETRY QUICK RETURN ROUNDROBIN ROUTE SACK 
> SCRIPT SEND
> -%token   SESSION SOCKET SPLICE SSL STICKYADDR STRIP STYLE TABLE TAG 
> TAGGED TCP
> +%token   SESSION SOCKET SPLICE STICKYADDR STRIP STYLE TABLE TAG TAGGED 
> TCP
>  %token   TIMEOUT TLS TO ROUTER RTLABEL TRANSPARENT URL WITH TTL RTABLE
>  %token   MATCH PARAMS RANDOM LEASTSTATES SRCHASH KEY CERTIFICATE 
> PASSWORD ECDHE
>  %token   EDH TICKETS CONNECTION CONNECTIONS CONTEXT ERRORS STATE CHANGES 
> CHECKS
> @@ -227,21 +227,12 @@ include : INCLUDE STRING{
>   }
>   ;
>  
> -ssltls   : SSL   {
> - log_warnx("%s:%d: %s",
> - file->name, yylval.lineno,
> - "please use the \"tls\" keyword"
> - " instead of \"ssl\"");
> - }
> - | TLS
> - ;
> -
>  opttls   : /*empty*/ { $$ = 0; }
> - | ssltls{ $$ = 1; }
> + | TLS   { $$ = 1; }
>   ;
>  
>  opttlsclient : /*empty*/ { $$ = 0; }
> - | WITH ssltls   { $$ = 1; }
> + | WITH TLS  { $$ = 1; }
>   ;
>  
>  http_type: HTTP  { $$ = 0; }
> @@ -905,7 +896,7 @@ hashkey   : /* empty */   {
>  
>  tablecheck   : ICMP  { table->conf.check = CHECK_ICMP; }
>   | TCP   { table->conf.check = CHECK_TCP; }
> - | ssltls{
> + | TLS   {
>   table->conf.check = CHECK_TCP;
>   conf->sc_conf.flags |= F_TLS;
>   table->conf.flags |= F_TLS;
> @@ -1114,7 +1105,7 @@ protopts_l  : protopts_l protoptsl nl
>   | protoptsl optnl
>   ;
>  
> -protoptsl: ssltls {
> +protoptsl: TLS {
>   if (!(proto->type == RELAY_PROTO_TCP ||
>   proto->type == RELAY_PROTO_HTTP)) {
>   yyerror("can set tls options only for "
> @@ -1122,7 +1113,7 @@ protoptsl   : ssltls {
>   YYERROR;
>   }
>   } tlsflags
> - | ssltls {
> + | TLS {
>   if (!(proto->type == RELAY_PROTO_TCP ||
>   proto->type == RELAY_PROTO_HTTP)) {
>   yyerror("can set tls options only for "
> @@ -2492,7 +2483,6 @@ lookup(char *s)
>   { "socket", SOCKET },
>   { "source-hash",SRCHASH },
>   { "splice", SPLICE },
> - { "ssl",SSL },
>   { "state",  STATE },
>   { "sticky-address", STICKYADDR },
>   { "strip",  STRIP },
> Index: usr.sbin/relayd/relay.c
> ===
> RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
> retrieving revision 1.257
> diff -u -p -r1.257 relay.c
> --- usr.sbin/relayd/relay.c   3 Sep 2023 10:22:03 -   1.257
> +++ usr.sbin/relayd/relay.c   26 Oct 2023 05:49:22

Re: snmpd: don't move back in oid tree when on/below instance below region

2023-10-28 Thread Theo Buehler

On Sat, Oct 28, 2023 at 09:59:25AM +0200, Martijn van Duren wrote:
> Right now when we register a region with below that an instance we
> can revert back in the tree. When we request below the instance we
> currently use appl_region_next() to find the next region and simply
> set the to be find oid to the the oid of the new region. In the
> situation described above this means that we return the parent
> region and set the oid to that of the parent region, which is smaller
> than the instance oid.
> 
> The easiest fix is to increment the searched oid to nextsibling of the
> instance and let the normal appl_varbind_backend() logic handle it from
> there.

ok tb

Re: libagentx: don't return responses >= searchrange.end

2023-10-28 Thread Theo Buehler

On Sat, Oct 28, 2023 at 09:41:55AM +0200, Martijn van Duren wrote:
> In most cases when a region is registered we have the full ownership.
> As soon as a region has been registered below prior mentioned region we
> could loose ownership halfway through. This case currently isn't fully
> tested and with indices we can return OIDs >= searchrange.end.
> The easiest way is to test this case is in agentx_varbind_finalize()
> and simply reset a varbind to EOMV if we're outside our range.

Makes sense.

> I've pulled apart the agentx_request_type cases for clarity and control.

Yes, that's more readable.

ok tb

> 
> OK?
> 
> martijn@
> 
> Index: agentx.c
> ===
> RCS file: /cvs/src/lib/libagentx/agentx.c,v
> retrieving revision 1.23
> diff -u -p -r1.23 agentx.c
> --- agentx.c  24 Oct 2023 08:54:52 -  1.23
> +++ agentx.c  28 Oct 2023 07:40:59 -
> @@ -2822,7 +2822,7 @@ getnext:
>   while (axo != NULL && axo->axo_cstate != AX_CSTATE_OPEN)
>   axo = RB_NEXT(axc_objects, &(axc->axc_objects), axo);
>   if (axo == NULL ||
> - ax_oid_cmp(&(axo->axo_oid), &(axv->axv_end)) > 0) {
> + ax_oid_cmp(&(axo->axo_oid), &(axv->axv_end)) >= 0) {
>   agentx_varbind_endofmibview(axv);
>   return;
>   }
> @@ -3349,19 +3349,53 @@ agentx_varbind_finalize(struct agentx_va
>  #endif
>   }
>   }
> - cmp = ax_oid_cmp(&(axv->axv_vb.avb_oid), );
> - if ((agentx_varbind_request(axv) == AGENTX_REQUEST_TYPE_GETNEXT &&
> - cmp >= 0) || cmp > 0) {
> + cmp = ax_oid_cmp(, &(axv->axv_vb.avb_oid));
> + switch (agentx_varbind_request(axv)) {
> + case AGENTX_REQUEST_TYPE_GET:
> + if (cmp != 0) {
>  #ifdef AX_DEBUG
> - agentx_log_axg_fatalx(axg, "indices not incremented");
> + agentx_log_axg_fatalx(axg, "index changed");
>  #else
> - agentx_log_axg_warnx(axg, "indices not incremented");
> - bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> - sizeof(axv->axv_start));
> - axv->axv_error = AX_PDU_ERROR_GENERR;
> + agentx_log_axg_warnx(axg, "index changed");
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
> + axv->axv_error = AX_PDU_ERROR_GENERR;
> + break;
>  #endif
> - } else
> + }
> + break;
> + case AGENTX_REQUEST_TYPE_GETNEXT:
> + if (cmp <= 0) {
> +#ifdef AX_DEBUG
> + agentx_log_axg_fatalx(axg, "indices not incremented");
> +#else
> + agentx_log_axg_warnx(axg, "indices not incremented");
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
> + axv->axv_error = AX_PDU_ERROR_GENERR;
> + break;
> +#endif
> + }
> + /* FALLTHROUGH */
> + case AGENTX_REQUEST_TYPE_GETNEXTINCLUSIVE:
> + if (cmp < 0) {
> +#ifdef AX_DEBUG
> + agentx_log_axg_fatalx(axg, "index decremented");
> +#else
> + agentx_log_axg_warnx(axg, "index decremented");
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
> + axv->axv_error = AX_PDU_ERROR_GENERR;
> + break;
> +#endif
> + }
> + if (axv->axv_end.aoi_idlen != 0 &&
> + ax_oid_cmp(, &(axv->axv_end)) >= 0) {
> + agentx_varbind_endofmibview(axv);
> + return;
> + }
>   bcopy(, &(axv->axv_vb.avb_oid), sizeof(oid));
> + }
>  done:
>   agentx_object_unlock(axv->axv_axo);
>   agentx_get_finalize(axv->axv_axg);
>

Re: snmpd: reject OIDS equal to searchrange.end

2023-10-28 Thread Martijn van Duren

On Sat, 2023-10-28 at 09:45 +0200, Martijn van Duren wrote:
> RFC 2741 section 5.2 states that searchrange.end is non-inclusive.
> appl_varbind_valid() and appl_response() currently tests inclusive.
> The appl_varbind_valid() case is for backends that support
> searchrange.end (like agentx) and the appl_response() case for
> those who do not and need a fixup (basically everything else).
> 
> OK?
> 
> martijn
> 
Here's the regress part I forgot to add.

Index: Makefile
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/Makefile,v
retrieving revision 1.6
diff -u -p -r1.6 Makefile
--- Makefile27 Oct 2023 10:26:20 -  1.6
+++ Makefile28 Oct 2023 07:49:31 -
@@ -170,6 +170,7 @@ BACKEND_TARGETS+=   backend_getnext_stale
 BACKEND_TARGETS+=  backend_getnext_inclusive_backwards
 BACKEND_TARGETS+=  backend_getnext_toofew
 BACKEND_TARGETS+=  backend_getnext_toomany
+BACKEND_TARGETS+=  backend_getnext_response_equal_end
 BACKEND_TARGETS+=  backend_getbulk_nonrep_zero_maxrep_one
 BACKEND_TARGETS+=  backend_getbulk_nonrep_zero_maxrep_two
 BACKEND_TARGETS+=  backend_getbulk_nonrep_one_maxrep_one
Index: backend.c
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/backend.c,v
retrieving revision 1.1
diff -u -p -r1.1 backend.c
--- backend.c   24 Oct 2023 14:34:40 -  1.1
+++ backend.c   28 Oct 2023 07:49:31 -
@@ -2633,6 +2633,62 @@ backend_getnext_toomany(void)
 }
 
 void
+backend_getnext_response_equal_end(void)
+{
+   struct sockaddr_storage ss;
+   struct sockaddr *sa = (struct sockaddr *)
+   socklen_t salen;
+   int snmp_s, ax_s;
+   uint32_t sessionid1, sessionid2;
+   struct varbind varbind[] = {
+   {
+   .type = TYPE_NULL,
+   .name = OID_STRUCT(MIB_BACKEND_GETNEXT, 26),
+   .data.int32 = 1
+   },
+   };
+   struct searchrange searchrange[] = {
+   {
+   .start = OID_STRUCT(MIB_BACKEND_GETNEXT, 26),
+   .end = OID_STRUCT(MIB_BACKEND_GETNEXT, 26, 1, 1)
+   },
+   };
+   int32_t requestid;
+   char buf[1024];
+   size_t n;
+
+   ax_s = agentx_connect(axsocket);
+   sessionid1 = agentx_open(ax_s, 0, 0,
+   OID_ARG(MIB_SUBAGENT_BACKEND_GETNEXT, 26, 1),
+   "backend_getnext_end_equal.1");
+   sessionid2 = agentx_open(ax_s, 0, 0,
+   OID_ARG(MIB_SUBAGENT_BACKEND_GETNEXT, 26, 2),
+   "backend_getnext_end_equal.2");
+   agentx_register(ax_s, sessionid1, 0, 0, 127, 0,
+   OID_ARG(MIB_BACKEND_GETNEXT, 26), 0);
+   agentx_register(ax_s, sessionid2, 0, 0, 127, 0,
+   OID_ARG(MIB_BACKEND_GETNEXT, 26, 1, 1), 0);
+
+   salen = snmp_resolve(SOCK_DGRAM, hostname, servname, sa);
+   snmp_s = snmp_connect(SOCK_DGRAM, sa, salen);
+   requestid = snmpv2_getnext(snmp_s, community, 0, varbind, 1);
+
+   /* Fool agentx_getnext_handle() */
+   varbind[0].name.subid[varbind[0].name.n_subid++] = 1;
+   varbind[0].type = TYPE_INTEGER;
+   n = agentx_read(ax_s, buf, sizeof(buf), 1000);
+   agentx_getnext_handle(__func__, buf, n, 0, sessionid1, searchrange,
+   varbind, 1);
+   varbind[0].name = searchrange[0].end;
+   agentx_response(ax_s, buf, NOERROR, 0, varbind, 1);
+   varbind[0].type = TYPE_NULL;
+   varbind[0].name = OID_STRUCT(MIB_BACKEND_GETNEXT, 26),
+
+   snmpv2_response_validate(snmp_s, 1000, community, requestid, GENERR, 1,
+   varbind, 1);
+}
+
+void
 backend_getbulk_nonrep_zero_maxrep_one(void)
 {
struct sockaddr_storage ss;
Index: regress.h
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/regress.h,v
retrieving revision 1.2
diff -u -p -r1.2 regress.h
--- regress.h   27 Oct 2023 10:26:20 -  1.2
+++ regress.h   28 Oct 2023 07:49:31 -
@@ -293,6 +293,7 @@ void backend_getnext_stale(void);
 void backend_getnext_inclusive_backwards(void);
 void backend_getnext_toofew(void);
 void backend_getnext_toomany(void);
+void backend_getnext_response_equal_end(void);
 void backend_getbulk_nonrep_zero_maxrep_one(void);
 void backend_getbulk_nonrep_zero_maxrep_two(void);
 void backend_getbulk_nonrep_one_maxrep_one(void);
Index: snmpd_regress.c
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/snmpd_regress.c,v
retrieving revision 1.2
diff -u -p -r1.2 snmpd_regress.c
--- snmpd_regress.c 27 Oct 2023 10:26:20 -  1.2
+++ snmpd_regress.c 28 Oct 2023 07:49:31 -
@@ -143,6 +143,7 @@ const struct {
{ "backend_getnext_inclusive_backwards", 
backend_getnext_inclusive_backwards },
{ "backend_getnext_toofew",

Re: snmpd: reject OIDS equal to searchrange.end

2023-10-28 Thread Theo Buehler

On Sat, Oct 28, 2023 at 09:45:25AM +0200, Martijn van Duren wrote:
> RFC 2741 section 5.2 states that searchrange.end is non-inclusive.
> appl_varbind_valid() and appl_response() currently tests inclusive.
> The appl_varbind_valid() case is for backends that support
> searchrange.end (like agentx) and the appl_response() case for
> those who do not and need a fixup (basically everything else).
> 
> OK?

ok tb

> 
> martijn
> 
> diff --git a/application.c b/application.c
> index 33143d6..b3a2603 100644
> --- a/application.c
> +++ b/application.c
> @@ -1100,7 +1100,7 @@ appl_response(struct appl_backend *backend, int32_t 
> requestid,
>*/
>   eomv |= !backend->ab_range && next &&
>   ober_oid_cmp(&(vb->av_oid),
> - &(origvb->avi_varbind.av_oid_end)) > 0;
> + &(origvb->avi_varbind.av_oid_end)) >= 0;
>   /* RFC 3584 section 4.2.2.1 */
>   if (ureq->aru_pduversion == SNMP_V1 &&
>   vb->av_value != NULL &&
> @@ -1283,7 +1283,7 @@ appl_varbind_valid(struct appl_varbind *varbind,
>   }
>   }
>   if (range && ober_oid_cmp(&(varbind->av_oid),
> - &(request->avi_varbind.av_oid_end)) > 0) {
> + &(request->avi_varbind.av_oid_end)) >= 0) {
>   *errstr = "end oid not honoured";
>   return 0;
>   }
>

Re: snmpd; Fix use after free for appl_request_upstream

2023-10-27 Thread Martijn van Duren

On Thu, 2023-10-26 at 21:38 +0200, Theo Buehler wrote:
> On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote:
> > This case is covered by the new regress' backend_get_toofew and
> > backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
> > to the max it's really hard to trigger (I had to run
> > backend_get_wrongorder, backend_get_toofew, backend_get_toomany
> > sequentially in a tight loop killing snmpd between iterations for the
> > best chance).
> > 
> > When we receive an invalid varbindlist in a response we set the invalid
> > variable. This in turn calls appl_varbind_error(), but the avi_state
> > of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
> > we call appl_request_downstream_free(), which sees that the varbinds
> > haven't been resolved, triggering a call to
> > appl_request_upstream_resolve(). This call in turn sees that the
> > error has been set and just sends out the error-response to the client
> > and frees the appl_request_upstream. From here we return back to
> > appl_response(), which also calls appl_request_upstream_resolve(),
> > resulting in a use after free.
> > 
> > The main tool for fixing this issue is making use of
> > appl_request_upstream's aru_locked member, which will cause
> > appl_request_upstream_resolve() to return instantly. The simplest fix is
> > to set aru_locked before calling appl_request_downstream_free() and
> > unsetting it directly afterwards inside appl_response().
> > 
> > The second one is the diff proposed below, which shrinks the code.
> > 
> > appl_request_upstream_free() is only called once from
> > appl_request_upstream_reply(). appl_request_upstream_reply() in turn
> > is only called by appl_request_upstream_resolve().
> > appl_request_upstream_resolve() is called in 3 places:
> > - appl_processpdu(): to kick things off
> > - appl_request_downstream_free(): For when a backend disappears with
> > outstanding requests
> > - appl_response(): To kickstart the next round of resolving.
> > 
> > Since appl_request_downstream_free() is always called from
> > appl_response(), we can leverage that function and make it call
> > appl_request_upstream_resolve() unconditionally.
> > 
> > appl_request_downstream_free() is called from the following locations:
> > - appl_close(): When a backend has disappeared.
> > - appl_request_upstream_free(): We send out a reply early, because an
> > error has been detected.
> > - appl_response(): We received a response
> > 
> > appl_request_upstream_free() can't reenter into
> > appl_request_upstream_resolve(), or it would potentially trigger new
> > appl_request_downstreams. This can be prevented by setting aru_locked
> > before calling appl_request_downstream_free().
> > For all other cases we should rely on appl_request_upstream_resolve()'s
> > logic to handle varbinds in any state, so there's no reason make calls
> > from other contexts conditional.
> 
> Your description of the bug makes sense and your choice of resolving it
> as well. Thanks for the in-depth explanation, that helped a lot. Still,
> I must say that I don't really feel at ease with the amount of complexity
> and entanglement here. I simply can't fit all of this into my head within
> a reasonable amount of time.
> 
Well, the S in SNMP doesn't stand for "I want to pull my hair out" for
no reason.

So for people interested, this is the high-over flow of the code:
When a message is received snmpe.c sends the PDU part to
appl_processpdu(). This function pulls apart the request in structs that
can be worked with internally (1 struct appl_request_upstream + n
struct appl_varbind_internal).
After things are pulled apart appl_processpdu() calls
appl_request_upstream_resolve(), which is responsible for retrieving
the values for the requested varbinds. It does this by going over all
varbinds in the APPL_VBSTATE_NEW state, and calling
appl_varbind_backend() to find a matching backend for the request. In
the case of a getnext/getbulk request it can also increment the oid
to the next matching region (if needed) and it also fills the end oid
where the ownership of the backend ends. After all APPL_VBSTATE_NEW
varbinds have found their backend, they are grouped together by backend
in a struct appl_request_downstream and send out via
appl_request_downstream_send().

Once a backend has a response available it calls appl_response(). This
function verifies the returned data via appl_varbind_valid() and
appl_error_valid(), and checks if the data matches the supported
datatypes (e.g. counter64 on SNMPv1). After that the data is put into
the corresponding struct appl_varbind_internal, or in case of an EOMV
we increment the OID to the end OID of the last request and reset the
varbind to APPL_VBSTATE_NEW. After that the struct
appl_request_downstream is freed and we return back to
appl_request_upstream_resolve(). If all varbinds have reached
APPL_VBSTATE_DONE (or an error occured) we call
appl_request_upstream_reply() which

Re: Removing syscall(2) from libc and kernel

2023-10-27 Thread Lucas Gabriel Vuotto

On Fri, Oct 27, 2023 at 09:36:25AM -0600, Theo de Raadt wrote:
> Index: sys/arch/m88k/m88k/trap.c
> ===
> RCS file: /cvs/src/sys/arch/m88k/m88k/trap.c,v
> diff -u -p -u -r1.128 trap.c
> --- sys/arch/m88k/m88k/trap.c 2 Aug 2023 06:14:46 -   1.128
> +++ sys/arch/m88k/m88k/trap.c 27 Oct 2023 03:17:47 -
> @@ -1153,9 +1153,9 @@ void
>  m88100_syscall(register_t code, struct trapframe *tf)
>  {
>   int i, nap;
> - const struct sysent *callp;
> + const struct sysent *callp = sysent;
>   struct proc *p = curproc;
> - int error, indirect = -1;
> + int error;
>   register_t args[8] __aligned(8);
>   register_t rval[2] __aligned(8);
>   register_t *ap;
> @@ -1172,19 +1172,9 @@ m88100_syscall(register_t code, struct t
>   ap = >tf_r[2];
>   nap = 8; /* r2-r9 */
>  
> - switch (code) {
> - case SYS_syscall:
> - indirect = code;
> - code = *ap++;
> - nap--;
> - break;
> - }
> -
> - callp = sysent;
> - if (code < 0 || code >= SYS_MAXSYSCALL)
> - callp += SYS_syscall;
> - else
> - callp += code;
> + if (code <= 0 || code >= SYS_MAXSYSCALL)
> + goto bad;
> + callp += code;
>  
>   i = callp->sy_argsize / sizeof(register_t);
>   if (i > sizeof(args) / sizeof(register_t))
> @@ -1200,7 +1190,7 @@ m88100_syscall(register_t code, struct t
>   rval[0] = 0;
>   rval[1] = tf->tf_r[3];
>  
> - error = mi_syscall(p, code, indirect, callp, args, rval);
> + error = mi_syscall(p, code, callp, args, rval);
>  
>   /*
>* system call will look like:
> @@ -1266,7 +1256,7 @@ void
>  m88110_syscall(register_t code, struct trapframe *tf)
>  {
>   int i, nap;
> - const struct sysent *callp;
> + const struct sysent *callp = sysent;
>   struct proc *p = curproc;
>   int error;
>   register_t args[8] __aligned(8);
> @@ -1285,17 +1275,8 @@ m88110_syscall(register_t code, struct t
>   ap = >tf_r[2];
>   nap = 8;/* r2-r9 */
>  
> - switch (code) {
> - case SYS_syscall:
> - code = *ap++;
> - nap--;
> - break;
> - }
> -
> - callp = sysent;
> - if (code < 0 || code >= SYS_MAXSYSCALL)
> - callp += SYS_syscall;
> - else
> + // XXX out of range stays on syscall0, which we assume is enosys
> + if (code >= 0 || code <= SYS_MAXSYSCALL)
>   callp += code;
>  
>   i = callp->sy_argsize / sizeof(register_t);

Shouldn't this be

if (code > 0 && code < SYS_MAXSYSCALL)

?

All the other places in the diff modify callp under that condition. Also
all the values of code >= SYS_MAXSYSCALL will be covered by code >= 0.

Out of curiosity, any reason why m88k doesn't do the goto bad early on?
Other than RTFM for the calling convention.

> Index: sys/arch/mips64/mips64/trap.c
> ===
> RCS file: /cvs/src/sys/arch/mips64/mips64/trap.c,v
> diff -u -p -u -r1.167 trap.c
> --- sys/arch/mips64/mips64/trap.c 26 Apr 2023 16:53:59 -  1.167
> +++ sys/arch/mips64/mips64/trap.c 27 Oct 2023 03:17:44 -
> @@ -396,14 +396,12 @@ fault_common_no_miss:
>   case T_SYSCALL+T_USER:
>   {
>   struct trapframe *locr0 = p->p_md.md_regs;
> - const struct sysent *callp;
> - unsigned int code, indirect = -1;
> + const struct sysent *callp = sysent;
> + unsigned int code;
>   register_t tpc;
>   uint32_t branch = 0;
>   int error, numarg;
> - struct args {
> - register_t i[8];
> - } args;
> + register_t args[8];
>   register_t rval[2];
>  
>   atomic_inc_int();
> @@ -422,51 +420,22 @@ fault_common_no_miss:
>   trapframe->pc, 0, branch);
>   } else
>   locr0->pc += 4;
> - callp = sysent;
>   code = locr0->v0;
> - switch (code) {
> - case SYS_syscall:
> - /*
> -  * Code is first argument, followed by actual args.
> -  */
> - indirect = code;
> - code = locr0->a0;
> - if (code >= SYS_MAXSYSCALL)
> - callp += SYS_syscall;
> - else
> - callp += code;
> - numarg = callp->sy_argsize / sizeof(register_t);
> - args.i[0] = locr0->a1;
> - args.i[1] = locr0->a2;
> - args.i[2] = locr0->a3;
> - if (numarg > 3) {
> - args.i[3] = locr0->a4;
> - args.i[4] = locr0->a5;
> -

Re: Removing syscall(2) from libc and kernel

2023-10-27 Thread Theo de Raadt

  switch (error) {
case 0:
Index: sys/arch/arm/arm/syscall.c
===
RCS file: /cvs/src/sys/arch/arm/arm/syscall.c,v
diff -u -p -u -r1.26 syscall.c
--- sys/arch/arm/arm/syscall.c  11 Feb 2023 23:07:26 -  1.26
+++ sys/arch/arm/arm/syscall.c  27 Oct 2023 16:16:49 -
@@ -93,8 +93,8 @@ void
 swi_handler(trapframe_t *frame)
 {
struct proc *p = curproc;
-   const struct sysent *callp;
-   int code, error, indirect = -1;
+   const struct sysent *callp = sysent;
+   int code, error = ENOSYS;
u_int nap = 4, nargs;
register_t *ap, *args, copyargs[MAXARGS], rval[2];
 
@@ -103,32 +103,19 @@ swi_handler(trapframe_t *frame)
/* Before enabling interrupts, save FPU state */
vfp_save();
 
-   /* Re-enable interrupts if they were enabled previously */
-   if (__predict_true((frame->tf_spsr & PSR_I) == 0))
-   enable_interrupts(PSR_I);
+   enable_interrupts(PSR_I);
 
p->p_addr->u_pcb.pcb_tf = frame;
 
/* Skip over speculation-blocking barrier. */
frame->tf_pc += 8;
 
-   code = frame->tf_r12;
-
ap = >tf_r0;
 
-   switch (code) { 
-   case SYS_syscall:
-   indirect = code;
-   code = *ap++;
-   nap--;
-   break;
-   }
-
-   callp = sysent;
-   if (code < 0 || code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   code = frame->tf_r12;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp += code;
 
nargs = callp->sy_argsize / sizeof(register_t);
if (nargs <= nap) {
@@ -145,27 +132,23 @@ swi_handler(trapframe_t *frame)
rval[0] = 0;
rval[1] = frame->tf_r1;
 
-   error = mi_syscall(p, code, indirect, callp, args, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
frame->tf_r0 = rval[0];
frame->tf_r1 = rval[1];
-
frame->tf_spsr &= ~PSR_C;   /* carry bit */
break;
-
case ERESTART:
/*
 * Reconstruct the pc to point at the swi.
 */
frame->tf_pc -= 12;
break;
-
case EJUSTRETURN:
/* nothing to do */
break;
-
default:
bad:
frame->tf_r0 = error;
Index: sys/arch/arm64/arm64/syscall.c
===
RCS file: /cvs/src/sys/arch/arm64/arm64/syscall.c,v
diff -u -p -u -r1.14 syscall.c
--- sys/arch/arm64/arm64/syscall.c  13 Apr 2023 02:19:04 -  1.14
+++ sys/arch/arm64/arm64/syscall.c  27 Oct 2023 03:26:49 -
@@ -33,7 +33,7 @@ svc_handler(trapframe_t *frame)
 {
struct proc *p = curproc;
const struct sysent *callp;
-   int code, error, indirect = -1;
+   int code, error = ENOSYS;
u_int nap = 8, nargs;
register_t *ap, *args, copyargs[MAXARGS], rval[2];
 
@@ -50,19 +50,9 @@ svc_handler(trapframe_t *frame)
 
ap = >tf_x[0];
 
-   switch (code) { 
-   case SYS_syscall:
-   indirect = code;
-   code = *ap++;
-   nap--;
-   break;
-   }
-
-   callp = sysent;
-   if (code < 0 || code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp = sysent + code;
 
nargs = callp->sy_argsize / sizeof(register_t);
if (nargs <= nap) {
@@ -79,25 +69,22 @@ svc_handler(trapframe_t *frame)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, args, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
frame->tf_x[0] = rval[0];
frame->tf_spsr &= ~PSR_C;   /* carry bit */
break;
-
case ERESTART:
/*
 * Reconstruct the pc to point at the svc.
 */
frame->tf_elr -= 12;
break;
-
case EJUSTRETURN:
/* nothing to do */
break;
-
default:
bad:
frame->tf_x[0] = error;
Index: sys/arch/hppa/hppa/trap.c
===
RCS file: /cvs/src/sys/arch/hppa/hppa/trap.c,v
diff -u -p -u -r1.161 trap.c
--- sys/arch/hppa/hppa/trap.c   11 Feb 2023 23:07:26 -  1.161
+++ sys/arch/hppa/hppa/trap.c   27 Oct 2023 16:16:38 -
@@ -764,8 +764,8 @@ void
 syscall(struct trapframe *frame)
 {
struct proc *p = curproc;
-   const struc

Re: Removing syscall(2) from libc and kernel

2023-10-27 Thread Theo de Raadt

mission.
-.\"
-.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
-.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
-.\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
-.\" ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
-.\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
-.\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
-.\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
-.\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
-.\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
-.\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
-.\" SUCH DAMAGE.
-.\"
-.\" @(#)syscall.2  8.1 (Berkeley) 6/16/93
-.\"
-.Dd $Mdocdate: February 22 2023 $
-.Dt SYSCALL 2
-.Os
-.Sh NAME
-.Nm syscall
-.Nd indirect system call
-.Sh SYNOPSIS
-.In sys/syscall.h
-.In unistd.h
-.Ft int
-.Fn syscall "int number" "..."
-.Sh DESCRIPTION
-.Fn syscall
-performs the system call whose assembly language
-interface has the specified
-.Fa number
-with the specified arguments.
-Symbolic constants for system calls can be found in the header file
-.In sys/syscall.h .
-.Sh RETURN VALUES
-The return values are defined by the system call being invoked.
-In general, for system calls returning
-.Va int ,
-a 0 return value indicates success.
-A \-1 return value indicates an error,
-and an error code is stored in
-.Va errno .
-.Sh HISTORY
-The predecessor of these functions, the former
-.Fn indir
-system call, first appeared in
-.At v4 .
-The
-.Fn syscall
-function first appeared in
-.Bx 3 .
Index: share/man/man9/ktrace.9
===
RCS file: /cvs/src/share/man/man9/ktrace.9,v
diff -u -p -u -r1.13 ktrace.9
--- share/man/man9/ktrace.9 4 Aug 2022 06:20:24 -   1.13
+++ share/man/man9/ktrace.9 24 Oct 2023 01:09:47 -
@@ -145,7 +145,6 @@ The process tracing facility is implemen
 .Sh SEE ALSO
 .Xr errno 2 ,
 .Xr ktrace 2 ,
-.Xr syscall 2 ,
 .Xr namei 9 ,
 .Xr syscall 9
 .Sh HISTORY
Index: share/man/man9/syscall.9
===
RCS file: /cvs/src/share/man/man9/syscall.9,v
diff -u -p -u -r1.15 syscall.9
--- share/man/man9/syscall.914 May 2019 13:17:09 -  1.15
+++ share/man/man9/syscall.924 Oct 2023 01:09:43 -
@@ -235,7 +235,6 @@ Machine-independent syscall entry end re
 .El
 .Sh SEE ALSO
 .Xr ktrace 2 ,
-.Xr syscall 2 ,
 .Xr ktrace 9 ,
 .Xr sysctl_int 9
 .Sh HISTORY
Index: sys/arch/alpha/alpha/trap.c
===
RCS file: /cvs/src/sys/arch/alpha/alpha/trap.c,v
diff -u -p -u -r1.108 trap.c
--- sys/arch/alpha/alpha/trap.c 8 Mar 2023 04:43:07 -   1.108
+++ sys/arch/alpha/alpha/trap.c 27 Oct 2023 03:18:04 -
@@ -497,17 +497,15 @@ dopanic:
  * a3, and v0 from the frame before returning to the user process.
  */
 void
-syscall(code, framep)
-   u_int64_t code;
-   struct trapframe *framep;
+syscall(u_int64_t code, struct trapframe *framep)
 {
-   const struct sysent *callp;
+   const struct sysent *callp = sysent;
struct proc *p;
-   int error, indirect = -1;
+   int error;
u_int64_t opc;
u_long rval[2];
u_long args[10];/* XXX */
-   u_int hidden, nargs;
+   u_int nargs;
 
atomic_add_int(, 1);
p = curproc;
@@ -515,24 +513,11 @@ syscall(code, framep)
framep->tf_regs[FRAME_SP] = alpha_pal_rdusp();
opc = framep->tf_regs[FRAME_PC] - 4;
 
-   switch(code) {
-   case SYS_syscall:
-   indirect = code;
-   code = framep->tf_regs[FRAME_A0];
-   hidden = 1;
-   break;
-   default:
-   hidden = 0;
-   }
-
-   error = 0;
-   callp = sysent;
-   if (code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp += code;
 
-   nargs = callp->sy_narg + hidden;
+   nargs = callp->sy_narg;
switch (nargs) {
default:
if (nargs > 10) /* XXX */
@@ -559,7 +544,7 @@ syscall(code, framep)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, args + hidden, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
Index: sys/arch/amd64/amd64/locore.S
===
RCS file: /cvs/src/sys/arch/amd64/amd64/locore.S,v
diff -u -p -u -r1.141 locore.S
--- sys/arch/amd64/amd64

Re: bgpd: cleanup optparamlen handling in session_open

2023-10-27 Thread Theo Buehler

On Fri, Oct 27, 2023 at 01:06:31PM +0200, Claudio Jeker wrote:
> In the big ibuf API refactor I also broke the optparamlen handling
> by using one variable for two things.
> 
> All the size handling in session_open() can be simplified since
> ibuf_size() is cheap to call.
> 
> I think the result is cleaner than the code before. It is still somewhat
> funky because there are a fair amount of conditions to cover now.

There's no way to avoid catching some funk if the spec has a groovy
legacy heritage...

Diff makes sense and matches my understanding of that RFC's hack.

ok

> 
> -- 
> :wq Claudio
> 
> Index: session.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
> retrieving revision 1.452
> diff -u -p -r1.452 session.c
> --- session.c 27 Oct 2023 09:40:27 -  1.452
> +++ session.c 27 Oct 2023 11:00:13 -
> @@ -1471,8 +1471,9 @@ session_open(struct peer *p)
>  {
>   struct bgp_msg  *buf;
>   struct ibuf *opb;
> - uint16_t len, optparamlen = 0, holdtime;
> - uint8_t  i, op_type;
> + size_t   optparamlen;
> + uint16_t holdtime;
> + uint8_t  i;
>   int  errs = 0, extlen = 0;
>   int  mpcapa = 0;
>  
> @@ -1556,16 +1557,16 @@ session_open(struct peer *p)
>   if (optparamlen == 0) {
>   /* nothing */
>   } else if (optparamlen + 2 >= 255) {
> - /* RFC9072: 2 byte length instead of 1 + 3 byte extra header */
> - optparamlen += sizeof(op_type) + 2 + 3;
> + /* RFC9072: use 255 as magic size and request extra header */
>   optparamlen = 255;
>   extlen = 1;
>   } else {
> - optparamlen += sizeof(op_type) + 1;
> + /* regular capabilities header */
> + optparamlen += 2;
>   }
>  
> - len = MSGSIZE_OPEN_MIN + optparamlen;
> - if (errs || (buf = session_newmsg(OPEN, len)) == NULL) {
> + if (errs || (buf = session_newmsg(OPEN,
> + MSGSIZE_OPEN_MIN + optparamlen)) == NULL) {
>   ibuf_free(opb);
>   bgp_fsm(p, EVNT_CON_FATAL);
>   return;
> @@ -1584,20 +1585,19 @@ session_open(struct peer *p)
>   errs += ibuf_add_n8(buf->buf, optparamlen);
>  
>   if (extlen) {
> - /* write RFC9072 extra header */
> + /* RFC9072 extra header which spans over the capabilities hdr */
>   errs += ibuf_add_n8(buf->buf, OPT_PARAM_EXT_LEN);
> - errs += ibuf_add_n16(buf->buf, optparamlen - 3);
> + errs += ibuf_add_n16(buf->buf, ibuf_size(opb) + 1 + 2);
>   }
>  
>   if (optparamlen) {
>   errs += ibuf_add_n8(buf->buf, OPT_PARAM_CAPABILITIES);
>  
> - optparamlen = ibuf_size(opb);
>   if (extlen) {
>   /* RFC9072: 2-byte extended length */
> - errs += ibuf_add_n16(buf->buf, optparamlen);
> + errs += ibuf_add_n16(buf->buf, ibuf_size(opb));
>   } else {
> - errs += ibuf_add_n8(buf->buf, optparamlen);
> + errs += ibuf_add_n8(buf->buf, ibuf_size(opb));
>   }
>   errs += ibuf_add_buf(buf->buf, opb);
>   }
>

Re: bgpd fix holdtime in session_open

2023-10-27 Thread Theo Buehler

On Fri, Oct 27, 2023 at 11:27:11AM +0200, Claudio Jeker wrote:
> While looking for something else I noticed this error.
> 
> The holdtime is written into the buffer with ibuf_add_n16() so the htons()
> call is done in the ibuf add call. So there is no need for the htons()
> when assigning holdtime.

ok tb

> 
> -- 
> :wq Claudio
> 
> Index: session.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
> retrieving revision 1.451
> diff -u -p -u -5 -r1.451 session.c
> --- session.c 19 Oct 2023 07:02:45 -  1.451
> +++ session.c 27 Oct 2023 09:24:20 -
> @@ -1570,13 +1570,13 @@ session_open(struct peer *p)
>   bgp_fsm(p, EVNT_CON_FATAL);
>   return;
>   }
>  
>   if (p->conf.holdtime)
> - holdtime = htons(p->conf.holdtime);
> + holdtime = p->conf.holdtime;
>   else
> - holdtime = htons(conf->holdtime);
> + holdtime = conf->holdtime;
>  
>   errs += ibuf_add_n8(buf->buf, 4);
>   errs += ibuf_add_n16(buf->buf, p->conf.local_short_as);
>   errs += ibuf_add_n16(buf->buf, holdtime);
>   /* is already in network byte order */
>

Re: snmpd: Fix close after protocol error case

2023-10-26 Thread Theo Buehler

On Thu, Oct 26, 2023 at 10:47:36AM +0200, Martijn van Duren wrote:
> So here's an elusive one that can be triggered every now and then by the
> new regression test. Once an AgentX session is opened and we send an
> invalid packet appl_agentx_recv() goes to appl_agentx_free(), since
> there's no recovery. appl_agentx_free() tries to neatly close all
> open sessions by sending a close-pdu, followed by calling
> appl_agentx_send() directly.
> However, if the socket has been closed in the meantime we hit
> appl_agentx_send()'s error path, which also calls appl_agentx_free().
> This in turn leads to use after free cases.
> 
> To fix this don't call appl_agentx_send() directly anymore, but just
> schedule it via conn_wev. To make sure as much data as possible is
> written out do a last unchecked courtesy flush before definitively
> freeing the connection. Since appl_agentx_forceclose() arms conn_wev
> move the event_del() calls down in appl_agentx_free().
> 
> Other calls of appl_agentx_send() should be fine, but just convert
> all of them to be consistent and safe.

ok tb

Re: snmpd; Fix use after free for appl_request_upstream

2023-10-26 Thread Theo Buehler

On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote:
> This case is covered by the new regress' backend_get_toofew and
> backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
> to the max it's really hard to trigger (I had to run
> backend_get_wrongorder, backend_get_toofew, backend_get_toomany
> sequentially in a tight loop killing snmpd between iterations for the
> best chance).
> 
> When we receive an invalid varbindlist in a response we set the invalid
> variable. This in turn calls appl_varbind_error(), but the avi_state
> of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
> we call appl_request_downstream_free(), which sees that the varbinds
> haven't been resolved, triggering a call to
> appl_request_upstream_resolve(). This call in turn sees that the
> error has been set and just sends out the error-response to the client
> and frees the appl_request_upstream. From here we return back to
> appl_response(), which also calls appl_request_upstream_resolve(),
> resulting in a use after free.
> 
> The main tool for fixing this issue is making use of
> appl_request_upstream's aru_locked member, which will cause
> appl_request_upstream_resolve() to return instantly. The simplest fix is
> to set aru_locked before calling appl_request_downstream_free() and
> unsetting it directly afterwards inside appl_response().
> 
> The second one is the diff proposed below, which shrinks the code.
> 
> appl_request_upstream_free() is only called once from
> appl_request_upstream_reply(). appl_request_upstream_reply() in turn
> is only called by appl_request_upstream_resolve().
> appl_request_upstream_resolve() is called in 3 places:
> - appl_processpdu(): to kick things off
> - appl_request_downstream_free(): For when a backend disappears with
> outstanding requests
> - appl_response(): To kickstart the next round of resolving.
> 
> Since appl_request_downstream_free() is always called from
> appl_response(), we can leverage that function and make it call
> appl_request_upstream_resolve() unconditionally.
> 
> appl_request_downstream_free() is called from the following locations:
> - appl_close(): When a backend has disappeared.
> - appl_request_upstream_free(): We send out a reply early, because an
> error has been detected.
> - appl_response(): We received a response
> 
> appl_request_upstream_free() can't reenter into
> appl_request_upstream_resolve(), or it would potentially trigger new
> appl_request_downstreams. This can be prevented by setting aru_locked
> before calling appl_request_downstream_free().
> For all other cases we should rely on appl_request_upstream_resolve()'s
> logic to handle varbinds in any state, so there's no reason make calls
> from other contexts conditional.

Your description of the bug makes sense and your choice of resolving it
as well. Thanks for the in-depth explanation, that helped a lot. Still,
I must say that I don't really feel at ease with the amount of complexity
and entanglement here. I simply can't fit all of this into my head within
a reasonable amount of time.

ok tb (fwiw)

since it resolves a real problem and simplifies things a bit.

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-26 Thread Martin Pieuchot

On 26/10/23(Thu) 07:06, Miod Vallat wrote:
> > I wonder if the diff below makes a difference.  It's hard to debug and it
> > might be worth adding a counter for bad swap slots.
> 
> It did not help (but your diff is probably correct).

In that case I'd like to put both diffs in, are you ok with that?

> > Index: uvm/uvm_anon.c
> > ===
> > RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> > retrieving revision 1.56
> > diff -u -p -r1.56 uvm_anon.c
> > --- uvm/uvm_anon.c  2 Sep 2023 08:24:40 -   1.56
> > +++ uvm/uvm_anon.c  22 Oct 2023 21:27:42 -
> > @@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
> > uvm_unlock_pageq(); /* free the daemon */
> > }
> > } else {
> > -   if (anon->an_swslot != 0) {
> > +   if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
> > /* This page is no longer only in swap. */
> > KASSERT(uvmexp.swpgonly > 0);
> > atomic_dec_int();

Re: relayd.conf.5: less SSL

2023-10-26 Thread Klemens Nanni

On Tue, Oct 24, 2023 at 09:09:21AM +0200, Peter N. M. Hansteen wrote:
> On Tue, Oct 24, 2023 at 06:54:30AM +, Klemens Nanni wrote:
> > - parse.y still accepting undocumented "ssl" with a warning since 2014
> > - more "SSL/TLS" instead of "TLS" in manual and code comments
> 
> my take would be that while it's fine to streamline the documentation to use
> the modern terminology, I suspect there may still be ancient configurations
> out there that use the "ssl" keyword, so removing the last bit of support for
> that option should be accompanied by or preceded by a warning on relevant
> mailing lists or at least in the commit message. 
> 
> And I think undeadly.org would be more than happy to help spread the word :)

current.html entry should do for a deprecated keyword we've been warning
about for almost ten years...  I've checked faq/upgrade*.html for previous
notes, but couldn't find any.

Here's a first try, relayd regress is also happy.

Index: usr.sbin/relayd/parse.y
===
RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
retrieving revision 1.254
diff -u -p -r1.254 parse.y
--- usr.sbin/relayd/parse.y 3 Jul 2023 09:38:08 -   1.254
+++ usr.sbin/relayd/parse.y 26 Oct 2023 06:07:08 -
@@ -175,7 +175,7 @@ typedef struct {
 %token LOOKUP METHOD MODE NAT NO DESTINATION NODELAY NOTHING ON PARENT PATH
 %token PFTAG PORT PREFORK PRIORITY PROTO QUERYSTR REAL REDIRECT RELAY REMOVE
 %token REQUEST RESPONSE RETRY QUICK RETURN ROUNDROBIN ROUTE SACK SCRIPT SEND
-%token SESSION SOCKET SPLICE SSL STICKYADDR STRIP STYLE TABLE TAG TAGGED TCP
+%token SESSION SOCKET SPLICE STICKYADDR STRIP STYLE TABLE TAG TAGGED TCP
 %token TIMEOUT TLS TO ROUTER RTLABEL TRANSPARENT URL WITH TTL RTABLE
 %token MATCH PARAMS RANDOM LEASTSTATES SRCHASH KEY CERTIFICATE PASSWORD ECDHE
 %token EDH TICKETS CONNECTION CONNECTIONS CONTEXT ERRORS STATE CHANGES CHECKS
@@ -227,21 +227,12 @@ include   : INCLUDE STRING{
}
;
 
-ssltls : SSL   {
-   log_warnx("%s:%d: %s",
-   file->name, yylval.lineno,
-   "please use the \"tls\" keyword"
-   " instead of \"ssl\"");
-   }
-   | TLS
-   ;
-
 opttls : /*empty*/ { $$ = 0; }
-   | ssltls{ $$ = 1; }
+   | TLS   { $$ = 1; }
;
 
 opttlsclient   : /*empty*/ { $$ = 0; }
-   | WITH ssltls   { $$ = 1; }
+   | WITH TLS  { $$ = 1; }
;
 
 http_type  : HTTP  { $$ = 0; }
@@ -905,7 +896,7 @@ hashkey : /* empty */   {
 
 tablecheck : ICMP  { table->conf.check = CHECK_ICMP; }
| TCP   { table->conf.check = CHECK_TCP; }
-   | ssltls{
+   | TLS   {
table->conf.check = CHECK_TCP;
conf->sc_conf.flags |= F_TLS;
table->conf.flags |= F_TLS;
@@ -1114,7 +1105,7 @@ protopts_l: protopts_l protoptsl nl
| protoptsl optnl
;
 
-protoptsl  : ssltls {
+protoptsl  : TLS {
if (!(proto->type == RELAY_PROTO_TCP ||
proto->type == RELAY_PROTO_HTTP)) {
yyerror("can set tls options only for "
@@ -1122,7 +1113,7 @@ protoptsl : ssltls {
YYERROR;
}
} tlsflags
-   | ssltls {
+   | TLS {
if (!(proto->type == RELAY_PROTO_TCP ||
proto->type == RELAY_PROTO_HTTP)) {
yyerror("can set tls options only for "
@@ -2492,7 +2483,6 @@ lookup(char *s)
{ "socket", SOCKET },
{ "source-hash",SRCHASH },
{ "splice", SPLICE },
-   { "ssl",SSL },
{ "state",  STATE },
{ "sticky-address", STICKYADDR },
{ "strip",  STRIP },
Index: usr.sbin/relayd/relay.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.257
diff -u -p -r1.257 relay.c
--- usr.sbin/relayd/relay.c 3 Sep 2023 10:22:03 -   1.257
+++ usr.sbin/relayd/relay.c 26 Oct 2023 05:49:22 -
@@ -2064,7 +2064,7 @@ relay_tls_ctx_create_proto(struct protoc
 {
uint32_t protocols = 0;
 
-   /* Set the allowed SSL protocols */
+   /* Set the allowed TLS protocols */
if (proto->tlsflags & TLSFLAG_TLSV1_2)
protocols |= TLS_PROTOCOL_TLSv1_2;
if (proto->tlsflags & TLSFLAG_TLSV1_3)
@@ -2186,7

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-26 Thread Miod Vallat

> I wonder if the diff below makes a difference.  It's hard to debug and it
> might be worth adding a counter for bad swap slots.

It did not help (but your diff is probably correct).

> Index: uvm/uvm_anon.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> retrieving revision 1.56
> diff -u -p -r1.56 uvm_anon.c
> --- uvm/uvm_anon.c2 Sep 2023 08:24:40 -   1.56
> +++ uvm/uvm_anon.c22 Oct 2023 21:27:42 -
> @@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
>   uvm_unlock_pageq(); /* free the daemon */
>   }
>   } else {
> - if (anon->an_swslot != 0) {
> + if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
>   /* This page is no longer only in swap. */
>   KASSERT(uvmexp.swpgonly > 0);
>   atomic_dec_int();

Re: ibuf free fd on close

2023-10-25 Thread Theo Buehler

On Tue, Oct 24, 2023 at 04:00:42PM +0200, Claudio Jeker wrote:
> On Tue, Oct 24, 2023 at 03:50:47PM +0200, Theo Buehler wrote:
> > On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> > > When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> > > close any fd still on the buffer. This way even if a fd is unexpectedly
> > > passed nothing will happen.
> > > 
> > > That code was disabled at start because userland was not fully ready. In
> > > particular rpki-client did not handle that well. All of this is to my
> > > knowledge fixed so there is no reason to keep the NOTYET :)
> > > 
> > > With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> > > Code not doing so will break because ibuf_free() will close the fd which
> > > is probably still in use somewhere else.
> > 
> > Nothing in base outside of libutil seems to reach directly for the fd
> > (checked by compiling with that struct member renamed in the public
> > header).
> > 
> > The internal uses are addressed by this diff, so
> > 
> > ok tb
> > 
> > I can put the fd rename through a bulk to catch some ports in a couple
> > of days but I don't think there is a need to wait.
> 
> Thanks. Do we have a list of ports that use ibuf / imsg? 

mdnsd sets an ibuf fd to -1, which is harmless:
https://github.com/haesbaert/mdnsd/blob/master/libmdns/mdnsl.c#L513

got's lib/privsep.c code has a similar pattern of setting the fd before
imsg_close(). This won't be affected directly by this change as far as I
can see, but the privsep code probably needs an fd audit, ibuf related
and otherwise. Just skimming the code I saw a few potential fd leaks,
e.g. request_tree() leaks if got_privsep_send_tree_req() exits early.

PS: It is quite easy to hit "unexpected end of file" if one clicks on
links in the web view. Clicking on "i can't count" here:
https://got.gameoftrees.org/?action=summary=got.git

Re: malloc: micro optimizations

2023-10-25 Thread Masato Asou

Hi,

It works fine for me.
ok asou@
--
ASOU Masato

From: Otto Moerbeek 
Date: Wed, 25 Oct 2023 11:04:01 +0200

> Hi,
> 
> a few micro-optimization, including getting rid of some statistics
> that are not actualy very interesting.
> 
> Speedup amounts to a few tenths of percents to a few percents,
> depending on how biased the benchmark is.
> 
>   -Otto
> 
> Index: stdlib/malloc.c
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
> retrieving revision 1.291
> diff -u -p -r1.291 malloc.c
> --- stdlib/malloc.c   22 Oct 2023 12:19:26 -  1.291
> +++ stdlib/malloc.c   24 Oct 2023 14:05:37 -
> @@ -169,16 +169,12 @@ struct dir_info {
>   void *caller;
>   size_t inserts;
>   size_t insert_collisions;
> - size_t finds;
> - size_t find_collisions;
>   size_t deletes;
>   size_t delete_moves;
>   size_t cheap_realloc_tries;
>   size_t cheap_reallocs;
>   size_t malloc_used; /* bytes allocated */
>   size_t malloc_guarded;  /* bytes used for guards */
> - size_t pool_searches;   /* searches for pool */
> - size_t other_pool;  /* searches in other pool */
>  #define STATS_ADD(x,y)   ((x) += (y))
>  #define STATS_SUB(x,y)   ((x) -= (y))
>  #define STATS_INC(x) ((x)++)
> @@ -209,12 +205,14 @@ static void unmap(struct dir_info *d, vo
>  struct chunk_info {
>   LIST_ENTRY(chunk_info) entries;
>   void *page; /* pointer to the page */
> + /* number of shorts should add up to 8, check alloc_chunk_info() */
>   u_short canary;
>   u_short bucket;
>   u_short free;   /* how many free chunks */
>   u_short total;  /* how many chunks */
>   u_short offset; /* requested size table offset */
> - u_short bits[1];/* which chunks are free */
> +#define CHUNK_INFO_TAIL  3
> + u_short bits[CHUNK_INFO_TAIL];  /* which chunks are free */
>  };
>  
>  #define CHUNK_FREE(i, n) ((i)->bits[(n) / MALLOC_BITS] & (1U << ((n) % 
> MALLOC_BITS)))
> @@ -656,12 +654,10 @@ find(struct dir_info *d, void *p)
>   index = hash(p) & mask;
>   r = d->r[index].p;
>   q = MASK_POINTER(r);
> - STATS_INC(d->finds);
>   while (q != p && r != NULL) {
>   index = (index - 1) & mask;
>   r = d->r[index].p;
>   q = MASK_POINTER(r);
> - STATS_INC(d->find_collisions);
>   }
>   return (q == p && r != NULL) ? >r[index] : NULL;
>  }
> @@ -949,7 +945,7 @@ init_chunk_info(struct dir_info *d, stru
>  
>   p->bucket = bucket;
>   p->total = p->free = MALLOC_PAGESIZE / B2ALLOC(bucket);
> - p->offset = bucket == 0 ? 0xdead : howmany(p->total, MALLOC_BITS);
> + p->offset = howmany(p->total, MALLOC_BITS);
>   p->canary = (u_short)d->canary1;
>  
>   /* set all valid bits in the bitmap */
> @@ -971,8 +967,13 @@ alloc_chunk_info(struct dir_info *d, u_i
>   count = MALLOC_PAGESIZE / B2ALLOC(bucket);
>  
>   size = howmany(count, MALLOC_BITS);
> - size = sizeof(struct chunk_info) + (size - 1) * sizeof(u_short);
> - if (mopts.chunk_canaries)
> + /* see declaration of struct chunk_info */
> + if (size <= CHUNK_INFO_TAIL)
> + size = 0;
> + else
> + size -= CHUNK_INFO_TAIL;
> + size = sizeof(struct chunk_info) + size * sizeof(u_short);
> + if (mopts.chunk_canaries && bucket > 0)
>   size += count * sizeof(u_short);
>   size = _ALIGN(size);
>   count = MALLOC_PAGESIZE / size;
> @@ -1129,8 +1130,7 @@ fill_canary(char *ptr, size_t sz, size_t
>  static void *
>  malloc_bytes(struct dir_info *d, size_t size)
>  {
> - u_int i, r, bucket, listnum;
> - size_t k;
> + u_int i, k, r, bucket, listnum;
>   u_short *lp;
>   struct chunk_info *bp;
>   void *p;
> @@ -1170,7 +1170,7 @@ malloc_bytes(struct dir_info *d, size_t 
>   /* no bit halfway, go to next full short */
>   i /= MALLOC_BITS;
>   for (;;) {
> - if (++i >= howmany(bp->total, MALLOC_BITS))
> + if (++i >= bp->offset)
>   i = 0;
>   lp = >bits[i];
>   if (*lp) {
> @@ -1228,7 +1228,7 @@ validate_canary(struct dir_info *d, u_ch
>   }
>  }
>  
> -static uint32_t
> +static inline uint32_t
>  find_chunknum(struct dir_info *d, struct chunk_info *info, void *ptr, int 
> check)
>  {
>   uint32_t chunknum;
> @@ -1532,12 +1532,10 @@ findpool(void *p, struct dir_info *argpo
>   struct dir_info *pool = argpool;
>   struct region_info *r = find(pool, p);
>  
> - STATS_INC(pool->pool_searches);
>   if (r == NULL) {
>   u_int i, nmutexes;
>  
>   nmutexes =

Re: dhcpd(8): Parse lease database after dropping privileges

2023-10-25 Thread Stephen Fox

Hello,

I just wanted to double check if there is any additional feedback on the patch.

I am happy to make changes or discuss alternative approaches.

Thank you,
Stephen

On Tue, Oct 3, 2023 at 11:33 PM Stephen Fox  wrote:
>
> Hello,
>
> I received feedback from deraadt that the first two unveil(2) calls were
> unnecessary because pledge(2) automatically unveils "/usr/share/zoneinfo".
>
> This updated patch removes the unneeded unveil(2) calls.
>
> Best regards,
> Stephen
> ---
>  usr.sbin/dhcpd/confpars.c | 41 ++-
>  usr.sbin/dhcpd/db.c   | 19 ++
>  usr.sbin/dhcpd/dhcpd.c| 30 ++--
>  usr.sbin/dhcpd/dhcpd.h|  2 ++
>  4 files changed, 76 insertions(+), 16 deletions(-)
>
> diff --git usr.sbin/dhcpd/confpars.c usr.sbin/dhcpd/confpars.c
> index 1bdffb38842..bb245fc663e 100644
> --- usr.sbin/dhcpd/confpars.c
> +++ usr.sbin/dhcpd/confpars.c
> @@ -57,6 +57,8 @@
>  #include "dhctoken.h"
>  #include "log.h"
>
> +FILE   *db_file_ro;
> +
>  intparse_cidr(FILE *, unsigned char *, unsigned char *);
>
>  /*
> @@ -106,18 +108,11 @@ readconf(void)
>  }
>
>  /*
> - * lease-file :== lease-declarations EOF
> - * lease-statments :== 
> - *| lease-declaration
> - *| lease-declarations lease-declaration
> + * Open and retain a read-only file descriptor to the lease database file.
>   */
>  void
> -read_leases(void)
> +open_leases(void)
>  {
> -   FILE *cfile;
> -   char *val;
> -   int token;
> -
> new_parse(path_dhcpd_db);
>
> /*
> @@ -131,23 +126,40 @@ read_leases(void)
>  * thinking that no leases have been assigned to anybody, which
>  * could create severe network chaos.
>  */
> -   if ((cfile = fopen(path_dhcpd_db, "r")) == NULL) {
> +   if ((db_file_ro = fopen(path_dhcpd_db, "r")) == NULL) {
> log_warn("Can't open lease database (%s)", path_dhcpd_db);
> log_warnx("check for failed database rewrite attempt!");
> log_warnx("Please read the dhcpd.leases manual page if you");
> fatalx("don't know what to do about this.");
> }
> +}
> +
> +/*
> + * lease-file :== lease-declarations EOF
> + * lease-statments :== 
> + *| lease-declaration
> + *| lease-declarations lease-declaration
> + */
> +void
> +read_leases(void)
> +{
> +   char *val;
> +   int token;
> +
> +   if (!db_file_ro) {
> +   fatalx("db_file_ro is NULL");
> +   }
>
> do {
> -   token = next_token(, cfile);
> +   token = next_token(, db_file_ro);
> if (token == EOF)
> break;
> if (token != TOK_LEASE) {
> log_warnx("Corrupt lease file - possible data loss!");
> -   skip_to_semi(cfile);
> +   skip_to_semi(db_file_ro);
> } else {
> struct lease *lease;
> -   lease = parse_lease_declaration(cfile);
> +   lease = parse_lease_declaration(db_file_ro);
> if (lease)
> enter_lease(lease);
> else
> @@ -155,7 +167,8 @@ read_leases(void)
> }
>
> } while (1);
> -   fclose(cfile);
> +   fclose(db_file_ro);
> +   db_file_ro = NULL;
>  }
>
>  /*
> diff --git usr.sbin/dhcpd/db.c usr.sbin/dhcpd/db.c
> index 295e522b1ce..634ec8a593a 100644
> --- usr.sbin/dhcpd/db.c
> +++ usr.sbin/dhcpd/db.c
> @@ -190,6 +190,16 @@ commit_leases(void)
> return (1);
>  }
>
> +/*
> + * Open and retain two file descriptors to the lease database file:
> + *
> + * - A read-only fd for learning existing leases
> + * - A write-only fd for writing new leases
> + *
> + * Reading and parsing data from the read-only fd is done separately.
> + * This allows privilege drop to happen *before* parsing untrusted
> + * client data from the lease database file.
> + */
>  void
>  db_startup(void)
>  {
> @@ -202,6 +212,15 @@ db_startup(void)
> if ((db_file = fdopen(db_fd, "w")) == NULL)
> fatalx("Can't fdopen new lease file!");
>
> +   open_leases();
> +}
> +
> +/*
> + * Read and parse the existing leases from the lease database file.
> + */
> +void
> +db_parse(void)
> +{
> /* Read in the existing lease file... */
> read_leases();
> time(_time);
> diff --git usr.sbin/dhcpd/dhcpd.c usr.sbin/dhcpd/dhcpd.c
> index b3562dce34f..7759f7839e0 100644
> --- usr.sbin/dhcpd/dhcpd.c
> +++ usr.sbin/dhcpd/dhcpd.c
> @@ -244,8 +244,6 @@ main(int argc, char *argv[])
>
> icmp_startup(1, lease_pinged);
>
> -   if (chroot(pw->pw_dir) == -1)
> -   fatal("chroot %s", pw->pw_dir);
> if (chdir("/") == -1)
> fatal("chdir(\"/\")");
> if (setgroups(1, >pw_gid) ||
> @@ -253,6 +251,34

Re: patch unveil fail

2023-10-25 Thread Alexander Bluhm

On Wed, Oct 25, 2023 at 07:00:28PM +0200, Omar Polo wrote:
> On 2023/10/25 13:38:37 +0200, Alexander Bluhm  wrote:
> > @@ -213,11 +214,27 @@ main(int argc, char *argv[])
> > perror("unveil");
> > my_exit(2);
> > }
> > -   if (filearg[0] != NULL)
> > +   if (filearg[0] != NULL) {
> > +   char *origdir;
> > +
> > if (unveil(filearg[0], "rwc") == -1) {
> > perror("unveil");
> > my_exit(2);
> > }
> > +   if ((origdir = dirname(filearg[0])) == NULL) {
> 
> Not sure if we're interested in it, but dirname(3) theoretically alter
> the passed string.  our dirname doesn't do it, but per posix it can,
> IIUC.  This could cause issues since filearg[0] is used later.
> 
> If we care about portability here, we should pass a copy to dirname.
> don't know if we care thought.

unveil(2) is not portable code anyway.  And dirname(3) is only used
for that.

> > +   perror("dirname");
> > +   my_exit(2);
> > +   }
> > +   if (unveil(origdir, "rwc") == -1) {
> > +   perror("unveil");
> > +   my_exit(2);
> > +   }
> > +   } else {
> > +   if (unveil(".", "rwc") == -1) {
> > +   perror("unveil");
> > +   my_exit(2);
> > +   }
> > +   }
> > if (filearg[1] != NULL)
> > if (unveil(filearg[1], "r") == -1) {
> > perror("unveil");

Re: patch unveil fail

2023-10-25 Thread Omar Polo

On 2023/10/25 13:38:37 +0200, Alexander Bluhm  wrote:
> Index: patch.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
> diff -u -p -r1.74 patch.c
> --- patch.c   19 Jul 2023 13:26:20 -  1.74
> +++ patch.c   24 Oct 2023 17:13:28 -
> @@ -32,6 +32,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -213,11 +214,27 @@ main(int argc, char *argv[])
>   perror("unveil");
>   my_exit(2);
>   }
> - if (filearg[0] != NULL)
> + if (filearg[0] != NULL) {
> + char *origdir;
> +
>   if (unveil(filearg[0], "rwc") == -1) {
>   perror("unveil");
>   my_exit(2);
>   }
> + if ((origdir = dirname(filearg[0])) == NULL) {

Not sure if we're interested in it, but dirname(3) theoretically alter
the passed string.  our dirname doesn't do it, but per posix it can,
IIUC.  This could cause issues since filearg[0] is used later.

If we care about portability here, we should pass a copy to dirname.
don't know if we care thought.

> + perror("dirname");
> + my_exit(2);
> + }
> + if (unveil(origdir, "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + } else {
> + if (unveil(".", "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + }
>   if (filearg[1] != NULL)
>   if (unveil(filearg[1], "r") == -1) {
>   perror("unveil");

Re: patch unveil fail

2023-10-25 Thread Todd C . Miller

On Wed, 25 Oct 2023 13:38:37 +0200, Alexander Bluhm wrote:

> Since 7.4 patch(1) does not work if an explicit patchfile is given on
> command line.
>
> https://marc.info/?l=openbsd-cvs=168941770509379=2

OK millert@

 - todd

Re: patch unveil fail

2023-10-25 Thread Florian Obser

reads correct, OK florian

On 2023-10-25 13:38 +02, Alexander Bluhm  wrote:
> Hi,
>
> Since 7.4 patch(1) does not work if an explicit patchfile is given on
> command line.
>
> https://marc.info/?l=openbsd-cvs=168941770509379=2
>
> root@ot14:.../~# patch /usr/src/usr.bin/patch/patch.c patch-unveil.diff
> Hmm...  Looks like a unified diff to me...
> The text leading up to this was:
> --
> |Index: patch.c
> |===
> |RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
> |diff -u -p -r1.74 patch.c
> |--- patch.c19 Jul 2023 13:26:20 -  1.74
> |+++ patch.c24 Oct 2023 17:13:28 -
> --
> Patching file /usr/src/usr.bin/patch/patch.c using Plan A...
> Hunk #1 succeeded at 32.
> Hunk #2 succeeded at 214.
> Hunk #3 succeeded at 245.
> Can't backup /usr/src/usr.bin/patch/patch.c, output is in 
> /tmp/patchoorjYymLKcM: No such file or directory
> done
>
> A backup file should be created in the directory of the original
> file, but only the current directory is unveiled.  Then the patched
> file is created in /tmp and does not replace the original patchfile
> in place.
>
> Diff below fixes it.
>
> ok?
>
> bluhm
>
> Index: patch.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
> diff -u -p -r1.74 patch.c
> --- patch.c   19 Jul 2023 13:26:20 -  1.74
> +++ patch.c   24 Oct 2023 17:13:28 -
> @@ -32,6 +32,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -213,11 +214,27 @@ main(int argc, char *argv[])
>   perror("unveil");
>   my_exit(2);
>   }
> - if (filearg[0] != NULL)
> + if (filearg[0] != NULL) {
> + char *origdir;
> +
>   if (unveil(filearg[0], "rwc") == -1) {
>   perror("unveil");
>   my_exit(2);
>   }
> + if ((origdir = dirname(filearg[0])) == NULL) {
> + perror("dirname");
> + my_exit(2);
> + }
> + if (unveil(origdir, "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + } else {
> + if (unveil(".", "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + }
>   if (filearg[1] != NULL)
>   if (unveil(filearg[1], "r") == -1) {
>   perror("unveil");
> @@ -228,10 +245,6 @@ main(int argc, char *argv[])
>   perror("unveil");
>   my_exit(2);
>   }
> - if (unveil(".", "rwc") == -1) {
> - perror("unveil");
> - my_exit(2);
> - }
>   if (*rejname != '\0')
>   if (unveil(rejname, "rwc") == -1) {
>   perror("unveil");
>

-- 
In my defence, I have been left unsupervised.

Re: IPv4 on ix(4) slow/nothing - 7.4

2023-10-25 Thread Alexander Bluhm

On Thu, Oct 19, 2023 at 04:04:26PM +0200, Jan Klemkow wrote:
> On Wed, Oct 18, 2023 at 08:53:44PM +0200, Alexander Bluhm wrote:
> > On Wed, Oct 18, 2023 at 08:19:29PM +0200, Mischa wrote:
> > > It's indeed something like that: ix -> vlan (tagged) -> veb
> > 
> > When vlan is added to veb, kernel should disable LRO on ix.
> > All testing before release did not find this code path :-(
> > 
> > Is it possible to add vlan to veb first, and then add or change the
> > vlan parent to ix?  If it works, that should also disable LRO.
> > 
> > Jan said he will have a look tomorrow.
> > 
> > trunk, carp, ... in veb or bridge might have the same issue.
> 
> First round of fixes for vlan(4), vxlan(4), nvgre(4) and bpe(4).

Don't know much about nvgre(4) and bpe(4).

Maybe we should call ifsetlro(ifp0, 0) unconditionally in
vxlan_set_parent().  Otherwise we may get large UDP packets with
large TCP frames.

For vlan(4) this diff is correct.  For the others it is at least
an improvement.

OK bluhm@

> Index: net/if.c
> ===
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.708
> diff -u -p -r1.708 if.c
> --- net/if.c  16 Sep 2023 09:33:27 -  1.708
> +++ net/if.c  19 Oct 2023 13:03:33 -
> @@ -3243,6 +3243,17 @@ ifsetlro(struct ifnet *ifp, int on)
>   struct ifreq ifrq;
>   int error = 0;
>   int s = splnet();
> + struct if_parent parent;
> +
> + memset(, 0, sizeof(parent));
> + if ((*ifp->if_ioctl)(ifp, SIOCGIFPARENT, (caddr_t)) != -1) {
> + struct ifnet *ifp0 = if_unit(parent.ifp_parent);
> +
> + if (ifp0 != NULL) {
> + ifsetlro(ifp0, on);
> + if_put(ifp0);
> + }
> + }
>  
>   if (!ISSET(ifp->if_capabilities, IFCAP_LRO)) {
>   error = ENOTSUP;
> Index: net/if_bpe.c
> ===
> RCS file: /cvs/src/sys/net/if_bpe.c,v
> retrieving revision 1.19
> diff -u -p -r1.19 if_bpe.c
> --- net/if_bpe.c  8 Nov 2021 04:54:44 -   1.19
> +++ net/if_bpe.c  19 Oct 2023 13:20:18 -
> @@ -631,6 +631,9 @@ bpe_set_parent(struct bpe_softc *sc, con
>   goto put;
>   }
>  
> + if (ether_brport_isset(ifp))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_key.k_if = ifp0->if_index;
>   etherbridge_flush(>sc_eb, IFBF_FLUSHALL);
> Index: net/if_gre.c
> ===
> RCS file: /cvs/src/sys/net/if_gre.c,v
> retrieving revision 1.174
> diff -u -p -r1.174 if_gre.c
> --- net/if_gre.c  13 May 2023 13:35:17 -  1.174
> +++ net/if_gre.c  19 Oct 2023 13:24:56 -
> @@ -3544,6 +3544,9 @@ nvgre_set_parent(struct nvgre_softc *sc,
>   return (EPROTONOSUPPORT);
>   }
>  
> + if (ether_brport_isset(>sc_ac.ac_if))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_ifp0 = ifp0->if_index;
>   if_put(ifp0);
> Index: net/if_vlan.c
> ===
> RCS file: /cvs/src/sys/net/if_vlan.c,v
> retrieving revision 1.215
> diff -u -p -r1.215 if_vlan.c
> --- net/if_vlan.c 16 May 2023 14:32:54 -  1.215
> +++ net/if_vlan.c 19 Oct 2023 11:08:23 -
> @@ -937,6 +937,9 @@ vlan_set_parent(struct vlan_softc *sc, c
>   if (error != 0)
>   goto put;
>  
> + if (ether_brport_isset(ifp))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_ifidx0 = ifp0->if_index;
>   if (!ISSET(sc->sc_flags, IFVF_LLADDR))
> Index: net/if_vxlan.c
> ===
> RCS file: /cvs/src/sys/net/if_vxlan.c,v
> retrieving revision 1.93
> diff -u -p -r1.93 if_vxlan.c
> --- net/if_vxlan.c3 Aug 2023 09:49:08 -   1.93
> +++ net/if_vxlan.c19 Oct 2023 13:18:47 -
> @@ -1582,6 +1582,9 @@ vxlan_set_parent(struct vxlan_softc *sc,
>   goto put;
>   }
>  
> + if (ether_brport_isset(ifp))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_if_index0 = ifp0->if_index;
>   etherbridge_flush(>sc_eb, IFBF_FLUSHALL);

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-25 Thread Klemens Nanni

10/24/23 14:03, Crystal Kolipe пишет:
> On Tue, Oct 24, 2023 at 01:44:08AM +, Klemens Nanni wrote:
>> Rereading the code, I now question why it checks the 'a' label type at all.
>> 
>> Taking your sd0d example through devboot():
>> 
>> |#ifdef SOFTRAID
>> |/*
>> | * Determine the partition type for the 'a' partition of the
>> | * boot device.
>> | */
>> |TAILQ_FOREACH(dip, , list)
>> |if (dip->bios_info.bios_number == bootdev &&
>> |(dip->bios_info.flags & BDI_BADLABEL) == 0)
>> |part_type = dip->disklabel.d_partitions[0].p_fstype;
>> 
>> Whatever sd0a's type is...
>> 
>> |
>> |/*
>> | * See if we booted from a disk that is a member of a bootable
>> | * softraid volume.
>> | */
>> |SLIST_FOREACH(bv, _volumes, sbv_link) {
>> |if (bv->sbv_flags & BIOC_SCBOOTABLE)
>> |SLIST_FOREACH(bc, >sbv_chunks, sbc_link)
>> |if (bc->sbc_disk == bootdev)
>> 
>> ... the boot loader sees we booted from a bootable softraid chunk,
>> matching disk sd0 by number and not partition.
>> 
>> |sr_boot_vol = bv->sbv_unit;
>> |if (sr_boot_vol != -1)
>> |break;
>> |}
>> |#endif
>> |
>> |if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
>> 
>> With softraid chunk on sd0d, sd0a happens to be not FFS (likely unused),
>> but that's still enough for the boot loader to stick to softraid on sd0!
>> 
>> |*p++ = 's';
>> |*p++ = 'r';
>> |*p++ = '0' + sr_boot_vol;
>> |} else if (bootdev & 0x100) {
>> 
>> So why check the 'a' label type if that's not relevant?
> 
> The only reason I can see for that check is to better support the unusual
> configuration of booting from a disk which happens to be part of a bootable
> softraid volume, but which also has a regular FFS partition outside of the
> RAID, and for whatever reason the user wants to automatically boot from that
> instead.
> 
> Maybe that was useful before support for booting from softraid crypto volumes
> was introduced, (when it was raid-1 only).  It doesn't seem very useful now.
> 
>> It is confusing and 
>> Doubling down on the assumption that bootable chunks are always on 'a'
>> like my diff did shows that's a mistake and dropping the check actually
>> makes more sense to me now.
> 
> I agree it is confusing.
> 
> Effectively the code is there to treat part_type == FS_BSDFFS an 'edge case'
> as described above.
> 
>> This boots with root on softraid on sd0a and sd0d.
>> 
>> Thoughts?
>> Am I missing something?
> 
> It works OK here, so I'm fine with removing this.
> 

That matches sys/arch/amd64/stand/libsa/dev_i386.c r1.12 from 2012
which introduced this code;  my second diff would be a revert:

If we have booted from a disk that is a member of a bootable
softraid volume, always select the srX device unless the 'a'
partition of the disk is FFS.

(Other architectures copied that, except sparc64 checking FS_RAID.)

Re: snmpd_metrics: differentiate between hrSWRunName and hrSWRunPath

2023-10-24 Thread Theo Buehler

On Wed, Oct 18, 2023 at 01:49:00PM +0200, Martijn van Duren wrote:
> Right now we return the same value for both hrSWRunName and hrSWRunPath.
> hrSWRunPath should return the full path of the binary, and hrSWRunName
> a description of the running software.
> 
> Afaik there's no proper way to retrieve the full path of the running
> binary.

No there isn't. It's been discussed many times.

> However, in a lot of cases argv[0] can contain the full or
> relative path. But if argv[0] gets overwritten (like most of our
> daemons' children) it gives a more descriptive name, which is more in
> line with hrSWRunName.
> 
> netsnmp's snmpd uses argv[0] for hrSWRunPath and kinfo_proc's p_comm for
> hrSWRunName, and snmptop defaults to hrSWRunName. top(1) also defaults
> to p_comm, but contrary to top(1), snmptop allows us to switch between
> hrSWRunName, and hrSWRunPath and toggling of hrSWRunParameters
> independently, where top(1) toggles between p_comm and argv[].
> 
> So there's an argument to be made either way, but for this diff I stuck
> with netsnmp's choices.

I am fine with your choices and people feeling strongly about this had
week to object.

> 
> While here, change the buffer length from 128 to 129. HOST-RESOURCES-MIB
> allows up to 128 characters in the response, so make room for the
> terminating NUL.
> 
> Thoughts? OK?

ok tb

Re: snmpd_metrics: add HOST-RESOURCES-MIB:hrSWRunPerfTable

2023-10-24 Thread Theo Buehler

On Wed, Oct 18, 2023 at 11:54:06AM +0200, Martijn van Duren wrote:
> This diff adds the two entries from the hrSWRunPerfTable:
> hrSWRunPerfCPU, and hrSWRunPerfMem. This allows snmptop from the
> net-snmp package to work. Math stolen^Wcopied from top/machine.c.
> 
> OK?

ok

> 
> martijn@
> 
> Index: mib.c
> ===
> RCS file: /cvs/src/libexec/snmpd/snmpd_metrics/mib.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 mib.c
> --- mib.c 4 Jul 2023 11:34:19 -   1.4
> +++ mib.c 18 Oct 2023 09:52:07 -
> @@ -69,6 +69,10 @@ struct eventconnev;
>  const char   *agentxsocket = NULL;
>  int   agentxfd = -1;
>  
> +int   pageshift;
> +#define pagetok(size) ((size) << pageshift)
> +
> +void  pageshift_init(void);
>  void  snmp_connect(struct agentx *, void *, int);
>  void  snmp_tryconnect(int, short, void *);
>  void  snmp_read(int, short, void *);
> @@ -89,6 +93,7 @@ struct agentx_object *hrProcessorLoad;
>  struct agentx_index *hrSWRunIdx;
>  struct agentx_object *hrSWRunIndex, *hrSWRunName, *hrSWRunID, *hrSWRunPath;
>  struct agentx_object *hrSWRunParameters, *hrSWRunType, *hrSWRunStatus;
> +struct agentx_object *hrSWRunPerfCPU, *hrSWRunPerfMem;
>  
>  void  mib_hrsystemuptime(struct agentx_varbind *);
>  void  mib_hrsystemdate(struct agentx_varbind *);
> @@ -634,6 +639,7 @@ mib_hrswrun(struct agentx_varbind *vb)
>   struct agentx_object*obj;
>   enum agentx_request_type req;
>   int32_t  idx;
> + int32_t  time;
>   struct kinfo_proc   *kinfo;
>   char*s;
>  
> @@ -711,6 +717,13 @@ mib_hrswrun(struct agentx_varbind *vb)
>   agentx_varbind_integer(vb, 4);
>   break;
>   }
> + } else if (obj == hrSWRunPerfCPU) {
> + time = kinfo->p_rtime_sec * 100;
> + time += (kinfo->p_rtime_usec + 5000) / 1;
> + agentx_varbind_integer(vb, time);
> + } else if (obj == hrSWRunPerfMem) {
> + agentx_varbind_integer(vb, pagetok(kinfo->p_vm_tsize +
> + kinfo->p_vm_dsize + kinfo->p_vm_ssize));
>   } else
>   fatal("%s: Unexpected object", __func__);
>  }
> @@ -3278,6 +3291,7 @@ main(int argc, char *argv[])
>   kr_init();
>   pf_init();
>   timer_init();
> + pageshift_init();
>  
>   if (agentxsocket != NULL) {
>   if (strlcpy(agentxsocketdir, agentxsocket,
> @@ -3375,6 +3389,10 @@ main(int argc, char *argv[])
>   (hrSWRunType = agentx_object(host, AGENTX_OID(HRSWRUNTYPE),
>   , 1, 0, mib_hrswrun)) == NULL ||
>   (hrSWRunStatus = agentx_object(host, AGENTX_OID(HRSWRUNSTATUS),
> + , 1, 0, mib_hrswrun)) == NULL ||
> + (hrSWRunPerfCPU = agentx_object(host, AGENTX_OID(HRSWRUNPERFCPU),
> + , 1, 0, mib_hrswrun)) == NULL ||
> + (hrSWRunPerfMem = agentx_object(host, AGENTX_OID(HRSWRUNPERFMEM),
>   , 1, 0, mib_hrswrun)) == NULL)
>   fatal("agentx_object");
>  
> @@ -4318,6 +4336,22 @@ main(int argc, char *argv[])
>   log_setverbose(verbose);
>  
>   event_dispatch();
> +}
> +
> +#define LOG1024   10
> +void
> +pageshift_init(void)
> +{
> + long pagesize;
> +
> + if ((pagesize = sysconf(_SC_PAGESIZE)) == -1)
> + fatal("sysconf(_SC_PAGESIZE)");
> + while (pagesize > 1) {
> + pageshift++;
> + pagesize >>= 1;
> + }
> + /* we only need the amount of log(2)1024 for our conversion */
> + pageshift -= LOG1024;
>  }
>  
>  void
>

Re: ibuf free fd on close

2023-10-24 Thread Theo Buehler

> Thanks. Do we have a list of ports that use ibuf / imsg? 

Here's a list obtained from the nm output of all ports by grepping for
'^ibuf_.* U' and '^imsg_.* U', respectively, then weeding out the
obvious false positives::

ibuf:

mail/opensmtpd-extras,-main
net/ladvd

imsg:

audio/amused
devel/got,-main
mail/opensmtpd-extras,-main
mail/pop3d
mail/smtp-vilter
mail/smtp-vilter,ldap
net/gelatod
net/gmid
net/openmdns
net/telescope
net/thingsd
sysutils/tabled
sysutils/tmate

Re: ibuf free fd on close

2023-10-24 Thread Theo Buehler

On Tue, Oct 24, 2023 at 04:00:42PM +0200, Claudio Jeker wrote:
> On Tue, Oct 24, 2023 at 03:50:47PM +0200, Theo Buehler wrote:
> > On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> > > When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> > > close any fd still on the buffer. This way even if a fd is unexpectedly
> > > passed nothing will happen.
> > > 
> > > That code was disabled at start because userland was not fully ready. In
> > > particular rpki-client did not handle that well. All of this is to my
> > > knowledge fixed so there is no reason to keep the NOTYET :)
> > > 
> > > With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> > > Code not doing so will break because ibuf_free() will close the fd which
> > > is probably still in use somewhere else.
> > 
> > Nothing in base outside of libutil seems to reach directly for the fd
> > (checked by compiling with that struct member renamed in the public
> > header).
> > 
> > The internal uses are addressed by this diff, so
> > 
> > ok tb
> > 
> > I can put the fd rename through a bulk to catch some ports in a couple
> > of days but I don't think there is a need to wait.
> 
> Thanks. Do we have a list of ports that use ibuf / imsg? 

I don't think so. The list of ports using libutil is rather long. my sql
is nonexistent but I get a few hundred.

Re: ibuf free fd on close

2023-10-24 Thread Claudio Jeker

On Tue, Oct 24, 2023 at 03:50:47PM +0200, Theo Buehler wrote:
> On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> > When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> > close any fd still on the buffer. This way even if a fd is unexpectedly
> > passed nothing will happen.
> > 
> > That code was disabled at start because userland was not fully ready. In
> > particular rpki-client did not handle that well. All of this is to my
> > knowledge fixed so there is no reason to keep the NOTYET :)
> > 
> > With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> > Code not doing so will break because ibuf_free() will close the fd which
> > is probably still in use somewhere else.
> 
> Nothing in base outside of libutil seems to reach directly for the fd
> (checked by compiling with that struct member renamed in the public
> header).
> 
> The internal uses are addressed by this diff, so
> 
> ok tb
> 
> I can put the fd rename through a bulk to catch some ports in a couple
> of days but I don't think there is a need to wait.

Thanks. Do we have a list of ports that use ibuf / imsg? 

-- 
:wq Claudio

Re: ibuf free fd on close

2023-10-24 Thread Theo Buehler

On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> close any fd still on the buffer. This way even if a fd is unexpectedly
> passed nothing will happen.
> 
> That code was disabled at start because userland was not fully ready. In
> particular rpki-client did not handle that well. All of this is to my
> knowledge fixed so there is no reason to keep the NOTYET :)
> 
> With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> Code not doing so will break because ibuf_free() will close the fd which
> is probably still in use somewhere else.

Nothing in base outside of libutil seems to reach directly for the fd
(checked by compiling with that struct member renamed in the public
header).

The internal uses are addressed by this diff, so

ok tb

I can put the fd rename through a bulk to catch some ports in a couple
of days but I don't think there is a need to wait.

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-24 Thread Crystal Kolipe

On Tue, Oct 24, 2023 at 01:44:08AM +, Klemens Nanni wrote:
> Rereading the code, I now question why it checks the 'a' label type at all.
> 
> Taking your sd0d example through devboot():
> 
> |#ifdef SOFTRAID
> | /*
> |  * Determine the partition type for the 'a' partition of the
> |  * boot device.
> |  */
> | TAILQ_FOREACH(dip, , list)
> | if (dip->bios_info.bios_number == bootdev &&
> | (dip->bios_info.flags & BDI_BADLABEL) == 0)
> | part_type = dip->disklabel.d_partitions[0].p_fstype;
> 
> Whatever sd0a's type is...
> 
> |
> | /*
> |  * See if we booted from a disk that is a member of a bootable
> |  * softraid volume.
> |  */
> | SLIST_FOREACH(bv, _volumes, sbv_link) {
> | if (bv->sbv_flags & BIOC_SCBOOTABLE)
> | SLIST_FOREACH(bc, >sbv_chunks, sbc_link)
> | if (bc->sbc_disk == bootdev)
> 
> ... the boot loader sees we booted from a bootable softraid chunk,
> matching disk sd0 by number and not partition.
> 
> | sr_boot_vol = bv->sbv_unit;
> | if (sr_boot_vol != -1)
> | break;
> | }
> |#endif
> |
> | if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> 
> With softraid chunk on sd0d, sd0a happens to be not FFS (likely unused),
> but that's still enough for the boot loader to stick to softraid on sd0!
> 
> | *p++ = 's';
> | *p++ = 'r';
> | *p++ = '0' + sr_boot_vol;
> | } else if (bootdev & 0x100) {
> 
> So why check the 'a' label type if that's not relevant?

The only reason I can see for that check is to better support the unusual
configuration of booting from a disk which happens to be part of a bootable
softraid volume, but which also has a regular FFS partition outside of the
RAID, and for whatever reason the user wants to automatically boot from that
instead.

Maybe that was useful before support for booting from softraid crypto volumes
was introduced, (when it was raid-1 only).  It doesn't seem very useful now.

> It is confusing and 
> Doubling down on the assumption that bootable chunks are always on 'a'
> like my diff did shows that's a mistake and dropping the check actually
> makes more sense to me now.

I agree it is confusing.

Effectively the code is there to treat part_type == FS_BSDFFS an 'edge case'
as described above.

> This boots with root on softraid on sd0a and sd0d.
> 
> Thoughts?
> Am I missing something?

It works OK here, so I'm fine with removing this.

Re: snmpd [15/16]: When we have an error, all oids must be identical to the request

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:31:20PM +0200, Martijn van Duren wrote:
> RFC3416 section 4.2.1 (and others) tells us that if an error occurs the
> varbindlist in the response must be identical to the original request.
> 
> There might be other edge-cases here, but let's at least make sure that
> the OIDs haven't been tampered with.

ok

Re: snmpd [14/16]: Validate the returned error code

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:28:12PM +0200, Martijn van Duren wrote:
> 
> Certain error codes are only intended for certain request-types. Add an
> appl_error_valid() function to test for this.

ok tb

Re: snmpd [13/16]: registered instances must not return below OID

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:25:29PM +0200, Martijn van Duren wrote:
> If a backend registers as an instance it must never return OIDs below
> their registration. Add a test for this in appl_varbind_valid().
> 
> OK?

ok with a tiny nit inline

> 
> martijn@
> 
> diff --git a/application.c b/application.c
> index dfa7220..6ddeb39 100644
> --- a/application.c
> +++ b/application.c
> @@ -128,8 +128,8 @@ void appl_request_upstream_resolve(struct 
> appl_request_upstream *);
>  void appl_request_downstream_send(struct appl_request_downstream *);
>  void appl_request_downstream_timeout(int, short, void *);
>  void appl_request_upstream_reply(struct appl_request_upstream *);
> -int appl_varbind_valid(struct appl_varbind *, struct appl_varbind *, int, 
> int,
> -int, const char **);
> +int appl_varbind_valid(struct appl_varbind *, struct appl_varbind_internal *,
> +int, int, int, const char **);
>  int appl_varbind_backend(struct appl_varbind_internal *);
>  void appl_varbind_error(struct appl_varbind_internal *, enum appl_error);
>  void appl_report(struct snmp_message *, int32_t, struct ber_oid *,
> @@ -1073,9 +1073,9 @@ appl_response(struct appl_backend *backend, int32_t 
> requestid,
>  
>   vb = vblist;
>   for (i = 1; vb != NULL; vb = vb->av_next, i++) {
> -if (!appl_varbind_valid(vb, origvb == NULL ?
> - NULL : &(origvb->avi_varbind), next,
> -error != APPL_ERROR_NOERROR, backend->ab_range, 
> )) {
> +if (!appl_varbind_valid(vb, origvb == NULL ?  NULL : origvb,

could that not just be

if (!appl_varbind_valid(vb, origvb,

(also: use tabs for indent)

> + next, error != APPL_ERROR_NOERROR, backend->ab_range,
> + )) {
>   smi_oid2string(&(vb->av_oid), oidbuf,
>   sizeof(oidbuf), 0);
>   log_warnx("%s: %"PRIu32" %s: %s",
> @@ -1173,8 +1173,9 @@ appl_response(struct appl_backend *backend, int32_t 
> requestid,
>  }
>  
>  int
> -appl_varbind_valid(struct appl_varbind *varbind, struct appl_varbind 
> *request,
> -int next, int null, int range, const char **errstr)
> +appl_varbind_valid(struct appl_varbind *varbind,
> +struct appl_varbind_internal *request, int next, int null, int range,
> +const char **errstr)
>  {
>   int cmp;
>   int eomv = 0;
> @@ -1259,24 +1260,32 @@ appl_varbind_valid(struct appl_varbind *varbind, 
> struct appl_varbind *request,
>   if (request == NULL)
>   return 1;
>  
> - cmp = ober_oid_cmp(&(request->av_oid), &(varbind->av_oid));
> - if (next && !eomv) {
> - if (request->av_include) {
> - if (cmp > 0) {
> - *errstr = "oid not incrementing";
> - return 0;
> + cmp = ober_oid_cmp(&(request->avi_varbind.av_oid), &(varbind->av_oid));
> + if (next) {
> + if (request->avi_region->ar_instance &&
> + ober_oid_cmp(&(request->avi_region->ar_oid),
> + &(varbind->av_oid)) != 0) {
> + *errstr = "oid below instance";
> + return 0;
> + }
> + if (!eomv) {
> + if (request->avi_varbind.av_include) {
> + if (cmp > 0) {
> + *errstr = "oid not incrementing";
> + return 0;
> + }
> + } else {
> + if (cmp >= 0) {
> + *errstr = "oid not incrementing";
> + return 0;
> + }
>   }
> - } else {
> - if (cmp >= 0) {
> - *errstr = "oid not incrementing";
> + if (range && ober_oid_cmp(&(varbind->av_oid),
> + &(request->avi_varbind.av_oid_end)) > 0) {
> + *errstr = "end oid not honoured";
>   return 0;
>   }
>   }
> - if (range && ober_oid_cmp(&(varbind->av_oid),
> - &(request->av_oid_end)) > 0) {
> - *errstr = "end oid not honoured";
> - return 0;
> - }
>   } else {
>   if (cmp != 0) {
>   *errstr = "oids not equal";
>

Re: snmpd [12/16]: Make ab_range in application_agentx explicit 1

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:21:45PM +0200, Martijn van Duren wrote:
> appl_agentx_session doesn't set ab_range explicitly to 1, and thus
> relies on malloc randomness to set it. Sit it explicitly.

ok, but I haven't verified that all the session's members are now
initialized.

> 
> OK?
> 
> martijn@
> 
> diff --git a/application_agentx.c b/application_agentx.c
> index 680725d..79900d6 100644
> --- a/application_agentx.c
> +++ b/application_agentx.c
> @@ -548,6 +548,7 @@ appl_agentx_open(struct appl_agentx_connection *conn, 
> struct ax_pdu *pdu)
>   session->sess_backend.ab_cookie = session;
>   session->sess_backend.ab_retries = 0;
>   session->sess_backend.ab_fn = _agentx_functions;
> + session->sess_backend.ab_range = 1;
>   RB_INIT(&(session->sess_backend.ab_requests));
>   TAILQ_INSERT_TAIL(&(conn->conn_sessions), session, sess_conn_entry);
>  
>

Re: snmpd [11/16]: When a request results in EOMV we must return the requesting OID

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:17:19PM +0200, Martijn van Duren wrote:
> According to RFC3416 section 4.2.2 and 4.2.3 case "(2)" when an
> endOfMibView is returned the OID must be identical to originally
> requested OID. Currently this can fail when the original request
> is in a !last registered region and all subsequent regions also
> return EOMV.
> 
> Store the original OID and use that on EOMV.

ok tb

Re: snmpd [10/16]: Make retries on open session where connection is closed return early

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:13:57PM +0200, Martijn van Duren wrote:
> Here's a special case unlikely to be found in the wild:
> When opening 2 sessions on an agentx connection (already unusual) and
> registering 2 overlapping regions on the different sessions, e.g. by
> differing in priority (even more unusual) and we close the underlying
> connection with an outstanding request to the dominant region we will
> call appl_agentx_free(), which sequentially closes all sessions.
> If the session with the outstanding request is closed before the
> second session the request is retried before said session is cleaned
> up and it will try to send it over a conn_ax which at that point has
> been set to NULL, resulting in a SIGSEGV.
> 
> Simply return early and let this second request be cancelled by the
> cleanup of the second session.

Makes total sense.

ok tb

Re: snmpd [9/16]: Fix range handling with appl_unregister

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:11:18PM +0200, Martijn van Duren wrote:
> Right now (un)registering a region with range_subid set to !0 will
> fail. Apparently nothing in the wild uses this, but let's fix it.
> 
> This is the unregister part.

ok tb

Re: snmpd [8/16]: Fix range handling with appl_register

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:08:00PM +0200, Martijn van Duren wrote:
> Right now (un)registering a region with range_subid set to !0 will
> fail. Apparently nothing in the wild uses this, but let's fix it.
> 
> This is the register part.

ok tb

Re: snmpd [7/16]: Treat agentx-close-pdu with reasonByManager as a parseerror

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 03:03:16PM +0200, Martijn van Duren wrote:
> 
> RFC2741 section 6.2.2 says that reasonByManager can only be used by the
> agentx master. Treat this reason as a parseerror.

ok tb

Re: snmpd [6/16]: support close reason for appl_agentx_free

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 02:59:52PM +0200, Martijn van Duren wrote:
> appl_agentx_free() closes any potential open sessions before closing the
> connection and cleaning up. This function is called from multiple
> contexts and the current APPL_CLOSE_REASONSHUTDOWN is not always
> applicable. Add a second reason parameter that can be passed onto
> appl_agentx_forceclose().

ok tb

Re: snmpd [5/16]: Check context existence in appl_agentx_recv

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 02:56:59PM +0200, Martijn van Duren wrote:
> application.c checks the context where applicable, but not every
> agentx-pdu goes through there (e.g. agentx-ping-pdu). Make sure
> we always check the context in appl_agentx_recv()

ok tb

Re: snmpd [4/16]: check agentx-pdu-header flags for validity

2023-10-24 Thread Theo Buehler

On Tue, Oct 17, 2023 at 02:54:07PM +0200, Martijn van Duren wrote:
> RFC2741 section 6.1 specifies which PDUs can contain which header flags.
> Check that that incoming agentx PDUs have valid flags in
> appl_agentx_recv(). While here I cleaned up a few log messages some
> minor restructuring to prevent the function growing too large.

ok tb

Re: relayd.conf.5: less SSL

2023-10-24 Thread Peter N. M. Hansteen

On Tue, Oct 24, 2023 at 06:54:30AM +, Klemens Nanni wrote:
> - parse.y still accepting undocumented "ssl" with a warning since 2014
> - more "SSL/TLS" instead of "TLS" in manual and code comments

my take would be that while it's fine to streamline the documentation to use
the modern terminology, I suspect there may still be ancient configurations
out there that use the "ssl" keyword, so removing the last bit of support for
that option should be accompanied by or preceded by a warning on relevant
mailing lists or at least in the commit message. 

And I think undeadly.org would be more than happy to help spread the word :)

- Peter

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Klemens Nanni

On Mon, Oct 23, 2023 at 06:36:21PM -0300, Crystal Kolipe wrote:
> On Mon, Oct 23, 2023 at 11:04:07AM +, Klemens Nanni wrote:
> > 10/16/23 04:02, Klemens Nanni ??:
> > > The current check implies one could use, e.g. SWAP or MSDOS partitions
> > > as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> > > thus using chunks with different partition types is not possible:
> > > 
> > >   # vmctl create -s100M disk.img
> > >   # vnd=`vnconfig disk.img`
> > >   # echo 'swap *' | disklabel -wAT- vnd0
> > > 
> > >   # disklabel $vnd | grep swap
> > > a:   2048000swap
> > >   # bioctl -c c -l ${vnd}a softraid0
> > >   softraid0: invalid metadata format
> > > 
> > > Correct the check.
> > > I don't expect this to break anything.
> > > amd64 biosboot boots off standard RAID 'a' as before.
> > > 
> > > Feedback? Objection? OK?
> > 
> > Ping.
> 
> This breaks booting off of a RAID that is not on partition 'a', on amd64.
> 
> Was this intentional?

No.

> 
> For example, if you have a RAID on 'd', with no 'a' partition at all, then
> with your patch the machine becomes unbootable.
> 
> The second stage bootloader doesn't automatically find the softraid volume.
> Manually booting the kernel from it results in a kernel panic when the
> kernel can't find the root filesystem.
> 
> Although booting from a RAID on a non-'a' partition is not supported on all
> archs, it has worked fine on amd64 for a long time, so it's quite possible
> that people have machines deployed that boot from other RAID partitions.

We don't recommend it, but this sentence in softraid(4) is the only thing I
can find actually talking about boot requirements wrt. chunks and labels:

On sparc64, bootable chunks must be RAID partitions using the letter 
‘a’.

It does imply that other architectures can boot from non-'a' partitions,
I've done so myself, you example shows it and I don't intent to change that.

> 
> This change would unexpectedly break them, and it would potentially be quite
> painful for any users who upgrade to 7.5 and find out afterwards that their
> machine doesn't boot, because the work-around would likely be to boot the
> ramdisk kernel, and unpack mdec/boot from the base package of the previous
> release then re-run installboot specifying the old mdec/boot.
> 
> That wouldn't be at all obvious to users without a lot of OpenBSD experience.

Rereading the code, I now question why it checks the 'a' label type at all.

Taking your sd0d example through devboot():

|#ifdef SOFTRAID
|   /*
|* Determine the partition type for the 'a' partition of the
|* boot device.
|*/
|   TAILQ_FOREACH(dip, , list)
|   if (dip->bios_info.bios_number == bootdev &&
|   (dip->bios_info.flags & BDI_BADLABEL) == 0)
|   part_type = dip->disklabel.d_partitions[0].p_fstype;

Whatever sd0a's type is...

|
|   /*
|* See if we booted from a disk that is a member of a bootable
|* softraid volume.
|*/
|   SLIST_FOREACH(bv, _volumes, sbv_link) {
|   if (bv->sbv_flags & BIOC_SCBOOTABLE)
|   SLIST_FOREACH(bc, >sbv_chunks, sbc_link)
|   if (bc->sbc_disk == bootdev)

... the boot loader sees we booted from a bootable softraid chunk,
matching disk sd0 by number and not partition.

|   sr_boot_vol = bv->sbv_unit;
|   if (sr_boot_vol != -1)
|   break;
|   }
|#endif
|
|   if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {

With softraid chunk on sd0d, sd0a happens to be not FFS (likely unused),
but that's still enough for the boot loader to stick to softraid on sd0!

|   *p++ = 's';
|   *p++ = 'r';
|   *p++ = '0' + sr_boot_vol;
|   } else if (bootdev & 0x100) {

So why check the 'a' label type if that's not relevant?
It is confusing and 
Doubling down on the assumption that bootable chunks are always on 'a'
like my diff did shows that's a mistake and dropping the check actually
makes more sense to me now.

This boots with root on softraid on sd0a and sd0d.

Thoughts?
Am I missing something?

Index: sys/arch/amd64//stand/libsa/dev_i386.c
===
RCS file: /cvs/src/sys/arch/amd64/stand/libsa/dev_i386.c,v
retrieving revision 1.23
diff -u -p -r1.23 dev_i386.c
--- sys/arch/amd64//stand/libsa/dev_i386.c  10 May 2019 21:20:43 -  
1.23
+++ sys/arch/amd64//stand/libsa/dev_i386.c  24 Oct 2023 01:43:06 -
@@ -103,22 +103,11 @@ devboot(dev_t bootdev, char *p)
 #ifdef S

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Theo de Raadt

Crystal Kolipe  wrote:

`> Hi Theo, it's a long time since we last conversed.
> 
> On Mon, Oct 23, 2023 at 03:44:17PM -0600, Theo de Raadt wrote:
> > What user without OpenBSD experience is booting from 'd'?
> > 
> > Which also poses the question -- what user with OpenBSD experience
> > is booting from 'd'?
> > 
> > Why?
> 
> Some disklabel partitions have traditionally had specific meanings:
> 
> a - root fs
> b - swap
> c - whole disk
> d - on non-OpenBSD systems is 'whole disk' where c is not
> i - often used for non FFS partitions on OpenBSD
> 
> Why would you automatically make a RAID as parition 'a'?
> 
> It's not a root fs.
> 
> I don't see any logic behind RAID partition = 'a'.
> 
> What if you want more than one?
> 
> Booting from non-'a' softraids has never been discouraged on amd64.
> 
> It's been noted that on other archs it doesn't work, but there has
> never been a general advisory that booting from softraid volumes should
> only be done from 'a'.
> 
> > You say "quite possible that people have machines deployed that boot
> > from other RAID partitions"
> 
> After 10+ years of it working, do you really think that nobody has
> ever done it?



The way you write, it sounds personal.

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Crystal Kolipe

Hi Theo, it's a long time since we last conversed.

On Mon, Oct 23, 2023 at 03:44:17PM -0600, Theo de Raadt wrote:
> What user without OpenBSD experience is booting from 'd'?
> 
> Which also poses the question -- what user with OpenBSD experience
> is booting from 'd'?
> 
> Why?

Some disklabel partitions have traditionally had specific meanings:

a - root fs
b - swap
c - whole disk
d - on non-OpenBSD systems is 'whole disk' where c is not
i - often used for non FFS partitions on OpenBSD

Why would you automatically make a RAID as parition 'a'?

It's not a root fs.

I don't see any logic behind RAID partition = 'a'.

What if you want more than one?

Booting from non-'a' softraids has never been discouraged on amd64.

It's been noted that on other archs it doesn't work, but there has
never been a general advisory that booting from softraid volumes should
only be done from 'a'.

> You say "quite possible that people have machines deployed that boot
> from other RAID partitions"

After 10+ years of it working, do you really think that nobody has
ever done it?

> Who?  Just you?

If it's just me, I can easy fix my own machine.

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Theo de Raadt

Crystal Kolipe  wrote:

> On Mon, Oct 23, 2023 at 11:04:07AM +, Klemens Nanni wrote:
> > 10/16/23 04:02, Klemens Nanni ??:
> > > The current check implies one could use, e.g. SWAP or MSDOS partitions
> > > as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> > > thus using chunks with different partition types is not possible:
> > > 
> > >   # vmctl create -s100M disk.img
> > >   # vnd=`vnconfig disk.img`
> > >   # echo 'swap *' | disklabel -wAT- vnd0
> > > 
> > >   # disklabel $vnd | grep swap
> > > a:   2048000swap
> > >   # bioctl -c c -l ${vnd}a softraid0
> > >   softraid0: invalid metadata format
> > > 
> > > Correct the check.
> > > I don't expect this to break anything.
> > > amd64 biosboot boots off standard RAID 'a' as before.
> > > 
> > > Feedback? Objection? OK?
> > 
> > Ping.
> 
> This breaks booting off of a RAID that is not on partition 'a', on amd64.
> 
> Was this intentional?
> 
> For example, if you have a RAID on 'd', with no 'a' partition at all, then
> with your patch the machine becomes unbootable.
> 
> The second stage bootloader doesn't automatically find the softraid volume.
> Manually booting the kernel from it results in a kernel panic when the
> kernel can't find the root filesystem.
> 
> Although booting from a RAID on a non-'a' partition is not supported on all
> archs, it has worked fine on amd64 for a long time, so it's quite possible
> that people have machines deployed that boot from other RAID partitions.
> 
> This change would unexpectedly break them, and it would potentially be quite
> painful for any users who upgrade to 7.5 and find out afterwards that their
> machine doesn't boot, because the work-around would likely be to boot the
> ramdisk kernel, and unpack mdec/boot from the base package of the previous
> release then re-run installboot specifying the old mdec/boot.
> 
> That wouldn't be at all obvious to users without a lot of OpenBSD experience.


What user without OpenBSD experience is booting from 'd'?

Which also poses the question -- what user with OpenBSD experience
is booting from 'd'?

Why?


You say "quite possible that people have machines deployed that boot
from other RAID partitions"

Who?  Just you?

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Crystal Kolipe

On Mon, Oct 23, 2023 at 11:04:07AM +, Klemens Nanni wrote:
> 10/16/23 04:02, Klemens Nanni ??:
> > The current check implies one could use, e.g. SWAP or MSDOS partitions
> > as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> > thus using chunks with different partition types is not possible:
> > 
> > # vmctl create -s100M disk.img
> > # vnd=`vnconfig disk.img`
> > # echo 'swap *' | disklabel -wAT- vnd0
> > 
> > # disklabel $vnd | grep swap
> >   a:   2048000swap
> > # bioctl -c c -l ${vnd}a softraid0
> > softraid0: invalid metadata format
> > 
> > Correct the check.
> > I don't expect this to break anything.
> > amd64 biosboot boots off standard RAID 'a' as before.
> > 
> > Feedback? Objection? OK?
> 
> Ping.

This breaks booting off of a RAID that is not on partition 'a', on amd64.

Was this intentional?

For example, if you have a RAID on 'd', with no 'a' partition at all, then
with your patch the machine becomes unbootable.

The second stage bootloader doesn't automatically find the softraid volume.
Manually booting the kernel from it results in a kernel panic when the
kernel can't find the root filesystem.

Although booting from a RAID on a non-'a' partition is not supported on all
archs, it has worked fine on amd64 for a long time, so it's quite possible
that people have machines deployed that boot from other RAID partitions.

This change would unexpectedly break them, and it would potentially be quite
painful for any users who upgrade to 7.5 and find out afterwards that their
machine doesn't boot, because the work-around would likely be to boot the
ramdisk kernel, and unpack mdec/boot from the base package of the previous
release then re-run installboot specifying the old mdec/boot.

That wouldn't be at all obvious to users without a lot of OpenBSD experience.

Re: fix an error in flowspec_get_addr()

2023-10-23 Thread Theo Buehler

On Mon, Oct 23, 2023 at 12:50:39PM +0200, Claudio Jeker wrote:
> So flowspec_get_addr() in the IPv6 case is utterly complicated.

Indeed. It would probably be worthwhile to add a bit of regress
exercising the branches of this function.

> Since matching can be done on some sub-part of the prefix.
> So there is this shift_right() call that moves takes care of this special
> offset.
> 
> Now the shift_right call uses *olen but should actually use xoff instead.
> *olen is set much later in the code.

much = 3 lines? :)

> This should fix:
> https://github.com/openbgpd-portable/openbgpd-portable/security/code-scanning/2

Indeed.

ok

> -- 
> :wq Claudio
> 
> Index: flowspec.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/flowspec.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 flowspec.c
> --- flowspec.c19 Apr 2023 09:31:58 -  1.4
> +++ flowspec.c23 Oct 2023 10:44:22 -
> @@ -366,7 +366,7 @@ flowspec_get_addr(const uint8_t *flow, i
>   if (extract_prefix(comp + 2, complen - 2, buf, xlen,
>   sizeof(buf)) == -1)
>   return -1;
> - shift_right(addr->v6.s6_addr, buf, *olen, xlen);
> + shift_right(addr->v6.s6_addr, buf, xoff, xlen);
>   *plen = comp[0];
>   if (olen != NULL)
>   *olen = comp[1];
>

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Klemens Nanni

10/16/23 04:02, Klemens Nanni пишет:
> The current check implies one could use, e.g. SWAP or MSDOS partitions
> as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> thus using chunks with different partition types is not possible:
> 
>   # vmctl create -s100M disk.img
>   # vnd=`vnconfig disk.img`
>   # echo 'swap *' | disklabel -wAT- vnd0
> 
>   # disklabel $vnd | grep swap
> a:   2048000swap
>   # bioctl -c c -l ${vnd}a softraid0
>   softraid0: invalid metadata format
> 
> Correct the check.
> I don't expect this to break anything.
> amd64 biosboot boots off standard RAID 'a' as before.
> 
> Feedback? Objection? OK?

Ping.

> 
> Index: arch/amd64/stand/efiboot/dev_i386.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/stand/efiboot/dev_i386.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 dev_i386.c
> --- arch/amd64/stand/efiboot/dev_i386.c   10 May 2019 21:20:42 -  
> 1.1
> +++ arch/amd64/stand/efiboot/dev_i386.c   16 Oct 2023 00:33:14 -
> @@ -149,7 +149,7 @@ devboot(dev_t bootdev, char *p)
>   }
>  #endif
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   *p++ = 's';
>   *p++ = 'r';
>   *p++ = '0' + sr_boot_vol;
> Index: arch/amd64/stand/libsa/dev_i386.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/stand/libsa/dev_i386.c,v
> retrieving revision 1.23
> diff -u -p -r1.23 dev_i386.c
> --- arch/amd64/stand/libsa/dev_i386.c 10 May 2019 21:20:43 -  1.23
> +++ arch/amd64/stand/libsa/dev_i386.c 16 Oct 2023 00:31:35 -
> @@ -132,7 +132,7 @@ devboot(dev_t bootdev, char *p)
>   }
>  #endif
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   *p++ = 's';
>   *p++ = 'r';
>   *p++ = '0' + sr_boot_vol;
> Index: arch/arm64/stand/efiboot/efiboot.c
> ===
> RCS file: /cvs/src/sys/arch/arm64/stand/efiboot/efiboot.c,v
> retrieving revision 1.48
> diff -u -p -r1.48 efiboot.c
> --- arch/arm64/stand/efiboot/efiboot.c12 May 2023 16:43:00 -  
> 1.48
> +++ arch/arm64/stand/efiboot/efiboot.c16 Oct 2023 00:34:17 -
> @@ -864,7 +864,7 @@ devboot(dev_t dev, char *p)
>   break;
>   }
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   strlcpy(p, "sr0a", 5);
>   p[2] = '0' + sr_boot_vol;
>   return;
> Index: arch/i386/stand/libsa/dev_i386.c
> ===
> RCS file: /cvs/src/sys/arch/i386/stand/libsa/dev_i386.c,v
> retrieving revision 1.43
> diff -u -p -r1.43 dev_i386.c
> --- arch/i386/stand/libsa/dev_i386.c  11 Sep 2016 17:52:47 -  1.43
> +++ arch/i386/stand/libsa/dev_i386.c  16 Oct 2023 00:34:45 -
> @@ -132,7 +132,7 @@ devboot(dev_t bootdev, char *p)
>   }
>  #endif
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   *p++ = 's';
>   *p++ = 'r';
>   *p++ = '0' + sr_boot_vol;
> Index: arch/riscv64/stand/efiboot/efiboot.c
> ===
> RCS file: /cvs/src/sys/arch/riscv64/stand/efiboot/efiboot.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 efiboot.c
> --- arch/riscv64/stand/efiboot/efiboot.c  5 Jul 2023 09:25:55 -   
> 1.6
> +++ arch/riscv64/stand/efiboot/efiboot.c  16 Oct 2023 00:35:14 -
> @@ -742,7 +742,7 @@ devboot(dev_t dev, char *p)
>   break;
>   }
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   strlcpy(p, "sr0a", 5);
>   p[2] = '0' + sr_boot_vol;
>   return;
>

Re: Term::Cap full revamp

2023-10-22 Thread Matthieu Herrb

On Fri, Oct 20, 2023 at 11:52:25AM +0200, Marc Espie wrote:
> I guess i will probably leave it alone after this.
> This does quite a few things compared to my former patches.
> 
> - totally get rid of eval, it doen't make sense anymore
> - declare variables before they get used, which tends to
> simplify things.
> - change quaint formatting to something more BSD like
> - update documentation to new style of doing OO
> - use defined logic on entry and such
> - always try to run infocmp as a last resort, even if
> we have a path.
> - run infocmp with the best options we have to get a good termcap
> - use \Q\E, which gets rid of termpat entirely
> - dedup the path along the way: for us, /etc/termcap
> and /usr/share/misc/termcap are the same.
> - redo recursion logic by just recording which term values we
> already saw, the max=32 value overflow was absurd, proper parsing
> yields roughly 10 or so tc redirections for xterm, not >32.

I eventually got an occation to test it. Except for the extra debug
print that sneaked. it works well for my use case. (pkg_* tools on a
terminal application from ports that ships its own terminfo entry).

ok matthieu@ with the print removed (see inline).

> 
> Index: Cap.pm
> ===
> RCS file: /cvs/src/gnu/usr.bin/perl/cpan/Term-Cap/Cap.pm,v
> retrieving revision 1.3
> diff -u -p -r1.3 Cap.pm
> --- Cap.pm18 Oct 2023 01:49:26 -  1.3
> +++ Cap.pm20 Oct 2023 09:47:05 -
> @@ -16,8 +16,8 @@ sub croak
>  
>  use strict;
>  
> +use v5.16;
>  use vars qw($VERSION $VMS_TERMCAP);
> -use vars qw($termpat $state $first $entry);
>  
>  $VERSION = '1.17';
>  
> @@ -33,7 +33,7 @@ Term::Cap - Perl termcap interface
>  =head1 SYNOPSIS
>  
>  require Term::Cap;
> -$terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed };
> +$terminal = Term::Cap->Tgetent({ TERM => undef, OSPEED => $ospeed });
>  $terminal->Trequire(qw/ce ku kd/);
>  $terminal->Tgoto('cm', $col, $row, $FH);
>  $terminal->Tputs('dl', $count, $FH);
> @@ -75,10 +75,10 @@ if ( $^O eq 'VMS' )
>  
>  sub termcap_path
>  {## private
> -my @termcap_path;
> +my @l;
>  
>  # $TERMCAP, if it's a filespec
> -push( @termcap_path, $ENV{TERMCAP} )
> +push(@l, $ENV{TERMCAP})
>if (
>  ( exists $ENV{TERMCAP} )
>  && (
> @@ -87,23 +87,27 @@ sub termcap_path
>  : $ENV{TERMCAP} =~ /^\//s
>  )
>);
> -if ( ( exists $ENV{TERMPATH} ) && ( $ENV{TERMPATH} ) )
> -{
> -
> +if (exists $ENV{TERMPATH} && $ENV{TERMPATH}) {
>  # Add the users $TERMPATH
> -push( @termcap_path, split( /(:|\s+)/, $ENV{TERMPATH} ) );
> -}
> -else
> -{
> -
> +push(@l, split( /(:|\s+)/, $ENV{TERMPATH}));
> +} else {
>  # Defaults
> -push( @termcap_path,
> -exists $ENV{'HOME'} ? $ENV{'HOME'} . '/.termcap' : undef,
> -'/etc/termcap', '/usr/share/misc/termcap', );
> + if (exists $ENV{HOME}) {
> + push(@l, $ENV{HOME}.'/.termcap');
> + }
> +push(@l, '/etc/termcap', '/usr/share/misc/termcap', );
> +}
> +my @termcap_path;
> +my $seen = {};
> +for my $i (@l) {
> + next unless -f $i;
> + my $k = join(',', (stat _)[0,1]);
> + next if $seen->{$k};
> + push(@termcap_path, $i);
> + $seen->{$k} = 1;
>  }
>  
> -# return the list of those termcaps that exist
> -return grep { defined $_ && -f $_ } @termcap_path;
> +return @termcap_path;
>  }
>  
>  =over 4
> @@ -164,195 +168,158 @@ It calls C on failure.
>  
>  sub Tgetent
>  {## public -- static method
> -my $class = shift;
> -my ($self) = @_;
> +my ($class, $self) = @_;
>  
>  $self = {} unless defined $self;
>  bless $self, $class;
>  
> -my ( $term, $cap, $search, $field, $max, $tmp_term, $TERMCAP );
> -local ( $termpat, $state, $first, $entry );# used inside eval
> +my ($cap, $field);
> + 
>  local $_;
>  
>  # Compute PADDING factor from OSPEED (to be used by Tpad)
> -if ( !$self->{OSPEED} )
> -{
> -if ($^W)
> -{
> +if (!$self->{OSPEED}) {
> +if ($^W) {
>  carp "OSPEED was not set, defaulting to 9600";
>  }
>  $self->{OSPEED} = 9600;
>  }
> -if ( $self->{OSPEED} < 16 )
> -{
> -
> +if ($self->{OSPEED} < 16) {
>  # delays for old style speeds
>  my @pad = (
>  0,200, 133.3, 90.9, 74.3, 66.7, 50, 33.3,
>  16.7, 8.3, 5.5,   4.1,  2,1,.5, .2
>  );
>  $self->{PADDING} = $pad[ $self->{OSPEED} ];
> -}
> -else
> -{
> +} else {
>  $self->{PADDING} = 1 / $self->{OSPEED};
>  }
>  
> -unless ( $self->{TERM} )
> -{
> -   if ( $ENV{TERM} )
> -   {
> - $self->{TERM} =  $ENV{TERM} ;
> -   }
> -   else
> -   {
> -  if ( $^O eq

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-22 Thread Martin Pieuchot

On 22/10/23(Sun) 20:29, Miod Vallat wrote:
> > On 21/10/23(Sat) 14:28, Miod Vallat wrote:
> > > > Stuart, Miod, I wonder if this also help for the off-by-one issue you
> > > > are seeing.  It might not.
> > > 
> > > It makes the aforementioned issue disappear on the affected machine.
> > 
> > Thanks at lot for testing!
> 
> Spoke too soon. I have just hit
> 
> panic: kernel diagnostic assertion "uvmexp.swpgonly > 0" failed: file 
> "/usr/src/sys/uvm/uvm_anon.c", line 121
> Stopped at  db_enter+0x8:   add #0x4, r14
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *235984  11904  0 0x14000  0x2000  reaper
> db_enter() at db_enter+0x8
> panic() at panic+0x74
> __assert() at __assert+0x1c
> uvm_anfree_list() at uvm_anfree_list+0x156
> amap_wipeout() at amap_wipeout+0xe6
> uvm_unmap_detach() at uvm_unmap_detach+0x42
> uvm_map_teardown() at uvm_map_teardown+0x104
> uvmspace_free() at uvmspace_free+0x2a
> reaper() at reaper+0x86
> ddb> show uvmexp
> Current UVM status:
>   pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
>   14875 VM pages: 376 active, 2076 inactive, 1 wired, 7418 free (924
> zero)
>   min  10% (25) anon, 10% (25) vnode, 5% (12) vtext
>   freemin=495, free-target=660, inactive-target=2809, wired-max=4958
>   faults=73331603, traps=39755714, intrs=33863551, ctxswitch=11641480
> fpuswitch
> =0
>   softint=15742561, syscalls=39755712, kmapent=11
>   fault counts:
> noram=1, noanon=0, noamap=0, pgwait=1629, pgrele=0
> ok relocks(total)=1523991(1524022), anget(retries)=23905247(950233),
> amapco
> py=9049749
> neighbor anon/obj pg=12025732/40041442,
> gets(lock/unlock)=12859247/574102
> cases: anon=20680175, anoncow=3225049, obj=11467884, prcopy=1391019,
> przero
> =36545783
>   daemon and swap counts:
> woke=6868, revs=6246, scans=3525644, obscans=511526, anscans=2930634
> busy=0, freed=1973275, reactivate=83484, deactivate=3941988
> pageouts=94506, pending=94506, nswget=949421
> nswapdev=1
> swpages=4194415, swpginuse=621, swpgonly=0 paging=0
>   kernel pointers:
> objs(kern)=0x8c3ca94c

I wonder if the diff below makes a difference.  It's hard to debug and it
might be worth adding a counter for bad swap slots.

Index: uvm/uvm_anon.c
===
RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
retrieving revision 1.56
diff -u -p -r1.56 uvm_anon.c
--- uvm/uvm_anon.c  2 Sep 2023 08:24:40 -   1.56
+++ uvm/uvm_anon.c  22 Oct 2023 21:27:42 -
@@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
uvm_unlock_pageq(); /* free the daemon */
}
} else {
-   if (anon->an_swslot != 0) {
+   if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
/* This page is no longer only in swap. */
KASSERT(uvmexp.swpgonly > 0);
atomic_dec_int();

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-22 Thread Miod Vallat

> On 21/10/23(Sat) 14:28, Miod Vallat wrote:
> > > Stuart, Miod, I wonder if this also help for the off-by-one issue you
> > > are seeing.  It might not.
> > 
> > It makes the aforementioned issue disappear on the affected machine.
> 
> Thanks at lot for testing!

Spoke too soon. I have just hit

panic: kernel diagnostic assertion "uvmexp.swpgonly > 0" failed: file 
"/usr/src/sys/uvm/uvm_anon.c", line 121
Stopped at  db_enter+0x8:   add #0x4, r14
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*235984  11904  0 0x14000  0x2000  reaper
db_enter() at db_enter+0x8
panic() at panic+0x74
__assert() at __assert+0x1c
uvm_anfree_list() at uvm_anfree_list+0x156
amap_wipeout() at amap_wipeout+0xe6
uvm_unmap_detach() at uvm_unmap_detach+0x42
uvm_map_teardown() at uvm_map_teardown+0x104
uvmspace_free() at uvmspace_free+0x2a
reaper() at reaper+0x86
ddb> show uvmexp
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
  14875 VM pages: 376 active, 2076 inactive, 1 wired, 7418 free (924
zero)
  min  10% (25) anon, 10% (25) vnode, 5% (12) vtext
  freemin=495, free-target=660, inactive-target=2809, wired-max=4958
  faults=73331603, traps=39755714, intrs=33863551, ctxswitch=11641480
fpuswitch
=0
  softint=15742561, syscalls=39755712, kmapent=11
  fault counts:
noram=1, noanon=0, noamap=0, pgwait=1629, pgrele=0
ok relocks(total)=1523991(1524022), anget(retries)=23905247(950233),
amapco
py=9049749
neighbor anon/obj pg=12025732/40041442,
gets(lock/unlock)=12859247/574102
cases: anon=20680175, anoncow=3225049, obj=11467884, prcopy=1391019,
przero
=36545783
  daemon and swap counts:
woke=6868, revs=6246, scans=3525644, obscans=511526, anscans=2930634
busy=0, freed=1973275, reactivate=83484, deactivate=3941988
pageouts=94506, pending=94506, nswget=949421
nswapdev=1
swpages=4194415, swpginuse=621, swpgonly=0 paging=0
  kernel pointers:
objs(kern)=0x8c3ca94c

Re: nfsd: don't clear SB_NOINTR flag

2023-10-22 Thread Vitaliy Makkoveev

On Fri, Oct 20, 2023 at 10:51:46PM +0200, Alexander Bluhm wrote:
> On Mon, Oct 16, 2023 at 10:17:50PM +0300, Vitaliy Makkoveev wrote:
> > This socket comes from userland, so this flag is never set. This makes
> > SB_NOINTR flag immutable, because we only set this bit on NFS client
> > socket buffers for all it's lifetime.
> > 
> > I want to do this because this flag modifies sblock() behaviour and it's
> > not clean which lock should protect it for standalone sblock().
> > 
> > ok?
> 
> This code came here:
> https://svnweb.freebsd.org/csrg/sys/nfs/nfs_syscalls.c?revision=41903=markup
> "update from Rick Macklem adding TCP support to NFS"
> 
> I would guess that this flag must be cleared to allow to kill the
> NFS server if a client does not respond.  Otherwise it may hang in
> sbwait() in soreceive() or sosend().
> 
> As the flags are never set, your diff does not change behavior.
> 
> > I want to completely remove SB_NOINTR flag. Only NFS client sets it, but
> > since the socket never passed to userland, this flag is useless, because
> > we can't send singnal to kernel thread. So for this sblock()/sbwait()
> > sleep is uninterruptible. But I want to not mix NFS server and NFS
> > client diffs.
> 
> The NFS client does not run as kernel thread, but as the process
> that does file system access.  You can Ctrl-C it if mount uses
> "intr" option.  NFSMNT_INT sets some PCATCH and sb_timeo_nsecs, but
> I think signal should also abort sbwait().  That is why NFS client
> sets SB_NOINTR.
> 
> As this flag is only set during mount or reconnect.  It is a problem
> for MP?  Is it modified in a non-initialization path?
> 
> bluhm
> 

Sorry for non clean explanation.

This time solock() protects `sb_flags', so the following SB_NOINTR flag
check is fine:

sblock(struct socket *so, struct sockbuf *sb, int flags)
{
int error, prio = PSOCK;

soassertlocked(so);

if ((sb->sb_flags & SB_LOCK) == 0) {
sb->sb_flags |= SB_LOCK;
return (0);
}
if ((flags & SBL_WAIT) == 0)
return (EWOULDBLOCK);
if (!(flags & SBL_NOINTR || sb->sb_flags & SB_NOINTR))
prio |= PCATCH;

However, solock() will not protect `sb_flags' for standalone sblock().
It is not yet implemented, so here is the hypothetical sblock(). I
suppose the `sb_lock' rwlock(9) will protect the whole sockbuf data
include `sb_flags'. And there is the problem. As you can see, the
SB_NOINTR check presceeds the `sb_lock' acquisition, because the
rw_enter(9) behaviour depends on this flag.

sblock(struct sockbuf *sb, int flags)
{   
int lockflags = RW_WRITE;

if (flags & SBL_NOINTR || sb->sb_flags & SB_NOINTR)
lockflags |= RW_INTR;
if ((flags & SBL_WAIT) == 0)
lockflags |= RW_NOSLEEP;
return rw_enter(>sb_lock, lockflags);
}

I the case, when SB_NOINTR flag is immutable, there is no problem with
this unlocked check. Since this flag is really immutable and it's only
cleanup is the null op, I proposed to remove the flag cleanup.

However, this is not the only solution. This time this flag is not true
for sblock() because you always need to acquire solock() before. The
solock() acquisition is always uninterruptible, so ^C will not work. But
we assume the rwlock(9) acquisition doesn't take significant time,
right?

So, hypothetical sblock() could avoid the SB_NOINTR check. We have
SBL_NOINTR passed with `flags' for the sorflush() cases. This flag is
significant for sbwait(), but with the socket buffers standalone
locking, obviously sblock() should be taken before sbwait(), so the
`sb_flags' check will be protected:

sblock(struct sockbuf *sb, int flags)
{   
int lockflags = RW_WRITE;

if (flags & SBL_NOINTR)
lockflags |= RW_INTR;
if ((flags & SBL_WAIT) == 0)
lockflags |= RW_NOSLEEP;
return rw_enter(>sb_lock, lockflags);
}

sbwait(struct sockbuf *sb)
{   
int prio = (sb->sb_flags & SB_NOINTR) ? PSOCK : PSOCK | PCATCH;

sbassertlocked(sb);

return rwsleep_nsec(>sb_cc, >sb_lock, prio, "sbwait",
sb->sb_timeo_nsecs);
}

Of course this is hypothetical. Also, SB_NOINTR flag has sense for TCP
and UDP sockets only. I don't think they will be the first sockets with
standalone locking ability for buffers. The unlocked SB_NOINTR flag
check is fine for all other sockets, so it could be left as is for now.

Re: relayd does not delete control socket on shutdown

2023-10-22 Thread Theo de Raadt

Otto Moerbeek  wrote:

> On Sat, Oct 21, 2023 at 10:40:45PM +0300, Kapetanakis Giannis wrote:
> 
> > On 21/10/2023 20:39, Florian Obser wrote:
> > > Which was 8 years ago. I don't understand why you see a change in 7.4.
> > > 
> > > Anyway, we decided to not clean up control sockets in any of our
> > > privsep daemons because leaving them behind does not cause any issues.
> > 
> > I just noticed it today when I tried to use the socket in a script and
> > noticed that it stayed there even after shutdown and though it was after 7.4
> > but I was wrong about that.
> > 
> > Your commit made it that clear.
> > 
> > Agree it's not a big case if it stays there.
> > 
> > Would the unlink succeed if the socket was owned by _relayd?
> > 
> > G
> 
> Unlinking somthing requires write permissions to the directory it is
> in.

Which means an attacker who gains control, but otherwise can't do a bunch
of other things becuase of the privsep design -- could still fill the directory
and filesystem.

So a few years ago we asked ourselves -- is the tradeoff worth it?

Re: relayd does not delete control socket on shutdown

2023-10-22 Thread Otto Moerbeek

On Sat, Oct 21, 2023 at 10:40:45PM +0300, Kapetanakis Giannis wrote:

> On 21/10/2023 20:39, Florian Obser wrote:
> > Which was 8 years ago. I don't understand why you see a change in 7.4.
> > 
> > Anyway, we decided to not clean up control sockets in any of our
> > privsep daemons because leaving them behind does not cause any issues.
> 
> I just noticed it today when I tried to use the socket in a script and
> noticed that it stayed there even after shutdown and though it was after 7.4
> but I was wrong about that.
> 
> Your commit made it that clear.
> 
> Agree it's not a big case if it stays there.
> 
> Would the unlink succeed if the socket was owned by _relayd?
> 
> G

Unlinking somthing requires write permissions to the directory it is
in.

-Otto

Re: malloc: more info in error message for write-after-free with option D

2023-10-22 Thread Masato Asou

C_PAGESIZE, 0);
>   if (pp == MAP_FAILED)
>   return NULL;
> + if (DO_STATS) {
> + ff = map(d, MALLOC_PAGESIZE, 0);
> + if (ff == MAP_FAILED)
> + goto err;
> + memset(ff, 0, sizeof(void *) * MALLOC_PAGESIZE / 
> B2ALLOC(bucket));
> + }
>  
>   /* memory protect the page allocated in the malloc(0) case */
>   if (bucket == 0 && mprotect(pp, MALLOC_PAGESIZE, PROT_NONE) == -1)
> @@ -1011,7 +1035,7 @@ omalloc_make_chunks(struct dir_info *d, 
>   bp->page = pp;
>  
>   if (insert(d, (void *)((uintptr_t)pp | (bucket + 1)), (uintptr_t)bp,
> - NULL))
> + ff))
>   goto err;
>   LIST_INSERT_HEAD(>chunk_dir[bucket][listnum], bp, entries);
>  
> @@ -1022,6 +1046,8 @@ omalloc_make_chunks(struct dir_info *d, 
>  
>  err:
>   unmap(d, pp, MALLOC_PAGESIZE, 0);
> + if (ff != NULL && ff != MAP_FAILED)
> + unmap(d, ff, MALLOC_PAGESIZE, 0);
>   return NULL;
>  }
>  
> @@ -1101,7 +1127,7 @@ fill_canary(char *ptr, size_t sz, size_t
>   * Allocate a chunk
>   */
>  static void *
> -malloc_bytes(struct dir_info *d, size_t size, void *f)
> +malloc_bytes(struct dir_info *d, size_t size)
>  {
>   u_int i, r, bucket, listnum;
>   size_t k;
> @@ -1153,11 +1179,6 @@ malloc_bytes(struct dir_info *d, size_t 
>   }
>   }
>  found:
> - if (i == 0 && k == 0 && DO_STATS) {
> - struct region_info *r = find(d, bp->page);
> - STATS_SETF(r, f);
> - }
> -
>   *lp ^= 1 << k;
>  
>   /* If there are no more free, remove from free-list */
> @@ -1170,6 +1191,11 @@ found:
>   if (mopts.chunk_canaries && size > 0)
>   bp->bits[bp->offset + k] = size;
>  
> + if (DO_STATS) {
> + struct region_info *r = find(d, bp->page);
> + STATS_SETFN(r, k, d->caller);
> + }
> +
>   k *= B2ALLOC(bp->bucket);
>  
>   p = (char *)bp->page + k;
> @@ -1194,8 +1220,8 @@ validate_canary(struct dir_info *d, u_ch
>  
>   while (p < q) {
>   if (*p != (u_char)mopts.chunk_canaries && *p != SOME_JUNK) {
> - wrterror(d, "canary corrupted %p %#tx@%#zx%s",
> - ptr, p - ptr, sz,
> + wrterror(d, "canary corrupted %p[%tu]@%zu/%zu%s",
> + ptr, p - ptr, sz, allocated,
>   *p == SOME_FREEJUNK ? " (double free?)" : "");
>   }
>   p++;
> @@ -1215,8 +1241,7 @@ find_chunknum(struct dir_info *d, struct
>  
>   if ((uintptr_t)ptr & (MALLOC_MINSIZE - 1))
>   wrterror(d, "modified chunk-pointer %p", ptr);
> - if (info->bits[chunknum / MALLOC_BITS] &
> - (1U << (chunknum % MALLOC_BITS)))
> + if (CHUNK_FREE(info, chunknum))
>   wrterror(d, "double free %p", ptr);
>   if (check && info->bucket > 0) {
>   validate_canary(d, ptr, info->bits[info->offset + chunknum],
> @@ -1239,9 +1264,6 @@ free_bytes(struct dir_info *d, struct re
>   info = (struct chunk_info *)r->size;
>   chunknum = find_chunknum(d, info, ptr, 0);
>  
> - if (chunknum == 0)
> - STATS_SETF(r, NULL);
> -
>   info->bits[chunknum / MALLOC_BITS] |= 1U << (chunknum % MALLOC_BITS);
>   info->free++;
>  
> @@ -1261,18 +1283,22 @@ free_bytes(struct dir_info *d, struct re
>   if (info->bucket == 0 && !mopts.malloc_freeunmap)
>   mprotect(info->page, MALLOC_PAGESIZE, PROT_READ | PROT_WRITE);
>   unmap(d, info->page, MALLOC_PAGESIZE, 0);
> +#ifdef MALLOC_STATS
> + if (r->f != NULL) {
> + unmap(d, r->f, MALLOC_PAGESIZE, MALLOC_PAGESIZE);
> + r->f = NULL;
> + }
> +#endif
>  
>   delete(d, r);
>   mp = >chunk_info_list[info->bucket];
>   LIST_INSERT_HEAD(mp, info, entries);
>  }
>  
> -
> -
>  static void *
> -omalloc(struct dir_info *pool, size_t sz, int zero_fill, void *f)
> +omalloc(struct dir_info *pool, size_t sz, int zero_fill)
>  {
> - void *p;
> + void *p, *caller = NULL;
>   size_t psz;
>  
>   if (sz > MALLOC_MAXCHUNK) {
> @@ -1287,7 +1313,11 @@ omalloc(struct dir_info *pool, size_t sz
>   errno = ENOMEM;
>   return NULL;
>   }
> - if (insert(pool, p, sz, f)) {
> +#ifdef MALLOC_STATS
> +

Re: bt(5), btrace(8): execute END probe and print maps after exit() statement

2023-10-21 Thread Scott Cheloha

On Sat, Oct 21, 2023 at 07:17:05PM +0200, Martin Pieuchot wrote:
> On 18/10/23(Wed) 12:56, Scott Cheloha wrote:
> > Hi,
> > 
> > A bt(5) exit() statement causes the btrace(8) interpreter to exit(3)
> > immediately.
> > 
> > A BPFtrace exit() statement is more nuanced: the END probe is executed
> > and the contents of all maps are printed before the interpreter exits.
> > 
> > This patch adds a halting check after the execution of each bt(5)
> > statement.  If a statement causes the program to halt, the halt
> > bubbles up to the top-level rule evaluation loop and terminates
> > execution.  rules_teardown() then runs, just as if the program had
> > received SIGTERM.
> > 
> > Two edge-like cases:
> > 
> > 1. You can exit() from BEGIN.  rules_setup() returns non-zero if this
> >happens so the main loop knows to halt immediately.
> > 
> > 2. You can exit() from END.  This is just an early-return: the END probe
> >doesn't run again.
> > 
> > Thoughts?
> 
> Makes sense to ease the transition from bpftrace scripts.  Ok with me if
> you make sure the regression tests still pass.  Some outputs might
> depend on the actual behavior and would need to be updated.

Agh, my mistake, there are two tests that depend on the current
behavior.  I have updated them below.

ok with the test fixes?

Index: usr.sbin/btrace/btrace.c
===
RCS file: /cvs/src/usr.sbin/btrace/btrace.c,v
retrieving revision 1.79
diff -u -p -r1.79 btrace.c
--- usr.sbin/btrace/btrace.c12 Oct 2023 15:16:44 -  1.79
+++ usr.sbin/btrace/btrace.c22 Oct 2023 01:21:21 -
@@ -71,10 +71,10 @@ struct dtioc_probe_info *dtpi_get_by_val
  * Main loop and rule evaluation.
  */
 voidrules_do(int);
-voidrules_setup(int);
-voidrules_apply(int, struct dt_evt *);
+int rules_setup(int);
+int rules_apply(int, struct dt_evt *);
 voidrules_teardown(int);
-voidrule_eval(struct bt_rule *, struct dt_evt *);
+int rule_eval(struct bt_rule *, struct dt_evt *);
 voidrule_printmaps(struct bt_rule *);
 
 /*
@@ -84,7 +84,7 @@ uint64_t   builtin_nsecs(struct dt_evt *
 const char *builtin_kstack(struct dt_evt *);
 const char *builtin_arg(struct dt_evt *, enum bt_argtype);
 struct bt_arg  *fn_str(struct bt_arg *, struct dt_evt *, char *);
-voidstmt_eval(struct bt_stmt *, struct dt_evt *);
+int stmt_eval(struct bt_stmt *, struct dt_evt *);
 voidstmt_bucketize(struct bt_stmt *, struct dt_evt *);
 voidstmt_clear(struct bt_stmt *);
 voidstmt_delete(struct bt_stmt *, struct dt_evt *);
@@ -405,6 +405,7 @@ void
 rules_do(int fd)
 {
struct sigaction sa;
+   int halt = 0;
 
memset(, 0, sizeof(sa));
sigemptyset(_mask);
@@ -415,9 +416,9 @@ rules_do(int fd)
if (sigaction(SIGTERM, , NULL))
err(1, "sigaction");
 
-   rules_setup(fd);
+   halt = rules_setup(fd);
 
-   while (!quit_pending && g_nprobes > 0) {
+   while (!quit_pending && !halt && g_nprobes > 0) {
static struct dt_evt devtbuf[64];
ssize_t rlen;
size_t i;
@@ -434,8 +435,11 @@ rules_do(int fd)
if ((rlen % sizeof(struct dt_evt)) != 0)
err(1, "incorrect read");
 
-   for (i = 0; i < rlen / sizeof(struct dt_evt); i++)
-   rules_apply(fd, [i]);
+   for (i = 0; i < rlen / sizeof(struct dt_evt); i++) {
+   halt = rules_apply(fd, [i]);
+   if (halt)
+   break;
+   }
}
 
rules_teardown(fd);
@@ -484,7 +488,7 @@ rules_action_scan(struct bt_stmt *bs)
return evtflags;
 }
 
-void
+int
 rules_setup(int fd)
 {
struct dtioc_probe_info *dtpi;
@@ -493,7 +497,7 @@ rules_setup(int fd)
struct bt_probe *bp;
struct bt_stmt *bs;
struct bt_arg *ba;
-   int dokstack = 0, on = 1;
+   int dokstack = 0, halt = 0, on = 1;
uint64_t evtflags;
 
TAILQ_FOREACH(r, _rules, br_next) {
@@ -553,7 +557,7 @@ rules_setup(int fd)
clock_gettime(CLOCK_REALTIME, _devt.dtev_tsp);
 
if (rbegin)
-   rule_eval(rbegin, _devt);
+   halt = rule_eval(rbegin, _devt);
 
/* Enable all probes */
TAILQ_FOREACH(r, _rules, br_next) {
@@ -571,9 +575,14 @@ rules_setup(int fd)
if (ioctl(fd, DTIOCRECORD, ))
err(1, "DTIOCRECORD");
}
+
+   return halt;
 }
 
-void
+/*
+ * Returns non-zero if the program should halt.
+ */
+int
 rules_apply(int fd, struct dt_evt *dtev)
 {
struct bt_rule *r;
@@ -586,9 +595,11 @@ rules_apply(int fd,

Re: Virtio fix for testing

2023-10-21 Thread Andrew Cagney

On Mon, 21 Aug 2023 at 15:31, Andrew Cagney  wrote:
>
> On Sun, 20 Aug 2023 at 06:23, Stefan Fritsch  wrote:
> >
> > Am 13.08.23 um 17:38 schrieb Tobias Heider:

> > You could try something like
> >
> > -device virtio-scsi-pci,id=scsi
> > -drive file=install73.iso,format=raw,id=cdinst,if=none
> > -device scsi-cd,drive=cdinst
>
> good idea (just need to translate it to virt-install speak)
>
> > That depends on your seabios having support for virtio cdroms. Not sure
> > if that is the default by now.
> >
> > Or maybe try a SATA cdrom, but you would need to figure out the qemu
> > options for that yourself.

This is what I did.  Using virt-install and replacing --cdrom with
something like:

+KVM_OPENBSD_VIRT_INSTALL_FLAGS = \
+ --disk path=$(KVM_OPENBSD_BASE_ISO),readonly=on,device=cdrom,target.bus=sata \
+ --install bootdev=cdrom

I can boot/install 7.4's ISO.

thanks,
Andrew

Re: relayd does not delete control socket on shutdown

2023-10-21 Thread Kapetanakis Giannis


On 21/10/2023 20:39, Florian Obser wrote:

Which was 8 years ago. I don't understand why you see a change in 7.4.

Anyway, we decided to not clean up control sockets in any of our
privsep daemons because leaving them behind does not cause any issues.


I just noticed it today when I tried to use the socket in a script and 
noticed that it stayed there even after shutdown and though it was after 
7.4 but I was wrong about that.


Your commit made it that clear.

Agree it's not a big case if it stays there.

Would the unlink succeed if the socket was owned by _relayd?

G

Re: relayd does not delete control socket on shutdown

2023-10-21 Thread Florian Obser

On 2023-10-21 14:49 +03, Kapetanakis Giannis  wrote:
> Rev 1.140 by florian@ seems to have changed that.
>
> Do not try to unlink the control socket in an unprivileged child
> process on shutdown.
> Found while working ontame(2)  .
> OK benno@
>

Which was 8 years ago. I don't understand why you see a change in 7.4.

Anyway, we decided to not clean up control sockets in any of our
privsep daemons because leaving them behind does not cause any issues.

> G
>
>
> On 21/10/2023 14:41, Kapetanakis Giannis wrote:
>> After 7.4 relayd does not unlink it's socket
>>
>> I've added the following but it's probably not enough. unveil?
>>
>> G
>>
>> Index: relayd.c
>> ===
>> RCS file: /cvs/src/usr.sbin/relayd/relayd.c,v
>> retrieving revision 1.191
>> diff -u -p -r1.191 relayd.c
>> --- relayd.c    25 Jun 2023 08:07:38 -    1.191
>> +++ relayd.c    21 Oct 2023 11:39:44 -
>> @@ -382,6 +382,8 @@ parent_shutdown(struct relayd *env)
>>  free(env->sc_ps);
>>  free(env);
>>
>> +    unlink(env->sc_ps->ps_csock.cs_name);
>> +
>>  log_info("parent terminating, pid %d", getpid());
>>
>>  exit(0);
>>
>

-- 
In my defence, I have been left unsupervised.

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-21 Thread Martin Pieuchot

On 21/10/23(Sat) 14:28, Miod Vallat wrote:
> > Stuart, Miod, I wonder if this also help for the off-by-one issue you
> > are seeing.  It might not.
> 
> It makes the aforementioned issue disappear on the affected machine.

Thanks at lot for testing!

> > Comments, ok?
> 
> > diff --git sys/uvm/uvm_pdaemon.c sys/uvm/uvm_pdaemon.c
> > index 284211d226c..a26a776df67 100644
> > --- sys/uvm/uvm_pdaemon.c
> > +++ sys/uvm/uvm_pdaemon.c
> 
> > @@ -917,9 +914,7 @@ uvmpd_scan(struct uvm_pmalloc *pma, struct 
> > uvm_constraint_range *constraint)
> >  */
> > free = uvmexp.free - BUFPAGES_DEFICIT;
> > swap_shortage = 0;
> > -   if (free < uvmexp.freetarg &&
> > -   uvmexp.swpginuse == uvmexp.swpages &&
> > -   !uvm_swapisfull() &&
> > +   if (free < uvmexp.freetarg && uvm_swapisfilled() && !uvm_swapisfull() &&
> > pages_freed == 0) {
> > swap_shortage = uvmexp.freetarg - free;
> > }
> 
> It's unfortunate that you now invoke two uvm_swapisxxx() routines, which
> will both acquire a mutex. Maybe a third uvm_swapisxxx routine could be
> introduced to compute the swapisfilled && !swapisfull condition at once?

I'm mot interested in such micro optimization yet.  Not acquiring a mutex
twice is IMHO not worth making half-shiny. 

However is someone is interested in going down in this direction, I'd
suggest try placing `uvmexp.freetarg' under the same lock and deal with
all its occurrences.  This is a possible next step to reduce the scope of
the uvm_lock_pageq() which is currently responsible for most of the
contention on MP in UVM.

Re: bt(5), btrace(8): execute END probe and print maps after exit() statement

2023-10-21 Thread Martin Pieuchot

On 18/10/23(Wed) 12:56, Scott Cheloha wrote:
> Hi,
> 
> A bt(5) exit() statement causes the btrace(8) interpreter to exit(3)
> immediately.
> 
> A BPFtrace exit() statement is more nuanced: the END probe is executed
> and the contents of all maps are printed before the interpreter exits.
> 
> This patch adds a halting check after the execution of each bt(5)
> statement.  If a statement causes the program to halt, the halt
> bubbles up to the top-level rule evaluation loop and terminates
> execution.  rules_teardown() then runs, just as if the program had
> received SIGTERM.
> 
> Two edge-like cases:
> 
> 1. You can exit() from BEGIN.  rules_setup() returns non-zero if this
>happens so the main loop knows to halt immediately.
> 
> 2. You can exit() from END.  This is just an early-return: the END probe
>doesn't run again.
> 
> Thoughts?

Makes sense to ease the transition from bpftrace scripts.  Ok with me if
you make sure the regression tests still pass.  Some outputs might
depend on the actual behavior and would need to be updated.

> 
> $ btrace -e '
> BEGIN {
>   @[probe] = "reached";
>   exit();
>   @[probe] = "not reached";
> }
> END {
>   @[probe] = "reached";
>   exit();
>   @[probe] = "not reached";
> }'
> 
> Index: btrace.c
> ===
> RCS file: /cvs/src/usr.sbin/btrace/btrace.c,v
> retrieving revision 1.79
> diff -u -p -r1.79 btrace.c
> --- btrace.c  12 Oct 2023 15:16:44 -  1.79
> +++ btrace.c  18 Oct 2023 17:54:16 -
> @@ -71,10 +71,10 @@ struct dtioc_probe_info   *dtpi_get_by_val
>   * Main loop and rule evaluation.
>   */
>  void  rules_do(int);
> -void  rules_setup(int);
> -void  rules_apply(int, struct dt_evt *);
> +int   rules_setup(int);
> +int   rules_apply(int, struct dt_evt *);
>  void  rules_teardown(int);
> -void  rule_eval(struct bt_rule *, struct dt_evt *);
> +int   rule_eval(struct bt_rule *, struct dt_evt *);
>  void  rule_printmaps(struct bt_rule *);
>  
>  /*
> @@ -84,7 +84,7 @@ uint64_t builtin_nsecs(struct dt_evt *
>  const char   *builtin_kstack(struct dt_evt *);
>  const char   *builtin_arg(struct dt_evt *, enum bt_argtype);
>  struct bt_arg*fn_str(struct bt_arg *, struct dt_evt *, char 
> *);
> -void  stmt_eval(struct bt_stmt *, struct dt_evt *);
> +int   stmt_eval(struct bt_stmt *, struct dt_evt *);
>  void  stmt_bucketize(struct bt_stmt *, struct dt_evt *);
>  void  stmt_clear(struct bt_stmt *);
>  void  stmt_delete(struct bt_stmt *, struct dt_evt *);
> @@ -405,6 +405,7 @@ void
>  rules_do(int fd)
>  {
>   struct sigaction sa;
> + int halt = 0;
>  
>   memset(, 0, sizeof(sa));
>   sigemptyset(_mask);
> @@ -415,9 +416,9 @@ rules_do(int fd)
>   if (sigaction(SIGTERM, , NULL))
>   err(1, "sigaction");
>  
> - rules_setup(fd);
> + halt = rules_setup(fd);
>  
> - while (!quit_pending && g_nprobes > 0) {
> + while (!quit_pending && !halt && g_nprobes > 0) {
>   static struct dt_evt devtbuf[64];
>   ssize_t rlen;
>   size_t i;
> @@ -434,8 +435,11 @@ rules_do(int fd)
>   if ((rlen % sizeof(struct dt_evt)) != 0)
>   err(1, "incorrect read");
>  
> - for (i = 0; i < rlen / sizeof(struct dt_evt); i++)
> - rules_apply(fd, [i]);
> + for (i = 0; i < rlen / sizeof(struct dt_evt); i++) {
> + halt = rules_apply(fd, [i]);
> + if (halt)
> + break;
> + }
>   }
>  
>   rules_teardown(fd);
> @@ -484,7 +488,7 @@ rules_action_scan(struct bt_stmt *bs)
>   return evtflags;
>  }
>  
> -void
> +int
>  rules_setup(int fd)
>  {
>   struct dtioc_probe_info *dtpi;
> @@ -493,7 +497,7 @@ rules_setup(int fd)
>   struct bt_probe *bp;
>   struct bt_stmt *bs;
>   struct bt_arg *ba;
> - int dokstack = 0, on = 1;
> + int dokstack = 0, halt = 0, on = 1;
>   uint64_t evtflags;
>  
>   TAILQ_FOREACH(r, _rules, br_next) {
> @@ -553,7 +557,7 @@ rules_setup(int fd)
>   clock_gettime(CLOCK_REALTIME, _devt.dtev_tsp);
>  
>   if (rbegin)
> - rule_eval(rbegin, _devt);
> + halt = rule_eval(rbegin, _devt);
>  
>   /* Enable all probes */
>   TAILQ_FOREACH(r, _rules, br_next) {
> @@ -571,9 +575,14 @@ rules_setup(int fd)
>   if (ioctl(fd, DTIOCRECORD, ))
>   err(1, "DTIOCRECORD");
>   }
> +
> + return halt;
>  }
>  
> -void
> +/*
> + * Returns non-zero if the program should halt.
> + */
> +int
>  rules_apply(int fd, struct dt_evt *dtev)
>  {
>   struct bt_rule *r;
> @@ -586,9 +595,11 @@

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-21 Thread Miod Vallat

> Stuart, Miod, I wonder if this also help for the off-by-one issue you
> are seeing.  It might not.

It makes the aforementioned issue disappear on the affected machine.

> Comments, ok?

> diff --git sys/uvm/uvm_pdaemon.c sys/uvm/uvm_pdaemon.c
> index 284211d226c..a26a776df67 100644
> --- sys/uvm/uvm_pdaemon.c
> +++ sys/uvm/uvm_pdaemon.c

> @@ -917,9 +914,7 @@ uvmpd_scan(struct uvm_pmalloc *pma, struct 
> uvm_constraint_range *constraint)
>*/
>   free = uvmexp.free - BUFPAGES_DEFICIT;
>   swap_shortage = 0;
> - if (free < uvmexp.freetarg &&
> - uvmexp.swpginuse == uvmexp.swpages &&
> - !uvm_swapisfull() &&
> + if (free < uvmexp.freetarg && uvm_swapisfilled() && !uvm_swapisfull() &&
>   pages_freed == 0) {
>   swap_shortage = uvmexp.freetarg - free;
>   }

It's unfortunate that you now invoke two uvm_swapisxxx() routines, which
will both acquire a mutex. Maybe a third uvm_swapisxxx routine could be
introduced to compute the swapisfilled && !swapisfull condition at once?

Re: relayd does not delete control socket on shutdown

2023-10-21 Thread Kapetanakis Giannis


Rev 1.140 by florian@ seems to have changed that.

Do not try to unlink the control socket in an unprivileged child
process on shutdown.
Found while working ontame(2)  .
OK benno@

G


On 21/10/2023 14:41, Kapetanakis Giannis wrote:

After 7.4 relayd does not unlink it's socket

I've added the following but it's probably not enough. unveil?

G

Index: relayd.c
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.c,v
retrieving revision 1.191
diff -u -p -r1.191 relayd.c
--- relayd.c    25 Jun 2023 08:07:38 -    1.191
+++ relayd.c    21 Oct 2023 11:39:44 -
@@ -382,6 +382,8 @@ parent_shutdown(struct relayd *env)
 free(env->sc_ps);
 free(env);

+    unlink(env->sc_ps->ps_csock.cs_name);
+
 log_info("parent terminating, pid %d", getpid());

 exit(0);

Re: 7.3: Squid 6.3 with multiple workers - net.unix.dgram.sendspace too low

2023-10-21 Thread Stuart Henderson

On 2023/10/21 09:31, Silamael Darkomen wrote:
> On 20 Oct 2023 19:33, Stuart Henderson wrote:
> > After a few hours digging around, I eventually figured out where the
> > relevant sockets are created and have added a patch (to 7.4-stable and
> > -current) to bump buffers on them.
> 
> Hi Stuart,
> 
> Meanwhile I also did some digging and opened a bug at Squid with
> a patch attached:
> https://bugs.squid-cache.org/show_bug.cgi?id=5308
> I hope Squid will fix this on their side for future releases.
> 
> Greetings,
> Matthias
> 

I had another thought about this, we should probably fetch the
existing buffer size and avoid decreasing (in case the admin
already raised it system-wide)

Re: 7.3: Squid 6.3 with multiple workers - net.unix.dgram.sendspace too low

2023-10-21 Thread Silamael Darkomen


On 20 Oct 2023 19:33, Stuart Henderson wrote:

After a few hours digging around, I eventually figured out where the
relevant sockets are created and have added a patch (to 7.4-stable and
-current) to bump buffers on them.


Hi Stuart,

Meanwhile I also did some digging and opened a bug at Squid with a patch 
attached: https://bugs.squid-cache.org/show_bug.cgi?id=5308

I hope Squid will fix this on their side for future releases.

Greetings,
Matthias

Re: nfsd: don't clear SB_NOINTR flag

2023-10-20 Thread Alexander Bluhm

On Mon, Oct 16, 2023 at 10:17:50PM +0300, Vitaliy Makkoveev wrote:
> This socket comes from userland, so this flag is never set. This makes
> SB_NOINTR flag immutable, because we only set this bit on NFS client
> socket buffers for all it's lifetime.
> 
> I want to do this because this flag modifies sblock() behaviour and it's
> not clean which lock should protect it for standalone sblock().
> 
> ok?

This code came here:
https://svnweb.freebsd.org/csrg/sys/nfs/nfs_syscalls.c?revision=41903=markup
"update from Rick Macklem adding TCP support to NFS"

I would guess that this flag must be cleared to allow to kill the
NFS server if a client does not respond.  Otherwise it may hang in
sbwait() in soreceive() or sosend().

As the flags are never set, your diff does not change behavior.

> I want to completely remove SB_NOINTR flag. Only NFS client sets it, but
> since the socket never passed to userland, this flag is useless, because
> we can't send singnal to kernel thread. So for this sblock()/sbwait()
> sleep is uninterruptible. But I want to not mix NFS server and NFS
> client diffs.

The NFS client does not run as kernel thread, but as the process
that does file system access.  You can Ctrl-C it if mount uses
"intr" option.  NFSMNT_INT sets some PCATCH and sb_timeo_nsecs, but
I think signal should also abort sbwait().  That is why NFS client
sets SB_NOINTR.

As this flag is only set during mount or reconnect.  It is a problem
for MP?  Is it modified in a non-initialization path?

bluhm

> Index: sys/nfs/nfs_syscalls.c
> ===
> RCS file: /cvs/src/sys/nfs/nfs_syscalls.c,v
> retrieving revision 1.119
> diff -u -p -r1.119 nfs_syscalls.c
> --- sys/nfs/nfs_syscalls.c3 Aug 2023 09:49:09 -   1.119
> +++ sys/nfs/nfs_syscalls.c16 Oct 2023 19:00:02 -
> @@ -276,9 +276,7 @@ nfssvc_addsock(struct file *fp, struct m
>   m_freem(m);
>   }
>   solock(so);
> - so->so_rcv.sb_flags &= ~SB_NOINTR;
>   so->so_rcv.sb_timeo_nsecs = INFSLP;
> - so->so_snd.sb_flags &= ~SB_NOINTR;
>   so->so_snd.sb_timeo_nsecs = INFSLP;
>   sounlock(so);
>   if (tslp)

Re: 7.3: Squid 6.3 with multiple workers - net.unix.dgram.sendspace too low

2023-10-20 Thread Stuart Henderson

After a few hours digging around, I eventually figured out where the
relevant sockets are created and have added a patch (to 7.4-stable and
-current) to bump buffers on them.

Re: 7.3: Squid 6.3 with multiple workers - net.unix.dgram.sendspace too low

2023-10-20 Thread Stuart Henderson

On 2023/10/20 13:57, Stuart Henderson wrote:
> On 2023/10/19 15:09, Silamael Darkomen wrote:
> > Hi,
> > 
> > Today I upgraded to the brand new Squid version 6.3 from ports and noticed,
> > that Squid no longer starts properly if configured with multiple worker
> > processes.
> > 
> > After some debugging the limit from net.unix.dgram.sendspace came up as
> > cause. The 2k default is way to low.
> 
> I haven't normally been using multiple workers, but have just given this
> a spin with 4 workers on my usual setup (which is doing memory caching
> only, no disk cache) and can't trigger it there.
> 
> If you're using disk cache, I assume that would be rock for multiple
> workers, in which case this is expected anyway and already documented:
> 
> https://wiki.squid-cache.org/Features/SmpScale#write-failure-40-message-too-long
> https://cvsweb.openbsd.org/cgi-bin/cvsweb/ports/www/squid/pkg/README-main?rev=HEAD=text/x-cvsweb-markup
> 

Ah, I'd forgotten that I've already bump dgram limits on that machine
as it also runs unbound and I've run into problems with that before..

Re: 7.3: Squid 6.3 with multiple workers - net.unix.dgram.sendspace too low

2023-10-20 Thread Stuart Henderson

On 2023/10/19 15:09, Silamael Darkomen wrote:
> Hi,
> 
> Today I upgraded to the brand new Squid version 6.3 from ports and noticed,
> that Squid no longer starts properly if configured with multiple worker
> processes.
> 
> After some debugging the limit from net.unix.dgram.sendspace came up as
> cause. The 2k default is way to low.

I haven't normally been using multiple workers, but have just given this
a spin with 4 workers on my usual setup (which is doing memory caching
only, no disk cache) and can't trigger it there.

If you're using disk cache, I assume that would be rock for multiple
workers, in which case this is expected anyway and already documented:

https://wiki.squid-cache.org/Features/SmpScale#write-failure-40-message-too-long
https://cvsweb.openbsd.org/cgi-bin/cvsweb/ports/www/squid/pkg/README-main?rev=HEAD=text/x-cvsweb-markup

Re: HAMMER2 filesystem for OpenBSD

2023-10-20 Thread Stuart Henderson

On 2023/10/20 11:51, Chris Narkiewicz wrote:
> On Thu, 2023-10-19 at 15:03 +0200, Denis Fondras wrote:
> > 
> > https://marc.info/?l=openbsd-misc=169272174500676=2
> > 
> > 
> 
> Thank you, I missed that one.
> 
> So the key piece of information from that thread is that
> Kusumi's implementation of HAMMER2 for OpenBSD is read-only.

It was read-only then. That changed in 1.1.5. But the netbsd_hammer2
and openbsd_hammer2 repos say

"This repository will be abandoned once Linux or FreeBSD is stabilized
with write support. NetBSD is not the main target."

and

"This repository will be abandoned once Linux or FreeBSD is stabilized
with write support. OpenBSD is not the main area of interest."

Re: HAMMER2 filesystem for OpenBSD

2023-10-20 Thread Chris Narkiewicz

On Thu, 2023-10-19 at 15:03 +0200, Denis Fondras wrote:
> 
> https://marc.info/?l=openbsd-misc=169272174500676=2
> 
> 

Thank you, I missed that one.

So the key piece of information from that thread is that
Kusumi's implementation of HAMMER2 for OpenBSD is read-only.

Cheers,
Chris

Re: 7.3: Squid 6.3 with multiple workers - net.unix.dgram.sendspace too low

2023-10-19 Thread Claudio Jeker

On Thu, Oct 19, 2023 at 03:09:08PM +0200, Silamael Darkomen wrote:
> Hi,
> 
> Today I upgraded to the brand new Squid version 6.3 from ports and noticed,
> that Squid no longer starts properly if configured with multiple worker
> processes.
> 
> After some debugging the limit from net.unix.dgram.sendspace came up as
> cause. The 2k default is way to low.
> In ktrace I saw sendmessage calls with messages slightly over 4k.
> 
> Increasing this limit to 16k as net.unix.dgram.recvspace fixes the problem
> and Squid can start.
> 
> Perhaps this historically low limit should be adjusted accordingly?
> All for all other Unix sockets, sendspace and recvspace share the same
> limits, just dgram sockets are out of line.
> 
> PS: Tested this with 7.3 but 7.4 seems to have the same limitations.
> 

It is a SOCK_DGRAM socket, it is supposed to be limited.
>From the man-page:
A SOCK_DGRAM socket supports datagrams (connectionless,
unreliable messages of a fixed (typically small) maximum length).

The program should set the socket buffer size via setsockopt() using
SO_SNDBUF. It seems squid just YOLOs this and hopes for the best.

So the best way to fix this is in squid itself.
-- 
:wq Claudio

Re: 7.3: Squid 6.3 with multiple workers not starting - net.unix.dgram.sendspace too low

2023-10-19 Thread Silamael Darkomen


On 19 Oct 2023 16:36, Silamael Darkomen wrote:

Hi,

I just upgraded to Squid 6.3 under 7.3 and noticed that it no longer 
starts if configured to use multiple worker processes.


After some debugging I found that net.unix.dgram.sendspace with its 2k 
limit is the reason.
Squid uses Unix sockets for IPC and sends messages larger than those 2k 
(saw slightly more than 4k in ktrace).


As this limit seems pretty historical and all other Unix sockets use 
larger limits and have sendspace same as recvspace I would suggest to 
increase net.unix.dgram.sendspace to 16k as net.unix.dgram.recvspace.


PS: Tested on 7.3 but 7.4 seems to use the same old limits.

Greetings,
Matthias



Sorry for the second mail.

Re: snmpd: remove filter-pf-addresses support

2023-10-19 Thread Theo Buehler

On Thu, Oct 19, 2023 at 04:13:41PM +0200, Martijn van Duren wrote:
> OpenBSD 7.4 is here. upgrade72.html already mentions it's deprecation.
> 
> OK?

ok

> 
> martijn@
> 
> Index: parse.y
> ===
> RCS file: /cvs/src/usr.sbin/snmpd/parse.y,v
> retrieving revision 1.78
> diff -u -p -r1.78 parse.y
> --- parse.y   6 Oct 2022 14:41:08 -   1.78
> +++ parse.y   19 Oct 2023 14:12:55 -
> @@ -140,7 +140,7 @@ typedef struct {
>  %token   SYSTEM CONTACT DESCR LOCATION NAME OBJECTID SERVICES RTFILTER
>  %token   READONLY READWRITE OCTETSTRING INTEGER COMMUNITY TRAP RECEIVER
>  %token   SECLEVEL NONE AUTH ENC USER AUTHKEY ENCKEY ERROR
> -%token   HANDLE DEFAULT SRCADDR TCP UDP PFADDRFILTER BLOCKLIST PORT
> +%token   HANDLE DEFAULT SRCADDR TCP UDP BLOCKLIST PORT
>  %token STRING
>  %token NUMBER
>  %type  usmuser community optcommunity
> @@ -336,26 +336,6 @@ main : LISTEN ON listen_udptcp
>   else
>   conf->sc_rtfilter = 0;
>   }
> - /* XXX Remove after 7.4 */
> - | PFADDRFILTER yesno{
> - struct ber_oid *blocklist;
> -
> - log_warnx("filter-pf-addresses is deprecated. "
> - "Please use blocklist pfTblAddrTable instead.");
> - if ($2) {
> - blocklist = recallocarray(conf->sc_blocklist,
> - conf->sc_nblocklist,
> - conf->sc_nblocklist + 1,
> - sizeof(*blocklist));
> - if (blocklist == NULL) {
> - yyerror("malloc");
> - YYERROR;
> - }
> - conf->sc_blocklist = blocklist;
> - smi_string2oid("pfTblAddrTable",
> - &(blocklist[conf->sc_nblocklist++]));
> - }
> - }
>   | seclevel {
>   conf->sc_min_seclevel = $1;
>   }
> @@ -1195,7 +1175,6 @@ lookup(char *s)
>   { "enc",ENC },
>   { "enckey", ENCKEY },
>   { "engineid",   ENGINEID },
> - { "filter-pf-addresses",PFADDRFILTER },
>   { "filter-routes",  RTFILTER },
>   { "group",  GROUP },
>   { "handle", HANDLE },
>

Re: IPv4 on ix(4) slow/nothing - 7.4

2023-10-19 Thread Jan Klemkow

On Wed, Oct 18, 2023 at 08:53:44PM +0200, Alexander Bluhm wrote:
> On Wed, Oct 18, 2023 at 08:19:29PM +0200, Mischa wrote:
> > It's indeed something like that: ix -> vlan (tagged) -> veb
> 
> When vlan is added to veb, kernel should disable LRO on ix.
> All testing before release did not find this code path :-(
> 
> Is it possible to add vlan to veb first, and then add or change the
> vlan parent to ix?  If it works, that should also disable LRO.
> 
> Jan said he will have a look tomorrow.
> 
> trunk, carp, ... in veb or bridge might have the same issue.

First round of fixes for vlan(4), vxlan(4), nvgre(4) and bpe(4).

ok?

Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.708
diff -u -p -r1.708 if.c
--- net/if.c16 Sep 2023 09:33:27 -  1.708
+++ net/if.c19 Oct 2023 13:03:33 -
@@ -3243,6 +3243,17 @@ ifsetlro(struct ifnet *ifp, int on)
struct ifreq ifrq;
int error = 0;
int s = splnet();
+   struct if_parent parent;
+
+   memset(, 0, sizeof(parent));
+   if ((*ifp->if_ioctl)(ifp, SIOCGIFPARENT, (caddr_t)) != -1) {
+   struct ifnet *ifp0 = if_unit(parent.ifp_parent);
+
+   if (ifp0 != NULL) {
+   ifsetlro(ifp0, on);
+   if_put(ifp0);
+   }
+   }
 
if (!ISSET(ifp->if_capabilities, IFCAP_LRO)) {
error = ENOTSUP;
Index: net/if_bpe.c
===
RCS file: /cvs/src/sys/net/if_bpe.c,v
retrieving revision 1.19
diff -u -p -r1.19 if_bpe.c
--- net/if_bpe.c8 Nov 2021 04:54:44 -   1.19
+++ net/if_bpe.c19 Oct 2023 13:20:18 -
@@ -631,6 +631,9 @@ bpe_set_parent(struct bpe_softc *sc, con
goto put;
}
 
+   if (ether_brport_isset(ifp))
+   ifsetlro(ifp0, 0);
+
/* commit */
sc->sc_key.k_if = ifp0->if_index;
etherbridge_flush(>sc_eb, IFBF_FLUSHALL);
Index: net/if_gre.c
===
RCS file: /cvs/src/sys/net/if_gre.c,v
retrieving revision 1.174
diff -u -p -r1.174 if_gre.c
--- net/if_gre.c13 May 2023 13:35:17 -  1.174
+++ net/if_gre.c19 Oct 2023 13:24:56 -
@@ -3544,6 +3544,9 @@ nvgre_set_parent(struct nvgre_softc *sc,
return (EPROTONOSUPPORT);
}
 
+   if (ether_brport_isset(>sc_ac.ac_if))
+   ifsetlro(ifp0, 0);
+
/* commit */
sc->sc_ifp0 = ifp0->if_index;
if_put(ifp0);
Index: net/if_vlan.c
===
RCS file: /cvs/src/sys/net/if_vlan.c,v
retrieving revision 1.215
diff -u -p -r1.215 if_vlan.c
--- net/if_vlan.c   16 May 2023 14:32:54 -  1.215
+++ net/if_vlan.c   19 Oct 2023 11:08:23 -
@@ -937,6 +937,9 @@ vlan_set_parent(struct vlan_softc *sc, c
if (error != 0)
goto put;
 
+   if (ether_brport_isset(ifp))
+   ifsetlro(ifp0, 0);
+
/* commit */
sc->sc_ifidx0 = ifp0->if_index;
if (!ISSET(sc->sc_flags, IFVF_LLADDR))
Index: net/if_vxlan.c
===
RCS file: /cvs/src/sys/net/if_vxlan.c,v
retrieving revision 1.93
diff -u -p -r1.93 if_vxlan.c
--- net/if_vxlan.c  3 Aug 2023 09:49:08 -   1.93
+++ net/if_vxlan.c  19 Oct 2023 13:18:47 -
@@ -1582,6 +1582,9 @@ vxlan_set_parent(struct vxlan_softc *sc,
goto put;
}
 
+   if (ether_brport_isset(ifp))
+   ifsetlro(ifp0, 0);
+
/* commit */
sc->sc_if_index0 = ifp0->if_index;
etherbridge_flush(>sc_eb, IFBF_FLUSHALL);

Re: HAMMER2 filesystem for OpenBSD

2023-10-19 Thread Denis Fondras

Le Tue, Oct 17, 2023 at 10:14:25PM +0100, Chris Narkiewicz a écrit :
> Hi,
> 
> Tomohiro Kusumi is currently working on HAMMER2 implementation
> for OpenBSD, FreeBSD and NetBSD.
> 
> The repository is here:
> https://github.com/kusumi/openbsd_hammer2
> 
> 
> He maintains repositories for NetBSD, FreeBSD and OpenBSD, which
> suggests that the implementation is portable. He also
> provides a patch for OpenBSD 7.3:
> 
> https://github.com/kusumi/openbsd_hammer2/blob/master/patch/openbsd73.patch
> 
> The patch looks very minimal to me, with no deeper changes to the
> kernel.
> 
> I haven't found any discussion about HAMMER2 in list archives, so I'd
> like to bring it to devs attention, kindly asking for your opinion.

https://marc.info/?l=openbsd-misc=169272174500676=2

> Does it look like it's worth bringing in? Does it require more work?
> 
> I'd appreciate any opinions from more knowledgable crowd.
> 
> Cheers,
> Chris
>

Re: bgpd convert rtr_proto.c to new ibuf API

2023-10-19 Thread Theo Buehler

On Thu, Oct 19, 2023 at 01:26:49PM +0200, Claudio Jeker wrote:
> On Thu, Oct 19, 2023 at 12:59:17PM +0200, Theo Buehler wrote:
> > On Thu, Oct 19, 2023 at 10:41:07AM +0200, Claudio Jeker wrote:
> > > More ibuf cleanup. rtr_proto.c still uses ibuf_add() where it could use
> > > the new functions.
> > > 
> > > Two bits I'm unsure about:
> > > - I had to change some sizeof() to use native types (I especially dislike
> > >   the sizeof(struct rtr_header).
> > 
> > Yes, that's not very nice.
> > 
> > > - ibuf_add_nXX() can fail if the value is too large. Which should be
> > >   impossible but still maybe it would be better to check for errors.
> > 
> > While it should be impossible, the length calculations are non-trivial,
> > so it seems wiser to check.
> > 
> > It's a bit longer than what you have now, but maybe it's an option
> > to combine the length calculation with the errs += idiom.
> > 
> > len += sizeof(rs->version);
> > len += sizeof(type);
> > len += sizeof(session_id);
> > len += sizeof(len);
> > 
> > if ((buf = ibuf_open(len)) == NULL)
> > return NULL;
> > 
> > errs += ibuf_add_n8(buf, rs->version);
> > errs += ibuf_add_n8(buf, type);
> > errs += ibuf_add_n16(buf, session_id);
> > errs += ibuf_add_n32(buf, len);
> > 
> > if (errs) {
> > ibuf_free(ibuf);
> > return NULL;
> > }
> > 
> > I'm ok with the diff as it is and you can ponder how you want to shave
> > this particular Yak.
> 
> I like my yaks shaved like this...

I like this better than the errs dance, too.

ok tb

> The sizeof yak is still in queue... not sure about it.

It's not that horrible.

> -- 
> :wq Claudio
> 
> Index: rtr_proto.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rtr_proto.c,v
> retrieving revision 1.18
> diff -u -p -r1.18 rtr_proto.c
> --- rtr_proto.c   19 Oct 2023 11:12:10 -  1.18
> +++ rtr_proto.c   19 Oct 2023 11:24:56 -
> @@ -233,6 +233,7 @@ rtr_newmsg(struct rtr_session *rs, enum 
>  uint16_t session_id)
>  {
>   struct ibuf *buf;
> + int saved_errno;
>  
>   if (len > RTR_MAX_LEN) {
>   errno = ERANGE;
> @@ -240,15 +241,23 @@ rtr_newmsg(struct rtr_session *rs, enum 
>   }
>   len += sizeof(struct rtr_header);
>   if ((buf = ibuf_open(len)) == NULL)
> - return NULL;
> -
> - /* cannot fail with fixed buffers */
> - ibuf_add_n8(buf, rs->version);
> - ibuf_add_n8(buf, type);
> - ibuf_add_n16(buf, session_id);
> - ibuf_add_n32(buf, len);
> + goto fail;
> + if (ibuf_add_n8(buf, rs->version) == -1)
> + goto fail;
> + if (ibuf_add_n8(buf, type) == -1)
> + goto fail;
> + if (ibuf_add_n16(buf, session_id) == -1)
> + goto fail;
> + if (ibuf_add_n32(buf, len) == -1)
> + goto fail;
>  
>   return buf;
> +
> + fail:
> + saved_errno = errno;
> + ibuf_free(buf);
> + errno = saved_errno;
> + return NULL;
>  }
>  
>  /*
> @@ -271,22 +280,27 @@ rtr_send_error(struct rtr_session *rs, e
>  
>   buf = rtr_newmsg(rs, ERROR_REPORT, 2 * sizeof(uint32_t) + len + mlen,
>   err);
> - if (buf == NULL) {
> - log_warn("rtr %s: send error report", log_rtr(rs));
> - return;
> - }
> -
> - /* cannot fail with fixed buffers */
> - ibuf_add_n32(buf, len);
> - ibuf_add(buf, pdu, len);
> - ibuf_add_n32(buf, mlen);
> - ibuf_add(buf, msg, mlen);
> + if (buf == NULL)
> + goto fail;
> + if (ibuf_add_n32(buf, len) == -1)
> + goto fail;
> + if (ibuf_add(buf, pdu, len) == -1)
> + goto fail;
> + if (ibuf_add_n32(buf, mlen) == -1)
> + goto fail;
> + if (ibuf_add(buf, msg, mlen) == -1)
> + goto fail;
>   ibuf_close(>w, buf);
>  
>   log_warnx("rtr %s: sending error report[%u] %s", log_rtr(rs), err,
>   msg ? msg : "");
>  
>   rtr_fsm(rs, RTR_EVNT_SEND_ERROR);
> + return;
> +
> + fail:
> + log_warn("rtr %s: send error report", log_rtr(rs));
> + ibuf_free(buf);
>  }
>  
>  static void
> @@ -309,15 +323,17 @@ rtr_send_serial_query(struct rtr_session
>   struct ibuf *buf;
>  
>   buf = rtr_newmsg(rs, SERIAL_QUERY, sizeof(uint32_t), rs->session_id);
> - if (buf == NULL) {
> - log_warn("rtr %s: send serial query", log_rtr(rs));
> - rtr_send_error(rs, INTERNAL_ERROR, "out of memory", NULL, 0);
> - return;
> - }
> -
> - /* cannot fail with fixed buffers */
> - ibuf_add_n32(buf, rs->serial);
> + if (buf == NULL)
> + goto fail;
> + if (ibuf_add_n32(buf, rs->serial) == -1)
> + goto fail;
>   ibuf_close(>w, buf);
> + return;
> +
> + fail:
> + log_warn("rtr %s: send serial query", log_rtr(rs));
> + ibuf_free(buf);
> + rtr_send_error(rs, INTERNAL_ERROR, "out of memory",

Re: bgpd convert rtr_proto.c to new ibuf API

2023-10-19 Thread Claudio Jeker

On Thu, Oct 19, 2023 at 12:59:17PM +0200, Theo Buehler wrote:
> On Thu, Oct 19, 2023 at 10:41:07AM +0200, Claudio Jeker wrote:
> > More ibuf cleanup. rtr_proto.c still uses ibuf_add() where it could use
> > the new functions.
> > 
> > Two bits I'm unsure about:
> > - I had to change some sizeof() to use native types (I especially dislike
> >   the sizeof(struct rtr_header).
> 
> Yes, that's not very nice.
> 
> > - ibuf_add_nXX() can fail if the value is too large. Which should be
> >   impossible but still maybe it would be better to check for errors.
> 
> While it should be impossible, the length calculations are non-trivial,
> so it seems wiser to check.
> 
> It's a bit longer than what you have now, but maybe it's an option
> to combine the length calculation with the errs += idiom.
> 
>   len += sizeof(rs->version);
>   len += sizeof(type);
>   len += sizeof(session_id);
>   len += sizeof(len);
> 
>   if ((buf = ibuf_open(len)) == NULL)
>   return NULL;
> 
>   errs += ibuf_add_n8(buf, rs->version);
>   errs += ibuf_add_n8(buf, type);
>   errs += ibuf_add_n16(buf, session_id);
>   errs += ibuf_add_n32(buf, len);
> 
>   if (errs) {
>   ibuf_free(ibuf);
>   return NULL;
>   }
> 
> I'm ok with the diff as it is and you can ponder how you want to shave
> this particular Yak.

I like my yaks shaved like this...

The sizeof yak is still in queue... not sure about it.
-- 
:wq Claudio

Index: rtr_proto.c
===
RCS file: /cvs/src/usr.sbin/bgpd/rtr_proto.c,v
retrieving revision 1.18
diff -u -p -r1.18 rtr_proto.c
--- rtr_proto.c 19 Oct 2023 11:12:10 -  1.18
+++ rtr_proto.c 19 Oct 2023 11:24:56 -
@@ -233,6 +233,7 @@ rtr_newmsg(struct rtr_session *rs, enum 
 uint16_t session_id)
 {
struct ibuf *buf;
+   int saved_errno;
 
if (len > RTR_MAX_LEN) {
errno = ERANGE;
@@ -240,15 +241,23 @@ rtr_newmsg(struct rtr_session *rs, enum 
}
len += sizeof(struct rtr_header);
if ((buf = ibuf_open(len)) == NULL)
-   return NULL;
-
-   /* cannot fail with fixed buffers */
-   ibuf_add_n8(buf, rs->version);
-   ibuf_add_n8(buf, type);
-   ibuf_add_n16(buf, session_id);
-   ibuf_add_n32(buf, len);
+   goto fail;
+   if (ibuf_add_n8(buf, rs->version) == -1)
+   goto fail;
+   if (ibuf_add_n8(buf, type) == -1)
+   goto fail;
+   if (ibuf_add_n16(buf, session_id) == -1)
+   goto fail;
+   if (ibuf_add_n32(buf, len) == -1)
+   goto fail;
 
return buf;
+
+ fail:
+   saved_errno = errno;
+   ibuf_free(buf);
+   errno = saved_errno;
+   return NULL;
 }
 
 /*
@@ -271,22 +280,27 @@ rtr_send_error(struct rtr_session *rs, e
 
buf = rtr_newmsg(rs, ERROR_REPORT, 2 * sizeof(uint32_t) + len + mlen,
err);
-   if (buf == NULL) {
-   log_warn("rtr %s: send error report", log_rtr(rs));
-   return;
-   }
-
-   /* cannot fail with fixed buffers */
-   ibuf_add_n32(buf, len);
-   ibuf_add(buf, pdu, len);
-   ibuf_add_n32(buf, mlen);
-   ibuf_add(buf, msg, mlen);
+   if (buf == NULL)
+   goto fail;
+   if (ibuf_add_n32(buf, len) == -1)
+   goto fail;
+   if (ibuf_add(buf, pdu, len) == -1)
+   goto fail;
+   if (ibuf_add_n32(buf, mlen) == -1)
+   goto fail;
+   if (ibuf_add(buf, msg, mlen) == -1)
+   goto fail;
ibuf_close(>w, buf);
 
log_warnx("rtr %s: sending error report[%u] %s", log_rtr(rs), err,
msg ? msg : "");
 
rtr_fsm(rs, RTR_EVNT_SEND_ERROR);
+   return;
+
+ fail:
+   log_warn("rtr %s: send error report", log_rtr(rs));
+   ibuf_free(buf);
 }
 
 static void
@@ -309,15 +323,17 @@ rtr_send_serial_query(struct rtr_session
struct ibuf *buf;
 
buf = rtr_newmsg(rs, SERIAL_QUERY, sizeof(uint32_t), rs->session_id);
-   if (buf == NULL) {
-   log_warn("rtr %s: send serial query", log_rtr(rs));
-   rtr_send_error(rs, INTERNAL_ERROR, "out of memory", NULL, 0);
-   return;
-   }
-
-   /* cannot fail with fixed buffers */
-   ibuf_add_n32(buf, rs->serial);
+   if (buf == NULL)
+   goto fail;
+   if (ibuf_add_n32(buf, rs->serial) == -1)
+   goto fail;
ibuf_close(>w, buf);
+   return;
+
+ fail:
+   log_warn("rtr %s: send serial query", log_rtr(rs));
+   ibuf_free(buf);
+   rtr_send_error(rs, INTERNAL_ERROR, "out of memory", NULL, 0);
 }
 
 /*

Re: bgpd convert rtr_proto.c to new ibuf API

2023-10-19 Thread Theo Buehler

On Thu, Oct 19, 2023 at 10:41:07AM +0200, Claudio Jeker wrote:
> More ibuf cleanup. rtr_proto.c still uses ibuf_add() where it could use
> the new functions.
> 
> Two bits I'm unsure about:
> - I had to change some sizeof() to use native types (I especially dislike
>   the sizeof(struct rtr_header).

Yes, that's not very nice.

> - ibuf_add_nXX() can fail if the value is too large. Which should be
>   impossible but still maybe it would be better to check for errors.

While it should be impossible, the length calculations are non-trivial,
so it seems wiser to check.

It's a bit longer than what you have now, but maybe it's an option
to combine the length calculation with the errs += idiom.

len += sizeof(rs->version);
len += sizeof(type);
len += sizeof(session_id);
len += sizeof(len);

if ((buf = ibuf_open(len)) == NULL)
return NULL;

errs += ibuf_add_n8(buf, rs->version);
errs += ibuf_add_n8(buf, type);
errs += ibuf_add_n16(buf, session_id);
errs += ibuf_add_n32(buf, len);

if (errs) {
ibuf_free(ibuf);
return NULL;
}

I'm ok with the diff as it is and you can ponder how you want to shave
this particular Yak.

> 
> -- 
> :wq Claudio
> 
> Index: rtr_proto.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/rtr_proto.c,v
> retrieving revision 1.17
> diff -u -p -r1.17 rtr_proto.c
> --- rtr_proto.c   16 Aug 2023 08:26:35 -  1.17
> +++ rtr_proto.c   19 Oct 2023 08:35:49 -
> @@ -233,24 +233,21 @@ rtr_newmsg(struct rtr_session *rs, enum 
>  uint16_t session_id)
>  {
>   struct ibuf *buf;
> - struct rtr_header rh;
>  
>   if (len > RTR_MAX_LEN) {
>   errno = ERANGE;
>   return NULL;
>   }
> - len += sizeof(rh);
> + len += sizeof(struct rtr_header);
>   if ((buf = ibuf_open(len)) == NULL)
>   return NULL;
>  
> - memset(, 0, sizeof(rh));
> - rh.version = rs->version;
> - rh.type = type;
> - rh.session_id = htons(session_id);
> - rh.length = htonl(len);
> -
>   /* cannot fail with fixed buffers */
> - ibuf_add(buf, , sizeof(rh));
> + ibuf_add_n8(buf, rs->version);
> + ibuf_add_n8(buf, type);
> + ibuf_add_n16(buf, session_id);
> + ibuf_add_n32(buf, len);
> +
>   return buf;
>  }
>  
> @@ -264,7 +261,6 @@ rtr_send_error(struct rtr_session *rs, e
>  {
>   struct ibuf *buf;
>   size_t mlen = 0;
> - uint32_t hdrlen;
>  
>   rs->last_sent_error = err;
>   if (msg) {
> @@ -273,7 +269,7 @@ rtr_send_error(struct rtr_session *rs, e
>   } else
>   memset(rs->last_sent_msg, 0, sizeof(rs->last_sent_msg));
>  
> - buf = rtr_newmsg(rs, ERROR_REPORT, 2 * sizeof(hdrlen) + len + mlen,
> + buf = rtr_newmsg(rs, ERROR_REPORT, 2 * sizeof(uint32_t) + len + mlen,
>   err);
>   if (buf == NULL) {
>   log_warn("rtr %s: send error report", log_rtr(rs));
> @@ -281,11 +277,9 @@ rtr_send_error(struct rtr_session *rs, e
>   }
>  
>   /* cannot fail with fixed buffers */
> - hdrlen = ntohl(len);
> - ibuf_add(buf, , sizeof(hdrlen));
> + ibuf_add_n32(buf, len);
>   ibuf_add(buf, pdu, len);
> - hdrlen = ntohl(mlen);
> - ibuf_add(buf, , sizeof(hdrlen));
> + ibuf_add_n32(buf, mlen);
>   ibuf_add(buf, msg, mlen);
>   ibuf_close(>w, buf);
>  
> @@ -313,9 +307,8 @@ static void
>  rtr_send_serial_query(struct rtr_session *rs)
>  {
>   struct ibuf *buf;
> - uint32_t s;
>  
> - buf = rtr_newmsg(rs, SERIAL_QUERY, sizeof(s), rs->session_id);
> + buf = rtr_newmsg(rs, SERIAL_QUERY, sizeof(uint32_t), rs->session_id);
>   if (buf == NULL) {
>   log_warn("rtr %s: send serial query", log_rtr(rs));
>   rtr_send_error(rs, INTERNAL_ERROR, "out of memory", NULL, 0);
> @@ -323,8 +316,7 @@ rtr_send_serial_query(struct rtr_session
>   }
>  
>   /* cannot fail with fixed buffers */
> - s = htonl(rs->serial);
> - ibuf_add(buf, , sizeof(s));
> + ibuf_add_n32(buf, rs->serial);
>   ibuf_close(>w, buf);
>  }
>  
>

Re: log.c use buffered IO

2023-10-19 Thread Theo Buehler

On Wed, Oct 18, 2023 at 11:34:09AM +0200, Claudio Jeker wrote:
> On Tue, Oct 17, 2023 at 10:06:54AM +0200, Sebastian Benoit wrote:
> > Theo Buehler(t...@theobuehler.org) on 2023.10.17 09:13:15 +0200:
> > > On Mon, Oct 16, 2023 at 12:19:17PM +0200, Claudio Jeker wrote:
> > > > I dislike how log.c does all these asprintf() calls with dubious
> > > > workaround calls in case asprintf() fails.
> > > 
> > > You're not alone.
> > > 
> > > > IMO it is easier to use the stdio provided buffers and fflush() to get
> > > > "atomic" writes on stderr. At least from my understanding this is the
> > > > reason to go all this lenght to do a single fprintf call.
> > > 
> > > This makes sense, but I don't know the history here.
> > 
> > as far as i can remember, it was done so it would still be able to work
> > somewhat when out of memeory.
> >  
> 
> After some input off-list here another idea.
> Require that logit() is called with a \n and by that simplify a lot of
> code around it. vsyslog() handles both so having the \n should not cause
> any breakage.

Really hard to say which is the lesser evil: requiring \n or the ugly
asprintf dances. I don't think it is flat out objectionable, but I can
see people disliking this. The churn you outline below seems indeed
rather limited, so I guess we can go this way.

Definitely ok with using a stack buffer rather than a static one.

> 
> Now logit() is mostly used internally but in bgpd it is also used in
> logmsg.c and parse.y. Fixing those is simple.
> 
> Also this uses a stack buffer for all the log_* cases now. This should
> make the code more thread safe.
> 
> Also this removes vlog() from the API.  I had a quick look at all the
> other log.c users and apart from ldapd this can be added to all of them
> without much issues. Nothing else uses vlog() and the logit() is also very
> minimal (mostly the same parse.y change as below).
> 
> -- 
> :wq Claudio
> 
> Index: log.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/log.c,v
> retrieving revision 1.64
> diff -u -p -r1.64 log.c
> --- log.c 21 Mar 2017 12:06:55 -  1.64
> +++ log.c 18 Oct 2023 07:10:32 -
> @@ -26,6 +26,8 @@
>  
>  #include "log.h"
>  
> +#define MAX_LOGLEN   4096
> +
>  static intdebug;
>  static intverbose;
>  static const char*log_procname;
> @@ -68,30 +70,15 @@ void
>  logit(int pri, const char *fmt, ...)
>  {
>   va_list ap;
> + int saved_errno = errno;
>  
>   va_start(ap, fmt);
> - vlog(pri, fmt, ap);
> - va_end(ap);
> -}
> -
> -void
> -vlog(int pri, const char *fmt, va_list ap)
> -{
> - char*nfmt;
> - int  saved_errno = errno;
> -
>   if (debug) {
> - /* best effort in out of mem situations */
> - if (asprintf(, "%s\n", fmt) == -1) {
> - vfprintf(stderr, fmt, ap);
> - fprintf(stderr, "\n");
> - } else {
> - vfprintf(stderr, nfmt, ap);
> - free(nfmt);
> - }
> + vfprintf(stderr, fmt, ap);
>   fflush(stderr);
>   } else
>   vsyslog(pri, fmt, ap);
> + va_end(ap);
>  
>   errno = saved_errno;
>  }
> @@ -99,26 +86,18 @@ vlog(int pri, const char *fmt, va_list a
>  void
>  log_warn(const char *emsg, ...)
>  {
> - char*nfmt;
> - va_list  ap;
> - int  saved_errno = errno;
> + charfmtbuf[MAX_LOGLEN];
> + va_list ap;
> + int saved_errno = errno;
>  
>   /* best effort to even work in out of memory situations */
>   if (emsg == NULL)
> - logit(LOG_ERR, "%s", strerror(saved_errno));
> + logit(LOG_ERR, "%s\n", strerror(saved_errno));
>   else {
>   va_start(ap, emsg);
> -
> - if (asprintf(, "%s: %s", emsg,
> - strerror(saved_errno)) == -1) {
> - /* we tried it... */
> - vlog(LOG_ERR, emsg, ap);
> - logit(LOG_ERR, "%s", strerror(saved_errno));
> - } else {
> - vlog(LOG_ERR, nfmt, ap);
> - free(nfmt);
> - }
> + (void)vsnprintf(fmtbuf, sizeof(fmtbuf), emsg, ap);
>   va_end(ap);
> + logit(LOG_ERR, "%s: %s\n", fmtbuf, strerror(saved_errno));
>   }
>  
>   errno = saved_errno;
> @@ -127,53 +106,65 @@ log_warn(const char *emsg, ...)
>  void
>  log_warnx(const char *emsg, ...)
>  {
> - va_list  ap;
> + charfmtbuf[MAX_LOGLEN];
> + va_list ap;
> + int saved_errno = errno;
>  
>   va_start(ap, emsg);
> - vlog(LOG_ERR, emsg, ap);
> + (void)vsnprintf(fmtbuf, sizeof(fmtbuf), emsg, ap);
>   va_end(ap);
> + logit(LOG_ERR, "%s\n", fmtbuf);
> +
> + errno = saved_errno;
>  }
>  
>  void
>  log_info(const char *emsg, ...)
>  {
> - va_list  ap;
> + char

Re: installer: support encryption with key disks

2023-10-18 Thread Andrew Hewus Fresh

On Mon, Oct 16, 2023 at 07:46:10PM +, Klemens Nanni wrote:
> On Mon, Sep 04, 2023 at 09:57:40PM +, Klemens Nanni wrote:
> > Extend the yes/no question to no/passphrase/keydisk and have users pick an
> > existing, preformated RAID partition;  no support (yet) for creating one.
> > 
> > Thanks to how ask_which() works, users can always say 'done' to land back
> > at question to either skip crypto or use a passphrase instead.
> > 
> > All code remains contained behind interactive non-default installations.
> > Code is straight forward, I've not been able to break it;  rest unchanged.
> > 
> > Example install with root disk sd0 and ready-to-use key disk sd1:
> > 
> > Available disks are: sd0 sd1.
> > Which disk is the root disk? ('?' for details) [sd0] 
> > Encrypt the root disk with a (p)assphrase or (k)eydisk? [no] k
> > Available disks are: sd1.
> > Which disk contains the key disk? (or 'done') [sd1] 
> > Available sd1 partitions are: a.
> > Which sd1 partition is the key disk? (or 'done') [a] 
> > 
> > Configuring the crypto chunk sd0...
> > 
> > No valid MBR or GPT.
> > Use (W)hole disk MBR, whole disk (G)PT or (E)dit? [whole] 
> > Setting OpenBSD MBR partition to whole sd0...done.
> > sd2 at scsibus2 targ 1 lun 0: 
> > sd2: 1023MB, 512 bytes/sector, 2096560 sectors
> > 
> > Configuring the root disk sd2...
> > 
> > No valid MBR or GPT.
> > Use (W)hole disk MBR, whole disk (G)PT or (E)dit? [whole] 
> > 
> > 
> > Feedback? OK?

This seems to work well for me and the implementation looks reasonable.

I think we want to update INSTALL with some extra docs as well, likely
pointing folks to how to create a keydisk.

OK afresh1@



> 
> Ping.
> 
> Index: install.sub
> ===
> RCS file: /cvs/src/distrib/miniroot/install.sub,v
> retrieving revision 1.1255
> diff -u -p -r1.1255 install.sub
> --- install.sub   21 Aug 2023 14:33:55 -  1.1255
> +++ install.sub   16 Oct 2023 19:36:55 -
> @@ -3074,8 +3074,32 @@ do_autoinstall() {
>   exec reboot
>  }
>  
> +# Chose an existing partition as key disk and set global $KEYDISK on success,
> +# otherwise return non-zero.
> +pick_keydisk() {
> + KEYDISK=
> + local _disk _label
> +
> + ask_which disk 'contains the key disk' '$(rmel $ROOTDISK $(get_dkdevs))'
> + [[ $resp == done ]] && return 1
> + _disk=$resp
> +
> + make_dev $_disk
> + if disklabel $_disk 2>/dev/null | ! grep -qw RAID; then
> + echo "$_disk must contain a RAID partition."
> + return 1
> + fi
> +
> + ask_which "$_disk partition" 'is the key disk' \
> + "\$(disklabel $_disk 2>/dev/null |
> + sed -En 's/^  ([a-p]):.*RAID.*$/\1/p')"
> + [[ $resp == done ]] && return 1
> + _label=$resp
> + KEYDISK=$_disk$_label
> +}
> +
>  encrypt_root() {
> - local _chunk=$ROOTDISK
> + local _args _chunk=$ROOTDISK
>  
>   [[ $MDBOOTSR == y ]] || return
>  
> @@ -3088,13 +3112,30 @@ encrypt_root() {
>   # e.g. auto-assembled at boot or done in (S)hell.
>   [[ -z $(get_softraid_volumes) ]] || return
>  
> - ask_yn 'Encrypt the root disk with a passphrase?' || return
> + while :; do
> + ask 'Encrypt the root disk with a (p)assphrase or (k)eydisk?' no
> + case $resp in
> + # Retry on failure to allow passphrase or skip.
> + [kK]*)
> + pick_keydisk || continue
> + _args=-k$KEYDISK
> + break
> + ;;
> + # Do nothing, bioctl(8) will handle the passphrase.
> + [pP]*)  break
> + ;;
> + [nN]*)  return
> + ;;
> + *)  echo "'$resp' is not a valid choice."
> + ;;
> + esac
> + done
>  
>   echo "\nConfiguring the crypto chunk $_chunk...\n"
>   md_prep_fdisk $_chunk
>   echo 'RAID *' | disklabel -w -A -T- $_chunk
>  
> - bioctl -Cforce -cC -l${_chunk}a softraid0 >/dev/null
> + bioctl -Cforce -cC -l${_chunk}a $_args softraid0 >/dev/null
>  
>   # No volumes existed before asking, but we just created one.
>   ROOTDISK=$(get_softraid_volumes)
> 

-- 
andrew

($do || !$do) && undef($try) ;  # Master of Perl, Yoda is.  H?

Re: dwge(4): don't panic on truncated input packet

2023-10-18 Thread Mark Kettenis

> Date: Wed, 18 Oct 2023 16:40:20 +
> From: Miod Vallat 
> 
> I had the misfortune of hitting a KASSERT in dwge:
> 
> panic: kernel diagnostic assertion "len > 0" failed: file 
> "/usr/src/sys/dev/fdt
> /if_dwge.c", line 1102
> Stopped at  panic+0x106:addia0,zero,256TIDPIDUID 
> PR
> FLAGS PFLAGS  CPU  COMMAND
> *405136  98879   1500 0x3  00  git
>  301471  67731  0 0x14000  0x2001  softnet0
> panic() at panic+0x106
> panic() at panic
> dwge_rx_proc() at dwge_rx_proc+0x1d4
> dwge_intr() at dwge_intr+0x4e
> plic_irq_dispatch() at plic_irq_dispatch+0xec
> plic_irq_handler() at plic_irq_handler+0x56
> riscv_cpu_intr() at riscv_cpu_intr+0x22
> cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x7a
> pmap_copy_page() at pmap_copy_page+0x94
> uvm_pagerealloc_multi() at uvm_pagerealloc_multi+0x24e
> buf_realloc_pages() at buf_realloc_pages+0xa0
> buf_flip_high() at buf_flip_high+0x64
> bufcache_recover_dmapages() at bufcache_recover_dmapages+0x13a
> buf_get() at buf_get+0xec
> end trace frame: 0xffc04d9ac870, count: 0
> 
> The following diff should count the error and skirt the panic.

You probably hit a receive error and hit this code path because we
don't check the error bit in the receive descriptor.  I fixed this in
dwqe(4) recently, and would prefer a similar fix here.  The (untested)
diff for that is attached here.

We may still want to check the length on top of that (and make a
similar change for dwqe(4)).  I think I saw packets with the error bit
set that had a length > 0 on dwqe(4) and this makes me suspect that it
happens on dwge(4) as well.


Index: dev/fdt/if_dwge.c
===
RCS file: /cvs/src/sys/dev/fdt/if_dwge.c,v
retrieving revision 1.19
diff -u -p -r1.19 if_dwge.c
--- dev/fdt/if_dwge.c   15 Aug 2023 08:27:30 -  1.19
+++ dev/fdt/if_dwge.c   18 Oct 2023 22:29:16 -
@@ -214,6 +214,7 @@ struct dwge_desc {
 #define RDES0_OE   (1 << 11)
 #define RDES0_SAF  (1 << 13)
 #define RDES0_DE   (1 << 14)
+#define RDES0_ES   (1 << 15)
 #define RDES0_FL_MASK  0x3fff
 #define RDES0_FL_SHIFT 16
 #define RDES0_AFM  (1 << 30)
@@ -1097,15 +1098,20 @@ dwge_rx_proc(struct dwge_softc *sc)
len, BUS_DMASYNC_POSTREAD);
bus_dmamap_unload(sc->sc_dmat, rxb->tb_map);
 
-   /* Strip off CRC. */
-   len -= ETHER_CRC_LEN;
-   KASSERT(len > 0);
-
m = rxb->tb_m;
rxb->tb_m = NULL;
-   m->m_pkthdr.len = m->m_len = len;
+   if (rxd->sd_status & RDES0_ES) {
+   ifp->if_ierrors++;
+   m_freem(m);
+   } else {
+   /* Strip off CRC. */
+   len -= ETHER_CRC_LEN;
+   KASSERT(len > 0);
 
-   ml_enqueue(, m);
+   m->m_pkthdr.len = m->m_len = len;
+
+   ml_enqueue(, m);
+   }
 
put++;
if (sc->sc_rx_cons == (DWGE_NRXDESC - 1))


> 
> Index: if_dwge.c
> ===
> RCS file: /OpenBSD/src/sys/dev/fdt/if_dwge.c,v
> retrieving revision 1.19
> diff -u -p -r1.19 if_dwge.c
> --- if_dwge.c 15 Aug 2023 08:27:30 -  1.19
> +++ if_dwge.c 18 Oct 2023 16:37:59 -
> @@ -1099,13 +1101,15 @@ dwge_rx_proc(struct dwge_softc *sc)
>  
>   /* Strip off CRC. */
>   len -= ETHER_CRC_LEN;
> - KASSERT(len > 0);
> -
> - m = rxb->tb_m;
> - rxb->tb_m = NULL;
> - m->m_pkthdr.len = m->m_len = len;
> + if (len <= 0) {
> + ifp->if_ierrors++;
> + } else {
> + m = rxb->tb_m;
> + rxb->tb_m = NULL;
> + m->m_pkthdr.len = m->m_len = len;
>  
> - ml_enqueue(, m);
> + ml_enqueue(, m);
> + }
>  
>   put++;
>   if (sc->sc_rx_cons == (DWGE_NRXDESC - 1))
> 
>

Re: better fix for Term::Cap

2023-10-18 Thread Andrew Hewus Fresh

On Wed, Oct 18, 2023 at 10:25:59AM +0200, Marc Espie wrote:
> Instead of an archaic limit, just keep track of tc's.
> This is actually slightly faster, because the termcap links is a tree,
> not a list.

I like it, but I don't want to carry this patch against upstream.  I
will happily pull in this change from there as soon as they merge it
though.

Changing the 32 to 64 is an easy patch to keep up on and to solve merge
conflicts with when upgrading.  This larger change is much less so.

That said, there is so much in this file that could be improved, why
stop here?

> 
> Please test.
> 
> Index: Cap.pm
> ===
> RCS file: /cvs/src/gnu/usr.bin/perl/cpan/Term-Cap/Cap.pm,v
> retrieving revision 1.3
> diff -u -p -r1.3 Cap.pm
> --- Cap.pm18 Oct 2023 01:49:26 -  1.3
> +++ Cap.pm18 Oct 2023 08:24:54 -
> @@ -280,7 +280,7 @@ sub Tgetent
>  
>  $first = 0;# first entry (keeps term name)
>  
> -$max = 64; # max :tc=...:'s
> +my $seen = {};   # keep track of :tc=...:s
>  
>  if ($entry)
>  {
> @@ -291,6 +291,7 @@ sub Tgetent
>  if ( $entry =~ s/:tc=([^:]+):/:/ )
>  {
>  $tmp_term = $1;
> + $seen->{$tmp_term} = 1;
>  
>  # protect any pattern metacharacters in $tmp_term
>  $termpat = $tmp_term;
> @@ -332,10 +333,7 @@ sub Tgetent
>  }
>  else
>  {
> -
>  # do the same file again
> -# prevent endless recursion
> -$max-- || croak "failed termcap loop at $tmp_term";
>  $state = 1;# ok, maybe do a new file next time
>  }
>  
> @@ -345,11 +343,20 @@ sub Tgetent
>  close TERMCAP;
>  
>  # If :tc=...: found then search this file again
> -$entry =~ s/:tc=([^:]+):/:/ && ( $tmp_term = $1, $state = 2 );
> +while ($entry =~ s/:tc=([^:]+):/:/) {
> + $tmp_term = $1;
> + if ($seen->{$tmp_term}) {
> + # XXX first version of this croaked, but we can actually
> + # get several intermediate entries with the same tc !
> + next;
> + }
> + $seen->{$tmp_term} = 1;
> + $state = 2;
>  
> -# protect any pattern metacharacters in $tmp_term
> -$termpat = $tmp_term;
> -$termpat =~ s/(\W)/\\$1/g;
> + # protect any pattern metacharacters in $tmp_term
> + $termpat = $tmp_term;
> + $termpat =~ s/(\W)/\\$1/g;
> + }
>  }
>  
>  croak "Can't find $term" if $entry eq '';
> 

-- 
andrew

"Programming today is a race between software engineers striving to
build bigger and better idiot-proof programs, and the Universe
trying to produce bigger and better idiots. So far, the Universe is
winning." -- Rich Cook

Re: TSO for ixl(4)

2023-10-18 Thread Alexander Bluhm

On Wed, Oct 18, 2023 at 05:29:41PM +0200, Jan Klemkow wrote:
> This diff implements TCP Segmentation Offloading for ixl(4).  I tested
> it successfully on amd64 and sparc64 with Intel X710.  It should
> increase the TCP bulk performance to 10 Gbit/s.  On sparc64 I got an
> increase from 600 MBit/s to 2.000 Gbit/s.
> 
> Further testing is welcome.

tested on amd64
OK bluhm@

> Index: dev/pci/if_ixl.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_ixl.c,v
> retrieving revision 1.89
> diff -u -p -r1.89 if_ixl.c
> --- dev/pci/if_ixl.c  29 Sep 2023 19:44:47 -  1.89
> +++ dev/pci/if_ixl.c  18 Oct 2023 15:15:30 -
> @@ -71,6 +71,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  
>  #if NBPFILTER > 0
> @@ -85,6 +86,8 @@
>  #include 
>  #include 
>  #include 
> +#include 
> +#include 
>  #include 
>  #include 
>  
> @@ -827,6 +830,10 @@ struct ixl_tx_desc {
>  #define IXL_TX_DESC_BSIZE_MASK   \
>   (IXL_TX_DESC_BSIZE_MAX << IXL_TX_DESC_BSIZE_SHIFT)
>  
> +#define IXL_TX_CTX_DESC_CMD_TSO  0x10
> +#define IXL_TX_CTX_DESC_TLEN_SHIFT   30
> +#define IXL_TX_CTX_DESC_MSS_SHIFT50
> +
>  #define IXL_TX_DESC_L2TAG1_SHIFT 48
>  } __packed __aligned(16);
>  
> @@ -893,11 +900,19 @@ struct ixl_rx_wb_desc_32 {
>   uint64_tqword3;
>  } __packed __aligned(16);
>  
> -#define IXL_TX_PKT_DESCS 8
> +#define IXL_TX_PKT_DESCS 32
>  #define IXL_TX_QUEUE_ALIGN   128
>  #define IXL_RX_QUEUE_ALIGN   128
>  
>  #define IXL_HARDMTU  9712 /* 9726 - ETHER_HDR_LEN */
> +#define IXL_TSO_SIZE ((255 * 1024) - 1)
> +#define IXL_MAX_DMA_SEG_SIZE ((16 * 1024) - 1)
> +
> +/*
> + * Our TCP/IP Stack could not handle packets greater than MAXMCLBYTES.
> + * This interface could not handle packets greater than IXL_TSO_SIZE.
> + */
> +CTASSERT(MAXMCLBYTES < IXL_TSO_SIZE);
>  
>  #define IXL_PCIREG   PCI_MAPREG_START
>  
> @@ -1958,6 +1973,7 @@ ixl_attach(struct device *parent, struct
>   ifp->if_capabilities |= IFCAP_CSUM_IPv4 |
>   IFCAP_CSUM_TCPv4 | IFCAP_CSUM_UDPv4 |
>   IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
> + ifp->if_capabilities |= IFCAP_TSOv4 | IFCAP_TSOv6;
>  
>   ifmedia_init(>sc_media, 0, ixl_media_change, ixl_media_status);
>  
> @@ -2603,7 +2619,7 @@ ixl_txr_alloc(struct ixl_softc *sc, unsi
>   txm = [i];
>  
>   if (bus_dmamap_create(sc->sc_dmat,
> - IXL_HARDMTU, IXL_TX_PKT_DESCS, IXL_HARDMTU, 0,
> + MAXMCLBYTES, IXL_TX_PKT_DESCS, IXL_MAX_DMA_SEG_SIZE, 0,
>   BUS_DMA_WAITOK | BUS_DMA_ALLOCNOW | BUS_DMA_64BIT,
>   >txm_map) != 0)
>   goto uncreate;
> @@ -2787,7 +2803,8 @@ ixl_load_mbuf(bus_dma_tag_t dmat, bus_dm
>  }
>  
>  static uint64_t
> -ixl_tx_setup_offload(struct mbuf *m0)
> +ixl_tx_setup_offload(struct mbuf *m0, struct ixl_tx_ring *txr,
> +unsigned int prod)
>  {
>   struct ether_extracted ext;
>   uint64_t hlen;
> @@ -2800,7 +2817,7 @@ ixl_tx_setup_offload(struct mbuf *m0)
>   }
>  
>   if (!ISSET(m0->m_pkthdr.csum_flags,
> - M_IPV4_CSUM_OUT|M_TCP_CSUM_OUT|M_UDP_CSUM_OUT))
> + M_IPV4_CSUM_OUT|M_TCP_CSUM_OUT|M_UDP_CSUM_OUT|M_TCP_TSO))
>   return (offload);
>  
>   ether_extract_headers(m0, );
> @@ -2833,6 +2850,28 @@ ixl_tx_setup_offload(struct mbuf *m0)
>   offload |= (sizeof(*ext.udp) >> 2) << IXL_TX_DESC_L4LEN_SHIFT;
>   }
>  
> + if (ISSET(m0->m_pkthdr.csum_flags, M_TCP_TSO)) {
> + if (ext.tcp) {
> + struct ixl_tx_desc *ring, *txd;
> + uint64_t cmd = 0;
> +
> + hlen += ext.tcp->th_off << 2;
> + ring = IXL_DMA_KVA(>txr_mem);
> + txd = [prod];
> +
> + cmd |= IXL_TX_DESC_DTYPE_CONTEXT;
> + cmd |= IXL_TX_CTX_DESC_CMD_TSO;
> + cmd |= (uint64_t)(m0->m_pkthdr.len - ETHER_HDR_LEN
> + - hlen) << IXL_TX_CTX_DESC_TLEN_SHIFT;
> + cmd |= (uint64_t)(m0->m_pkthdr.ph_mss)
> + << IXL_TX_CTX_DESC_MSS_SHIFT;
> +
> + htolem64(>addr, 0);
> + htolem64(>cmd, cmd);
> + } else
> + tcpstat_inc(tcps_outbadtso);
> + }
> +
>   return (offload);
>  }
>  
> @@ -2873,7 +2912,8 @@ ixl_start(struct ifqueue *ifq)
>   mask = sc->sc_tx_ring_ndescs - 1;
>  
>   for (;;) {
> - if (free <= IXL_TX_PKT_DESCS) {
> + /* We need one extra descriptor for TSO packets. */
> + if (free <= (IXL_TX_PKT_DESCS + 1)) {
>   ifq_set_oactive(ifq);
>   break;
>   }
> @@ -2882,10 +2922,16 @@ ixl_start(struct ifqueue *ifq)
>

Re: bgpd session.c convert to new ibuf API

2023-10-18 Thread Theo Buehler

On Wed, Oct 18, 2023 at 05:19:29PM +0200, Claudio Jeker wrote:
> This is a bit overdue. Convert session.c to also use the new ibuf API.
> This simplifies some code since there is no need for local variables.
> Also kill the struct msg_header and especially msg_open. The are of very
> little use.
> 
> Regress passes so I think this should be fine :)

Can't spot anything wrong with the conversion, two log nits below.

> -- 
> :wq Claudio
> 
> Index: session.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
> retrieving revision 1.450
> diff -u -p -r1.450 session.c
> --- session.c 17 Oct 2023 17:58:15 -  1.450
> +++ session.c 18 Oct 2023 15:09:29 -
> @@ -1324,24 +1324,26 @@ session_capa_add(struct ibuf *opb, uint8
>  {
>   int errs = 0;
>  
> - errs += ibuf_add(opb, _code, sizeof(capa_code));
> - errs += ibuf_add(opb, _len, sizeof(capa_len));
> + errs += ibuf_add_n8(opb, capa_code);
> + errs += ibuf_add_n8(opb, capa_len);
>   return (errs);
>  }
>  
>  int
>  session_capa_add_mp(struct ibuf *buf, uint8_t aid)
>  {
> - uint8_t  safi, pad = 0;
>   uint16_t afi;
> + uint8_t  safi;
>   int  errs = 0;
>  
> - if (aid2afi(aid, , ) == -1)
> - fatalx("session_capa_add_mp: bad afi/safi pair");
> - afi = htons(afi);
> - errs += ibuf_add(buf, , sizeof(afi));
> - errs += ibuf_add(buf, , sizeof(pad));
> - errs += ibuf_add(buf, , sizeof(safi));
> + if (aid2afi(aid, , ) == -1) {
> + log_warn("session_capa_add_afi: bad AID");

wrong function name. Should this be

log_warn("%s: bad AID", __func__);

> + return (-1);
> + }
> +
> + errs += ibuf_add_n16(buf, afi);
> + errs += ibuf_add_zero(buf, 1);
> + errs += ibuf_add_n8(buf, safi);
>  
>   return (errs);
>  }
> @@ -1356,13 +1358,12 @@ session_capa_add_afi(struct peer *p, str
>  
>   if (aid2afi(aid, , )) {
>   log_warn("session_capa_add_afi: bad AID");

use __func__?

> - return (1);
> + return (-1);

fun

>   }
>  
> - afi = htons(afi);
> - errs += ibuf_add(b, , sizeof(afi));
> - errs += ibuf_add(b, , sizeof(safi));
> - errs += ibuf_add(b, , sizeof(flags));
> + errs += ibuf_add_n16(b, afi);
> + errs += ibuf_add_n8(b, safi);
> + errs += ibuf_add_n8(b, flags);
>  
>   return (errs);
>  }
> @@ -1370,21 +1371,19 @@ session_capa_add_afi(struct peer *p, str
>  struct bgp_msg *
>  session_newmsg(enum msg_type msgtype, uint16_t len)
>  {
> + u_char   marker[MSGSIZE_HEADER_MARKER];
>   struct bgp_msg  *msg;
> - struct msg_headerhdr;
>   struct ibuf *buf;
>   int  errs = 0;
>  
> - memset(, 0xff, sizeof(hdr.marker));
> - hdr.len = htons(len);
> - hdr.type = msgtype;
> + memset(marker, 0xff, sizeof(marker));
>  
>   if ((buf = ibuf_open(len)) == NULL)
>   return (NULL);
>  
> - errs += ibuf_add(buf, , sizeof(hdr.marker));
> - errs += ibuf_add(buf, , sizeof(hdr.len));
> - errs += ibuf_add(buf, , sizeof(hdr.type));
> + errs += ibuf_add(buf, marker, sizeof(marker));
> + errs += ibuf_add_n16(buf, len);
> + errs += ibuf_add_n8(buf, msgtype);
>  
>   if (errs || (msg = calloc(1, sizeof(*msg))) == NULL) {
>   ibuf_free(buf);
> @@ -1472,8 +1471,7 @@ session_open(struct peer *p)
>  {
>   struct bgp_msg  *buf;
>   struct ibuf *opb;
> - struct msg_open  msg;
> - uint16_t len, optparamlen = 0;
> + uint16_t len, optparamlen = 0, holdtime;
>   uint8_t  i, op_type;
>   int  errs = 0, extlen = 0;
>   int  mpcapa = 0;
> @@ -1501,10 +1499,8 @@ session_open(struct peer *p)
>   p->conf.role != ROLE_NONE &&
>   (p->capa.ann.mp[AID_INET] || p->capa.ann.mp[AID_INET6] ||
>   mpcapa == 0)) {
> - uint8_t val;
> - val = role2capa(p->conf.role);
>   errs += session_capa_add(opb, CAPA_ROLE, 1);
> - errs += ibuf_add(opb, , 1);
> + errs += ibuf_add_n8(opb, role2capa(p->conf.role));
>   }
>  
>   /* graceful restart and End-of-RIB marker, RFC 4724 */
> @@ -1520,19 +1516,14 @@ session_open(struct peer *p)
>   /* Only set the R-flag if no graceful restart is ongoing */
>   if (!rst)
>   hdr |= CAPA_GR_R_FLAG;
> - hdr = htons(hdr);
> -
>   errs += session_capa_add(opb, CAPA_RESTART, sizeof(hdr));
> - errs += ibuf_add(opb, , sizeof(hdr));
> + errs += ibuf_add_n16(opb, hdr);
>   }
>  
>   /* 4-bytes AS numbers, RFC6793 */
>   if (p->capa.ann.as4byte) {  /* 4 bytes data */
>

Re: smtpd: implement nullmx RFC 7505

2023-10-18 Thread Philipp

[2023-10-18 11:42] Omar Polo 
> On 2023/10/18 08:40:14 +0100, Stuart Henderson  wrote:
> > On 2023/10/17 22:27, Philipp wrote:
> > > [2023-10-17 17:32] Omar Polo 
> > > > [...]
>
> > > But I don't think your proposed patch is a good solution, because the
> > > result depend on the order of the RR in the repsonse. The problem is
> > > when the first entry in the response is a Null MX your patch truncate
> > > the other results and a bounce is generated. But when the Null MX is
> > > later in the response the other entries are used to deliver the mail.
>
> that's true; I fell for a shorter diff.  I've attached below a version
> that doesn't depend on the order.

Your patch looks mostly good. I only have a small nitpick style comment:

> +
> + if (rr.rr.mx.preference == 0 && !strcmp(buf, "")) {

strcmp doesn't return a bool like value. Something like:

if (rr.rr.mx.preference == 0 && strcmp(buf, "") == 0)

is imho better to read.

Re: IPv4 on ix(4) slow/nothing - 7.4

2023-10-18 Thread Mischa Peters



> On Oct 18, 2023, at 15:44, Hrvoje Popovski  wrote:
> 
> On 18.10.2023. 15:35, Mischa wrote:
>> Hi All,
>> 
>> Just upgraded a couple of machines to 7.4. smooth as always!!
>> 
>> I am however seeing issues with IPv4, slowness or no throughput at all.
>> The machines I have upgraded are using an Intel X540-T network card and
>> is connected on 10G.
>> 
>> ix0 at pci5 dev 0 function 0 "Intel X540T" rev 0x01, msix, 16 queues,
>> address b8:ca:3a:62:ee:40
>> ix1 at pci5 dev 0 function 1 "Intel X540T" rev 0x01, msix, 16 queues,
>> address b8:ca:3a:62:ee:42
>> 
>> root@n2:~ # sysctl kern.version
>> kern.version=OpenBSD 7.4 (GENERIC.MP) #1397: Tue Oct 10 09:02:37 MDT 2023
>> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>> 
>> There are a bunch of VMs running on top of it, as soon as I want to
>> fetch something with ftp, for example, I don't get anything over IPv4,
>> with IPv6 everything is normal.
>> 
>> mischa@www2:~ $ ftp -4
>> https://mirror.openbsd.amsterdam/pub/OpenBSD/7.4/amd64/install74.iso
>> Trying 46.23.88.18...
>> Requesting
>> https://mirror.openbsd.amsterdam/pub/OpenBSD/7.4/amd64/install74.iso
>>   0% | |   512 KB  - stalled
>> -^C
>> 
>> A trace on mirror / n2:
>> 
>> n2:~ # tcpdump -i vport880 host 46.23.88.32
>> tcpdump: listening on vport880, link-type EN10MB
>> 15:16:08.730274 www2.high5.nl.1828 > n2.high5.nl.https: S
>> 2182224746:2182224746(0) win 16384 > 6,nop,nop,timestamp 2899683458 0> (DF)
>> 15:16:08.730297 arp who-has www2.high5.nl tell n2.high5.nl
>> 15:16:08.731535 arp reply www2.high5.nl is-at fe:51:bb:1e:12:11
>> 15:16:08.731540 n2.high5.nl.https > www2.high5.nl.1828: S
>> 633749938:633749938(0) ack 2182224747 win 16384 > 1460,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 3129955106
>> 2899683458> (DF)
>> 15:16:08.732017 www2.high5.nl.1828 > n2.high5.nl.https: . ack 1 win 256
>>  (DF)
>> 15:16:08.785752 www2.high5.nl.1828 > n2.high5.nl.https: P 1:312(311) ack
>> 1 win 256  (DF)
>> 15:16:08.786092 n2.high5.nl.https > www2.high5.nl.1828: P 1:128(127) ack
>> 312 win 271  (DF)
>> 15:16:08.786376 n2.high5.nl.https > www2.high5.nl.1828: P 128:134(6) ack
>> 312 win 271  (DF)
>> 15:16:08.786396 n2.high5.nl.https > www2.high5.nl.1828: P 134:166(32)
>> ack 312 win 271  (DF)
>> 15:16:08.786455 n2.high5.nl.https > www2.high5.nl.1828: . 166:1614(1448)
>> ack 312 win 271  (DF)
>> 15:16:08.786457 n2.high5.nl.https > www2.high5.nl.1828: .
>> 1614:3062(1448) ack 312 win 271 > 2899683510> (DF)
>> 15:16:08.786460 n2.high5.nl.https > www2.high5.nl.1828: P 3062:3803(741)
>> ack 312 win 271  (DF)
>> 15:16:08.786943 www2.high5.nl.1828 > n2.high5.nl.https: . ack 134 win
>> 255  (DF)
>> 15:16:08.796534 n2.high5.nl.https > www2.high5.nl.1828: P 3803:4345(542)
>> ack 312 win 271  (DF)
>> 15:16:08.796577 n2.high5.nl.https > www2.high5.nl.1828: P 4345:4403(58)
>> ack 312 win 271  (DF)
>> 15:16:08.797518 www2.high5.nl.1828 > n2.high5.nl.https: . ack 166 win
>> 256 >> (DF)
>> 15:16:08.797522 www2.high5.nl.1828 > n2.high5.nl.https: . ack 166 win
>> 256 >> (DF)
>> 15:16:09.790297 n2.high5.nl.https > www2.high5.nl.1828: . 166:1614(1448)
>> ack 312 win 271  (DF)
>> 15:16:09.790902 www2.high5.nl.1828 > n2.high5.nl.https: . ack 1614 win
>> 233 >> (DF)
>> 15:16:09.790917 n2.high5.nl.https > www2.high5.nl.1828: .
>> 1614:3062(1448) ack 312 win 271 > 2899684519> (DF)
>> 15:16:09.790923 n2.high5.nl.https > www2.high5.nl.1828: P
>> 3062:4403(1341) ack 312 win 271 > 2899684519> (DF)
>> 15:16:10.790299 n2.high5.nl.https > www2.high5.nl.1828: .
>> 1614:3062(1448) ack 312 win 271 > 2899684519> (DF)
>> 15:16:10.791204 www2.high5.nl.1828 > n2.high5.nl.https: . ack 3062 win
>> 233 >> (DF)
>> 15:16:10.791223 n2.high5.nl.https > www2.high5.nl.1828: P
>> 3062:4403(1341) ack 312 win 271 > 2899685520> (DF)
>> 15:16:10.791692 www2.high5.nl.1828 > n2.high5.nl.https: . ack 4403 win
>> 235  (DF)
>> 15:16:10.802647 www2.high5.nl.1828 > n2.high5.nl.https: P 312:318(6) ack
>> 4403 win 256  (DF)
>> 15:16:11.000297 n2.high5.nl.https > www2.high5.nl.1828: . ack 318 win
>> 271  (DF)
>> 15:16:11.001162 www2.high5.nl.1828 > n2.high5.nl.https: P 318:527(209)
>> ack 4403 win 256  (DF)
>> 15:16:11.001860 n2.high5.nl.https > www2.high5.nl.1828: P 4403:5059(656)
>> ack 527 win 271  (DF)
>> 15:16:11.001989 n2.high5.nl.https > www2.high5.nl.1828: .
>> 5059:6507(1448) ack 527 win 271 > 2899685730> (DF)
>> 15:16:11.001992 n2.high5.nl.https > www2.high5.nl.1828: .
>> 6507:7955(1448) ack 527 win 271 > 2899685730> (DF)
>> 15:16:11.195431 www2.high5.nl.1828 > n2.high5.nl.https: . ack 5059 win
>> 256  (DF)
>> 15:16:11.195447 n2.high5.nl.https > www2.high5.nl.1828: .
>> 7955:9403(1448) ack 527 win 271 > 2899685924> (DF)
>> 
>> 
>> Running a trace on www2 I am seeing:
>> 
>> www2:~ # tcpdump -i vio0 host 46.23.88.18
>> tcpdump: listening on vio0, link-type EN10MB
>> 15:16:08.729974 www2.high5.nl.1828 > n2.high5.nl.https: S
>> 2182224746:2182224746(0) win 16384 > 6,nop,nop,timestamp

Re: IPv4 on ix(4) slow/nothing - 7.4

2023-10-18 Thread Hrvoje Popovski

On 18.10.2023. 15:35, Mischa wrote:
> Hi All,
> 
> Just upgraded a couple of machines to 7.4. smooth as always!!
> 
> I am however seeing issues with IPv4, slowness or no throughput at all.
> The machines I have upgraded are using an Intel X540-T network card and
> is connected on 10G.
> 
> ix0 at pci5 dev 0 function 0 "Intel X540T" rev 0x01, msix, 16 queues,
> address b8:ca:3a:62:ee:40
> ix1 at pci5 dev 0 function 1 "Intel X540T" rev 0x01, msix, 16 queues,
> address b8:ca:3a:62:ee:42
> 
> root@n2:~ # sysctl kern.version
> kern.version=OpenBSD 7.4 (GENERIC.MP) #1397: Tue Oct 10 09:02:37 MDT 2023
>     dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> There are a bunch of VMs running on top of it, as soon as I want to
> fetch something with ftp, for example, I don't get anything over IPv4,
> with IPv6 everything is normal.
> 
> mischa@www2:~ $ ftp -4
> https://mirror.openbsd.amsterdam/pub/OpenBSD/7.4/amd64/install74.iso
> Trying 46.23.88.18...
> Requesting
> https://mirror.openbsd.amsterdam/pub/OpenBSD/7.4/amd64/install74.iso
>   0% | |   512 KB  - stalled
> -^C
> 
> A trace on mirror / n2:
> 
> n2:~ # tcpdump -i vport880 host 46.23.88.32
> tcpdump: listening on vport880, link-type EN10MB
> 15:16:08.730274 www2.high5.nl.1828 > n2.high5.nl.https: S
> 2182224746:2182224746(0) win 16384  6,nop,nop,timestamp 2899683458 0> (DF)
> 15:16:08.730297 arp who-has www2.high5.nl tell n2.high5.nl
> 15:16:08.731535 arp reply www2.high5.nl is-at fe:51:bb:1e:12:11
> 15:16:08.731540 n2.high5.nl.https > www2.high5.nl.1828: S
> 633749938:633749938(0) ack 2182224747 win 16384  1460,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 3129955106
> 2899683458> (DF)
> 15:16:08.732017 www2.high5.nl.1828 > n2.high5.nl.https: . ack 1 win 256
>  (DF)
> 15:16:08.785752 www2.high5.nl.1828 > n2.high5.nl.https: P 1:312(311) ack
> 1 win 256  (DF)
> 15:16:08.786092 n2.high5.nl.https > www2.high5.nl.1828: P 1:128(127) ack
> 312 win 271  (DF)
> 15:16:08.786376 n2.high5.nl.https > www2.high5.nl.1828: P 128:134(6) ack
> 312 win 271  (DF)
> 15:16:08.786396 n2.high5.nl.https > www2.high5.nl.1828: P 134:166(32)
> ack 312 win 271  (DF)
> 15:16:08.786455 n2.high5.nl.https > www2.high5.nl.1828: . 166:1614(1448)
> ack 312 win 271  (DF)
> 15:16:08.786457 n2.high5.nl.https > www2.high5.nl.1828: .
> 1614:3062(1448) ack 312 win 271  2899683510> (DF)
> 15:16:08.786460 n2.high5.nl.https > www2.high5.nl.1828: P 3062:3803(741)
> ack 312 win 271  (DF)
> 15:16:08.786943 www2.high5.nl.1828 > n2.high5.nl.https: . ack 134 win
> 255  (DF)
> 15:16:08.796534 n2.high5.nl.https > www2.high5.nl.1828: P 3803:4345(542)
> ack 312 win 271  (DF)
> 15:16:08.796577 n2.high5.nl.https > www2.high5.nl.1828: P 4345:4403(58)
> ack 312 win 271  (DF)
> 15:16:08.797518 www2.high5.nl.1828 > n2.high5.nl.https: . ack 166 win
> 256 > (DF)
> 15:16:08.797522 www2.high5.nl.1828 > n2.high5.nl.https: . ack 166 win
> 256 > (DF)
> 15:16:09.790297 n2.high5.nl.https > www2.high5.nl.1828: . 166:1614(1448)
> ack 312 win 271  (DF)
> 15:16:09.790902 www2.high5.nl.1828 > n2.high5.nl.https: . ack 1614 win
> 233 > (DF)
> 15:16:09.790917 n2.high5.nl.https > www2.high5.nl.1828: .
> 1614:3062(1448) ack 312 win 271  2899684519> (DF)
> 15:16:09.790923 n2.high5.nl.https > www2.high5.nl.1828: P
> 3062:4403(1341) ack 312 win 271  2899684519> (DF)
> 15:16:10.790299 n2.high5.nl.https > www2.high5.nl.1828: .
> 1614:3062(1448) ack 312 win 271  2899684519> (DF)
> 15:16:10.791204 www2.high5.nl.1828 > n2.high5.nl.https: . ack 3062 win
> 233 > (DF)
> 15:16:10.791223 n2.high5.nl.https > www2.high5.nl.1828: P
> 3062:4403(1341) ack 312 win 271  2899685520> (DF)
> 15:16:10.791692 www2.high5.nl.1828 > n2.high5.nl.https: . ack 4403 win
> 235  (DF)
> 15:16:10.802647 www2.high5.nl.1828 > n2.high5.nl.https: P 312:318(6) ack
> 4403 win 256  (DF)
> 15:16:11.000297 n2.high5.nl.https > www2.high5.nl.1828: . ack 318 win
> 271  (DF)
> 15:16:11.001162 www2.high5.nl.1828 > n2.high5.nl.https: P 318:527(209)
> ack 4403 win 256  (DF)
> 15:16:11.001860 n2.high5.nl.https > www2.high5.nl.1828: P 4403:5059(656)
> ack 527 win 271  (DF)
> 15:16:11.001989 n2.high5.nl.https > www2.high5.nl.1828: .
> 5059:6507(1448) ack 527 win 271  2899685730> (DF)
> 15:16:11.001992 n2.high5.nl.https > www2.high5.nl.1828: .
> 6507:7955(1448) ack 527 win 271  2899685730> (DF)
> 15:16:11.195431 www2.high5.nl.1828 > n2.high5.nl.https: . ack 5059 win
> 256  (DF)
> 15:16:11.195447 n2.high5.nl.https > www2.high5.nl.1828: .
> 7955:9403(1448) ack 527 win 271  2899685924> (DF)
> 
> 
> Running a trace on www2 I am seeing:
> 
> www2:~ # tcpdump -i vio0 host 46.23.88.18
> tcpdump: listening on vio0, link-type EN10MB
> 15:16:08.729974 www2.high5.nl.1828 > n2.high5.nl.https: S
> 2182224746:2182224746(0) win 16384  6,nop,nop,timestamp 2899683458 0> (DF)
> 15:16:08.731114 arp who-has www2.high5.nl tell n2.high5.nl
> 15:16:08.731229 arp reply www2.high5.nl is-at fe:51:bb:1e:12:11
> 15:16:08.731631 n2.high5.nl.https >

Re: smtpd: implement nullmx RFC 7505

2023-10-18 Thread Omar Polo

On 2023/10/18 08:40:14 +0100, Stuart Henderson  wrote:
> On 2023/10/17 22:27, Philipp wrote:
> > [2023-10-17 17:32] Omar Polo 
> > > There is one part of the RFC7505 that I'd like to quote and discuss
> > > with you however.  The last paragraph of the section 3 says:
> > >
> > > : A domain that advertises a null MX MUST NOT advertise any other MX
> > > : RR.
> > >
> > > Your implementation handles the case where there are more than one MX
> > > by setting the `nullmx' field in the shared struct when we encounter
> > > one, and then don't process the other MXs when that field is set.
> > > This is fine.
> > >
> > > However, I think we can simplify here by not honouring the Null MX
> > > when there are other MX records.  We're not violating the RFC.  See
> > > diff below to see what I mean.  The only difference is in dns.c, the
> > > rest is the same with yours.
> > 
> > My reasaning for that was: When you set Null MX you explicitly opt out
> > from reciving mail even when you violate the spec. But if you want
> > it diffrent I can change it.
> 
> I think in that situation, the recipient's DNS admin has *not* set a
> valid nullmx, and delivery to other MXes should still be attempted.

That's what I though too.

> > But I don't think your proposed patch is a good solution, because the
> > result depend on the order of the RR in the repsonse. The problem is
> > when the first entry in the response is a Null MX your patch truncate
> > the other results and a bounce is generated. But when the Null MX is
> > later in the response the other entries are used to deliver the mail.

that's true; I fell for a shorter diff.  I've attached below a version
that doesn't depend on the order.

> > > So far I've only compile-tested the code.  I don't have a spare domain
> > > to test and don't know of anyone using a Null MX.
> > 
> > To test Null MX you can use example.com. To test the localhost patch a
> > friend of me has set up mxtest.yannikenss.de.

thanks!

> [...]
> > [1] I still belive OpenSMTPD should only connect to public routable
> > addresses for relay. So don't connect to something like 127.0.0.1/8,
> > ::1/128, RFC1918 addresses, ULA, ...
> > But this is independent of this patch.
> 
> Support for nullmx makes sense to me. Perhaps also the localhost
> handling. But some intranet setups may require connections to
> RFC1918/ULA addresses - I don't think an MTA should prevent these.

Completely agree.



diff 2d025d839f99dc09ee525c11a4ed09a0f3bbe7d0 
02bb94351d3865e61483023cab9fa02bcac2970d
commit - 2d025d839f99dc09ee525c11a4ed09a0f3bbe7d0
commit + 02bb94351d3865e61483023cab9fa02bcac2970d
blob - 4cf5d23d1d14b5400c6f4429dae0a4f6490073d4
blob + 552a5cf9115401b4fac3d131d6e5c022ebb8b7fd
--- usr.sbin/smtpd/dns.c
+++ usr.sbin/smtpd/dns.c
@@ -232,10 +232,10 @@ dns_dispatch_mx(struct asr_result *ar, void *arg)
struct dns_rrrr;
char buf[512];
size_t   found;
+   int  nullmx = 0;
 
if (ar->ar_h_errno && ar->ar_h_errno != NO_DATA &&
ar->ar_h_errno != NOTIMP) {
-
m_create(s->p,  IMSG_MTA_DNS_HOST_END, 0, 0, -1);
m_add_id(s->p, s->reqid);
if (ar->ar_rcode == NXDOMAIN)
@@ -259,13 +259,29 @@ dns_dispatch_mx(struct asr_result *ar, void *arg)
unpack_rr(, );
if (rr.rr_type != T_MX)
continue;
+
print_dname(rr.rr.mx.exchange, buf, sizeof(buf));
buf[strlen(buf) - 1] = '\0';
+
+   if (rr.rr.mx.preference == 0 && !strcmp(buf, "")) {
+   nullmx = 1;
+   continue;
+   }
+
dns_lookup_host(s, buf, rr.rr.mx.preference);
found++;
}
free(ar->ar_data);
 
+   if (nullmx && found == 0) {
+   m_create(s->p, IMSG_MTA_DNS_HOST_END, 0, 0, -1);
+   m_add_id(s->p, s->reqid);
+   m_add_int(s->p, DNS_NULLMX);
+   m_close(s->p);
+   free(s);
+   return;
+   }
+
/* fallback to host if no MX is found. */
if (found == 0)
dns_lookup_host(s, s->name, 0);
blob - c0d0fc02931b90409441035d2744923af9e42df1
blob + 25e158b68a88c8485428a2476c1c557f8c60404d
--- usr.sbin/smtpd/mta.c
+++ usr.sbin/smtpd/mta.c
@@ -1088,6 +1088,10 @@ mta_on_mx(void *tag, void *arg, void *data)
else
relay->failstr = "No MX found for domain";
break;
+   case DNS_NULLMX:
+   relay->fail = IMSG_MTA_DELIVERY_PERMFAIL;
+   relay->failstr = "Domain does not accept mail";
+   break;
default:
fatalx("bad DNS lookup error code");
break;
blob - 6781286928da45e6159bce81ff2437011ebdef72
blob + d5cbe88feeaf9e91eca119b31dda957aca68c031
--- usr.sbin/smtpd/smtpd.h
+++ usr.sbin/smtpd/smtpd.h
@@

Re: log.c use buffered IO

2023-10-18 Thread Claudio Jeker

On Tue, Oct 17, 2023 at 10:06:54AM +0200, Sebastian Benoit wrote:
> Theo Buehler(t...@theobuehler.org) on 2023.10.17 09:13:15 +0200:
> > On Mon, Oct 16, 2023 at 12:19:17PM +0200, Claudio Jeker wrote:
> > > I dislike how log.c does all these asprintf() calls with dubious
> > > workaround calls in case asprintf() fails.
> > 
> > You're not alone.
> > 
> > > IMO it is easier to use the stdio provided buffers and fflush() to get
> > > "atomic" writes on stderr. At least from my understanding this is the
> > > reason to go all this lenght to do a single fprintf call.
> > 
> > This makes sense, but I don't know the history here.
> 
> as far as i can remember, it was done so it would still be able to work
> somewhat when out of memeory.
>  

After some input off-list here another idea.
Require that logit() is called with a \n and by that simplify a lot of
code around it. vsyslog() handles both so having the \n should not cause
any breakage.

Now logit() is mostly used internally but in bgpd it is also used in
logmsg.c and parse.y. Fixing those is simple.

Also this uses a stack buffer for all the log_* cases now. This should
make the code more thread safe.

Also this removes vlog() from the API.  I had a quick look at all the
other log.c users and apart from ldapd this can be added to all of them
without much issues. Nothing else uses vlog() and the logit() is also very
minimal (mostly the same parse.y change as below).

-- 
:wq Claudio

Index: log.c
===
RCS file: /cvs/src/usr.sbin/bgpd/log.c,v
retrieving revision 1.64
diff -u -p -r1.64 log.c
--- log.c   21 Mar 2017 12:06:55 -  1.64
+++ log.c   18 Oct 2023 07:10:32 -
@@ -26,6 +26,8 @@
 
 #include "log.h"
 
+#define MAX_LOGLEN 4096
+
 static int  debug;
 static int  verbose;
 static const char  *log_procname;
@@ -68,30 +70,15 @@ void
 logit(int pri, const char *fmt, ...)
 {
va_list ap;
+   int saved_errno = errno;
 
va_start(ap, fmt);
-   vlog(pri, fmt, ap);
-   va_end(ap);
-}
-
-void
-vlog(int pri, const char *fmt, va_list ap)
-{
-   char*nfmt;
-   int  saved_errno = errno;
-
if (debug) {
-   /* best effort in out of mem situations */
-   if (asprintf(, "%s\n", fmt) == -1) {
-   vfprintf(stderr, fmt, ap);
-   fprintf(stderr, "\n");
-   } else {
-   vfprintf(stderr, nfmt, ap);
-   free(nfmt);
-   }
+   vfprintf(stderr, fmt, ap);
fflush(stderr);
} else
vsyslog(pri, fmt, ap);
+   va_end(ap);
 
errno = saved_errno;
 }
@@ -99,26 +86,18 @@ vlog(int pri, const char *fmt, va_list a
 void
 log_warn(const char *emsg, ...)
 {
-   char*nfmt;
-   va_list  ap;
-   int  saved_errno = errno;
+   charfmtbuf[MAX_LOGLEN];
+   va_list ap;
+   int saved_errno = errno;
 
/* best effort to even work in out of memory situations */
if (emsg == NULL)
-   logit(LOG_ERR, "%s", strerror(saved_errno));
+   logit(LOG_ERR, "%s\n", strerror(saved_errno));
else {
va_start(ap, emsg);
-
-   if (asprintf(, "%s: %s", emsg,
-   strerror(saved_errno)) == -1) {
-   /* we tried it... */
-   vlog(LOG_ERR, emsg, ap);
-   logit(LOG_ERR, "%s", strerror(saved_errno));
-   } else {
-   vlog(LOG_ERR, nfmt, ap);
-   free(nfmt);
-   }
+   (void)vsnprintf(fmtbuf, sizeof(fmtbuf), emsg, ap);
va_end(ap);
+   logit(LOG_ERR, "%s: %s\n", fmtbuf, strerror(saved_errno));
}
 
errno = saved_errno;
@@ -127,53 +106,65 @@ log_warn(const char *emsg, ...)
 void
 log_warnx(const char *emsg, ...)
 {
-   va_list  ap;
+   charfmtbuf[MAX_LOGLEN];
+   va_list ap;
+   int saved_errno = errno;
 
va_start(ap, emsg);
-   vlog(LOG_ERR, emsg, ap);
+   (void)vsnprintf(fmtbuf, sizeof(fmtbuf), emsg, ap);
va_end(ap);
+   logit(LOG_ERR, "%s\n", fmtbuf);
+
+   errno = saved_errno;
 }
 
 void
 log_info(const char *emsg, ...)
 {
-   va_list  ap;
+   charfmtbuf[MAX_LOGLEN];
+   va_list ap;
+   int saved_errno = errno;
 
va_start(ap, emsg);
-   vlog(LOG_INFO, emsg, ap);
+   (void)vsnprintf(fmtbuf, sizeof(fmtbuf), emsg, ap);
va_end(ap);
+   logit(LOG_INFO, "%s\n", fmtbuf);
+
+   errno = saved_errno;
 }
 
 void
 log_debug(const char *emsg, ...)
 {
-   va_list  ap;
+   charfmtbuf[MAX_LOGLEN];
+   va_list ap;
 
if (verbose) {
va_start(ap, emsg);
-   vlog(LOG_DEBUG, emsg, ap);
+

Re: smtpd: implement nullmx RFC 7505

2023-10-18 Thread Stuart Henderson

On 2023/10/17 22:27, Philipp wrote:
> [2023-10-17 17:32] Omar Polo 
> >
> > There is one part of the RFC7505 that I'd like to quote and discuss
> > with you however.  The last paragraph of the section 3 says:
> >
> > : A domain that advertises a null MX MUST NOT advertise any other MX
> > : RR.
> >
> > Your implementation handles the case where there are more than one MX
> > by setting the `nullmx' field in the shared struct when we encounter
> > one, and then don't process the other MXs when that field is set.
> > This is fine.
> >
> > However, I think we can simplify here by not honouring the Null MX
> > when there are other MX records.  We're not violating the RFC.  See
> > diff below to see what I mean.  The only difference is in dns.c, the
> > rest is the same with yours.
> 
> My reasaning for that was: When you set Null MX you explicitly opt out
> from reciving mail even when you violate the spec. But if you want
> it diffrent I can change it.

I think in that situation, the recipient's DNS admin has *not* set a
valid nullmx, and delivery to other MXes should still be attempted.

> But I don't think your proposed patch is a good solution, because the
> result depend on the order of the RR in the repsonse. The problem is
> when the first entry in the response is a Null MX your patch truncate
> the other results and a bounce is generated. But when the Null MX is
> later in the response the other entries are used to deliver the mail.
> 
> > So far I've only compile-tested the code.  I don't have a spare domain
> > to test and don't know of anyone using a Null MX.
> 
> To test Null MX you can use example.com. To test the localhost patch a
> friend of me has set up mxtest.yannikenss.de.
> 
> > > Because some domains set the MX record to "localhost." to get a similar
> > > efect the secound patch ignores "localhost." MX entries and handles a MX
> > > containing only "localhost." records like a Null MX.
> >
> > This could be simplified too.  On top of diff below, it would need
> > just something among the lines of:
> 
> When localhost and Null MX should be handled the same this could be
> simplified in one check, yes. But that "localhost." and Null MX are
> handled different was on purpose. Because I would say setting a Null MX
> is a explicit opt out from reciving mail and setting MX to localhost
> is implicit. As said before, I have no problem which changing this.
> 
> > I'm still not convinced about this part however.  It feels wrong to me
> > to hardcode such a check, it seems too much paranoic.  The follow-up
> > would be to check for localhost in dns_dispatch_host too?  ;)
> 
> The idea[1] about this part was because setting "localhost." as MX is used
> in a similiar way as Null MX[0]. So handle this in a simmilar way as
> Null MX make sense. This way the sender will get a better bounce message.
> 
> Philipp
> 
> [0] https://www.netmeister.org/blog/mx-diversity.html
> 
> [1] I still belive OpenSMTPD should only connect to public routable
> addresses for relay. So don't connect to something like 127.0.0.1/8,
> ::1/128, RFC1918 addresses, ULA, ...
> But this is independent of this patch.

Support for nullmx makes sense to me. Perhaps also the localhost
handling. But some intranet setups may require connections to
RFC1918/ULA addresses - I don't think an MTA should prevent these.

BTW, you can already block those at the DNS resolver level in a
configurable way - see unbound's "private-address" feature.

Re: Why store pointers for some functions in malloc.c?

2023-10-17 Thread Masato Asou

From: Otto Moerbeek 
Date: Wed, 18 Oct 2023 07:28:47 +0200

> On Wed, Oct 18, 2023 at 09:23:49AM +0900, Masato Asou wrote:
> 
>> Hello tech@ and otto,
>> 
>> Why do only some calling functions store the pinttes in region_info as
>> below:
>> 
>> static void *
>> malloc_bytes(struct dir_info *d, size_t size, void *f)
>> {
>> 
>> found:
>> if (i == 0 && k == 0 && DO_STATS) {
>> struct region_info *r = find(d, bp->page);
>> STATS_SETF(r, f);
>> }
>> 
>> I found following mail from otto:
>> https://marc.info/?l=openbsd-tech=168171382927798=2
>> > The null "f" values (call sites) are due to the sampling nature of
>> > small allocations. Recording all call sites of all potential leaks
>> > introduces too much overhead.
>> 
>> Is this the answer to my question?
>> --
>> ASOU Masato
> 
> Yes.
>  
> The reason is that (in the existing code) there's only one pointer per
> region_info available to store callers. So for a chunk page (which has
> many small alocations) ony slot 0 gets recorded.

OK.

> But there's a diff I posted last week on tech@ that will change this
> so that all call sites are recorded (in a different location and only
> if D is used). It will also report more details when a write of a free
> chunk is detected.  That diff could use some review/testing.

I'll checkt the your posted diff.

Thank your for your information!
--
ASOU Masato

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 50116 matches

Mail list logo