Re: BUFSIZ-related pessimization in fvwrite.c

2024-03-29 Thread Christos Zoulas
In article ,
enh   wrote:
>(if anyone knows the equivalent freebsd list, please add them --- this
>code is the same between all three bsds.)
>
>Apple patched their copy of the FreeBSD fvwrite.c to change BUFSIZ to
>INT_MAX to massively improve the performance of large unbuffered
>writes[1], and patched the buffered case to use the largest multiple
>of the buffer size to massively improve the performance of large
>buffered writes:
>
>https://github.com/apple-oss-distributions/Libc/commit/c5a3293354e22262702a3add5b2dfc9bb0b93b85#diff-3b844a19cfb0aab1a23f7fbc457d3bce7453513730c489a72f66ff89d6557ff3
>
>i've tested similar changes to openbsd's fvwrite.c with great results.
>
>is it a _requirement_ that buffered writes _only_ happen in multiples
>of the buffer size? that seems unlikely to me (a) because of short
>writes and (b) musl just does a writev() of what's in the buffer and
>what it was just given in case of a larger-than-buffer write and (c)
>we already changed the openbsd fread() to read directly into the
>caller's buffer regardless of size, so the equivalent behavior on the
>_write_ side seems reasonable to me?

Makes sense, committed.

christos



BUFSIZ-related pessimization in fvwrite.c

2024-03-29 Thread enh
(if anyone knows the equivalent freebsd list, please add them --- this
code is the same between all three bsds.)

Apple patched their copy of the FreeBSD fvwrite.c to change BUFSIZ to
INT_MAX to massively improve the performance of large unbuffered
writes[1], and patched the buffered case to use the largest multiple
of the buffer size to massively improve the performance of large
buffered writes:

https://github.com/apple-oss-distributions/Libc/commit/c5a3293354e22262702a3add5b2dfc9bb0b93b85#diff-3b844a19cfb0aab1a23f7fbc457d3bce7453513730c489a72f66ff89d6557ff3

i've tested similar changes to openbsd's fvwrite.c with great results.

is it a _requirement_ that buffered writes _only_ happen in multiples
of the buffer size? that seems unlikely to me (a) because of short
writes and (b) musl just does a writev() of what's in the buffer and
what it was just given in case of a larger-than-buffer write and (c)
we already changed the openbsd fread() to read directly into the
caller's buffer regardless of size, so the equivalent behavior on the
_write_ side seems reasonable to me?


1. BUFSIZ is only 1024 bytes on the three BSDs and Android/iOS. glibc uses 8KiB.


sys/sched.h: conceal schedstate_percpu definition from userspace

2023-10-29 Thread Scott Cheloha
struct schedstate_percpu contains clockintr pointers.  struct
clockintr is not defined int userspace, so we need to conceal
the schedstate_percpu definition from userspace.  Nothing in
base userspace currently uses the struct.

I think we should leave  in place for now.  Something
might depend upon it to compile.  We could circle back move it under
_KERNEL in a separate patch.

Preferences?  ok?

Index: sched.h
===
RCS file: /cvs/src/sys/sys/sched.h,v
diff -u -p -r1.67 sched.h
--- sched.h 24 Oct 2023 13:20:11 -  1.67
+++ sched.h 29 Oct 2023 20:44:50 -
@@ -88,6 +88,15 @@
 #define CP_IDLE5
 #define CPUSTATES  6
 
+struct cpustats {
+   uint64_tcs_time[CPUSTATES]; /* CPU state statistics */
+   uint64_tcs_flags;   /* see below */
+};
+
+#define CPUSTATS_ONLINE0x0001  /* CPU is schedulable */
+
+#ifdef _KERNEL
+
 #defineSCHED_NQS   32  /* 32 run queues. */
 
 struct clockintr;
@@ -123,15 +132,6 @@ struct schedstate_percpu {
 * without delay */
u_char spc_smrgp;   /* this CPU's view of grace period */
 };
-
-struct cpustats {
-   uint64_tcs_time[CPUSTATES]; /* CPU state statistics */
-   uint64_tcs_flags;   /* see below */
-};
-
-#define CPUSTATS_ONLINE0x0001  /* CPU is schedulable */
-
-#ifdef _KERNEL
 
 /* spc_flags */
 #define SPCF_SEENRR 0x0001  /* process has seen roundrobin() */



Interest in patches for changing /usr/games/* to use $XDG_* directories?

2023-10-29 Thread Tim Chase
Howdy,

There was discussion on Reddit[1] asking about how to get various
games to use alternate directories for the various files, particularly
the FreeDesktop.org $XDG_* directory layout.  These mostly break
down into a couple categories:

- save-games in $XDG_DATA_HOME

- configuration in $XDG_CONFIG_DIR (e.g. .huntd.conf)

- high-score files in $XDG_STATE_HOME

Any such modifications would likely need to do some fallback logic,
attempting the $XDG file first, and falling back to $HOME in a
second attempt.

I too appreciate the goals of moving such files away from cluttering
$HOME into well-known subdirectories.

Would there be any interest in such XDG-dir patches for the relevant
games[3] if I attempted to implement them and provide such a
patch-set?

-tkc

[1]
https://www.reddit.com/r/openbsd/comments/17j0zi9/change_game_score_default_directory/

[2]
https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html

[3]
atc(6), battlestar(6), canfield(6), hack(6), hunt(6), robots(6),
snake(6), and tetris(6) all seem to refer to $HOME








Re: snmpd: Move SNMPv2-SMI::snmpV2 to application_internal.c

2023-10-29 Thread Martijn van Duren
On Sun, 2023-10-29 at 20:11 +0100, Martijn van Duren wrote:
> Similar reasoning and questions as the move of SNMPv2-MIB::snmp.
> This moves SNMP-FRAMEWORK-MIB::snmpEngine and
> SNMP-USER-BASED-SM-MIB::usmStats from mib.c to application_internal.c,
> under SNMPv2-SMI::snmpV2. The reason for this broader umbrella is that
> other backends have no business fiddling around under this region.
> Since different backends bite eachother with exclusive regions, both
> snmpEngine and usmStats must be moved at the same time.
> 
> This change also breaks snmpd.sh from regress. This places a few custom
> OIDs in the usmUserTable, which is under snmpV2. Since these tests are
> mostly there to test snmp(1) I think it's worth removing these tests and
> build a proper snmp(1) regress suite another day (regress part in next
> mail)
> 
> OK?
> 
> martijn@
> 
Here's the regress diff

Index: snmpd.sh
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/snmpd.sh,v
retrieving revision 1.18
diff -u -p -r1.18 snmpd.sh
--- snmpd.sh19 Jan 2022 11:02:38 -  1.18
+++ snmpd.sh29 Oct 2023 19:14:33 -
@@ -271,8 +271,6 @@ read-write community non-default-rw
 oid 1.3.6.1.4.1.30155.42.1 name myName read-only string "humppa"
 oid 1.3.6.1.4.1.30155.42.2 name myStatus read-only integer 1
 # No need to place a full index, we just need the object
-oid 1.3.6.1.6.3.15.1.2.2.1.3.1 name Reyk read-only string "Reyk Fl${oe}ter"
-oid 1.3.6.1.6.3.15.1.2.2.1.3.2 name broken read-only string "br${boe}ken"
 EOF
 
 (cd ${OBJDIR} && nohup snmpd -dvf ./snmpd.conf > snmpd.log 2>&1) &
@@ -344,38 +342,6 @@ fi
 #  echo "Setting of a ro custom oid test unexpectedly succeeded."
 #  FAILED=1
 #fi
-
-snmp_command="snmp get -Oqv -v2c -c non-default-rw localhost \
-usmUserSecurityName.1.0"
-echo === $snmp_command
-reyk="$(eval LC_ALL=en_US.UTF-8 $snmp_command)"
-if [ "$reyk" != "Reyk Fl${oe}ter" ]
-then
-   echo "Printing of UTF-8 string in UTF-8 locale failed"
-   FAILED=1
-fi
-reyk="$(eval LC_ALL=C $snmp_command)"
-if [ "$reyk" != "Reyk Fl.ter" ]
-then
-   echo "Printing of UTF-8 string in C locale failed"
-   FAILED=1
-fi
-
-snmp_command="snmp get -Oqv -v2c -c non-default-rw localhost \
-usmUserSecurityName.2.0"
-echo === $snmp_command
-broken="$(eval LC_ALL=en_US.UTF-8 $snmp_command)"
-if [ "$broken" != "br${replacement}ken" ]
-then
-   echo "Printing of UTF-8 replacement character failed"
-   FAILED=1
-fi
-broken="$(eval LC_ALL=C $snmp_command)"
-if [ "$broken" != "br?ken" ]
-then
-   echo "Printing of question mark in C locale failed"
-   FAILED=1
-fi
 
 kill $(pgrep snmpd) >/dev/null 2>&1
 



Re: azalia: use HDMI as second fallback

2023-10-29 Thread Peter Hessler
Yes please!


On 2023 Oct 29 (Sun) at 14:55:09 +0100 (+0100), Christopher Zimmermann wrote:
:Hi,
:
:for me azalia HDMI audio playback works fine. According to [1] it had or
:still has problems. For machines where the default audio should not be
:(possibly broken) rsnd/0, but rsnd/1 this can be configured in sndiod or
:AUDIODEVICE.
:
:This diff would attach azalia even when there are only HDMI codecs available.
:
:
:Christopher
:
:[1] 
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/azalia.c?rev=1.155=text/x-cvsweb-markup
:
:
:
:Index: azalia.c
:===
:RCS file: /cvs/src/sys/dev/pci/azalia.c,v
:retrieving revision 1.284
:diff -u -p -r1.284 azalia.c
:--- azalia.c   30 Jul 2023 08:46:03 -  1.284
:+++ azalia.c   29 Oct 2023 13:39:04 -
:@@ -950,28 +950,19 @@ azalia_init_codecs(azalia_t *az)
:   return(1);
:   }
:-  /* Use the first codec capable of analog I/O.  If there are none,
:-   * use the first codec capable of digital I/O.  Skip HDMI codecs.
:-   */
:+  /* Prefer analog over digital codecs over HDMI codecs. */
:   c = -1;
:   for (i = 0; i < az->ncodecs; i++) {
:-  codec = >codecs[i];
:-  if ((codec->audiofunc < 0) ||
:-  (codec->codec_type == AZ_CODEC_TYPE_HDMI))
:-  continue;
:-  if (codec->codec_type == AZ_CODEC_TYPE_DIGITAL) {
:-  if (c < 0)
:-  c = i;
:-  } else {
:+  if (az->codecs[i].audiofunc >= 0 &&
:+  (c = -1 ||
:+  az->codecs[i].codec_type < az->codecs[c].codec_type))
:   c = i;
:-  break;
:-  }
:   }
:-  az->codecno = c;
:-  if (az->codecno < 0) {
:+  if (c == -1) {
:   printf("%s: no supported codecs\n", XNAME(az));
:   return(1);
:   }
:+  az->codecno = c;
:   printf("%s: codecs: ", XNAME(az));
:   for (i = 0; i < az->ncodecs; i++) {
:



snmpd: Move SNMPv2-MIB::system to application_internal.c

2023-10-29 Thread Martijn van Duren
This one is a little more involved than the previous 2. Since the
snmpd.conf's system variables are stored inside a 'struct oid', I want
to move these into their own struct inside 'struct snmpd'.

If we also move mib_tree to smi.c we can completely remove mib.c
As stated in my previous mail, this removes the sysORTable. I'll add
these back later in a proper way. (mib.c removal omitted from diff)

With these the only consumer of mps.c left is the oid keyword of
snmpd.conf.

OK?

martijn@

diff --git a/Makefile b/Makefile
index 3522a38..98bdda5 100644
--- a/Makefile
+++ b/Makefile
@@ -5,7 +5,7 @@ MAN=snmpd.8 snmpd.conf.5
 SRCS=  parse.y log.c snmpe.c application.c application_legacy.c \
application_blocklist.c application_internal.c \
application_agentx.c ax.c \
-   mps.c trap.c mib.c smi.c snmpd.c \
+   mps.c trap.c smi.c snmpd.c \
proc.c usm.c traphandler.c util.c
 
 LDADD= -levent -lutil -lcrypto
diff --git a/application_internal.c b/application_internal.c
index 36ab8ef..e682cc7 100644
--- a/application_internal.c
+++ b/application_internal.c
@@ -47,6 +47,7 @@ void appl_internal_getnext(struct appl_backend *, int32_t, 
int32_t,
 struct ber_element *appl_internal_snmp(struct ber_oid *);
 struct ber_element *appl_internal_engine(struct ber_oid *);
 struct ber_element *appl_internal_usmstats(struct ber_oid *);
+struct ber_element *appl_internal_system(struct ber_oid *);
 struct appl_internal_object *appl_internal_object_parent(struct ber_oid *);
 int appl_internal_object_cmp(struct appl_internal_object *,
 struct appl_internal_object *);
@@ -73,6 +74,15 @@ RB_PROTOTYPE_STATIC(appl_internal_objects, 
appl_internal_object, entry,
 void
 appl_internal_init(void)
 {
+   appl_internal_region((MIB_system));
+   appl_internal_object((MIB_sysDescr), appl_internal_system, NULL);
+   appl_internal_object((MIB_sysOID), appl_internal_system, NULL);
+   appl_internal_object((MIB_sysUpTime), appl_internal_system, NULL);
+   appl_internal_object((MIB_sysContact), appl_internal_system, NULL);
+   appl_internal_object((MIB_sysName), appl_internal_system, NULL);
+   appl_internal_object((MIB_sysLocation), appl_internal_system, NULL);
+   appl_internal_object((MIB_sysServices), appl_internal_system, NULL);
+
appl_internal_region((MIB_snmp));
appl_internal_object((MIB_snmpInPkts), appl_internal_snmp, NULL);
appl_internal_object((MIB_snmpOutPkts), appl_internal_snmp, NULL);
@@ -441,6 +451,30 @@ appl_internal_usmstats(struct ber_oid *oid)
return value;
 }
 
+struct ber_element *
+appl_internal_system(struct ber_oid *oid)
+{
+   struct snmp_system *s = _env->sc_system;
+   struct ber_element *value = NULL;
+
+   if (ober_oid_cmp((MIB_sysDescr, 0), oid) == 0)
+   return ober_add_string(NULL, s->sys_descr);
+   else if (ober_oid_cmp((MIB_sysOID, 0), oid) == 0)
+   return ober_add_oid(NULL, >sys_oid);
+   else if (ober_oid_cmp((MIB_sysUpTime, 0), oid) == 0) {
+   value = ober_add_integer(NULL, smi_getticks());
+   ober_set_header(value, BER_CLASS_APPLICATION, SNMP_T_TIMETICKS);
+   } else if (ober_oid_cmp((MIB_sysContact, 0), oid) == 0)
+   return ober_add_string(NULL, s->sys_contact);
+   else if (ober_oid_cmp((MIB_sysName, 0), oid) == 0)
+   return ober_add_string(NULL, s->sys_name);
+   else if (ober_oid_cmp((MIB_sysLocation, 0), oid) == 0)
+   return ober_add_string(NULL, s->sys_location);
+   else if (ober_oid_cmp((MIB_sysServices, 0), oid) == 0)
+   return ober_add_integer(NULL, s->sys_services);
+   return value;
+}
+
 struct appl_internal_object *
 appl_internal_object_parent(struct ber_oid *oid)
 {
diff --git a/parse.y b/parse.y
index 8f9bad0..7ae7ce1 100644
--- a/parse.y
+++ b/parse.y
@@ -28,6 +28,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -760,28 +761,87 @@ system: SYSTEM sysmib
;
 
 sysmib : CONTACT STRING{
-   struct ber_oid   o = OID(MIB_sysContact);
-   mps_set(, $2, strlen($2));
+   if (conf->sc_system.sys_contact[0] != '\0') {
+   yyerror("system contact already defined");
+   free($2);
+   YYERROR;
+   }
+   if (strlcpy(conf->sc_system.sys_contact, $2,
+   sizeof(conf->sc_system.sys_contact)) >=
+   sizeof(conf->sc_system.sys_contact)) {
+   yyerror("system contact too long");
+   free($2);
+   YYERROR;
+   }
+   free($2);
}
| DESCR 

snmpd: Move SNMPv2-SMI::snmpV2 to application_internal.c

2023-10-29 Thread Martijn van Duren
Similar reasoning and questions as the move of SNMPv2-MIB::snmp.
This moves SNMP-FRAMEWORK-MIB::snmpEngine and
SNMP-USER-BASED-SM-MIB::usmStats from mib.c to application_internal.c,
under SNMPv2-SMI::snmpV2. The reason for this broader umbrella is that
other backends have no business fiddling around under this region.
Since different backends bite eachother with exclusive regions, both
snmpEngine and usmStats must be moved at the same time.

This change also breaks snmpd.sh from regress. This places a few custom
OIDs in the usmUserTable, which is under snmpV2. Since these tests are
mostly there to test snmp(1) I think it's worth removing these tests and
build a proper snmp(1) regress suite another day (regress part in next
mail)

OK?

martijn@

diff --git a/application_internal.c b/application_internal.c
index c76e8ef..36ab8ef 100644
--- a/application_internal.c
+++ b/application_internal.c
@@ -45,6 +45,8 @@ void appl_internal_get(struct appl_backend *, int32_t, 
int32_t, const char *,
 void appl_internal_getnext(struct appl_backend *, int32_t, int32_t,
 const char *, struct appl_varbind *);
 struct ber_element *appl_internal_snmp(struct ber_oid *);
+struct ber_element *appl_internal_engine(struct ber_oid *);
+struct ber_element *appl_internal_usmstats(struct ber_oid *);
 struct appl_internal_object *appl_internal_object_parent(struct ber_oid *);
 int appl_internal_object_cmp(struct appl_internal_object *,
 struct appl_internal_object *);
@@ -128,6 +130,29 @@ appl_internal_init(void)
NULL);
appl_internal_object((MIB_snmpProxyDrops), appl_internal_snmp,
NULL);
+
+   appl_internal_region((MIB_snmpV2));
+   appl_internal_object((MIB_snmpEngineID), appl_internal_engine,
+   NULL);
+   appl_internal_object((MIB_snmpEngineBoots), appl_internal_engine,
+   NULL);
+   appl_internal_object((MIB_snmpEngineTime), appl_internal_engine,
+   NULL);
+   appl_internal_object((MIB_snmpEngineMaxMsgSize),
+   appl_internal_engine, NULL);
+
+   appl_internal_object((MIB_usmStatsUnsupportedSecLevels),
+   appl_internal_usmstats, NULL);
+   appl_internal_object((MIB_usmStatsNotInTimeWindow),
+   appl_internal_usmstats, NULL);
+   appl_internal_object((MIB_usmStatsUnknownUserNames),
+   appl_internal_usmstats, NULL);
+   appl_internal_object((MIB_usmStatsUnknownEngineId),
+   appl_internal_usmstats, NULL);
+   appl_internal_object((MIB_usmStatsWrongDigests),
+   appl_internal_usmstats, NULL);
+   appl_internal_object((MIB_usmStatsDecryptionErrors),
+   appl_internal_usmstats, NULL);
 }
 
 void
@@ -376,6 +401,46 @@ appl_internal_snmp(struct ber_oid *oid)
return value;
 }
 
+struct ber_element *
+appl_internal_engine(struct ber_oid *oid)
+{
+   if (ober_oid_cmp((MIB_snmpEngineID, 0), oid) == 0)
+   return ober_add_nstring(NULL, snmpd_env->sc_engineid,
+   snmpd_env->sc_engineid_len);
+   else if (ober_oid_cmp((MIB_snmpEngineBoots, 0), oid) == 0)
+   return ober_add_integer(NULL, snmpd_env->sc_engine_boots);
+   else if (ober_oid_cmp((MIB_snmpEngineTime, 0), oid) == 0)
+   return ober_add_integer(NULL, snmpd_engine_time());
+   else if (ober_oid_cmp((MIB_snmpEngineMaxMsgSize, 0), oid) == 0)
+   return ober_add_integer(NULL, READ_BUF_SIZE);
+   return NULL;
+}
+
+struct ber_element *
+appl_internal_usmstats(struct ber_oid *oid)
+{
+   struct snmp_stats *stats = _env->sc_stats;
+   struct ber_element *value = NULL;
+
+   if (ober_oid_cmp((MIB_usmStatsUnsupportedSecLevels, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_usmbadseclevel);
+   else if (ober_oid_cmp((MIB_usmStatsNotInTimeWindow, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_usmtimewindow);
+   else if (ober_oid_cmp((MIB_usmStatsUnknownUserNames, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_usmnosuchuser);
+   else if (ober_oid_cmp((MIB_usmStatsUnknownEngineId, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_usmnosuchengine);
+   else if (ober_oid_cmp((MIB_usmStatsWrongDigests, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_usmwrongdigest);
+   else if (ober_oid_cmp((MIB_usmStatsDecryptionErrors, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_usmdecrypterr);
+
+   if (value != NULL)
+   ober_set_header(value, BER_CLASS_APPLICATION, SNMP_T_COUNTER32);
+
+   return value;
+}
+
 struct appl_internal_object *
 appl_internal_object_parent(struct ber_oid *oid)
 {
diff --git a/mib.c b/mib.c
index ef34983..481460c 100644
--- a/mib.c
+++ b/mib.c
@@ -216,79 +216,6 @@ mib_sysor(struct oid *oid, struct ber_oid *o, struct 
ber_element **elm)
return (0);
 }
 
-/*
- * Defined in SNMP-USER-BASED-SM-MIB.txt (RFC 3414)

snmpd: Move SNMPv2-MIB::snmp to application_internal.c

2023-10-29 Thread Martijn van Duren
I think the subject says it all. I'm not a 100% convinced that 
appl_internal_snmp()
should live in application_internal.c, maybe snmpe.c, where these metrics are 
set
is a better place. But since things come from mib.c, let's keep it simply here 
and
we can always shuffle the deckchairs later.

OK?

martijn@

diff --git a/application_internal.c b/application_internal.c
index ff9611e..c76e8ef 100644
--- a/application_internal.c
+++ b/application_internal.c
@@ -44,6 +44,7 @@ void appl_internal_get(struct appl_backend *, int32_t, 
int32_t, const char *,
 struct appl_varbind *);
 void appl_internal_getnext(struct appl_backend *, int32_t, int32_t,
 const char *, struct appl_varbind *);
+struct ber_element *appl_internal_snmp(struct ber_oid *);
 struct appl_internal_object *appl_internal_object_parent(struct ber_oid *);
 int appl_internal_object_cmp(struct appl_internal_object *,
 struct appl_internal_object *);
@@ -70,6 +71,63 @@ RB_PROTOTYPE_STATIC(appl_internal_objects, 
appl_internal_object, entry,
 void
 appl_internal_init(void)
 {
+   appl_internal_region((MIB_snmp));
+   appl_internal_object((MIB_snmpInPkts), appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpOutPkts), appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpInBadVersions), appl_internal_snmp,
+  NULL);
+   appl_internal_object((MIB_snmpInBadCommunityNames),
+  appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpInBadCommunityUses),
+  appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpInASNParseErrs), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInTooBigs), appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpInNoSuchNames), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInBadValues), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInReadOnlys), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInReadOnlys), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInGenErrs), appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpInTotalReqVars), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInTotalSetVars), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInGetRequests), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInGetNexts), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInSetRequests), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInGetResponses), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpInTraps), appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpOutTooBigs), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutNoSuchNames), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutBadValues), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutGenErrs), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutGetRequests), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutGetNexts), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutSetRequests), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutGetResponses), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpOutTraps), appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpEnableAuthenTraps),
+   appl_internal_snmp, NULL);
+   appl_internal_object((MIB_snmpSilentDrops), appl_internal_snmp,
+   NULL);
+   appl_internal_object((MIB_snmpProxyDrops), appl_internal_snmp,
+   NULL);
 }
 
 void
@@ -245,6 +303,79 @@ appl_internal_getnext(struct appl_backend *backend,
appl_response(backend, requestid, APPL_ERROR_GENERR, i + 1, vblist);
 }
 
+struct ber_element *
+appl_internal_snmp(struct ber_oid *oid)
+{
+   struct snmp_stats *stats = _env->sc_stats;
+   struct ber_element *value = NULL;
+
+   if (ober_oid_cmp(oid, (MIB_snmpEnableAuthenTraps, 0)) == 0)
+   return ober_add_integer(NULL,
+   stats->snmp_enableauthentraps ? 1 : 2);
+   if (ober_oid_cmp((MIB_snmpInPkts, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_inpkts);
+   else if (ober_oid_cmp((MIB_snmpOutPkts, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_outpkts);
+   else if (ober_oid_cmp((MIB_snmpInBadVersions, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_inbadversions);
+   else if (ober_oid_cmp((MIB_snmpInBadCommunityNames, 0), oid) == 0)
+   value = ober_add_integer(NULL, stats->snmp_inbadcommunitynames);
+   else if (ober_oid_cmp((MIB_snmpInBadCommunityUses, 

snmpd: introduce application_internal.c

2023-10-29 Thread Martijn van Duren
In my quest to move away from the old code of mps.c and mib.c, here's
the next step: Introduce a new application_internal.c. This is basically
a very simplified version of the libagentx interface.
We register a region at a certain point with application.c, and use an
internal database of objects that can be walked like leaves that
application.c doesn't know about.

Apart from the code being a little easier to read than mps.c (imho), this
new interface has a few advantages over application_legacy.c:
- no objects are registered inside application.c as instance. Since
  application.c needs to account for overlapping regions it's quite a
  bit slower in finding the correct regions. appl_internal just deals
  with objects, which means we can just walk the leaves. This also makes
  the startup logs a bit quieter.
- Since we can now register an entire region we can block reserved
  regions from being claimed by other backends.
- We can now better distinguish between noSuchObject and
  noSuchInstance for GetRequests.

Since our only internal table atm is sysORTable, which only contains
invalid values I left out struct appl_internal_object's getnext() call,
but it should be "easy" enough to add later (I have some code here that
needs a bit of an extra shine before sending out). getnext() is going to
be called for non-scalars, where struct appl_internal_object can't know
the index.

This diff just adds the plumbing. Migrating the remaining mib.c code
will be done in 3 followup diffs, after which mib.c can be removed.
mps.c does need to stick around for a little longer, because of the
oid keyword in snmpd.conf.

OK?

martijn@

diff --git a/Makefile b/Makefile
index 261e89b..3522a38 100644
--- a/Makefile
+++ b/Makefile
@@ -3,7 +3,7 @@
 PROG=  snmpd
 MAN=   snmpd.8 snmpd.conf.5
 SRCS=  parse.y log.c snmpe.c application.c application_legacy.c \
-   application_blocklist.c \
+   application_blocklist.c application_internal.c \
application_agentx.c ax.c \
mps.c trap.c mib.c smi.c snmpd.c \
proc.c usm.c traphandler.c util.c
diff --git a/application.c b/application.c
index c36f059..2359943 100644
--- a/application.c
+++ b/application.c
@@ -159,6 +159,7 @@ void
 appl_init(void)
 {
appl_blocklist_init();
+   appl_internal_init();
appl_legacy_init();
appl_agentx_init();
 }
@@ -169,6 +170,7 @@ appl_shutdown(void)
struct appl_context *ctx, *tctx;
 
appl_blocklist_shutdown();
+   appl_internal_shutdown();
appl_legacy_shutdown();
appl_agentx_shutdown();
 
diff --git a/application.h b/application.h
index 39e7b5f..b9c95c0 100644
--- a/application.h
+++ b/application.h
@@ -147,3 +147,7 @@ void appl_agentx_backend(int);
 /* application_blocklist.c */
 voidappl_blocklist_init(void);
 voidappl_blocklist_shutdown(void);
+
+/* application_internal.c */
+voidappl_internal_init(void);
+voidappl_internal_shutdown(void);
diff --git a/application_internal.c b/application_internal.c
new file mode 100644
index 000..ff9611e
--- /dev/null
+++ b/application_internal.c
@@ -0,0 +1,271 @@
+/* $OpenBSD$   */
+
+/*
+ * Copyright (c) 2023 Martijn van Duren 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+
+#include 
+#include 
+
+#include "application.h"
+#include "log.h"
+#include "mib.h"
+#include "smi.h"
+#include "snmpd.h"
+
+struct appl_internal_object {
+   struct ber_oid   oid;
+   struct ber_element *(*get)(struct ber_oid *);
+   /* No getnext means the object is scalar */
+   struct ber_element *(*getnext)(int8_t, struct ber_oid *);
+
+   RB_ENTRY(appl_internal_object)   entry;
+};
+
+void appl_internal_region(struct ber_oid *);
+void appl_internal_object(struct ber_oid *,
+struct ber_element *(*)(struct ber_oid *),
+struct ber_element *(*)(int8_t, struct ber_oid *));
+void appl_internal_get(struct appl_backend *, int32_t, int32_t, const char *,
+struct appl_varbind *);
+void appl_internal_getnext(struct appl_backend *, int32_t, int32_t,
+const char *, struct appl_varbind *);
+struct appl_internal_object *appl_internal_object_parent(struct ber_oid *);
+int 

Re: Removing syscall(2) from libc and kernel

2023-10-29 Thread Theo de Raadt
Here's the newest version of all 3 diffs:

- kernel diff, can be tested alone
- userland diff, can be tested alone
- syscalls.master cleanup, would happen afterwards, and conflicts a bit.

the ports team has repaired a majority of the syscall fallout in non-go
programs.  The effort for go is still ongoing, I do not think it will take
very long.  Fingers crossed.

go is the worst because the unix-hater club at the root of their
development team (this is my interpretation of what has happened) showed their
hate to ioctl() -- it is obviously such an evil type-unsafe call that people
need to be shown the even more evil type-unsafe syscall().  the application
community became aware of syscall() and embraced it for this purpose, then
when glibc refused to add new syscall stubs, a related community picked up
syscall() use there also, for all sorts of crazy seperate reasions.  On linux,
syscall() is way more of a shitshow [go check their manual page for a subset
of the details that must be kept in mind] than on *BSD operating systems.
*BSD trees handled the 64-bit transition in a cleaner way: first side-stepping
the stdarg problem using a seperate __syscall() function and using pads, later
solving it by changing the ABI so the pads were not needed and the special 
casesn
are decomposed in the kernel.  This was also easier because we were able to
stop userland from calling __syscall() initially and syscall() as a later step.

Index: sys/arch/alpha/alpha/trap.c
===
RCS file: /cvs/src/sys/arch/alpha/alpha/trap.c,v
diff -u -p -u -r1.108 trap.c
--- sys/arch/alpha/alpha/trap.c 8 Mar 2023 04:43:07 -   1.108
+++ sys/arch/alpha/alpha/trap.c 28 Oct 2023 15:11:05 -
@@ -497,17 +497,15 @@ dopanic:
  * a3, and v0 from the frame before returning to the user process.
  */
 void
-syscall(code, framep)
-   u_int64_t code;
-   struct trapframe *framep;
+syscall(u_int64_t code, struct trapframe *framep)
 {
-   const struct sysent *callp;
+   const struct sysent *callp = sysent;
struct proc *p;
-   int error, indirect = -1;
+   int error = ENOSYS;
u_int64_t opc;
u_long rval[2];
u_long args[10];/* XXX */
-   u_int hidden, nargs;
+   u_int nargs;
 
atomic_add_int(, 1);
p = curproc;
@@ -515,24 +513,11 @@ syscall(code, framep)
framep->tf_regs[FRAME_SP] = alpha_pal_rdusp();
opc = framep->tf_regs[FRAME_PC] - 4;
 
-   switch(code) {
-   case SYS_syscall:
-   indirect = code;
-   code = framep->tf_regs[FRAME_A0];
-   hidden = 1;
-   break;
-   default:
-   hidden = 0;
-   }
-
-   error = 0;
-   callp = sysent;
-   if (code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp += code;
 
-   nargs = callp->sy_narg + hidden;
+   nargs = callp->sy_narg;
switch (nargs) {
default:
if (nargs > 10) /* XXX */
@@ -559,7 +544,7 @@ syscall(code, framep)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, args + hidden, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
Index: sys/arch/amd64/amd64/locore.S
===
RCS file: /cvs/src/sys/arch/amd64/amd64/locore.S,v
diff -u -p -u -r1.141 locore.S
--- sys/arch/amd64/amd64/locore.S   24 Oct 2023 13:20:09 -  1.141
+++ sys/arch/amd64/amd64/locore.S   28 Oct 2023 15:11:05 -
@@ -508,6 +508,7 @@ ENTRY(savectx)
lfence
 END(savectx)
 
+// XXX this should not behave like a nop
 IDTVEC(syscall32)
sysret  /* go away please */
 END(Xsyscall32)
Index: sys/arch/amd64/amd64/trap.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
diff -u -p -u -r1.101 trap.c
--- sys/arch/amd64/amd64/trap.c 5 Jul 2023 12:58:55 -   1.101
+++ sys/arch/amd64/amd64/trap.c 28 Oct 2023 15:11:05 -
@@ -553,7 +553,7 @@ syscall(struct trapframe *frame)
caddr_t params;
const struct sysent *callp;
struct proc *p;
-   int error, indirect = -1;
+   int error = ENOSYS;
size_t argsize, argoff;
register_t code, args[9], rval[2], *argp;
 
@@ -570,26 +570,9 @@ syscall(struct trapframe *frame)
argp = [0];
argoff = 0;
 
-   switch (code) {
-   case SYS_syscall:
-   /*
-* Code is first argument, followed by actual args.
-*/
-   indirect = code;
-   code = frame->tf_rdi;
-   argp = [1];
-   argoff = 1;
-   break;
-   default:
-  

vmd(8) vionet: adapt zero-copy approach from vioblk

2023-10-29 Thread Dave Voutila
Recently, I rewrote parts of the emulated virtio block device in vmd(8)
to fix issues found bringing up NetBSD as a guest. In the process, the
rewrite introduced the chance to do zero-copy transfers for raw block
devices. The diff below brings that design to the virtio network device.

In summary:

  - flips the logic to only read from the tap(4) when there's space in
the rx virtqueue instead of just when the tap is readable

  - uses readv/writev when not needing to enforce locked lladdr or do
local interface packet inspection for DHCP intercept

  - redesigns packet injection for DHCP replies by using a pipe,
allowing it to be handled by the rx event path

  - puts the device into "needs reset" state under error conditions so
the driver will know something is borked

  - drops the include of sys/param.h as it switches the static rx buffer
to be based on maximum tx size and not PAGE_SIZE

Don't expect throughput increases, but you may notice improved tail
latency and decreased cpu utilization under networking load, especially
if you're not using vmd(8)'s "local" interfaces.

ok's, feedback?

-dv

diffstat refs/heads/master refs/heads/vmd-vionet-rewrite
 M  usr.sbin/vmd/vionet.c  |  445+  301-
 M  usr.sbin/vmd/virtio.h  |0+7-

2 files changed, 445 insertions(+), 308 deletions(-)

diff refs/heads/master refs/heads/vmd-vionet-rewrite
commit - dae64e9e225d0c4320fe266e6b924a1630222416
commit + 7743ecfaef9332d7e7df5932ff472c6a00225414
blob - 740296a125e15f905e370e84ecf07163c58e2c99
blob + e08383130c5f19a25dcc3620cb31f4e5bab09b0d
--- usr.sbin/vmd/vionet.c
+++ usr.sbin/vmd/vionet.c
@@ -16,9 +16,8 @@
  * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
  * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
-#include 
-#include  /* PAGE_SIZE */
 #include 
+#include 

 #include 
 #include 
@@ -26,7 +25,6 @@
 #include 
 #include 
 #include 
-#include 

 #include 
 #include 
@@ -36,7 +34,6 @@
 #include 

 #include "atomicio.h"
-#include "pci.h"
 #include "virtio.h"
 #include "vmd.h"

@@ -47,20 +44,36 @@
 extern char *__progname;
 extern struct vmd_vm *current_vm;

-/* Device Globals */
-struct event ev_tap;
+struct packet {
+   uint8_t *buf;
+   size_t   len;
+};

-static int vionet_rx(struct vionet_dev *);
+static int vionet_rx(struct vionet_dev *, int);
+static ssize_t vionet_rx_copy(struct vionet_dev *, int, const struct iovec *,
+int, size_t);
+static ssize_t vionet_rx_zerocopy(struct vionet_dev *, int,
+const struct iovec *, int);
 static void vionet_rx_event(int, short, void *);
 static uint32_t handle_io_read(struct viodev_msg *, struct virtio_dev *,
 int8_t *);
 static int handle_io_write(struct viodev_msg *, struct virtio_dev *);
-void vionet_notify_rx(struct virtio_dev *);
-int vionet_notifyq(struct virtio_dev *);
+static int vionet_notify_tx(struct virtio_dev *);
+static int vionet_notifyq(struct virtio_dev *);

 static void dev_dispatch_vm(int, short, void *);
 static void handle_sync_io(int, short, void *);

+/* Device Globals */
+struct event ev_tap;
+struct event ev_inject;
+int pipe_inject[2];
+#define READ   0
+#define WRITE  1
+struct iovec iov_rx[VIONET_QUEUE_SIZE];
+struct iovec iov_tx[VIONET_QUEUE_SIZE];
+
+
 __dead void
 vionet_main(int fd, int fd_vmm)
 {
@@ -79,6 +92,10 @@ vionet_main(int fd, int fd_vmm)
if (pledge("stdio vmm proc", NULL) == -1)
fatal("pledge");

+   /* Initialize iovec arrays. */
+   memset(iov_rx, 0, sizeof(iov_rx));
+   memset(iov_tx, 0, sizeof(iov_tx));
+
/* Receive our vionet_dev, mostly preconfigured. */
sz = atomicio(read, fd, , sizeof(dev));
if (sz != sizeof(dev)) {
@@ -153,6 +170,12 @@ vionet_main(int fd, int fd_vmm)
}
}

+   /* Initialize our packet injection pipe. */
+   if (pipe2(pipe_inject, O_NONBLOCK) == -1) {
+   log_warn("%s: injection pipe", __func__);
+   goto fail;
+   }
+
/* Initialize libevent so we can start wiring event handlers. */
event_init();

@@ -171,6 +194,12 @@ vionet_main(int fd, int fd_vmm)
event_set(_tap, vionet->data_fd, EV_READ | EV_PERSIST,
vionet_rx_event, );

+   /* Add an event for injected packets. */
+   log_debug("%s: wiring in packet injection handler (fd=%d)", __func__,
+   pipe_inject[READ]);
+   event_set(_inject, pipe_inject[READ], EV_READ | EV_PERSIST,
+   vionet_rx_event, );
+
/* Configure our sync channel event handler. */
log_debug("%s: wiring in sync channel handler (fd=%d)", __func__,
dev.sync_fd);
@@ -209,6 +238,8 @@ vionet_main(int fd, int fd_vmm)
close_fd(dev.sync_fd);
close_fd(dev.async_fd);
close_fd(vionet->data_fd);
+   close_fd(pipe_inject[READ]);
+   close_fd(pipe_inject[WRITE]);
_exit(ret);
/* NOTREACHED */

azalia: use HDMI as second fallback

2023-10-29 Thread Christopher Zimmermann

Hi,

for me azalia HDMI audio playback works fine. According to [1] it had or 
still has problems. For machines where the default audio should not be 
(possibly broken) rsnd/0, but rsnd/1 this can be configured in sndiod or 
AUDIODEVICE.


This diff would attach azalia even when there are only HDMI codecs 
available.



Christopher

[1] 
https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/sys/dev/pci/azalia.c?rev=1.155=text/x-cvsweb-markup




Index: azalia.c
===
RCS file: /cvs/src/sys/dev/pci/azalia.c,v
retrieving revision 1.284
diff -u -p -r1.284 azalia.c
--- azalia.c30 Jul 2023 08:46:03 -  1.284
+++ azalia.c29 Oct 2023 13:39:04 -
@@ -950,28 +950,19 @@ azalia_init_codecs(azalia_t *az)
return(1);
}
 
-	/* Use the first codec capable of analog I/O.  If there are none,

-* use the first codec capable of digital I/O.  Skip HDMI codecs.
-*/
+   /* Prefer analog over digital codecs over HDMI codecs. */
c = -1;
for (i = 0; i < az->ncodecs; i++) {
-   codec = >codecs[i];
-   if ((codec->audiofunc < 0) ||
-   (codec->codec_type == AZ_CODEC_TYPE_HDMI))
-   continue;
-   if (codec->codec_type == AZ_CODEC_TYPE_DIGITAL) {
-   if (c < 0)
-   c = i;
-   } else {
+   if (az->codecs[i].audiofunc >= 0 &&
+   (c = -1 ||
+   az->codecs[i].codec_type < az->codecs[c].codec_type))
c = i;
-   break;
-   }
}
-   az->codecno = c;
-   if (az->codecno < 0) {
+   if (c == -1) {
printf("%s: no supported codecs\n", XNAME(az));
return(1);
}
+   az->codecno = c;
 
 	printf("%s: codecs: ", XNAME(az));

for (i = 0; i < az->ncodecs; i++) {



Re: relayd.conf.5: less SSL

2023-10-28 Thread Sebastian Benoit
Klemens Nanni(k...@openbsd.org) on 2023.10.26 13:28:42 +:
> On Tue, Oct 24, 2023 at 09:09:21AM +0200, Peter N. M. Hansteen wrote:
> > On Tue, Oct 24, 2023 at 06:54:30AM +, Klemens Nanni wrote:
> > > - parse.y still accepting undocumented "ssl" with a warning since 2014
> > > - more "SSL/TLS" instead of "TLS" in manual and code comments
> > 
> > my take would be that while it's fine to streamline the documentation to use
> > the modern terminology, I suspect there may still be ancient configurations
> > out there that use the "ssl" keyword, so removing the last bit of support 
> > for
> > that option should be accompanied by or preceded by a warning on relevant
> > mailing lists or at least in the commit message. 
> > 
> > And I think undeadly.org would be more than happy to help spread the word :)
> 
> current.html entry should do for a deprecated keyword we've been warning
> about for almost ten years...

Yes, please kick it where it belongs.

> I've checked faq/upgrade*.html for previous
> notes, but couldn't find any.

no, because it wasnt removed after 2 releases with the warning.

> Here's a first try, relayd regress is also happy.

ok benno@

> Index: usr.sbin/relayd/parse.y
> ===
> RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
> retrieving revision 1.254
> diff -u -p -r1.254 parse.y
> --- usr.sbin/relayd/parse.y   3 Jul 2023 09:38:08 -   1.254
> +++ usr.sbin/relayd/parse.y   26 Oct 2023 06:07:08 -
> @@ -175,7 +175,7 @@ typedef struct {
>  %token   LOOKUP METHOD MODE NAT NO DESTINATION NODELAY NOTHING ON PARENT 
> PATH
>  %token   PFTAG PORT PREFORK PRIORITY PROTO QUERYSTR REAL REDIRECT RELAY 
> REMOVE
>  %token   REQUEST RESPONSE RETRY QUICK RETURN ROUNDROBIN ROUTE SACK 
> SCRIPT SEND
> -%token   SESSION SOCKET SPLICE SSL STICKYADDR STRIP STYLE TABLE TAG 
> TAGGED TCP
> +%token   SESSION SOCKET SPLICE STICKYADDR STRIP STYLE TABLE TAG TAGGED 
> TCP
>  %token   TIMEOUT TLS TO ROUTER RTLABEL TRANSPARENT URL WITH TTL RTABLE
>  %token   MATCH PARAMS RANDOM LEASTSTATES SRCHASH KEY CERTIFICATE 
> PASSWORD ECDHE
>  %token   EDH TICKETS CONNECTION CONNECTIONS CONTEXT ERRORS STATE CHANGES 
> CHECKS
> @@ -227,21 +227,12 @@ include : INCLUDE STRING{
>   }
>   ;
>  
> -ssltls   : SSL   {
> - log_warnx("%s:%d: %s",
> - file->name, yylval.lineno,
> - "please use the \"tls\" keyword"
> - " instead of \"ssl\"");
> - }
> - | TLS
> - ;
> -
>  opttls   : /*empty*/ { $$ = 0; }
> - | ssltls{ $$ = 1; }
> + | TLS   { $$ = 1; }
>   ;
>  
>  opttlsclient : /*empty*/ { $$ = 0; }
> - | WITH ssltls   { $$ = 1; }
> + | WITH TLS  { $$ = 1; }
>   ;
>  
>  http_type: HTTP  { $$ = 0; }
> @@ -905,7 +896,7 @@ hashkey   : /* empty */   {
>  
>  tablecheck   : ICMP  { table->conf.check = CHECK_ICMP; }
>   | TCP   { table->conf.check = CHECK_TCP; }
> - | ssltls{
> + | TLS   {
>   table->conf.check = CHECK_TCP;
>   conf->sc_conf.flags |= F_TLS;
>   table->conf.flags |= F_TLS;
> @@ -1114,7 +1105,7 @@ protopts_l  : protopts_l protoptsl nl
>   | protoptsl optnl
>   ;
>  
> -protoptsl: ssltls {
> +protoptsl: TLS {
>   if (!(proto->type == RELAY_PROTO_TCP ||
>   proto->type == RELAY_PROTO_HTTP)) {
>   yyerror("can set tls options only for "
> @@ -1122,7 +1113,7 @@ protoptsl   : ssltls {
>   YYERROR;
>   }
>   } tlsflags
> - | ssltls {
> + | TLS {
>   if (!(proto->type == RELAY_PROTO_TCP ||
>   proto->type == RELAY_PROTO_HTTP)) {
>   yyerror("can set tls options only for "
> @@ -2492,7 +2483,6 @@ lookup(char *s)
>   { "socket", SOCKET },
>   { "source-hash",SRCHASH },
>   { "splice", SPLICE },
> - { "ssl",SSL },
>   { "state",  STATE },
>   { "sticky-address", STICKYADDR },
>   { "strip",  STRIP },
> Index: usr.sbin/relayd/relay.c
> ===
> RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
> retrieving revision 1.257
> diff -u -p -r1.257 relay.c
> --- usr.sbin/relayd/relay.c   3 Sep 2023 10:22:03 -   1.257
> +++ usr.sbin/relayd/relay.c   26 Oct 2023 05:49:22 

Re: snmpd: don't move back in oid tree when on/below instance below region

2023-10-28 Thread Theo Buehler
On Sat, Oct 28, 2023 at 09:59:25AM +0200, Martijn van Duren wrote:
> Right now when we register a region with below that an instance we
> can revert back in the tree. When we request below the instance we
> currently use appl_region_next() to find the next region and simply
> set the to be find oid to the the oid of the new region. In the
> situation described above this means that we return the parent
> region and set the oid to that of the parent region, which is smaller
> than the instance oid.
> 
> The easiest fix is to increment the searched oid to nextsibling of the
> instance and let the normal appl_varbind_backend() logic handle it from
> there.

ok tb



snmpd: don't move back in oid tree when on/below instance below region

2023-10-28 Thread Martijn van Duren
Right now when we register a region with below that an instance we
can revert back in the tree. When we request below the instance we
currently use appl_region_next() to find the next region and simply
set the to be find oid to the the oid of the new region. In the
situation described above this means that we return the parent
region and set the oid to that of the parent region, which is smaller
than the instance oid.

The easiest fix is to increment the searched oid to nextsibling of the
instance and let the normal appl_varbind_backend() logic handle it from
there.

OK?

martijn@

Index: usr.sbin/snmpd/application.c
===
RCS file: /cvs/src/usr.sbin/snmpd/application.c,v
retrieving revision 1.25
diff -u -p -r1.25 application.c
--- usr.sbin/snmpd/application.c27 Oct 2023 10:32:11 -  1.25
+++ usr.sbin/snmpd/application.c28 Oct 2023 07:57:23 -
@@ -1368,19 +1368,17 @@ appl_varbind_backend(struct appl_varbind
return -1;
return 0;
}
-   if ((region = appl_region_next(ureq->aru_ctx,
-   &(vb->av_oid), region)) == NULL)
-   goto eomv;
vb->av_oid = region->ar_oid;
+   ober_oid_nextsibling(&(vb->av_oid));
vb->av_include = 1;
+   return appl_varbind_backend(ivb);
}
} else if (cmp == 0) {
if (region->ar_instance && next && !vb->av_include) {
-   if ((region = appl_region_next(ureq->aru_ctx,
-   &(vb->av_oid), region)) == NULL)
-   goto eomv;
vb->av_oid = region->ar_oid;
+   ober_oid_nextsibling(&(vb->av_oid));
vb->av_include = 1;
+   return appl_varbind_backend(ivb);
}
}
ivb->avi_region = region;
Index: regress/usr.sbin/snmpd/Makefile
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/Makefile,v
retrieving revision 1.6
diff -u -p -r1.6 Makefile
--- regress/usr.sbin/snmpd/Makefile 27 Oct 2023 10:26:20 -  1.6
+++ regress/usr.sbin/snmpd/Makefile 28 Oct 2023 07:57:23 -
@@ -170,6 +170,9 @@ BACKEND_TARGETS+=   backend_getnext_stale
 BACKEND_TARGETS+=  backend_getnext_inclusive_backwards
 BACKEND_TARGETS+=  backend_getnext_toofew
 BACKEND_TARGETS+=  backend_getnext_toomany
+BACKEND_TARGETS+=  
backend_getnext_instance_below_region_before_instance
+BACKEND_TARGETS+=  
backend_getnext_instance_below_region_on_instance
+BACKEND_TARGETS+=  
backend_getnext_instance_below_region_below_instance
 BACKEND_TARGETS+=  backend_getbulk_nonrep_zero_maxrep_one
 BACKEND_TARGETS+=  backend_getbulk_nonrep_zero_maxrep_two
 BACKEND_TARGETS+=  backend_getbulk_nonrep_one_maxrep_one
Index: regress/usr.sbin/snmpd/backend.c
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/backend.c,v
retrieving revision 1.1
diff -u -p -r1.1 backend.c
--- regress/usr.sbin/snmpd/backend.c24 Oct 2023 14:34:40 -  1.1
+++ regress/usr.sbin/snmpd/backend.c28 Oct 2023 07:57:23 -
@@ -2633,6 +2633,173 @@ backend_getnext_toomany(void)
 }
 
 void
+backend_getnext_instance_below_region_before_instance(void)
+{
+   struct sockaddr_storage ss;
+   struct sockaddr *sa = (struct sockaddr *)
+   socklen_t salen;
+   int snmp_s, ax_s;
+   uint32_t sessionid1, sessionid2;
+   struct varbind varbind[] = {
+   {
+   .type = TYPE_NULL,
+   .name = OID_STRUCT(MIB_BACKEND_GETNEXT, 27),
+   .data.int32 = 1
+   },
+   };
+   struct searchrange searchrange[] = {
+   {
+   .start = OID_STRUCT(MIB_BACKEND_GETNEXT, 27),
+   .end = OID_STRUCT(MIB_BACKEND_GETNEXT, 27, 1, 0)
+   },
+   };
+   int32_t requestid;
+   char buf[1024];
+   size_t n;
+
+   ax_s = agentx_connect(axsocket);
+   sessionid1 = agentx_open(ax_s, 0, 0,
+   OID_ARG(MIB_SUBAGENT_BACKEND_GETNEXT, 27, 1),
+   "backend_getnext_instance_below_region_before_instance.1");
+   sessionid2 = agentx_open(ax_s, 0, 0,
+   OID_ARG(MIB_SUBAGENT_BACKEND_GETNEXT, 27, 2),
+   "backend_getnext_instance_below_region_before_instance.2");
+   agentx_register(ax_s, sessionid1, 0, 0, 127, 0,
+   OID_ARG(MIB_BACKEND_GETNEXT, 27), 0);
+   agentx_register(ax_s, sessionid2, 1, 0, 127, 0,
+   OID_ARG(MIB_BACKEND_GETNEXT, 27, 1, 0), 0);
+
+   salen = 

Re: libagentx: don't return responses >= searchrange.end

2023-10-28 Thread Theo Buehler
On Sat, Oct 28, 2023 at 09:41:55AM +0200, Martijn van Duren wrote:
> In most cases when a region is registered we have the full ownership.
> As soon as a region has been registered below prior mentioned region we
> could loose ownership halfway through. This case currently isn't fully
> tested and with indices we can return OIDs >= searchrange.end.
> The easiest way is to test this case is in agentx_varbind_finalize()
> and simply reset a varbind to EOMV if we're outside our range.

Makes sense.

> I've pulled apart the agentx_request_type cases for clarity and control.

Yes, that's more readable.

ok tb

> 
> OK?
> 
> martijn@
> 
> Index: agentx.c
> ===
> RCS file: /cvs/src/lib/libagentx/agentx.c,v
> retrieving revision 1.23
> diff -u -p -r1.23 agentx.c
> --- agentx.c  24 Oct 2023 08:54:52 -  1.23
> +++ agentx.c  28 Oct 2023 07:40:59 -
> @@ -2822,7 +2822,7 @@ getnext:
>   while (axo != NULL && axo->axo_cstate != AX_CSTATE_OPEN)
>   axo = RB_NEXT(axc_objects, &(axc->axc_objects), axo);
>   if (axo == NULL ||
> - ax_oid_cmp(&(axo->axo_oid), &(axv->axv_end)) > 0) {
> + ax_oid_cmp(&(axo->axo_oid), &(axv->axv_end)) >= 0) {
>   agentx_varbind_endofmibview(axv);
>   return;
>   }
> @@ -3349,19 +3349,53 @@ agentx_varbind_finalize(struct agentx_va
>  #endif
>   }
>   }
> - cmp = ax_oid_cmp(&(axv->axv_vb.avb_oid), );
> - if ((agentx_varbind_request(axv) == AGENTX_REQUEST_TYPE_GETNEXT &&
> - cmp >= 0) || cmp > 0) {
> + cmp = ax_oid_cmp(, &(axv->axv_vb.avb_oid));
> + switch (agentx_varbind_request(axv)) {
> + case AGENTX_REQUEST_TYPE_GET:
> + if (cmp != 0) {
>  #ifdef AX_DEBUG
> - agentx_log_axg_fatalx(axg, "indices not incremented");
> + agentx_log_axg_fatalx(axg, "index changed");
>  #else
> - agentx_log_axg_warnx(axg, "indices not incremented");
> - bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> - sizeof(axv->axv_start));
> - axv->axv_error = AX_PDU_ERROR_GENERR;
> + agentx_log_axg_warnx(axg, "index changed");
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
> + axv->axv_error = AX_PDU_ERROR_GENERR;
> + break;
>  #endif
> - } else
> + }
> + break;
> + case AGENTX_REQUEST_TYPE_GETNEXT:
> + if (cmp <= 0) {
> +#ifdef AX_DEBUG
> + agentx_log_axg_fatalx(axg, "indices not incremented");
> +#else
> + agentx_log_axg_warnx(axg, "indices not incremented");
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
> + axv->axv_error = AX_PDU_ERROR_GENERR;
> + break;
> +#endif
> + }
> + /* FALLTHROUGH */
> + case AGENTX_REQUEST_TYPE_GETNEXTINCLUSIVE:
> + if (cmp < 0) {
> +#ifdef AX_DEBUG
> + agentx_log_axg_fatalx(axg, "index decremented");
> +#else
> + agentx_log_axg_warnx(axg, "index decremented");
> + bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
> + sizeof(axv->axv_start));
> + axv->axv_error = AX_PDU_ERROR_GENERR;
> + break;
> +#endif
> + }
> + if (axv->axv_end.aoi_idlen != 0 &&
> + ax_oid_cmp(, &(axv->axv_end)) >= 0) {
> + agentx_varbind_endofmibview(axv);
> + return;
> + }
>   bcopy(, &(axv->axv_vb.avb_oid), sizeof(oid));
> + }
>  done:
>   agentx_object_unlock(axv->axv_axo);
>   agentx_get_finalize(axv->axv_axg);
> 



Re: snmpd: reject OIDS equal to searchrange.end

2023-10-28 Thread Martijn van Duren
On Sat, 2023-10-28 at 09:45 +0200, Martijn van Duren wrote:
> RFC 2741 section 5.2 states that searchrange.end is non-inclusive.
> appl_varbind_valid() and appl_response() currently tests inclusive.
> The appl_varbind_valid() case is for backends that support
> searchrange.end (like agentx) and the appl_response() case for
> those who do not and need a fixup (basically everything else).
> 
> OK?
> 
> martijn
> 
Here's the regress part I forgot to add.

Index: Makefile
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/Makefile,v
retrieving revision 1.6
diff -u -p -r1.6 Makefile
--- Makefile27 Oct 2023 10:26:20 -  1.6
+++ Makefile28 Oct 2023 07:49:31 -
@@ -170,6 +170,7 @@ BACKEND_TARGETS+=   backend_getnext_stale
 BACKEND_TARGETS+=  backend_getnext_inclusive_backwards
 BACKEND_TARGETS+=  backend_getnext_toofew
 BACKEND_TARGETS+=  backend_getnext_toomany
+BACKEND_TARGETS+=  backend_getnext_response_equal_end
 BACKEND_TARGETS+=  backend_getbulk_nonrep_zero_maxrep_one
 BACKEND_TARGETS+=  backend_getbulk_nonrep_zero_maxrep_two
 BACKEND_TARGETS+=  backend_getbulk_nonrep_one_maxrep_one
Index: backend.c
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/backend.c,v
retrieving revision 1.1
diff -u -p -r1.1 backend.c
--- backend.c   24 Oct 2023 14:34:40 -  1.1
+++ backend.c   28 Oct 2023 07:49:31 -
@@ -2633,6 +2633,62 @@ backend_getnext_toomany(void)
 }
 
 void
+backend_getnext_response_equal_end(void)
+{
+   struct sockaddr_storage ss;
+   struct sockaddr *sa = (struct sockaddr *)
+   socklen_t salen;
+   int snmp_s, ax_s;
+   uint32_t sessionid1, sessionid2;
+   struct varbind varbind[] = {
+   {
+   .type = TYPE_NULL,
+   .name = OID_STRUCT(MIB_BACKEND_GETNEXT, 26),
+   .data.int32 = 1
+   },
+   };
+   struct searchrange searchrange[] = {
+   {
+   .start = OID_STRUCT(MIB_BACKEND_GETNEXT, 26),
+   .end = OID_STRUCT(MIB_BACKEND_GETNEXT, 26, 1, 1)
+   },
+   };
+   int32_t requestid;
+   char buf[1024];
+   size_t n;
+
+   ax_s = agentx_connect(axsocket);
+   sessionid1 = agentx_open(ax_s, 0, 0,
+   OID_ARG(MIB_SUBAGENT_BACKEND_GETNEXT, 26, 1),
+   "backend_getnext_end_equal.1");
+   sessionid2 = agentx_open(ax_s, 0, 0,
+   OID_ARG(MIB_SUBAGENT_BACKEND_GETNEXT, 26, 2),
+   "backend_getnext_end_equal.2");
+   agentx_register(ax_s, sessionid1, 0, 0, 127, 0,
+   OID_ARG(MIB_BACKEND_GETNEXT, 26), 0);
+   agentx_register(ax_s, sessionid2, 0, 0, 127, 0,
+   OID_ARG(MIB_BACKEND_GETNEXT, 26, 1, 1), 0);
+
+   salen = snmp_resolve(SOCK_DGRAM, hostname, servname, sa);
+   snmp_s = snmp_connect(SOCK_DGRAM, sa, salen);
+   requestid = snmpv2_getnext(snmp_s, community, 0, varbind, 1);
+
+   /* Fool agentx_getnext_handle() */
+   varbind[0].name.subid[varbind[0].name.n_subid++] = 1;
+   varbind[0].type = TYPE_INTEGER;
+   n = agentx_read(ax_s, buf, sizeof(buf), 1000);
+   agentx_getnext_handle(__func__, buf, n, 0, sessionid1, searchrange,
+   varbind, 1);
+   varbind[0].name = searchrange[0].end;
+   agentx_response(ax_s, buf, NOERROR, 0, varbind, 1);
+   varbind[0].type = TYPE_NULL;
+   varbind[0].name = OID_STRUCT(MIB_BACKEND_GETNEXT, 26),
+
+   snmpv2_response_validate(snmp_s, 1000, community, requestid, GENERR, 1,
+   varbind, 1);
+}
+
+void
 backend_getbulk_nonrep_zero_maxrep_one(void)
 {
struct sockaddr_storage ss;
Index: regress.h
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/regress.h,v
retrieving revision 1.2
diff -u -p -r1.2 regress.h
--- regress.h   27 Oct 2023 10:26:20 -  1.2
+++ regress.h   28 Oct 2023 07:49:31 -
@@ -293,6 +293,7 @@ void backend_getnext_stale(void);
 void backend_getnext_inclusive_backwards(void);
 void backend_getnext_toofew(void);
 void backend_getnext_toomany(void);
+void backend_getnext_response_equal_end(void);
 void backend_getbulk_nonrep_zero_maxrep_one(void);
 void backend_getbulk_nonrep_zero_maxrep_two(void);
 void backend_getbulk_nonrep_one_maxrep_one(void);
Index: snmpd_regress.c
===
RCS file: /cvs/src/regress/usr.sbin/snmpd/snmpd_regress.c,v
retrieving revision 1.2
diff -u -p -r1.2 snmpd_regress.c
--- snmpd_regress.c 27 Oct 2023 10:26:20 -  1.2
+++ snmpd_regress.c 28 Oct 2023 07:49:31 -
@@ -143,6 +143,7 @@ const struct {
{ "backend_getnext_inclusive_backwards", 
backend_getnext_inclusive_backwards },
{ "backend_getnext_toofew", 

Re: snmpd: reject OIDS equal to searchrange.end

2023-10-28 Thread Theo Buehler
On Sat, Oct 28, 2023 at 09:45:25AM +0200, Martijn van Duren wrote:
> RFC 2741 section 5.2 states that searchrange.end is non-inclusive.
> appl_varbind_valid() and appl_response() currently tests inclusive.
> The appl_varbind_valid() case is for backends that support
> searchrange.end (like agentx) and the appl_response() case for
> those who do not and need a fixup (basically everything else).
> 
> OK?

ok tb

> 
> martijn
> 
> diff --git a/application.c b/application.c
> index 33143d6..b3a2603 100644
> --- a/application.c
> +++ b/application.c
> @@ -1100,7 +1100,7 @@ appl_response(struct appl_backend *backend, int32_t 
> requestid,
>*/
>   eomv |= !backend->ab_range && next &&
>   ober_oid_cmp(&(vb->av_oid),
> - &(origvb->avi_varbind.av_oid_end)) > 0;
> + &(origvb->avi_varbind.av_oid_end)) >= 0;
>   /* RFC 3584 section 4.2.2.1 */
>   if (ureq->aru_pduversion == SNMP_V1 &&
>   vb->av_value != NULL &&
> @@ -1283,7 +1283,7 @@ appl_varbind_valid(struct appl_varbind *varbind,
>   }
>   }
>   if (range && ober_oid_cmp(&(varbind->av_oid),
> - &(request->avi_varbind.av_oid_end)) > 0) {
> + &(request->avi_varbind.av_oid_end)) >= 0) {
>   *errstr = "end oid not honoured";
>   return 0;
>   }
> 



snmpd: reject OIDS equal to searchrange.end

2023-10-28 Thread Martijn van Duren
RFC 2741 section 5.2 states that searchrange.end is non-inclusive.
appl_varbind_valid() and appl_response() currently tests inclusive.
The appl_varbind_valid() case is for backends that support
searchrange.end (like agentx) and the appl_response() case for
those who do not and need a fixup (basically everything else).

OK?

martijn

diff --git a/application.c b/application.c
index 33143d6..b3a2603 100644
--- a/application.c
+++ b/application.c
@@ -1100,7 +1100,7 @@ appl_response(struct appl_backend *backend, int32_t 
requestid,
 */
eomv |= !backend->ab_range && next &&
ober_oid_cmp(&(vb->av_oid),
-   &(origvb->avi_varbind.av_oid_end)) > 0;
+   &(origvb->avi_varbind.av_oid_end)) >= 0;
/* RFC 3584 section 4.2.2.1 */
if (ureq->aru_pduversion == SNMP_V1 &&
vb->av_value != NULL &&
@@ -1283,7 +1283,7 @@ appl_varbind_valid(struct appl_varbind *varbind,
}
}
if (range && ober_oid_cmp(&(varbind->av_oid),
-   &(request->avi_varbind.av_oid_end)) > 0) {
+   &(request->avi_varbind.av_oid_end)) >= 0) {
*errstr = "end oid not honoured";
return 0;
}



libagentx: don't return responses >= searchrange.end

2023-10-28 Thread Martijn van Duren
In most cases when a region is registered we have the full ownership.
As soon as a region has been registered below prior mentioned region we
could loose ownership halfway through. This case currently isn't fully
tested and with indices we can return OIDs >= searchrange.end.
The easiest way is to test this case is in agentx_varbind_finalize()
and simply reset a varbind to EOMV if we're outside our range.

I've pulled apart the agentx_request_type cases for clarity and control.

OK?

martijn@

Index: agentx.c
===
RCS file: /cvs/src/lib/libagentx/agentx.c,v
retrieving revision 1.23
diff -u -p -r1.23 agentx.c
--- agentx.c24 Oct 2023 08:54:52 -  1.23
+++ agentx.c28 Oct 2023 07:40:59 -
@@ -2822,7 +2822,7 @@ getnext:
while (axo != NULL && axo->axo_cstate != AX_CSTATE_OPEN)
axo = RB_NEXT(axc_objects, &(axc->axc_objects), axo);
if (axo == NULL ||
-   ax_oid_cmp(&(axo->axo_oid), &(axv->axv_end)) > 0) {
+   ax_oid_cmp(&(axo->axo_oid), &(axv->axv_end)) >= 0) {
agentx_varbind_endofmibview(axv);
return;
}
@@ -3349,19 +3349,53 @@ agentx_varbind_finalize(struct agentx_va
 #endif
}
}
-   cmp = ax_oid_cmp(&(axv->axv_vb.avb_oid), );
-   if ((agentx_varbind_request(axv) == AGENTX_REQUEST_TYPE_GETNEXT &&
-   cmp >= 0) || cmp > 0) {
+   cmp = ax_oid_cmp(, &(axv->axv_vb.avb_oid));
+   switch (agentx_varbind_request(axv)) {
+   case AGENTX_REQUEST_TYPE_GET:
+   if (cmp != 0) {
 #ifdef AX_DEBUG
-   agentx_log_axg_fatalx(axg, "indices not incremented");
+   agentx_log_axg_fatalx(axg, "index changed");
 #else
-   agentx_log_axg_warnx(axg, "indices not incremented");
-   bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
-   sizeof(axv->axv_start));
-   axv->axv_error = AX_PDU_ERROR_GENERR;
+   agentx_log_axg_warnx(axg, "index changed");
+   bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
+   sizeof(axv->axv_start));
+   axv->axv_error = AX_PDU_ERROR_GENERR;
+   break;
 #endif
-   } else
+   }
+   break;
+   case AGENTX_REQUEST_TYPE_GETNEXT:
+   if (cmp <= 0) {
+#ifdef AX_DEBUG
+   agentx_log_axg_fatalx(axg, "indices not incremented");
+#else
+   agentx_log_axg_warnx(axg, "indices not incremented");
+   bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
+   sizeof(axv->axv_start));
+   axv->axv_error = AX_PDU_ERROR_GENERR;
+   break;
+#endif
+   }
+   /* FALLTHROUGH */
+   case AGENTX_REQUEST_TYPE_GETNEXTINCLUSIVE:
+   if (cmp < 0) {
+#ifdef AX_DEBUG
+   agentx_log_axg_fatalx(axg, "index decremented");
+#else
+   agentx_log_axg_warnx(axg, "index decremented");
+   bcopy(&(axv->axv_start), &(axv->axv_vb.avb_oid),
+   sizeof(axv->axv_start));
+   axv->axv_error = AX_PDU_ERROR_GENERR;
+   break;
+#endif
+   }
+   if (axv->axv_end.aoi_idlen != 0 &&
+   ax_oid_cmp(, &(axv->axv_end)) >= 0) {
+   agentx_varbind_endofmibview(axv);
+   return;
+   }
bcopy(, &(axv->axv_vb.avb_oid), sizeof(oid));
+   }
 done:
agentx_object_unlock(axv->axv_axo);
agentx_get_finalize(axv->axv_axg);



Re: snmpd; Fix use after free for appl_request_upstream

2023-10-27 Thread Martijn van Duren
On Thu, 2023-10-26 at 21:38 +0200, Theo Buehler wrote:
> On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote:
> > This case is covered by the new regress' backend_get_toofew and
> > backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
> > to the max it's really hard to trigger (I had to run
> > backend_get_wrongorder, backend_get_toofew, backend_get_toomany
> > sequentially in a tight loop killing snmpd between iterations for the
> > best chance).
> > 
> > When we receive an invalid varbindlist in a response we set the invalid
> > variable. This in turn calls appl_varbind_error(), but the avi_state
> > of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
> > we call appl_request_downstream_free(), which sees that the varbinds
> > haven't been resolved, triggering a call to
> > appl_request_upstream_resolve(). This call in turn sees that the
> > error has been set and just sends out the error-response to the client
> > and frees the appl_request_upstream. From here we return back to
> > appl_response(), which also calls appl_request_upstream_resolve(),
> > resulting in a use after free.
> > 
> > The main tool for fixing this issue is making use of
> > appl_request_upstream's aru_locked member, which will cause
> > appl_request_upstream_resolve() to return instantly. The simplest fix is
> > to set aru_locked before calling appl_request_downstream_free() and
> > unsetting it directly afterwards inside appl_response().
> > 
> > The second one is the diff proposed below, which shrinks the code.
> > 
> > appl_request_upstream_free() is only called once from
> > appl_request_upstream_reply(). appl_request_upstream_reply() in turn
> > is only called by appl_request_upstream_resolve().
> > appl_request_upstream_resolve() is called in 3 places:
> > - appl_processpdu(): to kick things off
> > - appl_request_downstream_free(): For when a backend disappears with
> > outstanding requests
> > - appl_response(): To kickstart the next round of resolving.
> > 
> > Since appl_request_downstream_free() is always called from
> > appl_response(), we can leverage that function and make it call
> > appl_request_upstream_resolve() unconditionally.
> > 
> > appl_request_downstream_free() is called from the following locations:
> > - appl_close(): When a backend has disappeared.
> > - appl_request_upstream_free(): We send out a reply early, because an
> > error has been detected.
> > - appl_response(): We received a response
> > 
> > appl_request_upstream_free() can't reenter into
> > appl_request_upstream_resolve(), or it would potentially trigger new
> > appl_request_downstreams. This can be prevented by setting aru_locked
> > before calling appl_request_downstream_free().
> > For all other cases we should rely on appl_request_upstream_resolve()'s
> > logic to handle varbinds in any state, so there's no reason make calls
> > from other contexts conditional.
> 
> Your description of the bug makes sense and your choice of resolving it
> as well. Thanks for the in-depth explanation, that helped a lot. Still,
> I must say that I don't really feel at ease with the amount of complexity
> and entanglement here. I simply can't fit all of this into my head within
> a reasonable amount of time.
> 
Well, the S in SNMP doesn't stand for "I want to pull my hair out" for
no reason.

So for people interested, this is the high-over flow of the code:
When a message is received snmpe.c sends the PDU part to
appl_processpdu(). This function pulls apart the request in structs that
can be worked with internally (1 struct appl_request_upstream + n
struct appl_varbind_internal).
After things are pulled apart appl_processpdu() calls
appl_request_upstream_resolve(), which is responsible for retrieving
the values for the requested varbinds. It does this by going over all
varbinds in the APPL_VBSTATE_NEW state, and calling
appl_varbind_backend() to find a matching backend for the request. In
the case of a getnext/getbulk request it can also increment the oid
to the next matching region (if needed) and it also fills the end oid
where the ownership of the backend ends. After all APPL_VBSTATE_NEW
varbinds have found their backend, they are grouped together by backend
in a struct appl_request_downstream and send out via
appl_request_downstream_send().

Once a backend has a response available it calls appl_response(). This
function verifies the returned data via appl_varbind_valid() and
appl_error_valid(), and checks if the data matches the supported
datatypes (e.g. counter64 on SNMPv1). After that the data is put into
the corresponding struct appl_varbind_internal, or in case of an EOMV
we increment the OID to the end OID of the last request and reset the
varbind to APPL_VBSTATE_NEW. After that the struct
appl_request_downstream is freed and we return back to
appl_request_upstream_resolve(). If all varbinds have reached
APPL_VBSTATE_DONE (or an error occured) we call
appl_request_upstream_reply() which 

Re: Removing syscall(2) from libc and kernel

2023-10-27 Thread Lucas Gabriel Vuotto
On Fri, Oct 27, 2023 at 09:36:25AM -0600, Theo de Raadt wrote:
> Index: sys/arch/m88k/m88k/trap.c
> ===
> RCS file: /cvs/src/sys/arch/m88k/m88k/trap.c,v
> diff -u -p -u -r1.128 trap.c
> --- sys/arch/m88k/m88k/trap.c 2 Aug 2023 06:14:46 -   1.128
> +++ sys/arch/m88k/m88k/trap.c 27 Oct 2023 03:17:47 -
> @@ -1153,9 +1153,9 @@ void
>  m88100_syscall(register_t code, struct trapframe *tf)
>  {
>   int i, nap;
> - const struct sysent *callp;
> + const struct sysent *callp = sysent;
>   struct proc *p = curproc;
> - int error, indirect = -1;
> + int error;
>   register_t args[8] __aligned(8);
>   register_t rval[2] __aligned(8);
>   register_t *ap;
> @@ -1172,19 +1172,9 @@ m88100_syscall(register_t code, struct t
>   ap = >tf_r[2];
>   nap = 8; /* r2-r9 */
>  
> - switch (code) {
> - case SYS_syscall:
> - indirect = code;
> - code = *ap++;
> - nap--;
> - break;
> - }
> -
> - callp = sysent;
> - if (code < 0 || code >= SYS_MAXSYSCALL)
> - callp += SYS_syscall;
> - else
> - callp += code;
> + if (code <= 0 || code >= SYS_MAXSYSCALL)
> + goto bad;
> + callp += code;
>  
>   i = callp->sy_argsize / sizeof(register_t);
>   if (i > sizeof(args) / sizeof(register_t))
> @@ -1200,7 +1190,7 @@ m88100_syscall(register_t code, struct t
>   rval[0] = 0;
>   rval[1] = tf->tf_r[3];
>  
> - error = mi_syscall(p, code, indirect, callp, args, rval);
> + error = mi_syscall(p, code, callp, args, rval);
>  
>   /*
>* system call will look like:
> @@ -1266,7 +1256,7 @@ void
>  m88110_syscall(register_t code, struct trapframe *tf)
>  {
>   int i, nap;
> - const struct sysent *callp;
> + const struct sysent *callp = sysent;
>   struct proc *p = curproc;
>   int error;
>   register_t args[8] __aligned(8);
> @@ -1285,17 +1275,8 @@ m88110_syscall(register_t code, struct t
>   ap = >tf_r[2];
>   nap = 8;/* r2-r9 */
>  
> - switch (code) {
> - case SYS_syscall:
> - code = *ap++;
> - nap--;
> - break;
> - }
> -
> - callp = sysent;
> - if (code < 0 || code >= SYS_MAXSYSCALL)
> - callp += SYS_syscall;
> - else
> + // XXX out of range stays on syscall0, which we assume is enosys
> + if (code >= 0 || code <= SYS_MAXSYSCALL)
>   callp += code;
>  
>   i = callp->sy_argsize / sizeof(register_t);

Shouldn't this be

if (code > 0 && code < SYS_MAXSYSCALL)

?

All the other places in the diff modify callp under that condition. Also
all the values of code >= SYS_MAXSYSCALL will be covered by code >= 0.

Out of curiosity, any reason why m88k doesn't do the goto bad early on?
Other than RTFM for the calling convention.

> Index: sys/arch/mips64/mips64/trap.c
> ===
> RCS file: /cvs/src/sys/arch/mips64/mips64/trap.c,v
> diff -u -p -u -r1.167 trap.c
> --- sys/arch/mips64/mips64/trap.c 26 Apr 2023 16:53:59 -  1.167
> +++ sys/arch/mips64/mips64/trap.c 27 Oct 2023 03:17:44 -
> @@ -396,14 +396,12 @@ fault_common_no_miss:
>   case T_SYSCALL+T_USER:
>   {
>   struct trapframe *locr0 = p->p_md.md_regs;
> - const struct sysent *callp;
> - unsigned int code, indirect = -1;
> + const struct sysent *callp = sysent;
> + unsigned int code;
>   register_t tpc;
>   uint32_t branch = 0;
>   int error, numarg;
> - struct args {
> - register_t i[8];
> - } args;
> + register_t args[8];
>   register_t rval[2];
>  
>   atomic_inc_int();
> @@ -422,51 +420,22 @@ fault_common_no_miss:
>   trapframe->pc, 0, branch);
>   } else
>   locr0->pc += 4;
> - callp = sysent;
>   code = locr0->v0;
> - switch (code) {
> - case SYS_syscall:
> - /*
> -  * Code is first argument, followed by actual args.
> -  */
> - indirect = code;
> - code = locr0->a0;
> - if (code >= SYS_MAXSYSCALL)
> - callp += SYS_syscall;
> - else
> - callp += code;
> - numarg = callp->sy_argsize / sizeof(register_t);
> - args.i[0] = locr0->a1;
> - args.i[1] = locr0->a2;
> - args.i[2] = locr0->a3;
> - if (numarg > 3) {
> - args.i[3] = locr0->a4;
> - args.i[4] = locr0->a5;
> -  

Re: Removing syscall(2) from libc and kernel

2023-10-27 Thread Theo de Raadt
Theo de Raadt  wrote:

> Piece by piece, I've been trying to remove the easiest of the
> terminal-actions that exploit code uses (ie. getting to execve, or performing
> other system calls, etc).
> Snapshots for some architectures now contain kernel diffs which reject
> syscall(2).  The symbol still remains libc.
> 
> I'm including a piece of this diff.

Replacement diff that works on more architectures.  (The story here
is that a kernel-only diff without the userland diff requires a different
kind of range check, because syscalls.master cannot be changed to make
the 0th entry enosys, because the generated .h files will break libc
build.  So I refactored rapidly, and forgot 'error = ENOSYS' on a few
architectures..)

Index: sys/arch/alpha/alpha/trap.c
===
RCS file: /cvs/src/sys/arch/alpha/alpha/trap.c,v
diff -u -p -u -r1.108 trap.c
--- sys/arch/alpha/alpha/trap.c 8 Mar 2023 04:43:07 -   1.108
+++ sys/arch/alpha/alpha/trap.c 27 Oct 2023 16:16:57 -
@@ -497,17 +497,15 @@ dopanic:
  * a3, and v0 from the frame before returning to the user process.
  */
 void
-syscall(code, framep)
-   u_int64_t code;
-   struct trapframe *framep;
+syscall(u_int64_t code, struct trapframe *framep)
 {
-   const struct sysent *callp;
+   const struct sysent *callp = sysent;
struct proc *p;
-   int error, indirect = -1;
+   int error = ENOSYS;
u_int64_t opc;
u_long rval[2];
u_long args[10];/* XXX */
-   u_int hidden, nargs;
+   u_int nargs;
 
atomic_add_int(, 1);
p = curproc;
@@ -515,24 +513,11 @@ syscall(code, framep)
framep->tf_regs[FRAME_SP] = alpha_pal_rdusp();
opc = framep->tf_regs[FRAME_PC] - 4;
 
-   switch(code) {
-   case SYS_syscall:
-   indirect = code;
-   code = framep->tf_regs[FRAME_A0];
-   hidden = 1;
-   break;
-   default:
-   hidden = 0;
-   }
-
-   error = 0;
-   callp = sysent;
-   if (code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp += code;
 
-   nargs = callp->sy_narg + hidden;
+   nargs = callp->sy_narg;
switch (nargs) {
default:
if (nargs > 10) /* XXX */
@@ -559,7 +544,7 @@ syscall(code, framep)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, args + hidden, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
Index: sys/arch/amd64/amd64/locore.S
===
RCS file: /cvs/src/sys/arch/amd64/amd64/locore.S,v
diff -u -p -u -r1.141 locore.S
--- sys/arch/amd64/amd64/locore.S   24 Oct 2023 13:20:09 -  1.141
+++ sys/arch/amd64/amd64/locore.S   27 Oct 2023 03:26:49 -
@@ -508,6 +508,7 @@ ENTRY(savectx)
lfence
 END(savectx)
 
+// XXX this should not behave like a nop
 IDTVEC(syscall32)
sysret  /* go away please */
 END(Xsyscall32)
Index: sys/arch/amd64/amd64/trap.c
===
RCS file: /cvs/src/sys/arch/amd64/amd64/trap.c,v
diff -u -p -u -r1.101 trap.c
--- sys/arch/amd64/amd64/trap.c 5 Jul 2023 12:58:55 -   1.101
+++ sys/arch/amd64/amd64/trap.c 27 Oct 2023 03:26:49 -
@@ -553,7 +553,7 @@ syscall(struct trapframe *frame)
caddr_t params;
const struct sysent *callp;
struct proc *p;
-   int error, indirect = -1;
+   int error = ENOSYS;
size_t argsize, argoff;
register_t code, args[9], rval[2], *argp;
 
@@ -570,26 +570,9 @@ syscall(struct trapframe *frame)
argp = [0];
argoff = 0;
 
-   switch (code) {
-   case SYS_syscall:
-   /*
-* Code is first argument, followed by actual args.
-*/
-   indirect = code;
-   code = frame->tf_rdi;
-   argp = [1];
-   argoff = 1;
-   break;
-   default:
-   break;
-   }
-
-   callp = sysent;
-   if (code < 0 || code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
-
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp = sysent + code;
argsize = (callp->sy_argsize >> 3) + argoff;
if (argsize) {
switch (MIN(argsize, 6)) {
@@ -620,7 +603,7 @@ syscall(struct trapframe *frame)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, argp, rval);
+   error = mi_syscall(p, code, callp, argp, rval);
 
switch (error) {
case 0:
Index: sys/arch/arm/arm/syscall.c

Re: Removing syscall(2) from libc and kernel

2023-10-27 Thread Theo de Raadt
Theo de Raadt  wrote:

> Piece by piece, I've been trying to remove the easiest of the
> terminal-actions that exploit code uses (ie. getting to execve, or performing
> other system calls, etc).
> So in this next step, I'm going to take away the ability to perform syscall #0
> (SYS_syscall), with the first argument being the real system call.
> 
> This library interface, and all the pieces below it, will be going away:
> 
> https://man.openbsd.org/syscall.2
> 
> There's going to be some fallout which takes time to fix, especially in the
> "go" ecosystem.
> 
> Snapshots for some architectures now contain kernel diffs which reject
> syscall(2).  The symbol still remains libc.

Here is the larger diff which removes syscall(2) from everywhere:
- kernel
- libc prototypes and stub
- manuals
- a few programs that can see this

Index: include/unistd.h
===
RCS file: /cvs/src/include/unistd.h,v
diff -u -p -u -r1.107 unistd.h
--- include/unistd.h7 Jan 2023 05:24:58 -   1.107
+++ include/unistd.h23 Oct 2023 21:10:08 -
@@ -522,7 +522,6 @@ int  setthrname(pid_t, const char *);
 voidsetusershell(void);
 int strtofflags(char **, u_int32_t *, u_int32_t *);
 int swapctl(int cmd, const void *arg, int misc);
-int syscall(int, ...);
 int getentropy(void *, size_t);
 int pledge(const char *, const char *);
 int unveil(const char *, const char *);
Index: lib/libc/Symbols.list
===
RCS file: /cvs/src/lib/libc/Symbols.list,v
diff -u -p -u -r1.83 Symbols.list
--- lib/libc/Symbols.list   20 Aug 2023 15:17:53 -  1.83
+++ lib/libc/Symbols.list   23 Oct 2023 21:10:08 -
@@ -434,7 +434,6 @@ symlink
 symlinkat
 sync
 sysarch
-syscall
 timer_create
 timer_delete
 timer_getoverrun
Index: lib/libc/hidden/unistd.h
===
RCS file: /cvs/src/lib/libc/hidden/unistd.h,v
diff -u -p -u -r1.12 unistd.h
--- lib/libc/hidden/unistd.h18 May 2023 16:11:09 -  1.12
+++ lib/libc/hidden/unistd.h23 Oct 2023 21:10:08 -
@@ -157,7 +157,6 @@ PROTO_NORMAL(swapctl);
 PROTO_NORMAL(symlink);
 PROTO_NORMAL(symlinkat);
 PROTO_NORMAL(sync);
-PROTO_NORMAL(syscall);
 PROTO_NORMAL(sysconf);
 PROTO_DEPRECATED(tcgetpgrp);
 PROTO_DEPRECATED(tcsetpgrp);
Index: lib/libc/sys/Makefile.inc
===
RCS file: /cvs/src/lib/libc/sys/Makefile.inc,v
diff -u -p -u -r1.174 Makefile.inc
--- lib/libc/sys/Makefile.inc   20 Aug 2023 15:17:53 -  1.174
+++ lib/libc/sys/Makefile.inc   23 Oct 2023 21:10:08 -
@@ -8,7 +8,7 @@
 # modules with non-default implementations on at least one architecture:
 SRCS+= Ovfork.S brk.S ${CERROR} \
sbrk.S sigpending.S sigprocmask.S \
-   sigsuspend.S syscall.S tfork_thread.S
+   sigsuspend.S tfork_thread.S
 
 # glue to offer userland wrappers for some syscalls
 SRCS+= posix_madvise.c pthread_sigmask.c \
@@ -216,7 +216,7 @@ MAN+=   __get_tcb.2 __thrsigdivert.2 __thr
shmctl.2 shmget.2 shutdown.2 sigaction.2 sigaltstack.2 sigpending.2 \
sigprocmask.2 sigreturn.2 sigsuspend.2 socket.2 \
socketpair.2 stat.2 statfs.2 swapctl.2 symlink.2 \
-   sync.2 sysarch.2 syscall.2 sysctl.2 thrkill.2 truncate.2 \
+   sync.2 sysarch.2 sysctl.2 thrkill.2 truncate.2 \
umask.2 unlink.2 unveil.2 utimes.2 utrace.2 vfork.2 \
wait.2 waitid.2 write.2 \
ypconnect.2
Index: lib/libc/sys/syscall.2
===
RCS file: /cvs/src/lib/libc/sys/syscall.2,v
diff -u -p -u -r1.16 syscall.2
--- lib/libc/sys/syscall.2  22 Feb 2023 07:04:50 -  1.16
+++ lib/libc/sys/syscall.2  23 Oct 2023 21:10:08 -
@@ -1,68 +0,0 @@
-.\"$OpenBSD: syscall.2,v 1.16 2023/02/22 07:04:50 jmc Exp $
-.\"$NetBSD: syscall.2,v 1.4 1995/02/27 12:38:53 cgd Exp $
-.\"
-.\" Copyright (c) 1980, 1991, 1993
-.\"The Regents of the University of California.  All rights reserved.
-.\"
-.\" Redistribution and use in source and binary forms, with or without
-.\" modification, are permitted provided that the following conditions
-.\" are met:
-.\" 1. Redistributions of source code must retain the above copyright
-.\"notice, this list of conditions and the following disclaimer.
-.\" 2. Redistributions in binary form must reproduce the above copyright
-.\"notice, this list of conditions and the following disclaimer in the
-.\"documentation and/or other materials provided with the distribution.
-.\" 3. Neither the name of the University nor the names of its contributors
-.\"may be used to endorse or promote products derived from this software
-.\"without specific prior written permission.
-.\"
-.\" THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
-.\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, 

Removing syscall(2) from libc and kernel

2023-10-27 Thread Theo de Raadt
Piece by piece, I've been trying to remove the easiest of the
terminal-actions that exploit code uses (ie. getting to execve, or performing
other system calls, etc).

I recognize we can never completely remove all mechanisms they
use. However, I hope I am forcing attack coders into using increasingly
more complicated methods. Same time, it means fewer methods are
available.  Other methods make exploitation more fragile.  This is
pushing success rates into "low-percent statistical" success. If we
teach more software stacks to "fail hard, don't try to recover", that is
an improvement in security.

I already made it difficult to call execve() directly in a few ways.
The kernel must be entered via the exact syscall instruction, inside the
libc syscall stub.  Immediately before that syscall instruction, the
SYS_execve instruction is loaded into a register.  On some
architectures, the PLT-reachable stub performs a retguard check, which
can be triggered by a few methods.  Stack pivots are also mostly
prevented because of other checks.  It is not possible to enter via
the SYS_syscall (syscall register = 0) case either.

Attack code can try to do perform other system calls, to create
filesystem damage or network communication.  They could still load other
syscall numbers and jump to a found syscall instruction, if they are
able to cheat the retguard epilogue (It is a bit unfortunate that libc
syscall stubs tend to use the same save register, but at least the
compare offset is chosen random at compile time).  Or, they could know
where all the system calls are from a pre-read libc, which requires them
to be on the machine before performing an online or offline attack (libc
is random relinked, but still readable in the filesystem).  It's
difficult to discover code-locations online only, because most
architectures also have xonly code now.  Some methods can use PLT
entries (which also vary based upon random relink), but I've not seem
much methodology using PLT entry + offset.

Anyways, everyone of these things I mention, and the ones I don't mention,
tend to be more difficult than the previous methods.  I'm trying to remove
simple methods, and force attackers into more and more complex methods.
I promise that I will circle back and damage the more complex methods in
the future.


So in this next step, I'm going to take away the ability to perform syscall #0
(SYS_syscall), with the first argument being the real system call.

This library interface, and all the pieces below it, will be going away:

https://man.openbsd.org/syscall.2

There's going to be some fallout which takes time to fix, especially in the
"go" ecosystem.

Snapshots for some architectures now contain kernel diffs which reject
syscall(2).  The symbol still remains libc.

I'm including a piece of this diff.




Index: sys/arch/alpha/alpha/trap.c
===
RCS file: /cvs/src/sys/arch/alpha/alpha/trap.c,v
diff -u -p -u -r1.108 trap.c
--- sys/arch/alpha/alpha/trap.c 8 Mar 2023 04:43:07 -   1.108
+++ sys/arch/alpha/alpha/trap.c 27 Oct 2023 03:26:49 -
@@ -497,17 +497,15 @@ dopanic:
  * a3, and v0 from the frame before returning to the user process.
  */
 void
-syscall(code, framep)
-   u_int64_t code;
-   struct trapframe *framep;
+syscall(u_int64_t code, struct trapframe *framep)
 {
-   const struct sysent *callp;
+   const struct sysent *callp = sysent;
struct proc *p;
-   int error, indirect = -1;
+   int error;
u_int64_t opc;
u_long rval[2];
u_long args[10];/* XXX */
-   u_int hidden, nargs;
+   u_int nargs;
 
atomic_add_int(, 1);
p = curproc;
@@ -515,24 +513,11 @@ syscall(code, framep)
framep->tf_regs[FRAME_SP] = alpha_pal_rdusp();
opc = framep->tf_regs[FRAME_PC] - 4;
 
-   switch(code) {
-   case SYS_syscall:
-   indirect = code;
-   code = framep->tf_regs[FRAME_A0];
-   hidden = 1;
-   break;
-   default:
-   hidden = 0;
-   }
-
-   error = 0;
-   callp = sysent;
-   if (code >= SYS_MAXSYSCALL)
-   callp += SYS_syscall;
-   else
-   callp += code;
+   if (code <= 0 || code >= SYS_MAXSYSCALL)
+   goto bad;
+   callp += code;
 
-   nargs = callp->sy_narg + hidden;
+   nargs = callp->sy_narg;
switch (nargs) {
default:
if (nargs > 10) /* XXX */
@@ -559,7 +544,7 @@ syscall(code, framep)
rval[0] = 0;
rval[1] = 0;
 
-   error = mi_syscall(p, code, indirect, callp, args + hidden, rval);
+   error = mi_syscall(p, code, callp, args, rval);
 
switch (error) {
case 0:
Index: sys/arch/amd64/amd64/locore.S
===
RCS file: /cvs/src/sys/arch/amd64/amd64/locore.S,v
diff -u -p -u 

Re: bgpd: cleanup optparamlen handling in session_open

2023-10-27 Thread Theo Buehler
On Fri, Oct 27, 2023 at 01:06:31PM +0200, Claudio Jeker wrote:
> In the big ibuf API refactor I also broke the optparamlen handling
> by using one variable for two things.
> 
> All the size handling in session_open() can be simplified since
> ibuf_size() is cheap to call.
> 
> I think the result is cleaner than the code before. It is still somewhat
> funky because there are a fair amount of conditions to cover now.

There's no way to avoid catching some funk if the spec has a groovy
legacy heritage...

Diff makes sense and matches my understanding of that RFC's hack.

ok

> 
> -- 
> :wq Claudio
> 
> Index: session.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
> retrieving revision 1.452
> diff -u -p -r1.452 session.c
> --- session.c 27 Oct 2023 09:40:27 -  1.452
> +++ session.c 27 Oct 2023 11:00:13 -
> @@ -1471,8 +1471,9 @@ session_open(struct peer *p)
>  {
>   struct bgp_msg  *buf;
>   struct ibuf *opb;
> - uint16_t len, optparamlen = 0, holdtime;
> - uint8_t  i, op_type;
> + size_t   optparamlen;
> + uint16_t holdtime;
> + uint8_t  i;
>   int  errs = 0, extlen = 0;
>   int  mpcapa = 0;
>  
> @@ -1556,16 +1557,16 @@ session_open(struct peer *p)
>   if (optparamlen == 0) {
>   /* nothing */
>   } else if (optparamlen + 2 >= 255) {
> - /* RFC9072: 2 byte length instead of 1 + 3 byte extra header */
> - optparamlen += sizeof(op_type) + 2 + 3;
> + /* RFC9072: use 255 as magic size and request extra header */
>   optparamlen = 255;
>   extlen = 1;
>   } else {
> - optparamlen += sizeof(op_type) + 1;
> + /* regular capabilities header */
> + optparamlen += 2;
>   }
>  
> - len = MSGSIZE_OPEN_MIN + optparamlen;
> - if (errs || (buf = session_newmsg(OPEN, len)) == NULL) {
> + if (errs || (buf = session_newmsg(OPEN,
> + MSGSIZE_OPEN_MIN + optparamlen)) == NULL) {
>   ibuf_free(opb);
>   bgp_fsm(p, EVNT_CON_FATAL);
>   return;
> @@ -1584,20 +1585,19 @@ session_open(struct peer *p)
>   errs += ibuf_add_n8(buf->buf, optparamlen);
>  
>   if (extlen) {
> - /* write RFC9072 extra header */
> + /* RFC9072 extra header which spans over the capabilities hdr */
>   errs += ibuf_add_n8(buf->buf, OPT_PARAM_EXT_LEN);
> - errs += ibuf_add_n16(buf->buf, optparamlen - 3);
> + errs += ibuf_add_n16(buf->buf, ibuf_size(opb) + 1 + 2);
>   }
>  
>   if (optparamlen) {
>   errs += ibuf_add_n8(buf->buf, OPT_PARAM_CAPABILITIES);
>  
> - optparamlen = ibuf_size(opb);
>   if (extlen) {
>   /* RFC9072: 2-byte extended length */
> - errs += ibuf_add_n16(buf->buf, optparamlen);
> + errs += ibuf_add_n16(buf->buf, ibuf_size(opb));
>   } else {
> - errs += ibuf_add_n8(buf->buf, optparamlen);
> + errs += ibuf_add_n8(buf->buf, ibuf_size(opb));
>   }
>   errs += ibuf_add_buf(buf->buf, opb);
>   }
> 



bgpd: cleanup optparamlen handling in session_open

2023-10-27 Thread Claudio Jeker
In the big ibuf API refactor I also broke the optparamlen handling
by using one variable for two things.

All the size handling in session_open() can be simplified since
ibuf_size() is cheap to call.

I think the result is cleaner than the code before. It is still somewhat
funky because there are a fair amount of conditions to cover now.

-- 
:wq Claudio

Index: session.c
===
RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
retrieving revision 1.452
diff -u -p -r1.452 session.c
--- session.c   27 Oct 2023 09:40:27 -  1.452
+++ session.c   27 Oct 2023 11:00:13 -
@@ -1471,8 +1471,9 @@ session_open(struct peer *p)
 {
struct bgp_msg  *buf;
struct ibuf *opb;
-   uint16_t len, optparamlen = 0, holdtime;
-   uint8_t  i, op_type;
+   size_t   optparamlen;
+   uint16_t holdtime;
+   uint8_t  i;
int  errs = 0, extlen = 0;
int  mpcapa = 0;
 
@@ -1556,16 +1557,16 @@ session_open(struct peer *p)
if (optparamlen == 0) {
/* nothing */
} else if (optparamlen + 2 >= 255) {
-   /* RFC9072: 2 byte length instead of 1 + 3 byte extra header */
-   optparamlen += sizeof(op_type) + 2 + 3;
+   /* RFC9072: use 255 as magic size and request extra header */
optparamlen = 255;
extlen = 1;
} else {
-   optparamlen += sizeof(op_type) + 1;
+   /* regular capabilities header */
+   optparamlen += 2;
}
 
-   len = MSGSIZE_OPEN_MIN + optparamlen;
-   if (errs || (buf = session_newmsg(OPEN, len)) == NULL) {
+   if (errs || (buf = session_newmsg(OPEN,
+   MSGSIZE_OPEN_MIN + optparamlen)) == NULL) {
ibuf_free(opb);
bgp_fsm(p, EVNT_CON_FATAL);
return;
@@ -1584,20 +1585,19 @@ session_open(struct peer *p)
errs += ibuf_add_n8(buf->buf, optparamlen);
 
if (extlen) {
-   /* write RFC9072 extra header */
+   /* RFC9072 extra header which spans over the capabilities hdr */
errs += ibuf_add_n8(buf->buf, OPT_PARAM_EXT_LEN);
-   errs += ibuf_add_n16(buf->buf, optparamlen - 3);
+   errs += ibuf_add_n16(buf->buf, ibuf_size(opb) + 1 + 2);
}
 
if (optparamlen) {
errs += ibuf_add_n8(buf->buf, OPT_PARAM_CAPABILITIES);
 
-   optparamlen = ibuf_size(opb);
if (extlen) {
/* RFC9072: 2-byte extended length */
-   errs += ibuf_add_n16(buf->buf, optparamlen);
+   errs += ibuf_add_n16(buf->buf, ibuf_size(opb));
} else {
-   errs += ibuf_add_n8(buf->buf, optparamlen);
+   errs += ibuf_add_n8(buf->buf, ibuf_size(opb));
}
errs += ibuf_add_buf(buf->buf, opb);
}



Re: bgpd fix holdtime in session_open

2023-10-27 Thread Theo Buehler
On Fri, Oct 27, 2023 at 11:27:11AM +0200, Claudio Jeker wrote:
> While looking for something else I noticed this error.
> 
> The holdtime is written into the buffer with ibuf_add_n16() so the htons()
> call is done in the ibuf add call. So there is no need for the htons()
> when assigning holdtime.

ok tb

> 
> -- 
> :wq Claudio
> 
> Index: session.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
> retrieving revision 1.451
> diff -u -p -u -5 -r1.451 session.c
> --- session.c 19 Oct 2023 07:02:45 -  1.451
> +++ session.c 27 Oct 2023 09:24:20 -
> @@ -1570,13 +1570,13 @@ session_open(struct peer *p)
>   bgp_fsm(p, EVNT_CON_FATAL);
>   return;
>   }
>  
>   if (p->conf.holdtime)
> - holdtime = htons(p->conf.holdtime);
> + holdtime = p->conf.holdtime;
>   else
> - holdtime = htons(conf->holdtime);
> + holdtime = conf->holdtime;
>  
>   errs += ibuf_add_n8(buf->buf, 4);
>   errs += ibuf_add_n16(buf->buf, p->conf.local_short_as);
>   errs += ibuf_add_n16(buf->buf, holdtime);
>   /* is already in network byte order */
> 



bgpd fix holdtime in session_open

2023-10-27 Thread Claudio Jeker
While looking for something else I noticed this error.

The holdtime is written into the buffer with ibuf_add_n16() so the htons()
call is done in the ibuf add call. So there is no need for the htons()
when assigning holdtime.

-- 
:wq Claudio

Index: session.c
===
RCS file: /cvs/src/usr.sbin/bgpd/session.c,v
retrieving revision 1.451
diff -u -p -u -5 -r1.451 session.c
--- session.c   19 Oct 2023 07:02:45 -  1.451
+++ session.c   27 Oct 2023 09:24:20 -
@@ -1570,13 +1570,13 @@ session_open(struct peer *p)
bgp_fsm(p, EVNT_CON_FATAL);
return;
}
 
if (p->conf.holdtime)
-   holdtime = htons(p->conf.holdtime);
+   holdtime = p->conf.holdtime;
else
-   holdtime = htons(conf->holdtime);
+   holdtime = conf->holdtime;
 
errs += ibuf_add_n8(buf->buf, 4);
errs += ibuf_add_n16(buf->buf, p->conf.local_short_as);
errs += ibuf_add_n16(buf->buf, holdtime);
/* is already in network byte order */



disruptive amd64 snapshot coming

2023-10-26 Thread Theo de Raadt
There is a pretty disruptive amd64 snapshot coming, so anyone who is
using snapshots for critical stuff should take a pause.  (This warning
about a development step is unusual, I won't make it common practice).



Re: snmpd: Fix close after protocol error case

2023-10-26 Thread Theo Buehler
On Thu, Oct 26, 2023 at 10:47:36AM +0200, Martijn van Duren wrote:
> So here's an elusive one that can be triggered every now and then by the
> new regression test. Once an AgentX session is opened and we send an
> invalid packet appl_agentx_recv() goes to appl_agentx_free(), since
> there's no recovery. appl_agentx_free() tries to neatly close all
> open sessions by sending a close-pdu, followed by calling
> appl_agentx_send() directly.
> However, if the socket has been closed in the meantime we hit
> appl_agentx_send()'s error path, which also calls appl_agentx_free().
> This in turn leads to use after free cases.
> 
> To fix this don't call appl_agentx_send() directly anymore, but just
> schedule it via conn_wev. To make sure as much data as possible is
> written out do a last unchecked courtesy flush before definitively
> freeing the connection. Since appl_agentx_forceclose() arms conn_wev
> move the event_del() calls down in appl_agentx_free().
> 
> Other calls of appl_agentx_send() should be fine, but just convert
> all of them to be consistent and safe.

ok tb



Re: snmpd; Fix use after free for appl_request_upstream

2023-10-26 Thread Theo Buehler
On Thu, Oct 26, 2023 at 11:51:00AM +0200, Martijn van Duren wrote:
> This case is covered by the new regress' backend_get_toofew and
> backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
> to the max it's really hard to trigger (I had to run
> backend_get_wrongorder, backend_get_toofew, backend_get_toomany
> sequentially in a tight loop killing snmpd between iterations for the
> best chance).
> 
> When we receive an invalid varbindlist in a response we set the invalid
> variable. This in turn calls appl_varbind_error(), but the avi_state
> of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
> we call appl_request_downstream_free(), which sees that the varbinds
> haven't been resolved, triggering a call to
> appl_request_upstream_resolve(). This call in turn sees that the
> error has been set and just sends out the error-response to the client
> and frees the appl_request_upstream. From here we return back to
> appl_response(), which also calls appl_request_upstream_resolve(),
> resulting in a use after free.
> 
> The main tool for fixing this issue is making use of
> appl_request_upstream's aru_locked member, which will cause
> appl_request_upstream_resolve() to return instantly. The simplest fix is
> to set aru_locked before calling appl_request_downstream_free() and
> unsetting it directly afterwards inside appl_response().
> 
> The second one is the diff proposed below, which shrinks the code.
> 
> appl_request_upstream_free() is only called once from
> appl_request_upstream_reply(). appl_request_upstream_reply() in turn
> is only called by appl_request_upstream_resolve().
> appl_request_upstream_resolve() is called in 3 places:
> - appl_processpdu(): to kick things off
> - appl_request_downstream_free(): For when a backend disappears with
> outstanding requests
> - appl_response(): To kickstart the next round of resolving.
> 
> Since appl_request_downstream_free() is always called from
> appl_response(), we can leverage that function and make it call
> appl_request_upstream_resolve() unconditionally.
> 
> appl_request_downstream_free() is called from the following locations:
> - appl_close(): When a backend has disappeared.
> - appl_request_upstream_free(): We send out a reply early, because an
> error has been detected.
> - appl_response(): We received a response
> 
> appl_request_upstream_free() can't reenter into
> appl_request_upstream_resolve(), or it would potentially trigger new
> appl_request_downstreams. This can be prevented by setting aru_locked
> before calling appl_request_downstream_free().
> For all other cases we should rely on appl_request_upstream_resolve()'s
> logic to handle varbinds in any state, so there's no reason make calls
> from other contexts conditional.

Your description of the bug makes sense and your choice of resolving it
as well. Thanks for the in-depth explanation, that helped a lot. Still,
I must say that I don't really feel at ease with the amount of complexity
and entanglement here. I simply can't fit all of this into my head within
a reasonable amount of time.

ok tb (fwiw)

since it resolves a real problem and simplifies things a bit.



Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-26 Thread Martin Pieuchot
On 26/10/23(Thu) 07:06, Miod Vallat wrote:
> > I wonder if the diff below makes a difference.  It's hard to debug and it
> > might be worth adding a counter for bad swap slots.
> 
> It did not help (but your diff is probably correct).

In that case I'd like to put both diffs in, are you ok with that?

> > Index: uvm/uvm_anon.c
> > ===
> > RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> > retrieving revision 1.56
> > diff -u -p -r1.56 uvm_anon.c
> > --- uvm/uvm_anon.c  2 Sep 2023 08:24:40 -   1.56
> > +++ uvm/uvm_anon.c  22 Oct 2023 21:27:42 -
> > @@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
> > uvm_unlock_pageq(); /* free the daemon */
> > }
> > } else {
> > -   if (anon->an_swslot != 0) {
> > +   if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
> > /* This page is no longer only in swap. */
> > KASSERT(uvmexp.swpgonly > 0);
> > atomic_dec_int();



Re: relayd.conf.5: less SSL

2023-10-26 Thread Klemens Nanni
On Tue, Oct 24, 2023 at 09:09:21AM +0200, Peter N. M. Hansteen wrote:
> On Tue, Oct 24, 2023 at 06:54:30AM +, Klemens Nanni wrote:
> > - parse.y still accepting undocumented "ssl" with a warning since 2014
> > - more "SSL/TLS" instead of "TLS" in manual and code comments
> 
> my take would be that while it's fine to streamline the documentation to use
> the modern terminology, I suspect there may still be ancient configurations
> out there that use the "ssl" keyword, so removing the last bit of support for
> that option should be accompanied by or preceded by a warning on relevant
> mailing lists or at least in the commit message. 
> 
> And I think undeadly.org would be more than happy to help spread the word :)

current.html entry should do for a deprecated keyword we've been warning
about for almost ten years...  I've checked faq/upgrade*.html for previous
notes, but couldn't find any.

Here's a first try, relayd regress is also happy.

Index: usr.sbin/relayd/parse.y
===
RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
retrieving revision 1.254
diff -u -p -r1.254 parse.y
--- usr.sbin/relayd/parse.y 3 Jul 2023 09:38:08 -   1.254
+++ usr.sbin/relayd/parse.y 26 Oct 2023 06:07:08 -
@@ -175,7 +175,7 @@ typedef struct {
 %token LOOKUP METHOD MODE NAT NO DESTINATION NODELAY NOTHING ON PARENT PATH
 %token PFTAG PORT PREFORK PRIORITY PROTO QUERYSTR REAL REDIRECT RELAY REMOVE
 %token REQUEST RESPONSE RETRY QUICK RETURN ROUNDROBIN ROUTE SACK SCRIPT SEND
-%token SESSION SOCKET SPLICE SSL STICKYADDR STRIP STYLE TABLE TAG TAGGED TCP
+%token SESSION SOCKET SPLICE STICKYADDR STRIP STYLE TABLE TAG TAGGED TCP
 %token TIMEOUT TLS TO ROUTER RTLABEL TRANSPARENT URL WITH TTL RTABLE
 %token MATCH PARAMS RANDOM LEASTSTATES SRCHASH KEY CERTIFICATE PASSWORD ECDHE
 %token EDH TICKETS CONNECTION CONNECTIONS CONTEXT ERRORS STATE CHANGES CHECKS
@@ -227,21 +227,12 @@ include   : INCLUDE STRING{
}
;
 
-ssltls : SSL   {
-   log_warnx("%s:%d: %s",
-   file->name, yylval.lineno,
-   "please use the \"tls\" keyword"
-   " instead of \"ssl\"");
-   }
-   | TLS
-   ;
-
 opttls : /*empty*/ { $$ = 0; }
-   | ssltls{ $$ = 1; }
+   | TLS   { $$ = 1; }
;
 
 opttlsclient   : /*empty*/ { $$ = 0; }
-   | WITH ssltls   { $$ = 1; }
+   | WITH TLS  { $$ = 1; }
;
 
 http_type  : HTTP  { $$ = 0; }
@@ -905,7 +896,7 @@ hashkey : /* empty */   {
 
 tablecheck : ICMP  { table->conf.check = CHECK_ICMP; }
| TCP   { table->conf.check = CHECK_TCP; }
-   | ssltls{
+   | TLS   {
table->conf.check = CHECK_TCP;
conf->sc_conf.flags |= F_TLS;
table->conf.flags |= F_TLS;
@@ -1114,7 +1105,7 @@ protopts_l: protopts_l protoptsl nl
| protoptsl optnl
;
 
-protoptsl  : ssltls {
+protoptsl  : TLS {
if (!(proto->type == RELAY_PROTO_TCP ||
proto->type == RELAY_PROTO_HTTP)) {
yyerror("can set tls options only for "
@@ -1122,7 +1113,7 @@ protoptsl : ssltls {
YYERROR;
}
} tlsflags
-   | ssltls {
+   | TLS {
if (!(proto->type == RELAY_PROTO_TCP ||
proto->type == RELAY_PROTO_HTTP)) {
yyerror("can set tls options only for "
@@ -2492,7 +2483,6 @@ lookup(char *s)
{ "socket", SOCKET },
{ "source-hash",SRCHASH },
{ "splice", SPLICE },
-   { "ssl",SSL },
{ "state",  STATE },
{ "sticky-address", STICKYADDR },
{ "strip",  STRIP },
Index: usr.sbin/relayd/relay.c
===
RCS file: /cvs/src/usr.sbin/relayd/relay.c,v
retrieving revision 1.257
diff -u -p -r1.257 relay.c
--- usr.sbin/relayd/relay.c 3 Sep 2023 10:22:03 -   1.257
+++ usr.sbin/relayd/relay.c 26 Oct 2023 05:49:22 -
@@ -2064,7 +2064,7 @@ relay_tls_ctx_create_proto(struct protoc
 {
uint32_t protocols = 0;
 
-   /* Set the allowed SSL protocols */
+   /* Set the allowed TLS protocols */
if (proto->tlsflags & TLSFLAG_TLSV1_2)
protocols |= TLS_PROTOCOL_TLSv1_2;
if (proto->tlsflags & TLSFLAG_TLSV1_3)
@@ -2186,7 

snmpd; Fix use after free for appl_request_upstream

2023-10-26 Thread Martijn van Duren
This case is covered by the new regress' backend_get_toofew and
backend_get_toomany tests. However, even with MALLOC_OPTIONS cranked
to the max it's really hard to trigger (I had to run
backend_get_wrongorder, backend_get_toofew, backend_get_toomany
sequentially in a tight loop killing snmpd between iterations for the
best chance).

When we receive an invalid varbindlist in a response we set the invalid
variable. This in turn calls appl_varbind_error(), but the avi_state
of the varbinds remains in APPL_VBSTATE_PENDING. Directly following
we call appl_request_downstream_free(), which sees that the varbinds
haven't been resolved, triggering a call to
appl_request_upstream_resolve(). This call in turn sees that the
error has been set and just sends out the error-response to the client
and frees the appl_request_upstream. From here we return back to
appl_response(), which also calls appl_request_upstream_resolve(),
resulting in a use after free.

The main tool for fixing this issue is making use of
appl_request_upstream's aru_locked member, which will cause
appl_request_upstream_resolve() to return instantly. The simplest fix is
to set aru_locked before calling appl_request_downstream_free() and
unsetting it directly afterwards inside appl_response().

The second one is the diff proposed below, which shrinks the code.

appl_request_upstream_free() is only called once from
appl_request_upstream_reply(). appl_request_upstream_reply() in turn
is only called by appl_request_upstream_resolve().
appl_request_upstream_resolve() is called in 3 places:
- appl_processpdu(): to kick things off
- appl_request_downstream_free(): For when a backend disappears with
outstanding requests
- appl_response(): To kickstart the next round of resolving.

Since appl_request_downstream_free() is always called from
appl_response(), we can leverage that function and make it call
appl_request_upstream_resolve() unconditionally.

appl_request_downstream_free() is called from the following locations:
- appl_close(): When a backend has disappeared.
- appl_request_upstream_free(): We send out a reply early, because an
error has been detected.
- appl_response(): We received a response

appl_request_upstream_free() can't reenter into
appl_request_upstream_resolve(), or it would potentially trigger new
appl_request_downstreams. This can be prevented by setting aru_locked
before calling appl_request_downstream_free().
For all other cases we should rely on appl_request_upstream_resolve()'s
logic to handle varbinds in any state, so there's no reason make calls
from other contexts conditional.

OK?

martijn@

Index: application.c
===
RCS file: /cvs/src/usr.sbin/snmpd/application.c,v
retrieving revision 1.24
diff -u -p -r1.24 application.c
--- application.c   24 Oct 2023 14:21:58 -  1.24
+++ application.c   26 Oct 2023 09:40:23 -
@@ -710,6 +710,7 @@ appl_request_upstream_free(struct appl_r
if (ureq == NULL)
return;
 
+   ureq->aru_locked = 1;
for (i = 0; i < ureq->aru_varbindlen && ureq->aru_vblist != NULL; i++) {
vb = &(ureq->aru_vblist[i]);
ober_free_elements(vb->avi_varbind.av_value);
@@ -726,7 +727,6 @@ void
 appl_request_downstream_free(struct appl_request_downstream *dreq)
 {
struct appl_varbind_internal *vb;
-   int retry = 0;
 
if (dreq == NULL)
return;
@@ -736,14 +736,11 @@ appl_request_downstream_free(struct appl
 
for (vb = dreq->ard_vblist; vb != NULL; vb = vb->avi_next) {
vb->avi_request_downstream = NULL;
-   if (vb->avi_state == APPL_VBSTATE_PENDING) {
+   if (vb->avi_state == APPL_VBSTATE_PENDING)
vb->avi_state = APPL_VBSTATE_NEW;
-   retry = 1;
-   }
}
 
-   if (retry)
-   appl_request_upstream_resolve(dreq->ard_request);
+   appl_request_upstream_resolve(dreq->ard_request);
free(dreq);
 }
 
@@ -1172,9 +1169,6 @@ appl_response(struct appl_backend *backe
backend->ab_name);
backend->ab_fn->ab_close(backend, APPL_CLOSE_REASONPARSEERROR);
}
-
-   if (ureq != NULL)
-   appl_request_upstream_resolve(ureq);
 }
 
 int



snmpd: Fix close after protocol error case

2023-10-26 Thread Martijn van Duren
So here's an elusive one that can be triggered every now and then by the
new regression test. Once an AgentX session is opened and we send an
invalid packet appl_agentx_recv() goes to appl_agentx_free(), since
there's no recovery. appl_agentx_free() tries to neatly close all
open sessions by sending a close-pdu, followed by calling
appl_agentx_send() directly.
However, if the socket has been closed in the meantime we hit
appl_agentx_send()'s error path, which also calls appl_agentx_free().
This in turn leads to use after free cases.

To fix this don't call appl_agentx_send() directly anymore, but just
schedule it via conn_wev. To make sure as much data as possible is
written out do a last unchecked courtesy flush before definitively
freeing the connection. Since appl_agentx_forceclose() arms conn_wev
move the event_del() calls down in appl_agentx_free().

Other calls of appl_agentx_send() should be fine, but just convert
all of them to be consistent and safe.

OK?

martijn@

Index: usr.sbin/snmpd/application_agentx.c
===
RCS file: /cvs/src/usr.sbin/snmpd/application_agentx.c,v
retrieving revision 1.12
diff -u -p -r1.12 application_agentx.c
--- usr.sbin/snmpd/application_agentx.c 24 Oct 2023 14:11:14 -  1.12
+++ usr.sbin/snmpd/application_agentx.c 26 Oct 2023 08:43:02 -
@@ -254,9 +254,6 @@ appl_agentx_free(struct appl_agentx_conn
 {
struct appl_agentx_session *session;
 
-   event_del(&(conn->conn_rev));
-   event_del(&(conn->conn_wev));
-
while ((session = TAILQ_FIRST(&(conn->conn_sessions))) != NULL) {
if (conn->conn_ax == NULL)
appl_agentx_session_free(session);
@@ -265,7 +262,12 @@ appl_agentx_free(struct appl_agentx_conn
reason);
}
 
+   event_del(&(conn->conn_rev));
+   event_del(&(conn->conn_wev));
+
RB_REMOVE(appl_agentx_conns, _agentx_conns, conn);
+   if (conn->conn_ax != NULL)
+   (void)ax_send(conn->conn_ax);
ax_free(conn->conn_ax);
if (conn->conn_backend)
fatalx("AgentX(%"PRIu32"): disappeared unexpected",
@@ -419,7 +421,7 @@ appl_agentx_recv(int fd, short event, vo
pdu->ap_header.aph_transactionid,
pdu->ap_header.aph_packetid, smi_getticks(),
APPL_ERROR_NOERROR, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
break;
case AX_PDU_TYPE_INDEXALLOCATE:
case AX_PDU_TYPE_INDEXDEALLOCATE:
@@ -431,7 +433,7 @@ appl_agentx_recv(int fd, short event, vo
APPL_ERROR_PROCESSINGERROR, 1,
pdu->ap_payload.ap_vbl.ap_varbind,
pdu->ap_payload.ap_vbl.ap_nvarbind);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
break;
case AX_PDU_TYPE_ADDAGENTCAPS:
case AX_PDU_TYPE_REMOVEAGENTCAPS:
@@ -451,7 +453,7 @@ appl_agentx_recv(int fd, short event, vo
pdu->ap_header.aph_transactionid,
pdu->ap_header.aph_packetid, smi_getticks(),
error, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
ax_pdu_free(pdu);
 
if (session == NULL || error != APPL_ERROR_PARSEERROR)
@@ -560,13 +562,13 @@ appl_agentx_open(struct appl_agentx_conn
ax_response(conn->conn_ax, session->sess_id,
pdu->ap_header.aph_transactionid, pdu->ap_header.aph_packetid,
smi_getticks(), APPL_ERROR_NOERROR, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
 
return;
  fail:
ax_response(conn->conn_ax, 0, pdu->ap_header.aph_transactionid,
pdu->ap_header.aph_packetid, 0, error, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
if (session != NULL)
free(session->sess_descr.aos_string);
free(session);
@@ -592,7 +594,7 @@ appl_agentx_close(struct appl_agentx_ses
ax_response(conn->conn_ax, pdu->ap_header.aph_sessionid,
pdu->ap_header.aph_transactionid, pdu->ap_header.aph_packetid,
smi_getticks(), error, 0, NULL, 0);
-   appl_agentx_send(-1, EV_WRITE, conn);
+   event_add(&(conn->conn_wev), NULL);
if (error == APPL_ERROR_NOERROR)
return;
 
@@ -612,7 +614,7 @@ appl_agentx_forceclose(struct appl_backe
session->sess_conn->conn_ax->ax_byteorder = session->sess_byteorder;
ax_close(session->sess_conn->conn_ax, session->sess_id,
(enum ax_close_reason) reason);
-   appl_agentx_send(-1, EV_WRITE, session->sess_conn);
+   event_add(&(session->sess_conn->conn_wev), NULL);
 
strlcpy(name, session->sess_backend.ab_name, sizeof(name));

apu4 real com0 boot - not working to install 7.4

2023-10-26 Thread harold felton
apologies for crossposting - feel free to direct me to the correct place
and/or ignore other versions...  i started a reddit-thread HERE

which
didnt specify which list specifically - but provided a very-useful
help/ftp-site to bisect when the problem occurred...

i have a pcengines apu4 which has been running 7.1-stable with all updates
just fine...  i had been avoiding doing an upgrade - so was thinking to
reinstall based on 7.4-release this week...  i was unable to do an install
from the bsd.rd (for 7.4) - and have bisected the issue to somewhere around
the 9th or 10th of september...

i will enclose the bug-report-info i just now collected from the
running-machine...  the symptom is: the bsd.rd seems to time-out/fail at
the very-beginning from com0 and then forces a reboot to the system...

onscreen - the com0 output for two) successful bsd.rd looks as follows:

  [00:15:43 - Thu Oct 26] {-ksh:729}
--hfeltonad...@apu4.hfelton.net:~/snaps/s20230908/usr[729]
$ doas reboot
doas (hfeltonad...@apu4.hfelton.net) password:
stopping package daemons: fossild.
syncing disks... done
rebooting...
▒PC Engines apu4
coreboot build 20230131
BIOS version v4.19.0.1
4080 MB ECC DRAM
SeaBIOS (version rel-1.16.0.1-0-g77603a32)

Press F10 key now for boot menu

Booting from Hard Disk...
Using drive 0, partition 3.
Loading..
probing: pc0 com0 com1 com2 com3 mem[639K 3325M 752M a20=on]
disk: hd0+
>> OpenBSD/amd64 BOOT 3.53
switching console to com>> OpenBSD/amd64 BOOT 3.53
boot> 0
b bsd.rd
booting hd0a:bsd.rd: 3891908+1614848+3895112+0+708608
[109+435984+290736]=0xa57ab0
entry point at 0x81001000
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2022 OpenBSD. All rights reserved.
https://www.OpenBSD.org

OpenBSD 7.1 (RAMDISK_CD) #440: Mon Apr 11 18:09:13 MDT 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/RAMDISK_CD
real mem = 4259930112 (4062MB)
avail mem = 4126810112 (3935MB)
random: good seed from bootblocks
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xcfe97040 (13 entries)
bios0: vendor coreboot version "v4.19.0.1" date 01/31/2023
bios0: PC Engines apu4
acpi0 at bios0: ACPI 6.0
acpi0: tables DSDT FACP SSDT MCFG TPM2 APIC HEST SSDT SSDT DRTM HPET
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD GX-412TC SOC, 998.27 MHz, 16-30-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,T
cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB
64b/line 16-way L2 cache
cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, IBE
cpu at mainbus0: not configured
cpu at mainbus0: not configured
cpu at mainbus0: not configured
ioapic0 at mainbus0: apid 4 pa 0xfec0, version 21, 24 pins
ioapic1 at mainbus0: apid 5 pa 0xfec2, version 21, 32 pins
acpihpet0 at acpi0: 14318180 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (PBR4)
acpiprt2 at acpi0: bus 2 (PBR5)
acpiprt3 at acpi0: bus 3 (PBR6)
acpiprt4 at acpi0: bus 4 (PBR7)
acpiprt5 at acpi0: bus -1 (PBR8)
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at acpi0 not configured
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
com0 at acpi0 COM1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at acpi0 COM2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
amdgpio0 at acpi0 GPIO uid 0 addr 0xfed81500/0x300 irq 7, 184 pins
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"PRP0001" at acpi0 not configured
"BOOT" at acpi0 not configured
acpitz at acpi0 not configured
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "AMD 16h Root Complex" rev 0x00
vendor "AMD", unknown product 0x1567 (class system subclass IOMMU, rev
0x00) at pci0 dev 0 function 2 not configured
pchb1 at pci0 dev 2 function 0 "AMD 16h Host" rev 0x00
ppb0 at pci0 dev 2 function 1 "AMD 16h PCIE" rev 0x00: msi
pci1 at ppb0 bus 1
em0 at pci1 dev 0 function 0 "Intel I211" rev 0x03: msi, address
00:0d:b9:55:bb:64
ppb1 at pci0 dev 2 function 2 "AMD 16h PCIE" rev 0x00: msi
pci2 at ppb1 bus 2
em1 at pci2 dev 0 function 0 "Intel I211" rev 0x03: msi, address
00:0d:b9:55:bb:65
ppb2 at pci0 dev 2 function 3 "AMD 16h PCIE" rev 0x00: msi
pci3 at ppb2 bus 3
em2 at pci3 dev 0 function 0 "Intel I211" rev 

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-26 Thread Miod Vallat
> I wonder if the diff below makes a difference.  It's hard to debug and it
> might be worth adding a counter for bad swap slots.

It did not help (but your diff is probably correct).

> Index: uvm/uvm_anon.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
> retrieving revision 1.56
> diff -u -p -r1.56 uvm_anon.c
> --- uvm/uvm_anon.c2 Sep 2023 08:24:40 -   1.56
> +++ uvm/uvm_anon.c22 Oct 2023 21:27:42 -
> @@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
>   uvm_unlock_pageq(); /* free the daemon */
>   }
>   } else {
> - if (anon->an_swslot != 0) {
> + if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
>   /* This page is no longer only in swap. */
>   KASSERT(uvmexp.swpgonly > 0);
>   atomic_dec_int();



Re: ibuf free fd on close

2023-10-25 Thread Theo Buehler
On Tue, Oct 24, 2023 at 04:00:42PM +0200, Claudio Jeker wrote:
> On Tue, Oct 24, 2023 at 03:50:47PM +0200, Theo Buehler wrote:
> > On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> > > When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> > > close any fd still on the buffer. This way even if a fd is unexpectedly
> > > passed nothing will happen.
> > > 
> > > That code was disabled at start because userland was not fully ready. In
> > > particular rpki-client did not handle that well. All of this is to my
> > > knowledge fixed so there is no reason to keep the NOTYET :)
> > > 
> > > With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> > > Code not doing so will break because ibuf_free() will close the fd which
> > > is probably still in use somewhere else.
> > 
> > Nothing in base outside of libutil seems to reach directly for the fd
> > (checked by compiling with that struct member renamed in the public
> > header).
> > 
> > The internal uses are addressed by this diff, so
> > 
> > ok tb
> > 
> > I can put the fd rename through a bulk to catch some ports in a couple
> > of days but I don't think there is a need to wait.
> 
> Thanks. Do we have a list of ports that use ibuf / imsg? 

mdnsd sets an ibuf fd to -1, which is harmless:
https://github.com/haesbaert/mdnsd/blob/master/libmdns/mdnsl.c#L513

got's lib/privsep.c code has a similar pattern of setting the fd before
imsg_close(). This won't be affected directly by this change as far as I
can see, but the privsep code probably needs an fd audit, ibuf related
and otherwise. Just skimming the code I saw a few potential fd leaks,
e.g. request_tree() leaks if got_privsep_send_tree_req() exits early.

PS: It is quite easy to hit "unexpected end of file" if one clicks on
links in the web view. Clicking on "i can't count" here:
https://got.gameoftrees.org/?action=summary=got.git



Re: malloc: micro optimizations

2023-10-25 Thread Masato Asou
Hi,

It works fine for me.
ok asou@
--
ASOU Masato

From: Otto Moerbeek 
Date: Wed, 25 Oct 2023 11:04:01 +0200

> Hi,
> 
> a few micro-optimization, including getting rid of some statistics
> that are not actualy very interesting.
> 
> Speedup amounts to a few tenths of percents to a few percents,
> depending on how biased the benchmark is.
> 
>   -Otto
> 
> Index: stdlib/malloc.c
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
> retrieving revision 1.291
> diff -u -p -r1.291 malloc.c
> --- stdlib/malloc.c   22 Oct 2023 12:19:26 -  1.291
> +++ stdlib/malloc.c   24 Oct 2023 14:05:37 -
> @@ -169,16 +169,12 @@ struct dir_info {
>   void *caller;
>   size_t inserts;
>   size_t insert_collisions;
> - size_t finds;
> - size_t find_collisions;
>   size_t deletes;
>   size_t delete_moves;
>   size_t cheap_realloc_tries;
>   size_t cheap_reallocs;
>   size_t malloc_used; /* bytes allocated */
>   size_t malloc_guarded;  /* bytes used for guards */
> - size_t pool_searches;   /* searches for pool */
> - size_t other_pool;  /* searches in other pool */
>  #define STATS_ADD(x,y)   ((x) += (y))
>  #define STATS_SUB(x,y)   ((x) -= (y))
>  #define STATS_INC(x) ((x)++)
> @@ -209,12 +205,14 @@ static void unmap(struct dir_info *d, vo
>  struct chunk_info {
>   LIST_ENTRY(chunk_info) entries;
>   void *page; /* pointer to the page */
> + /* number of shorts should add up to 8, check alloc_chunk_info() */
>   u_short canary;
>   u_short bucket;
>   u_short free;   /* how many free chunks */
>   u_short total;  /* how many chunks */
>   u_short offset; /* requested size table offset */
> - u_short bits[1];/* which chunks are free */
> +#define CHUNK_INFO_TAIL  3
> + u_short bits[CHUNK_INFO_TAIL];  /* which chunks are free */
>  };
>  
>  #define CHUNK_FREE(i, n) ((i)->bits[(n) / MALLOC_BITS] & (1U << ((n) % 
> MALLOC_BITS)))
> @@ -656,12 +654,10 @@ find(struct dir_info *d, void *p)
>   index = hash(p) & mask;
>   r = d->r[index].p;
>   q = MASK_POINTER(r);
> - STATS_INC(d->finds);
>   while (q != p && r != NULL) {
>   index = (index - 1) & mask;
>   r = d->r[index].p;
>   q = MASK_POINTER(r);
> - STATS_INC(d->find_collisions);
>   }
>   return (q == p && r != NULL) ? >r[index] : NULL;
>  }
> @@ -949,7 +945,7 @@ init_chunk_info(struct dir_info *d, stru
>  
>   p->bucket = bucket;
>   p->total = p->free = MALLOC_PAGESIZE / B2ALLOC(bucket);
> - p->offset = bucket == 0 ? 0xdead : howmany(p->total, MALLOC_BITS);
> + p->offset = howmany(p->total, MALLOC_BITS);
>   p->canary = (u_short)d->canary1;
>  
>   /* set all valid bits in the bitmap */
> @@ -971,8 +967,13 @@ alloc_chunk_info(struct dir_info *d, u_i
>   count = MALLOC_PAGESIZE / B2ALLOC(bucket);
>  
>   size = howmany(count, MALLOC_BITS);
> - size = sizeof(struct chunk_info) + (size - 1) * sizeof(u_short);
> - if (mopts.chunk_canaries)
> + /* see declaration of struct chunk_info */
> + if (size <= CHUNK_INFO_TAIL)
> + size = 0;
> + else
> + size -= CHUNK_INFO_TAIL;
> + size = sizeof(struct chunk_info) + size * sizeof(u_short);
> + if (mopts.chunk_canaries && bucket > 0)
>   size += count * sizeof(u_short);
>   size = _ALIGN(size);
>   count = MALLOC_PAGESIZE / size;
> @@ -1129,8 +1130,7 @@ fill_canary(char *ptr, size_t sz, size_t
>  static void *
>  malloc_bytes(struct dir_info *d, size_t size)
>  {
> - u_int i, r, bucket, listnum;
> - size_t k;
> + u_int i, k, r, bucket, listnum;
>   u_short *lp;
>   struct chunk_info *bp;
>   void *p;
> @@ -1170,7 +1170,7 @@ malloc_bytes(struct dir_info *d, size_t 
>   /* no bit halfway, go to next full short */
>   i /= MALLOC_BITS;
>   for (;;) {
> - if (++i >= howmany(bp->total, MALLOC_BITS))
> + if (++i >= bp->offset)
>   i = 0;
>   lp = >bits[i];
>   if (*lp) {
> @@ -1228,7 +1228,7 @@ validate_canary(struct dir_info *d, u_ch
>   }
>  }
>  
> -static uint32_t
> +static inline uint32_t
>  find_chunknum(struct dir_info *d, struct chunk_info *info, void *ptr, int 
> check)
>  {
>   uint32_t chunknum;
> @@ -1532,12 +1532,10 @@ findpool(void *p, struct dir_info *argpo
>   struct dir_info *pool = argpool;
>   struct region_info *r = find(pool, p);
>  
> - STATS_INC(pool->pool_searches);
>   if (r == NULL) {
>   u_int i, nmutexes;
>  
>   nmutexes = 

Re: dhcpd(8): Parse lease database after dropping privileges

2023-10-25 Thread Stephen Fox
Hello,

I just wanted to double check if there is any additional feedback on the patch.

I am happy to make changes or discuss alternative approaches.

Thank you,
Stephen

On Tue, Oct 3, 2023 at 11:33 PM Stephen Fox  wrote:
>
> Hello,
>
> I received feedback from deraadt that the first two unveil(2) calls were
> unnecessary because pledge(2) automatically unveils "/usr/share/zoneinfo".
>
> This updated patch removes the unneeded unveil(2) calls.
>
> Best regards,
> Stephen
> ---
>  usr.sbin/dhcpd/confpars.c | 41 ++-
>  usr.sbin/dhcpd/db.c   | 19 ++
>  usr.sbin/dhcpd/dhcpd.c| 30 ++--
>  usr.sbin/dhcpd/dhcpd.h|  2 ++
>  4 files changed, 76 insertions(+), 16 deletions(-)
>
> diff --git usr.sbin/dhcpd/confpars.c usr.sbin/dhcpd/confpars.c
> index 1bdffb38842..bb245fc663e 100644
> --- usr.sbin/dhcpd/confpars.c
> +++ usr.sbin/dhcpd/confpars.c
> @@ -57,6 +57,8 @@
>  #include "dhctoken.h"
>  #include "log.h"
>
> +FILE   *db_file_ro;
> +
>  intparse_cidr(FILE *, unsigned char *, unsigned char *);
>
>  /*
> @@ -106,18 +108,11 @@ readconf(void)
>  }
>
>  /*
> - * lease-file :== lease-declarations EOF
> - * lease-statments :== 
> - *| lease-declaration
> - *| lease-declarations lease-declaration
> + * Open and retain a read-only file descriptor to the lease database file.
>   */
>  void
> -read_leases(void)
> +open_leases(void)
>  {
> -   FILE *cfile;
> -   char *val;
> -   int token;
> -
> new_parse(path_dhcpd_db);
>
> /*
> @@ -131,23 +126,40 @@ read_leases(void)
>  * thinking that no leases have been assigned to anybody, which
>  * could create severe network chaos.
>  */
> -   if ((cfile = fopen(path_dhcpd_db, "r")) == NULL) {
> +   if ((db_file_ro = fopen(path_dhcpd_db, "r")) == NULL) {
> log_warn("Can't open lease database (%s)", path_dhcpd_db);
> log_warnx("check for failed database rewrite attempt!");
> log_warnx("Please read the dhcpd.leases manual page if you");
> fatalx("don't know what to do about this.");
> }
> +}
> +
> +/*
> + * lease-file :== lease-declarations EOF
> + * lease-statments :== 
> + *| lease-declaration
> + *| lease-declarations lease-declaration
> + */
> +void
> +read_leases(void)
> +{
> +   char *val;
> +   int token;
> +
> +   if (!db_file_ro) {
> +   fatalx("db_file_ro is NULL");
> +   }
>
> do {
> -   token = next_token(, cfile);
> +   token = next_token(, db_file_ro);
> if (token == EOF)
> break;
> if (token != TOK_LEASE) {
> log_warnx("Corrupt lease file - possible data loss!");
> -   skip_to_semi(cfile);
> +   skip_to_semi(db_file_ro);
> } else {
> struct lease *lease;
> -   lease = parse_lease_declaration(cfile);
> +   lease = parse_lease_declaration(db_file_ro);
> if (lease)
> enter_lease(lease);
> else
> @@ -155,7 +167,8 @@ read_leases(void)
> }
>
> } while (1);
> -   fclose(cfile);
> +   fclose(db_file_ro);
> +   db_file_ro = NULL;
>  }
>
>  /*
> diff --git usr.sbin/dhcpd/db.c usr.sbin/dhcpd/db.c
> index 295e522b1ce..634ec8a593a 100644
> --- usr.sbin/dhcpd/db.c
> +++ usr.sbin/dhcpd/db.c
> @@ -190,6 +190,16 @@ commit_leases(void)
> return (1);
>  }
>
> +/*
> + * Open and retain two file descriptors to the lease database file:
> + *
> + * - A read-only fd for learning existing leases
> + * - A write-only fd for writing new leases
> + *
> + * Reading and parsing data from the read-only fd is done separately.
> + * This allows privilege drop to happen *before* parsing untrusted
> + * client data from the lease database file.
> + */
>  void
>  db_startup(void)
>  {
> @@ -202,6 +212,15 @@ db_startup(void)
> if ((db_file = fdopen(db_fd, "w")) == NULL)
> fatalx("Can't fdopen new lease file!");
>
> +   open_leases();
> +}
> +
> +/*
> + * Read and parse the existing leases from the lease database file.
> + */
> +void
> +db_parse(void)
> +{
> /* Read in the existing lease file... */
> read_leases();
> time(_time);
> diff --git usr.sbin/dhcpd/dhcpd.c usr.sbin/dhcpd/dhcpd.c
> index b3562dce34f..7759f7839e0 100644
> --- usr.sbin/dhcpd/dhcpd.c
> +++ usr.sbin/dhcpd/dhcpd.c
> @@ -244,8 +244,6 @@ main(int argc, char *argv[])
>
> icmp_startup(1, lease_pinged);
>
> -   if (chroot(pw->pw_dir) == -1)
> -   fatal("chroot %s", pw->pw_dir);
> if (chdir("/") == -1)
> fatal("chdir(\"/\")");
> if (setgroups(1, >pw_gid) ||
> @@ -253,6 +251,34 

Re: patch unveil fail

2023-10-25 Thread Alexander Bluhm
On Wed, Oct 25, 2023 at 07:00:28PM +0200, Omar Polo wrote:
> On 2023/10/25 13:38:37 +0200, Alexander Bluhm  wrote:
> > @@ -213,11 +214,27 @@ main(int argc, char *argv[])
> > perror("unveil");
> > my_exit(2);
> > }
> > -   if (filearg[0] != NULL)
> > +   if (filearg[0] != NULL) {
> > +   char *origdir;
> > +
> > if (unveil(filearg[0], "rwc") == -1) {
> > perror("unveil");
> > my_exit(2);
> > }
> > +   if ((origdir = dirname(filearg[0])) == NULL) {
> 
> Not sure if we're interested in it, but dirname(3) theoretically alter
> the passed string.  our dirname doesn't do it, but per posix it can,
> IIUC.  This could cause issues since filearg[0] is used later.
> 
> If we care about portability here, we should pass a copy to dirname.
> don't know if we care thought.

unveil(2) is not portable code anyway.  And dirname(3) is only used
for that.

> > +   perror("dirname");
> > +   my_exit(2);
> > +   }
> > +   if (unveil(origdir, "rwc") == -1) {
> > +   perror("unveil");
> > +   my_exit(2);
> > +   }
> > +   } else {
> > +   if (unveil(".", "rwc") == -1) {
> > +   perror("unveil");
> > +   my_exit(2);
> > +   }
> > +   }
> > if (filearg[1] != NULL)
> > if (unveil(filearg[1], "r") == -1) {
> > perror("unveil");



Re: patch unveil fail

2023-10-25 Thread Omar Polo
On 2023/10/25 13:38:37 +0200, Alexander Bluhm  wrote:
> Index: patch.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
> diff -u -p -r1.74 patch.c
> --- patch.c   19 Jul 2023 13:26:20 -  1.74
> +++ patch.c   24 Oct 2023 17:13:28 -
> @@ -32,6 +32,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -213,11 +214,27 @@ main(int argc, char *argv[])
>   perror("unveil");
>   my_exit(2);
>   }
> - if (filearg[0] != NULL)
> + if (filearg[0] != NULL) {
> + char *origdir;
> +
>   if (unveil(filearg[0], "rwc") == -1) {
>   perror("unveil");
>   my_exit(2);
>   }
> + if ((origdir = dirname(filearg[0])) == NULL) {

Not sure if we're interested in it, but dirname(3) theoretically alter
the passed string.  our dirname doesn't do it, but per posix it can,
IIUC.  This could cause issues since filearg[0] is used later.

If we care about portability here, we should pass a copy to dirname.
don't know if we care thought.

> + perror("dirname");
> + my_exit(2);
> + }
> + if (unveil(origdir, "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + } else {
> + if (unveil(".", "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + }
>   if (filearg[1] != NULL)
>   if (unveil(filearg[1], "r") == -1) {
>   perror("unveil");



Re: patch unveil fail

2023-10-25 Thread Todd C . Miller
On Wed, 25 Oct 2023 13:38:37 +0200, Alexander Bluhm wrote:

> Since 7.4 patch(1) does not work if an explicit patchfile is given on
> command line.
>
> https://marc.info/?l=openbsd-cvs=168941770509379=2

OK millert@

 - todd



Re: patch unveil fail

2023-10-25 Thread Florian Obser
reads correct, OK florian

On 2023-10-25 13:38 +02, Alexander Bluhm  wrote:
> Hi,
>
> Since 7.4 patch(1) does not work if an explicit patchfile is given on
> command line.
>
> https://marc.info/?l=openbsd-cvs=168941770509379=2
>
> root@ot14:.../~# patch /usr/src/usr.bin/patch/patch.c patch-unveil.diff
> Hmm...  Looks like a unified diff to me...
> The text leading up to this was:
> --
> |Index: patch.c
> |===
> |RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
> |diff -u -p -r1.74 patch.c
> |--- patch.c19 Jul 2023 13:26:20 -  1.74
> |+++ patch.c24 Oct 2023 17:13:28 -
> --
> Patching file /usr/src/usr.bin/patch/patch.c using Plan A...
> Hunk #1 succeeded at 32.
> Hunk #2 succeeded at 214.
> Hunk #3 succeeded at 245.
> Can't backup /usr/src/usr.bin/patch/patch.c, output is in 
> /tmp/patchoorjYymLKcM: No such file or directory
> done
>
> A backup file should be created in the directory of the original
> file, but only the current directory is unveiled.  Then the patched
> file is created in /tmp and does not replace the original patchfile
> in place.
>
> Diff below fixes it.
>
> ok?
>
> bluhm
>
> Index: patch.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
> diff -u -p -r1.74 patch.c
> --- patch.c   19 Jul 2023 13:26:20 -  1.74
> +++ patch.c   24 Oct 2023 17:13:28 -
> @@ -32,6 +32,7 @@
>  
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -213,11 +214,27 @@ main(int argc, char *argv[])
>   perror("unveil");
>   my_exit(2);
>   }
> - if (filearg[0] != NULL)
> + if (filearg[0] != NULL) {
> + char *origdir;
> +
>   if (unveil(filearg[0], "rwc") == -1) {
>   perror("unveil");
>   my_exit(2);
>   }
> + if ((origdir = dirname(filearg[0])) == NULL) {
> + perror("dirname");
> + my_exit(2);
> + }
> + if (unveil(origdir, "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + } else {
> + if (unveil(".", "rwc") == -1) {
> + perror("unveil");
> + my_exit(2);
> + }
> + }
>   if (filearg[1] != NULL)
>   if (unveil(filearg[1], "r") == -1) {
>   perror("unveil");
> @@ -228,10 +245,6 @@ main(int argc, char *argv[])
>   perror("unveil");
>   my_exit(2);
>   }
> - if (unveil(".", "rwc") == -1) {
> - perror("unveil");
> - my_exit(2);
> - }
>   if (*rejname != '\0')
>   if (unveil(rejname, "rwc") == -1) {
>   perror("unveil");
>

-- 
In my defence, I have been left unsupervised.



installer: suggest CDN if ftplist1 doesn't work

2023-10-25 Thread Job Snijders
Dear all,

In case no mirror was previously configured (aka, this is a fresh
install) and ftplist1.openbsd.org is unreachable, perhaps it's useful to
suggest a default mirror to expedite the installation process.

Terminal transcript on a machine that's not able to reach ftplist1,
doing 'enter enter enter enter':

Let's install the sets!
Location of sets? (disk http nfs or 'done') [http]
HTTP proxy URL? (e.g. 'http://proxy:8080', or 'none') [none]
(Unable to get list from openbsd.org, but that is OK)
HTTP Server? (hostname or 'done') [cdn.openbsd.org]
Server directory? [pub/OpenBSD/snapshots/amd64]

The below diff is joint work with kn@

OK?

Kind regards,

Job

Index: distrib/miniroot/install.sub
===
RCS file: /cvs/src/distrib/miniroot/install.sub,v
diff -u -p -r1.1257 install.sub
--- distrib/miniroot/install.sub24 Oct 2023 18:03:53 -  1.1257
+++ distrib/miniroot/install.sub25 Oct 2023 13:42:28 -
@@ -1868,6 +1868,7 @@ install_http() {
else
echo "(Unable to get list from openbsd.org, but that is OK)"
_prompt="HTTP Server? (hostname or 'done')"
+   HTTP_SERVER=$DEFAULT_MIRROR
fi
 
# Use information from /etc/installurl as defaults for upgrades.
@@ -2935,7 +2936,7 @@ finish_up() {
 
# Create /etc/installurl if it does not yet exist.
if [[ ! -f /mnt/etc/installurl ]]; then
-   echo "${INSTALL_URL:-https://cdn.openbsd.org/pub/OpenBSD}; \
+   echo "${INSTALL_URL:-https://${DEFAULT_MIRROR}/pub/OpenBSD}; \
>/mnt/etc/installurl
fi
 
@@ -3621,6 +3622,7 @@ UPGRADE_BSDRD=false
 V4_AUTOCONF=false
 V6_AUTOCONF=false
 WLANLIST=/tmp/i/wlanlist
+DEFAULT_MIRROR=cdn.openbsd.org
 
 # Save one boot's worth of dmesg.
 dmesgtail >/var/run/dmesg.boot



Re: IPv4 on ix(4) slow/nothing - 7.4

2023-10-25 Thread Alexander Bluhm
On Thu, Oct 19, 2023 at 04:04:26PM +0200, Jan Klemkow wrote:
> On Wed, Oct 18, 2023 at 08:53:44PM +0200, Alexander Bluhm wrote:
> > On Wed, Oct 18, 2023 at 08:19:29PM +0200, Mischa wrote:
> > > It's indeed something like that: ix -> vlan (tagged) -> veb
> > 
> > When vlan is added to veb, kernel should disable LRO on ix.
> > All testing before release did not find this code path :-(
> > 
> > Is it possible to add vlan to veb first, and then add or change the
> > vlan parent to ix?  If it works, that should also disable LRO.
> > 
> > Jan said he will have a look tomorrow.
> > 
> > trunk, carp, ... in veb or bridge might have the same issue.
> 
> First round of fixes for vlan(4), vxlan(4), nvgre(4) and bpe(4).

Don't know much about nvgre(4) and bpe(4).

Maybe we should call ifsetlro(ifp0, 0) unconditionally in
vxlan_set_parent().  Otherwise we may get large UDP packets with
large TCP frames.

For vlan(4) this diff is correct.  For the others it is at least
an improvement.

OK bluhm@

> Index: net/if.c
> ===
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.708
> diff -u -p -r1.708 if.c
> --- net/if.c  16 Sep 2023 09:33:27 -  1.708
> +++ net/if.c  19 Oct 2023 13:03:33 -
> @@ -3243,6 +3243,17 @@ ifsetlro(struct ifnet *ifp, int on)
>   struct ifreq ifrq;
>   int error = 0;
>   int s = splnet();
> + struct if_parent parent;
> +
> + memset(, 0, sizeof(parent));
> + if ((*ifp->if_ioctl)(ifp, SIOCGIFPARENT, (caddr_t)) != -1) {
> + struct ifnet *ifp0 = if_unit(parent.ifp_parent);
> +
> + if (ifp0 != NULL) {
> + ifsetlro(ifp0, on);
> + if_put(ifp0);
> + }
> + }
>  
>   if (!ISSET(ifp->if_capabilities, IFCAP_LRO)) {
>   error = ENOTSUP;
> Index: net/if_bpe.c
> ===
> RCS file: /cvs/src/sys/net/if_bpe.c,v
> retrieving revision 1.19
> diff -u -p -r1.19 if_bpe.c
> --- net/if_bpe.c  8 Nov 2021 04:54:44 -   1.19
> +++ net/if_bpe.c  19 Oct 2023 13:20:18 -
> @@ -631,6 +631,9 @@ bpe_set_parent(struct bpe_softc *sc, con
>   goto put;
>   }
>  
> + if (ether_brport_isset(ifp))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_key.k_if = ifp0->if_index;
>   etherbridge_flush(>sc_eb, IFBF_FLUSHALL);
> Index: net/if_gre.c
> ===
> RCS file: /cvs/src/sys/net/if_gre.c,v
> retrieving revision 1.174
> diff -u -p -r1.174 if_gre.c
> --- net/if_gre.c  13 May 2023 13:35:17 -  1.174
> +++ net/if_gre.c  19 Oct 2023 13:24:56 -
> @@ -3544,6 +3544,9 @@ nvgre_set_parent(struct nvgre_softc *sc,
>   return (EPROTONOSUPPORT);
>   }
>  
> + if (ether_brport_isset(>sc_ac.ac_if))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_ifp0 = ifp0->if_index;
>   if_put(ifp0);
> Index: net/if_vlan.c
> ===
> RCS file: /cvs/src/sys/net/if_vlan.c,v
> retrieving revision 1.215
> diff -u -p -r1.215 if_vlan.c
> --- net/if_vlan.c 16 May 2023 14:32:54 -  1.215
> +++ net/if_vlan.c 19 Oct 2023 11:08:23 -
> @@ -937,6 +937,9 @@ vlan_set_parent(struct vlan_softc *sc, c
>   if (error != 0)
>   goto put;
>  
> + if (ether_brport_isset(ifp))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_ifidx0 = ifp0->if_index;
>   if (!ISSET(sc->sc_flags, IFVF_LLADDR))
> Index: net/if_vxlan.c
> ===
> RCS file: /cvs/src/sys/net/if_vxlan.c,v
> retrieving revision 1.93
> diff -u -p -r1.93 if_vxlan.c
> --- net/if_vxlan.c3 Aug 2023 09:49:08 -   1.93
> +++ net/if_vxlan.c19 Oct 2023 13:18:47 -
> @@ -1582,6 +1582,9 @@ vxlan_set_parent(struct vxlan_softc *sc,
>   goto put;
>   }
>  
> + if (ether_brport_isset(ifp))
> + ifsetlro(ifp0, 0);
> +
>   /* commit */
>   sc->sc_if_index0 = ifp0->if_index;
>   etherbridge_flush(>sc_eb, IFBF_FLUSHALL);



patch unveil fail

2023-10-25 Thread Alexander Bluhm
Hi,

Since 7.4 patch(1) does not work if an explicit patchfile is given on
command line.

https://marc.info/?l=openbsd-cvs=168941770509379=2

root@ot14:.../~# patch /usr/src/usr.bin/patch/patch.c patch-unveil.diff
Hmm...  Looks like a unified diff to me...
The text leading up to this was:
--
|Index: patch.c
|===
|RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
|diff -u -p -r1.74 patch.c
|--- patch.c19 Jul 2023 13:26:20 -  1.74
|+++ patch.c24 Oct 2023 17:13:28 -
--
Patching file /usr/src/usr.bin/patch/patch.c using Plan A...
Hunk #1 succeeded at 32.
Hunk #2 succeeded at 214.
Hunk #3 succeeded at 245.
Can't backup /usr/src/usr.bin/patch/patch.c, output is in 
/tmp/patchoorjYymLKcM: No such file or directory
done

A backup file should be created in the directory of the original
file, but only the current directory is unveiled.  Then the patched
file is created in /tmp and does not replace the original patchfile
in place.

Diff below fixes it.

ok?

bluhm

Index: patch.c
===
RCS file: /data/mirror/openbsd/cvs/src/usr.bin/patch/patch.c,v
diff -u -p -r1.74 patch.c
--- patch.c 19 Jul 2023 13:26:20 -  1.74
+++ patch.c 24 Oct 2023 17:13:28 -
@@ -32,6 +32,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -213,11 +214,27 @@ main(int argc, char *argv[])
perror("unveil");
my_exit(2);
}
-   if (filearg[0] != NULL)
+   if (filearg[0] != NULL) {
+   char *origdir;
+
if (unveil(filearg[0], "rwc") == -1) {
perror("unveil");
my_exit(2);
}
+   if ((origdir = dirname(filearg[0])) == NULL) {
+   perror("dirname");
+   my_exit(2);
+   }
+   if (unveil(origdir, "rwc") == -1) {
+   perror("unveil");
+   my_exit(2);
+   }
+   } else {
+   if (unveil(".", "rwc") == -1) {
+   perror("unveil");
+   my_exit(2);
+   }
+   }
if (filearg[1] != NULL)
if (unveil(filearg[1], "r") == -1) {
perror("unveil");
@@ -228,10 +245,6 @@ main(int argc, char *argv[])
perror("unveil");
my_exit(2);
}
-   if (unveil(".", "rwc") == -1) {
-   perror("unveil");
-   my_exit(2);
-   }
if (*rejname != '\0')
if (unveil(rejname, "rwc") == -1) {
perror("unveil");



make(1): document VPATH variable

2023-10-25 Thread Alexandre Ratchov
I just noticed that our make(1) supports VPATH, but is not
documented.

OK?

Index: make.1
===
RCS file: /cvs/src/usr.bin/make/make.1,v
diff -u -p -r1.141 make.1
--- make.1  10 Aug 2023 10:56:34 -  1.141
+++ make.1  25 Oct 2023 10:45:32 -
@@ -832,6 +832,10 @@ It should not be used; see the
 section below.
 .It Va .VARIABLES
 List of all the names of global variables that have been set.
+.It Va VPATH
+A colon separated list of directories appended to the search path (see
+.Va .PATH ) .
+Supported for compatibility with other implementations.
 .El
 .Pp
 Variable expansion may be modified to select or modify each word of the



malloc: micro optimizations

2023-10-25 Thread Otto Moerbeek
Hi,

a few micro-optimization, including getting rid of some statistics
that are not actualy very interesting.

Speedup amounts to a few tenths of percents to a few percents,
depending on how biased the benchmark is.

-Otto

Index: stdlib/malloc.c
===
RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
retrieving revision 1.291
diff -u -p -r1.291 malloc.c
--- stdlib/malloc.c 22 Oct 2023 12:19:26 -  1.291
+++ stdlib/malloc.c 24 Oct 2023 14:05:37 -
@@ -169,16 +169,12 @@ struct dir_info {
void *caller;
size_t inserts;
size_t insert_collisions;
-   size_t finds;
-   size_t find_collisions;
size_t deletes;
size_t delete_moves;
size_t cheap_realloc_tries;
size_t cheap_reallocs;
size_t malloc_used; /* bytes allocated */
size_t malloc_guarded;  /* bytes used for guards */
-   size_t pool_searches;   /* searches for pool */
-   size_t other_pool;  /* searches in other pool */
 #define STATS_ADD(x,y) ((x) += (y))
 #define STATS_SUB(x,y) ((x) -= (y))
 #define STATS_INC(x)   ((x)++)
@@ -209,12 +205,14 @@ static void unmap(struct dir_info *d, vo
 struct chunk_info {
LIST_ENTRY(chunk_info) entries;
void *page; /* pointer to the page */
+   /* number of shorts should add up to 8, check alloc_chunk_info() */
u_short canary;
u_short bucket;
u_short free;   /* how many free chunks */
u_short total;  /* how many chunks */
u_short offset; /* requested size table offset */
-   u_short bits[1];/* which chunks are free */
+#define CHUNK_INFO_TAIL3
+   u_short bits[CHUNK_INFO_TAIL];  /* which chunks are free */
 };
 
 #define CHUNK_FREE(i, n)   ((i)->bits[(n) / MALLOC_BITS] & (1U << ((n) % 
MALLOC_BITS)))
@@ -656,12 +654,10 @@ find(struct dir_info *d, void *p)
index = hash(p) & mask;
r = d->r[index].p;
q = MASK_POINTER(r);
-   STATS_INC(d->finds);
while (q != p && r != NULL) {
index = (index - 1) & mask;
r = d->r[index].p;
q = MASK_POINTER(r);
-   STATS_INC(d->find_collisions);
}
return (q == p && r != NULL) ? >r[index] : NULL;
 }
@@ -949,7 +945,7 @@ init_chunk_info(struct dir_info *d, stru
 
p->bucket = bucket;
p->total = p->free = MALLOC_PAGESIZE / B2ALLOC(bucket);
-   p->offset = bucket == 0 ? 0xdead : howmany(p->total, MALLOC_BITS);
+   p->offset = howmany(p->total, MALLOC_BITS);
p->canary = (u_short)d->canary1;
 
/* set all valid bits in the bitmap */
@@ -971,8 +967,13 @@ alloc_chunk_info(struct dir_info *d, u_i
count = MALLOC_PAGESIZE / B2ALLOC(bucket);
 
size = howmany(count, MALLOC_BITS);
-   size = sizeof(struct chunk_info) + (size - 1) * sizeof(u_short);
-   if (mopts.chunk_canaries)
+   /* see declaration of struct chunk_info */
+   if (size <= CHUNK_INFO_TAIL)
+   size = 0;
+   else
+   size -= CHUNK_INFO_TAIL;
+   size = sizeof(struct chunk_info) + size * sizeof(u_short);
+   if (mopts.chunk_canaries && bucket > 0)
size += count * sizeof(u_short);
size = _ALIGN(size);
count = MALLOC_PAGESIZE / size;
@@ -1129,8 +1130,7 @@ fill_canary(char *ptr, size_t sz, size_t
 static void *
 malloc_bytes(struct dir_info *d, size_t size)
 {
-   u_int i, r, bucket, listnum;
-   size_t k;
+   u_int i, k, r, bucket, listnum;
u_short *lp;
struct chunk_info *bp;
void *p;
@@ -1170,7 +1170,7 @@ malloc_bytes(struct dir_info *d, size_t 
/* no bit halfway, go to next full short */
i /= MALLOC_BITS;
for (;;) {
-   if (++i >= howmany(bp->total, MALLOC_BITS))
+   if (++i >= bp->offset)
i = 0;
lp = >bits[i];
if (*lp) {
@@ -1228,7 +1228,7 @@ validate_canary(struct dir_info *d, u_ch
}
 }
 
-static uint32_t
+static inline uint32_t
 find_chunknum(struct dir_info *d, struct chunk_info *info, void *ptr, int 
check)
 {
uint32_t chunknum;
@@ -1532,12 +1532,10 @@ findpool(void *p, struct dir_info *argpo
struct dir_info *pool = argpool;
struct region_info *r = find(pool, p);
 
-   STATS_INC(pool->pool_searches);
if (r == NULL) {
u_int i, nmutexes;
 
nmutexes = mopts.malloc_pool[1]->malloc_mt ? 
mopts.malloc_mutexes : 2;
-   STATS_INC(pool->other_pool);
for (i = 1; i < nmutexes; i++) {
u_int j = (argpool->mutex + i) & (nmutexes - 1);
 
@@ 

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-25 Thread Klemens Nanni
10/24/23 14:03, Crystal Kolipe пишет:
> On Tue, Oct 24, 2023 at 01:44:08AM +, Klemens Nanni wrote:
>> Rereading the code, I now question why it checks the 'a' label type at all.
>> 
>> Taking your sd0d example through devboot():
>> 
>> |#ifdef SOFTRAID
>> |/*
>> | * Determine the partition type for the 'a' partition of the
>> | * boot device.
>> | */
>> |TAILQ_FOREACH(dip, , list)
>> |if (dip->bios_info.bios_number == bootdev &&
>> |(dip->bios_info.flags & BDI_BADLABEL) == 0)
>> |part_type = dip->disklabel.d_partitions[0].p_fstype;
>> 
>> Whatever sd0a's type is...
>> 
>> |
>> |/*
>> | * See if we booted from a disk that is a member of a bootable
>> | * softraid volume.
>> | */
>> |SLIST_FOREACH(bv, _volumes, sbv_link) {
>> |if (bv->sbv_flags & BIOC_SCBOOTABLE)
>> |SLIST_FOREACH(bc, >sbv_chunks, sbc_link)
>> |if (bc->sbc_disk == bootdev)
>> 
>> ... the boot loader sees we booted from a bootable softraid chunk,
>> matching disk sd0 by number and not partition.
>> 
>> |sr_boot_vol = bv->sbv_unit;
>> |if (sr_boot_vol != -1)
>> |break;
>> |}
>> |#endif
>> |
>> |if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
>> 
>> With softraid chunk on sd0d, sd0a happens to be not FFS (likely unused),
>> but that's still enough for the boot loader to stick to softraid on sd0!
>> 
>> |*p++ = 's';
>> |*p++ = 'r';
>> |*p++ = '0' + sr_boot_vol;
>> |} else if (bootdev & 0x100) {
>> 
>> So why check the 'a' label type if that's not relevant?
> 
> The only reason I can see for that check is to better support the unusual
> configuration of booting from a disk which happens to be part of a bootable
> softraid volume, but which also has a regular FFS partition outside of the
> RAID, and for whatever reason the user wants to automatically boot from that
> instead.
> 
> Maybe that was useful before support for booting from softraid crypto volumes
> was introduced, (when it was raid-1 only).  It doesn't seem very useful now.
> 
>> It is confusing and 
>> Doubling down on the assumption that bootable chunks are always on 'a'
>> like my diff did shows that's a mistake and dropping the check actually
>> makes more sense to me now.
> 
> I agree it is confusing.
> 
> Effectively the code is there to treat part_type == FS_BSDFFS an 'edge case'
> as described above.
> 
>> This boots with root on softraid on sd0a and sd0d.
>> 
>> Thoughts?
>> Am I missing something?
> 
> It works OK here, so I'm fine with removing this.
> 

That matches sys/arch/amd64/stand/libsa/dev_i386.c r1.12 from 2012
which introduced this code;  my second diff would be a revert:

If we have booted from a disk that is a member of a bootable
softraid volume, always select the srX device unless the 'a'
partition of the disk is FFS.

(Other architectures copied that, except sparc64 checking FS_RAID.)



OpenBSD Errata: October 25, 2023 (xserver msplit)

2023-10-25 Thread Alexander Bluhm
Errata patches for X11 server and kernel mbuf handling have been
released for OpenBSD 7.3 and 7.4.

Binary updates for the amd64, arm64 and i386 platform are available
via the syspatch utility.  Source code patches can be found on the
respective errata page:

  https://www.openbsd.org/errata73.html
  https://www.openbsd.org/errata74.html



Re: snmpd_metrics: differentiate between hrSWRunName and hrSWRunPath

2023-10-24 Thread Theo Buehler
On Wed, Oct 18, 2023 at 01:49:00PM +0200, Martijn van Duren wrote:
> Right now we return the same value for both hrSWRunName and hrSWRunPath.
> hrSWRunPath should return the full path of the binary, and hrSWRunName
> a description of the running software.
> 
> Afaik there's no proper way to retrieve the full path of the running
> binary.

No there isn't. It's been discussed many times.

> However, in a lot of cases argv[0] can contain the full or
> relative path. But if argv[0] gets overwritten (like most of our
> daemons' children) it gives a more descriptive name, which is more in
> line with hrSWRunName.
> 
> netsnmp's snmpd uses argv[0] for hrSWRunPath and kinfo_proc's p_comm for
> hrSWRunName, and snmptop defaults to hrSWRunName. top(1) also defaults
> to p_comm, but contrary to top(1), snmptop allows us to switch between
> hrSWRunName, and hrSWRunPath and toggling of hrSWRunParameters
> independently, where top(1) toggles between p_comm and argv[].
> 
> So there's an argument to be made either way, but for this diff I stuck
> with netsnmp's choices.

I am fine with your choices and people feeling strongly about this had
week to object.

> 
> While here, change the buffer length from 128 to 129. HOST-RESOURCES-MIB
> allows up to 128 characters in the response, so make room for the
> terminating NUL.
> 
> Thoughts? OK?

ok tb



Re: snmpd_metrics: add HOST-RESOURCES-MIB:hrSWRunPerfTable

2023-10-24 Thread Theo Buehler
On Wed, Oct 18, 2023 at 11:54:06AM +0200, Martijn van Duren wrote:
> This diff adds the two entries from the hrSWRunPerfTable:
> hrSWRunPerfCPU, and hrSWRunPerfMem. This allows snmptop from the
> net-snmp package to work. Math stolen^Wcopied from top/machine.c.
> 
> OK?

ok

> 
> martijn@
> 
> Index: mib.c
> ===
> RCS file: /cvs/src/libexec/snmpd/snmpd_metrics/mib.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 mib.c
> --- mib.c 4 Jul 2023 11:34:19 -   1.4
> +++ mib.c 18 Oct 2023 09:52:07 -
> @@ -69,6 +69,10 @@ struct eventconnev;
>  const char   *agentxsocket = NULL;
>  int   agentxfd = -1;
>  
> +int   pageshift;
> +#define pagetok(size) ((size) << pageshift)
> +
> +void  pageshift_init(void);
>  void  snmp_connect(struct agentx *, void *, int);
>  void  snmp_tryconnect(int, short, void *);
>  void  snmp_read(int, short, void *);
> @@ -89,6 +93,7 @@ struct agentx_object *hrProcessorLoad;
>  struct agentx_index *hrSWRunIdx;
>  struct agentx_object *hrSWRunIndex, *hrSWRunName, *hrSWRunID, *hrSWRunPath;
>  struct agentx_object *hrSWRunParameters, *hrSWRunType, *hrSWRunStatus;
> +struct agentx_object *hrSWRunPerfCPU, *hrSWRunPerfMem;
>  
>  void  mib_hrsystemuptime(struct agentx_varbind *);
>  void  mib_hrsystemdate(struct agentx_varbind *);
> @@ -634,6 +639,7 @@ mib_hrswrun(struct agentx_varbind *vb)
>   struct agentx_object*obj;
>   enum agentx_request_type req;
>   int32_t  idx;
> + int32_t  time;
>   struct kinfo_proc   *kinfo;
>   char*s;
>  
> @@ -711,6 +717,13 @@ mib_hrswrun(struct agentx_varbind *vb)
>   agentx_varbind_integer(vb, 4);
>   break;
>   }
> + } else if (obj == hrSWRunPerfCPU) {
> + time = kinfo->p_rtime_sec * 100;
> + time += (kinfo->p_rtime_usec + 5000) / 1;
> + agentx_varbind_integer(vb, time);
> + } else if (obj == hrSWRunPerfMem) {
> + agentx_varbind_integer(vb, pagetok(kinfo->p_vm_tsize +
> + kinfo->p_vm_dsize + kinfo->p_vm_ssize));
>   } else
>   fatal("%s: Unexpected object", __func__);
>  }
> @@ -3278,6 +3291,7 @@ main(int argc, char *argv[])
>   kr_init();
>   pf_init();
>   timer_init();
> + pageshift_init();
>  
>   if (agentxsocket != NULL) {
>   if (strlcpy(agentxsocketdir, agentxsocket,
> @@ -3375,6 +3389,10 @@ main(int argc, char *argv[])
>   (hrSWRunType = agentx_object(host, AGENTX_OID(HRSWRUNTYPE),
>   , 1, 0, mib_hrswrun)) == NULL ||
>   (hrSWRunStatus = agentx_object(host, AGENTX_OID(HRSWRUNSTATUS),
> + , 1, 0, mib_hrswrun)) == NULL ||
> + (hrSWRunPerfCPU = agentx_object(host, AGENTX_OID(HRSWRUNPERFCPU),
> + , 1, 0, mib_hrswrun)) == NULL ||
> + (hrSWRunPerfMem = agentx_object(host, AGENTX_OID(HRSWRUNPERFMEM),
>   , 1, 0, mib_hrswrun)) == NULL)
>   fatal("agentx_object");
>  
> @@ -4318,6 +4336,22 @@ main(int argc, char *argv[])
>   log_setverbose(verbose);
>  
>   event_dispatch();
> +}
> +
> +#define LOG1024   10
> +void
> +pageshift_init(void)
> +{
> + long pagesize;
> +
> + if ((pagesize = sysconf(_SC_PAGESIZE)) == -1)
> + fatal("sysconf(_SC_PAGESIZE)");
> + while (pagesize > 1) {
> + pageshift++;
> + pagesize >>= 1;
> + }
> + /* we only need the amount of log(2)1024 for our conversion */
> + pageshift -= LOG1024;
>  }
>  
>  void
> 



Cruft: stop creating /usr/local/share/nls directories

2023-10-24 Thread Christian Weisgerber
The /usr/local/share/nls tree for localized message catalogs has
barely ever been used throughout the whole history of the ports
tree.  The main user was shells/tcsh and that stopped four years
ago.  Now, a single port is left that installs a single file there.

I would like to stop creating those directories by default in
mtree/4.4BSD.dist and the ports tree infrastructure.

Separate patches for src and ports attached.

OK?

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de
---
commit 44bc262d8170b10e6949c2836a820b7fee533de4 (local)
from: Christian Weisgerber 
date: Tue Oct 24 14:09:57 2023 UTC
 
 do not create /usr/local/share/nls and subdirectories by default
 
 The share/nls/ paths are unused.
 
diff 85c70083e85910ddc7626ad658e5652c024b844a 44bc262d8170b10e6949c2836a820b7fee533de4
commit - 85c70083e85910ddc7626ad658e5652c024b844a
commit + 44bc262d8170b10e6949c2836a820b7fee533de4
blob - e72a9aa77292f4533044e3d97d6462141f8dfea3
blob + e3ce8a743046f036830fa1c4824ba426cad23ed6
--- etc/mtree/4.4BSD.dist
+++ etc/mtree/4.4BSD.dist
@@ -318,82 +318,6 @@ usr
 misc
 ..
 
-# ./usr/local/share/nls
-nls
-C
-..
-da_DK.ISO_8859-1
-..
-de_AT.ISO_8859-1
-..
-de_CH.ISO_8859-1
-..
-de_DE.ISO_8859-1
-..
-el_GR.ISO_8859-7
-..
-en_AU.ISO_8859-1
-..
-en_CA.ISO_8859-1
-..
-en_GB.ISO_8859-1
-..
-en_US.ISO_8859-1
-..
-es_ES.ISO_8859-1
-..
-et_EE.ISO-8859-1
-..
-fi_FI.ISO_8859-1
-..
-fr_BE.ISO_8859-1
-..
-fr_CA.ISO_8859-1
-..
-fr_CH.ISO_8859-1
-..
-fr_FR.ISO_8859-1
-..
-hr_HR.ISO_8859-2
-..
-is_IS.ISO_8859-1
-..
-it_CH.ISO_8859-1
-..
-it_IT.ISO_8859-1
-..
-ja_JP.EUC
-..
-ko_KR.EUC
-..
-lt_LN.ASCII
-..
-lt_LN.ISO_8859-1
-..
-lt_LN.ISO_8859-2
-..
-nl_BE.ISO_8859-1
-..
-nl_NL.ISO_8859-1
-..
-no_NO.ISO_8859-1
-..
-pl_PL.ISO_8859-2
-..
-pt_PT.ISO_8859-1
-..
-ru_RU.CP866
-..
-ru_RU.ISO_8859-5
-..
-ru_RU.KOI8-R
-..
-sv_SE.ISO_8859-1
-..
-uk_UA.KOI8-U
-..
-..
-
 # ./usr/local/share/pkgconfig
 pkgconfig
 ..

---
commit f51da98f32c02018f7547f68db21d3c6ad474d99 (local)
from: Christian Weisgerber 
date: Tue Oct 24 14:49:39 2023 UTC
 
 do not create the ${PREFIX}/share/nls directory tree by default
 
 The share/nls directories have been barely ever used throughout the
 27-year history of the ports tree.  Let the single port that still
 installs a file there create the directory itself.
 
diff 16cf88f8fc810c890a483bf6ef62386b8610f8bd f51da98f32c02018f7547f68db21d3c6ad474d99
commit - 16cf88f8fc810c890a483bf6ef62386b8610f8bd
commit + f51da98f32c02018f7547f68db21d3c6ad474d99
blob - a189b247f6abf8b457c66bc4986770be81f639fe
blob + 32347397d0c4912d3b015aef226b9d2b652ac086
--- infrastructure/mk/bsd.port.mk
+++ infrastructure/mk/bsd.port.mk
@@ -2018,7 +2018,6 @@ _wrap_install_commands += ${_PBUILD} install -m ${BINM
 _cat = {cat1,cat2,cat3,cat3f,cat3p,cat4,cat5,cat6,cat7,cat8,cat9,catl,catn}
 _man = ${_cat:S/cat/man/g}
 _treebase = ${WRKINST}${LOCALBASE}
-_nls = ${_treebase}/share/nls
 _FAKE_TREE_LIST = \
 	${WRKINST}${BASESYSCONFDIR}/{firmware,rc.d} \
 	${_treebase}/bin \
@@ -2035,18 +2034,6 @@ _FAKE_TREE_LIST = \
 	${_treebase}/sbin \
 	${_treebase}/share/{dict,examples,misc,pkgconfig,skel} \
 	${_treebase}/share/doc/pkg-readmes \
-	${_nls}/{C,da_DK.ISO_8859-1,de_AT.ISO_8859-1,de_CH.ISO_8859-1} \
-	${_nls}/{de_DE.ISO_8859-1,el_GR.ISO_8859-7,en_AU.ISO_8859-1} \
-	${_nls}/{en_CA.ISO_8859-1,en_GB.ISO_8859-1,en_US.ISO_8859-1} \
-	${_nls}/{es_ES.ISO_8859-1,et_EE.ISO-8859-1,fi_FI.ISO_8859-1} \
-	${_nls}/{fr_BE.ISO_8859-1,fr_CA.ISO_8859-1,fr_CH.ISO_8859-1} \
-	${_nls}/fr_FR.ISO_8859-1 \
-	${_nls}/{hr_HR.ISO_8859-2,is_IS.ISO_8859-1,it_CH.ISO_8859-1} \
-	${_nls}/{it_IT.ISO_8859-1,ja_JP.EUC,ko_KR.EUC,lt_LN.ASCII} \
-	

Re: ibuf free fd on close

2023-10-24 Thread Theo Buehler
> Thanks. Do we have a list of ports that use ibuf / imsg? 

Here's a list obtained from the nm output of all ports by grepping for
'^ibuf_.* U' and '^imsg_.* U', respectively, then weeding out the
obvious false positives::

ibuf:

mail/opensmtpd-extras,-main
net/ladvd

imsg:

audio/amused
devel/got,-main
mail/opensmtpd-extras,-main
mail/pop3d
mail/smtp-vilter
mail/smtp-vilter,ldap
net/gelatod
net/gmid
net/openmdns
net/telescope
net/thingsd
sysutils/tabled
sysutils/tmate



Re: ibuf free fd on close

2023-10-24 Thread Theo Buehler
On Tue, Oct 24, 2023 at 04:00:42PM +0200, Claudio Jeker wrote:
> On Tue, Oct 24, 2023 at 03:50:47PM +0200, Theo Buehler wrote:
> > On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> > > When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> > > close any fd still on the buffer. This way even if a fd is unexpectedly
> > > passed nothing will happen.
> > > 
> > > That code was disabled at start because userland was not fully ready. In
> > > particular rpki-client did not handle that well. All of this is to my
> > > knowledge fixed so there is no reason to keep the NOTYET :)
> > > 
> > > With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> > > Code not doing so will break because ibuf_free() will close the fd which
> > > is probably still in use somewhere else.
> > 
> > Nothing in base outside of libutil seems to reach directly for the fd
> > (checked by compiling with that struct member renamed in the public
> > header).
> > 
> > The internal uses are addressed by this diff, so
> > 
> > ok tb
> > 
> > I can put the fd rename through a bulk to catch some ports in a couple
> > of days but I don't think there is a need to wait.
> 
> Thanks. Do we have a list of ports that use ibuf / imsg? 

I don't think so. The list of ports using libutil is rather long. my sql
is nonexistent but I get a few hundred.



Re: ibuf free fd on close

2023-10-24 Thread Claudio Jeker
On Tue, Oct 24, 2023 at 03:50:47PM +0200, Theo Buehler wrote:
> On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> > When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> > close any fd still on the buffer. This way even if a fd is unexpectedly
> > passed nothing will happen.
> > 
> > That code was disabled at start because userland was not fully ready. In
> > particular rpki-client did not handle that well. All of this is to my
> > knowledge fixed so there is no reason to keep the NOTYET :)
> > 
> > With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> > Code not doing so will break because ibuf_free() will close the fd which
> > is probably still in use somewhere else.
> 
> Nothing in base outside of libutil seems to reach directly for the fd
> (checked by compiling with that struct member renamed in the public
> header).
> 
> The internal uses are addressed by this diff, so
> 
> ok tb
> 
> I can put the fd rename through a bulk to catch some ports in a couple
> of days but I don't think there is a need to wait.

Thanks. Do we have a list of ports that use ibuf / imsg? 

-- 
:wq Claudio



Re: ibuf free fd on close

2023-10-24 Thread Theo Buehler
On Tue, Oct 24, 2023 at 03:01:26PM +0200, Claudio Jeker wrote:
> When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
> close any fd still on the buffer. This way even if a fd is unexpectedly
> passed nothing will happen.
> 
> That code was disabled at start because userland was not fully ready. In
> particular rpki-client did not handle that well. All of this is to my
> knowledge fixed so there is no reason to keep the NOTYET :)
> 
> With this users need to use ibuf_fd_get() to take the fd off the ibuf.
> Code not doing so will break because ibuf_free() will close the fd which
> is probably still in use somewhere else.

Nothing in base outside of libutil seems to reach directly for the fd
(checked by compiling with that struct member renamed in the public
header).

The internal uses are addressed by this diff, so

ok tb

I can put the fd rename through a bulk to catch some ports in a couple
of days but I don't think there is a need to wait.



ibuf free fd on close

2023-10-24 Thread Claudio Jeker
When I added ibuf_get_fd() the idea was to make sure that ibuf_free() will
close any fd still on the buffer. This way even if a fd is unexpectedly
passed nothing will happen.

That code was disabled at start because userland was not fully ready. In
particular rpki-client did not handle that well. All of this is to my
knowledge fixed so there is no reason to keep the NOTYET :)

With this users need to use ibuf_fd_get() to take the fd off the ibuf.
Code not doing so will break because ibuf_free() will close the fd which
is probably still in use somewhere else.

-- 
:wq Claudio

Index: imsg-buffer.c
===
RCS file: /cvs/src/lib/libutil/imsg-buffer.c,v
retrieving revision 1.16
diff -u -p -r1.16 imsg-buffer.c
--- imsg-buffer.c   19 Jun 2023 17:19:50 -  1.16
+++ imsg-buffer.c   24 Oct 2023 12:55:44 -
@@ -294,10 +294,8 @@ ibuf_free(struct ibuf *buf)
 {
if (buf == NULL)
return;
-#ifdef NOTYET
if (buf->fd != -1)
close(buf->fd);
-#endif
freezero(buf->buf, buf->size);
free(buf);
 }
@@ -314,9 +312,7 @@ ibuf_fd_get(struct ibuf *buf)
int fd;
 
fd = buf->fd;
-#ifdef NOTYET
buf->fd = -1;
-#endif
return (fd);
 }
 
@@ -480,11 +476,6 @@ static void
 ibuf_dequeue(struct msgbuf *msgbuf, struct ibuf *buf)
 {
TAILQ_REMOVE(>bufs, buf, entry);
-
-   if (buf->fd != -1) {
-   close(buf->fd);
-   buf->fd = -1;
-   }
 
msgbuf->queued--;
ibuf_free(buf);



Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-24 Thread Crystal Kolipe
On Tue, Oct 24, 2023 at 01:44:08AM +, Klemens Nanni wrote:
> Rereading the code, I now question why it checks the 'a' label type at all.
> 
> Taking your sd0d example through devboot():
> 
> |#ifdef SOFTRAID
> | /*
> |  * Determine the partition type for the 'a' partition of the
> |  * boot device.
> |  */
> | TAILQ_FOREACH(dip, , list)
> | if (dip->bios_info.bios_number == bootdev &&
> | (dip->bios_info.flags & BDI_BADLABEL) == 0)
> | part_type = dip->disklabel.d_partitions[0].p_fstype;
> 
> Whatever sd0a's type is...
> 
> |
> | /*
> |  * See if we booted from a disk that is a member of a bootable
> |  * softraid volume.
> |  */
> | SLIST_FOREACH(bv, _volumes, sbv_link) {
> | if (bv->sbv_flags & BIOC_SCBOOTABLE)
> | SLIST_FOREACH(bc, >sbv_chunks, sbc_link)
> | if (bc->sbc_disk == bootdev)
> 
> ... the boot loader sees we booted from a bootable softraid chunk,
> matching disk sd0 by number and not partition.
> 
> | sr_boot_vol = bv->sbv_unit;
> | if (sr_boot_vol != -1)
> | break;
> | }
> |#endif
> |
> | if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> 
> With softraid chunk on sd0d, sd0a happens to be not FFS (likely unused),
> but that's still enough for the boot loader to stick to softraid on sd0!
> 
> | *p++ = 's';
> | *p++ = 'r';
> | *p++ = '0' + sr_boot_vol;
> | } else if (bootdev & 0x100) {
> 
> So why check the 'a' label type if that's not relevant?

The only reason I can see for that check is to better support the unusual
configuration of booting from a disk which happens to be part of a bootable
softraid volume, but which also has a regular FFS partition outside of the
RAID, and for whatever reason the user wants to automatically boot from that
instead.

Maybe that was useful before support for booting from softraid crypto volumes
was introduced, (when it was raid-1 only).  It doesn't seem very useful now.

> It is confusing and 
> Doubling down on the assumption that bootable chunks are always on 'a'
> like my diff did shows that's a mistake and dropping the check actually
> makes more sense to me now.

I agree it is confusing.

Effectively the code is there to treat part_type == FS_BSDFFS an 'edge case'
as described above.

> This boots with root on softraid on sd0a and sd0d.
> 
> Thoughts?
> Am I missing something?

It works OK here, so I'm fine with removing this.



Re: snmpd [15/16]: When we have an error, all oids must be identical to the request

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:31:20PM +0200, Martijn van Duren wrote:
> RFC3416 section 4.2.1 (and others) tells us that if an error occurs the
> varbindlist in the response must be identical to the original request.
> 
> There might be other edge-cases here, but let's at least make sure that
> the OIDs haven't been tampered with.

ok



Re: snmpd [14/16]: Validate the returned error code

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:28:12PM +0200, Martijn van Duren wrote:
> 
> Certain error codes are only intended for certain request-types. Add an
> appl_error_valid() function to test for this.

ok tb



Re: snmpd [13/16]: registered instances must not return below OID

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:25:29PM +0200, Martijn van Duren wrote:
> If a backend registers as an instance it must never return OIDs below
> their registration. Add a test for this in appl_varbind_valid().
> 
> OK?

ok with a tiny nit inline

> 
> martijn@
> 
> diff --git a/application.c b/application.c
> index dfa7220..6ddeb39 100644
> --- a/application.c
> +++ b/application.c
> @@ -128,8 +128,8 @@ void appl_request_upstream_resolve(struct 
> appl_request_upstream *);
>  void appl_request_downstream_send(struct appl_request_downstream *);
>  void appl_request_downstream_timeout(int, short, void *);
>  void appl_request_upstream_reply(struct appl_request_upstream *);
> -int appl_varbind_valid(struct appl_varbind *, struct appl_varbind *, int, 
> int,
> -int, const char **);
> +int appl_varbind_valid(struct appl_varbind *, struct appl_varbind_internal *,
> +int, int, int, const char **);
>  int appl_varbind_backend(struct appl_varbind_internal *);
>  void appl_varbind_error(struct appl_varbind_internal *, enum appl_error);
>  void appl_report(struct snmp_message *, int32_t, struct ber_oid *,
> @@ -1073,9 +1073,9 @@ appl_response(struct appl_backend *backend, int32_t 
> requestid,
>  
>   vb = vblist;
>   for (i = 1; vb != NULL; vb = vb->av_next, i++) {
> -if (!appl_varbind_valid(vb, origvb == NULL ?
> - NULL : &(origvb->avi_varbind), next,
> -error != APPL_ERROR_NOERROR, backend->ab_range, 
> )) {
> +if (!appl_varbind_valid(vb, origvb == NULL ?  NULL : origvb,

could that not just be

if (!appl_varbind_valid(vb, origvb,

(also: use tabs for indent)

> + next, error != APPL_ERROR_NOERROR, backend->ab_range,
> + )) {
>   smi_oid2string(&(vb->av_oid), oidbuf,
>   sizeof(oidbuf), 0);
>   log_warnx("%s: %"PRIu32" %s: %s",
> @@ -1173,8 +1173,9 @@ appl_response(struct appl_backend *backend, int32_t 
> requestid,
>  }
>  
>  int
> -appl_varbind_valid(struct appl_varbind *varbind, struct appl_varbind 
> *request,
> -int next, int null, int range, const char **errstr)
> +appl_varbind_valid(struct appl_varbind *varbind,
> +struct appl_varbind_internal *request, int next, int null, int range,
> +const char **errstr)
>  {
>   int cmp;
>   int eomv = 0;
> @@ -1259,24 +1260,32 @@ appl_varbind_valid(struct appl_varbind *varbind, 
> struct appl_varbind *request,
>   if (request == NULL)
>   return 1;
>  
> - cmp = ober_oid_cmp(&(request->av_oid), &(varbind->av_oid));
> - if (next && !eomv) {
> - if (request->av_include) {
> - if (cmp > 0) {
> - *errstr = "oid not incrementing";
> - return 0;
> + cmp = ober_oid_cmp(&(request->avi_varbind.av_oid), &(varbind->av_oid));
> + if (next) {
> + if (request->avi_region->ar_instance &&
> + ober_oid_cmp(&(request->avi_region->ar_oid),
> + &(varbind->av_oid)) != 0) {
> + *errstr = "oid below instance";
> + return 0;
> + }
> + if (!eomv) {
> + if (request->avi_varbind.av_include) {
> + if (cmp > 0) {
> + *errstr = "oid not incrementing";
> + return 0;
> + }
> + } else {
> + if (cmp >= 0) {
> + *errstr = "oid not incrementing";
> + return 0;
> + }
>   }
> - } else {
> - if (cmp >= 0) {
> - *errstr = "oid not incrementing";
> + if (range && ober_oid_cmp(&(varbind->av_oid),
> + &(request->avi_varbind.av_oid_end)) > 0) {
> + *errstr = "end oid not honoured";
>   return 0;
>   }
>   }
> - if (range && ober_oid_cmp(&(varbind->av_oid),
> - &(request->av_oid_end)) > 0) {
> - *errstr = "end oid not honoured";
> - return 0;
> - }
>   } else {
>   if (cmp != 0) {
>   *errstr = "oids not equal";
> 



Re: snmpd [12/16]: Make ab_range in application_agentx explicit 1

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:21:45PM +0200, Martijn van Duren wrote:
> appl_agentx_session doesn't set ab_range explicitly to 1, and thus
> relies on malloc randomness to set it. Sit it explicitly.

ok, but I haven't verified that all the session's members are now
initialized.

> 
> OK?
> 
> martijn@
> 
> diff --git a/application_agentx.c b/application_agentx.c
> index 680725d..79900d6 100644
> --- a/application_agentx.c
> +++ b/application_agentx.c
> @@ -548,6 +548,7 @@ appl_agentx_open(struct appl_agentx_connection *conn, 
> struct ax_pdu *pdu)
>   session->sess_backend.ab_cookie = session;
>   session->sess_backend.ab_retries = 0;
>   session->sess_backend.ab_fn = _agentx_functions;
> + session->sess_backend.ab_range = 1;
>   RB_INIT(&(session->sess_backend.ab_requests));
>   TAILQ_INSERT_TAIL(&(conn->conn_sessions), session, sess_conn_entry);
>  
> 



Re: snmpd [11/16]: When a request results in EOMV we must return the requesting OID

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:17:19PM +0200, Martijn van Duren wrote:
> According to RFC3416 section 4.2.2 and 4.2.3 case "(2)" when an
> endOfMibView is returned the OID must be identical to originally
> requested OID. Currently this can fail when the original request
> is in a !last registered region and all subsequent regions also
> return EOMV.
> 
> Store the original OID and use that on EOMV.

ok tb



Re: snmpd [10/16]: Make retries on open session where connection is closed return early

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:13:57PM +0200, Martijn van Duren wrote:
> Here's a special case unlikely to be found in the wild:
> When opening 2 sessions on an agentx connection (already unusual) and
> registering 2 overlapping regions on the different sessions, e.g. by
> differing in priority (even more unusual) and we close the underlying
> connection with an outstanding request to the dominant region we will
> call appl_agentx_free(), which sequentially closes all sessions.
> If the session with the outstanding request is closed before the
> second session the request is retried before said session is cleaned
> up and it will try to send it over a conn_ax which at that point has
> been set to NULL, resulting in a SIGSEGV.
> 
> Simply return early and let this second request be cancelled by the
> cleanup of the second session.

Makes total sense.

ok tb



Re: snmpd [9/16]: Fix range handling with appl_unregister

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:11:18PM +0200, Martijn van Duren wrote:
> Right now (un)registering a region with range_subid set to !0 will
> fail. Apparently nothing in the wild uses this, but let's fix it.
> 
> This is the unregister part.

ok tb



Re: snmpd [8/16]: Fix range handling with appl_register

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:08:00PM +0200, Martijn van Duren wrote:
> Right now (un)registering a region with range_subid set to !0 will
> fail. Apparently nothing in the wild uses this, but let's fix it.
> 
> This is the register part.

ok tb



Re: snmpd [7/16]: Treat agentx-close-pdu with reasonByManager as a parseerror

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 03:03:16PM +0200, Martijn van Duren wrote:
> 
> RFC2741 section 6.2.2 says that reasonByManager can only be used by the
> agentx master. Treat this reason as a parseerror.

ok tb



Re: snmpd [6/16]: support close reason for appl_agentx_free

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 02:59:52PM +0200, Martijn van Duren wrote:
> appl_agentx_free() closes any potential open sessions before closing the
> connection and cleaning up. This function is called from multiple
> contexts and the current APPL_CLOSE_REASONSHUTDOWN is not always
> applicable. Add a second reason parameter that can be passed onto
> appl_agentx_forceclose().

ok tb



Re: snmpd [5/16]: Check context existence in appl_agentx_recv

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 02:56:59PM +0200, Martijn van Duren wrote:
> application.c checks the context where applicable, but not every
> agentx-pdu goes through there (e.g. agentx-ping-pdu). Make sure
> we always check the context in appl_agentx_recv()

ok tb



Re: snmpd [4/16]: check agentx-pdu-header flags for validity

2023-10-24 Thread Theo Buehler
On Tue, Oct 17, 2023 at 02:54:07PM +0200, Martijn van Duren wrote:
> RFC2741 section 6.1 specifies which PDUs can contain which header flags.
> Check that that incoming agentx PDUs have valid flags in
> appl_agentx_recv(). While here I cleaned up a few log messages some
> minor restructuring to prevent the function growing too large.

ok tb



Re: relayd.conf.5: less SSL

2023-10-24 Thread Peter N. M. Hansteen
On Tue, Oct 24, 2023 at 06:54:30AM +, Klemens Nanni wrote:
> - parse.y still accepting undocumented "ssl" with a warning since 2014
> - more "SSL/TLS" instead of "TLS" in manual and code comments

my take would be that while it's fine to streamline the documentation to use
the modern terminology, I suspect there may still be ancient configurations
out there that use the "ssl" keyword, so removing the last bit of support for
that option should be accompanied by or preceded by a warning on relevant
mailing lists or at least in the commit message. 

And I think undeadly.org would be more than happy to help spread the word :)

- Peter

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
https://bsdly.blogspot.com/ https://www.bsdly.net/ https://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



relayd.conf.5: less SSL

2023-10-24 Thread Klemens Nanni
Wanted to learn about TLS usage in relayd(8) and thought these SSL history
bits in the TLS RELAYS section read out of place.

Index: relayd.conf.5
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.conf.5,v
retrieving revision 1.206
diff -u -p -r1.206 relayd.conf.5
--- relayd.conf.5   6 Jun 2023 15:16:52 -   1.206
+++ relayd.conf.5   24 Oct 2023 06:37:47 -
@@ -728,8 +728,6 @@ In addition to plain TCP,
 .Xr relayd 8
 supports the Transport Layer Security (TLS) cryptographic protocol for
 authenticated and encrypted relays.
-TLS is the successor of the original Secure Sockets Layer (SSL) protocol,
-but the term SSL is sometimes still used in modern TLS-based applications.
 .Xr relayd 8
 can operate as a TLS client or server to offer a variety of options
 for different use cases related to TLS.

There's more:
- parse.y still accepting undocumented "ssl" with a warning since 2014
- more "SSL/TLS" instead of "TLS" in manual and code comments

In comparison, httpd.conf(5) just has a few "SSL/TLS" mentions and nc(1)
is "clean", so I'm inclined to dust off relayd a little (incl. parser).

Feedback?



Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Klemens Nanni
On Mon, Oct 23, 2023 at 06:36:21PM -0300, Crystal Kolipe wrote:
> On Mon, Oct 23, 2023 at 11:04:07AM +, Klemens Nanni wrote:
> > 10/16/23 04:02, Klemens Nanni ??:
> > > The current check implies one could use, e.g. SWAP or MSDOS partitions
> > > as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> > > thus using chunks with different partition types is not possible:
> > > 
> > >   # vmctl create -s100M disk.img
> > >   # vnd=`vnconfig disk.img`
> > >   # echo 'swap *' | disklabel -wAT- vnd0
> > > 
> > >   # disklabel $vnd | grep swap
> > > a:   2048000swap
> > >   # bioctl -c c -l ${vnd}a softraid0
> > >   softraid0: invalid metadata format
> > > 
> > > Correct the check.
> > > I don't expect this to break anything.
> > > amd64 biosboot boots off standard RAID 'a' as before.
> > > 
> > > Feedback? Objection? OK?
> > 
> > Ping.
> 
> This breaks booting off of a RAID that is not on partition 'a', on amd64.
> 
> Was this intentional?

No.

> 
> For example, if you have a RAID on 'd', with no 'a' partition at all, then
> with your patch the machine becomes unbootable.
> 
> The second stage bootloader doesn't automatically find the softraid volume.
> Manually booting the kernel from it results in a kernel panic when the
> kernel can't find the root filesystem.
> 
> Although booting from a RAID on a non-'a' partition is not supported on all
> archs, it has worked fine on amd64 for a long time, so it's quite possible
> that people have machines deployed that boot from other RAID partitions.

We don't recommend it, but this sentence in softraid(4) is the only thing I
can find actually talking about boot requirements wrt. chunks and labels:

On sparc64, bootable chunks must be RAID partitions using the letter 
‘a’.

It does imply that other architectures can boot from non-'a' partitions,
I've done so myself, you example shows it and I don't intent to change that.

> 
> This change would unexpectedly break them, and it would potentially be quite
> painful for any users who upgrade to 7.5 and find out afterwards that their
> machine doesn't boot, because the work-around would likely be to boot the
> ramdisk kernel, and unpack mdec/boot from the base package of the previous
> release then re-run installboot specifying the old mdec/boot.
> 
> That wouldn't be at all obvious to users without a lot of OpenBSD experience.

Rereading the code, I now question why it checks the 'a' label type at all.

Taking your sd0d example through devboot():

|#ifdef SOFTRAID
|   /*
|* Determine the partition type for the 'a' partition of the
|* boot device.
|*/
|   TAILQ_FOREACH(dip, , list)
|   if (dip->bios_info.bios_number == bootdev &&
|   (dip->bios_info.flags & BDI_BADLABEL) == 0)
|   part_type = dip->disklabel.d_partitions[0].p_fstype;

Whatever sd0a's type is...

|
|   /*
|* See if we booted from a disk that is a member of a bootable
|* softraid volume.
|*/
|   SLIST_FOREACH(bv, _volumes, sbv_link) {
|   if (bv->sbv_flags & BIOC_SCBOOTABLE)
|   SLIST_FOREACH(bc, >sbv_chunks, sbc_link)
|   if (bc->sbc_disk == bootdev)

... the boot loader sees we booted from a bootable softraid chunk,
matching disk sd0 by number and not partition.

|   sr_boot_vol = bv->sbv_unit;
|   if (sr_boot_vol != -1)
|   break;
|   }
|#endif
|
|   if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {

With softraid chunk on sd0d, sd0a happens to be not FFS (likely unused),
but that's still enough for the boot loader to stick to softraid on sd0!

|   *p++ = 's';
|   *p++ = 'r';
|   *p++ = '0' + sr_boot_vol;
|   } else if (bootdev & 0x100) {

So why check the 'a' label type if that's not relevant?
It is confusing and 
Doubling down on the assumption that bootable chunks are always on 'a'
like my diff did shows that's a mistake and dropping the check actually
makes more sense to me now.

This boots with root on softraid on sd0a and sd0d.

Thoughts?
Am I missing something?

Index: sys/arch/amd64//stand/libsa/dev_i386.c
===
RCS file: /cvs/src/sys/arch/amd64/stand/libsa/dev_i386.c,v
retrieving revision 1.23
diff -u -p -r1.23 dev_i386.c
--- sys/arch/amd64//stand/libsa/dev_i386.c  10 May 2019 21:20:43 -  
1.23
+++ sys/arch/amd64//stand/libsa/dev_i386.c  24 Oct 2023 01:43:06 -
@@ -103,22 +103,11 @@ devboot(dev_t bootdev, char *p)
 #ifdef SOFTRAID
struct sr_boot_volume *bv;
struct sr_boot_chunk *bc;
-   struct diskinfo *dip = NULL;
 #endif
int sr_boot_vol = -1;
-   int part_type = FS_UNUSED;
 
 #ifdef SOFTRAID
/*
-* Determine the partition type for the 'a' partition of the
-* boot 

Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Theo de Raadt
Crystal Kolipe  wrote:

`> Hi Theo, it's a long time since we last conversed.
> 
> On Mon, Oct 23, 2023 at 03:44:17PM -0600, Theo de Raadt wrote:
> > What user without OpenBSD experience is booting from 'd'?
> > 
> > Which also poses the question -- what user with OpenBSD experience
> > is booting from 'd'?
> > 
> > Why?
> 
> Some disklabel partitions have traditionally had specific meanings:
> 
> a - root fs
> b - swap
> c - whole disk
> d - on non-OpenBSD systems is 'whole disk' where c is not
> i - often used for non FFS partitions on OpenBSD
> 
> Why would you automatically make a RAID as parition 'a'?
> 
> It's not a root fs.
> 
> I don't see any logic behind RAID partition = 'a'.
> 
> What if you want more than one?
> 
> Booting from non-'a' softraids has never been discouraged on amd64.
> 
> It's been noted that on other archs it doesn't work, but there has
> never been a general advisory that booting from softraid volumes should
> only be done from 'a'.
> 
> > You say "quite possible that people have machines deployed that boot
> > from other RAID partitions"
> 
> After 10+ years of it working, do you really think that nobody has
> ever done it?



The way you write, it sounds personal.



Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Crystal Kolipe
Hi Theo, it's a long time since we last conversed.

On Mon, Oct 23, 2023 at 03:44:17PM -0600, Theo de Raadt wrote:
> What user without OpenBSD experience is booting from 'd'?
> 
> Which also poses the question -- what user with OpenBSD experience
> is booting from 'd'?
> 
> Why?

Some disklabel partitions have traditionally had specific meanings:

a - root fs
b - swap
c - whole disk
d - on non-OpenBSD systems is 'whole disk' where c is not
i - often used for non FFS partitions on OpenBSD

Why would you automatically make a RAID as parition 'a'?

It's not a root fs.

I don't see any logic behind RAID partition = 'a'.

What if you want more than one?

Booting from non-'a' softraids has never been discouraged on amd64.

It's been noted that on other archs it doesn't work, but there has
never been a general advisory that booting from softraid volumes should
only be done from 'a'.

> You say "quite possible that people have machines deployed that boot
> from other RAID partitions"

After 10+ years of it working, do you really think that nobody has
ever done it?

> Who?  Just you?

If it's just me, I can easy fix my own machine.



Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Theo de Raadt
Crystal Kolipe  wrote:

> On Mon, Oct 23, 2023 at 11:04:07AM +, Klemens Nanni wrote:
> > 10/16/23 04:02, Klemens Nanni ??:
> > > The current check implies one could use, e.g. SWAP or MSDOS partitions
> > > as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> > > thus using chunks with different partition types is not possible:
> > > 
> > >   # vmctl create -s100M disk.img
> > >   # vnd=`vnconfig disk.img`
> > >   # echo 'swap *' | disklabel -wAT- vnd0
> > > 
> > >   # disklabel $vnd | grep swap
> > > a:   2048000swap
> > >   # bioctl -c c -l ${vnd}a softraid0
> > >   softraid0: invalid metadata format
> > > 
> > > Correct the check.
> > > I don't expect this to break anything.
> > > amd64 biosboot boots off standard RAID 'a' as before.
> > > 
> > > Feedback? Objection? OK?
> > 
> > Ping.
> 
> This breaks booting off of a RAID that is not on partition 'a', on amd64.
> 
> Was this intentional?
> 
> For example, if you have a RAID on 'd', with no 'a' partition at all, then
> with your patch the machine becomes unbootable.
> 
> The second stage bootloader doesn't automatically find the softraid volume.
> Manually booting the kernel from it results in a kernel panic when the
> kernel can't find the root filesystem.
> 
> Although booting from a RAID on a non-'a' partition is not supported on all
> archs, it has worked fine on amd64 for a long time, so it's quite possible
> that people have machines deployed that boot from other RAID partitions.
> 
> This change would unexpectedly break them, and it would potentially be quite
> painful for any users who upgrade to 7.5 and find out afterwards that their
> machine doesn't boot, because the work-around would likely be to boot the
> ramdisk kernel, and unpack mdec/boot from the base package of the previous
> release then re-run installboot specifying the old mdec/boot.
> 
> That wouldn't be at all obvious to users without a lot of OpenBSD experience.


What user without OpenBSD experience is booting from 'd'?

Which also poses the question -- what user with OpenBSD experience
is booting from 'd'?

Why?


You say "quite possible that people have machines deployed that boot
from other RAID partitions"

Who?  Just you?



Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Crystal Kolipe
On Mon, Oct 23, 2023 at 11:04:07AM +, Klemens Nanni wrote:
> 10/16/23 04:02, Klemens Nanni ??:
> > The current check implies one could use, e.g. SWAP or MSDOS partitions
> > as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> > thus using chunks with different partition types is not possible:
> > 
> > # vmctl create -s100M disk.img
> > # vnd=`vnconfig disk.img`
> > # echo 'swap *' | disklabel -wAT- vnd0
> > 
> > # disklabel $vnd | grep swap
> >   a:   2048000swap
> > # bioctl -c c -l ${vnd}a softraid0
> > softraid0: invalid metadata format
> > 
> > Correct the check.
> > I don't expect this to break anything.
> > amd64 biosboot boots off standard RAID 'a' as before.
> > 
> > Feedback? Objection? OK?
> 
> Ping.

This breaks booting off of a RAID that is not on partition 'a', on amd64.

Was this intentional?

For example, if you have a RAID on 'd', with no 'a' partition at all, then
with your patch the machine becomes unbootable.

The second stage bootloader doesn't automatically find the softraid volume.
Manually booting the kernel from it results in a kernel panic when the
kernel can't find the root filesystem.

Although booting from a RAID on a non-'a' partition is not supported on all
archs, it has worked fine on amd64 for a long time, so it's quite possible
that people have machines deployed that boot from other RAID partitions.

This change would unexpectedly break them, and it would potentially be quite
painful for any users who upgrade to 7.5 and find out afterwards that their
machine doesn't boot, because the work-around would likely be to boot the
ramdisk kernel, and unpack mdec/boot from the base package of the previous
release then re-run installboot specifying the old mdec/boot.

That wouldn't be at all obvious to users without a lot of OpenBSD experience.



Re: fix an error in flowspec_get_addr()

2023-10-23 Thread Theo Buehler
On Mon, Oct 23, 2023 at 12:50:39PM +0200, Claudio Jeker wrote:
> So flowspec_get_addr() in the IPv6 case is utterly complicated.

Indeed. It would probably be worthwhile to add a bit of regress
exercising the branches of this function.

> Since matching can be done on some sub-part of the prefix.
> So there is this shift_right() call that moves takes care of this special
> offset.
> 
> Now the shift_right call uses *olen but should actually use xoff instead.
> *olen is set much later in the code.

much = 3 lines? :)

> This should fix:
> https://github.com/openbgpd-portable/openbgpd-portable/security/code-scanning/2

Indeed.

ok

> -- 
> :wq Claudio
> 
> Index: flowspec.c
> ===
> RCS file: /cvs/src/usr.sbin/bgpd/flowspec.c,v
> retrieving revision 1.4
> diff -u -p -r1.4 flowspec.c
> --- flowspec.c19 Apr 2023 09:31:58 -  1.4
> +++ flowspec.c23 Oct 2023 10:44:22 -
> @@ -366,7 +366,7 @@ flowspec_get_addr(const uint8_t *flow, i
>   if (extract_prefix(comp + 2, complen - 2, buf, xlen,
>   sizeof(buf)) == -1)
>   return -1;
> - shift_right(addr->v6.s6_addr, buf, *olen, xlen);
> + shift_right(addr->v6.s6_addr, buf, xoff, xlen);
>   *plen = comp[0];
>   if (olen != NULL)
>   *olen = comp[1];
> 



Re: boot loaders: softraid volumes must be on RAID partitions

2023-10-23 Thread Klemens Nanni
10/16/23 04:02, Klemens Nanni пишет:
> The current check implies one could use, e.g. SWAP or MSDOS partitions
> as softraid(4) chunks, but sys/dev/softraid.c always expects FS_RAID,
> thus using chunks with different partition types is not possible:
> 
>   # vmctl create -s100M disk.img
>   # vnd=`vnconfig disk.img`
>   # echo 'swap *' | disklabel -wAT- vnd0
> 
>   # disklabel $vnd | grep swap
> a:   2048000swap
>   # bioctl -c c -l ${vnd}a softraid0
>   softraid0: invalid metadata format
> 
> Correct the check.
> I don't expect this to break anything.
> amd64 biosboot boots off standard RAID 'a' as before.
> 
> Feedback? Objection? OK?

Ping.

> 
> Index: arch/amd64/stand/efiboot/dev_i386.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/stand/efiboot/dev_i386.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 dev_i386.c
> --- arch/amd64/stand/efiboot/dev_i386.c   10 May 2019 21:20:42 -  
> 1.1
> +++ arch/amd64/stand/efiboot/dev_i386.c   16 Oct 2023 00:33:14 -
> @@ -149,7 +149,7 @@ devboot(dev_t bootdev, char *p)
>   }
>  #endif
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   *p++ = 's';
>   *p++ = 'r';
>   *p++ = '0' + sr_boot_vol;
> Index: arch/amd64/stand/libsa/dev_i386.c
> ===
> RCS file: /cvs/src/sys/arch/amd64/stand/libsa/dev_i386.c,v
> retrieving revision 1.23
> diff -u -p -r1.23 dev_i386.c
> --- arch/amd64/stand/libsa/dev_i386.c 10 May 2019 21:20:43 -  1.23
> +++ arch/amd64/stand/libsa/dev_i386.c 16 Oct 2023 00:31:35 -
> @@ -132,7 +132,7 @@ devboot(dev_t bootdev, char *p)
>   }
>  #endif
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   *p++ = 's';
>   *p++ = 'r';
>   *p++ = '0' + sr_boot_vol;
> Index: arch/arm64/stand/efiboot/efiboot.c
> ===
> RCS file: /cvs/src/sys/arch/arm64/stand/efiboot/efiboot.c,v
> retrieving revision 1.48
> diff -u -p -r1.48 efiboot.c
> --- arch/arm64/stand/efiboot/efiboot.c12 May 2023 16:43:00 -  
> 1.48
> +++ arch/arm64/stand/efiboot/efiboot.c16 Oct 2023 00:34:17 -
> @@ -864,7 +864,7 @@ devboot(dev_t dev, char *p)
>   break;
>   }
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   strlcpy(p, "sr0a", 5);
>   p[2] = '0' + sr_boot_vol;
>   return;
> Index: arch/i386/stand/libsa/dev_i386.c
> ===
> RCS file: /cvs/src/sys/arch/i386/stand/libsa/dev_i386.c,v
> retrieving revision 1.43
> diff -u -p -r1.43 dev_i386.c
> --- arch/i386/stand/libsa/dev_i386.c  11 Sep 2016 17:52:47 -  1.43
> +++ arch/i386/stand/libsa/dev_i386.c  16 Oct 2023 00:34:45 -
> @@ -132,7 +132,7 @@ devboot(dev_t bootdev, char *p)
>   }
>  #endif
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   *p++ = 's';
>   *p++ = 'r';
>   *p++ = '0' + sr_boot_vol;
> Index: arch/riscv64/stand/efiboot/efiboot.c
> ===
> RCS file: /cvs/src/sys/arch/riscv64/stand/efiboot/efiboot.c,v
> retrieving revision 1.6
> diff -u -p -r1.6 efiboot.c
> --- arch/riscv64/stand/efiboot/efiboot.c  5 Jul 2023 09:25:55 -   
> 1.6
> +++ arch/riscv64/stand/efiboot/efiboot.c  16 Oct 2023 00:35:14 -
> @@ -742,7 +742,7 @@ devboot(dev_t dev, char *p)
>   break;
>   }
>  
> - if (sr_boot_vol != -1 && part_type != FS_BSDFFS) {
> + if (sr_boot_vol != -1 && part_type == FS_RAID) {
>   strlcpy(p, "sr0a", 5);
>   p[2] = '0' + sr_boot_vol;
>   return;
> 



fix an error in flowspec_get_addr()

2023-10-23 Thread Claudio Jeker
So flowspec_get_addr() in the IPv6 case is utterly complicated.
Since matching can be done on some sub-part of the prefix.
So there is this shift_right() call that moves takes care of this special
offset.

Now the shift_right call uses *olen but should actually use xoff instead.
*olen is set much later in the code.

This should fix:
https://github.com/openbgpd-portable/openbgpd-portable/security/code-scanning/2
-- 
:wq Claudio

Index: flowspec.c
===
RCS file: /cvs/src/usr.sbin/bgpd/flowspec.c,v
retrieving revision 1.4
diff -u -p -r1.4 flowspec.c
--- flowspec.c  19 Apr 2023 09:31:58 -  1.4
+++ flowspec.c  23 Oct 2023 10:44:22 -
@@ -366,7 +366,7 @@ flowspec_get_addr(const uint8_t *flow, i
if (extract_prefix(comp + 2, complen - 2, buf, xlen,
sizeof(buf)) == -1)
return -1;
-   shift_right(addr->v6.s6_addr, buf, *olen, xlen);
+   shift_right(addr->v6.s6_addr, buf, xoff, xlen);
*plen = comp[0];
if (olen != NULL)
*olen = comp[1];



Re: Term::Cap full revamp

2023-10-22 Thread Matthieu Herrb
On Fri, Oct 20, 2023 at 11:52:25AM +0200, Marc Espie wrote:
> I guess i will probably leave it alone after this.
> This does quite a few things compared to my former patches.
> 
> - totally get rid of eval, it doen't make sense anymore
> - declare variables before they get used, which tends to
> simplify things.
> - change quaint formatting to something more BSD like
> - update documentation to new style of doing OO
> - use defined logic on entry and such
> - always try to run infocmp as a last resort, even if
> we have a path.
> - run infocmp with the best options we have to get a good termcap
> - use \Q\E, which gets rid of termpat entirely
> - dedup the path along the way: for us, /etc/termcap
> and /usr/share/misc/termcap are the same.
> - redo recursion logic by just recording which term values we
> already saw, the max=32 value overflow was absurd, proper parsing
> yields roughly 10 or so tc redirections for xterm, not >32.

I eventually got an occation to test it. Except for the extra debug
print that sneaked. it works well for my use case. (pkg_* tools on a
terminal application from ports that ships its own terminfo entry).

ok matthieu@ with the print removed (see inline).

> 
> Index: Cap.pm
> ===
> RCS file: /cvs/src/gnu/usr.bin/perl/cpan/Term-Cap/Cap.pm,v
> retrieving revision 1.3
> diff -u -p -r1.3 Cap.pm
> --- Cap.pm18 Oct 2023 01:49:26 -  1.3
> +++ Cap.pm20 Oct 2023 09:47:05 -
> @@ -16,8 +16,8 @@ sub croak
>  
>  use strict;
>  
> +use v5.16;
>  use vars qw($VERSION $VMS_TERMCAP);
> -use vars qw($termpat $state $first $entry);
>  
>  $VERSION = '1.17';
>  
> @@ -33,7 +33,7 @@ Term::Cap - Perl termcap interface
>  =head1 SYNOPSIS
>  
>  require Term::Cap;
> -$terminal = Tgetent Term::Cap { TERM => undef, OSPEED => $ospeed };
> +$terminal = Term::Cap->Tgetent({ TERM => undef, OSPEED => $ospeed });
>  $terminal->Trequire(qw/ce ku kd/);
>  $terminal->Tgoto('cm', $col, $row, $FH);
>  $terminal->Tputs('dl', $count, $FH);
> @@ -75,10 +75,10 @@ if ( $^O eq 'VMS' )
>  
>  sub termcap_path
>  {## private
> -my @termcap_path;
> +my @l;
>  
>  # $TERMCAP, if it's a filespec
> -push( @termcap_path, $ENV{TERMCAP} )
> +push(@l, $ENV{TERMCAP})
>if (
>  ( exists $ENV{TERMCAP} )
>  && (
> @@ -87,23 +87,27 @@ sub termcap_path
>  : $ENV{TERMCAP} =~ /^\//s
>  )
>);
> -if ( ( exists $ENV{TERMPATH} ) && ( $ENV{TERMPATH} ) )
> -{
> -
> +if (exists $ENV{TERMPATH} && $ENV{TERMPATH}) {
>  # Add the users $TERMPATH
> -push( @termcap_path, split( /(:|\s+)/, $ENV{TERMPATH} ) );
> -}
> -else
> -{
> -
> +push(@l, split( /(:|\s+)/, $ENV{TERMPATH}));
> +} else {
>  # Defaults
> -push( @termcap_path,
> -exists $ENV{'HOME'} ? $ENV{'HOME'} . '/.termcap' : undef,
> -'/etc/termcap', '/usr/share/misc/termcap', );
> + if (exists $ENV{HOME}) {
> + push(@l, $ENV{HOME}.'/.termcap');
> + }
> +push(@l, '/etc/termcap', '/usr/share/misc/termcap', );
> +}
> +my @termcap_path;
> +my $seen = {};
> +for my $i (@l) {
> + next unless -f $i;
> + my $k = join(',', (stat _)[0,1]);
> + next if $seen->{$k};
> + push(@termcap_path, $i);
> + $seen->{$k} = 1;
>  }
>  
> -# return the list of those termcaps that exist
> -return grep { defined $_ && -f $_ } @termcap_path;
> +return @termcap_path;
>  }
>  
>  =over 4
> @@ -164,195 +168,158 @@ It calls C on failure.
>  
>  sub Tgetent
>  {## public -- static method
> -my $class = shift;
> -my ($self) = @_;
> +my ($class, $self) = @_;
>  
>  $self = {} unless defined $self;
>  bless $self, $class;
>  
> -my ( $term, $cap, $search, $field, $max, $tmp_term, $TERMCAP );
> -local ( $termpat, $state, $first, $entry );# used inside eval
> +my ($cap, $field);
> + 
>  local $_;
>  
>  # Compute PADDING factor from OSPEED (to be used by Tpad)
> -if ( !$self->{OSPEED} )
> -{
> -if ($^W)
> -{
> +if (!$self->{OSPEED}) {
> +if ($^W) {
>  carp "OSPEED was not set, defaulting to 9600";
>  }
>  $self->{OSPEED} = 9600;
>  }
> -if ( $self->{OSPEED} < 16 )
> -{
> -
> +if ($self->{OSPEED} < 16) {
>  # delays for old style speeds
>  my @pad = (
>  0,200, 133.3, 90.9, 74.3, 66.7, 50, 33.3,
>  16.7, 8.3, 5.5,   4.1,  2,1,.5, .2
>  );
>  $self->{PADDING} = $pad[ $self->{OSPEED} ];
> -}
> -else
> -{
> +} else {
>  $self->{PADDING} = 1 / $self->{OSPEED};
>  }
>  
> -unless ( $self->{TERM} )
> -{
> -   if ( $ENV{TERM} )
> -   {
> - $self->{TERM} =  $ENV{TERM} ;
> -   }
> -   else
> -   {
> -  if ( $^O eq 

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-22 Thread Martin Pieuchot
On 22/10/23(Sun) 20:29, Miod Vallat wrote:
> > On 21/10/23(Sat) 14:28, Miod Vallat wrote:
> > > > Stuart, Miod, I wonder if this also help for the off-by-one issue you
> > > > are seeing.  It might not.
> > > 
> > > It makes the aforementioned issue disappear on the affected machine.
> > 
> > Thanks at lot for testing!
> 
> Spoke too soon. I have just hit
> 
> panic: kernel diagnostic assertion "uvmexp.swpgonly > 0" failed: file 
> "/usr/src/sys/uvm/uvm_anon.c", line 121
> Stopped at  db_enter+0x8:   add #0x4, r14
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *235984  11904  0 0x14000  0x2000  reaper
> db_enter() at db_enter+0x8
> panic() at panic+0x74
> __assert() at __assert+0x1c
> uvm_anfree_list() at uvm_anfree_list+0x156
> amap_wipeout() at amap_wipeout+0xe6
> uvm_unmap_detach() at uvm_unmap_detach+0x42
> uvm_map_teardown() at uvm_map_teardown+0x104
> uvmspace_free() at uvmspace_free+0x2a
> reaper() at reaper+0x86
> ddb> show uvmexp
> Current UVM status:
>   pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
>   14875 VM pages: 376 active, 2076 inactive, 1 wired, 7418 free (924
> zero)
>   min  10% (25) anon, 10% (25) vnode, 5% (12) vtext
>   freemin=495, free-target=660, inactive-target=2809, wired-max=4958
>   faults=73331603, traps=39755714, intrs=33863551, ctxswitch=11641480
> fpuswitch
> =0
>   softint=15742561, syscalls=39755712, kmapent=11
>   fault counts:
> noram=1, noanon=0, noamap=0, pgwait=1629, pgrele=0
> ok relocks(total)=1523991(1524022), anget(retries)=23905247(950233),
> amapco
> py=9049749
> neighbor anon/obj pg=12025732/40041442,
> gets(lock/unlock)=12859247/574102
> cases: anon=20680175, anoncow=3225049, obj=11467884, prcopy=1391019,
> przero
> =36545783
>   daemon and swap counts:
> woke=6868, revs=6246, scans=3525644, obscans=511526, anscans=2930634
> busy=0, freed=1973275, reactivate=83484, deactivate=3941988
> pageouts=94506, pending=94506, nswget=949421
> nswapdev=1
> swpages=4194415, swpginuse=621, swpgonly=0 paging=0
>   kernel pointers:
> objs(kern)=0x8c3ca94c

I wonder if the diff below makes a difference.  It's hard to debug and it
might be worth adding a counter for bad swap slots.

Index: uvm/uvm_anon.c
===
RCS file: /cvs/src/sys/uvm/uvm_anon.c,v
retrieving revision 1.56
diff -u -p -r1.56 uvm_anon.c
--- uvm/uvm_anon.c  2 Sep 2023 08:24:40 -   1.56
+++ uvm/uvm_anon.c  22 Oct 2023 21:27:42 -
@@ -116,7 +116,7 @@ uvm_anfree_list(struct vm_anon *anon, st
uvm_unlock_pageq(); /* free the daemon */
}
} else {
-   if (anon->an_swslot != 0) {
+   if (anon->an_swslot != 0 && anon->an_swslot != SWSLOT_BAD) {
/* This page is no longer only in swap. */
KASSERT(uvmexp.swpgonly > 0);
atomic_dec_int();



Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-22 Thread Miod Vallat
> On 21/10/23(Sat) 14:28, Miod Vallat wrote:
> > > Stuart, Miod, I wonder if this also help for the off-by-one issue you
> > > are seeing.  It might not.
> > 
> > It makes the aforementioned issue disappear on the affected machine.
> 
> Thanks at lot for testing!

Spoke too soon. I have just hit

panic: kernel diagnostic assertion "uvmexp.swpgonly > 0" failed: file 
"/usr/src/sys/uvm/uvm_anon.c", line 121
Stopped at  db_enter+0x8:   add #0x4, r14
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*235984  11904  0 0x14000  0x2000  reaper
db_enter() at db_enter+0x8
panic() at panic+0x74
__assert() at __assert+0x1c
uvm_anfree_list() at uvm_anfree_list+0x156
amap_wipeout() at amap_wipeout+0xe6
uvm_unmap_detach() at uvm_unmap_detach+0x42
uvm_map_teardown() at uvm_map_teardown+0x104
uvmspace_free() at uvmspace_free+0x2a
reaper() at reaper+0x86
ddb> show uvmexp
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
  14875 VM pages: 376 active, 2076 inactive, 1 wired, 7418 free (924
zero)
  min  10% (25) anon, 10% (25) vnode, 5% (12) vtext
  freemin=495, free-target=660, inactive-target=2809, wired-max=4958
  faults=73331603, traps=39755714, intrs=33863551, ctxswitch=11641480
fpuswitch
=0
  softint=15742561, syscalls=39755712, kmapent=11
  fault counts:
noram=1, noanon=0, noamap=0, pgwait=1629, pgrele=0
ok relocks(total)=1523991(1524022), anget(retries)=23905247(950233),
amapco
py=9049749
neighbor anon/obj pg=12025732/40041442,
gets(lock/unlock)=12859247/574102
cases: anon=20680175, anoncow=3225049, obj=11467884, prcopy=1391019,
przero
=36545783
  daemon and swap counts:
woke=6868, revs=6246, scans=3525644, obscans=511526, anscans=2930634
busy=0, freed=1973275, reactivate=83484, deactivate=3941988
pageouts=94506, pending=94506, nswget=949421
nswapdev=1
swpages=4194415, swpginuse=621, swpgonly=0 paging=0
  kernel pointers:
objs(kern)=0x8c3ca94c



Re: nfsd: don't clear SB_NOINTR flag

2023-10-22 Thread Vitaliy Makkoveev
On Fri, Oct 20, 2023 at 10:51:46PM +0200, Alexander Bluhm wrote:
> On Mon, Oct 16, 2023 at 10:17:50PM +0300, Vitaliy Makkoveev wrote:
> > This socket comes from userland, so this flag is never set. This makes
> > SB_NOINTR flag immutable, because we only set this bit on NFS client
> > socket buffers for all it's lifetime.
> > 
> > I want to do this because this flag modifies sblock() behaviour and it's
> > not clean which lock should protect it for standalone sblock().
> > 
> > ok?
> 
> This code came here:
> https://svnweb.freebsd.org/csrg/sys/nfs/nfs_syscalls.c?revision=41903=markup
> "update from Rick Macklem adding TCP support to NFS"
> 
> I would guess that this flag must be cleared to allow to kill the
> NFS server if a client does not respond.  Otherwise it may hang in
> sbwait() in soreceive() or sosend().
> 
> As the flags are never set, your diff does not change behavior.
> 
> > I want to completely remove SB_NOINTR flag. Only NFS client sets it, but
> > since the socket never passed to userland, this flag is useless, because
> > we can't send singnal to kernel thread. So for this sblock()/sbwait()
> > sleep is uninterruptible. But I want to not mix NFS server and NFS
> > client diffs.
> 
> The NFS client does not run as kernel thread, but as the process
> that does file system access.  You can Ctrl-C it if mount uses
> "intr" option.  NFSMNT_INT sets some PCATCH and sb_timeo_nsecs, but
> I think signal should also abort sbwait().  That is why NFS client
> sets SB_NOINTR.
> 
> As this flag is only set during mount or reconnect.  It is a problem
> for MP?  Is it modified in a non-initialization path?
> 
> bluhm
> 

Sorry for non clean explanation.

This time solock() protects `sb_flags', so the following SB_NOINTR flag
check is fine:

sblock(struct socket *so, struct sockbuf *sb, int flags)
{
int error, prio = PSOCK;

soassertlocked(so);

if ((sb->sb_flags & SB_LOCK) == 0) {
sb->sb_flags |= SB_LOCK;
return (0);
}
if ((flags & SBL_WAIT) == 0)
return (EWOULDBLOCK);
if (!(flags & SBL_NOINTR || sb->sb_flags & SB_NOINTR))
prio |= PCATCH;

However, solock() will not protect `sb_flags' for standalone sblock().
It is not yet implemented, so here is the hypothetical sblock(). I
suppose the `sb_lock' rwlock(9) will protect the whole sockbuf data
include `sb_flags'. And there is the problem. As you can see, the
SB_NOINTR check presceeds the `sb_lock' acquisition, because the
rw_enter(9) behaviour depends on this flag.

sblock(struct sockbuf *sb, int flags)
{   
int lockflags = RW_WRITE;

if (flags & SBL_NOINTR || sb->sb_flags & SB_NOINTR)
lockflags |= RW_INTR;
if ((flags & SBL_WAIT) == 0)
lockflags |= RW_NOSLEEP;
return rw_enter(>sb_lock, lockflags);
}

I the case, when SB_NOINTR flag is immutable, there is no problem with
this unlocked check. Since this flag is really immutable and it's only
cleanup is the null op, I proposed to remove the flag cleanup.

However, this is not the only solution. This time this flag is not true
for sblock() because you always need to acquire solock() before. The
solock() acquisition is always uninterruptible, so ^C will not work. But
we assume the rwlock(9) acquisition doesn't take significant time,
right?

So, hypothetical sblock() could avoid the SB_NOINTR check. We have
SBL_NOINTR passed with `flags' for the sorflush() cases. This flag is
significant for sbwait(), but with the socket buffers standalone
locking, obviously sblock() should be taken before sbwait(), so the
`sb_flags' check will be protected:

sblock(struct sockbuf *sb, int flags)
{   
int lockflags = RW_WRITE;

if (flags & SBL_NOINTR)
lockflags |= RW_INTR;
if ((flags & SBL_WAIT) == 0)
lockflags |= RW_NOSLEEP;
return rw_enter(>sb_lock, lockflags);
}

sbwait(struct sockbuf *sb)
{   
int prio = (sb->sb_flags & SB_NOINTR) ? PSOCK : PSOCK | PCATCH;

sbassertlocked(sb);

return rwsleep_nsec(>sb_cc, >sb_lock, prio, "sbwait",
sb->sb_timeo_nsecs);
}

Of course this is hypothetical. Also, SB_NOINTR flag has sense for TCP
and UDP sockets only. I don't think they will be the first sockets with
standalone locking ability for buffers. The unlocked SB_NOINTR flag
check is fine for all other sockets, so it could be left as is for now.



Re: relayd does not delete control socket on shutdown

2023-10-22 Thread Theo de Raadt
Otto Moerbeek  wrote:

> On Sat, Oct 21, 2023 at 10:40:45PM +0300, Kapetanakis Giannis wrote:
> 
> > On 21/10/2023 20:39, Florian Obser wrote:
> > > Which was 8 years ago. I don't understand why you see a change in 7.4.
> > > 
> > > Anyway, we decided to not clean up control sockets in any of our
> > > privsep daemons because leaving them behind does not cause any issues.
> > 
> > I just noticed it today when I tried to use the socket in a script and
> > noticed that it stayed there even after shutdown and though it was after 7.4
> > but I was wrong about that.
> > 
> > Your commit made it that clear.
> > 
> > Agree it's not a big case if it stays there.
> > 
> > Would the unlink succeed if the socket was owned by _relayd?
> > 
> > G
> 
> Unlinking somthing requires write permissions to the directory it is
> in.

Which means an attacker who gains control, but otherwise can't do a bunch
of other things becuase of the privsep design -- could still fill the directory
and filesystem.

So a few years ago we asked ourselves -- is the tradeoff worth it?



Re: relayd does not delete control socket on shutdown

2023-10-22 Thread Otto Moerbeek
On Sat, Oct 21, 2023 at 10:40:45PM +0300, Kapetanakis Giannis wrote:

> On 21/10/2023 20:39, Florian Obser wrote:
> > Which was 8 years ago. I don't understand why you see a change in 7.4.
> > 
> > Anyway, we decided to not clean up control sockets in any of our
> > privsep daemons because leaving them behind does not cause any issues.
> 
> I just noticed it today when I tried to use the socket in a script and
> noticed that it stayed there even after shutdown and though it was after 7.4
> but I was wrong about that.
> 
> Your commit made it that clear.
> 
> Agree it's not a big case if it stays there.
> 
> Would the unlink succeed if the socket was owned by _relayd?
> 
> G

Unlinking somthing requires write permissions to the directory it is
in.

-Otto



Re: malloc: more info in error message for write-after-free with option D

2023-10-22 Thread Masato Asou
Hi,

I wanted an extension to malloc() that would report the caller of all
memory leaks.  It works fine for me!

ok asou@
--
ASOU Masato

From: Otto Moerbeek 
Date: Tue, 10 Oct 2023 12:39:00 +0200

> Hi,
> 
> This diff adds better error reporting for write-after-free or the more
> general write of free memory if malloc option D is active. Knowing the
> place where allocations were done often helps to find out where the
> overwrite happened.
> 
> If option D is active malloc now saves caller info in a separate page
> instead of only doing that for chunk index 0. It also reports about
> the preceding chunk if applicable. A report looks like this:
> 
> X(12489) in calloc(): write to free chunk 0x851ec6066d0[0..7]@16
> allocated at /usr/X11R6/lib/modules/dri/radeonsi_dri.so 0x88c949
> (preceding chunk 0x851ec6066c0 allocated at /usr/X11R6/bin/X 0x177374 (now 
> free))
> 
> You can use addr2line -e to get the file and linenumber the allocation
> was done (if the object was compiled with debug info).
> 
> If D is not used only the first part is printed, as no caller info is
> saved. So no extra overhead if D is not useed.
> 
> Also: the leak report now do not contain unknown locations anymore, as
> each allocation gets its caller recorded if D is active.
> 
>   -Otto
> 
> Index: stdlib/malloc.3
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.3,v
> retrieving revision 1.137
> diff -u -p -r1.137 malloc.3
> --- stdlib/malloc.3   1 Jul 2023 18:35:14 -   1.137
> +++ stdlib/malloc.3   10 Oct 2023 10:23:19 -
> @@ -307,7 +307,7 @@ These malloc options imply
>  .Cm D .
>  .It Cm F
>  .Dq Freecheck .
> -Enable more extensive double free and use after free detection.
> +Enable more extensive double free and write after free detection.
>  All chunks in the delayed free list will be checked for double frees and
>  write after frees.
>  Unused pages on the freelist are read and write protected to
> @@ -641,18 +641,34 @@ or
>  reallocate an unallocated pointer was made.
>  .It Dq double free
>  There was an attempt to free an allocation that had already been freed.
> -.It Dq write after free
> -An allocation has been modified after it was freed.
> +.It Dq write of free mem Va address Ns [ Va start Ns .. Ns Va end Ns ]@ Ns 
> Va size
> +An allocation has been modified after it was freed,
> +or a chunk that was never allocated was written to.
> +The
> +.Va range
> +at which corruption was detected is printed between [ and ].
> +.Pp
> +Enabling option
> +.Cm D
> +allows malloc to print information about where the allocation
> +was done.
>  .It Dq modified chunk-pointer
>  The pointer passed to
>  .Fn free
>  or a reallocation function has been modified.
> -.It Dq canary corrupted address offset@length
> -A byte after the requested size has been overwritten,
> +.It Dq canary corrupted Va address Ns [ Va offset Ns ]@ Ns Va length Ns / Ns 
> Va size
> +A byte after the requested
> +.Va length has been overwritten,
>  indicating a heap overflow.
> -The offset at which corruption was detected is printed before the @,
> -and the requested length of the allocation after the @.
> -.It Dq recorded size oldsize inconsistent with size
> +The
> +.Va offset
> +at which corruption was detected is printed between [ and ],
> +the requested
> +.Va length
> +of the allocation is printed before the / and the
> +.Va size
> +of the allocation after the /.
> +.It Dq recorded size Va oldsize No inconsistent with Va size
>  .Fn recallocarray
>  or
>  .Fn freezero
> @@ -676,7 +692,7 @@ functions nor utilize any other function
>  (e.g.,
>  .Xr stdio 3
>  routines).
> -.It Dq unknown char in MALLOC_OPTIONS
> +.It Dq unknown char in Ev MALLOC_OPTIONS
>  We found something we didn't understand.
>  .It any other error
>  .Fn malloc
> Index: stdlib/malloc.c
> ===
> RCS file: /home/cvs/src/lib/libc/stdlib/malloc.c,v
> retrieving revision 1.290
> diff -u -p -r1.290 malloc.c
> --- stdlib/malloc.c   9 Sep 2023 06:52:40 -   1.290
> +++ stdlib/malloc.c   10 Oct 2023 10:23:19 -
> @@ -112,7 +112,7 @@ struct region_info {
>   void *p;/* page; low bits used to mark chunks */
>   uintptr_t size; /* size for pages, or chunk_info pointer */
>  #ifdef MALLOC_STATS
> - void *f;/* where allocated from */
> + void **f;   /* where allocated from */
>  #endif
>  };
>  
> @@ -146,7 +146,7 @@ struct dir_info {
>   size_t regions_total;   /* number of region slots */
>   size_t regions_free;/* number of free slots */
>   size_t rbytesused;  /* random bytes used */
> - char *func; /* current function */
> + const char *func;   /* current function */
>   int malloc_junk;/* junk fill? */
>   int mmap_flag;  /* extra flag for mmap */
>  

Re: bt(5), btrace(8): execute END probe and print maps after exit() statement

2023-10-21 Thread Scott Cheloha
On Sat, Oct 21, 2023 at 07:17:05PM +0200, Martin Pieuchot wrote:
> On 18/10/23(Wed) 12:56, Scott Cheloha wrote:
> > Hi,
> > 
> > A bt(5) exit() statement causes the btrace(8) interpreter to exit(3)
> > immediately.
> > 
> > A BPFtrace exit() statement is more nuanced: the END probe is executed
> > and the contents of all maps are printed before the interpreter exits.
> > 
> > This patch adds a halting check after the execution of each bt(5)
> > statement.  If a statement causes the program to halt, the halt
> > bubbles up to the top-level rule evaluation loop and terminates
> > execution.  rules_teardown() then runs, just as if the program had
> > received SIGTERM.
> > 
> > Two edge-like cases:
> > 
> > 1. You can exit() from BEGIN.  rules_setup() returns non-zero if this
> >happens so the main loop knows to halt immediately.
> > 
> > 2. You can exit() from END.  This is just an early-return: the END probe
> >doesn't run again.
> > 
> > Thoughts?
> 
> Makes sense to ease the transition from bpftrace scripts.  Ok with me if
> you make sure the regression tests still pass.  Some outputs might
> depend on the actual behavior and would need to be updated.

Agh, my mistake, there are two tests that depend on the current
behavior.  I have updated them below.

ok with the test fixes?

Index: usr.sbin/btrace/btrace.c
===
RCS file: /cvs/src/usr.sbin/btrace/btrace.c,v
retrieving revision 1.79
diff -u -p -r1.79 btrace.c
--- usr.sbin/btrace/btrace.c12 Oct 2023 15:16:44 -  1.79
+++ usr.sbin/btrace/btrace.c22 Oct 2023 01:21:21 -
@@ -71,10 +71,10 @@ struct dtioc_probe_info *dtpi_get_by_val
  * Main loop and rule evaluation.
  */
 voidrules_do(int);
-voidrules_setup(int);
-voidrules_apply(int, struct dt_evt *);
+int rules_setup(int);
+int rules_apply(int, struct dt_evt *);
 voidrules_teardown(int);
-voidrule_eval(struct bt_rule *, struct dt_evt *);
+int rule_eval(struct bt_rule *, struct dt_evt *);
 voidrule_printmaps(struct bt_rule *);
 
 /*
@@ -84,7 +84,7 @@ uint64_t   builtin_nsecs(struct dt_evt *
 const char *builtin_kstack(struct dt_evt *);
 const char *builtin_arg(struct dt_evt *, enum bt_argtype);
 struct bt_arg  *fn_str(struct bt_arg *, struct dt_evt *, char *);
-voidstmt_eval(struct bt_stmt *, struct dt_evt *);
+int stmt_eval(struct bt_stmt *, struct dt_evt *);
 voidstmt_bucketize(struct bt_stmt *, struct dt_evt *);
 voidstmt_clear(struct bt_stmt *);
 voidstmt_delete(struct bt_stmt *, struct dt_evt *);
@@ -405,6 +405,7 @@ void
 rules_do(int fd)
 {
struct sigaction sa;
+   int halt = 0;
 
memset(, 0, sizeof(sa));
sigemptyset(_mask);
@@ -415,9 +416,9 @@ rules_do(int fd)
if (sigaction(SIGTERM, , NULL))
err(1, "sigaction");
 
-   rules_setup(fd);
+   halt = rules_setup(fd);
 
-   while (!quit_pending && g_nprobes > 0) {
+   while (!quit_pending && !halt && g_nprobes > 0) {
static struct dt_evt devtbuf[64];
ssize_t rlen;
size_t i;
@@ -434,8 +435,11 @@ rules_do(int fd)
if ((rlen % sizeof(struct dt_evt)) != 0)
err(1, "incorrect read");
 
-   for (i = 0; i < rlen / sizeof(struct dt_evt); i++)
-   rules_apply(fd, [i]);
+   for (i = 0; i < rlen / sizeof(struct dt_evt); i++) {
+   halt = rules_apply(fd, [i]);
+   if (halt)
+   break;
+   }
}
 
rules_teardown(fd);
@@ -484,7 +488,7 @@ rules_action_scan(struct bt_stmt *bs)
return evtflags;
 }
 
-void
+int
 rules_setup(int fd)
 {
struct dtioc_probe_info *dtpi;
@@ -493,7 +497,7 @@ rules_setup(int fd)
struct bt_probe *bp;
struct bt_stmt *bs;
struct bt_arg *ba;
-   int dokstack = 0, on = 1;
+   int dokstack = 0, halt = 0, on = 1;
uint64_t evtflags;
 
TAILQ_FOREACH(r, _rules, br_next) {
@@ -553,7 +557,7 @@ rules_setup(int fd)
clock_gettime(CLOCK_REALTIME, _devt.dtev_tsp);
 
if (rbegin)
-   rule_eval(rbegin, _devt);
+   halt = rule_eval(rbegin, _devt);
 
/* Enable all probes */
TAILQ_FOREACH(r, _rules, br_next) {
@@ -571,9 +575,14 @@ rules_setup(int fd)
if (ioctl(fd, DTIOCRECORD, ))
err(1, "DTIOCRECORD");
}
+
+   return halt;
 }
 
-void
+/*
+ * Returns non-zero if the program should halt.
+ */
+int
 rules_apply(int fd, struct dt_evt *dtev)
 {
struct bt_rule *r;
@@ -586,9 +595,11 @@ rules_apply(int fd, 

Re: Virtio fix for testing

2023-10-21 Thread Andrew Cagney
On Mon, 21 Aug 2023 at 15:31, Andrew Cagney  wrote:
>
> On Sun, 20 Aug 2023 at 06:23, Stefan Fritsch  wrote:
> >
> > Am 13.08.23 um 17:38 schrieb Tobias Heider:

> > You could try something like
> >
> > -device virtio-scsi-pci,id=scsi
> > -drive file=install73.iso,format=raw,id=cdinst,if=none
> > -device scsi-cd,drive=cdinst
>
> good idea (just need to translate it to virt-install speak)
>
> > That depends on your seabios having support for virtio cdroms. Not sure
> > if that is the default by now.
> >
> > Or maybe try a SATA cdrom, but you would need to figure out the qemu
> > options for that yourself.

This is what I did.  Using virt-install and replacing --cdrom with
something like:

+KVM_OPENBSD_VIRT_INSTALL_FLAGS = \
+ --disk path=$(KVM_OPENBSD_BASE_ISO),readonly=on,device=cdrom,target.bus=sata \
+ --install bootdev=cdrom

I can boot/install 7.4's ISO.

thanks,
Andrew



Re: relayd does not delete control socket on shutdown

2023-10-21 Thread Kapetanakis Giannis

On 21/10/2023 20:39, Florian Obser wrote:

Which was 8 years ago. I don't understand why you see a change in 7.4.

Anyway, we decided to not clean up control sockets in any of our
privsep daemons because leaving them behind does not cause any issues.


I just noticed it today when I tried to use the socket in a script and 
noticed that it stayed there even after shutdown and though it was after 
7.4 but I was wrong about that.


Your commit made it that clear.

Agree it's not a big case if it stays there.

Would the unlink succeed if the socket was owned by _relayd?

G




Re: relayd does not delete control socket on shutdown

2023-10-21 Thread Florian Obser
On 2023-10-21 14:49 +03, Kapetanakis Giannis  wrote:
> Rev 1.140 by florian@ seems to have changed that.
>
> Do not try to unlink the control socket in an unprivileged child
> process on shutdown.
> Found while working ontame(2)  .
> OK benno@
>

Which was 8 years ago. I don't understand why you see a change in 7.4.

Anyway, we decided to not clean up control sockets in any of our
privsep daemons because leaving them behind does not cause any issues.

> G
>
>
> On 21/10/2023 14:41, Kapetanakis Giannis wrote:
>> After 7.4 relayd does not unlink it's socket
>>
>> I've added the following but it's probably not enough. unveil?
>>
>> G
>>
>> Index: relayd.c
>> ===
>> RCS file: /cvs/src/usr.sbin/relayd/relayd.c,v
>> retrieving revision 1.191
>> diff -u -p -r1.191 relayd.c
>> --- relayd.c    25 Jun 2023 08:07:38 -    1.191
>> +++ relayd.c    21 Oct 2023 11:39:44 -
>> @@ -382,6 +382,8 @@ parent_shutdown(struct relayd *env)
>>  free(env->sc_ps);
>>  free(env);
>>
>> +    unlink(env->sc_ps->ps_csock.cs_name);
>> +
>>  log_info("parent terminating, pid %d", getpid());
>>
>>  exit(0);
>>
>

-- 
In my defence, I have been left unsupervised.



Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-21 Thread Martin Pieuchot
On 21/10/23(Sat) 14:28, Miod Vallat wrote:
> > Stuart, Miod, I wonder if this also help for the off-by-one issue you
> > are seeing.  It might not.
> 
> It makes the aforementioned issue disappear on the affected machine.

Thanks at lot for testing!

> > Comments, ok?
> 
> > diff --git sys/uvm/uvm_pdaemon.c sys/uvm/uvm_pdaemon.c
> > index 284211d226c..a26a776df67 100644
> > --- sys/uvm/uvm_pdaemon.c
> > +++ sys/uvm/uvm_pdaemon.c
> 
> > @@ -917,9 +914,7 @@ uvmpd_scan(struct uvm_pmalloc *pma, struct 
> > uvm_constraint_range *constraint)
> >  */
> > free = uvmexp.free - BUFPAGES_DEFICIT;
> > swap_shortage = 0;
> > -   if (free < uvmexp.freetarg &&
> > -   uvmexp.swpginuse == uvmexp.swpages &&
> > -   !uvm_swapisfull() &&
> > +   if (free < uvmexp.freetarg && uvm_swapisfilled() && !uvm_swapisfull() &&
> > pages_freed == 0) {
> > swap_shortage = uvmexp.freetarg - free;
> > }
> 
> It's unfortunate that you now invoke two uvm_swapisxxx() routines, which
> will both acquire a mutex. Maybe a third uvm_swapisxxx routine could be
> introduced to compute the swapisfilled && !swapisfull condition at once?

I'm mot interested in such micro optimization yet.  Not acquiring a mutex
twice is IMHO not worth making half-shiny. 

However is someone is interested in going down in this direction, I'd
suggest try placing `uvmexp.freetarg' under the same lock and deal with
all its occurrences.  This is a possible next step to reduce the scope of
the uvm_lock_pageq() which is currently responsible for most of the
contention on MP in UVM.



Re: bt(5), btrace(8): execute END probe and print maps after exit() statement

2023-10-21 Thread Martin Pieuchot
On 18/10/23(Wed) 12:56, Scott Cheloha wrote:
> Hi,
> 
> A bt(5) exit() statement causes the btrace(8) interpreter to exit(3)
> immediately.
> 
> A BPFtrace exit() statement is more nuanced: the END probe is executed
> and the contents of all maps are printed before the interpreter exits.
> 
> This patch adds a halting check after the execution of each bt(5)
> statement.  If a statement causes the program to halt, the halt
> bubbles up to the top-level rule evaluation loop and terminates
> execution.  rules_teardown() then runs, just as if the program had
> received SIGTERM.
> 
> Two edge-like cases:
> 
> 1. You can exit() from BEGIN.  rules_setup() returns non-zero if this
>happens so the main loop knows to halt immediately.
> 
> 2. You can exit() from END.  This is just an early-return: the END probe
>doesn't run again.
> 
> Thoughts?

Makes sense to ease the transition from bpftrace scripts.  Ok with me if
you make sure the regression tests still pass.  Some outputs might
depend on the actual behavior and would need to be updated.

> 
> $ btrace -e '
> BEGIN {
>   @[probe] = "reached";
>   exit();
>   @[probe] = "not reached";
> }
> END {
>   @[probe] = "reached";
>   exit();
>   @[probe] = "not reached";
> }'
> 
> Index: btrace.c
> ===
> RCS file: /cvs/src/usr.sbin/btrace/btrace.c,v
> retrieving revision 1.79
> diff -u -p -r1.79 btrace.c
> --- btrace.c  12 Oct 2023 15:16:44 -  1.79
> +++ btrace.c  18 Oct 2023 17:54:16 -
> @@ -71,10 +71,10 @@ struct dtioc_probe_info   *dtpi_get_by_val
>   * Main loop and rule evaluation.
>   */
>  void  rules_do(int);
> -void  rules_setup(int);
> -void  rules_apply(int, struct dt_evt *);
> +int   rules_setup(int);
> +int   rules_apply(int, struct dt_evt *);
>  void  rules_teardown(int);
> -void  rule_eval(struct bt_rule *, struct dt_evt *);
> +int   rule_eval(struct bt_rule *, struct dt_evt *);
>  void  rule_printmaps(struct bt_rule *);
>  
>  /*
> @@ -84,7 +84,7 @@ uint64_t builtin_nsecs(struct dt_evt *
>  const char   *builtin_kstack(struct dt_evt *);
>  const char   *builtin_arg(struct dt_evt *, enum bt_argtype);
>  struct bt_arg*fn_str(struct bt_arg *, struct dt_evt *, char 
> *);
> -void  stmt_eval(struct bt_stmt *, struct dt_evt *);
> +int   stmt_eval(struct bt_stmt *, struct dt_evt *);
>  void  stmt_bucketize(struct bt_stmt *, struct dt_evt *);
>  void  stmt_clear(struct bt_stmt *);
>  void  stmt_delete(struct bt_stmt *, struct dt_evt *);
> @@ -405,6 +405,7 @@ void
>  rules_do(int fd)
>  {
>   struct sigaction sa;
> + int halt = 0;
>  
>   memset(, 0, sizeof(sa));
>   sigemptyset(_mask);
> @@ -415,9 +416,9 @@ rules_do(int fd)
>   if (sigaction(SIGTERM, , NULL))
>   err(1, "sigaction");
>  
> - rules_setup(fd);
> + halt = rules_setup(fd);
>  
> - while (!quit_pending && g_nprobes > 0) {
> + while (!quit_pending && !halt && g_nprobes > 0) {
>   static struct dt_evt devtbuf[64];
>   ssize_t rlen;
>   size_t i;
> @@ -434,8 +435,11 @@ rules_do(int fd)
>   if ((rlen % sizeof(struct dt_evt)) != 0)
>   err(1, "incorrect read");
>  
> - for (i = 0; i < rlen / sizeof(struct dt_evt); i++)
> - rules_apply(fd, [i]);
> + for (i = 0; i < rlen / sizeof(struct dt_evt); i++) {
> + halt = rules_apply(fd, [i]);
> + if (halt)
> + break;
> + }
>   }
>  
>   rules_teardown(fd);
> @@ -484,7 +488,7 @@ rules_action_scan(struct bt_stmt *bs)
>   return evtflags;
>  }
>  
> -void
> +int
>  rules_setup(int fd)
>  {
>   struct dtioc_probe_info *dtpi;
> @@ -493,7 +497,7 @@ rules_setup(int fd)
>   struct bt_probe *bp;
>   struct bt_stmt *bs;
>   struct bt_arg *ba;
> - int dokstack = 0, on = 1;
> + int dokstack = 0, halt = 0, on = 1;
>   uint64_t evtflags;
>  
>   TAILQ_FOREACH(r, _rules, br_next) {
> @@ -553,7 +557,7 @@ rules_setup(int fd)
>   clock_gettime(CLOCK_REALTIME, _devt.dtev_tsp);
>  
>   if (rbegin)
> - rule_eval(rbegin, _devt);
> + halt = rule_eval(rbegin, _devt);
>  
>   /* Enable all probes */
>   TAILQ_FOREACH(r, _rules, br_next) {
> @@ -571,9 +575,14 @@ rules_setup(int fd)
>   if (ioctl(fd, DTIOCRECORD, ))
>   err(1, "DTIOCRECORD");
>   }
> +
> + return halt;
>  }
>  
> -void
> +/*
> + * Returns non-zero if the program should halt.
> + */
> +int
>  rules_apply(int fd, struct dt_evt *dtev)
>  {
>   struct bt_rule *r;
> @@ -586,9 +595,11 @@ 

Re: Prevent off-by-one accounting hang in out-of-swap situations

2023-10-21 Thread Miod Vallat
> Stuart, Miod, I wonder if this also help for the off-by-one issue you
> are seeing.  It might not.

It makes the aforementioned issue disappear on the affected machine.

> Comments, ok?

> diff --git sys/uvm/uvm_pdaemon.c sys/uvm/uvm_pdaemon.c
> index 284211d226c..a26a776df67 100644
> --- sys/uvm/uvm_pdaemon.c
> +++ sys/uvm/uvm_pdaemon.c

> @@ -917,9 +914,7 @@ uvmpd_scan(struct uvm_pmalloc *pma, struct 
> uvm_constraint_range *constraint)
>*/
>   free = uvmexp.free - BUFPAGES_DEFICIT;
>   swap_shortage = 0;
> - if (free < uvmexp.freetarg &&
> - uvmexp.swpginuse == uvmexp.swpages &&
> - !uvm_swapisfull() &&
> + if (free < uvmexp.freetarg && uvm_swapisfilled() && !uvm_swapisfull() &&
>   pages_freed == 0) {
>   swap_shortage = uvmexp.freetarg - free;
>   }

It's unfortunate that you now invoke two uvm_swapisxxx() routines, which
will both acquire a mutex. Maybe a third uvm_swapisxxx routine could be
introduced to compute the swapisfilled && !swapisfull condition at once?



Re: relayd does not delete control socket on shutdown

2023-10-21 Thread Kapetanakis Giannis

Rev 1.140 by florian@ seems to have changed that.

Do not try to unlink the control socket in an unprivileged child
process on shutdown.
Found while working ontame(2)  .
OK benno@

G


On 21/10/2023 14:41, Kapetanakis Giannis wrote:

After 7.4 relayd does not unlink it's socket

I've added the following but it's probably not enough. unveil?

G

Index: relayd.c
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.c,v
retrieving revision 1.191
diff -u -p -r1.191 relayd.c
--- relayd.c    25 Jun 2023 08:07:38 -    1.191
+++ relayd.c    21 Oct 2023 11:39:44 -
@@ -382,6 +382,8 @@ parent_shutdown(struct relayd *env)
 free(env->sc_ps);
 free(env);

+    unlink(env->sc_ps->ps_csock.cs_name);
+
 log_info("parent terminating, pid %d", getpid());

 exit(0);



relayd does not delete control socket on shutdown

2023-10-21 Thread Kapetanakis Giannis

After 7.4 relayd does not unlink it's socket

I've added the following but it's probably not enough. unveil?

G

Index: relayd.c
===
RCS file: /cvs/src/usr.sbin/relayd/relayd.c,v
retrieving revision 1.191
diff -u -p -r1.191 relayd.c
--- relayd.c    25 Jun 2023 08:07:38 -    1.191
+++ relayd.c    21 Oct 2023 11:39:44 -
@@ -382,6 +382,8 @@ parent_shutdown(struct relayd *env)
 free(env->sc_ps);
 free(env);

+    unlink(env->sc_ps->ps_csock.cs_name);
+
 log_info("parent terminating, pid %d", getpid());

 exit(0);



  1   2   3   4   5   6   7   8   9   10   >