Re: Small change in sysupgrade for custom release and test

2023-05-16 Thread Theo de Raadt
No.

First of all, because there is no justification.

Secondly, because it is not documented.

But thirdly, because we keep shit simple so that people don't build
their own stuff on top of our infrastructure, so that if we feel the
need to change/break our own infrastructure we don't need to give a shit
about anyone doing weird stuff on their own.

So the short answer would be no, because what you propose does not
serve the greater userbase in any way.  It only serves you, apparently.


Sven F.  wrote:

> Bienvenue,
> 
> --- /usr/sbin/sysupgrade.oldTue May 16 18:53:13 2023
> +++ /usr/sbin/sysupgradeTue May 16 19:04:46 2023
> @@ -143,6 +143,7 @@
>  case ${_LINE} in
>  *\ ${_KEY})SIGNIFY_KEY=/etc/signify/${_KEY} ;;
>  *\ ${_NEXTKEY})SIGNIFY_KEY=/etc/signify/${_NEXTKEY} ;;
> +*\ *.pub)  SIGNIFY_KEY=/etc/signify/${_LINE##* } && echo Using custom
> key $SIGNIFY_KEY ;;
>  *) err "invalid signing key" ;;
>  esac
> 
> Read the signing key in the file so you can use a custom key when
> testing release,
> 
> Have a good one.
> 



Small change in sysupgrade for custom release and test

2023-05-16 Thread Sven F.
Bienvenue,

--- /usr/sbin/sysupgrade.oldTue May 16 18:53:13 2023
+++ /usr/sbin/sysupgradeTue May 16 19:04:46 2023
@@ -143,6 +143,7 @@
 case ${_LINE} in
 *\ ${_KEY})SIGNIFY_KEY=/etc/signify/${_KEY} ;;
 *\ ${_NEXTKEY})SIGNIFY_KEY=/etc/signify/${_NEXTKEY} ;;
+*\ *.pub)  SIGNIFY_KEY=/etc/signify/${_LINE##* } && echo Using custom
key $SIGNIFY_KEY ;;
 *) err "invalid signing key" ;;
 esac

Read the signing key in the file so you can use a custom key when
testing release,

Have a good one.



Re: installer: amd64 EFI: default to GPT

2023-05-16 Thread Mark Kettenis
> Date: Tue, 16 May 2023 20:25:36 +
> From: Klemens Nanni 
> 
> On Tue, May 16, 2023 at 10:07:20AM -0700, Chris Cappuccio wrote:
> > I don't quite understand the case this patch solves, because my installs to
> > fresh media always get EFI/GPT. It doesn't default to MBR. However, if
> > there is a case where it tries to use MBR, that isn't going to work so well.
> 
> If efi(4) attaches, the installer defaults to EFI/GPT:
>   Use (W)hole disk MBR, whole disk (G)PT, or (E)dit? [gpt]
> 
> That is what you see and why you chose "gpt", right?
> It will lead to a disk containing an EFI System Partition.
> 
> My diff is only concerns chosing "edit" at this question, for example
> when you want to leave free space on the disk.
> 
> "edit" drops into manual mode aka. 'fdisk -e', after instructing users
> how to create a bootable disk.  If the chosen disk has a valid GPT:
> 
>   Use (W)hole disk MBR, whole disk (G)PT, or (E)dit? [gpt] edit
> 
>   You will now create two GPT partitions. The first must have an id
>   of 'EF' and be large enough to contain the OpenBSD boot programs,
>   at least 532480 blocks. The second must have an id of 'A6' and will
>   contain your OpenBSD data. Neither may overlap other partitions.
>   Inside the fdisk command, the 'manual' command describes the fdisk
>   commands in detail.
> 
>   Enter 'help' for information
>   sd0: 1>
> 
> 
> In all other cases, incl. a zeroed disk on an EFI system:
> 
>   Use (W)hole disk MBR, whole disk (G)PT, or (E)dit? [gpt] edit
> 
>   You will now create a single MBR partition to contain your OpenBSD 
> data. This
>   partition must have an id of 'A6'; must *NOT* overlap other partitions; 
> and
>   must be marked as the only active partition.  Inside the fdisk command, 
> the
>   'manual' command describes all the fdisk commands in detail.
> 
>   Enter 'help' for information
>   sd0: 1>
> 
> 
> So it tells users to do MBR, even when they boot using EFI, possibly on
> a new machine that only support UEFI and not BIOS/compat mode/whatever.
> 
> Users follow instructions, one scenario could be entering 'help',
> learning about and using 'reinit' to initialise the partition table
> (it will do MBR), 'print' to confirm that it indeed create a single
> partition of type A6, then 'write' and 'quit' to continue.
> 
> The installer happily finishes a default install, sd0i may be be some FFS,
> but not ax MSDOS EFI System Partition.
> 
> > MBR boot is totally unsupported on modern Intel. Starting with 10th gen
> > Intel processors, Intel only supplies graphics code for EFI mode.
> > 
> > With some 10th gen chipsets, like W480, the BIOS still gives you can option 
> > to
> > boot MBR with zero graphics. For the BIOS on 11th gen chipsets like the 
> > W580,
> > there is no MBR boot capability at all, for whatever reason.
> > 
> 
> I noticed this when installing on a 12th gen Intel T14 gen 3, where I
> don't see any BIOS mode whatsoever, hence my surprise being instructed
> to go with MBR instead of GPT when I chose "edit" to create an OpenBSD
> parition that does not span the whole disk (nothing else done manually).
> 
> Thus I propose switching the installer's decision about which instructions
> to print from looking at on-disk data to checking whether efi(4) attached;
> the very same check that is used to default "Use (W)hole ...?" to "gpt"
> instead of "whole".
> 
> Does that make any more sense?

It doesn't really.  You're just swapping one bit of incomplete advice
with another bit.

> Am I missing something gravely here?

Well, the problem is that we're mixing up two issues here:

* Booting in EFI mode vs. booting in legacy mode.

* What type of partition table to use: GPT vs. MBR.

You can boot in EFI mode with only an MBR partition table.  That is
what our installer does!  And you can actually boot in legacy mode
with a GPT since there still is an MBR too!

To boot in EFI mode you have to create an EFI System Partition.  Last
time I looked the fdisk(8) interface allows you to create one by
choosing the 'EF' partition type regardless of whether you start with
an MBR partition table or a GPT.  So the part of the advice that
suggests creating an EFI System Partition could be phrased in a
partition type agnostic fashion.

The current policy is that if there already is a valid partition table
on the disk, we go with that.  This is sensible since it means the
user gets an opportunity to review what's already on the disk and
tweak it appropriately.

But you've created a situation where there is no valid partition table
on the disk.  In that case you actually have to put a new partition
table on the disk, and I agree that on a modern EFI machine it makes
most sense for that to be a GPT.  Maybe fdisk(8) should do something
more sensible by default in this case?

To really improve the user experience the installer should only
suggest creating an EFI System Partition if there isn't one already.

Re: installer: amd64 EFI: default to GPT

2023-05-16 Thread Klemens Nanni
On Tue, May 16, 2023 at 10:07:20AM -0700, Chris Cappuccio wrote:
> I don't quite understand the case this patch solves, because my installs to
> fresh media always get EFI/GPT. It doesn't default to MBR. However, if
> there is a case where it tries to use MBR, that isn't going to work so well.

If efi(4) attaches, the installer defaults to EFI/GPT:
Use (W)hole disk MBR, whole disk (G)PT, or (E)dit? [gpt]

That is what you see and why you chose "gpt", right?
It will lead to a disk containing an EFI System Partition.

My diff is only concerns chosing "edit" at this question, for example
when you want to leave free space on the disk.

"edit" drops into manual mode aka. 'fdisk -e', after instructing users
how to create a bootable disk.  If the chosen disk has a valid GPT:

Use (W)hole disk MBR, whole disk (G)PT, or (E)dit? [gpt] edit

You will now create two GPT partitions. The first must have an id
of 'EF' and be large enough to contain the OpenBSD boot programs,
at least 532480 blocks. The second must have an id of 'A6' and will
contain your OpenBSD data. Neither may overlap other partitions.
Inside the fdisk command, the 'manual' command describes the fdisk
commands in detail.

Enter 'help' for information
sd0: 1>


In all other cases, incl. a zeroed disk on an EFI system:

Use (W)hole disk MBR, whole disk (G)PT, or (E)dit? [gpt] edit

You will now create a single MBR partition to contain your OpenBSD 
data. This
partition must have an id of 'A6'; must *NOT* overlap other partitions; 
and
must be marked as the only active partition.  Inside the fdisk command, 
the
'manual' command describes all the fdisk commands in detail.

Enter 'help' for information
sd0: 1>


So it tells users to do MBR, even when they boot using EFI, possibly on
a new machine that only support UEFI and not BIOS/compat mode/whatever.

Users follow instructions, one scenario could be entering 'help',
learning about and using 'reinit' to initialise the partition table
(it will do MBR), 'print' to confirm that it indeed create a single
partition of type A6, then 'write' and 'quit' to continue.

The installer happily finishes a default install, sd0i may be be some FFS,
but not ax MSDOS EFI System Partition.

> MBR boot is totally unsupported on modern Intel. Starting with 10th gen
> Intel processors, Intel only supplies graphics code for EFI mode.
> 
> With some 10th gen chipsets, like W480, the BIOS still gives you can option to
> boot MBR with zero graphics. For the BIOS on 11th gen chipsets like the W580,
> there is no MBR boot capability at all, for whatever reason.
> 

I noticed this when installing on a 12th gen Intel T14 gen 3, where I
don't see any BIOS mode whatsoever, hence my surprise being instructed
to go with MBR instead of GPT when I chose "edit" to create an OpenBSD
parition that does not span the whole disk (nothing else done manually).

Thus I propose switching the installer's decision about which instructions
to print from looking at on-disk data to checking whether efi(4) attached;
the very same check that is used to default "Use (W)hole ...?" to "gpt"
instead of "whole".

Does that make any more sense?

Am I missing something gravely here?


Index: distrib/amd64/common/install.md
===
RCS file: /cvs/src/distrib/amd64/common/install.md,v
retrieving revision 1.60
diff -u -p -r1.60 install.md
--- distrib/amd64/common/install.md 26 Apr 2023 22:45:32 -  1.60
+++ distrib/amd64/common/install.md 16 May 2023 20:24:25 -
@@ -86,7 +86,7 @@ md_prep_fdisk() {
echo "done."
return ;;
[eE]*)
-   if disk_has $_disk gpt; then
+   if [[ $MDEFI == y ]]; then
# Manually configure the GPT.
cat <<__EOT
 



Re: missing malloc failure check at /src/lib/libcrypto/asn1/bio_ndef.c

2023-05-16 Thread Илья Шипицин
вт, 16 мая 2023 г. в 21:18, Theo Buehler :

> > I tried to find "missing malloc null check" using the following
> coccinelle
> > script (easy to run from within CI)
>
> Cool, that's nice. We tend to be strict with error checking in new code,
> but having such a sanity check certainly won't hurt. If we only need to
> fix half a dozen functions, it might actually be worth it. Thanks


> > diff -u -p a/apps/openssl/apps.c b/apps/openssl/apps.c
> > --- a/apps/openssl/apps.c
> > +++ b/apps/openssl/apps.c
> > @@ -379,6 +379,8 @@ password_callback(char *buf, int bufsiz,
> >  PW_MIN_LENGTH, bufsiz - 1);
> >  if (ok >= 0 && verify) {
> >  buff = malloc(bufsiz);
> > +if (buff == NULL)
> > +return;
> >  ok = UI_add_verify_string(ui, prompt, ui_flags, buff,
> >  PW_MIN_LENGTH, bufsiz - 1, buf);
>
> UI_add_verify_string() handles a NULL buff correctly, although it is
> far from obvious. We will end up in general_allocate_prompt() with
> type == UIT_VERIFY and result_buf == NULL and will return -1 eventually.
>
> If we wanted to NULL check the allocation here, we'd need to rework this
> entire function (which we should probably do anyway).
>
> > diff -u -p a/crypto/asn1/bio_ndef.c b/crypto/asn1/bio_ndef.c
> > --- a/crypto/asn1/bio_ndef.c
> > +++ b/crypto/asn1/bio_ndef.c
>
> Already discussed.
>
> > diff -u -p a/tests/ecdhtest.c b/tests/ecdhtest.c
> > --- a/tests/ecdhtest.c
> > +++ b/tests/ecdhtest.c
>
> Thanks, I committed an appropriate version of this.
>
> > diff -u -p a/ssl/d1_pkt.c b/ssl/d1_pkt.c
> > --- a/ssl/d1_pkt.c
> > +++ b/ssl/d1_pkt.c
> > @@ -214,6 +214,8 @@ dtls1_buffer_record(SSL *s, record_pqueu
> >  return 0;
> >
> >  rdata = malloc(sizeof(DTLS1_RECORD_DATA_INTERNAL));
> > +if (rdata == NULL)
> > +return;
> >  item = pitem_new(priority, rdata);
> >  if (rdata == NULL || item == NULL)
> >  goto init_err;
>
> The checks actually happen in 'if (rdata == NULL || item == NULL)' above.
> pitem_new() does not dereference rdata, so the error checks are fine as
> they are.
>

"rdata" is passed to pitem_new, it is hard to check if it is dereferenced
there or not.


>
> We could rewrite this to do the following, with a corresponding
> NULL-initialization of item at the top of the function. I would actually
> prefer this.
>
> if ((rdata = malloc(sizeof(*rdata))) == NULL)
> goto init_err;
> if ((item = pitem_new(priority, rdata)) == NULL)
> goto init_err;
>

it would be nice


>
> > @@ -260,6 +262,8 @@ dtls1_buffer_rcontent(SSL *s, rcontent_p
> >  return 0;
> >
> >  rdata = malloc(sizeof(DTLS1_RCONTENT_DATA_INTERNAL));
> > +if (rdata == NULL)
> > +return;
> >  item = pitem_new(priority, rdata);
> >  if (rdata == NULL || item == NULL)
> >  goto init_err;
> >
>
> Exactly the same here.
>


Re: Unlock ip_sysctl()

2023-05-16 Thread Alexander Bluhm
On Tue, May 16, 2023 at 10:40:12PM +0300, Vitaliy Makkoveev wrote:
> We have "error == 0" in assertion, so I used this idiom instead of
> "!error". This is not the fast path, so dropping "maxlen !=
> mq->mq_maxlen" doesn't provide any performance impact.
> 
> ok?

OK bluhm@

> Index: sys/kern/uipc_mbuf.c
> ===
> RCS file: /cvs/src/sys/kern/uipc_mbuf.c,v
> retrieving revision 1.285
> diff -u -p -r1.285 uipc_mbuf.c
> --- sys/kern/uipc_mbuf.c  5 May 2023 01:19:51 -   1.285
> +++ sys/kern/uipc_mbuf.c  16 May 2023 19:34:28 -
> @@ -1801,7 +1801,7 @@ sysctl_mq(int *name, u_int namelen, void
>   case IFQCTL_MAXLEN:
>   maxlen = mq->mq_maxlen;
>   error = sysctl_int(oldp, oldlenp, newp, newlen, );
> - if (!error && maxlen != mq->mq_maxlen)
> + if (error == 0)
>   mq_set_maxlen(mq, maxlen);
>   return (error);
>   case IFQCTL_DROPS:



Re: cwm: add fvwm and tvm as default wm entries

2023-05-16 Thread Marc Espie
As another rant: we old farts know which window manager we want to use.
But for newer users, there might be a chance to find something cool before
they get totally fossilized.

And secondary rant: X is a failure, in that there is a *choice* of window
managers, but so many of them haven't been really upgraded and thus are
"useless" because they don't support modern hints.

It looks like a very interesting line to die on.



Re: Unlock ip_sysctl()

2023-05-16 Thread Vitaliy Makkoveev
On Tue, May 16, 2023 at 08:26:37PM +0300, Vitaliy Makkoveev wrote:
> > On 16 May 2023, at 18:35, Alexander Bluhm  wrote:
> > 
> > I saw one issue in sysctl_niq().  Another CPU could write mq_maxlen
> > and our logic is inconsistent.  Below is a fix with read once.  Each
> > CPU detects its own change, last write wins.  Or should we protect
> > everything with mq_mtx?  Then sysctl 123 -> 345 would have the
> > correct old value on both CPUs.
> 
> I don’t assume this case as issue, since if we have two concurrent
> sysctl calls modifying the same mib we can’t predict which value will
> be set last. To me, the "maxlen != mq->mq_maxlen” check could be
> avoided and the "!error” or "error == 0” check is pretty enough:
> 
>   error = sysctl_int(oldp, oldlenp, newp, newlen, );
>   if (!error)
>   mq_set_maxlen(mq, maxlen);
> 
> If you want to keep the check, I prefer to protect everything with
> mq_mtx, and push the check within mq_set_maxlen(). At least this is
> clean and strictly predictable comparing with
> oldmax = newmax = READ_ONCE().
>

We have "error == 0" in assertion, so I used this idiom instead of
"!error". This is not the fast path, so dropping "maxlen !=
mq->mq_maxlen" doesn't provide any performance impact.

ok?

Index: sys/kern/uipc_mbuf.c
===
RCS file: /cvs/src/sys/kern/uipc_mbuf.c,v
retrieving revision 1.285
diff -u -p -r1.285 uipc_mbuf.c
--- sys/kern/uipc_mbuf.c5 May 2023 01:19:51 -   1.285
+++ sys/kern/uipc_mbuf.c16 May 2023 19:34:28 -
@@ -1801,7 +1801,7 @@ sysctl_mq(int *name, u_int namelen, void
case IFQCTL_MAXLEN:
maxlen = mq->mq_maxlen;
error = sysctl_int(oldp, oldlenp, newp, newlen, );
-   if (!error && maxlen != mq->mq_maxlen)
+   if (error == 0)
mq_set_maxlen(mq, maxlen);
return (error);
case IFQCTL_DROPS:



Re: cwm: add fvwm and tvm as default wm entries

2023-05-16 Thread Marc Espie
On Tue, May 16, 2023 at 02:33:34AM +, Klemens Nanni wrote:
> On Mon, May 15, 2023 at 09:42:47AM -0400, Bryan Steele wrote:
> > On Mon, May 15, 2023 at 09:17:00AM -0400, Okan Demirmen wrote:
> > > On Mon 2023.05.15 at 10:41 +0200, Matthieu Herrb wrote:
> > > > On Mon, May 15, 2023 at 06:26:41AM +, Klemens Nanni wrote:
> > > > > Both fvwm(1) and twm(1) have a restart menu that contains other window
> > > > > managers by default, which is useful if you want to switch around
> > > > > without restarting X and/or custom window manager config.
> > > > > 
> > > > > cwm(1) only offers to restart into itself by deafult.
> > > > > Add the other two we ship by default so users can round trip between
> > > > > them.
> > > > > 
> > > > > Feedback? OK?
> > > > 
> > > > Last year I mentionned that I think we should retire twm. It's really
> > > > too old and missing support for the modern window managers hints.
> > > > 
> > > > People still using it should switch to cwm or maybe ctwm from ports
> > > > (to keep the same configurarion system), or someone should step up to
> > > > maintain it and enhance it with exwmh support. (but this is somehow
> > > > just wasting time imho).
> 
> I don't know anything about twm, so I'll leave that to others.
> 
> > > > 
> > > > Otherwise ok to add this and fix the other WM menus for other window
> > > > managers (those parts of the configs are already local changes in
> > > > Xenocara)
> > > 
> > > I might argue the opposite, to remove cwm from fvwm and twm restart 
> > > menus, if
> > > this inconsistency is a real concern. The entries in fvwm/twm are in the
> > > (shipped) example config files, where-as below it is, well, there for 
> > > good with
> > > no user choice. Heck, how often to do people even use this restart wm to
> > > another WM outside of playing around? Most window managers handle restarts
> > > differently, regardless of what ICCCM/EWMH says) and even then, crossing 
> > > window
> > > managers like this introduces inconsistencies. It's fine for playing 
> > > around I
> > > suppose, but is it really a demanded "workflow"?
> 
> It is convenient for testing because you keep all your windows and don't
> have to login out and in again;  that's about it for me.

I have to side with kn on this.

Actually, it would be cool if ywe could register more wm and switch on runtimes
(a bit like @shell ?)



Re: missing malloc failure check at /src/lib/libcrypto/asn1/bio_ndef.c

2023-05-16 Thread Theo Buehler
> I tried to find "missing malloc null check" using the following coccinelle
> script (easy to run from within CI)

Cool, that's nice. We tend to be strict with error checking in new code,
but having such a sanity check certainly won't hurt. If we only need to
fix half a dozen functions, it might actually be worth it. Thanks

> diff -u -p a/apps/openssl/apps.c b/apps/openssl/apps.c
> --- a/apps/openssl/apps.c
> +++ b/apps/openssl/apps.c
> @@ -379,6 +379,8 @@ password_callback(char *buf, int bufsiz,
>  PW_MIN_LENGTH, bufsiz - 1);
>  if (ok >= 0 && verify) {
>  buff = malloc(bufsiz);
> +if (buff == NULL)
> +return;
>  ok = UI_add_verify_string(ui, prompt, ui_flags, buff,
>  PW_MIN_LENGTH, bufsiz - 1, buf);

UI_add_verify_string() handles a NULL buff correctly, although it is
far from obvious. We will end up in general_allocate_prompt() with
type == UIT_VERIFY and result_buf == NULL and will return -1 eventually.

If we wanted to NULL check the allocation here, we'd need to rework this
entire function (which we should probably do anyway).

> diff -u -p a/crypto/asn1/bio_ndef.c b/crypto/asn1/bio_ndef.c
> --- a/crypto/asn1/bio_ndef.c
> +++ b/crypto/asn1/bio_ndef.c

Already discussed.

> diff -u -p a/tests/ecdhtest.c b/tests/ecdhtest.c
> --- a/tests/ecdhtest.c
> +++ b/tests/ecdhtest.c

Thanks, I committed an appropriate version of this.

> diff -u -p a/ssl/d1_pkt.c b/ssl/d1_pkt.c
> --- a/ssl/d1_pkt.c
> +++ b/ssl/d1_pkt.c
> @@ -214,6 +214,8 @@ dtls1_buffer_record(SSL *s, record_pqueu
>  return 0;
> 
>  rdata = malloc(sizeof(DTLS1_RECORD_DATA_INTERNAL));
> +if (rdata == NULL)
> +return;
>  item = pitem_new(priority, rdata);
>  if (rdata == NULL || item == NULL)
>  goto init_err;

The checks actually happen in 'if (rdata == NULL || item == NULL)' above.
pitem_new() does not dereference rdata, so the error checks are fine as
they are.

We could rewrite this to do the following, with a corresponding
NULL-initialization of item at the top of the function. I would actually
prefer this.

if ((rdata = malloc(sizeof(*rdata))) == NULL)
goto init_err;
if ((item = pitem_new(priority, rdata)) == NULL)
goto init_err;

> @@ -260,6 +262,8 @@ dtls1_buffer_rcontent(SSL *s, rcontent_p
>  return 0;
> 
>  rdata = malloc(sizeof(DTLS1_RCONTENT_DATA_INTERNAL));
> +if (rdata == NULL)
> +return;
>  item = pitem_new(priority, rdata);
>  if (rdata == NULL || item == NULL)
>  goto init_err;
> 

Exactly the same here.



Add LRO counter in ix(4)

2023-05-16 Thread Jan Klemkow
Hi,

This diff introduces new counters for LRO packets, we get from the
network interface.  It shows, how many packets the network interface has
coalesced into LRO packets.

In followup diff, this packet counter will also be used to set the
ph_mss variable to valid value.  So, the stack is able to forward or
redirect this kind of packets.

ok?

bye,
Jan

Index: usr.bin/netstat/inet.c
===
RCS file: /cvs/src/usr.bin/netstat/inet.c,v
retrieving revision 1.175
diff -u -p -r1.175 inet.c
--- usr.bin/netstat/inet.c  10 May 2023 12:07:17 -  1.175
+++ usr.bin/netstat/inet.c  16 May 2023 17:55:20 -
@@ -412,6 +412,10 @@ tcp_stats(char *name)
p(tcps_outhwtso, "\t\t%u output TSO packet%s hardware processed\n");
p(tcps_outpkttso, "\t\t%u output TSO packet%s generated\n");
p(tcps_outbadtso, "\t\t%u output TSO packet%s dropped\n");
+   p(tcps_inhwlro, "\t\t%u input LRO generated packet%s from hardware\n");
+   p(tcps_inpktlro, "\t\t%u input LRO coalesced packet%s from hardware\n");
+   p(tcps_inbadlro, "\t\t%u input bad LRO packet%s from hardware\n");
+
p(tcps_rcvtotal, "\t%u packet%s received\n");
p2(tcps_rcvackpack, tcps_rcvackbyte, "\t\t%u ack%s (for %llu 
byte%s)\n");
p(tcps_rcvdupack, "\t\t%u duplicate ack%s\n");
Index: sys/dev/pci/if_ix.c
===
RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
retrieving revision 1.194
diff -u -p -r1.194 if_ix.c
--- sys/dev/pci/if_ix.c 16 May 2023 14:32:54 -  1.194
+++ sys/dev/pci/if_ix.c 16 May 2023 18:49:33 -
@@ -3175,12 +3175,23 @@ ixgbe_rxeof(struct rx_ring *rxr)
sendmp = rxbuf->fmp;
rxbuf->buf = rxbuf->fmp = NULL;
 
-   if (sendmp != NULL) /* secondary frag */
+   if (sendmp != NULL) { /* secondary frag */
sendmp->m_pkthdr.len += mp->m_len;
-   else {
+
+   /*
+* This function iterates over interleaved descriptors.
+* Thus, we reuse ph_mss as global segment counter per
+* TCP connection, insteat of introducing a new variable
+* in m_pkthdr.
+*/
+   if (rsccnt)
+   sendmp->m_pkthdr.ph_mss += rsccnt - 1;
+   } else {
/* first desc of a non-ps chain */
sendmp = mp;
sendmp->m_pkthdr.len = mp->m_len;
+   if (rsccnt)
+   sendmp->m_pkthdr.ph_mss = rsccnt - 1;
 #if NVLAN > 0
if (sc->vlan_stripping && staterr & IXGBE_RXD_STAT_VP) {
sendmp->m_pkthdr.ether_vtag = vtag;
@@ -3200,6 +3211,21 @@ ixgbe_rxeof(struct rx_ring *rxr)
if (hashtype != IXGBE_RXDADV_RSSTYPE_NONE) {
sendmp->m_pkthdr.ph_flowid = hash;
SET(sendmp->m_pkthdr.csum_flags, M_FLOWID);
+   }
+
+   if (sendmp->m_pkthdr.ph_mss == 1)
+   sendmp->m_pkthdr.ph_mss = 0;
+
+   if (sendmp->m_pkthdr.ph_mss > 0) {
+   struct ether_extracted ext;
+   uint16_t pkts = sendmp->m_pkthdr.ph_mss;
+
+   ether_extract_headers(sendmp, );
+   if (ext.tcp)
+   tcpstat_inc(tcps_inhwlro);
+   else
+   tcpstat_inc(tcps_inbadlro);
+   tcpstat_add(tcps_inpktlro, pkts);
}
 
ml_enqueue(, sendmp);
Index: sys/dev/pci/ixgbe.h
===
RCS file: /cvs/src/sys/dev/pci/ixgbe.h,v
retrieving revision 1.33
diff -u -p -r1.33 ixgbe.h
--- sys/dev/pci/ixgbe.h 8 Feb 2022 03:38:00 -   1.33
+++ sys/dev/pci/ixgbe.h 16 May 2023 17:55:20 -
@@ -60,12 +60,18 @@
 
 #include 
 #include 
+#include 
 #include 
 
+struct tdb;
+
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
 
 #if NBPFILTER > 0
 #include 
Index: sys/netinet/tcp_usrreq.c
===
RCS file: /cvs/src/sys/netinet/tcp_usrreq.c,v
retrieving revision 1.218
diff -u -p -r1.218 tcp_usrreq.c
--- sys/netinet/tcp_usrreq.c10 May 2023 12:07:16 -  1.218
+++ sys/netinet/tcp_usrreq.c16 May 2023 17:55:20 -
@@ -1340,6 +1340,9 @@ tcp_sysctl_tcpstat(void *oldp, size_t *o
ASSIGN(tcps_outhwtso);
ASSIGN(tcps_outpkttso);
ASSIGN(tcps_outbadtso);
+   ASSIGN(tcps_inhwlro);
+   ASSIGN(tcps_inpktlro);
+   

Re: useradd: use "cp" instead of "pax" to copy dot files

2023-05-16 Thread Omar Polo
On 2023/05/16 11:39:17 -0600, Todd C. Miller  wrote:
> We can just use "cp -a skeldir/. homedir" to copy the skeleton dot
> files to the new user's homedir.  There's no good reason to use pax
> when cp will do and this will simplify a future commit of mine.

hard links are handled differently, but I doubt it's an issue here.
The -v output also changes slightly: before was just the list of
copied files, now a series of "source -> destination" lines.  don't
think anyone is relying on this however, so don't matter.

fwiw OK for me and +1 for the style(9) nit :)

(i'm curious about the future diff)

>  - todd
> 
> Index: usr.sbin/user/user.c
> ===
> RCS file: /cvs/src/usr.sbin/user/user.c,v
> retrieving revision 1.129
> diff -u -p -u -r1.129 user.c
> --- usr.sbin/user/user.c  15 May 2023 17:00:24 -  1.129
> +++ usr.sbin/user/user.c  16 May 2023 17:30:18 -
> @@ -171,7 +171,7 @@ enum {
>  #define MKDIR"/bin/mkdir"
>  #define MV   "/bin/mv"
>  #define NOLOGIN  "/sbin/nologin"
> -#define PAX  "/bin/pax"
> +#define CP   "/bin/cp"
>  #define RM   "/bin/rm"
>  
>  #define UNSET_INACTIVE   "Null (unset)"
> @@ -311,8 +311,8 @@ copydotfiles(char *skeldir, char *dir)
>   if (n == 0) {
>   warnx("No \"dot\" initialisation files found");
>   } else {
> - (void) asystem("cd %s && %s -rw -pe %s . %s",
> - skeldir, PAX, (verbose) ? "-v" : "", dir);
> + (void) asystem("%s -a %s %s/. %s",
> + CP, (verbose) ? "-v" : "", skeldir, dir);
>   }
>   return n;
>  }



Re: ix hardware tso

2023-05-16 Thread Todd C . Miller
On Tue, 16 May 2023 19:26:07 +0200, Alexander Bluhm wrote:

> On Tue, May 16, 2023 at 11:15:31AM -0600, Todd C. Miller wrote:
> > Would it be possible to move the forward declaration of struct tdb
> > to netinet/tcp_var.h so it is not required in every driver?
>
> sure

Thanks, that looks better to me.

 - todd



useradd: use "cp" instead of "pax" to copy dot files

2023-05-16 Thread Todd C . Miller
We can just use "cp -a skeldir/. homedir" to copy the skeleton dot
files to the new user's homedir.  There's no good reason to use pax
when cp will do and this will simplify a future commit of mine.

 - todd

Index: usr.sbin/user/user.c
===
RCS file: /cvs/src/usr.sbin/user/user.c,v
retrieving revision 1.129
diff -u -p -u -r1.129 user.c
--- usr.sbin/user/user.c15 May 2023 17:00:24 -  1.129
+++ usr.sbin/user/user.c16 May 2023 17:30:18 -
@@ -171,7 +171,7 @@ enum {
 #define MKDIR  "/bin/mkdir"
 #define MV "/bin/mv"
 #define NOLOGIN"/sbin/nologin"
-#define PAX"/bin/pax"
+#define CP "/bin/cp"
 #define RM "/bin/rm"
 
 #define UNSET_INACTIVE "Null (unset)"
@@ -311,8 +311,8 @@ copydotfiles(char *skeldir, char *dir)
if (n == 0) {
warnx("No \"dot\" initialisation files found");
} else {
-   (void) asystem("cd %s && %s -rw -pe %s . %s",
-   skeldir, PAX, (verbose) ? "-v" : "", dir);
+   (void) asystem("%s -a %s %s/. %s",
+   CP, (verbose) ? "-v" : "", skeldir, dir);
}
return n;
 }



Re: Unlock ip_sysctl()

2023-05-16 Thread Vitaliy Makkoveev
> On 16 May 2023, at 18:35, Alexander Bluhm  wrote:
> 
> I saw one issue in sysctl_niq().  Another CPU could write mq_maxlen
> and our logic is inconsistent.  Below is a fix with read once.  Each
> CPU detects its own change, last write wins.  Or should we protect
> everything with mq_mtx?  Then sysctl 123 -> 345 would have the
> correct old value on both CPUs.

I don’t assume this case as issue, since if we have two concurrent
sysctl calls modifying the same mib we can’t predict which value will
be set last. To me, the "maxlen != mq->mq_maxlen” check could be
avoided and the "!error” or "error == 0” check is pretty enough:

error = sysctl_int(oldp, oldlenp, newp, newlen, );
if (!error)
mq_set_maxlen(mq, maxlen);

If you want to keep the check, I prefer to protect everything with
mq_mtx, and push the check within mq_set_maxlen(). At least this is
clean and strictly predictable comparing with
oldmax = newmax = READ_ONCE().

> 
> bluhm
> 
> Index: kern/uipc_mbuf.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_mbuf.c,v
> retrieving revision 1.285
> diff -u -p -r1.285 uipc_mbuf.c
> --- kern/uipc_mbuf.c  5 May 2023 01:19:51 -   1.285
> +++ kern/uipc_mbuf.c  16 May 2023 15:05:52 -
> @@ -1788,7 +1788,7 @@ int
> sysctl_mq(int *name, u_int namelen, void *oldp, size_t *oldlenp,
> void *newp, size_t newlen, struct mbuf_queue *mq)
> {
> - unsigned int maxlen;
> + unsigned int oldmax, newmax;
>   int error;
> 
>   /* All sysctl names at this level are terminal. */
> @@ -1799,10 +1799,10 @@ sysctl_mq(int *name, u_int namelen, void
>   case IFQCTL_LEN:
>   return (sysctl_rdint(oldp, oldlenp, newp, mq_len(mq)));
>   case IFQCTL_MAXLEN:
> - maxlen = mq->mq_maxlen;
> - error = sysctl_int(oldp, oldlenp, newp, newlen, );
> - if (!error && maxlen != mq->mq_maxlen)
> - mq_set_maxlen(mq, maxlen);
> + oldmax = newmax = READ_ONCE(mq->mq_maxlen);
> + error = sysctl_int(oldp, oldlenp, newp, newlen, );
> + if (!error && oldmax != newmax)
> + mq_set_maxlen(mq, newmax);
>   return (error);
>   case IFQCTL_DROPS:
>   return (sysctl_rdint(oldp, oldlenp, newp, mq_drops(mq)));



Re: ix hardware tso

2023-05-16 Thread Alexander Bluhm
On Tue, May 16, 2023 at 11:15:31AM -0600, Todd C. Miller wrote:
> Would it be possible to move the forward declaration of struct tdb
> to netinet/tcp_var.h so it is not required in every driver?

sure

Index: dev/pci/if_ix.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ix.c,v
retrieving revision 1.194
diff -u -p -r1.194 if_ix.c
--- dev/pci/if_ix.c 16 May 2023 14:32:54 -  1.194
+++ dev/pci/if_ix.c 16 May 2023 16:07:53 -
@@ -1924,6 +1924,7 @@ ixgbe_setup_interface(struct ix_softc *s
ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
ifp->if_capabilities |= IFCAP_CSUM_IPv4;
 
+   ifp->if_capabilities |= IFCAP_TSOv4 | IFCAP_TSOv6;
if (sc->hw.mac.type != ixgbe_mac_82598EB)
ifp->if_capabilities |= IFCAP_LRO;
 
@@ -2344,6 +2345,7 @@ ixgbe_initialize_transmit_units(struct i
int  i;
uint64_t tdba;
uint32_t txctrl;
+   uint32_t hlreg;
 
/* Setup the Base and Length of the Tx Descriptor Ring */
 
@@ -2405,6 +2407,11 @@ ixgbe_initialize_transmit_units(struct i
rttdcs &= ~IXGBE_RTTDCS_ARBDIS;
IXGBE_WRITE_REG(hw, IXGBE_RTTDCS, rttdcs);
}
+
+   /* Enable TCP/UDP padding when using TSO */
+   hlreg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
+   hlreg |= IXGBE_HLREG0_TXPADEN;
+   IXGBE_WRITE_REG(hw, IXGBE_HLREG0, hlreg);
 }
 
 /*
@@ -2473,16 +2480,18 @@ ixgbe_free_transmit_buffers(struct tx_ri
  **/
 
 static inline int
-ixgbe_csum_offload(struct mbuf *mp, uint32_t *vlan_macip_lens,
-uint32_t *type_tucmd_mlhl, uint32_t *olinfo_status)
+ixgbe_tx_offload(struct mbuf *mp, uint32_t *vlan_macip_lens,
+uint32_t *type_tucmd_mlhl, uint32_t *olinfo_status, uint32_t *cmd_type_len,
+uint32_t *mss_l4len_idx)
 {
struct ether_extracted ext;
int offload = 0;
-   uint32_t iphlen;
+   uint32_t ethlen, iphlen;
 
ether_extract_headers(mp, );
+   ethlen = sizeof(*ext.eh);
 
-   *vlan_macip_lens |= (sizeof(*ext.eh) << IXGBE_ADVTXD_MACLEN_SHIFT);
+   *vlan_macip_lens |= (ethlen << IXGBE_ADVTXD_MACLEN_SHIFT);
 
if (ext.ip4) {
iphlen = ext.ip4->ip_hl << 2;
@@ -2500,6 +2509,8 @@ ixgbe_csum_offload(struct mbuf *mp, uint
*type_tucmd_mlhl |= IXGBE_ADVTXD_TUCMD_IPV6;
 #endif
} else {
+   if (mp->m_pkthdr.csum_flags & M_TCP_TSO)
+   tcpstat_inc(tcps_outbadtso);
return offload;
}
 
@@ -2519,6 +2530,31 @@ ixgbe_csum_offload(struct mbuf *mp, uint
}
}
 
+   if (mp->m_pkthdr.csum_flags & M_TCP_TSO) {
+   if (ext.tcp) {
+   uint32_t hdrlen, thlen, paylen, outlen;
+
+   thlen = ext.tcp->th_off << 2;
+
+   outlen = mp->m_pkthdr.ph_mss;
+   *mss_l4len_idx |= outlen << IXGBE_ADVTXD_MSS_SHIFT;
+   *mss_l4len_idx |= thlen << IXGBE_ADVTXD_L4LEN_SHIFT;
+
+   hdrlen = ethlen + iphlen + thlen;
+   paylen = mp->m_pkthdr.len - hdrlen;
+   CLR(*olinfo_status, IXGBE_ADVTXD_PAYLEN_MASK
+   << IXGBE_ADVTXD_PAYLEN_SHIFT);
+   *olinfo_status |= paylen << IXGBE_ADVTXD_PAYLEN_SHIFT;
+
+   *cmd_type_len |= IXGBE_ADVTXD_DCMD_TSE;
+   offload = 1;
+
+   tcpstat_add(tcps_outpkttso,
+   (paylen + outlen - 1) / outlen);
+   } else
+   tcpstat_inc(tcps_outbadtso);
+   }
+
return offload;
 }
 
@@ -2529,6 +2565,7 @@ ixgbe_tx_ctx_setup(struct tx_ring *txr, 
struct ixgbe_adv_tx_context_desc *TXD;
struct ixgbe_tx_buf *tx_buffer;
uint32_t vlan_macip_lens = 0, type_tucmd_mlhl = 0;
+   uint32_t mss_l4len_idx = 0;
int ctxd = txr->next_avail_desc;
int offload = 0;
 
@@ -2544,8 +2581,8 @@ ixgbe_tx_ctx_setup(struct tx_ring *txr, 
}
 #endif
 
-   offload |= ixgbe_csum_offload(mp, _macip_lens, _tucmd_mlhl,
-   olinfo_status);
+   offload |= ixgbe_tx_offload(mp, _macip_lens, _tucmd_mlhl,
+   olinfo_status, cmd_type_len, _l4len_idx);
 
if (!offload)
return (0);
@@ -2559,7 +2596,7 @@ ixgbe_tx_ctx_setup(struct tx_ring *txr, 
TXD->vlan_macip_lens = htole32(vlan_macip_lens);
TXD->type_tucmd_mlhl = htole32(type_tucmd_mlhl);
TXD->seqnum_seed = htole32(0);
-   TXD->mss_l4len_idx = htole32(0);
+   TXD->mss_l4len_idx = htole32(mss_l4len_idx);
 
tx_buffer->m_head = NULL;
tx_buffer->eop_index = -1;
@@ -2868,9 +2905,11 @@ 

Re: installer: amd64 EFI: default to GPT

2023-05-16 Thread Chris Cappuccio
Klemens Nanni [k...@openbsd.org] wrote:
> On Sun, May 07, 2023 at 06:22:55PM +0200, Mark Kettenis wrote:
> > > Date: Sat,  6 May 2023 22:47:55 +
> > > From: Klemens Nanni 
> > > 
> > > On Sat, Apr 29, 2023 at 06:47:48PM +, Klemens Nanni wrote:
> > > > Installing to a wiped disk on EFI machines suggests MBR not GPT when 
> > > > chosing
> > > > (E)dit because MBR vs. GPT in this manual case is picked based on 
> > > > existing
> > > > data on the disk, not whether it has EFI.
> > > > 
> > > > Fix that so users get correct instructions and don't end up with legacy
> > > > partitioning in fresh installs on new machines.
> > > > 
> > > > Feedback? OK?
> > > 
> > > Anyone?
> > > 
> > > Put differently, in the manual (E)dit case, the guidance message should
> > > be oriented towards the actual system (this diff) and not whatever is on
> > > the disk that's about to be set up by hand (-current).
> > 
> > Makes no sense to me.  If you choose (E)dit, you want to make changes
> > to the partition table that is already on the disk.
> 
> If the disk has random garbage or you zero it, the (E)dit case looks for GPT
> which is not there and then suggests to to MBR instead.
> 
> > EFI firmware doesn't really care whether you have a GPT partition
> > table or a traditional MBR partition table.
> 
> That might work, but shouldn't you go for GPT with an ESP on UEFI systems?
> 
> In the case I describe and hit, our installer advises to just create an MBR
> with one OpenBSD partition, whereas I think we should rather hint users at GPT
> with ESP and an OpenBDS partition on UEFI systems.
> 

I don't quite understand the case this patch solves, because my installs to
fresh media always get EFI/GPT. It doesn't default to MBR. However, if
there is a case where it tries to use MBR, that isn't going to work so well.

MBR boot is totally unsupported on modern Intel. Starting with 10th gen
Intel processors, Intel only supplies graphics code for EFI mode.

With some 10th gen chipsets, like W480, the BIOS still gives you can option to
boot MBR with zero graphics. For the BIOS on 11th gen chipsets like the W580,
there is no MBR boot capability at all, for whatever reason.



Re: ix hardware tso

2023-05-16 Thread Alexander Bluhm
On Tue, May 16, 2023 at 10:48:24AM +0200, Hrvoje Popovski wrote:
> I've tested this diff with x552 and it's working as expected.
> 
> ix0 at pci5 dev 0 function 0 "Intel X552 SFP+" rev 0x00, msix, 4 queues,
> ix1 at pci5 dev 0 function 1 "Intel X552 SFP+" rev 0x00, msix, 4 queues,

My test setup has these.  Automatic test do not cover X540T, will
do manual test.

ix0 at pci3 dev 0 function 0 "Intel 82598AF" rev 0x01, msix, 4 queues,
ix1 at pci4 dev 0 function 0 "Intel 82599" rev 0x01, msix, 4 queues,
ix2 at pci4 dev 0 function 1 "Intel 82599" rev 0x01, msix, 4 queues,
ix3 at pci10 dev 0 function 0 "Intel X540T" rev 0x01, msix, 4 queues,
ix4 at pci10 dev 0 function 1 "Intel X540T" rev 0x01, msix, 4 queues,
ix5 at pci13 dev 0 function 0 "Intel 82599" rev 0x01, msix, 4 queues,
ix6 at pci13 dev 0 function 1 "Intel 82599" rev 0x01, msix, 4 queues,

> netstat -sp tcp | grep TSO
>   46 output TSO packets software chopped
>   772398947 output TSO packets hardware processed
>   4005484521 output TSO packets generated
>   0 output TSO packets dropped

I rebased to current after jan@'s commit and I fixed the "output
TSO packets generated" counter.  There was a small error.

Basically it is jan@'s diff with some counters and cleanup from me.

The additional includes are necessary as TSO is a layer violation.
We need TCP counter in a driver.

ok?

bluhm

Index: dev/pci/if_ix.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/dev/pci/if_ix.c,v
retrieving revision 1.194
diff -u -p -r1.194 if_ix.c
--- dev/pci/if_ix.c 16 May 2023 14:32:54 -  1.194
+++ dev/pci/if_ix.c 16 May 2023 16:07:53 -
@@ -1924,6 +1924,7 @@ ixgbe_setup_interface(struct ix_softc *s
ifp->if_capabilities |= IFCAP_CSUM_TCPv6 | IFCAP_CSUM_UDPv6;
ifp->if_capabilities |= IFCAP_CSUM_IPv4;
 
+   ifp->if_capabilities |= IFCAP_TSOv4 | IFCAP_TSOv6;
if (sc->hw.mac.type != ixgbe_mac_82598EB)
ifp->if_capabilities |= IFCAP_LRO;
 
@@ -2344,6 +2345,7 @@ ixgbe_initialize_transmit_units(struct i
int  i;
uint64_t tdba;
uint32_t txctrl;
+   uint32_t hlreg;
 
/* Setup the Base and Length of the Tx Descriptor Ring */
 
@@ -2405,6 +2407,11 @@ ixgbe_initialize_transmit_units(struct i
rttdcs &= ~IXGBE_RTTDCS_ARBDIS;
IXGBE_WRITE_REG(hw, IXGBE_RTTDCS, rttdcs);
}
+
+   /* Enable TCP/UDP padding when using TSO */
+   hlreg = IXGBE_READ_REG(hw, IXGBE_HLREG0);
+   hlreg |= IXGBE_HLREG0_TXPADEN;
+   IXGBE_WRITE_REG(hw, IXGBE_HLREG0, hlreg);
 }
 
 /*
@@ -2473,16 +2480,18 @@ ixgbe_free_transmit_buffers(struct tx_ri
  **/
 
 static inline int
-ixgbe_csum_offload(struct mbuf *mp, uint32_t *vlan_macip_lens,
-uint32_t *type_tucmd_mlhl, uint32_t *olinfo_status)
+ixgbe_tx_offload(struct mbuf *mp, uint32_t *vlan_macip_lens,
+uint32_t *type_tucmd_mlhl, uint32_t *olinfo_status, uint32_t *cmd_type_len,
+uint32_t *mss_l4len_idx)
 {
struct ether_extracted ext;
int offload = 0;
-   uint32_t iphlen;
+   uint32_t ethlen, iphlen;
 
ether_extract_headers(mp, );
+   ethlen = sizeof(*ext.eh);
 
-   *vlan_macip_lens |= (sizeof(*ext.eh) << IXGBE_ADVTXD_MACLEN_SHIFT);
+   *vlan_macip_lens |= (ethlen << IXGBE_ADVTXD_MACLEN_SHIFT);
 
if (ext.ip4) {
iphlen = ext.ip4->ip_hl << 2;
@@ -2500,6 +2509,8 @@ ixgbe_csum_offload(struct mbuf *mp, uint
*type_tucmd_mlhl |= IXGBE_ADVTXD_TUCMD_IPV6;
 #endif
} else {
+   if (mp->m_pkthdr.csum_flags & M_TCP_TSO)
+   tcpstat_inc(tcps_outbadtso);
return offload;
}
 
@@ -2519,6 +2530,31 @@ ixgbe_csum_offload(struct mbuf *mp, uint
}
}
 
+   if (mp->m_pkthdr.csum_flags & M_TCP_TSO) {
+   if (ext.tcp) {
+   uint32_t hdrlen, thlen, paylen, outlen;
+
+   thlen = ext.tcp->th_off << 2;
+
+   outlen = mp->m_pkthdr.ph_mss;
+   *mss_l4len_idx |= outlen << IXGBE_ADVTXD_MSS_SHIFT;
+   *mss_l4len_idx |= thlen << IXGBE_ADVTXD_L4LEN_SHIFT;
+
+   hdrlen = ethlen + iphlen + thlen;
+   paylen = mp->m_pkthdr.len - hdrlen;
+   CLR(*olinfo_status, IXGBE_ADVTXD_PAYLEN_MASK
+   << IXGBE_ADVTXD_PAYLEN_SHIFT);
+   *olinfo_status |= paylen << IXGBE_ADVTXD_PAYLEN_SHIFT;
+
+   *cmd_type_len |= IXGBE_ADVTXD_DCMD_TSE;
+   offload = 1;
+
+   tcpstat_add(tcps_outpkttso,
+   (paylen + outlen - 1) / outlen);
+   } else
+  

Re: smtpd: some fatal -> fatalx

2023-05-16 Thread Todd C . Miller
On Tue, 16 May 2023 14:51:44 +0200, Omar Polo wrote:

> while debugging a pebkac in -portable, I noticed that in various
> places we use fatal() for libtls failures.  errno doesn't generally
> contains anything useful after libtls functions, and in most it's
> explicitly cleared to avoid misuse.
>
> just to provide a quick example, with `listen on ... ciphers foobar':
>
> % doas smtpd -d
> info: OpenSMTPD 7.0.0 starting
> dispatcher: no ciphers for 'foobar': No such file or directory
> smtpd: process dispatcher socket closed
>
> So change most of them to fatalx which doesn't append errno.  While
> here I'm also logging the actual error, via tls_config_error() or
> tls_error(), that before was missing.
>
> tls_config_new(), tls_server() and tls_client() failures are still
> logged with fatal(), which I believe it's correct.

OK millert@

 - todd



Re: Unlock ip_sysctl()

2023-05-16 Thread Alexander Bluhm
On Tue, May 16, 2023 at 01:55:32PM +0300, Vitaliy Makkoveev wrote:
> Let's start to unlock (*pr_sysctl)() handlers. We have many of them, so
> introduce temporary PR_MPSAFE flag to mark MP safe instead of pushing
> kernel lock within handlers.

I had the same idea and flag name for pr_input functions in a pending
diff.  Could you name your flag PR_MPSYSCTL?  Then it is clear what
it is used for.

> Unlock ip_sysctl(). Still take kernel lock within IPCTL_MRTSTATS case.

> ok?

OK bluhm@

> This is not related to (*pr_sysctl)() unlocking, but if there is no
> objections, I want to replace tabs by space after #define in PR_* flags
> definitions.

Yes please.  I also prefer #defineSPACE over #defineTAB as diff
formating does not jump around.  Could you also use 0x0080 when
doing this.  Flags is a short and we could need more flags for fine
grained unlocking.

I saw one issue in sysctl_niq().  Another CPU could write mq_maxlen
and our logic is inconsistent.  Below is a fix with read once.  Each
CPU detects its own change, last write wins.  Or should we protect
everything with mq_mtx?  Then sysctl 123 -> 345 would have the
correct old value on both CPUs.

bluhm

Index: kern/uipc_mbuf.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_mbuf.c,v
retrieving revision 1.285
diff -u -p -r1.285 uipc_mbuf.c
--- kern/uipc_mbuf.c5 May 2023 01:19:51 -   1.285
+++ kern/uipc_mbuf.c16 May 2023 15:05:52 -
@@ -1788,7 +1788,7 @@ int
 sysctl_mq(int *name, u_int namelen, void *oldp, size_t *oldlenp,
 void *newp, size_t newlen, struct mbuf_queue *mq)
 {
-   unsigned int maxlen;
+   unsigned int oldmax, newmax;
int error;
 
/* All sysctl names at this level are terminal. */
@@ -1799,10 +1799,10 @@ sysctl_mq(int *name, u_int namelen, void
case IFQCTL_LEN:
return (sysctl_rdint(oldp, oldlenp, newp, mq_len(mq)));
case IFQCTL_MAXLEN:
-   maxlen = mq->mq_maxlen;
-   error = sysctl_int(oldp, oldlenp, newp, newlen, );
-   if (!error && maxlen != mq->mq_maxlen)
-   mq_set_maxlen(mq, maxlen);
+   oldmax = newmax = READ_ONCE(mq->mq_maxlen);
+   error = sysctl_int(oldp, oldlenp, newp, newlen, );
+   if (!error && oldmax != newmax)
+   mq_set_maxlen(mq, newmax);
return (error);
case IFQCTL_DROPS:
return (sysctl_rdint(oldp, oldlenp, newp, mq_drops(mq)));



Re: missing malloc failure check at /src/lib/libcrypto/asn1/bio_ndef.c

2023-05-16 Thread Илья Шипицин
I'm totally fine with your approach.

I tried to find "missing malloc null check" using the following coccinelle
script (easy to run from within CI)

malloc.cocci:

// find calls to malloc
@call@
expression ptr;
position p;
@@

ptr@p = malloc(...);

// find ok calls to malloc
@ok@
expression ptr;
position call.p;
@@

ptr@p = malloc(...);
... when != ptr
(
 (ptr == NULL || ...)
|
 (ptr != NULL || ...)
)

// fix bad calls to malloc
@depends on !ok@
expression ptr;
position call.p;
@@

ptr@p = malloc(...);
+ if (ptr == NULL) return;

~/portable/libressl-3.8.0$ spatch --sp-file malloc.cocci --dir .
...
HANDLING: ./apps/openssl/apps.c
diff =
diff -u -p a/apps/openssl/apps.c b/apps/openssl/apps.c
--- a/apps/openssl/apps.c
+++ b/apps/openssl/apps.c
@@ -379,6 +379,8 @@ password_callback(char *buf, int bufsiz,
 PW_MIN_LENGTH, bufsiz - 1);
 if (ok >= 0 && verify) {
 buff = malloc(bufsiz);
+if (buff == NULL)
+return;
 ok = UI_add_verify_string(ui, prompt, ui_flags, buff,
 PW_MIN_LENGTH, bufsiz - 1, buf);
 }
...
HANDLING: ./crypto/asn1/bio_ndef.c
diff =
diff -u -p a/crypto/asn1/bio_ndef.c b/crypto/asn1/bio_ndef.c
--- a/crypto/asn1/bio_ndef.c
+++ b/crypto/asn1/bio_ndef.c
@@ -181,6 +181,8 @@ ndef_prefix(BIO *b, unsigned char **pbuf

 derlen = ASN1_item_ndef_i2d(ndef_aux->val, NULL, ndef_aux->it);
 p = malloc(derlen);
+if (p == NULL)
+return;
 ndef_aux->derbuf = p;
 *pbuf = p;
 derlen = ASN1_item_ndef_i2d(ndef_aux->val, , ndef_aux->it);
@@ -253,6 +255,8 @@ ndef_suffix(BIO *b, unsigned char **pbuf

 derlen = ASN1_item_ndef_i2d(ndef_aux->val, NULL, ndef_aux->it);
 p = malloc(derlen);
+if (p == NULL)
+return;
 ndef_aux->derbuf = p;
 *pbuf = p;
 derlen = ASN1_item_ndef_i2d(ndef_aux->val, , ndef_aux->it);
...
HANDLING: ./tests/ecdhtest.c
diff =
diff -u -p a/tests/ecdhtest.c b/tests/ecdhtest.c
--- a/tests/ecdhtest.c
+++ b/tests/ecdhtest.c
@@ -147,6 +147,8 @@ test_ecdh_curve(int nid, const char *tex

 alen = KDF1_SHA1_len;
 abuf = malloc(alen);
+if (abuf == NULL)
+return;
 aout = ECDH_compute_key(abuf, alen, EC_KEY_get0_public_key(b),
 a, KDF1_SHA1);

@@ -155,6 +157,8 @@ test_ecdh_curve(int nid, const char *tex

 blen = KDF1_SHA1_len;
 bbuf = malloc(blen);
+if (bbuf == NULL)
+return;
 bout = ECDH_compute_key(bbuf, blen, EC_KEY_get0_public_key(a),
 b, KDF1_SHA1);

@@ -345,6 +349,8 @@ ecdh_kat(BIO *out, const char *cname, in
 if (Ztmplen != Zlen)
 goto err;
 Ztmp = malloc(Ztmplen);
+if (Ztmp == NULL)
+return;
 if (!ECDH_compute_key(Ztmp, Ztmplen,
 EC_KEY_get0_public_key(key2), key1, 0))
 goto err;
.
HANDLING: ./ssl/d1_pkt.c
diff =
diff -u -p a/ssl/d1_pkt.c b/ssl/d1_pkt.c
--- a/ssl/d1_pkt.c
+++ b/ssl/d1_pkt.c
@@ -214,6 +214,8 @@ dtls1_buffer_record(SSL *s, record_pqueu
 return 0;

 rdata = malloc(sizeof(DTLS1_RECORD_DATA_INTERNAL));
+if (rdata == NULL)
+return;
 item = pitem_new(priority, rdata);
 if (rdata == NULL || item == NULL)
 goto init_err;
@@ -260,6 +262,8 @@ dtls1_buffer_rcontent(SSL *s, rcontent_p
 return 0;

 rdata = malloc(sizeof(DTLS1_RCONTENT_DATA_INTERNAL));
+if (rdata == NULL)
+return;
 item = pitem_new(priority, rdata);
 if (rdata == NULL || item == NULL)
 goto init_err;



вт, 16 мая 2023 г. в 15:18, Theo Buehler :

> On Sun, May 14, 2023 at 05:51:16PM +0200, Илья Шипицин wrote:
> > patch attached.
>
> Thank you. While we could add these malloc checks, I do not think it is
> enough. For example, derlen could be <= 0 after the first call and the
> second call to ASN1_item_ndef_i2d() is not guaranteed to succeed and to
> return the same derlen. All this should be checked.
>
> We have started cleaning up the NDEF/SMIME code which will be helped by
> the fact that we hid the general code from the public API surface. There
> is a lot of work still to do. One of the very next steps should be to
> add decent test coverage.
>
> I think the below is closer to a step in the right direction. The idea
> is that if p = NULL, ASN1_item_ndef_i2d(..., , ...) will do the
> allocation internally and check that the derlen is consistent in the two
> calls.
>
> However, I would rather have better test coverage before changing this
> code. Ownership in particular is extremely tricky here.
>
> Index: asn1/bio_ndef.c
> ===
> RCS file: /cvs/src/lib/libcrypto/asn1/bio_ndef.c,v
> retrieving revision 1.22
> diff -u -p -r1.22 bio_ndef.c
> --- asn1/bio_ndef.c 25 Apr 2023 19:08:30 -  1.22
> +++ asn1/bio_ndef.c 16 May 2023 12:39:13 -
> @@ -171,7 +171,7 @@ static int
>  ndef_prefix(BIO *b, unsigned char **pbuf, int *plen, void *parg)
>  {
> NDEF_SUPPORT *ndef_aux;
> -   unsigned char *p;
> +   unsigned char *p = NULL;
> int derlen;
>
> if (!parg)
> @@ -179,13 +179,13 @@ ndef_prefix(BIO *b, unsigned char **pbuf
>
> ndef_aux = *(NDEF_SUPPORT **)parg;
>
> -   derlen = ASN1_item_ndef_i2d(ndef_aux->val, NULL, ndef_aux->it);
> -   p = malloc(derlen);
> +   if ((derlen = 

Re: missing malloc failure check at /src/lib/libcrypto/asn1/bio_ndef.c

2023-05-16 Thread Theo Buehler
On Sun, May 14, 2023 at 05:51:16PM +0200, Илья Шипицин wrote:
> patch attached.

Thank you. While we could add these malloc checks, I do not think it is
enough. For example, derlen could be <= 0 after the first call and the
second call to ASN1_item_ndef_i2d() is not guaranteed to succeed and to
return the same derlen. All this should be checked.

We have started cleaning up the NDEF/SMIME code which will be helped by
the fact that we hid the general code from the public API surface. There
is a lot of work still to do. One of the very next steps should be to
add decent test coverage.

I think the below is closer to a step in the right direction. The idea
is that if p = NULL, ASN1_item_ndef_i2d(..., , ...) will do the
allocation internally and check that the derlen is consistent in the two
calls.

However, I would rather have better test coverage before changing this
code. Ownership in particular is extremely tricky here.

Index: asn1/bio_ndef.c
===
RCS file: /cvs/src/lib/libcrypto/asn1/bio_ndef.c,v
retrieving revision 1.22
diff -u -p -r1.22 bio_ndef.c
--- asn1/bio_ndef.c 25 Apr 2023 19:08:30 -  1.22
+++ asn1/bio_ndef.c 16 May 2023 12:39:13 -
@@ -171,7 +171,7 @@ static int
 ndef_prefix(BIO *b, unsigned char **pbuf, int *plen, void *parg)
 {
NDEF_SUPPORT *ndef_aux;
-   unsigned char *p;
+   unsigned char *p = NULL;
int derlen;
 
if (!parg)
@@ -179,13 +179,13 @@ ndef_prefix(BIO *b, unsigned char **pbuf
 
ndef_aux = *(NDEF_SUPPORT **)parg;
 
-   derlen = ASN1_item_ndef_i2d(ndef_aux->val, NULL, ndef_aux->it);
-   p = malloc(derlen);
+   if ((derlen = ASN1_item_ndef_i2d(ndef_aux->val, , ndef_aux->it)) <= 0)
+   return 0;
+
ndef_aux->derbuf = p;
*pbuf = p;
-   derlen = ASN1_item_ndef_i2d(ndef_aux->val, , ndef_aux->it);
 
-   if (!*ndef_aux->boundary)
+   if (*ndef_aux->boundary == NULL)
return 0;
 
*plen = *ndef_aux->boundary - *pbuf;
@@ -231,7 +231,7 @@ static int
 ndef_suffix(BIO *b, unsigned char **pbuf, int *plen, void *parg)
 {
NDEF_SUPPORT *ndef_aux;
-   unsigned char *p;
+   unsigned char *p = NULL;
int derlen;
const ASN1_AUX *aux;
ASN1_STREAM_ARG sarg;
@@ -251,14 +251,15 @@ ndef_suffix(BIO *b, unsigned char **pbuf
_aux->val, ndef_aux->it, ) <= 0)
return 0;
 
-   derlen = ASN1_item_ndef_i2d(ndef_aux->val, NULL, ndef_aux->it);
-   p = malloc(derlen);
+   if ((derlen = ASN1_item_ndef_i2d(ndef_aux->val, , ndef_aux->it)) <= 0)
+   return 0;
+
ndef_aux->derbuf = p;
*pbuf = p;
-   derlen = ASN1_item_ndef_i2d(ndef_aux->val, , ndef_aux->it);
 
-   if (!*ndef_aux->boundary)
+   if (*ndef_aux->boundary == NULL)
return 0;
+
*pbuf = *ndef_aux->boundary;
*plen = derlen - (*ndef_aux->boundary - ndef_aux->derbuf);
 



Re: Status of Virtual Function driver for Intel 82599 series port?

2023-05-16 Thread Theo de Raadt
Yuichiro NAITO  wrote:

>   2. MTU 9000 is required for 10Gbps performance.
> 
>  The default MTU size 1500 is too small for 10Gbps link for now.

It is dangerous to give this suggestion without caveats.  MTU over 1500
does not work on even small parts of the internet, and many people will
have a difficult time diagnosing the failure modes.

Sadly, it can only be used on point to point configurations.



smtpd: some fatal -> fatalx

2023-05-16 Thread Omar Polo
while debugging a pebkac in -portable, I noticed that in various
places we use fatal() for libtls failures.  errno doesn't generally
contains anything useful after libtls functions, and in most it's
explicitly cleared to avoid misuse.

just to provide a quick example, with `listen on ... ciphers foobar':

% doas smtpd -d
info: OpenSMTPD 7.0.0 starting
dispatcher: no ciphers for 'foobar': No such file or directory
smtpd: process dispatcher socket closed

So change most of them to fatalx which doesn't append errno.  While
here I'm also logging the actual error, via tls_config_error() or
tls_error(), that before was missing.

tls_config_new(), tls_server() and tls_client() failures are still
logged with fatal(), which I believe it's correct.

ok?

diff /usr/src
commit - 27d3d13aceb86ba91128d3f12ca9fc893d83fa86
path + /usr/src
blob - dbcf2c0158186f1a3077c25d746c9a8d7a8ad9d0
file + usr.sbin/smtpd/mta.c
--- usr.sbin/smtpd/mta.c
+++ usr.sbin/smtpd/mta.c
@@ -489,38 +489,41 @@ mta_setup_dispatcher(struct dispatcher *dispatcher)
if (remote->tls_ciphers)
ciphers = remote->tls_ciphers;
if (ciphers && tls_config_set_ciphers(config, ciphers) == -1)
-   fatal("%s", tls_config_error(config));
+   fatalx("%s", tls_config_error(config));
 
if (remote->tls_protocols) {
if (tls_config_parse_protocols(,
remote->tls_protocols) == -1)
-   fatal("failed to parse protocols \"%s\"",
+   fatalx("failed to parse protocols \"%s\"",
remote->tls_protocols);
if (tls_config_set_protocols(config, protos) == -1)
-   fatal("%s", tls_config_error(config));
+   fatalx("%s", tls_config_error(config));
}
 
if (remote->pki) {
pki = dict_get(env->sc_pki_dict, remote->pki);
if (pki == NULL)
-   fatal("client pki \"%s\" not found ", remote->pki);
+   fatalx("client pki \"%s\" not found", remote->pki);
 
tls_config_set_dheparams(config, dheparams[pki->pki_dhe]);
tls_config_use_fake_private_key(config);
if (tls_config_set_keypair_mem(config, pki->pki_cert,
pki->pki_cert_len, NULL, 0) == -1)
-   fatal("tls_config_set_keypair_mem");
+   fatalx("tls_config_set_keypair_mem: %s",
+   tls_config_error(config));
}
 
if (remote->ca) {
ca = dict_get(env->sc_ca_dict, remote->ca);
if (tls_config_set_ca_mem(config, ca->ca_cert, ca->ca_cert_len)
== -1)
-   fatal("tls_config_set_ca_mem");
+   fatalx("tls_config_set_ca_mem: %s",
+   tls_config_error(config));
}
else if (tls_config_set_ca_file(config, tls_default_ca_cert_file())
== -1)
-   fatal("tls_config_set_ca_file");
+   fatalx("tls_config_set_ca_file: %s",
+   tls_config_error(config));
 
if (remote->tls_verify) {
tls_config_verify(config);
blob - a9b7d48c8a5fe3e1af6d32429556f09b7579583b
file + usr.sbin/smtpd/smtp.c
--- usr.sbin/smtpd/smtp.c
+++ usr.sbin/smtpd/smtp.c
@@ -166,14 +166,14 @@ smtp_setup_listener_tls(struct listener *l)
if (l->tls_ciphers)
ciphers = l->tls_ciphers;
if (ciphers && tls_config_set_ciphers(config, ciphers) == -1)
-   fatal("%s", tls_config_error(config));
+   fatalx("%s", tls_config_error(config));
 
if (l->tls_protocols) {
if (tls_config_parse_protocols(, l->tls_protocols) == -1)
-   fatal("failed to parse protocols \"%s\"",
+   fatalx("failed to parse protocols \"%s\"",
l->tls_protocols);
if (tls_config_set_protocols(config, protos) == -1)
-   fatal("%s", tls_config_error(config));
+   fatalx("%s", tls_config_error(config));
}
 
pki = l->pki[0];
@@ -181,7 +181,8 @@ smtp_setup_listener_tls(struct listener *l)
fatal("no pki defined");
 
if (tls_config_set_dheparams(config, dheparams[pki->pki_dhe]) == -1)
-   fatal("tls_config_set_dheparams");
+   fatalx("tls_config_set_dheparams: %s",
+   tls_config_error(config));
 
tls_config_use_fake_private_key(config);
for (i = 0; i < l->pkicount; i++) {
@@ -189,11 +190,13 @@ smtp_setup_listener_tls(struct listener *l)
if (i == 0) {
if (tls_config_set_keypair_mem(config, pki->pki_cert,
pki->pki_cert_len, NULL, 0) == -1)
-   fatal("tls_config_set_keypair_mem");
+   

Re: fill_file(): use solock_shared() to protect socket data

2023-05-16 Thread Vitaliy Makkoveev
On Thu, Apr 27, 2023 at 02:54:38PM +0200, Claudio Jeker wrote:
> On Thu, Apr 27, 2023 at 01:55:33PM +0300, Vitaliy Makkoveev wrote:
> > Now only direct netlock used for inet sockets protection. The unlocked
> > access to all other sockets is safe, but we could lost consistency for a
> > little. Since the solock() used for sockets protection, make locking
> > path common and use it. Make it shared because this is read-only access
> > to sockets data. For same reason use shared netlock while performing
> > inpcb tables foreach walkthrough.
> 
> This is still wrong. fill_file() is not allowed to sleep. So you can not
> use a rwlock in here. fill_file is called inside a allprocess
> LIST_FOREACH loop and sleeping there will blow up.
>  
> Maybe it is enough to grab a shared NETLOCK around the LIST_FOREACH() to
> ensure that we don't sleep inside.
> 
> Please fix this properly. Right now running fstat is like playing russian
> roulette.
> 

Except __pmap_asid_alloc() from arch/sh/sh/pmap.c of uniprocessor
machine, proc_stop_sweep() is the only place where `ps_list' can't be
protected by `allprocess_lock' rwlock(9). Since the refactoring is not
obvious, for now  I propose to use two locks for `ps_list' as we do for
some other data structures like `ps_threads' or `if_list'. For the code
paths where context switch is unwanted, the read only acces to `ps_list'
will be protected by kernel lock, and by `allprocess_lock' rwlock(9) in
all the rest. The `ps_list' list modification will require both locks to
be held. Not the best solution, but it fixes sysctl_file().

Index: sys/kern/kern_exit.c
===
RCS file: /cvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.211
diff -u -p -r1.211 kern_exit.c
--- sys/kern/kern_exit.c25 Apr 2023 18:14:06 -  1.211
+++ sys/kern/kern_exit.c16 May 2023 11:14:49 -
@@ -223,6 +223,8 @@ exit1(struct proc *p, int xexit, int xsi
 
p->p_fd = NULL; /* zap the thread's copy */
 
+   rw_enter_write(_lock);
+
 /*
 * Remove proc from pidhash chain and allproc so looking
 * it up won't work.  We will put the proc on the
@@ -295,6 +297,8 @@ exit1(struct proc *p, int xexit, int xsi
process_clear_orphan(qr);
}
}
+
+   rw_exit_write(_lock);
 
/* add thread's accumulated rusage into the process's total */
ruadd(rup, >p_ru);
Index: sys/kern/kern_fork.c
===
RCS file: /cvs/src/sys/kern/kern_fork.c,v
retrieving revision 1.247
diff -u -p -r1.247 kern_fork.c
--- sys/kern/kern_fork.c25 Apr 2023 18:14:06 -  1.247
+++ sys/kern/kern_fork.c16 May 2023 11:14:49 -
@@ -62,6 +62,11 @@
 #include 
 #include 
 
+/*
+ * Locks used to protect struct members in this file:
+ * P   allprocess_lock
+ */
+
 intnprocesses = 1; /* process 0 */
 intnthreads = 1;   /* proc 0 */
 struct forkstat forkstat;
@@ -372,6 +377,8 @@ fork1(struct proc *curp, int flags, void
return (ENOMEM);
}
 
+   rw_enter_write(_lock);
+
/*
 * From now on, we're committed to the fork and cannot fail.
 */
@@ -468,19 +475,28 @@ fork1(struct proc *curp, int flags, void
knote_locked(>ps_klist, NOTE_FORK | pr->ps_pid);
 
/*
-* Update stats now that we know the fork was successful.
+* Return child pid to parent process
 */
-   uvmexp.forks++;
-   if (flags & FORK_PPWAIT)
-   uvmexp.forks_ppwait++;
-   if (flags & FORK_SHAREVM)
-   uvmexp.forks_sharevm++;
+   if (retval != NULL)
+   *retval = pr->ps_pid;
 
/*
 * Pass a pointer to the new process to the caller.
+* XXX but we don't return `rnewprocp' to sys_fork()
+* or sys_vfork().
 */
if (rnewprocp != NULL)
*rnewprocp = p;
+   rw_exit_write(_lock);
+
+   /*
+* Update stats now that we know the fork was successful.
+*/
+   uvmexp.forks++;
+   if (flags & FORK_PPWAIT)
+   uvmexp.forks_ppwait++;
+   if (flags & FORK_SHAREVM)
+   uvmexp.forks_sharevm++;
 
/*
 * Preserve synchronization semantics of vfork.  If waiting for
@@ -499,11 +515,6 @@ fork1(struct proc *curp, int flags, void
if ((flags & FORK_PTRACE) && (curpr->ps_flags & PS_TRACED))
psignal(curp, SIGTRAP);
 
-   /*
-* Return child pid to parent process
-*/
-   if (retval != NULL)
-   *retval = pr->ps_pid;
return (0);
 }
 
@@ -610,7 +621,7 @@ alloctid(void)
 /*
  * Checks for current use of a pid, either as a pid or pgid.
  */
-pid_t oldpids[128];
+pid_t oldpids[128];/* [P] */
 int
 ispidtaken(pid_t pid)
 {
Index: sys/kern/kern_ktrace.c

Re: seperate LRO/TSO flags

2023-05-16 Thread Alexander Bluhm
This diff passed a make release.  I think it should be commited
now, so that we can proceed with TSO in the driver layer.

bluhm

On Mon, May 15, 2023 at 11:16:59PM +0200, Jan Klemkow wrote:
> Index: sbin/ifconfig/ifconfig.8
> ===
> RCS file: /cvs/src/sbin/ifconfig/ifconfig.8,v
> retrieving revision 1.394
> diff -u -p -r1.394 ifconfig.8
> --- sbin/ifconfig/ifconfig.8  26 Apr 2023 02:38:08 -  1.394
> +++ sbin/ifconfig/ifconfig.8  15 May 2023 18:46:48 -
> @@ -282,8 +282,18 @@ tag.
>  As CSUM_TCPv4, but supports IPv6 datagrams.
>  .It Sy CSUM_UDPv6
>  As above, for UDP.
> -.It Sy TSO
> -The device supports TCP segment offloading (TSO).
> +.It Sy LRO
> +The device supports TCP large receive offload (LRO).
> +.It Sy TSOv4
> +The device supports IPv4 TCP segmentation offload (TSO).
> +TSO is used by default.
> +Use the
> +.Xr sysctl 8
> +variable
> +.Va net.inet.tcp.tso
> +to disable this feature.
> +.It Sy TSOv6
> +As above, for IPv6.
>  .It Sy WOL
>  The device supports Wake on LAN (WoL).
>  .It Sy hardmtu
> @@ -491,25 +501,25 @@ Query and display information and diagno
>  modules installed in an interface.
>  It is only supported by drivers implementing the necessary functionality
>  on hardware which supports it.
> -.It Cm tso
> -Enable TCP segmentation offloading (TSO) if it's supported by the hardware; 
> see
> +.It Cm tcprecvoffload
> +Enable TCP large receive offload (LRO) if it's supported by the hardware; see
>  .Cm hwfeatures .
> -TSO enabled NICs modify received TCP/IP packets.
> +LRO enabled network interfaces modify received TCP/IP packets.
>  This will also affect traffic of upper layer interfaces,
>  such as
>  .Xr vlan 4 ,
>  .Xr aggr 4 ,
>  and
>  .Xr carp 4 .
> -It is not possible to use TSO with interfaces attached to a
> +It is not possible to use LRO with interfaces attached to a
>  .Xr bridge 4 ,
>  .Xr veb 4 ,
>  or
>  .Xr tpmr 4 .
>  Changing this option will re-initialize the network interface.
> -.It Cm -tso
> -Disable TSO.
> -TSO is disabled by default.
> +.It Cm -tcprecvoffload
> +Disable LRO.
> +LRO is disabled by default.
>  .It Cm up
>  Mark an interface
>  .Dq up .
> Index: sbin/ifconfig/ifconfig.c
> ===
> RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v
> retrieving revision 1.463
> diff -u -p -r1.463 ifconfig.c
> --- sbin/ifconfig/ifconfig.c  12 May 2023 18:24:13 -  1.463
> +++ sbin/ifconfig/ifconfig.c  15 May 2023 20:27:51 -
> @@ -126,7 +126,7 @@
>  #define HWFEATURESBITS   
> \
>   "\024\1CSUM_IPv4\2CSUM_TCPv4\3CSUM_UDPv4"   \
>   "\5VLAN_MTU\6VLAN_HWTAGGING\10CSUM_TCPv6"   \
> - "\11CSUM_UDPv6\17TSO\20WOL"
> + "\11CSUM_UDPv6\15TSOv4\16TSOv6\17LSO\20WOL"
>  
>  struct ifencap {
>   unsigned int ife_flags;
> @@ -469,8 +469,8 @@ const struct  cmd {
>   { "-soii",  IFXF_INET6_NOSOII,  0,  setifxflags },
>   { "monitor",IFXF_MONITOR,   0,  setifxflags },
>   { "-monitor",   -IFXF_MONITOR,  0,  setifxflags },
> - { "tso",IFXF_TSO,   0,  setifxflags },
> - { "-tso",   -IFXF_TSO,  0,  setifxflags },
> + { "tcprecvoffload", IFXF_LRO,   0,  setifxflags },
> + { "-tcprecvoffload", -IFXF_LRO, 0,  setifxflags },
>  #ifndef SMALL
>   { "hwfeatures", NEXTARG0,   0,  printifhwfeatures },
>   { "metric", NEXTARG,0,  setifmetric },
> @@ -674,7 +674,7 @@ const struct  cmd {
>   "\7RUNNING\10NOARP\11PROMISC\12ALLMULTI\13OACTIVE\14SIMPLEX"\
>   "\15LINK0\16LINK1\17LINK2\20MULTICAST"  \
>   "\23AUTOCONF6TEMP\24MPLS\25WOL\26AUTOCONF6\27INET6_NOSOII"  \
> - "\30AUTOCONF4" "\31MONITOR" "\32TSO"
> + "\30AUTOCONF4" "\31MONITOR" "\32LRO"
>  
>  int  getinfo(struct ifreq *, int);
>  void getsock(int);
> Index: sys/dev/pci/if_ix.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
> retrieving revision 1.193
> diff -u -p -r1.193 if_ix.c
> --- sys/dev/pci/if_ix.c   28 Apr 2023 10:18:57 -  1.193
> +++ sys/dev/pci/if_ix.c   15 May 2023 21:09:13 -
> @@ -1925,7 +1925,7 @@ ixgbe_setup_interface(struct ix_softc *s
>   ifp->if_capabilities |= IFCAP_CSUM_IPv4;
>  
>   if (sc->hw.mac.type != ixgbe_mac_82598EB)
> - ifp->if_capabilities |= IFCAP_TSO;
> + ifp->if_capabilities |= IFCAP_LRO;
>  
>   /*
>* Specify the media types supported by this sc and register
> @@ -2873,13 +2873,13 @@ ixgbe_initialize_receive_units(struct ix
>   hlreg |= IXGBE_HLREG0_JUMBOEN;
>   IXGBE_WRITE_REG(hw, IXGBE_HLREG0, hlreg);
>  
> - if (ISSET(ifp->if_xflags, IFXF_TSO)) {
> + if 

Unlock ip_sysctl()

2023-05-16 Thread Vitaliy Makkoveev
Let's start to unlock (*pr_sysctl)() handlers. We have many of them, so
introduce temporary PR_MPSAFE flag to mark MP safe instead of pushing
kernel lock within handlers.

Unlock ip_sysctl(). Still take kernel lock within IPCTL_MRTSTATS case.
It looks like `mrtstat' protection is inconsistent, so keep locking as
it was. Since `mrtstat' are counters, it make sense to rework them into
per CPU counters with separate diffs.

mrt_sysctl_mfc() and mrt_sysctl_vif() do read-only access so, netlock
could be replaced by shared netlock and pushed within, but also with
separate diffs.

ok?

This is not related to (*pr_sysctl)() unlocking, but if there is no
objections, I want to replace tabs by space after #define in PR_* flags
definitions.

Index: sys/kern/uipc_domain.c
===
RCS file: /cvs/src/sys/kern/uipc_domain.c,v
retrieving revision 1.61
diff -u -p -r1.61 uipc_domain.c
--- sys/kern/uipc_domain.c  4 May 2023 09:40:36 -   1.61
+++ sys/kern/uipc_domain.c  16 May 2023 10:50:53 -
@@ -244,10 +244,12 @@ net_sysctl(int *name, u_int namelen, voi
protocol = name[1];
for (pr = dp->dom_protosw; pr < dp->dom_protoswNPROTOSW; pr++)
if (pr->pr_protocol == protocol && pr->pr_sysctl) {
-   KERNEL_LOCK();
+   if ((pr->pr_flags & PR_MPSAFE) == 0)
+   KERNEL_LOCK();
error = (*pr->pr_sysctl)(name + 2, namelen - 2,
oldp, oldlenp, newp, newlen);
-   KERNEL_UNLOCK();
+   if ((pr->pr_flags & PR_MPSAFE) == 0)
+   KERNEL_UNLOCK();
return (error);
}
return (ENOPROTOOPT);
Index: sys/netinet/in_proto.c
===
RCS file: /cvs/src/sys/netinet/in_proto.c,v
retrieving revision 1.99
diff -u -p -r1.99 in_proto.c
--- sys/netinet/in_proto.c  15 Aug 2022 09:11:38 -  1.99
+++ sys/netinet/in_proto.c  16 May 2023 10:50:53 -
@@ -177,6 +177,7 @@ u_char ip_protox[IPPROTO_MAX];
 const struct protosw inetsw[] = {
 {
   .pr_domain   = ,
+  .pr_flags= PR_MPSAFE,
   .pr_init = ip_init,
   .pr_slowtimo = ip_slowtimo,
   .pr_sysctl   = ip_sysctl
Index: sys/netinet/ip_input.c
===
RCS file: /cvs/src/sys/netinet/ip_input.c,v
retrieving revision 1.383
diff -u -p -r1.383 ip_input.c
--- sys/netinet/ip_input.c  5 Apr 2023 21:51:47 -   1.383
+++ sys/netinet/ip_input.c  16 May 2023 10:50:53 -
@@ -1704,8 +1704,11 @@ ip_sysctl(int *name, u_int namelen, void
return (ip_sysctl_ipstat(oldp, oldlenp, newp));
 #ifdef MROUTING
case IPCTL_MRTSTATS:
-   return (sysctl_rdstruct(oldp, oldlenp, newp,
-   , sizeof(mrtstat)));
+   KERNEL_LOCK();
+   error = sysctl_rdstruct(oldp, oldlenp, newp,
+   , sizeof(mrtstat));
+   KERNEL_UNLOCK();
+   return (error);
case IPCTL_MRTMFC:
if (newp)
return (EPERM);
Index: sys/sys/protosw.h
===
RCS file: /cvs/src/sys/sys/protosw.h,v
retrieving revision 1.59
diff -u -p -r1.59 protosw.h
--- sys/sys/protosw.h   26 Nov 2022 17:52:35 -  1.59
+++ sys/sys/protosw.h   16 May 2023 10:50:53 -
@@ -131,6 +131,8 @@ struct protosw {
 #definePR_ABRTACPTDIS  0x20/* abort on accept(2) to 
disconnected
   socket */
 #definePR_SPLICE   0x40/* socket splicing is possible 
*/
+#definePR_MPSAFE   0x80/* (*pr_sysctl)() doesn't 
require
+  kernel lock */
 
 /*
  * The arguments to usrreq are:



Re: ix hardware tso

2023-05-16 Thread Hrvoje Popovski
On 15.5.2023. 19:39, Alexander Bluhm wrote:
> On Sun, May 14, 2023 at 11:39:01PM +0200, Hrvoje Popovski wrote:
>> I've tested this on openbsd box with 4 iperf3's. 2 for ip4 and 2 for ip6
>> and with 16 tcp streams per iperf.  When testing over ix(4) there is big
>> differences in output performance. When testing ix/veb/vport there is
>> differences in output performance but not that big.
> Thanks a lot for testing.  I have also created some numbers which
> can be seen here.
> 
> http://bluhm.genua.de/perform/results/2023-05-14T09:14:59Z/perform.html
> 
> Sending TCP to Linux host and socket splicing gets faster.
> 
>> When testing over vport I'm getting "software chopped" which should be
>> expected.
> Yes, we cannot do hardware TSO in a bridge.  Maybe we could if all
> bridge members support it.
> 
> Next diff that should go in is where jan@ renames flags, cleans up
> ifconfig(8),. and fixes pseudo interface devices
> 
> Updated ix(4) driver diff after TCP/IP commit is below.

Hi,

I've tested this diff with x552 and it's working as expected.

ix0 at pci5 dev 0 function 0 "Intel X552 SFP+" rev 0x00, msix, 4 queues,
ix1 at pci5 dev 0 function 1 "Intel X552 SFP+" rev 0x00, msix, 4 queues,


netstat -sp tcp | grep TSO
46 output TSO packets software chopped
772398947 output TSO packets hardware processed
4005484521 output TSO packets generated
0 output TSO packets dropped



Re: cwm: add fvwm and tvm as default wm entries

2023-05-16 Thread Walter Alejandro Iglesias
I'm not an OpenBSD developer but, allow me to share my opinion about
this, please.

On May 15 2023, Okan Demirmen wrote:
> On Mon 2023.05.15 at 10:41 +0200, Matthieu Herrb wrote:
> > Last year I mentionned that I think we should retire twm. It's really
> > too old and missing support for the modern window managers hints.

For full support for ICCCM/EWMH you already have that modern, updated
and maintained version of fvwm you have in base :-).

> > 
> > People still using it ...

I started using unix-like systems in 2006, I guess that *decades* before
twm was already short for daily desktop use.  But twm has been always
there as part of the default X installation on any unix-like system.
Perhaps because its simplicity is precisely what makes twm useful as a
rescue option?  This is, at least, the use I've made of twm along all
these years.

As a side note.  Even being so simple, unlike the other two WMs in base,
twm supports utf-8.  This is now in part dysfunctional since you removed
half of the bitmap fixed "miscellaneous" fonts, which also affects fvwm2
and fvwm3 ports.

> > 
> > Otherwise ok to add this and fix the other WM menus for other window
> > managers (those parts of the configs are already local changes in
> > Xenocara)
> 
> I might argue the opposite, to remove cwm from fvwm and twm restart menus, if
> this inconsistency is a real concern. The entries in fvwm/twm are in the
> (shipped) example config files, where-as below it is, well, there for good 
> with
> no user choice. Heck, how often to do people even use this restart wm to
> another WM outside of playing around? Most window managers handle restarts
> differently, regardless of what ICCCM/EWMH says) and even then, crossing 
> window
> managers like this introduces inconsistencies. It's fine for playing around I
> suppose, but is it really a demanded "workflow"?
> 
> > > 
> > > PS:  fwvm and twm menus more programs we don't ship, e.g. "wm2", and
> > >   twm dies when failing to execute them (fvwm and cwm keeps running);
> > >   do we want to keep those default-broken entries around?
> 
> I'd support removing them.

In my experience, restarting from one window manager to another has *never*
worked fine.


-- 
Walter