Re: 11n Tx aggregation for iwm(4)

2020-06-26 Thread Landry Breuil
On Fri, Jun 26, 2020 at 06:14:48PM +0200, Landry Breuil wrote:
> On Fri, Jun 26, 2020 at 02:45:53PM +0200, Stefan Sperling wrote:
> > This patch adds support for 11n Tx aggregation to iwm(4).
> > 
> > Please help with testing if you can by running the patch and using wifi
> > as usual. Nothing should change, except that Tx speed may potentially
> > improve. If you have time to run before/after performance measurements with
> > tcpbench or such, that would be nice. But it's not required for testing.
> > 
> > If Tx aggregation is active then netstat will show a non-zero output block 
> > ack
> > agreement counter:
> > 
> > $ netstat -W iwm0 | grep 'output block'
> > 3 new output block ack agreements
> > 0 output block ack agreements timed out
> > 
> > It would be great to get at least one test for all the chipsets the driver
> > supports: 7260, 7265, 3160, 3165, 3168, 8260, 8265, 9260, 9560
> > The behaviour of the access point also matters a great deal. It won't
> > hurt to test the same chipset against several different access points.
> > 
> > I have tested this version on 8265 only so far. I've run older revisions
> > of this patch on 7265 so I'm confident that this chip will work, too.
> > So far, the APs I have tested against are athn(4) in 11a mode and in 11n
> > mode with the 'nomimo' nwflag, and a Sagemcom 11ac AP. All on 5Ghz channels.
> 
> no difference on X1c3 w/
> iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless AC 7265" rev 0x59, msi
> iwm0: hw rev 0x210, fw ver 17.3216344376.0,
> 
> using a crappy old fonera as AP, serving as a bridge to gw w/ tcpbench.
> 
> bandwidth min/avg/max/std-dev = 22.519/22.704/22.995/0.162 Mbps
> 
> same bw both ways it seems.

so no change against this old AP, which selects:
media: IEEE802.11 autoselect (OFDM48 mode 11g)
or sometimes
media: IEEE802.11 autoselect (OFDM12 mode 11g)
or
media: IEEE802.11 autoselect (OFDM6 mode 11g)

but if i connect to the ISP's box wifi, which selects:
media: IEEE802.11 autoselect (HT-MCS8 mode 11n)

the performance is horrible, i have a lot of lag, and tcpbench says:
bandwidth min/avg/max/std-dev = 0.000/1.576/10.069/2.781 Mbps

i have some iwm firmware errors in dmesg.

without the patch, its a bit the same:
bandwidth min/avg/max/std-dev = 0.000/1.836/9.846/2.292 Mbps

but no firmware errors afaict.
so dunno if the patch itself changes something, but the perf with the
ISP AP is awful. Cant remember if it was the case before as i seldomly
use it with OpenBSD as a client..

Landry



Re: awk FS behaviour change

2020-06-26 Thread Jason McIntyre
On Fri, Jun 26, 2020 at 09:28:00PM -0600, Todd C. Miller wrote:
> On Fri, 26 Jun 2020 23:56:23 +0200, Klemens Nanni wrote:
> 
> > How about adding something like "Therefore, FS should be set with -F or
> > in a BEGIN block before input is read." as second sentence in this
> > paragraph?
> 
> That whole section is missing important details.  I've tried to add
> the missing info without being too repetitive.
> 
>  - todd
> 
> Index: usr.bin/awk/awk.1
> ===
> RCS file: /cvs/src/usr.bin/awk/awk.1,v
> retrieving revision 1.54
> diff -u -p -u -r1.54 awk.1
> --- usr.bin/awk/awk.1 26 Jun 2020 21:50:06 -  1.54
> +++ usr.bin/awk/awk.1 27 Jun 2020 03:25:48 -
> @@ -129,27 +129,25 @@ and newlines are used as field separator
>  .Va FS ) .
>  This is convenient when working with multi-line records.
>  .Pp
> -An input line is normally made up of fields separated by whitespace,
> -or by the regular expression
> -.Va FS .
> +An input line is normally made up of fields split based on the value
> +of the field separator
> +.Va FS
> +at the time the line is read.

i'm not sure it reads better when we switch the emphasis from whitespace
to FS. i think it's better that people see how it normally works, then
the gories about FS. so i'd have kept the first part of the sentence,
but maybe reworked the FS bit.

>  The fields are denoted
>  .Va $1 , $2 , ... ,
>  while
>  .Va $0
>  refers to the entire line.
> -If
>  .Va FS
> -is null, the input line is split into one field per character.
> -Lines are split into fields using the value of
> +may be set to either a single character or a regular expression.
> +As as special case, if
>  .Va FS
> -at the time the line is read.
> -Because of this,
> +is a single space
> +.Pq the default ,
> +fields will be split by one or more whitespace characters.
> +If
>  .Va FS
> -is usually set via the
> -.Fl F
> -option or inside of a
> -.Ic BEGIN
> -block.
> +is null, the input line is split into one field per character.
>  .Pp
>  Normally, any number of blanks separate fields.
>  In order to set the field separator to a single blank, use the
> @@ -171,6 +169,11 @@ as the field separator, use the
>  .Fl F
>  option with a value of
>  .Sq [t] .
> +The field separator is usually set via the
> +.Fl F
> +option or from inside of a

that sounds odd, but it may be a US/UK thing: i would say either "from
inside a block" or "from the inside of a block".

jmc

> +.Ic BEGIN
> +block so that it takes effect before the input is read.
>  .Pp
>  A pattern-action statement has the form:
>  .Pp
> @@ -407,9 +410,9 @@ The name of the current input file.
>  .It Va FNR
>  Ordinal number of the current record in the current file.
>  .It Va FS
> -Regular expression used to separate fields; also settable
> -by option
> -.Fl F Ar fs .
> +Regular expression used to separate fields (default whitespace);
> +also settable by option
> +.Fl F Ar fs
>  .It Va NF
>  Number of fields in the current record.
>  .Va $NF
> 



Re: 11n Tx aggregation for iwm(4)

2020-06-26 Thread Mike Larkin
On Fri, Jun 26, 2020 at 09:01:03PM -0700, Mike Larkin wrote:
> On Fri, Jun 26, 2020 at 02:45:53PM +0200, Stefan Sperling wrote:
> > This patch adds support for 11n Tx aggregation to iwm(4).
> >
> > Please help with testing if you can by running the patch and using wifi
> > as usual. Nothing should change, except that Tx speed may potentially
> > improve. If you have time to run before/after performance measurements with
> > tcpbench or such, that would be nice. But it's not required for testing.
> >
> > If Tx aggregation is active then netstat will show a non-zero output block 
> > ack
> > agreement counter:
> >
> > $ netstat -W iwm0 | grep 'output block'
> > 3 new output block ack agreements
> > 0 output block ack agreements timed out
> >
> > It would be great to get at least one test for all the chipsets the driver
> > supports: 7260, 7265, 3160, 3165, 3168, 8260, 8265, 9260, 9560
> > The behaviour of the access point also matters a great deal. It won't
> > hurt to test the same chipset against several different access points.
> >
> > I have tested this version on 8265 only so far. I've run older revisions
> > of this patch on 7265 so I'm confident that this chip will work, too.
> > So far, the APs I have tested against are athn(4) in 11a mode and in 11n
> > mode with the 'nomimo' nwflag, and a Sagemcom 11ac AP. All on 5Ghz channels.
>
> I tested this on my T490 Thinkpad:
>
> iwm0 at pci0 dev 20 function 3 "Intel Dual Band Wireless AC 9560" rev 0x30, 
> msix
> iwm0: hw rev 0x310, fw ver 34.3125811985.0
>
> It ended up having a heck of a time connecting to anything, most/all
> connections ended up timing out or just taking a really long time to complete.
>
> I looked in dmesg, and found a stream of fatal firmware errors and other
> errors (see end of this email).
>
> My iwm-firmware was updated before I tried the new kernel:
>
> -innsmouth- ~> pkg_info iwm-firmware
> Information for inst:iwm-firmware-20191022p1
>
> Comment:
> firmware binary images for iwm(4) driver
>
> Description:
> Firmware binary images for use with the iwm(4) driver.
>
> Maintainer: The OpenBSD ports mailing-list 
>
> WWW: https://wireless.wiki.kernel.org/en/users/Drivers/iwlwifi
>

PS, I did see 5 new output block ack agreements when I was running the diff,
so apparently at least it is doing ... something?

-ml

>
>
> I still have the kernel around if you want me to test something else. There
> is nothing in this tree except this Txagg diff. LMK if you need any more
> info.
>
> OpenBSD 6.7-current (GENERIC.MP) #1: Fri Jun 26 14:01:06 PDT 2020
> 
> mlar...@innsmouth.int.azathoth.net:/u/bin/src/OpenBSD/openbsd/sys/arch/amd64/compile/GENERIC.MP
> real mem = 51260506112 (48885MB)
> avail mem = 49691906048 (47389MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.1 @ 0x604f5000 (67 entries)
> bios0: vendor LENOVO version "N2IET61W (1.39 )" date 05/16/2019
> bios0: LENOVO 20N20046US
> acpi0 at bios0: ACPI 6.1
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT UEFI SSDT HPET APIC MCFG ECDT 
> SSDT SSDT BOOT SLIC SSDT LPIT WSMT SSDT DBGP DBG2 MSDM BATB DMAR NHLT ASF! 
> FPDT UEFI
> acpi0: wakeup devices GLAN(S4) XHC_(S3) XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) 
> RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) 
> RP06(S4) PXSX(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpihpet0 at acpi0: 2399 Hz
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, 1586.72 MHz, 06-8e-0c
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
> cpu0: 256KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> cpu0: apic clock running at 24MHz
> cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, 1333.05 MHz, 06-8e-0c
> cpu1: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAV

wg(4): encapsulated transport checksums

2020-06-26 Thread richard . n . procter
Hi, 

On its receive path, wg(4) uses the same mbuf for both the encrypted 
capsule and its encapsulated packet, which it passes up to the stack. We 
must therefore clear this mbuf's checksum status flags, as although the 
capsule may have been subject to hardware offload, its encapsulated packet 
was not.

This ensures that the transport checksums of packets bound for local 
delivery are verified. That is necessary because, although the tunnel 
provides stronger integrity checks, the tunnel endpoints and the 
transport endpoints needn't coincide.

However, as the network and tunnel endpoints _do_ conincide, it remains 
unncessary to check the per-hop IPv4 checksum.

ok? 

Index: net/if_wg.c
===
RCS file: /cvs/src/sys/net/if_wg.c,v
retrieving revision 1.7
diff -u -p -u -p -r1.7 if_wg.c
--- net/if_wg.c 23 Jun 2020 10:03:49 -  1.7
+++ net/if_wg.c 27 Jun 2020 02:48:37 -
@@ -1660,14 +1660,10 @@ wg_decap(struct wg_softc *sc, struct mbu
goto error;
}
 
-   /*
-* We can mark incoming packet csum OK. We mark all flags OK
-* irrespective to the packet type.
-*/
-   m->m_pkthdr.csum_flags |= (M_IPV4_CSUM_IN_OK | M_TCP_CSUM_IN_OK |
-   M_UDP_CSUM_IN_OK | M_ICMP_CSUM_IN_OK);
-   m->m_pkthdr.csum_flags &= ~(M_IPV4_CSUM_IN_BAD | M_TCP_CSUM_IN_BAD |
-   M_UDP_CSUM_IN_BAD | M_ICMP_CSUM_IN_BAD);
+   /* tunneled packet was not offloaded */
+   m->m_pkthdr.csum_flags = 0;
+   /* optimise: the tunnel provided a stronger integrity check */
+   m->m_pkthdr.csum_flags |= M_IPV4_CSUM_IN_OK;
 
m->m_pkthdr.ph_ifidx = sc->sc_if.if_index;
m->m_pkthdr.ph_rtableid = sc->sc_if.if_rdomain;



Re: 11n Tx aggregation for iwm(4)

2020-06-26 Thread Mike Larkin
On Fri, Jun 26, 2020 at 02:45:53PM +0200, Stefan Sperling wrote:
> This patch adds support for 11n Tx aggregation to iwm(4).
>
> Please help with testing if you can by running the patch and using wifi
> as usual. Nothing should change, except that Tx speed may potentially
> improve. If you have time to run before/after performance measurements with
> tcpbench or such, that would be nice. But it's not required for testing.
>
> If Tx aggregation is active then netstat will show a non-zero output block ack
> agreement counter:
>
> $ netstat -W iwm0 | grep 'output block'
> 3 new output block ack agreements
>   0 output block ack agreements timed out
>
> It would be great to get at least one test for all the chipsets the driver
> supports: 7260, 7265, 3160, 3165, 3168, 8260, 8265, 9260, 9560
> The behaviour of the access point also matters a great deal. It won't
> hurt to test the same chipset against several different access points.
>
> I have tested this version on 8265 only so far. I've run older revisions
> of this patch on 7265 so I'm confident that this chip will work, too.
> So far, the APs I have tested against are athn(4) in 11a mode and in 11n
> mode with the 'nomimo' nwflag, and a Sagemcom 11ac AP. All on 5Ghz channels.

I tested this on my T490 Thinkpad:

iwm0 at pci0 dev 20 function 3 "Intel Dual Band Wireless AC 9560" rev 0x30, msix
iwm0: hw rev 0x310, fw ver 34.3125811985.0

It ended up having a heck of a time connecting to anything, most/all
connections ended up timing out or just taking a really long time to complete.

I looked in dmesg, and found a stream of fatal firmware errors and other
errors (see end of this email).

My iwm-firmware was updated before I tried the new kernel:

-innsmouth- ~> pkg_info iwm-firmware
Information for inst:iwm-firmware-20191022p1

Comment:
firmware binary images for iwm(4) driver

Description:
Firmware binary images for use with the iwm(4) driver.

Maintainer: The OpenBSD ports mailing-list 

WWW: https://wireless.wiki.kernel.org/en/users/Drivers/iwlwifi



I still have the kernel around if you want me to test something else. There
is nothing in this tree except this Txagg diff. LMK if you need any more
info.

OpenBSD 6.7-current (GENERIC.MP) #1: Fri Jun 26 14:01:06 PDT 2020

mlar...@innsmouth.int.azathoth.net:/u/bin/src/OpenBSD/openbsd/sys/arch/amd64/compile/GENERIC.MP
real mem = 51260506112 (48885MB)
avail mem = 49691906048 (47389MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.1 @ 0x604f5000 (67 entries)
bios0: vendor LENOVO version "N2IET61W (1.39 )" date 05/16/2019
bios0: LENOVO 20N20046US
acpi0 at bios0: ACPI 6.1
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT SSDT SSDT SSDT UEFI SSDT HPET APIC MCFG ECDT SSDT 
SSDT BOOT SLIC SSDT LPIT WSMT SSDT DBGP DBG2 MSDM BATB DMAR NHLT ASF! FPDT UEFI
acpi0: wakeup devices GLAN(S4) XHC_(S3) XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) 
RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) PXSX(S4) 
RP06(S4) PXSX(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 2399 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, 1586.72 MHz, 06-8e-0c
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, 1333.05 MHz, 06-8e-0c
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Core(TM) i7-8665U CPU @ 1.90GHz, 1125.81 MHz, 06-8e-0c
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX

Re: awk FS behaviour change

2020-06-26 Thread Todd C . Miller
On Fri, 26 Jun 2020 23:56:23 +0200, Klemens Nanni wrote:

> How about adding something like "Therefore, FS should be set with -F or
> in a BEGIN block before input is read." as second sentence in this
> paragraph?

That whole section is missing important details.  I've tried to add
the missing info without being too repetitive.

 - todd

Index: usr.bin/awk/awk.1
===
RCS file: /cvs/src/usr.bin/awk/awk.1,v
retrieving revision 1.54
diff -u -p -u -r1.54 awk.1
--- usr.bin/awk/awk.1   26 Jun 2020 21:50:06 -  1.54
+++ usr.bin/awk/awk.1   27 Jun 2020 03:25:48 -
@@ -129,27 +129,25 @@ and newlines are used as field separator
 .Va FS ) .
 This is convenient when working with multi-line records.
 .Pp
-An input line is normally made up of fields separated by whitespace,
-or by the regular expression
-.Va FS .
+An input line is normally made up of fields split based on the value
+of the field separator
+.Va FS
+at the time the line is read.
 The fields are denoted
 .Va $1 , $2 , ... ,
 while
 .Va $0
 refers to the entire line.
-If
 .Va FS
-is null, the input line is split into one field per character.
-Lines are split into fields using the value of
+may be set to either a single character or a regular expression.
+As as special case, if
 .Va FS
-at the time the line is read.
-Because of this,
+is a single space
+.Pq the default ,
+fields will be split by one or more whitespace characters.
+If
 .Va FS
-is usually set via the
-.Fl F
-option or inside of a
-.Ic BEGIN
-block.
+is null, the input line is split into one field per character.
 .Pp
 Normally, any number of blanks separate fields.
 In order to set the field separator to a single blank, use the
@@ -171,6 +169,11 @@ as the field separator, use the
 .Fl F
 option with a value of
 .Sq [t] .
+The field separator is usually set via the
+.Fl F
+option or from inside of a
+.Ic BEGIN
+block so that it takes effect before the input is read.
 .Pp
 A pattern-action statement has the form:
 .Pp
@@ -407,9 +410,9 @@ The name of the current input file.
 .It Va FNR
 Ordinal number of the current record in the current file.
 .It Va FS
-Regular expression used to separate fields; also settable
-by option
-.Fl F Ar fs .
+Regular expression used to separate fields (default whitespace);
+also settable by option
+.Fl F Ar fs
 .It Va NF
 Number of fields in the current record.
 .Va $NF



Re: pipex(4): use reference counters for `ifnet'

2020-06-26 Thread Vitaliy Makkoveev
On Fri, Jun 26, 2020 at 09:15:38PM +0200, Martin Pieuchot wrote:
> On 26/06/20(Fri) 17:53, Vitaliy Makkoveev wrote:
> > On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:
> > > On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > > > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > > > Updated diff. 
> > > > > > 
> > > > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we 
> > > > > > can't
> > > > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > > > original `ifnet'.
> > > > > 
> > > > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > > > for whatever reason then the if_get(9) interface has to be fixed.  
> > > > > Which
> > > > > case doesn't work for you?  Do you have a reproducer?  
> > > > > 
> > > > > How does sessions stay around if their corresponding interface is
> > > > > destroyed?
> > > > 
> > > > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > > > pipexintr() process these queues. pipexintr() and
> > > > pipex_destroy_session() are *always* different context. This mean we
> > > > *can't* free pipex(4) session without be sure there is no reference to
> > > > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > > > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > > > destroy sessions is wrong. While pppac(4) destroys sessions by
> > > > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > > > and `pipexoutq' state. I'am said it again and again.
> > > 
> > > I understand.  Why is it a problem?  Using reference counting the way
> > > you're suggesting is *one* possible solution to a problem we don't fully
> > > understand.  What are we trying to achieve?  Which problem are we trying
> > > to solve?
> > 
> > Sorry, may be I misunderstand something.
> > 
> > `pipexoutq' case:
> > 
> > 1. pppac_start() calls pipex_output()
> > 2. pipex_output() calls pipex_ip_output()
> > 3. pipex_ip_output() calls pipex_ppp_enqueue()
> > 4. pipex_ppp_enqueue() calls schednetisr() which is task_add()
> > 
> > `pipexinq' cases:
> > 
> > 1.1. ether_input() calls pipex_pppoe_input()
> > 1.2. gre_input() calls gre_input_1()
> >  gre_input_1() calls pipex_pptp_input()
> > 1.3. udp_input() calls pipex_l2tp_input()
> > 
> > 2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
> > 3. pipex_common_input() calls schednetisr() which is task_add()
> > 
> > task_add(9) just schedules the execution of the work specified by `tq'.
> > So we can do pipex_destroy_session() * between * schednetisr() and
> > pipexintr(). And we can do this right * now *, with our current locking.
> > And this is the problem I'am trying to solve.
> > 
> > My apologies if I'am wrong above. Please point me where I'am wrong.
> > 
> > Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
> > pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
> > session. We should be shure `session' is still walid between
> > pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
> > required in future.
> 
> Why not iterate over the queues and garbage collect the sessions that
> are about to be removed?  That's what the network stack was doing with
> mbuf queues prior to if_get(9) when interfaces where destroyed.
> 

Do you mean net/if.c:1185 and below? This is the queues associated with
this `ifp'. But for pipex(4) we should go through all mbufs associated
with pipex(4). This can be heavy if we have hundreds of sessions. Also
this would work until session destruction and `pr_input' are serialized.

Point me please the line in source to see if I'am wrong about `ifnet's
mbuf queues claninig.



Re: fix races in if_clone_create()

2020-06-26 Thread Vitaliy Makkoveev
On Fri, Jun 26, 2020 at 09:12:16PM +0200, Martin Pieuchot wrote:
> On 26/06/20(Fri) 16:56, Vitaliy Makkoveev wrote:
> > if_clone_create() has the races caused by context switch.
> 
> Can you share a backtrace of such race?  Where does the kernel panic?
>

This diff was inspired by thread [1]. As I explained [2] here is 3
issues that cause panics produced by command below:

 cut begin 
for i in 1 2 3; do while true; do ifconfig bridge0 create& \
ifconfig bridge0 destroy& done& done
 cut end 

My system was stable with the last diff I did for thread [1]. But since
this final diff [3] which include fixes for tun(4) is quick and dirty
and not for commit I decided to make the diff to fix the races caused by
if_clone_create() at first.

I included screenshot with panic.
Also the code to reproduce below:

 cut begin 
#!/bin/sh
while true; do ifconfig bridge0 create& done&
while true; do ifconfig bridge0 destroy done
 cut end 

1. https://marc.info/?t=15928959011&r=1&w=2
2. https://marc.info/?l=openbsd-tech&m=159307900124245&w=2
3. https://marc.info/?l=openbsd-tech&m=159308633126243&w=2


Re: powerpc64: 64-bit-ize memmove.S

2020-06-26 Thread Theo de Raadt
George Koehler  wrote:

> On Sat, 27 Jun 2020 01:27:14 +0200
> Christian Weisgerber  wrote:
> 
> > I'm also intrigued by this aside in the PowerPC ISA documentation:
> > | Moreover, Load with Update instructions may take longer to execute
> > | in some implementations than the corresponding pair of a non-update
> > | Load instruction and an Add instruction.
> > What does clang generate?
> 
> clang likes load/store with update instructions.  For example, the
> powerpc64 kernel has /sys/lib/libkern/memcpy.c, which copies bytes:
> 
>   while (n-- > 0)
>   *t++ = *f++;

Keep in mind in userland we use memcpy.c, which does backwards detection.
It's probably slower.

The original idea was to use the C version with backwards detection until
it stopped finding bugs, but... it keeps finding bugs



Re: powerpc64: 64-bit-ize memmove.S

2020-06-26 Thread George Koehler
On Sat, 27 Jun 2020 01:27:14 +0200
Christian Weisgerber  wrote:

> I'm also intrigued by this aside in the PowerPC ISA documentation:
> | Moreover, Load with Update instructions may take longer to execute
> | in some implementations than the corresponding pair of a non-update
> | Load instruction and an Add instruction.
> What does clang generate?

clang likes load/store with update instructions.  For example, the
powerpc64 kernel has /sys/lib/libkern/memcpy.c, which copies bytes:

while (n-- > 0)
*t++ = *f++;

clang uses lbzu and stbu:

memcpy: cmpldi r5,0x0
memcpy+0x4: beqlr
memcpy+0x8: addi r4,r4,-1
memcpy+0xc: addi r6,r3,-1
memcpy+0x10:mtspr ctr,r5
memcpy+0x14:lbzu r5,1(r4)
memcpy+0x18:stbu r5,1(r6)
memcpy+0x1c:bdnz 0x26cd0d4 (memcpy+0x14)
memcpy+0x20:blr

> I think we should consider dropping this "optimized" memmove.S on
> both powerpc and powerpc64.

I might want to benchmark memmove.S against memmove.c to check if
those unaligned accesses are too slow.  First I would have to write
a benchmark.



Re: top: add missing scroll keys to help page, name default signal

2020-06-26 Thread Todd C . Miller
On Fri, 26 Jun 2020 23:39:25 +0200, Klemens Nanni wrote:

> The order of commands is not in sync between help page and manual, but
> I refrained from reordering to avoid churn.

OK millert@

 - todd



Re: top: remove duplicate initialisation

2020-06-26 Thread Vitaliy Makkoveev



> On 26 Jun 2020, at 23:12, Klemens Nanni  wrote:
> 
> On Fri, Jun 26, 2020 at 11:07:54PM +0300, Vitaliy Makkoveev wrote:
>> What about “pageshift = 0;” at usr.bin/top/machine.c:216 ?
> Could be removed as well but I left it in there intentionally to keep
> this block of code readable since pageshift is not only set but used
> immediately afterwards as well.
> 
I don’t think someone changed value to fill .data section :)
ok mvs@


Re: 11n Tx aggregation for iwm(4)

2020-06-26 Thread Johan Huldtgren
hello,

On 2020-06-26 14:45, Stefan Sperling wrote:
> It would be great to get at least one test for all the chipsets the driver
> supports: 7260, 7265, 3160, 3165, 3168, 8260, 8265, 9260, 9560
> The behaviour of the access point also matters a great deal. It won't
> hurt to test the same chipset against several different access points.

tested on:

iwm0 at pci1 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi

AP is a Ruckus 7363.

$ netstat -W iwm0 | grep "output block" 


6 new output block ack agreements
0 output block ack agreements timed out

Before:

bandwidth min/avg/max/std-dev = 16.780/18.325/19.939/1.235 Mbps

After:

bandwidth min/avg/max/std-dev = 0.000/15.559/51.631/19.548 Mbps

.jh



Re: top: remove duplicate initialisation

2020-06-26 Thread Vitaliy Makkoveev



> On 26 Jun 2020, at 22:00, Klemens Nanni  wrote:
> 
> Those are global variables are (zero) initialised as such already and
> machine_init() is called only once upon startup.
> 
> Feedback? OK?
> 
> 
> Index: machine.c
> ===
> RCS file: /cvs/src/usr.bin/top/machine.c,v
> retrieving revision 1.105
> diff -u -p -r1.105 machine.c
> --- machine.c 25 Jun 2020 20:38:41 -  1.105
> +++ machine.c 26 Jun 2020 17:21:18 -
> @@ -203,11 +203,6 @@ machine_init(struct statics *statics)
>   if (cpu_online == NULL)
>   err(1, NULL);
> 
> - pbase = NULL;
> - pref = NULL;
> - onproc = -1;
> - nproc = 0;
> -
>   /*
>* get the page size with "getpagesize" and calculate pageshift from
>* it
> 

What about “pageshift = 0;” at usr.bin/top/machine.c:216 ?


Re: Stuck in Needbuf state, trying to understand (6.7)

2020-06-26 Thread sven falempin
On Fri, Jun 26, 2020 at 5:22 PM Stuart Henderson 
wrote:

> On 2020/06/26 15:30, sven falempin wrote:
> > behavior confirmed on current.
> >
> > Once the process stalls,  ( could be anything writing to the vnconfig
> disk,
> > cp , umount )
> > a few other calls like df , or ps, etc may hang, never the same
> > sp or mp kernel, reproduced on today's snapshots.
>
> vnconfig is used as part of "make release", many builds are done every
> week using this so it's not a general problem with vnconfig.
>
> Can you show some commands or a script to trigger the behaviour?
>

the perl script use the system to call :

vnconfig.
mount.
umount. <- saw hanged
cp.<- saw hanged
tar.<- saw hanged
svn up.<- saw hanged
and dd.
newfs.

really nothing fancy, only stuff writing to disk got stuck.

At one point it does a chroot but it never hangs near that , most of the
time it hangs before.

The script has been used like 1000 times on 6.0 and maybe twice more on 6.4.

I have absolutely no idea what the 'needbuf' of top is .

the script hangs at random position , always writing into vnconfig.

I have no idea how to reproduce outside the perl script , so maybe it is
related
to some devious perl stdin/stdout buffer .

Nevertheless there's like a 5% chance that's the script will work( slowly )

Most of the system call are inside a routine to log

sub debug_system {
  $logger->debug('running: '.join(' ', @_));
  return system(@_);
}

so i can easily put things inside to try to understand the issue.

It is really a strange behavior, and the device must be shut down
electrically.
Something really odd, i run syslogd on a buffer, and syslogc buffer is
stuck too
when the device stuck (but it supposed to be mostly already allocated
memory ).

It's really like the vm does not want to give anymore bucket (<- i
don't know what i m talking about here,
but i looks like that anything that doesn't malloc is ok , computer reply
to ping , can do a few things for a while , and then complete
hang )

I ran the 6.7 release on a VM somewhere and another device with many perl
script and they work.

Only this fails 95% of the time and is VERY VERY slow when ok.
compared to what i saw in /usr/src the vnconfig is big ,  ( forgot to copy
df -h  ),
like 2GB

-- 
--
-
Knowing is not enough; we must apply. Willing is not enough; we must do


Re: powerpc64: 64-bit-ize memmove.S

2020-06-26 Thread Christian Weisgerber
Christian Weisgerber:

> Well, I suggested it, so here's my attempt to switch powerpc64's
> libc memmove.S over to 64 bits:

Actually, on second thought:

That function simply copies as many (double)words plus a tail of
bytes as the length argument specifies.  Neither source nor destination
are checked for alignment, so this will happily run a loop of
unaligned accesses, which doesn't sound very optimal.

I'm also intrigued by this aside in the PowerPC ISA documentation:
| Moreover, Load with Update instructions may take longer to execute
| in some implementations than the corresponding pair of a non-update
| Load instruction and an Add instruction.
What does clang generate?

I think we should consider dropping this "optimized" memmove.S on
both powerpc and powerpc64.

-- 
Christian "naddy" Weisgerber  na...@mips.inka.de



powerpc64: 64-bit-ize memmove.S

2020-06-26 Thread Christian Weisgerber
Well, I suggested it, so here's my attempt to switch powerpc64's
libc memmove.S over to 64 bits:
* Treat length parameter as 64 bits (size_t) instead of 32 (u_int).
* Set up the main loop to copy 64-bit doublewords instead of 32-bit
  words.

This definitely needs double and triple checking.

Things I wonder about, but didn't touch:
* Why is the memcpy entry point commented out?
* END... STRONG? WEAK? BUILTIN?

Index: memmove.S
===
RCS file: /cvs/src/lib/libc/arch/powerpc64/string/memmove.S,v
retrieving revision 1.1
diff -u -p -r1.1 memmove.S
--- memmove.S   25 Jun 2020 02:34:22 -  1.1
+++ memmove.S   26 Jun 2020 22:22:51 -
@@ -39,7 +39,7 @@
  * ==
  */
 
-#include "SYS.h"
+#include "DEFS.h"
 
 .text
 
@@ -64,45 +64,45 @@ ENTRY(memmove)
/* start of dest*/
 
 fwd:
-   addi%r4, %r4, -4/* Back up src and dst pointers */
-   addi%r8, %r8, -4/* due to auto-update of 'load' */
+   addi%r4, %r4, -8/* Back up src and dst pointers */
+   addi%r8, %r8, -8/* due to auto-update of 'load' */
 
-   srwi.   %r9,%r5,2   /* How many words in total cnt  */
-   beq-last1   /* Handle byte by byte if < 4   */
+   srdi.   %r9,%r5,3   /* Doublewords in total count   */
+   beq-last1   /* Handle byte by byte if < 8   */
/* bytes total  */
-   mtctr   %r9 /* Count of words for loop  */
-   lwzu%r7, 4(%r4) /* Preload first word   */
+   mtctr   %r9 /* Count of dwords for loop */
+   ldu %r7, 8(%r4) /* Preload first doubleword */
 
b   g1
 
 g0:/* Main loop*/
 
-   lwzu%r7, 4(%r4) /* Load a new word  */
-   stwu%r6, 4(%r8) /* Store previous word  */
+   ldu %r7, 8(%r4) /* Load a new doubleword*/
+   stdu%r6, 8(%r8) /* Store previous doubleword*/
 
 g1:
 
bdz-last/* Dec cnt, and branch if just  */
-   /* one word to store*/
-   lwzu%r6, 4(%r4) /* Load another word*/
-   stwu%r7, 4(%r8) /* Store previous word  */
+   /* one doubleword to store  */
+   ldu %r6, 8(%r4) /* Load another doubleword  */
+   stdu%r7, 8(%r8) /* Store previous doubleword*/
bdnz+   g0  /* Dec cnt, and loop again if   */
-   /* more words   */
-   mr  %r7, %r6/* If word count -> 0, then...  */
+   /* more doublewords */
+   mr  %r7, %r6/* If dword count -> 0, then... */
 
 last:
 
-   stwu%r7, 4(%r8) /* ... store last word  */
+   stdu%r7, 8(%r8) /* ... store last doubleword*/
 
 last1: /* Byte-by-byte copy*/
 
-   clrlwi. %r5,%r5,30  /* If count -> 0, then ...  */
+   clrldi. %r5,%r5,61  /* If count -> 0, then ...  */
beqlr   /* we're done   */
 
mtctr   %r5 /* else load count for loop */
 
-   lbzu%r6, 4(%r4) /* 1st byte: update addr by 4   */
-   stbu%r6, 4(%r8) /* since we pre-adjusted by 4   */
+   lbzu%r6, 8(%r4) /* 1st byte: update addr by 8   */
+   stbu%r6, 8(%r8) /* since we pre-adjusted by 8   */
bdzlr-  /* in anticipation of main loop */
 
 last2:
@@ -120,40 +120,40 @@ reverse:
 
add %r4, %r4, %r5   /* Work from end to beginning   */
add %r8, %r8, %r5   /* so add count to string ptrs  */
-   srwi.   %r9,%r5,2   /* Words in total count */
-   beq-rlast1  /* Handle byte by byte if < 4   */
+   srdi.   %r9,%r5,3   /* Doublewords in total count   */
+   beq-rlast1  /* Handle byte by byte if < 8   */
/* bytes total  */
 
-   mtctr   %r9 /* Count of words for loop  */
+   mtctr   %r9 /* Count of dwords for loop */
 
-   lwzu%r7, -4(%r4)/* Preload first word   */
+   ldu

Re: awk FS behaviour change

2020-06-26 Thread Klemens Nanni
On Fri, Jun 26, 2020 at 03:41:21PM -0600, Todd C. Miller wrote:
> The awk manual leaves a lot of things unspecified (buy the book ;-).
> Does this addition help clear things up?
Yes.
 
> Index: awk.1
> ===
> RCS file: /cvs/src/usr.bin/awk/awk.1,v
> retrieving revision 1.53
> diff -u -p -u -r1.53 awk.1
> --- awk.1 17 Jun 2020 15:34:11 -  1.53
> +++ awk.1 26 Jun 2020 21:39:19 -
> @@ -140,6 +140,16 @@ refers to the entire line.
>  If
>  .Va FS
>  is null, the input line is split into one field per character.
> +Lines are split into fields using the value of
> +.Va FS
> +at the time the line is read.
> +Because of this,
> +.Va FS
> +is usually set via the
> +.Fl F
> +option or inside of a
> +.Ic BEGIN
> +block.
>  .Pp
>  Normally, any number of blanks separate fields.
>  In order to set the field separator to a single blank, use the

Given that this amends the following paragraph, your first sentence
seems repetitive:

An input line is normally made up of fields separated by whitespace, or
by the regular expression FS.  The fields are denoted $1, $2, ..., while
$0 refers to the entire line.  If FS is null, the input line is split
into one field per character.

How about adding something like "Therefore, FS should be set with -F or
in a BEGIN block before input is read." as second sentence in this
paragraph?



Re: awk FS behaviour change

2020-06-26 Thread Todd C . Miller
On Fri, 26 Jun 2020 21:41:57 +0100, Stuart Henderson wrote:

> I don't know which is "correct" (the manpage isn't enlightening) but
> it was a bit unexpected so I wanted to at least draw attention to it
> in case it breaks somebody else's script.

The awk manual leaves a lot of things unspecified (buy the book ;-).
Does this addition help clear things up?

 - todd

Index: awk.1
===
RCS file: /cvs/src/usr.bin/awk/awk.1,v
retrieving revision 1.53
diff -u -p -u -r1.53 awk.1
--- awk.1   17 Jun 2020 15:34:11 -  1.53
+++ awk.1   26 Jun 2020 21:39:19 -
@@ -140,6 +140,16 @@ refers to the entire line.
 If
 .Va FS
 is null, the input line is split into one field per character.
+Lines are split into fields using the value of
+.Va FS
+at the time the line is read.
+Because of this,
+.Va FS
+is usually set via the
+.Fl F
+option or inside of a
+.Ic BEGIN
+block.
 .Pp
 Normally, any number of blanks separate fields.
 In order to set the field separator to a single blank, use the



top: add missing scroll keys to help page, name default signal

2020-06-26 Thread Klemens Nanni
The order of commands is not in sync between help page and manual, but
I refrained from reordering to avoid churn.

OK?

NB: On a 80x24 xterm, the help page is already using the last line,
hiding all of the "Hit any key to continue: " prompt except its first
five characters which land on the last `u [-user]' line.

With those two lines added, the `S' line becomes the last one but and
on a 80x24 terminal the continue prompt gets can be seen completely.

I guess this can be fixed/improved regardless of the last line, but
that's stuff for another diff.


Index: display.c
===
RCS file: /cvs/src/usr.bin/top/display.c,v
retrieving revision 1.61
diff -u -p -r1.61 display.c
--- display.c   27 Oct 2019 13:52:26 -  1.61
+++ display.c   26 Jun 2020 21:32:37 -
@@ -805,6 +805,8 @@ show_help(void)
"  - update screen\n"
"+- reset any P highlight, g, p, or u filters\n"
"1- display CPU statistics on a single line\n"
+   "9 | 0- scroll up/down the process list by one line\n"
+   "( | )- scroll up/down the process list by screen half\n"
"C- toggle the display of command line arguments\n"
"d count  - show `count' displays, then exit\n"
"e- list errors generated by last \"kill\" or 
\"renice\" command\n"
@@ -812,7 +814,7 @@ show_help(void)
"h | ?- help; show this text\n"
"H- toggle the display of threads\n"
"I | i- toggle the display of idle processes\n"
-   "k [-sig] pid - send signal `-sig' to process `pid'\n"
+   "k [-sig] pid - send signal `-sig' (TERM by default) to process 
`pid'\n"
"n|# count- show `count' processes\n"
"o [-]field   - specify sort order (size, res, cpu, time, pri, pid, 
command)\n"
"   (o -field sorts in reverse)\n"



Re: fix races in if_clone_create()

2020-06-26 Thread Vitaliy Makkoveev
Sorry, I found a memory leak. Updated diff below.

Index: sys/net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.610
diff -u -p -r1.610 if.c
--- sys/net/if.c22 Jun 2020 09:45:13 -  1.610
+++ sys/net/if.c26 Jun 2020 17:19:41 -
@@ -157,6 +157,8 @@ voidif_linkstate_task(void *);
 
 intif_clone_list(struct if_clonereq *);
 struct if_clone*if_clone_lookup(const char *, int *);
+intif_clone_alloc_unit(struct if_clone *, int);
+void   if_clone_rele_unit(struct if_clone *, int);
 
 intif_group_egress_build(void);
 
@@ -1244,19 +1246,23 @@ if_clone_create(const char *name, int rd
 {
struct if_clone *ifc;
struct ifnet *ifp;
-   int unit, ret;
+   int unit, error;
 
ifc = if_clone_lookup(name, &unit);
if (ifc == NULL)
return (EINVAL);
+   error = if_clone_alloc_unit(ifc, unit);
+   if (error != 0)
+   return (error);
 
-   if (ifunit(name) != NULL)
-   return (EEXIST);
-
-   ret = (*ifc->ifc_create)(ifc, unit);
+   if (ifunit(name) != NULL) {
+   error = (EEXIST);
+   goto rele;
+   }
 
-   if (ret != 0 || (ifp = ifunit(name)) == NULL)
-   return (ret);
+   error = (*ifc->ifc_create)(ifc, unit);
+   if (error != 0 || (ifp = ifunit(name)) == NULL)
+   return (0);
 
NET_LOCK();
if_addgroup(ifp, ifc->ifc_name);
@@ -1264,7 +1270,10 @@ if_clone_create(const char *name, int rd
if_setrdomain(ifp, rdomain);
NET_UNLOCK();
 
-   return (ret);
+   return (0);
+rele:
+   if_clone_rele_unit(ifc, unit);
+   return (error);
 }
 
 /*
@@ -1275,9 +1284,9 @@ if_clone_destroy(const char *name)
 {
struct if_clone *ifc;
struct ifnet *ifp;
-   int ret;
+   int ret, unit;
 
-   ifc = if_clone_lookup(name, NULL);
+   ifc = if_clone_lookup(name, &unit);
if (ifc == NULL)
return (EINVAL);
 
@@ -1297,6 +1306,7 @@ if_clone_destroy(const char *name)
}
NET_UNLOCK();
ret = (*ifc->ifc_destroy)(ifp);
+   if_clone_rele_unit(ifc, unit);
 
return (ret);
 }
@@ -1342,12 +1352,95 @@ if_clone_lookup(const char *name, int *u
unit = (unit * 10) + (*cp++ - '0');
}
 
-   if (unitp != NULL)
-   *unitp = unit;
+   *unitp = unit;
return (ifc);
 }
 
 /*
+ * Allocate unit for cloned network interface.
+ */
+int if_clone_alloc_unit(struct if_clone *ifc, int unit)
+{
+   int word, bit, ret;
+
+   word = unit / (sizeof(*ifc->ifc_map) * 8);
+   bit = unit % (sizeof(*ifc->ifc_map) * 8);
+
+   rw_enter_write(&ifc->ifc_lock);
+
+   if(word >= ifc->ifc_map_size) {
+   u_long *map;
+   int size;
+
+   size = word + 1;
+   map = mallocarray(size, sizeof(*map), M_TEMP, M_WAITOK |
+   M_ZERO);
+
+   if (ifc->ifc_map != NULL) {
+   memcpy(map, ifc->ifc_map, ifc->ifc_map_size);
+   free(ifc->ifc_map, M_TEMP,
+   ifc->ifc_map_size * sizeof(*map));
+   }
+
+   ifc->ifc_map = map;
+   ifc->ifc_map_size = size;
+   }
+
+   if (ifc->ifc_map[word] & (1UL << bit))
+   ret = EEXIST;
+   else {
+   ifc->ifc_map[word] |= (1UL << bit);
+   ret = 0;
+   }
+
+   rw_exit_write(&ifc->ifc_lock);
+
+   return ret;
+}
+
+/*
+ * Release allocated unit for cloned network interface.
+ */
+void if_clone_rele_unit(struct if_clone *ifc, int unit)
+{
+   int word, bit;
+
+   word = unit / (sizeof(*ifc->ifc_map) * 8);
+   bit = unit % (sizeof(*ifc->ifc_map) * 8);
+
+   rw_enter_write(&ifc->ifc_lock);
+   KASSERT(word < ifc->ifc_map_size);
+
+   ifc->ifc_map[word] &= ~(1UL << bit);
+
+   if (ifc->ifc_map[word] == 0) {
+   u_long *map;
+   int size;
+
+   size = ifc->ifc_map_size - 2;
+   while (size>=0) {
+   if (ifc->ifc_map[size] != 0)
+   break;
+   --size;
+   }
+   if (size<0) {
+   size = 0;
+   map = NULL;
+   } else {
+   size += 1;
+   map = mallocarray(size, sizeof(*map), M_TEMP,
+   M_WAITOK);
+   memcpy(map, ifc->ifc_map, size);
+   }
+
+   free(ifc->ifc_map, M_TEMP, ifc->ifc_map_size * sizeof(*map));
+   ifc->ifc_map = map;
+   ifc->ifc_map_size = size;
+   }
+   rw_exit_write(&ifc->ifc_lock);
+}
+
+/*
  * Register a network interface cloner.
  */
 void
@@ -1360,6 +1453,7 @@ if_clone_attach(struct 

Re: Stuck in Needbuf state, trying to understand (6.7)

2020-06-26 Thread Stuart Henderson
On 2020/06/26 15:30, sven falempin wrote:
> behavior confirmed on current.
> 
> Once the process stalls,  ( could be anything writing to the vnconfig disk,
> cp , umount )
> a few other calls like df , or ps, etc may hang, never the same
> sp or mp kernel, reproduced on today's snapshots.

vnconfig is used as part of "make release", many builds are done every
week using this so it's not a general problem with vnconfig.

Can you show some commands or a script to trigger the behaviour?



Re: awk FS behaviour change

2020-06-26 Thread Todd C . Miller
On Fri, 26 Jun 2020 21:41:57 +0100, Stuart Henderson wrote:

> The Sep 10, 2019 version of awk introduced a change in handling this:
>
>   ifconfig egress | awk '/inet / {FS="[ .]"; print "host-"$4"-"$5"}'
>
> Given a line like
>
> inet 10.20.30.40 netmask 0xff00 broadcast 10.20.30.255
>
> it used to return host-30-40, now it returns host-0xfff0-broadcast.
> The new behaviour is the same as gawk and old behaviour can be obtained
> by doing this instead
>
>   ifconfig egress | awk -F '[ .]' '/inet / {print "host-"$4"-"$5}'
>
> I don't know which is "correct" (the manpage isn't enlightening) but
> it was a bit unexpected so I wanted to at least draw attention to it
> in case it breaks somebody else's script.

The current behavior is correct.  The old behavior was a bug because
field splitting is supported use the value of FS at the time the
record was read.  Setting FS after the line has been read is too
late.

You need to either set FS via the -F flag or inside a BEGIN block.

 - todd



awk FS behaviour change

2020-06-26 Thread Stuart Henderson
The Sep 10, 2019 version of awk introduced a change in handling this:

  ifconfig egress | awk '/inet / {FS="[ .]"; print "host-"$4"-"$5"}'

Given a line like

inet 10.20.30.40 netmask 0xff00 broadcast 10.20.30.255

it used to return host-30-40, now it returns host-0xfff0-broadcast.
The new behaviour is the same as gawk and old behaviour can be obtained
by doing this instead

  ifconfig egress | awk -F '[ .]' '/inet / {print "host-"$4"-"$5}'

I don't know which is "correct" (the manpage isn't enlightening) but
it was a bit unexpected so I wanted to at least draw attention to it
in case it breaks somebody else's script.



Re: pkg_add.1

2020-06-26 Thread Marc Espie
On Wed, Jun 24, 2020 at 05:32:57PM +0100, Jason McIntyre wrote:
> On Tue, Jun 23, 2020 at 07:28:09PM -0400, sven falempin wrote:
> > Dear readers,
> > 
> > It may not be very obvious that 'dry run' mode of pkg_add
> > actually downloads packages.
> > It is a good feature and maybe the pkg_add man could use an EXAMPLES
> > section.
> > 
> 
> hi.
> 
> it might not be obvious, but it is documented:
> 
>-n   Don't actually install a package, just report the
> steps that would be taken if it was.  Will still
> copy packages to PKG_CACHE if applicable.
> 
> if you look through the page for how the PKG_CACHE stuff works, it
> seems clear.
> 
> anyway, i'll defer to espie on whether the diff is wanted or not.
> comments on your text inline:

I don't think it adds much, and it's indeed already documented, and used
for instance by bsd.port.mk



Re: top: remove duplicate initialisation

2020-06-26 Thread Klemens Nanni
On Fri, Jun 26, 2020 at 11:07:54PM +0300, Vitaliy Makkoveev wrote:
> What about “pageshift = 0;” at usr.bin/top/machine.c:216 ?
Could be removed as well but I left it in there intentionally to keep
this block of code readable since pageshift is not only set but used
immediately afterwards as well.



Re: paste: treat '\0' delim as empty string

2020-06-26 Thread Richard Ipsum
On Mon, May 11, 2020 at 07:35:43AM +0200, Richard Ipsum wrote:
> Hi,
> 
> This patch makes paste treat '\0' in the delim list as an empty string
> instead of a null character as per POSIX[1].
> 
> before:
> 
> % echo -e 'hello\nworld' | ./paste -s -d '\0' - | od -b
> 000  150 145 154 154 157 000 167 157 162 154 144 012
> 014
> 
> after:
> 
> % echo -e 'hello\nworld' | ./paste -s -d '\0' - | od -b
> 000  150 145 154 154 157 167 157 162 154 144 012
> 013
> 
> Thanks,
> Richard
> 
> [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/paste.html
> 
> diff --git usr.bin/paste/paste.c usr.bin/paste/paste.c
> index 77c9f328755..68ed12898a4 100644
> --- usr.bin/paste/paste.c
> +++ usr.bin/paste/paste.c
> @@ -193,7 +193,7 @@ sequential(char **argv)
>   while ((len = getline(&line, &linesize, fp)) != -1) {
>   if (line[len - 1] == '\n')
>   line[len - 1] = '\0';
> - if (cnt >= 0)
> + if (cnt >= 0 && delim[cnt] != '\0')
>   putchar(delim[cnt]);
>   if (++cnt == delimcnt)
>   cnt = 0;

Any reason this didn't get merged?

Thanks,
Richard



Re: pipex(4): use reference counters for `ifnet'

2020-06-26 Thread Martin Pieuchot
On 26/06/20(Fri) 17:53, Vitaliy Makkoveev wrote:
> On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:
> > On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > > Updated diff. 
> > > > > 
> > > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we 
> > > > > can't
> > > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > > original `ifnet'.
> > > > 
> > > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > > > case doesn't work for you?  Do you have a reproducer?  
> > > > 
> > > > How does sessions stay around if their corresponding interface is
> > > > destroyed?
> > > 
> > > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > > pipexintr() process these queues. pipexintr() and
> > > pipex_destroy_session() are *always* different context. This mean we
> > > *can't* free pipex(4) session without be sure there is no reference to
> > > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > > destroy sessions is wrong. While pppac(4) destroys sessions by
> > > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > > and `pipexoutq' state. I'am said it again and again.
> > 
> > I understand.  Why is it a problem?  Using reference counting the way
> > you're suggesting is *one* possible solution to a problem we don't fully
> > understand.  What are we trying to achieve?  Which problem are we trying
> > to solve?
> 
> Sorry, may be I misunderstand something.
> 
> `pipexoutq' case:
> 
> 1. pppac_start() calls pipex_output()
> 2. pipex_output() calls pipex_ip_output()
> 3. pipex_ip_output() calls pipex_ppp_enqueue()
> 4. pipex_ppp_enqueue() calls schednetisr() which is task_add()
> 
> `pipexinq' cases:
> 
> 1.1. ether_input() calls pipex_pppoe_input()
> 1.2. gre_input() calls gre_input_1()
>  gre_input_1() calls pipex_pptp_input()
> 1.3. udp_input() calls pipex_l2tp_input()
> 
> 2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
> 3. pipex_common_input() calls schednetisr() which is task_add()
> 
> task_add(9) just schedules the execution of the work specified by `tq'.
> So we can do pipex_destroy_session() * between * schednetisr() and
> pipexintr(). And we can do this right * now *, with our current locking.
> And this is the problem I'am trying to solve.
> 
> My apologies if I'am wrong above. Please point me where I'am wrong.
> 
> Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
> pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
> session. We should be shure `session' is still walid between
> pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
> required in future.

Why not iterate over the queues and garbage collect the sessions that
are about to be removed?  That's what the network stack was doing with
mbuf queues prior to if_get(9) when interfaces where destroyed.



Re: fix races in if_clone_create()

2020-06-26 Thread Martin Pieuchot
On 26/06/20(Fri) 16:56, Vitaliy Makkoveev wrote:
> if_clone_create() has the races caused by context switch.

Can you share a backtrace of such race?  Where does the kernel panic?



top: remove duplicate initialisation

2020-06-26 Thread Klemens Nanni
Those are global variables are (zero) initialised as such already and
machine_init() is called only once upon startup.

Feedback? OK?


Index: machine.c
===
RCS file: /cvs/src/usr.bin/top/machine.c,v
retrieving revision 1.105
diff -u -p -r1.105 machine.c
--- machine.c   25 Jun 2020 20:38:41 -  1.105
+++ machine.c   26 Jun 2020 17:21:18 -
@@ -203,11 +203,6 @@ machine_init(struct statics *statics)
if (cpu_online == NULL)
err(1, NULL);
 
-   pbase = NULL;
-   pref = NULL;
-   onproc = -1;
-   nproc = 0;
-
/*
 * get the page size with "getpagesize" and calculate pageshift from
 * it



Re: pipex(4): use reference counters for `ifnet'

2020-06-26 Thread Vitaliy Makkoveev
On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:
> On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > Updated diff. 
> > > > 
> > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > original `ifnet'.
> > > 
> > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > > case doesn't work for you?  Do you have a reproducer?  
> > > 
> > > How does sessions stay around if their corresponding interface is
> > > destroyed?
> > 
> > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > pipexintr() process these queues. pipexintr() and
> > pipex_destroy_session() are *always* different context. This mean we
> > *can't* free pipex(4) session without be sure there is no reference to
> > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > destroy sessions is wrong. While pppac(4) destroys sessions by
> > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > and `pipexoutq' state. I'am said it again and again.
> 
> I understand.  Why is it a problem?  Using reference counting the way
> you're suggesting is *one* possible solution to a problem we don't fully
> understand.  What are we trying to achieve?  Which problem are we trying
> to solve?

Sorry, may be I misunderstand something.

`pipexoutq' case:

1. pppac_start() calls pipex_output()
2. pipex_output() calls pipex_ip_output()
3. pipex_ip_output() calls pipex_ppp_enqueue()
4. pipex_ppp_enqueue() calls schednetisr() which is task_add()

`pipexinq' cases:

1.1. ether_input() calls pipex_pppoe_input()
1.2. gre_input() calls gre_input_1()
 gre_input_1() calls pipex_pptp_input()
1.3. udp_input() calls pipex_l2tp_input()

2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
3. pipex_common_input() calls schednetisr() which is task_add()

task_add(9) just schedules the execution of the work specified by `tq'.
So we can do pipex_destroy_session() * between * schednetisr() and
pipexintr(). And we can do this right * now *, with our current locking.
And this is the problem I'am trying to solve.

My apologies if I'am wrong above. Please point me where I'am wrong.

Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
session. We should be shure `session' is still walid between
pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
required in future.

> Are we trying to move pipexintr() out of KERNEL_LOCK()?  If so are we
> trying to guarantee the memory pointed by `ph_cookie' representing a
> session pointer is always valid without the KERNEL_LOCK()? 
> 
> If that's what we're trying to do, shouldn't we start with a baby-step
> and trade the KERNEL_LOCK() for another lock first?  I understand it is
> not some sexy-mp-rcu-atomic-crazy-refcount-clever solution, but it is
> simpler to do.
> 
> Since all the packets (mbuf) are being processed with the NET_LOCK(). we
> should be able to start trading the KERNEL_LOCK() by the NET_LOCK(), no?
> Once that works and we've built knowledge about the existing architecture
> we might consider some finer grain locking.
> 
> Is there any downside to this suggestion?
> 



fix races in if_clone_create()

2020-06-26 Thread Vitaliy Makkoveev
if_clone_create() has the races caused by context switch.

 cut begin 
if_clone_create(const char *name, int rdomain)
{
struct if_clone *ifc;
struct ifnet *ifp;
int unit, ret;

ifc = if_clone_lookup(name, &unit); /* [1] */
if (ifc == NULL)
return (EINVAL);

if (ifunit(name) != NULL) /* [2] */
return (EEXIST);

ret = (*ifc->ifc_create)(ifc, unit); /* [3] */

if (ret != 0 || (ifp = ifunit(name)) == NULL) /* [4] */
return (ret);

NET_LOCK();
if_addgroup(ifp, ifc->ifc_name);
if (rdomain != 0)
if_setrdomain(ifp, rdomain);
NET_UNLOCK();

return (ret);
}

 cut end 


The race is:

Thread 1:
1. We pass the string "ifacename0" with contains interface name and
it's unit to if_clone_lookup(). if_clone_lookup() extracts the unit '0'
and return it to us as integer `unit'. this `unit' is our local
variable, nobody knows what we are going to create interface with this
unit.

2. There is no "ifacename0" yet, so ifunit() will return NULL. 

3. We did some checks add call `ifc_create' which will call
pseudo-driver's callback to create instance with passed `unit'. We
initialize ifnet with our softwere context and the `if_xname' set to
"ifacename0".

3.1. We do if_attach() which call NET_LOCK() before we link `ifnet' to
`if_list'. Also if_attach() doesn't check anything so we always attach
passed `ifnet'.

Context was switched to Thread 2:

1. We also passed string "ifacename0" to if_clone_lookup() and received
`unit' with value `0'

2. There is no "ifacename0" yet, so ifunit() will return NULL.

3. All checks done, we call `ifc_create'. We initialize out softwere
context with passed `unit', Who knows do we chech is software context
for `unit' already exists before? In fact we don't :)
3.1 We initialized `ifnet' with the `if_xname' set to "ifacename0" and
we link it `if_list'.

4. This check is passed, `ifc_create' returned `0' and we have
"ifacename0" linked so ifunit() will return `ifp'.

We returned to userland with success and switched to Thread 1:

We continue 3.1. if_attach() continue his work and we have another
`ifnet' with `if_xname' set to "ifacename0" in list.

Now we have inconsistent `if_list'.

4. We do these checks. ifunit() will return pointer to `ifp' with
requested "ifacename0". All ok we return to userland with success.

Diff below fixes this reces.

Each `struct if_clone' has the bitmap where requested `unit' marked as
busy if it was not allocated before. We have new function
if_clone_alloc_unit() for. If `unit' already was allocated
if_clone_alloc_unit() will return EEXIST. Also we have
if_clone_rele_unit() to release `unit'. We do unit allocation before we
do context switch so now we can't allocate the same unit twice or more.

Index: net/if.c
===
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.610
diff -u -p -r1.610 if.c
--- net/if.c22 Jun 2020 09:45:13 -  1.610
+++ net/if.c26 Jun 2020 13:49:08 -
@@ -157,6 +157,8 @@ voidif_linkstate_task(void *);
 
 intif_clone_list(struct if_clonereq *);
 struct if_clone*if_clone_lookup(const char *, int *);
+intif_clone_alloc_unit(struct if_clone *, int);
+void   if_clone_rele_unit(struct if_clone *, int);
 
 intif_group_egress_build(void);
 
@@ -1244,19 +1246,23 @@ if_clone_create(const char *name, int rd
 {
struct if_clone *ifc;
struct ifnet *ifp;
-   int unit, ret;
+   int unit, error;
 
ifc = if_clone_lookup(name, &unit);
if (ifc == NULL)
return (EINVAL);
+   error = if_clone_alloc_unit(ifc, unit);
+   if (error != 0)
+   return (error);
 
-   if (ifunit(name) != NULL)
-   return (EEXIST);
-
-   ret = (*ifc->ifc_create)(ifc, unit);
+   if (ifunit(name) != NULL) {
+   error = (EEXIST);
+   goto rele;
+   }
 
-   if (ret != 0 || (ifp = ifunit(name)) == NULL)
-   return (ret);
+   error = (*ifc->ifc_create)(ifc, unit);
+   if (error != 0 || (ifp = ifunit(name)) == NULL)
+   return (0);
 
NET_LOCK();
if_addgroup(ifp, ifc->ifc_name);
@@ -1264,7 +1270,10 @@ if_clone_create(const char *name, int rd
if_setrdomain(ifp, rdomain);
NET_UNLOCK();
 
-   return (ret);
+   return (0);
+rele:
+   if_clone_rele_unit(ifc, unit);
+   return (error);
 }
 
 /*
@@ -1275,9 +1284,9 @@ if_clone_destroy(const char *name)
 {
struct if_clone *ifc;
struct ifnet *ifp;
-   int ret;
+   int ret, unit;
 
-   ifc = if_clone_lookup(name, NULL);
+   ifc = if_clone_lookup(name, &unit);
if (ifc == NULL)
return (EINVAL);
 
@@ -1297,6 +1306,7 @@ if_clone_destroy(const char *name)
}
NET_UNLOCK();
ret = (*ifc->ifc

Re: 11n Tx aggregation for iwm(4)

2020-06-26 Thread Todd Carson


This is from an 8265:

iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x88, msi
iwm0: hw rev 0x230, fw ver 34.0.1,

Associated to an Apple AirPort AP on a 5 GHz channel.

Before:
bandwidth min/avg/max/std-dev = 11.402/17.410/36.190/4.079 Mbps

After:
bandwidth min/avg/max/std-dev = 5.147/25.039/54.066/8.489 Mbps

$ netstat -W iwm0 | grep "output block"
1 new output block ack agreement
0 output block ack agreements timed out




Re: 11n Tx aggregation for iwm(4)

2020-06-26 Thread Landry Breuil
On Fri, Jun 26, 2020 at 02:45:53PM +0200, Stefan Sperling wrote:
> This patch adds support for 11n Tx aggregation to iwm(4).
> 
> Please help with testing if you can by running the patch and using wifi
> as usual. Nothing should change, except that Tx speed may potentially
> improve. If you have time to run before/after performance measurements with
> tcpbench or such, that would be nice. But it's not required for testing.
> 
> If Tx aggregation is active then netstat will show a non-zero output block ack
> agreement counter:
> 
> $ netstat -W iwm0 | grep 'output block'
> 3 new output block ack agreements
>   0 output block ack agreements timed out
> 
> It would be great to get at least one test for all the chipsets the driver
> supports: 7260, 7265, 3160, 3165, 3168, 8260, 8265, 9260, 9560
> The behaviour of the access point also matters a great deal. It won't
> hurt to test the same chipset against several different access points.
> 
> I have tested this version on 8265 only so far. I've run older revisions
> of this patch on 7265 so I'm confident that this chip will work, too.
> So far, the APs I have tested against are athn(4) in 11a mode and in 11n
> mode with the 'nomimo' nwflag, and a Sagemcom 11ac AP. All on 5Ghz channels.

no difference on X1c3 w/
iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless AC 7265" rev 0x59, msi
iwm0: hw rev 0x210, fw ver 17.3216344376.0,

using a crappy old fonera as AP, serving as a bridge to gw w/ tcpbench.

bandwidth min/avg/max/std-dev = 22.519/22.704/22.995/0.162 Mbps

same bw both ways it seems.

Landry



Re: ffs(3): libc arch versions, regress

2020-06-26 Thread Mark Kettenis
> Date: Thu, 25 Jun 2020 23:05:36 +0200
> From: Christian Weisgerber 
> 
> Trying again, this time with powerpc64 added:
> 
> This adds the optimized ffs(3) versions on aarch64, powerpc, and
> powerpc64 to libc.  Also add a brief regression test.

Not yet in the position to test the powerpc64 implementation.  But it
should work.

ok kettenis@

> Index: lib/libc/arch/aarch64/string/Makefile.inc
> ===
> RCS file: /cvs/src/lib/libc/arch/aarch64/string/Makefile.inc,v
> retrieving revision 1.1
> diff -u -p -r1.1 Makefile.inc
> --- lib/libc/arch/aarch64/string/Makefile.inc 11 Jan 2017 18:09:24 -  
> 1.1
> +++ lib/libc/arch/aarch64/string/Makefile.inc 11 Jun 2020 20:30:34 -
> @@ -2,7 +2,7 @@
>  
>  SRCS+=   bcopy.c memcpy.c memmove.c \
>   strchr.c strrchr.c \
> - bcmp.c bzero.c ffs.c memchr.c memcmp.c memset.c \
> + bcmp.c bzero.c ffs.S memchr.c memcmp.c memset.c \
>   strcmp.c strncmp.c \
>   strcat.c strcpy.c strcspn.c strlen.c strlcat.c strlcpy.c \
>   strncat.c  strncpy.c strpbrk.c strsep.c \
> Index: lib/libc/arch/aarch64/string/ffs.S
> ===
> RCS file: lib/libc/arch/aarch64/string/ffs.S
> diff -N lib/libc/arch/aarch64/string/ffs.S
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libc/arch/aarch64/string/ffs.S11 Jun 2020 20:31:19 -
> @@ -0,0 +1,18 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> + 
> +#include "DEFS.h"
> +
> +ENTRY(ffs)
> + RETGUARD_SETUP(ffs, x15)
> + rbitw1, w0
> + clz w1, w1
> + cmp w0, wzr
> + csinc   w0, wzr, w1, eq
> + RETGUARD_CHECK(ffs, x15)
> + ret
> +END(ffs)
> +.protected
> Index: lib/libc/arch/powerpc/string/Makefile.inc
> ===
> RCS file: /cvs/src/lib/libc/arch/powerpc/string/Makefile.inc,v
> retrieving revision 1.6
> diff -u -p -r1.6 Makefile.inc
> --- lib/libc/arch/powerpc/string/Makefile.inc 15 May 2015 22:29:37 -  
> 1.6
> +++ lib/libc/arch/powerpc/string/Makefile.inc 11 Jun 2020 20:33:04 -
> @@ -2,7 +2,7 @@
>  
>  SRCS+=   memcpy.c memmove.S \
>   strchr.c strrchr.c \
> - bcmp.c bzero.c ffs.c memchr.c memcmp.c memset.c strcat.c \
> + bcmp.c bzero.c ffs.S memchr.c memcmp.c memset.c strcat.c \
>   strcmp.c strcpy.c strcspn.c strlen.c strlcat.c strlcpy.c \
>   strncat.c strncmp.c strncpy.c strpbrk.c strsep.c \
>   strspn.c strstr.c swab.c
> Index: lib/libc/arch/powerpc/string/ffs.S
> ===
> RCS file: lib/libc/arch/powerpc/string/ffs.S
> diff -N lib/libc/arch/powerpc/string/ffs.S
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libc/arch/powerpc/string/ffs.S11 Jun 2020 20:33:19 -
> @@ -0,0 +1,16 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> + 
> +#include "SYS.h"
> +
> +ENTRY(ffs)
> + neg %r4, %r3
> + and %r3, %r3, %r4
> + cntlzw  %r3, %r3
> + subfic  %r3, %r3, 32
> + blr
> +END(ffs)
> +.protected
> Index: lib/libc/arch/powerpc64/string/Makefile.inc
> ===
> RCS file: /cvs/src/lib/libc/arch/powerpc64/string/Makefile.inc,v
> retrieving revision 1.1
> diff -u -p -r1.1 Makefile.inc
> --- lib/libc/arch/powerpc64/string/Makefile.inc   25 Jun 2020 02:34:22 
> -  1.1
> +++ lib/libc/arch/powerpc64/string/Makefile.inc   25 Jun 2020 20:53:42 
> -
> @@ -2,7 +2,7 @@
>  
>  SRCS+=   memcpy.c memmove.S \
>   strchr.c strrchr.c \
> - bcmp.c bzero.c ffs.c memchr.c memcmp.c memset.c strcat.c \
> + bcmp.c bzero.c ffs.S memchr.c memcmp.c memset.c strcat.c \
>   strcmp.c strcpy.c strcspn.c strlen.c strlcat.c strlcpy.c \
>   strncat.c strncmp.c strncpy.c strpbrk.c strsep.c \
>   strspn.c strstr.c swab.c
> Index: lib/libc/arch/powerpc64/string/ffs.S
> ===
> RCS file: lib/libc/arch/powerpc64/string/ffs.S
> diff -N lib/libc/arch/powerpc64/string/ffs.S
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ lib/libc/arch/powerpc64/string/ffs.S  25 Jun 2020 20:57:16 -
> @@ -0,0 +1,15 @@
> +/*   $OpenBSD$ */
> +/*
> + * Written by Christian Weisgerber .
> + * Public domain.
> + */
> + 
> +#include "DEFS.h"
> +
> +ENTRY(ffs)
> + neg %r4, %r3
> + and %r3, %r3, %r4
> + cntlzw  %r3, %r3
> + subfic  %r3, %r3, 32
> + blr
> +END_BUILTIN(ffs)
> Index: regress/lib/libc/ffs/Makefile
> ===
> RCS file: regress/lib/libc/ffs/Makefile
> diff -N regress/lib/libc/ffs/Makefile
> --- /dev/null 1 Jan 1970 00:00:00 -
> +++ regress/lib/libc/ffs/Makefile 20 Jun 2020 15:26:51 -
> @@ -0,0 +1,6 @@
> +PROG=ffs_test

Re: 11n Tx aggregation for iwm(4)

2020-06-26 Thread Florian Obser
Seems to be working on a X1 gen2 using
iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless AC 7260" rev 0x83, msi
against a Unifi AP-SHD.

Before:

bandwidth min/avg/max/std-dev = 7.344/9.077/11.514/0.803 Mbps

after:

bandwidth min/avg/max/std-dev = 12.551/65.407/82.835/14.169 Mbps

-- 
I'm not entirely sure you are real.



Re: [PATCH] fast conditional console scrolling

2020-06-26 Thread johnc
I should have been more rigorous -- I had two different changes running
on my system, as well as forcing it to use the 12x24 font for a 160x45
console.

If you apply the "Optimized rasops32 putchar" patch I just posted, you
should see another significant speedup.


 Original Message 
Subject: Re: [PATCH] fast conditional console scrolling
From: Paul de Weerd 
Date: Fri, June 26, 2020 1:23 am
To: jo...@armadilloaerospace.com
Cc: "tech@openbsd.org" 

Hi John,

I tried your diff. I don't quite see the same 3x improvement that you
report, more like 2x. I timed 7 runs of ls -R /usr/ports:

Before diff, time ls -R /usr/ports | wc -l 2.897s on average
After diff, time ls -R /usr/ports | wc -l 2.707s on average

Before diff, time ls -R /usr/ports 2m53.067 on average
After diff, time ls -R /usr/ports 1m30.387 on average

Note that the 'before diff' runs were with a snapshot kernel. There
may be diffs in there that account for the difference between before
and after of the no-output runs. See dmesg and full stats below.

So, on average, a speed-up of ~48%.

Thanks!

Paul 'WEiRD' de Weerd




top: remove handle abstraction, use simpler process list

2020-06-26 Thread Klemens Nanni
The internal handle used to pass process information is a needless
abstraction, after previously removing an unused member, it now only has
one member pointing to a pointer to a process struct, i.e. a simple list
of processes.

Remove the abstraction layer and (re)use the existing list of
(pointers to) kinfo_proc structs.  The diff is straight forward and I
tested it on amd64 and sparc64.

With this, scrolling does not even deserve its own helper, we can merely
point at the process to format with an offset in the list now.

There's more to clean up, bu this minimal diff already removes 21 LOC.

Feedback? OK?


Index: commands.c
===
RCS file: /cvs/src/usr.bin/top/commands.c,v
retrieving revision 1.33
diff -u -p -r1.33 commands.c
--- commands.c  8 Oct 2019 07:26:59 -   1.33
+++ commands.c  26 Jun 2020 14:35:20 -
@@ -36,6 +36,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
Index: display.c
===
RCS file: /cvs/src/usr.bin/top/display.c,v
retrieving revision 1.61
diff -u -p -r1.61 display.c
--- display.c   27 Oct 2019 13:52:26 -  1.61
+++ display.c   26 Jun 2020 14:34:59 -
@@ -46,6 +46,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
Index: machine.c
===
RCS file: /cvs/src/usr.bin/top/machine.c,v
retrieving revision 1.105
diff -u -p -r1.105 machine.c
--- machine.c   25 Jun 2020 20:38:41 -  1.105
+++ machine.c   26 Jun 2020 14:36:25 -
@@ -60,12 +60,6 @@ static char  *format_comm(struct kinfo_pr
 static int cmd_matches(struct kinfo_proc *, char *);
 static char**get_proc_args(struct kinfo_proc *);
 
-/* get_process_info passes back a handle.  This is what it looks like: */
-
-struct handle {
-   struct kinfo_proc **next_proc;  /* points to next valid proc pointer */
-};
-
 /* what we consider to be process size: */
 #define PROCSIZE(pp) ((pp)->p_vm_tsize + (pp)->p_vm_dsize + (pp)->p_vm_ssize)
 
@@ -127,7 +121,7 @@ static int  nproc;
 static int onproc = -1;
 static int pref_len;
 static struct kinfo_proc *pbase;
-static struct kinfo_proc **pref;
+struct kinfo_proc **pref;
 
 /* these are for getting the memory statistics */
 static int pageshift;  /* log base 2 of the pagesize */
@@ -313,8 +307,6 @@ get_system_info(struct system_info *si)
si->last_pid = -1;
 }
 
-static struct handle handle;
-
 struct kinfo_proc *
 getprocs(int op, int arg, int *cnt)
 {
@@ -413,7 +405,7 @@ cmd_matches(struct kinfo_proc *proc, cha
return 0;
 }
 
-struct handle *
+void
 get_process_info(struct system_info *si, struct process_select *sel,
 int (*compare) (const void *, const void *))
 {
@@ -489,10 +481,6 @@ get_process_info(struct system_info *si,
/* remember active and total counts */
si->p_total = total_procs;
si->p_active = pref_len = active_procs;
-
-   /* pass back a handle */
-   handle.next_proc = pref;
-   return &handle;
 }
 
 char fmt[MAX_COLS];/* static area where result is built */
@@ -536,24 +524,14 @@ format_comm(struct kinfo_proc *kp)
return (buf);
 }
 
-void
-skip_processes(struct handle *hndl, int n)
-{
-   hndl->next_proc += n;
-}
-
 char *
-format_next_process(struct handle *hndl, const char *(*get_userid)(uid_t, int),
+format_process(struct kinfo_proc *pp, const char *(*get_userid)(uid_t, int),
 pid_t *pid)
 {
char *p_wait;
-   struct kinfo_proc *pp;
int cputime;
double pct;
char buf[16];
-
-   /* find and remember the next proc structure */
-   pp = *(hndl->next_proc++);
 
cputime = pp->p_rtime_sec + ((pp->p_rtime_usec + 50) / 100);
 
Index: machine.h
===
RCS file: /cvs/src/usr.bin/top/machine.h,v
retrieving revision 1.29
diff -u -p -r1.29 machine.h
--- machine.h   25 Jun 2020 20:38:41 -  1.29
+++ machine.h   26 Jun 2020 14:35:56 -
@@ -87,11 +87,10 @@ extern int  display_init(struct stat
 extern int  machine_init(struct statics *);
 extern char*format_header(char *);
 extern void get_system_info(struct system_info *);
-extern struct handle
-*get_process_info(struct system_info *, struct process_select *,
+extern void
+get_process_info(struct system_info *, struct process_select *,
 int (*) (const void *, const void *));
-extern void skip_processes(struct handle *, int);
-extern char*format_next_process(struct handle *,
+extern char*format_process(struct kinfo_proc *,
 const char *(*)(uid_t, int), pid_t *);
 extern uid_tproc_owner(pid_t);
 
Index: top.c
===
RCS file: /cvs/src/usr.bin/top/top.c,v
retrieving revision 1.103
diff -u -p -r1.103 top.c
--- top.c   25 Jun 2020 20:38:41 -  

[PATCH} Optimized rasops32 putchar

2020-06-26 Thread johnc
Optimized 32 bit character rendering with unrolled rows and pairwise
foreground / background pixel rendering.

If it weren't for the 5x8 font, I would have just assumed everything
was an even width and made the fallback path also pairwise.

In isolation, the 16x32 character case got 2x faster, but that wasn't
a huge real world speedup where the space rendering that was already
at memory bandwidth limits accounted for most of the character
rendering time.  However, in combination with the previous fast
conditional console scrolling that removes most of the space rendering,
it becomes significant.

I also found that at least the efi and intel framebuffers are not
currently mapped write combining, which makes this much slower than
it should be.


Index: rasops32.c
===
RCS file: /cvs/src/sys/dev/rasops/rasops32.c,v
retrieving revision 1.10
diff -u -p -r1.10 rasops32.c
--- rasops32.c  25 May 2020 09:55:49 -  1.10
+++ rasops32.c  26 Jun 2020 14:34:06 -
@@ -65,9 +65,14 @@ rasops32_init(struct rasops_info *ri)
 int
 rasops32_putchar(void *cookie, int row, int col, u_int uc, uint32_t
attr)
 {
-   int width, height, cnt, fs, fb, clr[2];
+   int width, height, step, cnt, fs, b, f;
+   uint32_t fb, clr[2];
struct rasops_info *ri;
-   int32_t *dp, *rp;
+   int64_t *rp, q;
+   union {
+   int64_t q[4];
+   int32_t d[4][2];
+   } u;
u_char *fr;
 
ri = (struct rasops_info *)cookie;
@@ -81,48 +86,128 @@ rasops32_putchar(void *cookie, int row, 
return 0;
 #endif
 
-   rp = (int32_t *)(ri->ri_bits + row*ri->ri_yscale + col*ri->ri_xscale);
+   rp = (int64_t *)(ri->ri_bits + row*ri->ri_yscale + col*ri->ri_xscale);
 
height = ri->ri_font->fontheight;
width = ri->ri_font->fontwidth;
+   step = ri->ri_stride >> 3;
 
-   clr[0] = ri->ri_devcmap[(attr >> 16) & 0xf];
-   clr[1] = ri->ri_devcmap[(attr >> 24) & 0xf];
+   b = ri->ri_devcmap[(attr >> 16) & 0xf];
+   f = ri->ri_devcmap[(attr >> 24) & 0xf];
+   u.d[0][0] = b; u.d[0][1] = b;
+   u.d[1][0] = b; u.d[1][1] = f;
+   u.d[2][0] = f; u.d[2][1] = b;
+   u.d[3][0] = f; u.d[3][1] = f;
 
if (uc == ' ') {
+   q = u.q[0];
while (height--) {
-   dp = rp;
-   DELTA(rp, ri->ri_stride, int32_t *);
-
-   for (cnt = width; cnt; cnt--)
-   *dp++ = clr[0];
+   /* the general, pixel-at-a-time case is fast enough */
+   for (cnt = 0; cnt < width; cnt++)
+   ((int *)rp)[cnt] = b;
+   rp += step;
}
} else {
uc -= ri->ri_font->firstchar;
fr = (u_char *)ri->ri_font->data + uc * ri->ri_fontscale;
fs = ri->ri_font->stride;
-
-   while (height--) {
-   dp = rp;
-   fb = fr[3] | (fr[2] << 8) | (fr[1] << 16) |
-   (fr[0] << 24);
-   fr += fs;
-   DELTA(rp, ri->ri_stride, int32_t *);
-
-   for (cnt = width; cnt; cnt--) {
-   *dp++ = clr[(fb >> 31) & 1];
-   fb <<= 1;
-   }
+   /* double-pixel special cases for the common widths */
+   switch (width) {
+   case 8:
+   while (height--) {
+   fb = fr[0];
+   rp[0] = u.q[fb >> 6];
+   rp[1] = u.q[(fb >> 4) & 3];
+   rp[2] = u.q[(fb >> 2) & 3];
+   rp[3] = u.q[fb & 3];
+   rp += step;
+   fr += 1;
+   }
+   break;
+   
+   case 12:
+   while (height--) {
+   fb = fr[0];
+   rp[0] = u.q[fb >> 6];
+   rp[1] = u.q[(fb >> 4) & 3];
+   rp[2] = u.q[(fb >> 2) & 3];
+   rp[3] = u.q[fb & 3];
+   fb = fr[1];
+   rp[4] = u.q[fb >> 6];
+   rp[5] = u.q[(fb >> 4) & 3];
+   rp += step;
+   fr += 2;
+   }
+   break;
+   
+   case 16:
+   while (height--) {
+  

Re: userland clock_gettime proof of concept

2020-06-26 Thread Paul Irofti
On Wed, Jun 24, 2020 at 11:53:23AM +0200, Robert Nagy wrote:
> On 22/06/20 19:12 +0300, Paul Irofti wrote:
> > New iteration:
> > 
> >   - ps_timekeep should not coredump, pointed by deraadt@
> >   - set ps_timekeep to 0 before user uvm_map for randomization
> >   - map timekeep before fixup. confirmed by naddy@ that it fixes NULL init
> >   - initialize va. clarified by kettenis@
> > 
> > How's the magical max skew value research going? Do we have a value yet?
> > 
> > Paul
> 
> I think we should pick 100 for now and then we can adjust it later if needed.
> 
> Of course this depends on kettenis' lfence diff so that amd ryzen tsc is sane.

I looked at dmesglog and the reported values are indeed small. 99 was
the highest on an Atom. I updated the diff to 100. I think we can adapt
this as we get more reports (if ever).

OK?

diff --git lib/libc/arch/aarch64/gen/Makefile.inc 
lib/libc/arch/aarch64/gen/Makefile.inc
index a7b1b73f3ef..ee198f5d611 100644
--- lib/libc/arch/aarch64/gen/Makefile.inc
+++ lib/libc/arch/aarch64/gen/Makefile.inc
@@ -9,4 +9,4 @@ SRCS+=  fpgetmask.c fpgetround.c fpgetsticky.c
 SRCS+= fpsetmask.c fpsetround.c fpsetsticky.c
 SRCS+= fpclassifyl.c
 SRCS+= isfinitel.c isinfl.c isnanl.c isnormall.c
-SRCS+= signbitl.c
+SRCS+= signbitl.c usertc.c
diff --git lib/libc/arch/aarch64/gen/usertc.c lib/libc/arch/aarch64/gen/usertc.c
new file mode 100644
index 000..6551854a010
--- /dev/null
+++ lib/libc/arch/aarch64/gen/usertc.c
@@ -0,0 +1,21 @@
+/* $OpenBSD$   */
+/*
+ * Copyright (c) 2020 Paul Irofti 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+
+int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
diff --git lib/libc/arch/alpha/gen/Makefile.inc 
lib/libc/arch/alpha/gen/Makefile.inc
index a44599d2cab..2a8abd32b61 100644
--- lib/libc/arch/alpha/gen/Makefile.inc
+++ lib/libc/arch/alpha/gen/Makefile.inc
@@ -3,5 +3,5 @@
 
 SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.c nan.c setjmp.S
 SRCS+= flt_rounds.c fpgetmask.c fpgetround.c fpgetsticky.c fpsetmask.c \
-   fpsetround.c fpsetsticky.c
+   fpsetround.c fpsetsticky.c usertc.c
 SRCS+= sigsetjmp.S
diff --git lib/libc/arch/alpha/gen/usertc.c lib/libc/arch/alpha/gen/usertc.c
new file mode 100644
index 000..6551854a010
--- /dev/null
+++ lib/libc/arch/alpha/gen/usertc.c
@@ -0,0 +1,21 @@
+/* $OpenBSD$   */
+/*
+ * Copyright (c) 2020 Paul Irofti 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+ */
+
+#include 
+#include 
+
+int (*const _tc_get_timecount)(struct timekeep *, u_int *) = NULL;
diff --git lib/libc/arch/amd64/gen/Makefile.inc 
lib/libc/arch/amd64/gen/Makefile.inc
index e995309ed71..f6349e2b974 100644
--- lib/libc/arch/amd64/gen/Makefile.inc
+++ lib/libc/arch/amd64/gen/Makefile.inc
@@ -2,6 +2,7 @@
 
 SRCS+= _setjmp.S fabs.S infinity.c ldexp.c modf.S nan.c setjmp.S \
sigsetjmp.S
-SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c
+SRCS+= fpclassifyl.c isfinitel.c isinfl.c isnanl.c isnormall.c signbitl.c \
+   usertc.c
 SRCS+= flt_rounds.S fpgetmask.S fpgetround.S fpgetsticky.S fpsetmask.S \
fpsetround.S fpsetsticky.S
diff --git lib/libc/arch/amd64/gen/usertc.c lib/libc/arch/amd64/gen/usertc.c
new file mode 100644
index 000..56016c8eca1
--- /dev/null
+++ lib/libc/arch/amd64/gen/usertc.c
@@ -0,0 +1,41 @@
+/* $OpenBSD$ */
+/*
+ * Copyright (c) 2020 Paul Irofti 
+ *
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission no

Re: pipex(4): use reference counters for `ifnet'

2020-06-26 Thread Vitaliy Makkoveev
On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > Updated diff. 
> > 
> > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > store index in session and be sure if_get(9) returned `ifnet' is our
> > original `ifnet'.
> 
> Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> for whatever reason then the if_get(9) interface has to be fixed.  Which
> case doesn't work for you?  Do you have a reproducer?  
> 
> How does sessions stay around if their corresponding interface is
> destroyed?

We have `pipexinq' and `pipexoutq' which can store pointers to session.
pipexintr() process these queues. pipexintr() and
pipex_destroy_session() are *always* different context. This mean we
*can't* free pipex(4) session without be sure there is no reference to
this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
afret free issue. Look please at net/pipex.c:846. The way pppx(4)
destroy sessions is wrong. While pppac(4) destroys sessions by
pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
and `pipexoutq' state. I'am said it again and again.

Look at net/pipex.c:777

Static int
pipex_ppp_enqueue(struct mbuf *m0, struct pipex_session *session,
struct mbuf_queue *mq)
{
m0->m_pkthdr.ph_cookie = session; /* [1] */
/* XXX need to support other protocols */
m0->m_pkthdr.ph_ppp_proto = PPP_IP;

if (mq_enqueue(mq, m0) != 0) /* [2] */
return (1);

schednetisr(NETISR_PIPEX); /* [3] */

return (0);
}

and net/pipex.c:643

Static int
pipex_destroy_session(struct pipex_session *session)
{
struct radix_node *rn;

/* remove from radix tree and hash chain */
NET_ASSERT_LOCKED();

if (!in_nullhost(session->ip_address.sin_addr)) {
rn = rn_delete(&session->ip_address, &session->ip_netmask,
pipex_rd_head4, (struct radix_node *)session);
KASSERT(rn != NULL);
}

pipex_unlink_session(session);
pool_put(&pipex_session_pool, session); /* [4] */

return (0);
}

and net/pipex.c:720

void
pipexintr(void)
{
struct pipex_session *pkt_session;
u_int16_t proto;
struct mbuf *m;
struct mbuf_list ml;

/* ppp output */
mq_delist(&pipexoutq, &ml);
while ((m = ml_dequeue(&ml)) != NULL) {
pkt_session = m->m_pkthdr.ph_cookie; /* [5] */
/* ... */
pipex_ppp_output(m, pkt_session, proto);
/* [7] */
}
/* ... */
mq_delist(&pipexinq, &ml);
while ((m = ml_dequeue(&ml)) != NULL) {
pkt_session = m->m_pkthdr.ph_cookie; /* [6] */
/* ... */
pipex_ppp_input(m, pkt_session, 0);
/* [8] */
}
}

and net/pipex.c:808

Static void
pipex_timer(void *ignored_arg)
{
struct pipex_session *session, *session_tmp;

timeout_add_sec(&pipex_timer_ch, pipex_prune);

NET_LOCK();
/* walk through */
LIST_FOREACH_SAFE(session, &pipex_session_list, session_list,
session_tmp) {
switch (session->state) {
/* ... */
case PIPEX_STATE_CLOSED:
/*
 * mbuf queued in pipexinq or pipexoutq may have a
 * refererce to this session.
 */
/* [9] */
if (!mq_empty(&pipexinq) || !mq_empty(&pipexoutq))
continue;

pipex_destroy_session(session);
break;
/* ... */
}

1. You pass pointer to mbuf
2. You enqueue this mbuf to `pipexinq' or `pipexoutq'
3. You call task_add();

Is it * always * garanteed that netisr() will be called here? What
will be happened if netisr will be called after [4]?

4. You destroy session.

5. You use memory pointed by `pkt_session' It's netisr context. Are you
shure `ph_cookie' is still valid?
6. You use memory pointed by `pkt_session' It's netisr context. Are you
shure `ph_cookie' is still valid?

Also we are going to move pipexintr() out of KERNEL_LOCK() and
NET_LOCK(). This means pipex_destroy_session() and pipexintr() will be
executed simultaneously. How to protect `ph_cookie'?

I guess to use reference counters for pipex(4) sessions. We will grab
extra reference before [1] and release this reference at [7] or [8].

I see absolutele *no* reason to sleep at [9]. So refcnt_finalize() is
not applicable here. I want to use reference counters in the way of
file(9). This means session can live after userland destroy it because
it has reference in `pipexinq' or `pipexoutq'.

But userland will not only release it's reference to session. It will
destroy `ifnet' too. And you will have session referenced by `pipe

Re: make btrace interval event to reciprocal of ticks

2020-06-26 Thread Martin Pieuchot
On 25/06/20(Thu) 23:45, Yuichiro NAITO wrote:
> From: Martin Pieuchot 
> Subject: Re: make btrace interval event to reciprocal of ticks
> Date: Thu, 25 Jun 2020 11:36:48 +0200
> 
> > On 23/06/20(Tue) 12:04, Yuichiro NAITO wrote:
> >> Current btrace has `interval:hz:1` probe that makes periodically events.
> >> `interval:hz:1` looks to make events once per second (= 1Hz),
> >> but current implementation makes once per tick (= 100Hz on amd64).
> >> I think the interval should be counted by reciprocal of ticks so that fit 
> >> to Hz.
> > 
> > Indeed the current implementations assumes it's a number of ticks.  How
> > does other tracing tool behave?  Is the behavior you suggest similar to
> > bpftrace(8) or dtrace(1)?
> 
> I don't know about bpftrace(8).
> Dtrace has interval timer in Hz as Lauri says.

Seems to be the same, so I'll commit your diff.

> > Would it be complicated to add support for other units, like 'ms' or
> > 'sec'?  In that case 'profile:s:10' would mean every 10sec while
> > 'profile:hz:10' would mean every ~6sec, right?
> 
> I feel it's ok to just rename 'interval:hz' to 'interval:ticks' with
> no implementation change. (but I hope that 100 is allowed.)
> We can know the tick interval from 'hz' in `sysctl -n kern.clockrate`.
> So it's easy to understand that 'interval:ticks:N' fires on every N ticks.
> 
> And tick resolution is not so high.
> In my patch, 51 Hz ~ 100 Hz is calculated to 10 ms (=1 tick) interval.
> It might confuse some users.
> 
> In my usecase, I want 10ms ~ 1sec intervals.
> If N is allowed over 99, 'interval:ticks:N' can be useful to me.

Well ideally we should be able to specify an interval in second or
millisecond.  If you or somebody else wants to implement that, I
suppose it makes sense now to change `dtrq_rate' to be a uint64_t
and encode the interval in nanosecond.



11n Tx aggregation for iwm(4)

2020-06-26 Thread Stefan Sperling
This patch adds support for 11n Tx aggregation to iwm(4).

Please help with testing if you can by running the patch and using wifi
as usual. Nothing should change, except that Tx speed may potentially
improve. If you have time to run before/after performance measurements with
tcpbench or such, that would be nice. But it's not required for testing.

If Tx aggregation is active then netstat will show a non-zero output block ack
agreement counter:

$ netstat -W iwm0 | grep 'output block'
3 new output block ack agreements
0 output block ack agreements timed out

It would be great to get at least one test for all the chipsets the driver
supports: 7260, 7265, 3160, 3165, 3168, 8260, 8265, 9260, 9560
The behaviour of the access point also matters a great deal. It won't
hurt to test the same chipset against several different access points.

I have tested this version on 8265 only so far. I've run older revisions
of this patch on 7265 so I'm confident that this chip will work, too.
So far, the APs I have tested against are athn(4) in 11a mode and in 11n
mode with the 'nomimo' nwflag, and a Sagemcom 11ac AP. All on 5Ghz channels.

diff refs/heads/master refs/heads/txagg
blob - 3a75d07a60a7eb4c66540474e47aeffd7a85250a
blob + 853bdd1290ad509f5fce7b5bf20550f458a2b460
--- sys/dev/pci/if_iwm.c
+++ sys/dev/pci/if_iwm.c
@@ -144,6 +144,8 @@
 #include 
 #include 
 #include 
+#include  /* for SEQ_LT */
+#undef DPRINTF /* defined in ieee80211_priv.h */
 
 #define DEVNAME(_s)((_s)->sc_dev.dv_xname)
 
@@ -299,7 +301,8 @@ int iwm_nic_rx_mq_init(struct iwm_softc *);
 intiwm_nic_tx_init(struct iwm_softc *);
 intiwm_nic_init(struct iwm_softc *);
 intiwm_enable_ac_txq(struct iwm_softc *, int, int);
-intiwm_enable_txq(struct iwm_softc *, int, int, int);
+intiwm_enable_txq(struct iwm_softc *, int, int, int, int, uint8_t,
+   uint16_t);
 intiwm_post_alive(struct iwm_softc *);
 struct iwm_phy_db_entry *iwm_phy_db_get_section(struct iwm_softc *, uint16_t,
uint16_t);
@@ -334,12 +337,12 @@ void  iwm_ampdu_rx_stop(struct ieee80211com *, struct 
i
uint8_t);
 void   iwm_sta_rx_agg(struct iwm_softc *, struct ieee80211_node *, uint8_t,
uint16_t, uint16_t, int);
-#ifdef notyet
+void   iwm_sta_tx_agg(struct iwm_softc *, struct ieee80211_node *, uint8_t,
+   uint16_t, uint16_t, int);
 intiwm_ampdu_tx_start(struct ieee80211com *, struct ieee80211_node *,
uint8_t);
 void   iwm_ampdu_tx_stop(struct ieee80211com *, struct ieee80211_node *,
uint8_t);
-#endif
 void   iwm_ba_task(void *);
 
 intiwm_parse_nvm_data(struct iwm_softc *, const uint16_t *,
@@ -372,14 +375,25 @@ int   iwm_rxmq_get_signal_strength(struct iwm_softc 
*, s
 void   iwm_rx_rx_phy_cmd(struct iwm_softc *, struct iwm_rx_packet *,
struct iwm_rx_data *);
 intiwm_get_noise(const struct iwm_statistics_rx_non_phy *);
+void   iwm_txq_advance(struct iwm_softc *, struct iwm_tx_ring *, int);
+void   iwm_ampdu_tx_done(struct iwm_softc *, struct iwm_cmd_header *,
+   struct iwm_node *, struct iwm_tx_ring *, uint32_t, uint8_t,
+   uint8_t, uint16_t, int, struct iwm_agg_tx_status *);
 intiwm_ccmp_decap(struct iwm_softc *, struct mbuf *,
struct ieee80211_node *);
 void   iwm_rx_frame(struct iwm_softc *, struct mbuf *, int, uint32_t, int, int,
uint32_t, struct ieee80211_rxinfo *, struct mbuf_list *);
-void   iwm_rx_tx_cmd_single(struct iwm_softc *, struct iwm_rx_packet *,
-   struct iwm_node *, int, int);
+void   iwm_rx_tx_cmd_single(struct iwm_softc *, struct iwm_tx_resp *,
+   struct iwm_node *, int, int, int);
+void   iwm_txd_done(struct iwm_softc *, struct iwm_tx_data *);
 void   iwm_rx_tx_cmd(struct iwm_softc *, struct iwm_rx_packet *,
struct iwm_rx_data *);
+void   iwm_clear_oactive(struct iwm_softc *, struct iwm_tx_ring *);
+void   iwm_mira_choose(struct iwm_softc *, struct ieee80211_node *);
+void   iwm_ampdu_rate_control(struct iwm_softc *, struct ieee80211_node *,
+   struct iwm_tx_ring *, int, uint16_t, uint16_t);
+void   iwm_rx_ba(struct iwm_softc *, struct iwm_rx_packet *,
+   struct iwm_rx_data *);
 void   iwm_rx_bmiss(struct iwm_softc *, struct iwm_rx_packet *,
struct iwm_rx_data *);
 intiwm_binding_cmd(struct iwm_softc *, struct iwm_node *, uint32_t);
@@ -399,6 +413,7 @@ int iwm_send_cmd_pdu_status(struct iwm_softc *, uint32
 void   iwm_free_resp(struct iwm_softc *, struct iwm_host_cmd *);
 void   iwm_cmd_done(struct iwm_softc *, int, int, int);
 void   iwm_update_sched(struct iwm_softc *, int, int, uint8_t, uint16_t);
+void   iwm_reset_sched(struct iwm_softc *, int, int, uint8_t);
 const struct iwm_rate *iwm_tx_fill_cmd(struct iwm_softc *, struct iwm_node *,
struct ieee80211_frame *, struct iwm_tx_cmd *);
 intiwm_tx(struct iwm_softc *, struct mbuf *, struct ieee80211_node *, int);
@@ -1306,17 +1321,17 @@ iwm_alloc_tx_r

Re: pipex(4): use reference counters for `ifnet'

2020-06-26 Thread Martin Pieuchot
On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > Updated diff. 
> > > 
> > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > original `ifnet'.
> > 
> > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > case doesn't work for you?  Do you have a reproducer?  
> > 
> > How does sessions stay around if their corresponding interface is
> > destroyed?
> 
> We have `pipexinq' and `pipexoutq' which can store pointers to session.
> pipexintr() process these queues. pipexintr() and
> pipex_destroy_session() are *always* different context. This mean we
> *can't* free pipex(4) session without be sure there is no reference to
> this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> destroy sessions is wrong. While pppac(4) destroys sessions by
> pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> and `pipexoutq' state. I'am said it again and again.

I understand.  Why is it a problem?  Using reference counting the way
you're suggesting is *one* possible solution to a problem we don't fully
understand.  What are we trying to achieve?  Which problem are we trying
to solve?

Are we trying to move pipexintr() out of KERNEL_LOCK()?  If so are we
trying to guarantee the memory pointed by `ph_cookie' representing a
session pointer is always valid without the KERNEL_LOCK()? 

If that's what we're trying to do, shouldn't we start with a baby-step
and trade the KERNEL_LOCK() for another lock first?  I understand it is
not some sexy-mp-rcu-atomic-crazy-refcount-clever solution, but it is
simpler to do.

Since all the packets (mbuf) are being processed with the NET_LOCK(). we
should be able to start trading the KERNEL_LOCK() by the NET_LOCK(), no?
Once that works and we've built knowledge about the existing architecture
we might consider some finer grain locking.

Is there any downside to this suggestion?



update vxlan(4) man

2020-06-26 Thread Denis Fondras
vxlan(4) actually supports IPv6 tunnel endpoints.
Only multicast endpoints are unsupported.

Index: vxlan.4
===
RCS file: /cvs/src/share/man/man4/vxlan.4,v
retrieving revision 1.8
diff -u -p -r1.8 vxlan.4
--- vxlan.4 22 Nov 2018 17:31:11 -  1.8
+++ vxlan.4 26 Jun 2020 11:52:32 -
@@ -176,4 +176,5 @@ decreased MTU of 1450 bytes.
 In any other case, it is commonly recommended to set the MTU of the
 transport interfaces to at least 1600 bytes.
 .Pp
-The implementation does not support IPv6 tunnel endpoints at present.
+The implementation does not support IPv6 multicast tunnel endpoints at
+present.



Re: Destructive Install Process (fdisk)

2020-06-26 Thread David Blevins
OpenBSD Community,

Thanks for the feedback, some people that write the multiboot
instructions reached out to me and I'm working through them.

>In order to use GPT you need a BIOS that supports it. In most of the
>cases, this means an UEFI BIOS.
I'm aware of the issues related to UEFI (particularly on older
hardware like mine) and the obnoxious solutions to the problem
at a systems programming level.

The problem looks like it resides in Grub and Linux (perhaps
just Debian systems) and I'm examining if there's a simple
way to reuse existing systems to work my solution. Of course
I don't expect the solutions to be integrated due to a variety
of factors.  Regardless, the multi-boot scenario is still sufficiently
compelling for me to keep examining the issue.

Initially I was surprised to find how difficult it has been to get the
OS to multiboot via grub, but I have been finding this less surprising
as I've looked into the details (particularly from the GNU end).

>...far easier to install Windows after OpenBSD and get both working
>again than installing Windows after installing Grub/Linux.
I install windows on a separate disk to avoid these issues.  It is
indeed easier to put Debian on the drive *after* installing OpenBSD.



Re: Destructive Install Process (fdisk)

2020-06-26 Thread Damien Couderc

Le 25/06/2020 à 22:41, David Blevins a écrit :

List,

Apologies, I left HTML mode enabled on my web client.
I understand that a Linux dmesg is probably not very useful.
I can provide a different set of system information (or an
OpenBSD dmesg from an MBR boot) but I'm not sure what
set of information would be most useful in determining my
issue regarding GPT boot on this particular system.


Hi David,

In order to use GPT you need a BIOS that supports it. In most of the 
cases, this means an UEFI BIOS.


Concerning multiboot:

- Using MBR, you have a plethora of boot managers that will work. You 
just need to select "edit" in the installer instead of "whole" to keep 
existing partitions.
- Using GPT/EFI, there is less options but I'd recommend rEFInd. You 
also have to edit partition table if you want to keep the existing 
partitions.


In both cases, a system upgrade of one of the OS present on the disk 
could result in a modified boot sequence (OpenBSD will overwrite UEFI 
boot loaders at each upgrade, Windows is well known to set itself as 
default boot at install, ..). If you setup a multiboot don't expect any 
help from OpenBSD, you are on your own.



Regards,

Damien



Re: Destructive Install Process (fdisk)

2020-06-26 Thread Kevin Chadwick
On 2020-06-25 20:16, Theo de Raadt wrote:
> I'd say that I simply don't see why the installer destructively 
> re-arranges the disk's scheme prior to officially choosing to write
> the new partitioning scheme to the disk.

I'm not sure that I believe that and it shows you what YOU are about to commit!

AFAIK grub is called grub for a reason. In that it needed more disk space than
the MBR offered. IOW grub is far more destructive than OpenBSD or the Windows
installer. The fact that Linux distros detect and fix that destruction doesn't
change that. In my experience (less GPT), it is far easier to install Windows
after OpenBSD and get both working again than installing Windows after
installing Grub/Linux.



Re: [PATCH] fast conditional console scrolling

2020-06-26 Thread Paul de Weerd
Hi John,

I tried your diff.  I don't quite see the same 3x improvement that you
report, more like 2x.  I timed 7 runs of ls -R /usr/ports:

Before diff, time ls -R /usr/ports | wc -l  2.897s on average
After diff,  time ls -R /usr/ports | wc -l  2.707s on average

Before diff, time ls -R /usr/ports  2m53.067 on average
After diff, time ls -R /usr/ports   1m30.387 on average

Note that the 'before diff' runs were with a snapshot kernel.  There
may be diffs in there that account for the difference between before
and after of the no-output runs.  See dmesg and full stats below.

So, on average, a speed-up of ~48%.

Thanks!

Paul 'WEiRD' de Weerd

--- full stats ---
pre-diff, no output post-diff, no output
realusersystem  realusersystem
02.94   00.58   02.40   02.70   00.58   02.12
02.88   00.56   02.37   02.71   00.39   02.32
03.03   00.46   02.60   02.70   00.43   02.26
02.85   00.52   02.36   02.69   00.54   02.18
02.88   00.45   02.43   02.62   00.53   02.10
02.87   00.50   02.38   02.72   00.62   02.11
02.83   00.57   02.29   02.81   00.45   02.36

pre-diff, with output   post-diff, with output
realusersystem  realusersystem
2m53.17 00.90   2m52.27 1m30.81 01.23   1m29.50
2m53.12 00.81   2m52.31 1m30.58 01.33   1m29.30
2m53.01 00.88   2m52.11 1m30.49 01.11   1m29.40
2m53.06 01.03   2m52.00 1m30.53 01.29   1m29.26
2m52.99 00.80   2m52.24 1m30.27 01.08   1m29.19
2m53.11 00.96   2m52.16 1m30.40 01.14   1m29.27
2m53.01 00.79   2m52,28 1m30.33 01.11   1m29.24
--

--- dmesg 
OpenBSD 6.7-current (GENERIC.MP) #296: Wed Jun 24 11:34:44 MDT 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 34243903488 (32657MB)
avail mem = 33191059456 (31653MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xec410 (88 entries)
bios0: vendor Dell Inc. version "A22" date 02/01/2018
bios0: Dell Inc. OptiPlex 9020
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT SLIC LPIT SSDT SSDT SSDT HPET SSDT MCFG SSDT 
ASF! DMAR
acpi0: wakeup devices UAR1(S3) RP01(S4) PXSX(S4) PXSX(S4) PXSX(S4) RP05(S4) 
PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S3) EHC2(S3) XHC_(S4) 
HDEF(S4) PEG0(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3692.06 MHz, 06-3c-03
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: TSC skew=0 observed drift=0
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.46 MHz, 06-3c-03
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: TSC skew=1 observed drift=0
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.46 MHz, 06-3c-03
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: TSC skew=12 observed drift=0
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Core(TM) i7-4770 CPU @ 3.40GHz, 3691.46 MHz, 06-3c-03
cpu3: 
FPU

Re: pipex(4): use reference counters for `ifnet'

2020-06-26 Thread Martin Pieuchot
On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> Updated diff. 
> 
> OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> store index in session and be sure if_get(9) returned `ifnet' is our
> original `ifnet'.

Why not?  The point of if_get(9) is to be sure.  If that doesn't work
for whatever reason then the if_get(9) interface has to be fixed.  Which
case doesn't work for you?  Do you have a reproducer?  

How does sessions stay around if their corresponding interface is
destroyed?

> Now each pipex(4) session has it's own reference to `ifnet'. Also pppoe
> related sessions has the reference to it's ethernet interface.
> 
> All `ifnet's are obtained by if_get(9) and we use `if_index' from stored
> reference to `ifnet'. 

That's not how the API is meant to be used.  All if_get(9) must have a
corresponding if_put(9) in the same context.  We don't store pointers
put interface indexes, meaning:

> Index: net/pipex.c
> ===
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.116
> diff -u -p -r1.116 pipex.c
> --- net/pipex.c   22 Jun 2020 09:38:15 -  1.116
> +++ net/pipex.c   25 Jun 2020 16:41:25 -
> @@ -148,7 +148,7 @@ pipex_iface_init(struct pipex_iface_cont
>   struct pipex_session *session;
>  
>   pipex_iface->pipexmode = 0;
> - pipex_iface->ifnet_this = ifp;
> + pipex_iface->ifnet_this = if_get(ifp->if_index);

`ifnet_this' should be replaced by `ifidx' so this becomes:

pipex_iface->ifidx = ifp->if_index.