Re: IPv4 on ix(4) slow/nothing - 7.4

2023-10-18 Thread Hrvoje Popovski
On 18.10.2023. 15:35, Mischa wrote:
> Hi All,
> 
> Just upgraded a couple of machines to 7.4. smooth as always!!
> 
> I am however seeing issues with IPv4, slowness or no throughput at all.
> The machines I have upgraded are using an Intel X540-T network card and
> is connected on 10G.
> 
> ix0 at pci5 dev 0 function 0 "Intel X540T" rev 0x01, msix, 16 queues,
> address b8:ca:3a:62:ee:40
> ix1 at pci5 dev 0 function 1 "Intel X540T" rev 0x01, msix, 16 queues,
> address b8:ca:3a:62:ee:42
> 
> root@n2:~ # sysctl kern.version
> kern.version=OpenBSD 7.4 (GENERIC.MP) #1397: Tue Oct 10 09:02:37 MDT 2023
>     dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> There are a bunch of VMs running on top of it, as soon as I want to
> fetch something with ftp, for example, I don't get anything over IPv4,
> with IPv6 everything is normal.
> 
> mischa@www2:~ $ ftp -4
> https://mirror.openbsd.amsterdam/pub/OpenBSD/7.4/amd64/install74.iso
> Trying 46.23.88.18...
> Requesting
> https://mirror.openbsd.amsterdam/pub/OpenBSD/7.4/amd64/install74.iso
>   0% | |   512 KB  - stalled
> -^C
> 
> A trace on mirror / n2:
> 
> n2:~ # tcpdump -i vport880 host 46.23.88.32
> tcpdump: listening on vport880, link-type EN10MB
> 15:16:08.730274 www2.high5.nl.1828 > n2.high5.nl.https: S
> 2182224746:2182224746(0) win 16384  6,nop,nop,timestamp 2899683458 0> (DF)
> 15:16:08.730297 arp who-has www2.high5.nl tell n2.high5.nl
> 15:16:08.731535 arp reply www2.high5.nl is-at fe:51:bb:1e:12:11
> 15:16:08.731540 n2.high5.nl.https > www2.high5.nl.1828: S
> 633749938:633749938(0) ack 2182224747 win 16384  1460,nop,nop,sackOK,nop,wscale 6,nop,nop,timestamp 3129955106
> 2899683458> (DF)
> 15:16:08.732017 www2.high5.nl.1828 > n2.high5.nl.https: . ack 1 win 256
>  (DF)
> 15:16:08.785752 www2.high5.nl.1828 > n2.high5.nl.https: P 1:312(311) ack
> 1 win 256  (DF)
> 15:16:08.786092 n2.high5.nl.https > www2.high5.nl.1828: P 1:128(127) ack
> 312 win 271  (DF)
> 15:16:08.786376 n2.high5.nl.https > www2.high5.nl.1828: P 128:134(6) ack
> 312 win 271  (DF)
> 15:16:08.786396 n2.high5.nl.https > www2.high5.nl.1828: P 134:166(32)
> ack 312 win 271  (DF)
> 15:16:08.786455 n2.high5.nl.https > www2.high5.nl.1828: . 166:1614(1448)
> ack 312 win 271  (DF)
> 15:16:08.786457 n2.high5.nl.https > www2.high5.nl.1828: .
> 1614:3062(1448) ack 312 win 271  2899683510> (DF)
> 15:16:08.786460 n2.high5.nl.https > www2.high5.nl.1828: P 3062:3803(741)
> ack 312 win 271  (DF)
> 15:16:08.786943 www2.high5.nl.1828 > n2.high5.nl.https: . ack 134 win
> 255  (DF)
> 15:16:08.796534 n2.high5.nl.https > www2.high5.nl.1828: P 3803:4345(542)
> ack 312 win 271  (DF)
> 15:16:08.796577 n2.high5.nl.https > www2.high5.nl.1828: P 4345:4403(58)
> ack 312 win 271  (DF)
> 15:16:08.797518 www2.high5.nl.1828 > n2.high5.nl.https: . ack 166 win
> 256 > (DF)
> 15:16:08.797522 www2.high5.nl.1828 > n2.high5.nl.https: . ack 166 win
> 256 > (DF)
> 15:16:09.790297 n2.high5.nl.https > www2.high5.nl.1828: . 166:1614(1448)
> ack 312 win 271  (DF)
> 15:16:09.790902 www2.high5.nl.1828 > n2.high5.nl.https: . ack 1614 win
> 233 > (DF)
> 15:16:09.790917 n2.high5.nl.https > www2.high5.nl.1828: .
> 1614:3062(1448) ack 312 win 271  2899684519> (DF)
> 15:16:09.790923 n2.high5.nl.https > www2.high5.nl.1828: P
> 3062:4403(1341) ack 312 win 271  2899684519> (DF)
> 15:16:10.790299 n2.high5.nl.https > www2.high5.nl.1828: .
> 1614:3062(1448) ack 312 win 271  2899684519> (DF)
> 15:16:10.791204 www2.high5.nl.1828 > n2.high5.nl.https: . ack 3062 win
> 233 > (DF)
> 15:16:10.791223 n2.high5.nl.https > www2.high5.nl.1828: P
> 3062:4403(1341) ack 312 win 271  2899685520> (DF)
> 15:16:10.791692 www2.high5.nl.1828 > n2.high5.nl.https: . ack 4403 win
> 235  (DF)
> 15:16:10.802647 www2.high5.nl.1828 > n2.high5.nl.https: P 312:318(6) ack
> 4403 win 256  (DF)
> 15:16:11.000297 n2.high5.nl.https > www2.high5.nl.1828: . ack 318 win
> 271  (DF)
> 15:16:11.001162 www2.high5.nl.1828 > n2.high5.nl.https: P 318:527(209)
> ack 4403 win 256  (DF)
> 15:16:11.001860 n2.high5.nl.https > www2.high5.nl.1828: P 4403:5059(656)
> ack 527 win 271  (DF)
> 15:16:11.001989 n2.high5.nl.https > www2.high5.nl.1828: .
> 5059:6507(1448) ack 527 win 271  2899685730> (DF)
> 15:16:11.001992 n2.high5.nl.https > www2.high5.nl.1828: .
> 6507:7955(1448) ack 527 win 271  2899685730> (DF)
> 15:16:11.195431 www2.high5.nl.1828 > n2.high5.nl.https: . ack 5059 win
> 256  (DF)
> 15:16:11.195447 n2.high5.nl.https > www2.high5.nl.1828: .
> 7955:9403(1448) ack 527 win 271  2899685924> (DF)
> 
> 
> Running a trace on www2 I am seeing:
> 
> www2:~ # tcpdump -i vio0 host 46.23.88.18
> tcpdump: listening on vio0, link-type EN10MB
> 15:16:08.729974 www2.high5.nl.1828 > n2.high5.nl.https: S
> 2182224746:2182224746(0) win 16384  6,nop,nop,timestamp 2899683458 0> (DF)
> 15:16:08.731114 arp who-has www2.high5.nl tell n2.high5.nl
> 15:16:08.731229 arp reply www2.high5.nl is-at fe:51:bb:1e:12:11
> 15:16:08.731631 n2.high5.nl.https > 

Re: Please test: make ipsec(4) timeouts mpsafe

2023-10-17 Thread Hrvoje Popovski
On 17.10.2023. 1:07, Vitaliy Makkoveev wrote:
>> On 13 Oct 2023, at 18:40, Hrvoje Popovski  wrote:
>>
>> On 12.10.2023. 20:10, Vitaliy Makkoveev wrote:
>>> Hi, MP safe process timeouts were landed to the tree, so time to test
>>> them with network stack :) Diff below makes tdb and ids garbage
>>> collector timeout handlers running without kernel lock. Not for commit,
>>> just share this for tests if someone interesting.
>>
>> Hi,
>>
>> with this diff it seems that it's little slower than without it.
>> 165Kpps with diff
>> 200Kpps without diff
>>
> 
> Hi,
> 
> Thanks for testing. I’m interesting on slower results. I suspect
> enabled/disabled POOL_DEBUG effect. The patched and unpatched builds
> were made from the same sources?

Hi,

I'm running same source with and without diff and with
kern.pool_debug=0.
I've tried few times and sometimes I'm getting same results, but mostly
little slower ...
I have 2 same servers directly connected with 10G x540T an one ipsec
tunnel through that 10G interface.
I'm testing like this:
compile kernel from source - test tunnel
apply your diff with same source - test tunnel



Re: l2vpn pseudowire and bridge type interface

2023-10-14 Thread Hrvoje Popovski
On 14.10.2023. 11:07, Wouter Prins wrote:
> hello list,
> 
> Was wondering if the veb interface is supported as a bridge for pseudowires?
> The manpage doesn't mention anything about the type of
> bridge interface required (bridge/veb)?
> 
> /Wouter

Hi,

you could try tpmr(4) ...

This question is more for misc@



Re: Please test: make ipsec(4) timeouts mpsafe

2023-10-13 Thread Hrvoje Popovski
On 12.10.2023. 20:10, Vitaliy Makkoveev wrote:
> Hi, MP safe process timeouts were landed to the tree, so time to test
> them with network stack :) Diff below makes tdb and ids garbage
> collector timeout handlers running without kernel lock. Not for commit,
> just share this for tests if someone interesting.

Hi,

with this diff it seems that it's little slower than without it.
165Kpps with diff
200Kpps without diff


test1
ike esp from 10.221.0.0/16 to 10.222.0.0/16 \
local 192.168.1.1 peer 192.168.1.2 \
main auth hmac-sha1 enc aes group modp1024 lifetime 3m \
quick enc aes-128-gcm group modp1024 lifetime 1m \
psk "123"

test2
ike esp from 10.222.0.0/16 to 10.221.0.0/16 \
local 192.168.1.2 peer 192.168.1.1 \
main auth hmac-sha1 enc aes group modp1024 lifetime 3m \
quick enc aes-128-gcm group modp1024 lifetime 1m \
psk "123"

I'm sending random /24 udp traffic from host connected to test1 box
through tunnel to host connected to test2 box ...


test1 - top -SHs1
  PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
20980   359894  1400K 1004K sleep/3   netlock   2:26 46.58% softnet3
54870   346439  1400K 1004K sleep/3   netlock   2:24 42.33% softnet4
65020   320085  4200K 1004K onproc/1  - 2:22 41.60% softnet5
 3723   371456  4500K 1004K onproc/5  - 2:22 40.67% softnet1
16879   500721  4300K 1004K onproc/4  - 2:26 39.06% softnet2
 1371   446835  1400K 1004K sleep/2   netlock   0:13  5.37% softnet0



test2 - top -SHs1
  PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
61821   455808  1000K 1004K sleep/4   bored 3:02 86.96% softnet0
77299   594039  1000K 1004K sleep/1   bored 0:33 21.63% softnet4






Re: Dell R7615 kernel protection fault

2023-09-11 Thread Hrvoje Popovski
On 11.9.2023. 6:27, Hrvoje Popovski wrote:
> On 11.9.2023. 2:48, Mike Larkin wrote:
>> On Sun, Sep 10, 2023 at 01:36:33AM +0200, Hrvoje Popovski wrote:
>>> Hi all,
>>>
>>> I've installed latest snapshot with uefi on Dell R7615 with AMD EPYC
>>> 9554P, with some NVMe disks on BOSS-N1 adapter and with Samsung NVMe
>>> disks directly connected to backplane and installation was fast and
>>> without any problems.
>>> But after that machine panics with this message
>>> https://kosjenka.srce.hr/~hrvoje/openbsd/r7615-ddb1.jpg
>>>
>>
>> did it work before on an older snapshot?
>>
> 
> this is brand new machine and I installed latest snapshot.
> will try older snapshot now ...
> 
> 

Hi,

I've tried snapshots from 2023-06-30 and 2023-06-07 and I'm getting same
kernel protection fault.





Re: Dell R7615 kernel protection fault

2023-09-10 Thread Hrvoje Popovski
On 11.9.2023. 2:48, Mike Larkin wrote:
> On Sun, Sep 10, 2023 at 01:36:33AM +0200, Hrvoje Popovski wrote:
>> Hi all,
>>
>> I've installed latest snapshot with uefi on Dell R7615 with AMD EPYC
>> 9554P, with some NVMe disks on BOSS-N1 adapter and with Samsung NVMe
>> disks directly connected to backplane and installation was fast and
>> without any problems.
>> But after that machine panics with this message
>> https://kosjenka.srce.hr/~hrvoje/openbsd/r7615-ddb1.jpg
>>
> 
> did it work before on an older snapshot?
> 

this is brand new machine and I installed latest snapshot.
will try older snapshot now ...




Dell R7615 kernel protection fault

2023-09-09 Thread Hrvoje Popovski
Hi all,

I've installed latest snapshot with uefi on Dell R7615 with AMD EPYC
9554P, with some NVMe disks on BOSS-N1 adapter and with Samsung NVMe
disks directly connected to backplane and installation was fast and
without any problems.
But after that machine panics with this message
https://kosjenka.srce.hr/~hrvoje/openbsd/r7615-ddb1.jpg

I can't do anything with keyboard and I've tried over ipmi console but I
can't get it to work.


BOSS-N1 is in raid1
https://kosjenka.srce.hr/~hrvoje/openbsd/r7615-ramdisk1.jpg

Samsung NVMe connected to backplane
https://kosjenka.srce.hr/~hrvoje/openbsd/r7615-ramdisk2.jpg


I will try somehow to get console output



SK-Hynix NVMe PE8000

2023-09-09 Thread Hrvoje Popovski
Hi all,

in attachment you can find diff to add SK-Hynix NVMe to pcidevs


before diff
Sep  9 08:44:28 alt-vpn1 /bsd: nvme0 at pci13 dev 0 function 0 vendor
"SK hynix", unknown product 0x2839 rev 0x21: msix, NVMe 1.3
Sep  9 08:44:28 alt-vpn1 /bsd: nvme0: Dell DC NVMe PE8010 RI U.2 960GB,
firmware 1.2.0, serial SJC2N4257I34R2Q19
Sep  9 08:44:28 alt-vpn1 /bsd: scsibus2 at nvme0: 17 targets, initiator 0


after diff
Sep  9 08:51:20 alt-vpn1 /bsd: nvme0 at pci13 dev 0 function 0 "SK hynix
PE8000 NVMe" rev 0x21: msix, NVMe 1.3
Sep  9 08:51:20 alt-vpn1 /bsd: nvme0: Dell DC NVMe PE8010 RI U.2 960GB,
firmware 1.2.0, serial SJC2N4257I34R2Q19
Sep  9 08:51:20 alt-vpn1 /bsd: scsibus2 at nvme0: 17 targets, initiator 0


 129:0:0: SK hynix unknown
0x: Vendor ID: 1c5c, Product ID: 2839
0x0004: Command: 0147, Status: 0010
0x0008: Class: 01 Mass Storage, Subclass: 08 NVM,
Interface: 02, Revision: 21
0x000c: BIST: 00, Header Type: 00, Latency Timer: 00,
Cache Line Size: 00
0x0010: BAR mem 64bit addr: 0xbf70/0x8000
0x0018: BAR empty ()
0x001c: BAR empty ()
0x0020: BAR empty ()
0x0024: BAR empty ()
0x0028: Cardbus CIS: 
0x002c: Subsystem Vendor ID: 1028 Product ID: 2144
0x0030: Expansion ROM Base Address: 
0x0038: 
0x003c: Interrupt Pin: 01 Line: ff Min Gnt: 00 Max Lat: 00
0x0080: Capability 0x01: Power Management
State: D0
0x00b0: Capability 0x11: Extended Message Signalled Interrupts
(MSI-X)
Enabled: yes; table size 257 (BAR 0:12288)
0x00c0: Capability 0x10: PCI Express
Max Payload Size: 256 / 512 bytes
Max Read Request Size: 512 bytes
Link Speed: 8.0 / 16.0 GT/s
Link Width: x4 / x4
0x0100: Enhanced Capability 0x01: Advanced Error Reporting
0x0150: Enhanced Capability 0x03: Device Serial Number
Serial Number: 5ddc2935ff2ee4ac
0x0160: Enhanced Capability 0x04: Power Budgeting
0x0300: Enhanced Capability 0x19: Secondary PCIe Capability
0x0400: Enhanced Capability 0x0b: Vendor-Specific
0x0910: Enhanced Capability 0x25: Data Link Feature
0x0920: Enhanced Capability 0x27: Lane Margining at the Receiver
0x09c0: Enhanced Capability 0x26: Physical Layer 16.0 GT/s


from pci.ids

1c5c  SK hynix
2839  PE8000 Series NVMe Solid State Drive
1028 2144  DC NVMe PE8010 RI U.2 960GB

Should I in diff somehow match
Subsystem Vendor ID: 1028 Product ID: 2144 ?Index: pcidevs
===
RCS file: /home/cvs/src/sys/dev/pci/pcidevs,v
retrieving revision 1.2050
diff -u -p -u -p -r1.2050 pcidevs
--- pcidevs 7 Sep 2023 02:11:26 -   1.2050
+++ pcidevs 9 Sep 2023 06:40:44 -
@@ -8875,6 +8875,7 @@ product SIS 966_HDA   0x7502  966 HD Audio
 
 /* SK hynix products */
 product SKHYNIX SSD0x1327  BC501 NVMe
+product SKHYNIX NVMe   0x2839  PE8000 NVMe
 
 /* SMC products */
 product SMC 83C170 0x0005  83C170
Index: pcidevs.h
===
RCS file: /home/cvs/src/sys/dev/pci/pcidevs.h,v
retrieving revision 1.2044
diff -u -p -u -p -r1.2044 pcidevs.h
--- pcidevs.h   7 Sep 2023 02:12:07 -   1.2044
+++ pcidevs.h   9 Sep 2023 06:41:03 -
@@ -2,7 +2,7 @@
  * THIS FILE AUTOMATICALLY GENERATED.  DO NOT EDIT.
  *
  * generated from:
- * OpenBSD: pcidevs,v 1.2049 2023/09/07 01:41:09 jsg Exp 
+ * OpenBSD: pcidevs,v 1.2050 2023/09/07 02:11:26 daniel Exp 
  */
 /* $NetBSD: pcidevs,v 1.30 1997/06/24 06:20:24 thorpej Exp $   */
 
@@ -8880,6 +8880,7 @@
 
 /* SK hynix products */
 #definePCI_PRODUCT_SKHYNIX_SSD 0x1327  /* BC501 NVMe */
+#definePCI_PRODUCT_SKHYNIX_NVMe0x2839  /* PE8000 NVMe 
*/
 
 /* SMC products */
 #definePCI_PRODUCT_SMC_83C170  0x0005  /* 83C170 */
Index: pcidevs_data.h
===
RCS file: /home/cvs/src/sys/dev/pci/pcidevs_data.h,v
retrieving revision 1.2039
diff -u -p -u -p -r1.2039 pcidevs_data.h
--- pcidevs_data.h  7 Sep 2023 02:12:07 -   1.2039
+++ pcidevs_data.h  9 Sep 2023 06:41:03 -
@@ -2,7 +2,7 @@
  * THIS FILE AUTOMATICALLY GENERATED.  DO NOT EDIT.
  *
  * generated from:
- * OpenBSD: pcidevs,v 1.2049 2023/09/07 01:41:09 jsg Exp 
+ * OpenBSD: pcidevs,v 1.2050 2023/09/07 02:11:26 daniel Exp 
  */
 
 /* $NetBSD: pcidevs,v 1.30 1997/06/24 06:20:24 thorpej Exp $   */
@@ -31906,6 +31906,10 @@ static const struct pci_known_product pc
{
PCI_VENDOR_SKHYNIX, PCI_PRODUCT_SKHYNIX_SSD,
"BC501 NVMe",
+   },
+   {
+   

Re: ix(4) shouldn't crash on memory allocation failure

2023-07-08 Thread Hrvoje Popovski
On 7.7.2023. 11:24, Jonathan Matthew wrote:
> One of the problems described here:
> https://www.mail-archive.com/tech@openbsd.org/msg71790.html
> amounts to ix(4) not checking that it allocated a dma map before trying to 
> free it.
> 
> ok?
> 
> 

Hi,

with this diff box won't panic if I have em multiqueue diff and bring
ix(4) up...

without diff
x3550m4# ifconfig ix0 up
ix0: Unable to create Pack DMA map
uvm_fault(0xfd877edf6180, 0xc, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  _bus_dmamap_destroy+0xd:movl0xc(%rsi),%eax
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
* 11781  20784  0 0x3  01K ifconfig
_bus_dmamap_destroy(824a5a40,0) at _bus_dmamap_destroy+0xd
ixgbe_free_receive_buffers(800e8910) at
ixgbe_free_receive_buffers+0xb2
ixgbe_init(800e6000) at ixgbe_init+0x788
ixgbe_ioctl(800e6048,80206910,800021bb6d80) at ixgbe_ioctl+0x327
ifioctl(fd87809651f8,80206910,800021bb6d80,800021bed8a0) at
ifioctl+0x7cc
sys_ioctl(800021bed8a0,800021bb6e90,800021bb6ef0) at
sys_ioctl+0x2c4
syscall(800021bb6f60) at syscall+0x3d4
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7931b1531760, count: 7
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{1}>


with diff
x3550m4# ifconfig ix0 up
ix0: Unable to create Pack DMA map
ix0: Could not setup receive structures
x3550m4# ifconfig ix1 up
ix1: Unable to create Pack DMA map
ix1: Could not setup receive structures
x3550m4#


x3550m4# vmstat -iz
interrupt   total rate
irq144/com0  3537   11
irq145/com1 00
irq96/acpi0 00
irq97/ppb0  00
irq98/ppb1  00
irq99/ppb2  00
irq114/ix0:000
irq115/ix0:100
irq116/ix0:200
irq117/ix0:300
irq118/ix0:400
irq119/ix0:500
irq120/ix0:600
irq121/ix0:700
irq122/ix0:800
irq123/ix0:900
irq124/ix0:10   00
irq125/ix0:11   00
irq126/ix0  00
irq127/ix1:000
irq128/ix1:100
irq129/ix1:200
irq130/ix1:300
irq131/ix1:400
irq132/ix1:500
irq133/ix1:600
irq134/ix1:700
irq135/ix1:800
irq136/ix1:900
irq137/ix1:10   00
irq138/ix1:11   00
irq139/ix1  00
irq100/ppb3 00
irq101/mfii042327  135
irq102/ehci0   230
irq103/ppb5 00
irq140/em0:0  8732
irq141/em0:1  3261
irq142/em0:210
irq143/em0:3   230
irq146/em0:4   550
irq147/em0:500
irq148/em0:6   240
irq149/em0:760
irq150/em0  20
irq151/em1:020
irq152/em1:100
irq153/em1:200
irq154/em1:300
irq155/em1:400
irq156/em1:500
irq157/em1:600
irq158/em1:700
irq159/em1  20
irq160/em2:020
irq161/em2:100
irq162/em2:200
irq163/em2:300
irq164/em2:400
irq165/em2:500
irq166/em2:600
irq167/em2:700
irq168/em2  20
irq169/em3:020
irq170/em3:100
irq171/em3:200
irq172/em3:300
irq173/em3:400
irq174/em3:500
irq175/em3:600
irq176/em3:7

Re: ifconfig description for wireguard peers

2023-05-24 Thread Hrvoje Popovski
On 23.5.2023. 21:13, Klemens Nanni wrote:
> On Sat, Jan 14, 2023 at 02:28:27PM +, Stuart Henderson wrote:
>> On 2023/01/12 04:49, Mikolaj Kucharski wrote:
>>> Hi,
>>>
>>> Is there anything else which I can do, to help this diff reviwed and
>>> increase the chance of getting in?
>>>
>>> Thread at https://marc.info/?t=16347829861=1=2
>>>
>>> Last version of the diff at
>>> https://marc.info/?l=openbsd-tech=167185582521873=mbox
>> Inlining that for a few comments, otherwise it's ok sthen
> wgdescr[iption] would be consistent with the existing descr[iption].
> At least my keep typing the trailing "r"...
> 
> Then '-wgdescr' and 'wgdescr ""' work and are implemented exactly like
> te inteface description equivalents.
> 
> I could use this now in a new VPN setup, so here's a polished diff,
> with the above, missing ifconfig.8 bits written and other nits inline.
> 
> As Theo suggested, I'd drop the wg.4 and leave it to ifconfig.8 proper.
> 
> Feedback?
> 
> Either way, net/wireguard-tools needs a bump/rebuild.
> 

Hi,

this would be nice features when having wg server with lots of wgpeers.

I've tried this diff ans it's working as expected.


Thank you Mikolaj and kn@



Re: syn cache tcp checksum

2023-05-22 Thread Hrvoje Popovski
On 22.5.2023. 22:17, Alexander Bluhm wrote:
> Hi,
> 
> I noticed a missing checksum count in netstat tcp packets sent
> software-checksummed.  Turns out that our syn cache does the checksum
> calculation by hand, instead of the established mechanism in ip
> output.
> 
> Just set the flag M_TCP_CSUM_OUT and let in_proto_cksum_out() do
> the work later.
> 
> While there remove redundant code.  The unhandled af case is handled
> in the first switch statement of the function.

Hi,

from time to time when doing iperf from openbsd to linux,
"output TSO packets software chopped" counter is increasing when
starting iperf tests, but not while tests are running.

smc4# netstat -sp tcp | grep TSO
19 output TSO packets software chopped
66504137 output TSO packets hardware processed
752185165 output TSO packets generated
0 output TSO packets dropped


Because I played with MTU, I thought, it's could be some quirk with MTU
mismatch, TCP fragments or something like that, but that's not the case.
Software chopped counter is increasing even with align MTU

It's not that there's any problems with or without this diff, I'm always
getting 10Gbps per tcp flow with HW TSO, I just thought that this diff
would fix that but it didn't.





Re: Add LRO counter in ix(4)

2023-05-22 Thread Hrvoje Popovski
On 18.5.2023. 10:46, Jan Klemkow wrote:
> On Thu, May 18, 2023 at 12:01:44AM +0200, Alexander Bluhm wrote:
>> On Tue, May 16, 2023 at 09:11:48PM +0200, Jan Klemkow wrote:
>>> @@ -412,6 +412,10 @@ tcp_stats(char *name)
>>> p(tcps_outhwtso, "\t\t%u output TSO packet%s hardware processed\n");
>>> p(tcps_outpkttso, "\t\t%u output TSO packet%s generated\n");
>>> p(tcps_outbadtso, "\t\t%u output TSO packet%s dropped\n");
>>> +   p(tcps_inhwlro, "\t\t%u input LRO generated packet%s from hardware\n");
>>> +   p(tcps_inpktlro, "\t\t%u input LRO coalesced packet%s from hardware\n");
>> ... coalesced packet%s by hardware
> done
> 
>>> +   p(tcps_inbadlro, "\t\t%u input bad LRO packet%s from hardware\n");
>>> +
>> Move this down to the "packets received" section.  You included it
>> in "packets sent".
> done
> 
>>> +   /*
>>> +* This function iterates over interleaved descriptors.
>>> +* Thus, we reuse ph_mss as global segment counter per
>>> +* TCP connection, insteat of introducing a new variable
>> s/insteat/instead/
> done
> 
> ok?
> 
> Thanks,
> Jan

Hi,

this diff

smc4# netstat -sp tcp | egrep "TSO|LRO"
48 output TSO packets software chopped
776036704 output TSO packets hardware processed
3907800204 output TSO packets generated
0 output TSO packets dropped
152169317 input LRO generated packets by hardware
981432700 input LRO coalesced packets by hardware
0 input bad LRO packets from hardware



vs old diff

smc4# netstat -sp tcp | egrep "TSO|LRO"
10843097 output TSO packets software chopped
0 output TSO packets hardware processed
136103193 output TSO packets generated
0 output TSO packets dropped
4940412 input LRO generated packets from hardware
74984192 input LRO coalesced packets from hardware
0 input bad LRO packets from hardware



Re: tcp tso loopback checksum

2023-05-21 Thread Hrvoje Popovski
On 21.5.2023. 22:41, Alexander Bluhm wrote:
> Hi,
> 
> When sending TCP packets with software TSO to the local address of
> a physical interface, the TCP checksum was miscalculated.
> 
> This bug was triggered on loopback interface, when sending to the
> local interface address of a physical interface.  Due to another
> bug, the smaller MTU of the physical interface was used.  Then the
> packet was sent without TSO chopping as it did fit the larger MTU
> of the loopback interface.  Although the loopback interface does
> not support hardware TSO, the modified TCP pseudo header checksum
> for hardware TSO checksum offloading was calculated.
> 
> Please test with and without hardware TSO.

Hi,

in lab i have:

ix0: flags=2008843 mtu 1500
inet 192.168.100.14 netmask 0xff00 broadcast 192.168.100.255
inet6 fe80::225:90ff:fe5d:ca38%ix0 prefixlen 64 scopeid 0x1
inet6 192:168:1000:1000::114 prefixlen 64

lo123: flags=8049 mtu 32768
inet 10.156.156.1 netmask 0xff00
inet6 fe80::1%lo123 prefixlen 64 scopeid 0xd
inet6 192:168:1560:1560::114 prefixlen 64


iperf3 -s6 -B 192:168:1000:1000::114 <- server
iperf3 -6 -B 192:168:1560:1560::114 -c 192:168:1000:1000::114 <- client

iperf3 -s -B 192.168.100.14 -p 5202 <- server
iperf3 -B 10.156.156.1 -c 192.168.100.14 -p 5202 <- client


without this diff and with net.inet.tcp.tso=1 I'm getting few Kbps over
ip4 and ip6. with net.inet.tcp.tso=0 i'm getting cca 650Mbps

smc4# netstat -sp tcp | grep TSO
0 output TSO packets software chopped
0 output TSO packets hardware processed
0 output TSO packets generated
0 output TSO packets dropped



With this diff and with net.inet.tcp.tso=1 i'm getting 2Gbps over ip4
and 2Gbps over ip6. with net.inet.tcp.tso=0 650Mbps.

smc4# netstat -sp tcp | grep TSO
157397 output TSO packets software chopped
0 output TSO packets hardware processed
4076665 output TSO packets generated
0 output TSO packets dropped



Re: Add LRO counter in ix(4)

2023-05-17 Thread Hrvoje Popovski
On 16.5.2023. 21:11, Jan Klemkow wrote:
> Hi,
> 
> This diff introduces new counters for LRO packets, we get from the
> network interface.  It shows, how many packets the network interface has
> coalesced into LRO packets.
> 
> In followup diff, this packet counter will also be used to set the
> ph_mss variable to valid value.  So, the stack is able to forward or
> redirect this kind of packets.

Hi,

when LRO is enabled with "ifconfig ix0 tcprecvoffload" LRO counter is
increasing if I'm sending or receiving tcp traffic with iperf.
Is this ok?
TSO counter is increasing only when I'm sending traffic.



ix0 at pci5 dev 0 function 0 "Intel X552 SFP+" rev 0x00, msix, 4 queues

smc4# ifconfig ix0 tcprecvoffload
smc4# ifconfig ix0
ix0: flags=2008843 mtu 1500

smc4# ifconfig ix0 -tcprecvoffload
smc4# ifconfig ix0
ix0: flags=8843 mtu 1500

smc4# netstat -sp tcp | egrep "TSO|LRO"
10843097 output TSO packets software chopped
0 output TSO packets hardware processed
136103193 output TSO packets generated
0 output TSO packets dropped
4940412 input LRO generated packets from hardware
74984192 input LRO coalesced packets from hardware
0 input bad LRO packets from hardware

It is 0 on "TSO packets hardware processed" because I didn't applied
latest bluhm@ "ix hardware tso" diff




Re: ix hardware tso

2023-05-16 Thread Hrvoje Popovski
On 15.5.2023. 19:39, Alexander Bluhm wrote:
> On Sun, May 14, 2023 at 11:39:01PM +0200, Hrvoje Popovski wrote:
>> I've tested this on openbsd box with 4 iperf3's. 2 for ip4 and 2 for ip6
>> and with 16 tcp streams per iperf.  When testing over ix(4) there is big
>> differences in output performance. When testing ix/veb/vport there is
>> differences in output performance but not that big.
> Thanks a lot for testing.  I have also created some numbers which
> can be seen here.
> 
> http://bluhm.genua.de/perform/results/2023-05-14T09:14:59Z/perform.html
> 
> Sending TCP to Linux host and socket splicing gets faster.
> 
>> When testing over vport I'm getting "software chopped" which should be
>> expected.
> Yes, we cannot do hardware TSO in a bridge.  Maybe we could if all
> bridge members support it.
> 
> Next diff that should go in is where jan@ renames flags, cleans up
> ifconfig(8),. and fixes pseudo interface devices
> 
> Updated ix(4) driver diff after TCP/IP commit is below.

Hi,

I've tested this diff with x552 and it's working as expected.

ix0 at pci5 dev 0 function 0 "Intel X552 SFP+" rev 0x00, msix, 4 queues,
ix1 at pci5 dev 0 function 1 "Intel X552 SFP+" rev 0x00, msix, 4 queues,


netstat -sp tcp | grep TSO
46 output TSO packets software chopped
772398947 output TSO packets hardware processed
4005484521 output TSO packets generated
0 output TSO packets dropped



Re: ifconfig description for wireguard peers

2023-05-15 Thread Hrvoje Popovski
On 15.5.2023. 9:47, Hrvoje Popovski wrote:
> On 12.1.2023. 5:49, Mikolaj Kucharski wrote:
>> Hi,
>>
>> Is there anything else which I can do, to help this diff reviwed and
>> increase the chance of getting in?
>>
>> Thread at https://marc.info/?t=16347829861=1=2
>>
>> Last version of the diff at
>> https://marc.info/?l=openbsd-tech=167185582521873=mbox
> 
> Hi,
> 
> I've applied this diff and it's works as expected. wgdesc would be nice
> features to have when having remote access server with wireguard.

Hi,

With this diff when executing wg in shell or wg showconf I'm getting
core dump. maybe I did something wrong? Mikolaj did you get core dump
with diff?



"wg show" without this diff

r620-1# wg show
interface: wg0
  public key: 123=
  private key: (hidden)
  listening port: 12345

peer: 111=
  preshared key: (hidden)
  allowed ips: 10.123.123.2/32





"wg show" with this diff

smc4# wg show
Segmentation fault (core dumped)


smc4# gdb wg wg.core
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you
are welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-unknown-openbsd7.3"...(no debugging
symbols found)

Core was generated by `wg'.
Program terminated with signal 11, Segmentation fault.
(no debugging symbols found)
Loaded symbols for /usr/local/bin/wg
Reading symbols from /usr/lib/libc.so.97.0...done.
Loaded symbols for /usr/lib/libc.so.97.0
Reading symbols from /usr/libexec/ld.so...Error while reading shared
library symbols:
Dwarf Error: wrong version in compilation unit header (is 4, should be
2) [in module /usr/libexec/ld.so]
#0  0x0ec9a681649d in ipc_get_device () from /usr/local/bin/wg


(gdb) bt
#0  0x0ec9a681649d in ipc_get_device () from /usr/local/bin/wg
#1  0x0ec9a6817ac2 in show_main () from /usr/local/bin/wg
#2  0x0ec9a6808822 in _start () from /usr/local/bin/wg
#3  0x in ?? ()
(gdb)



Re: ifconfig description for wireguard peers

2023-05-15 Thread Hrvoje Popovski
On 12.1.2023. 5:49, Mikolaj Kucharski wrote:
> Hi,
> 
> Is there anything else which I can do, to help this diff reviwed and
> increase the chance of getting in?
> 
> Thread at https://marc.info/?t=16347829861=1=2
> 
> Last version of the diff at
> https://marc.info/?l=openbsd-tech=167185582521873=mbox


Hi,

I've applied this diff and it's works as expected. wgdesc would be nice
features to have when having remote access server with wireguard.



old ifconfig wg0
wg0: flags=80c3 mtu 1420
index 15 priority 0 llprio 3
wgport 12345
wgpubkey 123=
wgpeer 111=
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.2/32
wgpeer 222=
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.3/32
wgpeer 33=
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.4/32
wgpeer 44=
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.5/32
wgpeer 55=
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.6/32
wgpeer 66=
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.7/32
wgpeer 777=
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.8/32
groups: wg
inet 10.123.123.1 netmask 0xff00 broadcast 10.123.123.255




new ifconfig wg0
wg0: flags=80c3 mtu 1420
index 15 priority 0 llprio 3
wgport 12345
wgpubkey 123=
wgpeer 111=
    wgdesc Hrvoje Popovski 1
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.2/32
wgpeer 222=
    wgdesc Hrvoje Popovski 2
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.3/32
wgpeer 333=
    wgdesc Hrvoje Popovski 3
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.4/32
wgpeer 444=
    wgdesc Hrvoje Popovski 4
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.5/32
wgpeer 555=
    wgdesc Hrvoje Popovski 5
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.6/32
wgpeer 666=
    wgdesc Hrvoje Popovski 6
wgpsk (present)
tx: 0, rx: 0
wgaip 10.123.123.7/32
groups: wg
inet 10.123.123.1 netmask 0xff00 broadcast 10.123.123.255




cat /etc/hostname.wg0
inet 10.123.123.1 255.255.255.0
wgport 12345
wgkey 456=

# old desc - Hrvoje Popovski 1
wgpeer 1111111= wgdesc "Hrvoje
Popovski 1" \
wgpsk 111= \
wgaip 10.123.123.2/32

# old desc - Hrvoje Popovski 2
wgpeer 222= wgdesc "Hrvoje
Popovski 2" \
wgpsk 222= \
wgaip 10.123.123.3/32

# old desc - Hrvoje Popovski 3
wgpeer 33333= wgdesc "Hrvoje
Popovski 3" \
wgpsk 33333= \
wgaip 10.123.123.4/32

# old desc - Hrvoje Popovski 4
wgpeer 44= wgdesc "Hrvoje
Popovski 4" \
wgpsk 444444= \
wgaip 10.123.123.5/32

# old desc .- Hrvoje Popovski 5
wgpeer 555= wgdesc "Hrvoje
Popovski 5" \
wgpsk 5555555= \
wgaip 10.123.123.6/32

# old desc - Hrvoje Popovski 6
wgpeer = wgdesc "Hrvoje
Popovski 6" \
wgpsk = \
wgaip 10.123.123.7/32



Re: ix hardware tso

2023-05-14 Thread Hrvoje Popovski
On 14.5.2023. 11:24, Alexander Bluhm wrote:
> On Sat, May 13, 2023 at 01:32:07AM +0200, Alexander Bluhm wrote:
>> I have not yet investigated where the dropped counter 83 comes from.
>> If you see that also, please report what you did.
> This is an ENOBUFS error in this chunk.
> 
> /* network interface hardware will do TSO */
> if (in_ifcap_cksum(*mp, ifp, ifcap)) {
> if (ISSET(ifcap, IFCAP_TSOv4)) {
> in_hdr_cksum_out(*mp, ifp);
> in_proto_cksum_out(*mp, ifp);
> }
> if (ISSET(ifcap, IFCAP_TSOv6))
> in6_proto_cksum_out(*mp, ifp);
> if ((error = ifp->if_output(ifp, *mp, dst, rt))) {
> tcpstat_inc(tcps_outbadtso);
> goto done;
> }
> tcpstat_inc(tcps_outhwtso);
> goto done;
> }
> 
> As the error from ifp->if_output() has nothing todo with TSO, I
> remove the counting there.
> 
> Updated diff, please test if you have ix(4) interfaces doing TCP
> output.


Hi,

I've tested this on openbsd box with 4 iperf3's. 2 for ip4 and 2 for ip6
and with 16 tcp streams per iperf.  When testing over ix(4) there is big
differences in output performance. When testing ix/veb/vport there is
differences in output performance but not that big.

When testing over vport I'm getting "software chopped" which should be
expected.

r620-1# netstat -sp tcp | grep TSO
7921175 output TSO packets software chopped
3739630121 output TSO packets hardware processed
915829954 output TSO packets generated
0 output TSO packets dropped

With previous diff I could easily trigger "TSO packet dropped" but with
this one I couldn't.




Re: ifconfig: SIOCSIFFLAGS: device not configured

2023-05-12 Thread Hrvoje Popovski
On 12.5.2023. 16:21, Jan Klemkow wrote:
> On Thu, May 11, 2023 at 09:17:37PM +0200, Hrvoje Popovski wrote:
>> is it possible to change "ifconfig: SIOCSIFFLAGS: device not configured"
>> message that it has an interface name in it, something like:
>> ifconfig pfsync0: SIOCSIFFLAGS: device not configured <- in my case.
>>
>> I have many vlans and static routes in my setup and while testing some
>> diffs, it took me a long time to figure out which interface the message
>> was coming from.
>>
>> starting network
>> add host 10.11.2.69: gateway 10.12.253.225
>> add host 10.250.184.36: gateway 10.12.253.225
>> add host 9.9.9.9: gateway 10.12.253.225
>> add host 10.11.1.234: gateway 10.12.253.225
>> add host 10.11.1.235: gateway 10.12.253.225
>> add host 10.11.255.123: gateway 10.12.253.225
>> add net 10.101/16: gateway 10.12.253.225
>> ifconfig: SIOCSIFFLAGS: Device not configured
>> add net 16/8: gateway 192.168.100.112
>> add net a192:a168:a100:a100::/64: gateway 192:168:1000:1000::112
>> add net 48/8: gateway 192.168.111.112
>> add net a192:a168:a111:a111::/64: gateway 192:168::::112
>> reordering: ld.so libc libcrypto sshd.
>>
>> or when I'm doing sh /etc/netstart and have aggr interface
>>
>> ifconfig: SIOCSTRUNKPORT: Device busy
>> ifconfig: SIOCSTRUNKPORT: Device busy
>>
>> to change
>> ifconfig ix0: SIOCSTRUNKPORT: Device busy
>> ifconfig ix1: SIOCSTRUNKPORT: Device busy
> I also run into this issue sometimes.  So, here is diff that prints the
> interface name in front of most of these anonym error messages.


That's it, thank you.
This will help me a lot when doing some network testing...


old
ifconfig: SIOCSIFFLAGS: Device not configured
ifconfig: SIOCSIFFLAGS: Inappropriate ioctl for device
reordering: ld.so libc libcrypto sshd.

new
ifconfig: pfsync0: SIOCSIFFLAGS: Device not configured
ifconfig: pfsync0: SIOCSIFFLAGS: Inappropriate ioctl for device
reordering: ld.so libc libcrypto sshd.




when doing sh /etc/netstart with aggr0

old
ifconfig: SIOCSTRUNKPORT: Device busy
ifconfig: SIOCSTRUNKPORT: Device busy

new
ifconfig: aggr0 ix0: SIOCSTRUNKPORT: Device busy
ifconfig: aggr0 ix1: SIOCSTRUNKPORT: Device busy



ifconfig: SIOCSIFFLAGS: device not configured

2023-05-11 Thread Hrvoje Popovski
Hi everyone,

is it possible to change "ifconfig: SIOCSIFFLAGS: device not configured"
message that it has an interface name in it, something like:
ifconfig pfsync0: SIOCSIFFLAGS: device not configured <- in my case.

I have many vlans and static routes in my setup and while testing some
diffs, it took me a long time to figure out which interface the message
was coming from.


starting network
add host 10.11.2.69: gateway 10.12.253.225
add host 10.250.184.36: gateway 10.12.253.225
add host 9.9.9.9: gateway 10.12.253.225
add host 10.11.1.234: gateway 10.12.253.225
add host 10.11.1.235: gateway 10.12.253.225
add host 10.11.255.123: gateway 10.12.253.225
add net 10.101/16: gateway 10.12.253.225
ifconfig: SIOCSIFFLAGS: Device not configured
add net 16/8: gateway 192.168.100.112
add net a192:a168:a100:a100::/64: gateway 192:168:1000:1000::112
add net 48/8: gateway 192.168.111.112
add net a192:a168:a111:a111::/64: gateway 192:168::::112
reordering: ld.so libc libcrypto sshd.


or when I'm doing sh /etc/netstart and have aggr interface

ifconfig: SIOCSTRUNKPORT: Device busy
ifconfig: SIOCSTRUNKPORT: Device busy

to change
ifconfig ix0: SIOCSTRUNKPORT: Device busy
ifconfig ix1: SIOCSTRUNKPORT: Device busy



Re: software tcp send offloading

2023-05-10 Thread Hrvoje Popovski
On 9.5.2023. 9:56, Alexander Bluhm wrote:
> On Sun, May 07, 2023 at 09:00:31PM +0200, Alexander Bluhm wrote:
>> Not sure if I addressed all corner cases already.  I think IPsec
>> is missing.
> Updated diff:
> - parts have been commited
> - works with IPsec now
> - some bugs fixed
> - sysctl net.inet.tcp.tso
> - netstat TSO counter
> 
> If you test this, recompile sysctl and netstat with new kernel
> headers.  Then you can see, whether the diff has an effect on your
> setup.
> 
> # netstat -s -p tcp | grep TSO
> 79 output TSO packets software chopped
> 0 output TSO packets hardware processed
> 840 output TSO packets generated
> 0 output TSO packets dropped
> 
> If you run into problems, disable the feature, and report if the
> problem goes away.  This helps to locate the bug.
> 
> # sysctl net.inet.tcp.tso=0
> net.inet.tcp.tso: 1 -> 0
> 
> I would like to keep the sysctl for now.  It makes performance
> comparison easier.  When we add hardware TSO it can be a quick
> workaround for driver problems.
> 
> When this has been tested a bit, I think it is ready for commit.
> Remaining issues can be handled in tree.  My tests pass, I am not
> aware of TCP problems.

Hi,

I've tested this with few iperf3/tcpbench clients. While generating
traffic disabling and enabling net.inet.tcp.tso and the difference in
performance compared to hw TSO is small, but visible ...

netstat -s -p tcp | grep TSO
61559792 output TSO packets software chopped
0 output TSO packets hardware processed
685352918 output TSO packets generated
0 output TSO packets dropped



  PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
79516   543359  5900K 1384K onproc/0  -11:35 42.68% softnet
50096   165854  100 5432K 7400K sleep/1   netlock   7:16 38.92% iperf3
60190   204914  100 5440K 7316K sleep/2   netlock   7:03 37.89% iperf3
72560   588715  100 5428K 7408K sleep/3   netlock   7:17 37.16% iperf3
30689   450329  100 5436K 7424K sleep/1   netlock   7:16 37.06% iperf3
66203   324544  100 5440K 7352K sleep/3   netlock   4:30 29.83% iperf3
88977   447186  100 5432K 7376K sleep/5   netlock   4:34 29.74% iperf3
 6101   488471  100 5444K 7412K sleep/5   netlock   4:34 28.76% iperf3
 2186   575957  100 5436K 7376K sleep/2   netlock   4:32 28.12% iperf3
77920   442144  1000K 1384K sleep/1   bored 7:00 22.27% softnet
48409   243224  1400K 1384K sleep/2   netlock   6:29 21.14% softnet
91863   516055  1000K 1384K sleep/4   bored 6:09 21.09% softnet
67265   156637 -2200K 1384K run/0 - 0:17  0.10%
softclock




Re: arpresolve reduce kernel lock

2023-04-26 Thread Hrvoje Popovski
On 26.4.2023. 12:15, Alexander Bluhm wrote:
> On Wed, Apr 26, 2023 at 11:17:32AM +0200, Alexander Bluhm wrote:
>> On Tue, Apr 25, 2023 at 11:57:09PM +0300, Vitaliy Makkoveev wrote:
>>> On Tue, Apr 25, 2023 at 11:44:34AM +0200, Alexander Bluhm wrote:
 Hi,

 Mutex arp_mtx protects the llinfo_arp la_...  fields.  So kernel
 lock is only needed for changing the route rt_flags.

 Of course there is a race between checking and setting rt_flags.
 But the other checks of the RTF_REJECT flags were without kernel
 lock before.  This does not cause trouble, the worst thing that may
 happen is to wait another exprire time for ARP retry.  My diff does
 not make it worse, reading rt_flags and rt_expire is done without
 lock anyway.

 The kernel lock is needed to change rt_flags.  Testing without
 KERNEL_LOCK() caused crashes.

>>> Hi,
>>>
>>> I'm interesting is the system stable with the diff below? If so, we
>>> could avoid kernel lock in the arpresolve().
>> I could not crash it.
> I was too fast.  Just after writing this mail I restarted the test.

Hi,

my boxes are still ok with mvs@ diff even if I'm running arp -ad in loop.



Re: arpresolve reduce kernel lock

2023-04-26 Thread Hrvoje Popovski
On 25.4.2023. 22:57, Vitaliy Makkoveev wrote:
> On Tue, Apr 25, 2023 at 11:44:34AM +0200, Alexander Bluhm wrote:
>> Hi,
>>
>> Mutex arp_mtx protects the llinfo_arp la_...  fields.  So kernel
>> lock is only needed for changing the route rt_flags.
>>
>> Of course there is a race between checking and setting rt_flags.
>> But the other checks of the RTF_REJECT flags were without kernel
>> lock before.  This does not cause trouble, the worst thing that may
>> happen is to wait another exprire time for ARP retry.  My diff does
>> not make it worse, reading rt_flags and rt_expire is done without
>> lock anyway.
>>
>> The kernel lock is needed to change rt_flags.  Testing without
>> KERNEL_LOCK() caused crashes.
>>
> Hi,
> 
> I'm interesting is the system stable with the diff below? If so, we
> could avoid kernel lock in the arpresolve().

Hi,

I've put that diff on production boxes and in lab and for now firewalls
are stable. Let's see after few more hours.




Re: arp input remove kernel lock

2023-04-07 Thread Hrvoje Popovski
On 6.4.2023. 22:46, Alexander Bluhm wrote:
> Hi,
> 
> When removing these kernel locks from the ARP input path, the machine
> runs stable in my tests.  Caller if_netisr() grabs the exclusive
> netlock and that should be sufficent for in_arpinput() and arpcache().
> 
> To stress the ARP resolver I run arp -nd ... in a loop.
> 
> Hrvoje: Could you run this diff on your testsetup?
> 
> bluhm


Hi,

I'm running this diff in lab and on production firewalls and boxes seems
happy and little faster. In lab whatever I do I couldn't panic boxes,
generating incomplete arp entries, arp -ad, destroy vlan, destroy carp,
up/down physical interfaces and stuff like that while sending traffic.





Re: running UDP input in parallel

2023-02-27 Thread Hrvoje Popovski
On 22.8.2022. 15:07, Alexander Bluhm wrote:
> On Sun, Aug 21, 2022 at 07:07:29PM +0200, Alexander Bluhm wrote:
>> On Fri, Aug 19, 2022 at 10:54:42PM +0200, Alexander Bluhm wrote:
>>> This diff allows to run udp_input() in parallel.
> 
> Diff rebased to -current.


Hi,

is this diff still active? I was running this diff in prod with
wireguard, remote syslog, dhcp server, ntp client for 2 months and all
seems good.



> 
> Index: kern/uipc_socket.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket.c,v
> retrieving revision 1.284
> diff -u -p -r1.284 uipc_socket.c
> --- kern/uipc_socket.c21 Aug 2022 16:22:17 -  1.284
> +++ kern/uipc_socket.c22 Aug 2022 12:01:58 -
> @@ -822,10 +822,10 @@ bad:
>   if (mp)
>   *mp = NULL;
>  
> - solock(so);
> + solock_shared(so);
>  restart:
>   if ((error = sblock(so, >so_rcv, SBLOCKWAIT(flags))) != 0) {
> - sounlock(so);
> + sounlock_shared(so);
>   return (error);
>   }
>  
> @@ -893,7 +893,7 @@ restart:
>   sbunlock(so, >so_rcv);
>   error = sbwait(so, >so_rcv);
>   if (error) {
> - sounlock(so);
> + sounlock_shared(so);
>   return (error);
>   }
>   goto restart;
> @@ -962,11 +962,11 @@ dontblock:
>   sbsync(>so_rcv, nextrecord);
>   if (controlp) {
>   if (pr->pr_domain->dom_externalize) {
> - sounlock(so);
> + sounlock_shared(so);
>   error =
>   (*pr->pr_domain->dom_externalize)
>   (cm, controllen, flags);
> - solock(so);
> + solock_shared(so);
>   }
>   *controlp = cm;
>   } else {
> @@ -1040,9 +1040,9 @@ dontblock:
>   SBLASTRECORDCHK(>so_rcv, "soreceive uiomove");
>   SBLASTMBUFCHK(>so_rcv, "soreceive uiomove");
>   resid = uio->uio_resid;
> - sounlock(so);
> + sounlock_shared(so);
>   uio_error = uiomove(mtod(m, caddr_t) + moff, len, uio);
> - solock(so);
> + solock_shared(so);
>   if (uio_error)
>   uio->uio_resid = resid - len;
>   } else
> @@ -1126,7 +1126,7 @@ dontblock:
>   error = sbwait(so, >so_rcv);
>   if (error) {
>   sbunlock(so, >so_rcv);
> - sounlock(so);
> + sounlock_shared(so);
>   return (0);
>   }
>   if ((m = so->so_rcv.sb_mb) != NULL)
> @@ -1171,7 +1171,7 @@ dontblock:
>   *flagsp |= flags;
>  release:
>   sbunlock(so, >so_rcv);
> - sounlock(so);
> + sounlock_shared(so);
>   return (error);
>  }
>  
> Index: kern/uipc_socket2.c
> ===
> RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket2.c,v
> retrieving revision 1.127
> diff -u -p -r1.127 uipc_socket2.c
> --- kern/uipc_socket2.c   13 Aug 2022 21:01:46 -  1.127
> +++ kern/uipc_socket2.c   22 Aug 2022 12:01:58 -
> @@ -360,6 +360,24 @@ solock(struct socket *so)
>   }
>  }
>  
> +void
> +solock_shared(struct socket *so)
> +{
> + switch (so->so_proto->pr_domain->dom_family) {
> + case PF_INET:
> + case PF_INET6:
> + if (so->so_proto->pr_usrreqs->pru_lock != NULL) {
> + NET_LOCK_SHARED();
> + pru_lock(so);
> + } else
> + NET_LOCK();
> + break;
> + default:
> + rw_enter_write(>so_lock);
> + break;
> + }
> +}
> +
>  int
>  solock_persocket(struct socket *so)
>  {
> @@ -403,6 +421,24 @@ sounlock(struct socket *so)
>  }
>  
>  void
> +sounlock_shared(struct socket *so)
> +{
> + switch (so->so_proto->pr_domain->dom_family) {
> + case PF_INET:
> + case PF_INET6:
> + if (so->so_proto->pr_usrreqs->pru_unlock != NULL) {
> + pru_unlock(so);
> + NET_UNLOCK_SHARED();
> + } else
> + NET_UNLOCK();
> + break;
> + default:
> + rw_exit_write(>so_lock);
> + break;
> + }
> +}
> +
> +void
>  soassertlocked(struct socket *so)
>  {
>   switch (so->so_proto->pr_domain->dom_family) {
> @@ -425,7 +461,15 @@ sosleep_nsec(struct 

Re: ifconfig description for wireguard peers

2023-01-05 Thread Hrvoje Popovski
On 2.1.2023. 22:01, Mikolaj Kucharski wrote:
> This seems to work fine for me.
> 
> Patch also available at:
> 
> https://marc.info/?l=openbsd-tech=167185582521873=mbox
> 

I've had some problems with 20+ wgpeers few days ago and at that time it
would have been good if I had wgdesc in ifconfig wg output ...


> 
> On Sat, Dec 24, 2022 at 03:29:35AM +, Mikolaj Kucharski wrote:
>> On Sat, Nov 19, 2022 at 12:03:59PM +, Mikolaj Kucharski wrote:
>>> Kind reminder.
>>>
>>> Below diff also at:
>>>
>>> https://marc.info/?l=openbsd-tech=166806412910623=2
>>>
>>> This is diff by Noah Meier with small changes by me.
>>>
>>>
>>> On Thu, Nov 10, 2022 at 07:14:11AM +, Mikolaj Kucharski wrote:
 On Thu, Nov 10, 2022 at 12:53:07AM +, Mikolaj Kucharski wrote:
> On Wed, Oct 20, 2021 at 10:20:09PM -0400, Noah Meier wrote:
>> Hi,
>>
>> While wireguard interfaces can have a description set by ifconfig, 
>> wireguard peers currently cannot. I now have a lot of peers and 
>> descriptions of them in ifconfig would be helpful.
>>
>> This diff adds a 'wgdesc' option to a 'wgpeer' in ifconfig (and a 
>> corresponding '-wgdesc' option). Man page also updated.
>>
>> NM
>
> Now that my `ifconfig, wireguard output less verbose, unless -A or `
> diff is commited ( see https://marc.info/?t=16577915002=1=2 ),
> bump of an old thread.
>
> Below is rebased on -current and tiny modified by me, Noah's diff.
>
> You need both kernel and ifconfig with below code, otherwise you may see
> issues bringing up wg(4) interface. If you may loose access to machine
> behind wg(4) VPN, make sure you update on that machine both kernel and
> ifconfig(8) at the same time.
>
>>
>> Rebased again, just a moment ago. Will test runtime again over the weekend,
>> are there no surprises.
>>
>> - ifconfig compiles
>> - GENERIC.MP/amd64 kernel compiles too
>>
>>
>> Index: sbin/ifconfig/ifconfig.c
>> ===
>> RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v
>> retrieving revision 1.460
>> diff -u -p -u -r1.460 ifconfig.c
>> --- sbin/ifconfig/ifconfig.c 18 Dec 2022 18:56:38 -  1.460
>> +++ sbin/ifconfig/ifconfig.c 24 Dec 2022 00:49:05 -
>> @@ -355,12 +355,14 @@ void   setwgpeerep(const char *, const cha
>>  voidsetwgpeeraip(const char *, int);
>>  voidsetwgpeerpsk(const char *, int);
>>  voidsetwgpeerpka(const char *, int);
>> +voidsetwgpeerdesc(const char *, int);
>>  voidsetwgport(const char *, int);
>>  voidsetwgkey(const char *, int);
>>  voidsetwgrtable(const char *, int);
>>  
>>  voidunsetwgpeer(const char *, int);
>>  voidunsetwgpeerpsk(const char *, int);
>> +voidunsetwgpeerdesc(const char *, int);
>>  voidunsetwgpeerall(const char *, int);
>>  
>>  voidwg_status(int);
>> @@ -623,11 +625,13 @@ const struct   cmd {
>>  { "wgaip",  NEXTARG,A_WIREGUARD,setwgpeeraip},
>>  { "wgpsk",  NEXTARG,A_WIREGUARD,setwgpeerpsk},
>>  { "wgpka",  NEXTARG,A_WIREGUARD,setwgpeerpka},
>> +{ "wgdesc", NEXTARG,A_WIREGUARD,setwgpeerdesc},
>>  { "wgport", NEXTARG,A_WIREGUARD,setwgport},
>>  { "wgkey",  NEXTARG,A_WIREGUARD,setwgkey},
>>  { "wgrtable",   NEXTARG,A_WIREGUARD,setwgrtable},
>>  { "-wgpeer",NEXTARG,A_WIREGUARD,unsetwgpeer},
>>  { "-wgpsk", 0,  A_WIREGUARD,unsetwgpeerpsk},
>> +{ "-wgdesc",0,  A_WIREGUARD,unsetwgpeerdesc},
>>  { "-wgpeerall", 0,  A_WIREGUARD,unsetwgpeerall},
>>  
>>  #else /* SMALL */
>> @@ -5856,6 +5860,16 @@ setwgpeerpka(const char *pka, int param)
>>  }
>>  
>>  void
>> +setwgpeerdesc(const char *wgdesc, int param)
>> +{
>> +if (wg_peer == NULL)
>> +errx(1, "wgdesc: wgpeer not set");
>> +if (strlen(wgdesc))
>> +strlcpy(wg_peer->p_description, wgdesc, IFDESCRSIZE);
>> +wg_peer->p_flags |= WG_PEER_SET_DESCRIPTION;
>> +}
>> +
>> +void
>>  setwgport(const char *port, int param)
>>  {
>>  const char *errmsg = NULL;
>> @@ -5902,6 +5916,15 @@ unsetwgpeerpsk(const char *value, int pa
>>  }
>>  
>>  void
>> +unsetwgpeerdesc(const char *value, int param)
>> +{
>> +if (wg_peer == NULL)
>> +errx(1, "wgdesc: wgpeer not set");
>> +strlcpy(wg_peer->p_description, "", IFDESCRSIZE);
>> +wg_peer->p_flags |= WG_PEER_SET_DESCRIPTION;
>> +}
>> +
>> +void
>>  unsetwgpeerall(const char *value, int param)
>>  {
>>  ensurewginterface();
>> @@ -5961,6 +5984,9 @@ wg_status(int ifaliases)
>>  b64_ntop(wg_peer->p_public, WG_KEY_LEN,
>>  key, sizeof(key));
>>  printf("\twgpeer %s\n", key);
>> +
>> +if (strlen(wg_peer->p_description))
>> +  

Re: em(4) multiqueue

2022-12-25 Thread Hrvoje Popovski
On 15.8.2022. 20:51, Hrvoje Popovski wrote:
> On 12.8.2022. 22:15, Hrvoje Popovski wrote:
>> Hi,
>>
>> I'm testing forwarding over
>>
>> em0 at pci7 dev 0 function 0 "Intel 82576" rev 0x01, msix, 4 queues,
>> em1 at pci7 dev 0 function 1 "Intel 82576" rev 0x01, msix, 4 queues,
>> em2 at pci8 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues,
>> em3 at pci9 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues,
>> em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01, msix, 4 queues,
>> em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01, msix, 4 queues,
> I've managed to get linux pktgen to send traffic on all 6 em interfaces
> at that same time, and box seems to work just fine. Some systat, vmstat
> and kstat details in attachment while traffic is flowing over that box.

Hi,

after 95 day in production with this diff and i350 and everything works
as expected. I'm sending this because it's time to upgrade :)
Is it maybe time to put this diff in ?


ix0 at pci5 dev 0 function 0 "Intel X540T" rev 0x01, msix, 8 queues,
address a0:36:9f:29:f3:28
ix1 at pci5 dev 0 function 1 "Intel X540T" rev 0x01, msix, 8 queues,
address a0:36:9f:29:f3:2a
em0 at pci6 dev 0 function 0 "Intel I350" rev 0x01, msix, 8 queues,
address ac:1f:6b:14:bd:b2
em1 at pci6 dev 0 function 1 "Intel I350" rev 0x01, msix, 8 queues,
address ac:1f:6b:14:bd:b3


fw2# uptime
 6:34PM  up 95 days, 19:26, 1 user, load averages: 0.00, 0.00, 0.00


fw2# vmstat -i
interrupt   total rate
irq0/clock 6622294171  799
irq0/ipi   8263089839  998
irq96/acpi0 10
irq114/ix0:0514761687   62
irq115/ix0:1510189468   61
irq116/ix0:2522691117   63
irq117/ix0:3531638415   64
irq118/ix0:4534116996   64
irq119/ix0:5511162669   61
irq120/ix0:6535267806   64
irq121/ix0:7519707637   62
irq122/ix0  20
irq99/xhci0680
irq100/ehci0   190
irq132/em0:0498689640   60
irq133/em0:1516744073   62
irq134/em0:2520784714   62
irq135/em0:3512596405   61
irq136/em0:4521988376   63
irq137/em0:5513939246   62
irq138/em0:6517184525   62
irq139/em0:7509781661   61
irq140/em0  20
irq141/em1:0216273893   26
irq143/em1:2283094667   34
irq148/em1:520
irq151/em1 180
irq100/ehci1   190
irq103/ahci0  50490680
Total 23681046204 2860




Re: pfsync panic after pf_purge backout

2022-11-29 Thread Hrvoje Popovski
On 28.11.2022. 17:07, Alexandr Nedvedicky wrote:
> diff below should avoid panic above (and similar panics in pfsync_q_del().
> It also prints some 'error' system message buffer (a.k.a. dmesg)
> 
> We panic because we attempt to remove state from psync queue which is
> already empty. the pfsync_q_del() must be acting on some stale information
> kept in `st` argument (state).
> 
> the information we print into dmesg buffer should help us understand
> what's going on. At the moment I can not explain how does it come
> there is a state which claims its presence on state queue, while the
> queue in question is empty.
> 
> I'd like to ask you to give a try diff below and repeat your test.
> Let it run for some time and collect 'dmesg' output for me after usual
> uptime-to-panic elapses during a test run.


Hi,

here's panic with WITESS, this diff and this one
https://www.mail-archive.com/tech@openbsd.org/msg72582.html

I will leave box in ddb ...


wsmouse1 at ums1 mux 0
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
107228145326
32609891)nosync: no unlinked: no timeout: PFTM_TCP_OPENING
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
132937190089
77519715)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
124393720902
11468387)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
182347109632
42042467)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<4>com1: 1 silo overflow, 0 ibuf overflows
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
142245292777
58899299)nosync: yes unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
475207607577
1004003)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
159470263971
51003747)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
134712868555
08329571)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
790090587538
2371427)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
543286072470
0808291)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
924950197971
1276131)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
139134763774
04146787)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
930062379002
4590435)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
125019156702
63792739)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
134887042707
43536739)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
155369141176
78236771)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
100393240929
61031267)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
909085788793
5661155)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
182376159588
61972579)nosync: no unlinked: no timeout: PFTM_UDP_MULTIPLE
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
159637401507
14238051)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
242206588258
3172195)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
105537141411
11420003)nosync: no unlinked: no timeout: PFTM_TCP_OPENING
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
152424297451
57538915)nosync: no unlinked: no timeout: PFTM_TCP_CLOSED
<3>pfsync: pfsync_q_del: stale queue (PFSYNC_S_UPD_C) in state (id
498918805181
3434467)nosync: no unlinked: no timeout: PFTM_TCP_FIN_WAIT
<3>pfsync: pfsync_q_del: stale queue 

Re: pfsync panic after pf_purge backout

2022-11-27 Thread Hrvoje Popovski
On 27.11.2022. 9:28, Hrvoje Popovski wrote:
> On 27.11.2022. 1:51, Alexandr Nedvedicky wrote:
>> Hello,
>>
>> On Sat, Nov 26, 2022 at 08:33:28PM +0100, Hrvoje Popovski wrote:
>> 
>>> I just need to say that with all pf, pfsync and with pf_purge diffs
>>> after hackaton + this diff on tech@
>>> https://www.mail-archive.com/tech@openbsd.org/msg72582.html
>>> my production firewall seems stable and it wasn't without that diff
>> this diff still waits for OK. it makes pfsync to use
>> state mutex to safely dereference keys.
>>
>>> I'm not sure if we have same diffs but even Josmar Pierri on bugs@
>>> https://www.mail-archive.com/bugs@openbsd.org/msg18994.html
>>> who had panics quite regularly with that diff on tech@ seems to have
>>> stable firewall now.
>>>
>>>
>>>
>>> r620-1# uvm_fault(0x82374288, 0x17, 0, 2) -> e
>>> kernel: page fault trap, code=0
>>> Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
>>> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>>> *192892  19920  0 0x14000  0x2005K softnet
>>> pfsync_q_del(fd82e8a4ce20) at pfsync_q_del+0x96
>>> pf_remove_state(fd82e8a4ce20) at pf_remove_state+0x14b
>>> pfsync_in_del_c(fd8006d843b8,c,79,0) at pfsync_in_del_c+0x6f
>>> pfsync_input(800022d60ad8,800022d60ae4,f0,2) at pfsync_input+0x33c
>>> ip_deliver(800022d60ad8,800022d60ae4,f0,2) at ip_deliver+0x113
>>> ipintr() at ipintr+0x69
>>> if_netisr(0) at if_netisr+0xea
>>> taskq_thread(8003) at taskq_thread+0x100
>>> end trace frame: 0x0, count: 7
>>> https://www.openbsd.org/ddb.html describes the minimum info required in
>>> bug reports.  Insufficient info makes it difficult to find and fix bugs.
>>> ddb{5}>
>>>
>> those panics are causing me headaches. this got most-likely uncovered
>> by diff which adds a mutex. The mutex makes pfsync stable enough
>> so you can trigger unknown bugs.
> 
> Hi,
> 
> here's panic with WITNESS. Now I will try to trigger panic with that
> mutex diff on tech@


Hi,

here's panic with WITNESS and this diff on tech@
https://www.mail-archive.com/tech@openbsd.org/msg72582.html

I will stop now because I'm not sure what I'm doing and which diffs I'm
testing...


r620-1# uvm_fault(0x8248ea28, 0x17, 0, 2) -> e
kernel: page fault trap, code=0
Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*300703  35643  0 0x14000  0x2001K systq
 237790  10061  0 0x14000 0x42000  softclock
pfsync_q_del(fd8323dc3900) at pfsync_q_del+0x96
pfsync_delete_state(fd8323dc3900) at pfsync_delete_state+0x118
pf_remove_state(fd8323dc3900) at pf_remove_state+0x14e
pf_purge_expired_states(c3501) at pf_purge_expired_states+0x1b3
pf_purge(823ae080) at pf_purge+0x28
taskq_thread(822cbe30) at taskq_thread+0x11a
end trace frame: 0x0, count: 9
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{1}>


ddb{1}> show panic
*cpu1: uvm_fault(0x8248ea28, 0x17, 0, 2) -> e
ddb{1}>


ddb{1}> show reg
rdi  0x9
rsi  0xf
rbp   0x800022d593c0
rbx   0xfd83347714a8
rdx   0x
rcx 0x10
rax  0xf
r80x7fff
r90x800022d59570
r10   0xcc7e29c6fd100f64
r11   0x1e575244acf63fd3
r12   0x808c4000
r13   0xfd8318aac200
r14   0xfd8323dc3900
r15   0x808c47e0
rip   0x817d3ec6pfsync_q_del+0x96
cs   0x8
rflags   0x10286__ALIGN_SIZE+0xf286
rsp   0x800022d59390
ss  0x10
pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
ddb{1}>



ddb{1}>  show locks
exclusive rwlock pf_state_lock r = 0 (0x822b03a0)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x17f
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pf_lock r = 0 (0x822b0370)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x173
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pfstates r = 0 (0x822c57d0)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x167
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock netlock r = 0 (0x822b

Re: pfsync panic after pf_purge backout

2022-11-27 Thread Hrvoje Popovski
On 27.11.2022. 1:51, Alexandr Nedvedicky wrote:
> Hello,
> 
> On Sat, Nov 26, 2022 at 08:33:28PM +0100, Hrvoje Popovski wrote:
> 
>> I just need to say that with all pf, pfsync and with pf_purge diffs
>> after hackaton + this diff on tech@
>> https://www.mail-archive.com/tech@openbsd.org/msg72582.html
>> my production firewall seems stable and it wasn't without that diff
> 
> this diff still waits for OK. it makes pfsync to use
> state mutex to safely dereference keys.
> 
>>
>> I'm not sure if we have same diffs but even Josmar Pierri on bugs@
>> https://www.mail-archive.com/bugs@openbsd.org/msg18994.html
>> who had panics quite regularly with that diff on tech@ seems to have
>> stable firewall now.
>>
>>
>>
>> r620-1# uvm_fault(0x82374288, 0x17, 0, 2) -> e
>> kernel: page fault trap, code=0
>> Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
>> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>> *192892  19920  0 0x14000  0x2005K softnet
>> pfsync_q_del(fd82e8a4ce20) at pfsync_q_del+0x96
>> pf_remove_state(fd82e8a4ce20) at pf_remove_state+0x14b
>> pfsync_in_del_c(fd8006d843b8,c,79,0) at pfsync_in_del_c+0x6f
>> pfsync_input(800022d60ad8,800022d60ae4,f0,2) at pfsync_input+0x33c
>> ip_deliver(800022d60ad8,800022d60ae4,f0,2) at ip_deliver+0x113
>> ipintr() at ipintr+0x69
>> if_netisr(0) at if_netisr+0xea
>> taskq_thread(8003) at taskq_thread+0x100
>> end trace frame: 0x0, count: 7
>> https://www.openbsd.org/ddb.html describes the minimum info required in
>> bug reports.  Insufficient info makes it difficult to find and fix bugs.
>> ddb{5}>
>>
> 
> those panics are causing me headaches. this got most-likely uncovered
> by diff which adds a mutex. The mutex makes pfsync stable enough
> so you can trigger unknown bugs.


Hi,

here's panic with WITNESS. Now I will try to trigger panic with that
mutex diff on tech@

r620-1# uvm_fault(0x824be2f8, 0x17, 0, 2) -> e
kernel: page fault trap, code=0
Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 267891  77512 830x100012  02  ntpd
*250028  94877  0 0x14000  0x2003K systq
  35323  58401  0 0x14000 0x42000  softclock
pfsync_q_del(fd82c6826da0) at pfsync_q_del+0x96
pfsync_delete_state(fd82c6826da0) at pfsync_delete_state+0x118
pf_remove_state(fd82c6826da0) at pf_remove_state+0x14b
pf_purge_expired_states(b3368) at pf_purge_expired_states+0x1b3
pf_purge(823d8c80) at pf_purge+0x28
taskq_thread(822c73e8) at taskq_thread+0x11a
end trace frame: 0x0, count: 9
https://www.openbsd.org/ddb.html describes the minimum info required in bug
reports.  Insufficient info makes it difficult to find and fix bugs.


ddb{3}> show panic
*cpu3: uvm_fault(0x824be2f8, 0x17, 0, 2) -> e
ddb{3}>


ddb{3}> show reg
rdi  0x9
rsi  0xf
rbp   0x800022d59b80
rbx   0xfd842e15b7d8
rdx   0x
rcx 0x10
rax  0xf
r80x7fff
r90x800022d59d20
r10   0x362267f9c796ed3a
r11   0xcadda0efd2fc372f
r12   0x808c3000
r13   0xfd8310c597b0
r14   0xfd82c6826da0
r15   0x808c37e0
rip   0x81398f56pfsync_q_del+0x96
cs   0x8
rflags   0x10286__ALIGN_SIZE+0xf286
rsp   0x800022d59b50
ss  0x10
pfsync_q_del+0x96:  movq%rdx,0x8(%rax)


ddb{3}> show locks
exclusive rwlock pf_state_lock r = 0 (0x822e2d58)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x17f
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pf_lock r = 0 (0x822e2d28)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x173
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pfstates r = 0 (0x822bf040)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x167
#2  pf_purge+0x28
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock netlock r = 0 (0x822bb6a0)
#0  witness_lock+0x311
#1  rw_enter+0x292
#2  pf_purge_expired_states+0x15b
#3  pf_purge+0x28
#4  taskq_thread+0x11a
#5  proc_trampoline+0x1c
exclusive kernel_lock _lock r = 1 (0x824bd7e8)
#0  witness_lock+0x311
#1  __mp_acquire_count+0x38
#2  mi_switch+0x28b
#3  sleep_finish+0xfe
#4  rw_enter+0x232
#5  pf_purge_expired_states+0x15b
#6  pf_purge+0x28
#7  t

pfsync panic after pf_purge backout

2022-11-26 Thread Hrvoje Popovski
Hi all,

sashan@ and dlg@ I'm sorry if your eyes bleeds from pfsync panics. I
just wanted to send panic log here after pf_purge backout.
I'm testing pfsync with NET_TASKQ=6 because I have 6 core boxes and that
way panics are faster to trigger.
In this test I sure that pf is hitting state limit quite hard and I'm
only generating traffic and nothing more.
I'm generating traffic with cisco t-rex so traffic is real as what you
can get in lab environment.
Now I'm trying to trigger panic with WITNESS but it seems that it hard.


I just need to say that with all pf, pfsync and with pf_purge diffs
after hackaton + this diff on tech@
https://www.mail-archive.com/tech@openbsd.org/msg72582.html
my production firewall seems stable and it wasn't without that diff

I'm not sure if we have same diffs but even Josmar Pierri on bugs@
https://www.mail-archive.com/bugs@openbsd.org/msg18994.html
who had panics quite regularly with that diff on tech@ seems to have
stable firewall now.



r620-1# uvm_fault(0x82374288, 0x17, 0, 2) -> e
kernel: page fault trap, code=0
Stopped at  pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*192892  19920  0 0x14000  0x2005K softnet
pfsync_q_del(fd82e8a4ce20) at pfsync_q_del+0x96
pf_remove_state(fd82e8a4ce20) at pf_remove_state+0x14b
pfsync_in_del_c(fd8006d843b8,c,79,0) at pfsync_in_del_c+0x6f
pfsync_input(800022d60ad8,800022d60ae4,f0,2) at pfsync_input+0x33c
ip_deliver(800022d60ad8,800022d60ae4,f0,2) at ip_deliver+0x113
ipintr() at ipintr+0x69
if_netisr(0) at if_netisr+0xea
taskq_thread(8003) at taskq_thread+0x100
end trace frame: 0x0, count: 7
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{5}>


ddb{5}> show panic
*cpu5: uvm_fault(0x82374288, 0x17, 0, 2) -> e
ddb{5}>


ddb{5}> show reg
rdi0
rsi  0xf
rbp   0x800022d60920
rbx0x3cc
rdx   0x
rcx 0x10
rax  0xf
r80xfd82b9c8a8d0
r90xfd82cd8efe38
r10   0xfd82cd8efe38
r11   0x56c3d350aa4ea3d2
r12   0x808c4000
r13 0x28
r14   0xfd82e8a4ce20
r15   0x808c4720
rip   0x81c78626pfsync_q_del+0x96
cs   0x8
rflags   0x10286__ALIGN_SIZE+0xf286
rsp   0x800022d608f0
ss  0x10
pfsync_q_del+0x96:  movq%rdx,0x8(%rax)
ddb{5}>


ddb{5}> ps /o
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
*192892  19920  0 0x14000  0x2005K softnet

ddb{5}> trace /t 0t 192892
sleep_finish(8272dea0,1) at sleep_finish+0xfe
tsleep(823738f0,4,81f880db,0) at tsleep+0xb2
main(0) at main+0x775
end trace frame: 0x0, count: -3


ddb{5}> ps
   PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
  4058  502031  1  0  30x100083  ttyin ksh
 48707  291695  1  0  30x100098  kqreadcron
 37722  219650  96592 95  3   0x1100092  kqreadsmtpd
 38110  501136  96592103  3   0x1100092  kqreadsmtpd
  6191  274846  96592 95  3   0x1100092  kqreadsmtpd
 93001  113813  96592 95  30x100092  kqreadsmtpd
 48938  178445  96592 95  3   0x1100092  kqreadsmtpd
 60938  255136  96592 95  3   0x1100092  kqreadsmtpd
 96592  417634  1  0  30x100080  kqreadsmtpd
 37963  127625  1  0  30x88  kqreadsshd
 32077  367571  1  0  30x100080  kqreadntpd
 47614  473827   5112 83  30x100092  kqreadntpd
  5112   74565  1 83  3   0x1100012  netlock   ntpd
 88022  357328  31934 74  3   0x1100092  bpf   pflogd
 31934  430624  1  0  30x80  netio pflogd
 88174   57576  57658 73  3   0x1100090  kqreadsyslogd
 576585426  1  0  30x100082  netio syslogd
 93915  271655  0  0  3 0x14200  bored smr
 60765  222732  0  0  3 0x14200  pgzerozerothread
 75107   75302  0  0  3 0x14200  aiodoned  aiodoned
 26715  517414  0  0  3 0x14200  syncerupdate
 30007  265724  0  0  3 0x14200  cleaner   cleaner
 53800  389208  0  0  3 0x14200  reaperreaper
 19611   77488  0  0  3 0x14200  pgdaemon  pagedaemon
 38998  273899  0  0  3 0x14200  usbtskusbtask
 92490   66823  0  0  3 0x14200  usbatsk   usbatsk
 37775   10602  0  0  3  0x40014200  acpi0 acpi0
 70481  

Re: pfsync slpassert on boot and panic

2022-11-25 Thread Hrvoje Popovski
On 25.11.2022. 10:12, Alexandr Nedvedicky wrote:
> Looks like we need to synchronize pfsync destroy with timer thread.
> 
> thanks for great testing.
> 
> regards
> sashan
> 
> 8<---8<---8<--8<
> diff --git a/sys/net/if_pfsync.c b/sys/net/if_pfsync.c
> index f69790ee98d..24963a546de 100644
> --- a/sys/net/if_pfsync.c
> +++ b/sys/net/if_pfsync.c
> @@ -1865,8 +1865,6 @@ pfsync_undefer(struct pfsync_deferral *pd, int drop)
>  {
>   struct pfsync_softc *sc = pfsyncif;
>  
> - NET_ASSERT_LOCKED();
> -
>   if (sc == NULL)
>   return;
>  
> @@ -2128,8 +2126,6 @@ pfsync_delete_state(struct pf_state *st)
>  {
>   struct pfsync_softc *sc = pfsyncif;
>  
> - NET_ASSERT_LOCKED();
> -
>   if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING))
>   return;
>  
> @@ -2188,8 +2184,6 @@ pfsync_clear_states(u_int32_t creatorid, const char 
> *ifname)
>   struct pfsync_clr clr;
>   } __packed r;
>  
> - NET_ASSERT_LOCKED();
> -
>   if (sc == NULL || !ISSET(sc->sc_if.if_flags, IFF_RUNNING))
>   return;
>  


Hi,

yes, this diff helps and I can't panic box in the way i can without it.
I will play with this diff and try to trigger more panics ...

Thank you guys for work on unlocking pf and pfsync ...





pfsync slpassert on boot and panic

2022-11-25 Thread Hrvoje Popovski
Hi,

I think that this is similar problem as what David Hill send on tech@
with subject "splassert on boot"

I've checkout tree few minutes ago and in there should be
mvs@ "Remove netlock assertion within PF_LOCK()" and
dlg@ "get rid of NET_LOCK in the pf purge work" diffs.

on boot I'm getting this splassert

splassert: pfsync_delete_state: want 2 have 256
Starting stack trace...
pfsync_delete_state(fd83a66644d8) at pfsync_delete_state+0x58
pf_remove_state(fd83a66644d8) at pf_remove_state+0x14b
pf_purge_expired_states(1fdb,40) at pf_purge_expired_states+0x202
pf_purge_states(0) at pf_purge_states+0x1c
taskq_thread(822f69c8) at taskq_thread+0x11a
end trace frame: 0x0, count: 252
End of stack trace.

splassert: pfsync_delete_state: want 2 have 0
Starting stack trace...
pfsync_delete_state(fd83a6676628) at pfsync_delete_state+0x58
pf_remove_state(fd83a6676628) at pf_remove_state+0x14b
pf_purge_expired_states(1f9c,40) at pf_purge_expired_states+0x202
pf_purge_states(0) at pf_purge_states+0x1c
taskq_thread(822f69c8) at taskq_thread+0x11a
end trace frame: 0x0, count: 252
End of stack trace.


and if i destroy pfsync interface and then sh /etc/netstart box panic

uvm_fault(0x823d3250, 0x810, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at  pfsync_q_ins+0x1a:  movq0x810(%r13),%rsi
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
* 68977  95532  0 0x14000  0x2003K systqmp
pfsync_q_ins(fd83a6676628,2) at pfsync_q_ins+0x1a
pf_remove_state(fd83a6676628) at pf_remove_state+0x14b
pf_purge_expired_states(1f9c,40) at pf_purge_expired_states+0x202
pf_purge_states(0) at pf_purge_states+0x1c
taskq_thread(822f69c8) at taskq_thread+0x11a
end trace frame: 0x0, count: 10
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{3}>


ddb{3}> show panic
*cpu3: uvm_fault(0x823d3250, 0x810, 0, 1) -> e


ddb{3}> show reg
rdi   0xfd83a6676628
rsi  0x2
rbp   0x800022d5ef90
rbx   0xfd83a6676628
rdx   0xfe0f
rcx0x282
rax 0xff
r80x8233fa38w_locklistdata+0x43e68
r90x800022d5f100
r10   0x3925934c5d55f628
r11   0xba2a637b8a7a5b53
r12 0x40
r130
r14  0x2
r15   0xfd83a6676628
rip   0x81b88c8apfsync_q_ins+0x1a
cs   0x8
rflags   0x10282__ALIGN_SIZE+0xf282
rsp   0x800022d5ef50
ss  0x10
pfsync_q_ins+0x1a:  movq0x810(%r13),%rsi



ddb{3}> show locks
exclusive kernel_lock _lock r = 0 (0x82438590)
#0  witness_lock+0x311
#1  kpageflttrap+0x1b2
#2  kerntrap+0x91
#3  alltraps_kern_meltdown+0x7b
#4  pfsync_q_ins+0x1a
#5  pf_remove_state+0x14b
#6  pf_purge_expired_states+0x202
#7  pf_purge_states+0x1c
#8  taskq_thread+0x11a
#9  proc_trampoline+0x1c
exclusive rwlock pf_state_lock r = 0 (0x822c05a0)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x1d5
#2  pf_purge_states+0x1c
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pf_lock r = 0 (0x822c0570)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x1c9
#2  pf_purge_states+0x1c
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
exclusive rwlock pfstates r = 0 (0x822b4210)
#0  witness_lock+0x311
#1  pf_purge_expired_states+0x1bd
#2  pf_purge_states+0x1c
#3  taskq_thread+0x11a
#4  proc_trampoline+0x1c
shared rwlock systqmp r = 0 (0x822f6a38)
#0  witness_lock+0x311
#1  taskq_thread+0x10d
#2  proc_trampoline+0x1c
ddb{3}>


ddb{3}> ps
   PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
 83757   73556  1  0  30x100083  ttyin ksh
   230  204977  73028  0  30x100083  ttyin ksh
 73028  319173  85298   1000  30x10008b  sigsusp   ksh
 85298  521998  43878   1000  30x10  pf_state_loc  sshd
 43878  290809  21538  0  30x82  kqreadsshd
 50517  491548  1  0  30x100098  kqreadcron
 91327  121436  26507 95  3   0x1100092  kqreadsmtpd
 17435  384685  26507103  3   0x1100092  kqreadsmtpd
 67349  520840  26507 95  3   0x1100092  kqreadsmtpd
 97353  483313  26507 95  30x100092  kqreadsmtpd
 44262  496860  26507 95  3   0x1100092  kqreadsmtpd
52  255319  26507 95  3   0x1100092  kqreadsmtpd
 26507   41672  1  0  30x100080  kqreadsmtpd
 21538  173086  1  0  30x88  kqreadsshd
 44849  364511  1  0  30x100080  kqreadntpd
 33978  331427  77612 83  30x100092  kqread

Re: replace SRP with SMR in the if_idxmap commit

2022-11-10 Thread Hrvoje Popovski
On 10.11.2022. 14:59, David Gwynne wrote:
> On Thu, Nov 10, 2022 at 09:04:22PM +1000, David Gwynne wrote:
>> On Thu, Nov 10, 2022 at 08:10:35AM +1000, David Gwynne wrote:
>>> I know what this is. The barrier at the end of if_idxmap_alloc is sleeping 
>>> waiting for cpus to run that aren't running cos we haven't finished booting 
>>> yet.
>>>
>>> I'll back it out and fix it up when I'm actually awake.
>> i woke up, so here's a diff.
>>
>> this uses the usedidx as an smr_entry so we can use smr_call instead of
>> smr_barrier during autoconf.
>>
>> this works for me on a box with a lot of hardware interfaces, which
>> forces allocation of a new interface map and therefore destruction of
>> the initial one.
>>
>> there is still an smr_barrier in if_idxmap_remove, but remove only
>> happens when an interface goes away. we could use part of struct ifnet
>> (eg, if_description) as an smr_entry if needed.
>>
> this one is even better.


Hi,

with this diff my machines boot as they should.

Thank you



replace SRP with SMR in the if_idxmap commit

2022-11-09 Thread Hrvoje Popovski
Hi all,

I've checkout cvs half an hour ago on two boxes and both boxes won't
properly boot.

First one stops here

ppb10 at pci1 dev 28 function 4 "Intel 8 Series PCIE" rev 0xd5: msi
pci12 at ppb10 bus 13
em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01: msi, address
00:25:90:5d:c9:9a
em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01: msi


second one stop here
vmm0 at mainbus0: VMX/EPT

so I've change if.c revision to r1.672 and with that change box boots
but with r1.673 won't boot...
I've compile kernel with WITNESS and WITNESS_LOCKTRACE but with that
there isn't any more information.

Did anyone experience that or I just me?







Re: ifconfig, wireguard output less verbose, unless -A or

2022-10-24 Thread Hrvoje Popovski
On 14.10.2022. 23:57, Mikolaj Kucharski wrote:
> Kind reminder. Below there is a comment with an OK from sthen@
> 
> Diff at the end of this email.
> 
> 

Hi all,

can this diff be committed? Less verbose output of ifconfig wg interface
is quite nice when handling wg vpn server

Thank you



> On Wed, Sep 07, 2022 at 05:29:38PM +0100, Stuart Henderson wrote:
>> On 2022/09/07 15:25, Mikolaj Kucharski wrote:
>>> Hi.
>>>
>>> I didn't get a lof of feedback on this on the code level, however
>>> got some intput on manual page changes. At the end of the email is
>>> ifconfig.8 change from jmc@ and ifconfig.c from me.
>>>
>>>
>>> On Sat, Sep 03, 2022 at 04:51:03PM +0100, Jason McIntyre wrote:
 On Sat, Sep 03, 2022 at 08:55:51AM +, Mikolaj Kucharski wrote:
> Hi,
>
> I tried to address what jmc@ mentioned below. I don't really know
> mdoc(7) and English is not my native language, so I imagine there is
> place for improvement in the wg(4) diff.
>

 hi.

 after looking again, i think maybe ifconfig.8 is the better place, but
 just not where it was originally proposed. by way of a peace offering,
 how about the diff below?

 jmc

>>> [...]
>>
>> It's all in ifndef SMALL so there are no ramdisk space concerns.
>> Works as expected, I think it's a good idea. It's OK with me.
>>
>>
>>>
>>> Index: ifconfig.c
>>> ===
>>> RCS file: /cvs/src/sbin/ifconfig/ifconfig.c,v
>>> retrieving revision 1.456
>>> diff -u -p -u -r1.456 ifconfig.c
>>> --- ifconfig.c  8 Jul 2022 07:04:54 -   1.456
>>> +++ ifconfig.c  7 Sep 2022 15:18:50 -
>>> @@ -363,7 +363,7 @@ voidunsetwgpeer(const char *, int);
>>>  void   unsetwgpeerpsk(const char *, int);
>>>  void   unsetwgpeerall(const char *, int);
>>>  
>>> -void   wg_status();
>>> +void   wg_status(int);
>>>  #else
>>>  void   setignore(const char *, int);
>>>  #endif
>>> @@ -679,7 +679,7 @@ voidprintgroupattribs(char *);
>>>  void   printif(char *, int);
>>>  void   printb_status(unsigned short, unsigned char *);
>>>  const char *get_linkstate(int, int);
>>> -void   status(int, struct sockaddr_dl *, int);
>>> +void   status(int, struct sockaddr_dl *, int, int);
>>>  __dead voidusage(void);
>>>  const char *get_string(const char *, const char *, u_int8_t *, int *);
>>>  intlen_string(const u_int8_t *, int);
>>> @@ -1195,7 +1195,7 @@ printif(char *name, int ifaliases)
>>> continue;
>>> ifdata = ifa->ifa_data;
>>> status(1, (struct sockaddr_dl *)ifa->ifa_addr,
>>> -   ifdata->ifi_link_state);
>>> +   ifdata->ifi_link_state, ifaliases);
>>> count++;
>>> noinet = 1;
>>> continue;
>>> @@ -3316,7 +3316,7 @@ get_linkstate(int mt, int link_state)
>>>   * specified, show it and it only; otherwise, show them all.
>>>   */
>>>  void
>>> -status(int link, struct sockaddr_dl *sdl, int ls)
>>> +status(int link, struct sockaddr_dl *sdl, int ls, int ifaliases)
>>>  {
>>> const struct afswtch *p = afp;
>>> struct ifmediareq ifmr;
>>> @@ -3391,7 +3391,7 @@ status(int link, struct sockaddr_dl *sdl
>>> mpls_status();
>>> pflow_status();
>>> umb_status();
>>> -   wg_status();
>>> +   wg_status(ifaliases);
>>>  #endif
>>> trunk_status();
>>> getifgroups();
>>> @@ -5907,7 +5907,7 @@ process_wg_commands(void)
>>>  }
>>>  
>>>  void
>>> -wg_status(void)
>>> +wg_status(int ifaliases)
>>>  {
>>> size_t   i, j, last_size;
>>> struct timespec  now;
>>> @@ -5942,45 +5942,47 @@ wg_status(void)
>>> printf("\twgpubkey %s\n", key);
>>> }
>>>  
>>> -   wg_peer = _interface->i_peers[0];
>>> -   for (i = 0; i < wg_interface->i_peers_count; i++) {
>>> -   b64_ntop(wg_peer->p_public, WG_KEY_LEN,
>>> -   key, sizeof(key));
>>> -   printf("\twgpeer %s\n", key);
>>> -
>>> -   if (wg_peer->p_flags & WG_PEER_HAS_PSK)
>>> -   printf("\t\twgpsk (present)\n");
>>> -
>>> -   if (wg_peer->p_flags & WG_PEER_HAS_PKA && wg_peer->p_pka)
>>> -   printf("\t\twgpka %u (sec)\n", wg_peer->p_pka);
>>> -
>>> -   if (wg_peer->p_flags & WG_PEER_HAS_ENDPOINT) {
>>> -   if (getnameinfo(_peer->p_sa, wg_peer->p_sa.sa_len,
>>> -   hbuf, sizeof(hbuf), sbuf, sizeof(sbuf),
>>> -   NI_NUMERICHOST | NI_NUMERICSERV) == 0)
>>> -   printf("\t\twgendpoint %s %s\n", hbuf, sbuf);
>>> -   else
>>> -   printf("\t\twgendpoint unable to print\n");
>>> -   }
>>> +   if (ifaliases) {
>>> +   wg_peer = _interface->i_peers[0];
>>> +   for (i = 0; i < wg_interface->i_peers_count; i++) {
>>> +   

Re: em(4) IPv4, TCP, UDP checksum offloading

2022-10-18 Thread Hrvoje Popovski
On 15.10.2022. 22:01, Moritz Buhl wrote:
> With the previous diffs I am seeing sporadic connection problems in tcpbench
> via IPv6 on Intel 82546GB.
> The diff was too big anyways. Here is a smaller diff that introduces
> checksum offloading for the controllers that use the advanced descriptors.
> 
> I tested this diff on i350 and 82571EB, receive, send, forwarding,
> icmp4, icmp6, tcp4, tcp6, udp4, udp6, also over vlan and veb.

Hi,

I've tried same thing on i210 and 82576.



Re: em(4) IPv4, TCP, UDP checksum offloading

2022-10-11 Thread Hrvoje Popovski
On 11.10.2022. 17:16, Stuart Henderson wrote:

Hi,

> I tried this on my laptop which has I219-V em (I run it in a trunk
> with iwm). It breaks tx (packets don't show up on the other side).
> rx seems ok.
> 
> There is also a "em0: watchdog: head 111 tail 20 TDH 20 TDT 111"

this em log can be triggered with or without em offload diff¸with sh
/etc/netstart, but it seems that with this diff it's little easier to
trigger it ...


> but that was after ~10 mins uptime so may have occurred after
> I switched the active port on the trunk(4) over to the iwm, or
> back again.



Re: em(4) IPv4, TCP, UDP checksum offloading

2022-10-11 Thread Hrvoje Popovski
On 11.10.2022. 15:03, Moritz Buhl wrote:
> Here is a new diff for checksum offloading (ipv4, udp, tcp) for em(4).
> 
> The previous diff didn't implement hardware vlan tagging for >em82578
> which should result in variable ethernet header lengths and thus
> wrong checksums inserted at wrong places.
> 
> The diff below addresses this.
> I would appreciate further testing reports with different controllers.

Hi,

what would be the best thing to do to test this diff?

I have box with

em0 at pci7 dev 0 function 0 "Intel 82576" rev 0x01: msi, address
00:1b:21:61:8a:94
em1 at pci7 dev 0 function 1 "Intel 82576" rev 0x01: msi, address
00:1b:21:61:8a:95
em2 at pci8 dev 0 function 0 "Intel I210" rev 0x03: msi, address
00:25:90:5d:c9:98
em3 at pci9 dev 0 function 0 "Intel I210" rev 0x03: msi, address
00:25:90:5d:c9:99
em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01: msi, address
00:25:90:5d:c9:9a
em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01: msi, address
00:25:90:5d:c9:9b
em6 at pci12 dev 0 function 2 "Intel I350" rev 0x01: msi, address
00:25:90:5d:c9:9c
em7 at pci12 dev 0 function 3 "Intel I350" rev 0x01: msi, address
00:25:90:5d:c9:9d

after this diff I'm seeing

em0: flags=8843 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:1b:21:61:8a:94
em1: flags=8843 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:1b:21:61:8a:95
em2: flags=8843 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:25:90:5d:c9:98
em3: flags=8843 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:25:90:5d:c9:99
em4: flags=8843 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:25:90:5d:c9:9a
em5: flags=8843 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:25:90:5d:c9:9b
em6: flags=8843 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:25:90:5d:c9:9c
em7: flags=8802 mtu 1500

hwfeatures=1b7
hardmtu 9216
lladdr 00:25:90:5d:c9:9d


and basic tcp/udp iperf is working as expected over plain interface.




Re: problem with interrupts on machines with many cores and multiqueue nics

2022-10-02 Thread Hrvoje Popovski
On 1.10.2022. 23:28, Mark Kettenis wrote:
> At least on some of these machines, you're simply running out of
> kernel malloc space.  The machines "hang" because the M_WAITOK flag is
> used for the allocations, and malloc(9) waits in the hope someone else
> gives up the memory.  Maybe we need to allow for more malloc space.
> But to be sure this isn't the drivers being completely stupid, vmstat
> -m output would be helpful.

Do you mean vmstat -m when machines are fine or show all pools or some
other command when in ddb console ?




Re: sysupgrade - Reading from socket: Undefined error: 0

2022-09-20 Thread Hrvoje Popovski
On 20.9.2022. 8:50, Florian Obser wrote:
> On 2022-09-19 22:27 +02, Hrvoje Popovski  wrote:
>> Hi all,
>>
>> when doing sysupgrade few minutes ago on multiple machines i'm getting
>> error in subject
>>
>> smc24# sysupgrade -s
>> Fetching from https://cdn.openbsd.org/pub/OpenBSD/snapshots/amd64/
>> SHA256.sig   100% |*|
>> 2144   00:00
>> Signature Verified
>> INSTALL.amd64 100% ||
>> 43554   00:00
>> base72.tgz   100% |*|
>> 331 MB00:16
>> bsd  100% |*|
>> 22449 KB00:05
>> bsd.mp   100% |*|
>> 22562 KB00:04
>> bsd.rd   100% |*|
>> 4533 KB00:01
>> comp72.tgz   100% |*|
>> 74598 KB00:09
>> game72.tgz   100% |*|
>> 2745 KB00:01
>> man72.tgz100% |*|
>> 7610 KB00:02
>> xbase72.tgz   29% |**   |
>> 15744 KB00:14 ETAsysupgrade: Reading from socket: Undefined error: 0
>> smc24#
>>
> 
> Is this somehow coming from the non-blocking connect diff? I can't spot
> it though...
> 

I've managed to reproduce it only once. After that on all machines in my
lab sysupgrade is working as expected.



Re: sysupgrade - timezone

2022-09-19 Thread Hrvoje Popovski
On 19.9.2022. 23:22, Todd C. Miller wrote:
> There was a bad diff in that snapshot that caused tzset() to ignore
> /etc/localtime.
> 
>  - todd
> 


Thank you ...



Re: sysupgrade - timezone

2022-09-19 Thread Hrvoje Popovski
On 19.9.2022. 23:09, Todd C. Miller wrote:
> On Mon, 19 Sep 2022 23:06:13 +0200, Hrvoje Popovski wrote:
> 
>> after sysupgrade I'm having GMT
>>
>> OpenBSD 7.2 (GENERIC.MP) #736: Mon Sep 19 17:56:55 GMT 2022
>> r620-1# date
>> Mon Sep 19 21:01:14 GMT 2022
>>
>> r620-1# ls -apl /etc/localtime
>> lrwxr-xr-x  1 root  wheel  33 Feb 10  2022 /etc/localtime ->
>> /usr/share/zoneinfo/Europe/Zagreb
>>
>> I didn't upgrade r420-1 yet, but other boxes are upgraded to latest snapshot
> 
> Can you check the permissions on /usr/share/zoneinfo/Europe/Zagreb?
> If that file is not readable you will end up with GMT.
> 

r620-1# ls -apl /usr/share/zoneinfo/Europe/Zagreb
-r--r--r--  1 root  bin  1917 Sep 19 17:39 /usr/share/zoneinfo/Europe/Zagreb



> If it looks OK, what does:
> 
> TZ=Europe/Zagreb date
> 
> produce?
> 

r620-1# TZ=Europe/Zagreb date
Mon Sep 19 23:13:26 CEST 2022





>  - todd
> 



sysupgrade - timezone

2022-09-19 Thread Hrvoje Popovski
Hi all,

before sysupgrade I had CEST timezone

OpenBSD 7.2 (GENERIC.MP) #733: Sun Sep 18 06:39:56 MDT 2022
r420-1# date
Mon Sep 19 22:59:37 CEST 2022

r420-1# ls -apl /etc/localtime
lrwxr-xr-x  1 root  wheel  33 Nov 12  2019 /etc/localtime ->
/usr/share/zoneinfo/Europe/Zagreb



after sysupgrade I'm having GMT

OpenBSD 7.2 (GENERIC.MP) #736: Mon Sep 19 17:56:55 GMT 2022
r620-1# date
Mon Sep 19 21:01:14 GMT 2022

r620-1# ls -apl /etc/localtime
lrwxr-xr-x  1 root  wheel  33 Feb 10  2022 /etc/localtime ->
/usr/share/zoneinfo/Europe/Zagreb



I didn't upgrade r420-1 yet, but other boxes are upgraded to latest snapshot



Re: sysupgrade - Reading from socket: Undefined error: 0

2022-09-19 Thread Hrvoje Popovski
On 19.9.2022. 22:27, Hrvoje Popovski wrote:
> Hi all,
> 
> when doing sysupgrade few minutes ago on multiple machines i'm getting
> error in subject
> 
> smc24# sysupgrade -s
> Fetching from https://cdn.openbsd.org/pub/OpenBSD/snapshots/amd64/
> SHA256.sig   100% |*|
> 2144   00:00
> Signature Verified
> INSTALL.amd64 100% ||
> 43554   00:00
> base72.tgz   100% |*|
> 331 MB00:16
> bsd  100% |*|
> 22449 KB00:05
> bsd.mp   100% |*|
> 22562 KB00:04
> bsd.rd   100% |*|
> 4533 KB00:01
> comp72.tgz   100% |*|
> 74598 KB00:09
> game72.tgz   100% |*|
> 2745 KB00:01
> man72.tgz100% |*|
> 7610 KB00:02
> xbase72.tgz   29% |**   |
> 15744 KB00:14 ETAsysupgrade: Reading from socket: Undefined error: 0
> smc24#
> 

If I run it again sysupgrade seems fine

smc24# sysupgrade -s
Fetching from https://cdn.openbsd.org/pub/OpenBSD/snapshots/amd64/
SHA256.sig   100% |*|
2144   00:00
Signature Verified
Verifying old sets.
xbase72.tgz  100% |*|
52806 KB00:07
xfont72.tgz  100% |*|
22967 KB00:05
xserv72.tgz  100% |*|
14815 KB00:03
xshare72.tgz 100% |*|
4559 KB00:01
Verifying sets.
Fetching updated firmware.
fw_update: added none; updated none; kept vmm
Upgrading.



sysupgrade - Reading from socket: Undefined error: 0

2022-09-19 Thread Hrvoje Popovski
Hi all,

when doing sysupgrade few minutes ago on multiple machines i'm getting
error in subject

smc24# sysupgrade -s
Fetching from https://cdn.openbsd.org/pub/OpenBSD/snapshots/amd64/
SHA256.sig   100% |*|
2144   00:00
Signature Verified
INSTALL.amd64 100% ||
43554   00:00
base72.tgz   100% |*|
331 MB00:16
bsd  100% |*|
22449 KB00:05
bsd.mp   100% |*|
22562 KB00:04
bsd.rd   100% |*|
4533 KB00:01
comp72.tgz   100% |*|
74598 KB00:09
game72.tgz   100% |*|
2745 KB00:01
man72.tgz100% |*|
7610 KB00:02
xbase72.tgz   29% |**   |
15744 KB00:14 ETAsysupgrade: Reading from socket: Undefined error: 0
smc24#



Re: soreceive with shared netlock

2022-09-03 Thread Hrvoje Popovski
On 3.9.2022. 22:47, Alexander Bluhm wrote:
> Hi,
> 
> The next small step towards parallel network stack is to use shared
> netlock in soreceive().  The UDP and IP divert layer provide locking
> of the PCB.  If that is possible, use shared instead of exclusive
> netlock in soreceive().  The PCB mutex provides a per socket lock
> against multiple soreceive() running in parallel.
> 
> The solock_shared() is a bit hacky.  Especially where we have to
> release and regrab both locks in sosleep_nsec().  But it does what
> we need.
> 
> The udp_input() keeps exclusive lock for now.  Socket splicing and
> IP header chains are not completely MP safe yet.  Also udp_input()
> has some tentacles that need more testing.
> 
> Ok to procceed with soreceive() unlocking before release?  I think
> udp_input() has to wait post 7.2.

Hi,

with this diff while booting I'm getting this witness trace

iic0 at ichiic0
isa0 at pcib0
isadma0 at isa0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
vmm0 at mainbus0: VMX/EPT
witness: lock_object uninitialized: 0x82379880
Starting stack trace...
witness_checkorder(82379880,9,0,82379880,8a214bc08a1c5075,822b0ff0)
at witness_checkorder+0xad
mtx_enter(82379870,82379870,6e3a311a0ad2ae1e,82097020,824c12c0,9)
at mtx_enter+0x34
tcp_slowtimo(812dd160,800020bf3110,82379870,82379870,6e3a311a0ad2ae1e,82097020)
at tcp_slowtimo+0x10
pfslowtimo(824c12c0,824c12c0,824c12c0,81ed1820,822b3148,d)
at pfslowtimo+0xc7
timeout_run(824c12c0,824c12c0,82301e90,800020bf31a0,81277b90,82301e90)
at timeout_run+0x93
softclock_thread(800020be2fc0,800020be2fc0,0,0,0,4) at
softclock_thread+0x11d
end trace frame: 0x0, count: 251
End of stack trace.





Re: ifconfig, wireguard output less verbose, unless -A or

2022-08-18 Thread Hrvoje Popovski
On 14.7.2022. 11:37, Mikolaj Kucharski wrote:
> Hi,
> 
> Per other thread, Theo expressed dissatisfaction with long ifconfig(8)
> for wg(4) interface. Stuart Henderson pointed me at direction, which
> below diff makes it work.
> 
> I guess to questions are:
> 
> - Does the behaviour of ifconfig(8) make sense?
> - Does the code which makes above, make sense?
> 
> This is minimal diff, I would appreciate feedback, I did least
> resistance approach. Looking at diff -wu shows even less changes
> as wg_status() is mainly identation with if-statement.
> 
> 
> Short output by default, only 6 lines, no wgpeers section:
> 
> pce-0067# ifconfig.ifaliases -a | tail -n6
> wg0: flags=80c3 mtu 1420
> index 8 priority 0 llprio 3
> wgport 51820
> wgpubkey qcb...
> groups: wg
> inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64
> 
> 
> Long output with  as an argument, wgpeers section present:
> 
> pce-0067# ifconfig.ifaliases wg0
> wg0: flags=80c3 mtu 1420
> index 8 priority 0 llprio 3
> wgport 51820
> wgpubkey qcb...
> wgpeer klM...
> wgpsk (present)
> wgpka 25 (sec)
> wgendpoint xxx.xxx.xxx.xxx 51820
> tx: 178764, rx: 65100
> last handshake: 7 seconds ago
> wgaip fde4:f456:48c2:13c0::/64
> groups: wg
> inet6 fde4:f456:48c2:13c0::cc67 prefixlen 64
> 
> 
> Above long output works with group as an argument (ifconfig wg) and with
> option -A (ifconfig -A), so I think from user experience perspective,
> works as expected.
> 
> Manual page changes not provided, as I'm not sure are they needed with
> this diff.
> 
> Comments?



Hi,

from my user perspective this is wonderful.
I have 16 wgpeers, 5 vlans, 5 carps, 4 physical interfaces and mostly if
doing ifconfig I want to know if physical interfaces are up and what is
status of carp. I don't need 100 lines of wg stuff in plain ifconifg.



Re: em(4) multiqueue

2022-08-15 Thread Hrvoje Popovski
On 12.8.2022. 22:15, Hrvoje Popovski wrote:
> Hi,
> 
> I'm testing forwarding over
> 
> em0 at pci7 dev 0 function 0 "Intel 82576" rev 0x01, msix, 4 queues,
> em1 at pci7 dev 0 function 1 "Intel 82576" rev 0x01, msix, 4 queues,
> em2 at pci8 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues,
> em3 at pci9 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues,
> em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01, msix, 4 queues,
> em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01, msix, 4 queues,

I've managed to get linux pktgen to send traffic on all 6 em interfaces
at that same time, and box seems to work just fine. Some systat, vmstat
and kstat details in attachment while traffic is flowing over that box.

irq124/em0:0701061017 7450
irq125/em0:1700477475 7444
irq126/em0:2700518530 7445
irq127/em0:3700477219 7444
irq128/em0 120
irq129/em1:0702693602 7468
irq130/em1:1702621154 7467
irq131/em1:2702638755 7467
irq132/em1:3702619278 7467
irq133/em1  80
irq134/em2:0700792107 7448
irq135/em2:1685857158 7289
irq136/em2:2685987301 7290
irq137/em2:3685853293 7289
irq138/em2 120
irq139/em3:0702784432 7469
irq140/em3:1702673600 7468
irq141/em3:2702692900 7468
irq142/em3:3702670362 7468
irq143/em3  80
irq146/em4:0691767956 7352
irq147/em4:1687629590 7308
irq148/em4:2687675100 7308
irq149/em4:3687627987 7308
irq150/em4 120
irq151/em5:0702655585 7467
irq152/em5:1702482994 7466
irq153/em5:2702502382 7466
irq154/em5:3702481315 7466
irq155/em5  80NAME   LEN IDLE  NGC  CPU  REQ  REL LREQ
 LREL
knotepl  80   140 8680 87074
6
1 4181 42310
6
2 3659 36880
3
3 3409 34430
3
mbufpl 628* 64230  31819767309  36433364464 19452409
 46897323
1  32515621331  29647742759 38355243
 15286334
2  32477451749  32774718213 25059389
 35077344
3  32535508383  30493295832 36129475
 21741989
mcl12k   8000000
0
1000
0
2000
0
3000
0
mcl16k   8000000
0
1000
0
2000
0
3000
0
mcl2k8 5200 21370105468738 74153346  3915988
 1562
1650734300439537696 26402919
 3343
2617420319419639380 24727419
 4801
3665185877   1105501145 26117590
 81156997
mcl2k2 601* 61400  31683334392  36327982542 19966506
 48295824
1  31678859095  29022314473 38321812
 16928852
2  31682996160  32178017835 24908639
 37514119
3  31678642505  29196245479 37458168
 17922524
mcl4k8000000
0
1000
0
2000
0
3000
0
mcl64k   8000000
0
1 

Re: em(4) multiqueue

2022-08-12 Thread Hrvoje Popovski
On 28.6.2022. 15:11, Jonathan Matthew wrote:
> This adds the (not quite) final bits to em(4) to enable multiple rx/tx queues.
> Note that desktop/laptop models (I218, I219 etc.) do not support multiple 
> queues,
> so this only really applies to servers and network appliances (including 
> APU2).
> 
> It also removes the 'em_enable_msix' variable, in favour of using MSI-X on 
> devices
> that support multiple queues and MSI or INTX everywhere else.
> 
> I've tested this with an I350 on amd64 and arm64, where it works as expected, 
> and
> with the I218-LM in my laptop where it does nothing (as expected).
> More testing is welcome, especially in forwarding environments.


Hi,

I'm testing forwarding over

em0 at pci7 dev 0 function 0 "Intel 82576" rev 0x01, msix, 4 queues,
em1 at pci7 dev 0 function 1 "Intel 82576" rev 0x01, msix, 4 queues,
em2 at pci8 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues,
em3 at pci9 dev 0 function 0 "Intel I210" rev 0x03, msix, 4 queues,
em4 at pci12 dev 0 function 0 "Intel I350" rev 0x01, msix, 4 queues,
em5 at pci12 dev 0 function 1 "Intel I350" rev 0x01, msix, 4 queues,

and it seems that plain forwarding works as expected.
I'm sending traffic from em0 to em1, from em2 to em3 and from em4 to
em5, em6 is for ssh ...


irq124/em0:0  1233974 1316
irq125/em0:1  1233943 1316
irq126/em0:2  1233942 1316
irq127/em0:3  1233944 1316
irq128/em0  20
irq129/em1:0  1021586 1090
irq132/em1:320
irq133/em1  40
irq98/xhci0940
irq99/ehci0190
irq134/em2:0   466894  498
irq135/em2:1   466846  498
irq136/em2:2   466846  498
irq137/em2:3   466846  498
irq138/em2  20
irq139/em3:0   467019  498
irq143/em3  20
irq146/em4:0  1192252 1272
irq147/em4:1  1192213 1272
irq148/em4:2  1192211 1272
irq149/em4:3  1192212 1272
irq150/em4  20
irq151/em5:0  1192354 1272
irq155/em5  20
irq156/em6:0 29363
irq157/em6:1   840
irq158/em6:2   320
irq159/em6:3   300
irq160/em6  20


OpenBSD 7.2-beta (GENERIC.MP) #0: Fri Aug 12 12:50:45 CEST 2022
r...@smc4.srce.hr:/sys/arch/amd64/compile/GENERIC.MP
real mem = 17052663808 (16262MB)
avail mem = 16518463488 (15753MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xed9b0 (48 entries)
bios0: vendor American Megatrends Inc. version "2.3" date 05/07/2021
bios0: Supermicro Super Server
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG UEFI DBG2 HPET WDDT
SSDT SSDT SSDT PRAD DMAR HEST BERT ERST EINJ
acpi0: wakeup devices IP2P(S4) EHC1(S4) EHC2(S4) RP07(S4) RP08(S4)
BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4)
BR3C(S4) BR3D(S4) RP01(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.34 MHz, 06-56-03
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB
64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.01 MHz, 06-56-03
cpu1:

Re: ix(4): Add support for TCP Large Receive Offloading

2022-08-12 Thread Hrvoje Popovski
On 25.6.2022. 22:29, Jan Klemkow wrote:
> There is a bug in the firmware of ix(4) NICs, at least with chip 82599 and
> x540.  I could not test it with x550 chips, but i guess they have the same 
> bug.


Hi,

I've tested tso with latest snapshot on x552 with or without vlan and
everything seems to work as with 82599...

OpenBSD 7.2-beta (GENERIC.MP) #0: Fri Aug 12 12:50:45 CEST 2022
r...@smc4.srce.hr:/sys/arch/amd64/compile/GENERIC.MP
real mem = 17052663808 (16262MB)
avail mem = 16518451200 (15753MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xed9b0 (48 entries)
bios0: vendor American Megatrends Inc. version "2.3" date 05/07/2021
bios0: Supermicro Super Server
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG UEFI DBG2 HPET WDDT
SSDT SSDT SSDT PRAD DMAR HEST BERT ERST EINJ
acpi0: wakeup devices IP2P(S4) EHC1(S4) EHC2(S4) RP07(S4) RP08(S4)
BR1A(S4) BR1B(S4) BR2A(S4) BR2B(S4) BR2C(S4) BR2D(S4) BR3A(S4) BR3B(S4)
BR3C(S4) BR3D(S4) RP01(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.28 MHz, 06-56-03
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB
64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.01 MHz, 06-56-03
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB
64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.01 MHz, 06-56-03
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB
64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz, 2200.01 MHz, 06-56-03
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB
64b/line 8-way L2 cache, 6MB 64b/line 12-way L3 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0x8000, bus 0-255
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 255 (UNC0)
acpiprt1 at acpi0: bus 0 (PCI0)
acpiprt2 at acpi0: bus 1 (BR1A)
acpiprt3 at acpi0: bus 2 (BR1B)
acpiprt4 at acpi0: bus 4 (BR2C)
acpiprt5 at acpi0: bus 5 (BR3A)
acpiprt6 at acpi0: bus 7 (RP01)
acpiprt7 at acpi0: bus 8 (RP02)
acpiprt8 at acpi0: bus 9 (RP04)
acpiprt9 at acpi0: bus 10 (BR3A)
acpiprt10 at acpi0: bus 11 (RP05)
acpipci0 at acpi0 UNC0: 0x0010 0x0011 0x
"ACPI0004" at acpi0 not configured
"ACPI0007" at acpi0 not configured
"ACPI0007" at 

splassert: vlan_ioctl: want 2 have 0

2022-08-09 Thread Hrvoje Popovski
Hi all,

I've sysupgrade firewall from
OpenBSD 7.2-beta (GENERIC.MP) #651: Tue Jul 26 23:11:26 MDT 2022
to
OpenBSD 7.2-beta (GENERIC.MP) #677: Mon Aug  8 18:58:49 MDT 2022

and on console there was lot's of spalassert like below. I'm having ix,
aggr and vlan on that firewall


splassert: vlan_ioctl: want 2 have 0
Starting stack trace...
vlan_ioctl(80d55800,c0406938,800020ccfe20) at vlan_ioctl+0x64
ifioctl(fd887d7fdc78,c0406938,800020ccfe20,800020cac550) at
ifioctl+0xc40
soo_ioctl(fd887c7dd7f8,c0406938,800020ccfe20,800020cac550)
at soo_ioctl+0x161
sys_ioctl(800020cac550,800020ccff30,800020ccff90) at
sys_ioctl+0x2c4
syscall(800020cd) at syscall+0x384
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7c3da0, count: 251
End of stack trace.



Re: [v5] amd64: simplify TSC sync testing

2022-08-02 Thread Hrvoje Popovski
On 2.8.2022. 23:40, Stuart Henderson wrote:
> On 2022/08/02 22:28, Hrvoje Popovski wrote:
>>
>> this is report from Dell R7515 with AMD EPYC 7702P 64-Core Processor
>>
>>
>> r7515$ sysctl | grep tsc
>> kern.timecounter.choice=i8254(0) mcx1(-100) mcx0(-100) tsc(-1000)
>> acpihpet0(1000) acpitimer0(1000)
>> machdep.tscfreq=1996246800
>> machdep.invarianttsc=1
> 
> Just to be sure, are you able to check kern.timecounter.hardware
> on this one please? With the fix in v5 this should now use
> acpihpet or acpitimer (which are reasonable choices given the
> sync issues with tsc on this hw).
> 
> (The kther machines you sent details for have enough to tell that
> they're working as expected).
> 
> Thanks for the tests on these various interesting CPUs.
> 

yes, yes i've started with "|grep tsc" and with laptop i thought i am
missing some information here then everything from start :)





Re: [v5] amd64: simplify TSC sync testing

2022-08-02 Thread Hrvoje Popovski
On 2.8.2022. 22:28, Hrvoje Popovski wrote:
> Hi,
> 
> this is report from Dell R7515 with AMD EPYC 7702P 64-Core Processor
> 
> 
> r7515$ sysctl | grep tsc
> kern.timecounter.choice=i8254(0) mcx1(-100) mcx0(-100) tsc(-1000)
> acpihpet0(1000) acpitimer0(1000)
> machdep.tscfreq=1996246800
> machdep.invarianttsc=1


snapshot
OpenBSD 7.2-beta (GENERIC.MP) #662: Tue Aug  2 11:58:47 MDT 2022

r7515# sysctl kern.timecounter
kern.timecounter.tick=1
kern.timecounter.timestepwarnings=0
kern.timecounter.hardware=i8254
kern.timecounter.choice=i8254(0) mcx1(-100) mcx0(-100) tsc(-1000)
acpihpet0(1000) acpitimer0(1000)


v5 diff
r7515# sysctl kern.timecounter
kern.timecounter.tick=1
kern.timecounter.timestepwarnings=0
kern.timecounter.hardware=acpihpet0
kern.timecounter.choice=i8254(0) mcx1(-100) mcx0(-100) tsc(-1000)
acpihpet0(1000) acpitimer0(1000)




Re: [v5] amd64: simplify TSC sync testing

2022-08-02 Thread Hrvoje Popovski
On 31.7.2022. 5:13, Scott Cheloha wrote:
> Hi,
> 
> At the urging of sthen@ and dv@, here is v5.
> 
> Two major changes from v4:
> 
> - Add the function tc_reset_quality() to kern_tc.c and use it
>   to lower the quality of the TSC timecounter if we fail the
>   sync test.
> 
>   tc_reset_quality() will choose a new active timecounter if,
>   after the quality change, the given timecounter is no longer
>   the best timecounter.
> 
>   The upshot is: if you fail the TSC sync test you should boot
>   with the HPET as your active timecounter.  If you don't have
>   an HPET you'll be using something else.
> 
> - Drop the SMT accomodation from the hot loop.  It hasn't been
>   necessary since last year when I rewrote the test to run without
>   a mutex.  In the rewritten test, the two CPUs in the hot loop
>   are not competing for any resources so they should not be able
>   to starve one another.
> 
> dv: Could you double-check that this still chooses the right
> timecounter on your machine?  If so, I will ask deraadt@ to
> put this into snaps to replace v4.
> 
> Additional test reports are welcome.  Include your dmesg.

Hi,

this is report from Lenovo ThinPad E14 gen2 with 6 x AMD Ryzen 5 4500U

e14gen2# sysctl | grep kern.timecounter
kern.timecounter.tick=1
kern.timecounter.timestepwarnings=0
kern.timecounter.hardware=acpihpet0
kern.timecounter.choice=i8254(0) tsc(-1000) acpihpet0(1000) acpitimer0(1000)

OpenBSD 7.2-beta (GENERIC.MP) #20: Tue Aug  2 23:13:28 CEST 2022
hrv...@e14gen2.srce.hr:/sys/arch/amd64/compile/GENERIC.MP
real mem = 7713394688 (7356MB)
avail mem = 7462240256 (7116MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.2 @ 0xbf912000 (63 entries)
bios0: vendor LENOVO version "R1AET42W (1.18 )" date 03/03/2022
bios0: LENOVO 20T6000TSC
acpi0 at bios0: ACPI 6.3
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT SSDT IVRS SSDT SSDT TPM2 SSDT MSDM BATB
HPET APIC MCFG SBST WSMT VFCT SSDT CRAT CDIT FPDT SSDT SSDT SSDT BGRT
UEFI SSDT SSDT
acpi0: wakeup devices GPP3(S3) GPP4(S4) GPP5(S3) XHC0(S3) XHC1(S3)
GP19(S3) LID_(S4) SLPB(S3)
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpihpet0 at acpi0: 14318180 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD Ryzen 5 4500U with Radeon Graphics, 2370.90 MHz, 17-60-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB
64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=1.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD Ryzen 5 4500U with Radeon Graphics, 2370.55 MHz, 17-60-01
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB
64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: AMD Ryzen 5 4500U with Radeon Graphics, 2370.55 MHz, 17-60-01
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu2: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB
64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 4 (application processor)
tsc: cpu0/cpu3: sync test round 1/2 failed
tsc: cpu0/cpu3: cpu3: 51485 lags 5130 cycles
cpu3: AMD Ryzen 5 4500U with Radeon Graphics, 2370.55 MHz, 17-60-01
cpu3:

Re: [v5] amd64: simplify TSC sync testing

2022-08-02 Thread Hrvoje Popovski
On 31.7.2022. 5:13, Scott Cheloha wrote:
> Hi,
> 
> At the urging of sthen@ and dv@, here is v5.
> 
> Two major changes from v4:
> 
> - Add the function tc_reset_quality() to kern_tc.c and use it
>   to lower the quality of the TSC timecounter if we fail the
>   sync test.
> 
>   tc_reset_quality() will choose a new active timecounter if,
>   after the quality change, the given timecounter is no longer
>   the best timecounter.
> 
>   The upshot is: if you fail the TSC sync test you should boot
>   with the HPET as your active timecounter.  If you don't have
>   an HPET you'll be using something else.
> 
> - Drop the SMT accomodation from the hot loop.  It hasn't been
>   necessary since last year when I rewrote the test to run without
>   a mutex.  In the rewritten test, the two CPUs in the hot loop
>   are not competing for any resources so they should not be able
>   to starve one another.
> 
> dv: Could you double-check that this still chooses the right
> timecounter on your machine?  If so, I will ask deraadt@ to
> put this into snaps to replace v4.
> 
> Additional test reports are welcome.  Include your dmesg.

Hi,

this is report from Supermicro AS-1114S-WTRT with AMD EPYC 7413 24-Core
Processor


smc24# sysctl | grep tsc
kern.timecounter.hardware=tsc
kern.timecounter.choice=i8254(0) mcx1(-100) mcx0(-100) tsc(2000)
acpihpet0(1000) acpitimer0(1000)
machdep.tscfreq=2650005829
machdep.invarianttsc=1
smc24#


smc24# dmesg
OpenBSD 7.2-beta (GENERIC.MP) #12: Tue Aug  2 22:41:12 CEST 2022
hrv...@smc24.srce.hr:/sys/arch/amd64/compile/GENERIC.MP
real mem = 68497051648 (65323MB)
avail mem = 66403737600 (63327MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.3 @ 0xa9d1c000 (71 entries)
bios0: vendor American Megatrends Inc. version "2.4" date 04/13/2022
bios0: Supermicro AS -1114S-WTRT
acpi0 at bios0: ACPI 6.0
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP SSDT SPMI SSDT FIDT MCFG SSDT SSDT BERT HPET
IVRS PCCT SSDT CRAT CDIT SSDT WSMT APIC ERST HEST
acpi0: wakeup devices B000(S3) C000(S3) B010(S3) C010(S3) B030(S3)
C030(S3) B020(S3) C020(S3) B100(S3) C100(S3) B110(S3) C110(S3) B130(S3)
C130(S3) B120(S3) C120(S3)
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpimcfg0 at acpi0
acpimcfg0: addr 0xe000, bus 0-255
acpihpet0 at acpi0: 14318180 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD EPYC 7413 24-Core Processor, 2650.38 MHz, 19-01-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,INVPCID,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,PKU,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB
64b/line 8-way L2 cache, 32MB 64b/line 16-way L3 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 100MHz
cpu0: mwait min=64, max=64, C-substates=1.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD EPYC 7413 24-Core Processor, 2650.01 MHz, 19-01-01
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,INVPCID,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,PKU,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu1: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB
64b/line 8-way L2 cache, 32MB 64b/line 16-way L3 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: AMD EPYC 7413 24-Core Processor, 2650.00 MHz, 19-01-01
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,INVPCID,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,PKU,IBPB,IBRS,STIBP,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
cpu2: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB
64b/line 8-way L2 cache, 32MB 64b/line 16-way L3 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: AMD EPYC 7413 24-Core Processor, 2650.00 MHz, 19-01-01
cpu3:

Re: [v5] amd64: simplify TSC sync testing

2022-08-02 Thread Hrvoje Popovski
On 31.7.2022. 5:13, Scott Cheloha wrote:
> Hi,
> 
> At the urging of sthen@ and dv@, here is v5.
> 
> Two major changes from v4:
> 
> - Add the function tc_reset_quality() to kern_tc.c and use it
>   to lower the quality of the TSC timecounter if we fail the
>   sync test.
> 
>   tc_reset_quality() will choose a new active timecounter if,
>   after the quality change, the given timecounter is no longer
>   the best timecounter.
> 
>   The upshot is: if you fail the TSC sync test you should boot
>   with the HPET as your active timecounter.  If you don't have
>   an HPET you'll be using something else.
> 
> - Drop the SMT accomodation from the hot loop.  It hasn't been
>   necessary since last year when I rewrote the test to run without
>   a mutex.  In the rewritten test, the two CPUs in the hot loop
>   are not competing for any resources so they should not be able
>   to starve one another.
> 
> dv: Could you double-check that this still chooses the right
> timecounter on your machine?  If so, I will ask deraadt@ to
> put this into snaps to replace v4.
> 
> Additional test reports are welcome.  Include your dmesg.


Hi,

this is report from Dell R7515 with AMD EPYC 7702P 64-Core Processor


r7515$ sysctl | grep tsc
kern.timecounter.choice=i8254(0) mcx1(-100) mcx0(-100) tsc(-1000)
acpihpet0(1000) acpitimer0(1000)
machdep.tscfreq=1996246800
machdep.invarianttsc=1

r7515$ dmesg | grep tsc
tsc: cpu0/cpu4: sync test round 1/2 failed
tsc: cpu0/cpu4: cpu4: 3 lags 340 cycles
tsc: cpu0/cpu10: sync test round 1/2 failed
tsc: cpu0/cpu10: cpu10: 3 lags 260 cycles
tsc: cpu0/cpu12: sync test round 1/2 failed
tsc: cpu0/cpu12: cpu12: 6155 lags 280 cycles
tsc: cpu0/cpu13: sync test round 1/2 failed
tsc: cpu0/cpu13: cpu13: 15659 lags 220 cycles
tsc: cpu0/cpu14: sync test round 1/2 failed
tsc: cpu0/cpu14: cpu14: 42 lags 160 cycles
tsc: cpu0/cpu15: sync test round 1/2 failed
tsc: cpu0/cpu15: cpu15: 337 lags 160 cycles
tsc: cpu0/cpu16: sync test round 1/2 failed
tsc: cpu0/cpu16: cpu16: 973 lags 160 cycles
tsc: cpu0/cpu17: sync test round 1/2 failed
tsc: cpu0/cpu17: cpu17: 43 lags 180 cycles
tsc: cpu0/cpu18: sync test round 1/2 failed
tsc: cpu0/cpu18: cpu18: 30 lags 180 cycles
tsc: cpu0/cpu19: sync test round 1/2 failed
tsc: cpu0/cpu19: cpu19: 42 lags 160 cycles
tsc: cpu0/cpu20: sync test round 1/2 failed
tsc: cpu0/cpu20: cpu20: 1739 lags 180 cycles
tsc: cpu0/cpu21: sync test round 1/2 failed
tsc: cpu0/cpu21: cpu21: 6180 lags 260 cycles
tsc: cpu0/cpu22: sync test round 1/2 failed
tsc: cpu0/cpu22: cpu22: 696 lags 160 cycles
tsc: cpu0/cpu23: sync test round 1/2 failed
tsc: cpu0/cpu23: cpu23: 36 lags 160 cycles
tsc: cpu0/cpu24: sync test round 1/2 failed
tsc: cpu0/cpu24: cpu24: 64 lags 140 cycles
tsc: cpu0/cpu25: sync test round 1/2 failed
tsc: cpu0/cpu25: cpu25: 6086 lags 320 cycles
tsc: cpu0/cpu26: sync test round 1/2 failed
tsc: cpu0/cpu26: cpu26: 40 lags 160 cycles
tsc: cpu0/cpu27: sync test round 1/2 failed
tsc: cpu0/cpu27: cpu27: 4 lags 300 cycles
tsc: cpu0/cpu28: sync test round 1/2 failed
tsc: cpu0/cpu28: cpu28: 2 lags 160 cycles
tsc: cpu0/cpu29: sync test round 1/2 failed
tsc: cpu0/cpu29: cpu29: 2 lags 180 cycles
tsc: cpu0/cpu30: sync test round 1/2 failed
tsc: cpu0/cpu30: cpu30: 2 lags 140 cycles
tsc: cpu0/cpu31: sync test round 1/2 failed
tsc: cpu0/cpu31: cpu31: 32 lags 300 cycles
tsc: cpu0/cpu32: sync test round 1/2 failed
tsc: cpu0/cpu32: cpu32: 48 lags 200 cycles
tsc: cpu0/cpu33: sync test round 1/2 failed
tsc: cpu0/cpu33: cpu33: 2 lags 100 cycles
tsc: cpu0/cpu34: sync test round 1/2 failed
tsc: cpu0/cpu34: cpu34: 2 lags 120 cycles
tsc: cpu0/cpu35: sync test round 1/2 failed
tsc: cpu0/cpu35: cpu35: 2 lags 140 cycles
tsc: cpu0/cpu36: sync test round 1/2 failed
tsc: cpu0/cpu36: cpu36: 2 lags 120 cycles
tsc: cpu0/cpu37: sync test round 1/2 failed
tsc: cpu0/cpu37: cpu37: 13144 lags 200 cycles
tsc: cpu0/cpu38: sync test round 1/2 failed
tsc: cpu0/cpu38: cpu38: 28 lags 180 cycles
tsc: cpu0/cpu39: sync test round 1/2 failed
tsc: cpu0/cpu39: cpu39: 2 lags 120 cycles
tsc: cpu0/cpu40: sync test round 1/2 failed
tsc: cpu0/cpu40: cpu40: 13167 lags 240 cycles
tsc: cpu0/cpu41: sync test round 1/2 failed
tsc: cpu0/cpu41: cpu41: 2 lags 120 cycles
tsc: cpu0/cpu42: sync test round 1/2 failed
tsc: cpu0/cpu42: cpu42: 2 lags 120 cycles
tsc: cpu0/cpu43: sync test round 1/2 failed
tsc: cpu0/cpu43: cpu43: 12558 lags 180 cycles
tsc: cpu0/cpu44: sync test round 1/2 failed
tsc: cpu0/cpu44: cpu44: 13 lags 200 cycles
tsc: cpu0/cpu45: sync test round 1/2 failed
tsc: cpu0/cpu45: cpu45: 6 lags 180 cycles
tsc: cpu0/cpu46: sync test round 1/2 failed
tsc: cpu0/cpu46: cpu46: 189 lags 200 cycles
tsc: cpu0/cpu47: sync test round 1/2 failed
tsc: cpu0/cpu47: cpu47: 3 lags 140 cycles
tsc: cpu0/cpu48: sync test round 1/2 failed
tsc: cpu0/cpu48: cpu48: 2 lags 160 cycles
tsc: cpu0/cpu49: sync test round 1/2 failed
tsc: cpu0/cpu49: cpu49: 8 lags 180 cycles
tsc: cpu0/cpu50: sync test round 1/2 failed
tsc: cpu0/cpu50: cpu50: 2 lags 160 cycles
tsc: 

Re: fix NAT round-robin and random

2022-08-01 Thread Hrvoje Popovski
On 20.7.2022. 22:27, Alexandr Nedvedicky wrote:
> Hello,
> 
> below is a final version of patch for NAT issue discussed at bugs@ [1].
> Patch below is updated according to feedback I got from Chris, claudio@
> and hrvoje@.
> 
> The summary of changes is as follows:
> 
> - prevent infinite loop when packet hits NAT rule as follows:
>   pass out on em0 from 172.16.0.0/16 to any nat-to { 49/27 }
> the issue has been introduced by my earlier commit [2]. The earlier
> change makes pf(4) to interpret 49/27 as single IP address (POOL_NONE)
> this is wrong, because pool 49/27 actually contains 32 addresses.
> 
> - while investigating the issue I've realized 'random' pool should
>   rather be using arc4_uniform() with upper limit derived from mask.
>   also the random number should be turned to netorder.
> 
> - also while I was debugging my change I've noticed we should be using
>   pf_poolmask() to obtain address as a combination of pool address
>   and result of generator (round-robin all random).
> 
> OK to commit?
> 
> thanks and
> regards
> sashan
> 
> 
> [1] https://marc.info/?t=16581336821=1=2
> https://marc.info/?t=16573254651=1=2
> https://marc.info/?l=openbsd-bugs=165817500514813=2
> 
> [2] https://marc.info/?l=openbsd-cvs=164500117319660=2


Hi all,

I've tested this diff and from what I see NAT behaves as it should and
it's changing ip addresses quite nicely




ix(4) TSO and PPPoE problem

2022-07-10 Thread Hrvoje Popovski
Hi all,

while testing mvs@ npppd diffs I've noticed that if TSO is enabled on
pppoe server, forwarding from/to pppoe client from/to some host outside
pppoe server is very slow. Interesting thing is that iperf between pppoe
clients behaves ok with or without TSO on pppoe server. Enabling or
disabling TSO on pppoe clients doesn't have any impact on network
performance.

Setup:
pf is disabled
pppoe server - 192.168.100.205 (ix0) <- this is where I'm disabling or
enabling TSO
- serves 10.53.253/24 network to pppoe clients
pppoe client1 - r620-2 - ix0 without address - e0 10.53.253.12/24
pppoe client2 - smc24 - ix0 without address - pppoe0 10.53.253.13/24
pppoe client3 - x3550m4- ix0 without address - pppoe0 10.53.253.14/14
test2 box - 192.168.100.219
- route 10.53.253.0/24 via 192.168.100.205 dev p2p1


pppoe client iperf server - test2 box iperf client - TSO disabled

[root@test2 ~]# iperf3 -c 10.53.253.12
Connecting to host 10.53.253.12, port 5201
[  4] local 192.168.100.219 port 40488 connected to 10.53.253.12 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  4]   0.00-1.00   sec  79.4 MBytes   666 Mbits/sec0   52.0 KBytes
[  4]   1.00-2.00   sec   134 MBytes  1.12 Gbits/sec0   87.2 KBytes
[  4]   2.00-3.00   sec   160 MBytes  1.34 Gbits/sec0121 KBytes
[  4]   3.00-4.00   sec   193 MBytes  1.62 Gbits/sec0155 KBytes
[  4]   4.00-5.00   sec   239 MBytes  2.01 Gbits/sec0187 KBytes
[  4]   5.00-6.00   sec   255 MBytes  2.14 Gbits/sec0226 KBytes
[  4]   6.00-7.00   sec   262 MBytes  2.20 Gbits/sec0262 KBytes
[  4]   7.00-8.00   sec   286 MBytes  2.39 Gbits/sec0300 KBytes
[  4]   8.00-9.00   sec   304 MBytes  2.55 Gbits/sec0360 KBytes
[  4]   9.00-10.00  sec   322 MBytes  2.70 Gbits/sec0360 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -


pppoe client iperf client - test2 box iperf server - TSO disabled

r620-2# iperf3 -c 192.168.100.219
Connecting to host 192.168.100.219, port 5201
[  5] local 10.53.253.12 port 20950 connected to 192.168.100.219 port 5201
[ ID] Interval   Transfer Bitrate
[  5]   0.00-1.00   sec   612 MBytes  5.13 Gbits/sec
[  5]   1.00-2.00   sec   608 MBytes  5.11 Gbits/sec
[  5]   2.00-3.00   sec   608 MBytes  5.10 Gbits/sec
[  5]   3.00-4.00   sec   609 MBytes  5.11 Gbits/sec
[  5]   4.00-5.00   sec   599 MBytes  5.02 Gbits/sec
[  5]   5.00-6.00   sec   595 MBytes  4.99 Gbits/sec
[  5]   6.00-7.00   sec   600 MBytes  5.03 Gbits/sec
[  5]   7.00-8.00   sec   552 MBytes  4.63 Gbits/sec
[  5]   8.00-9.00   sec   612 MBytes  5.13 Gbits/sec
[  5]   9.00-10.00  sec   615 MBytes  5.16 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -


pppoe client iperf server - test2 box iperf client - TSO enabled

[root@test2 ~]# iperf3 -c 10.53.253.12
Connecting to host 10.53.253.12, port 5201
[  4] local 192.168.100.219 port 40496 connected to 10.53.253.12 port 5201
[ ID] Interval   Transfer Bandwidth   Retr  Cwnd
[  4]   0.00-1.00   sec  40.8 KBytes   334 Kbits/sec   15   2.81 KBytes
[  4]   1.00-2.00   sec   136 KBytes  1.12 Mbits/sec   37   2.81 KBytes
[  4]   2.00-3.00   sec   498 KBytes  4.08 Mbits/sec   97   2.81 KBytes
[  4]   3.00-4.00   sec   658 KBytes  5.39 Mbits/sec  128   4.22 KBytes
[  4]   4.00-5.00   sec   776 KBytes  6.36 Mbits/sec  148   4.22 KBytes
[  4]   5.00-6.00   sec   523 KBytes  4.29 Mbits/sec  101   4.22 KBytes
[  4]   6.00-7.00   sec  1012 KBytes  8.29 Mbits/sec  194   2.81 KBytes
[  4]   7.00-8.00   sec   582 KBytes  4.77 Mbits/sec  116   4.22 KBytes
[  4]   8.00-9.00   sec   599 KBytes  4.91 Mbits/sec  121   2.81 KBytes
[  4]   9.00-10.00  sec  1.07 MBytes  8.99 Mbits/sec  209   4.22 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -


pppoe client iperf client - test2 box iperf server - TSO enabled

r620-2# iperf3 -c 192.168.100.219
Connecting to host 192.168.100.219, port 5201
[  5] local 10.53.253.12 port 36462 connected to 192.168.100.219 port 5201
[ ID] Interval   Transfer Bitrate
[  5]   0.00-1.01   sec  33.8 KBytes   274 Kbits/sec
[  5]   1.01-2.01   sec  14.1 KBytes   115 Kbits/sec
[  5]   2.01-3.01   sec  2.81 KBytes  23.0 Kbits/sec
[  5]   3.01-4.01   sec  9.84 KBytes  80.6 Kbits/sec
[  5]   4.01-5.01   sec  0.00 Bytes  0.00 bits/sec
[  5]   5.01-6.00   sec  4.22 KBytes  34.9 Kbits/sec
[  5]   6.00-7.01   sec  7.03 KBytes  57.0 Kbits/sec
[  5]   7.01-8.01   sec  0.00 Bytes  0.00 bits/sec
[  5]   8.01-9.00   sec  4.22 KBytes  34.9 Kbits/sec
[  5]   9.00-10.01  sec  0.00 Bytes  0.00 bits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -



iperf between pppoe clients without TSO

x3550m4# iperf3 -c 10.53.253.13
Connecting to host 10.53.253.13, port 5201
[  5] local 10.53.253.14 port 23985 connected to 10.53.253.13 port 5201
[ ID] Interval   Transfer Bitrate
[  5]   0.00-1.00   sec  65.1 MBytes   546 Mbits/sec
[  5]   1.00-2.00   sec   114 MBytes   953 Mbits/sec
[  5]   

Re: amd64 serial console changes, part 2

2022-07-08 Thread Hrvoje Popovski
On 6.7.2022. 22:45, Mark Kettenis wrote:
> Now that the kernel supports the extended BOOTARG_CONSDEV struct and
> snaps with that change are out there, here is the diff that changes
> the amd64 bootloaders to switch to the extended struct and provide the
> parameters necessary for using the non-standard UART on the AMD Ryzen
> Embedded V1000 SoCs.
> 
> It would be good if someone can confirm this works on something like
> an APU.

Hi,

I've tested this on Dell r620 with serial console and on supermicro box
with ipmi serial and both works normally. This time with everything that
needs to be compiled :)


dell
com1 at acpi0 COMA addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
com1: console
com0 at acpi0 COMB addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo


supermicro
com0 at acpi0 UAR1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
com1 at acpi0 UAR2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
com1: console



Re: em(4) multiqueue

2022-07-01 Thread Hrvoje Popovski
On 28.6.2022. 15:11, Jonathan Matthew wrote:
> This adds the (not quite) final bits to em(4) to enable multiple rx/tx queues.
> Note that desktop/laptop models (I218, I219 etc.) do not support multiple 
> queues,
> so this only really applies to servers and network appliances (including 
> APU2).
> 
> It also removes the 'em_enable_msix' variable, in favour of using MSI-X on 
> devices
> that support multiple queues and MSI or INTX everywhere else.
> 
> I've tested this with an I350 on amd64 and arm64, where it works as expected, 
> and
> with the I218-LM in my laptop where it does nothing (as expected).
> More testing is welcome, especially in forwarding environments.


Hi,

I'm testing this diff in forwarding setup where source is 10.113.0/24
connected to em2 and destination is 10.114.0/24 connected to em3. I'm
doing random source and destination per ip.

dmesg:
em2 at pci6 dev 0 function 2 "Intel I350" rev 0x01, msix, 8 queues
em3 at pci6 dev 0 function 3 "Intel I350" rev 0x01, msix, 8 queues

netstat:
10.113.0/24192.168.113.11 UGS00 - 8 em2
10.114.0/24192.168.114.11 UGS0 404056853 - 8 em3


ifconfig:
em2: flags=8843 mtu 1500
lladdr 40:f2:e9:ec:b4:14
index 5 priority 0 llprio 3
media: Ethernet autoselect (1000baseT full-duplex,master)
status: active
inet 192.168.113.1 netmask 0xff00 broadcast 192.168.113.255
em3: flags=8843 mtu 1500
lladdr 40:f2:e9:ec:b4:15
index 6 priority 0 llprio 3
media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause)
status: active
inet 192.168.114.1 netmask 0xff00 broadcast 192.168.114.255


with vmstat -i
irq160/em2:0  4740972 3538
irq161/em2:1  4740979 3538
irq162/em2:2  4740977 3538
irq163/em2:3  4740978 3538
irq164/em2:4  4740965 3538
irq165/em2:5  4740972 3538
irq166/em2:6  4740971 3538
irq167/em2:7  4740965 3538
irq168/em2  20
irq169/em3:0  4741258 3538
irq177/em3  20


should I see 8 queues on em3 as on em2 ?

x3550m4# tcpdump -ni em3
tcpdump: listening on em3, link-type EN10MB
00:39:26.663617 10.113.0.230.9 > 10.114.0.154.9: udp 18
00:39:26.663618 10.113.0.176.9 > 10.114.0.3.9: udp 18
00:39:26.663619 10.113.0.37.9 > 10.114.0.7.9: udp 18
00:39:26.663620 10.113.0.200.9 > 10.114.0.197.9: udp 18
00:39:26.663620 10.113.0.37.9 > 10.114.0.230.9: udp 18
00:39:26.663621 10.113.0.95.9 > 10.114.0.216.9: udp 18
00:39:26.663622 10.113.0.8.9 > 10.114.0.187.9: udp 18
00:39:26.663623 10.113.0.56.9 > 10.114.0.107.9: udp 18
00:39:26.663624 10.113.0.4.9 > 10.114.0.39.9: udp 18
00:39:26.663624 10.113.0.244.9 > 10.114.0.188.9: udp 18
00:39:26.663625 10.113.0.166.9 > 10.114.0.15.9: udp 18
00:39:26.663626 10.113.0.7.9 > 10.114.0.78.9: udp 18
00:39:26.663627 10.113.0.147.9 > 10.114.0.202.9: udp 18
00:39:26.663628 10.113.0.144.9 > 10.114.0.184.9: udp 18
00:39:26.663628 10.113.0.221.9 > 10.114.0.100.9: udp 18
00:39:26.663630 10.113.0.69.9 > 10.114.0.231.9: udp 18
00:39:26.663648 10.113.0.71.9 > 10.114.0.64.9: udp 18


vmstat -iz
irq160/em2:0  4740972 3501
irq161/em2:1  4740979 3501
irq162/em2:2  4740977 3501
irq163/em2:3  4740978 3501
irq164/em2:4  4740965 3501
irq165/em2:5  4740972 3501
irq166/em2:6  4740971 3501
irq167/em2:7  4740965 3501
irq168/em2  20
irq169/em3:0  4741258 3501
irq170/em3:100
irq171/em3:200
irq172/em3:300
irq173/em3:400
irq174/em3:500
irq175/em3:600
irq176/em3:700
irq177/em3  20



Re: amd64 serial console changes

2022-06-30 Thread Hrvoje Popovski
On 30.6.2022. 17:03, Stuart Henderson wrote:
> On 2022/06/30 16:55, Hrvoje Popovski wrote:
>> On 30.6.2022. 16:48, Hrvoje Popovski wrote:
>>> On 30.6.2022. 15:14, Anton Lindqvist wrote:
>>>> On Thu, Jun 30, 2022 at 01:07:46PM +0200, Mark Kettenis wrote:
>>>>> Ah right.  Please commit!
>>>> Here's the complete diff, ok?
>>>
>>>
>>> Hi,
>>>
>>> with this diff :
>>>
>>> dell r620 - serial console
>>> com1 at acpi0 COMA addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
>>> com1: console
>>> com0 at acpi0 COMB addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
>>>
>>> works fast as before with first boot but second boot is slow...
>>>
>>> supermicro - ipmi console
>>> com0 at acpi0 UAR1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
>>> com1 at acpi0 UAR2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
>>> com1: console
>>>
>>> is slow as without this diff ..
>>>
>>>
>>> i will try on few more machines this diff ...
>>>
>>
>> after applying diff i did
>> cd /sys/arch/amd64/compile/GENERIC.MP && make -j6 obj && make config &&
>> make clean && time make -j6 && make install && reboot
>>
>>
>> is this ok?
>>
> 
> This is in the bootloader not the kernel - "make obj/make/make install"
> in sys/arch/amd64/stand and "installboot"
> 

Thank you sthen@ and jca@ ...
After this steps everything works just fine ..

Thank you guys ..



Re: amd64 serial console changes

2022-06-30 Thread Hrvoje Popovski
On 30.6.2022. 16:48, Hrvoje Popovski wrote:
> On 30.6.2022. 15:14, Anton Lindqvist wrote:
>> On Thu, Jun 30, 2022 at 01:07:46PM +0200, Mark Kettenis wrote:
>>> Ah right.  Please commit!
>> Here's the complete diff, ok?
> 
> 
> Hi,
> 
> with this diff :
> 
> dell r620 - serial console
> com1 at acpi0 COMA addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
> com1: console
> com0 at acpi0 COMB addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
> 
> works fast as before with first boot but second boot is slow...
> 
> supermicro - ipmi console
> com0 at acpi0 UAR1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
> com1 at acpi0 UAR2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
> com1: console
> 
> is slow as without this diff ..
> 
> 
> i will try on few more machines this diff ...
> 

after applying diff i did
cd /sys/arch/amd64/compile/GENERIC.MP && make -j6 obj && make config &&
make clean && time make -j6 && make install && reboot


is this ok?



Re: amd64 serial console changes

2022-06-30 Thread Hrvoje Popovski
On 30.6.2022. 15:14, Anton Lindqvist wrote:
> On Thu, Jun 30, 2022 at 01:07:46PM +0200, Mark Kettenis wrote:
>> Ah right.  Please commit!
> Here's the complete diff, ok?


Hi,

with this diff :

dell r620 - serial console
com1 at acpi0 COMA addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
com1: console
com0 at acpi0 COMB addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo

works fast as before with first boot but second boot is slow...

supermicro - ipmi console
com0 at acpi0 UAR1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
com1 at acpi0 UAR2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
com1: console

is slow as without this diff ..


i will try on few more machines this diff ...



Re: amd64 serial console changes

2022-06-30 Thread Hrvoje Popovski
On 30.6.2022. 10:26, Mark Kettenis wrote:
> Hi Hrvoje,
> 
> I assume it was faster before?  What hardware are you seeing this on?

Hi,

yes, it was faster before.

dell r620 - serial console
com1 at acpi0 COMA addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
com1: console
com0 at acpi0 COMB addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo

cat /etc/boot.conf
stty com1 115200
set tty com1

/etc/ttys
tty01   "/usr/libexec/getty std.115200" vt220   on  secure


dell r430 - ipmi console
com0 at acpi0 COMA addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
com1 at acpi0 COMB addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
com1: console

cat /etc/boot.conf
stty com1 115200
set tty com1

/etc/ttys
tty01   "/usr/libexec/getty std.115200" vt200   on  secure


supermicro - ipmi console
com0 at acpi0 UAR1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
com1 at acpi0 UAR2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
com1: console

cat /etc/boot.conf
stty com1 115200
set tty com1

/etc/ttys
tty01   "/usr/libexec/getty std.115200" vt220   on  secure



will try now anton@ last diff ..



Re: amd64 serial console changes

2022-06-30 Thread Hrvoje Popovski
On 27.6.2022. 23:44, Mark Kettenis wrote:
> The Ryzen Embedded V1000 processors have an arm64-style Synposys
> DesignWare UART instead if a PC-compatible NS16x50 UART.  To make this
> UART work as a serial console, we need to pass some more information
> from the bootloader to the kernel.  This diff adds the logic to handle
> that information to the kernel.  I'd like some folks that use a serial
> console on their amd64 machines to test this.  But testing this diff
> on amd64 machines with a glass console doesn't hurt.
> 
> Thanks,
> 
> Mark


Hi,

I've sysupgrade few boxes few minutes ago and booting output is quite
slow. Everything is working but console output needs cca 2 or 3 minutes
to finish depends how big is dmesg.

I have few console over ipmi and few connected to serial and all of them
are slow.

When boot output finish working over console backs to normal fast output.



Re: kernel lock in arp

2022-06-27 Thread Hrvoje Popovski
On 18.5.2022. 17:31, Alexander Bluhm wrote:
> Hi,
> 
> For parallel IP forwarding I had put kernel lock around arpresolve()
> as a quick workaround for crashes.  Moving the kernel lock inside
> the function makes the hot path lock free.  I see slight prerformace
> increase in my test and no lock contention in kstack flamegraph.
> 
> http://bluhm.genua.de/perform/results/latest/2022-05-16T00%3A00%3A00Z/btrace/tcpbench_-S100_-t10_-n100_10.3.45.35-btrace-kstack.0.svg
> http://bluhm.genua.de/perform/results/latest/patch-sys-arpresolve-kernel-lock.1/btrace/tcpbench_-S100_-t10_-n100_10.3.45.35-btrace-kstack.0.svg
> 
> Search for kernel_lock.  Matched goes from 0.6% to 0.2%
> 
> We are running such a diff in our genua code for a while.  I think
> route flags need more love.  I doubt that all flags and fields are
> consistent when run on multiple CPU.  But this diff does not make
> it worse and increases MP pressure.

Hi,

I'm running this diff on messy network with lots of
"arp: attempt to add entry" and sometimes
"arpresolve: XXX: route contains no arp information" logs and everything
seems fine ..




Re: rewrite amd64 cache printing

2022-06-24 Thread Hrvoje Popovski
On 24.6.2022. 11:19, Jonathan Gray wrote:
> Rewrite amd64 printing of cache details.
> Previously we looked at cpuid 0x8005 for L1/TLB details
> which Intel documents as reserved.
> And cpuid 0x8006 for L2 details.
> 
> Intel also encode cache details in cpuid 4.
> AMD have mostly the same encoding with cpuid 0x801d
> 0x8005/0x8006 is used as a fallback in this diff
> 
> The amount of cache visible to the thread is shown
> and not which groups of cpus share a particular cache.
> In the case of Alder Lake P, P cores have 1.25MB L2, each group of
> 4 E cores shares a 2MB L2.

cpu0: AMD EPYC 7413 24-Core Processor, 2650.35 MHz, 19-01-01

before:
cpu0: 32KB 64b/line 8-way I-cache, 32KB 64b/line 8-way D-cache, 512KB
64b/line 8-way L2 cache
cpu0: ITLB 64 4KB entries fully associative, 64 4MB entries fully
associative
cpu0: DTLB 64 4KB entries fully associative, 64 4MB entries fully
associative

after:
cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB
64b/line 8-way L2 cache, 32MB 64b/line 16-way L3 cache



Re: pipex(4): protect global lists with mutex(9)

2022-06-20 Thread Hrvoje Popovski
On 20.6.2022. 16:46, Vitaliy Makkoveev wrote:
> And use this mutex(9) to make the `session' dereference safe.
> 
> Hrvoje Popovski pointed me, he triggered netlock assertion with pipex(4)
> pppoe sessions:
> 

Hi,

I can trigger this splassert with plain snapshot and with only pppoe
clients. npppd setup is without pf.

r420-1# ifconfig ix0
ix0: flags=8843 mtu 1500
lladdr 90:e2:ba:19:29:a8
index 1 priority 0 llprio 3
media: Ethernet autoselect (10GSFP+Cu full-duplex,rxpause,txpause)
status: active
inet 192.168.100.205 netmask 0xff00 broadcast 192.168.100.255



r420-1# cat /etc/npppd/npppd.conf
authentication LOCAL type local {
users-file "/etc/npppd/npppd-users"
}
tunnel PPTP protocol pptp {
}
tunnel L2TP protocol l2tp {
authentication-method pap chap mschapv2
}
tunnel PPPOE protocol pppoe {
listen on interface ix0
authentication-method pap chap
}

ipcp IPCP-L2TP {
pool-address 10.53.251.10-10.53.251.100
dns-servers 161.53.2.69
}
ipcp IPCP-PPTP {
pool-address 10.53.252.10-10.53.252.100
dns-servers 161.53.2.69
}
ipcp IPCP-PPPOE {
pool-address 10.53.253.10-10.53.253.100
}

interface pppac0 address 10.53.251.1 ipcp IPCP-L2TP
interface pppac1 address 10.53.252.1 ipcp IPCP-PPTP
interface pppac2 address 10.53.253.1 ipcp IPCP-PPPOE

bind tunnel from L2TP authenticated by LOCAL to pppac0
bind tunnel from PPTP authenticated by LOCAL to pppac1
bind tunnel from PPPOE authenticated by LOCAL to pppac2


r420-1# npppctl ses bri
Ppp Id Assigned IPv4   Username Proto Tunnel From
-- ---  -
-
 0 10.53.253.11hrvojepppoe11PPPoE 90:e2:ba:33:af:ec
 1 10.53.253.12hrvojepppoe12PPPoE ec:f4:bb:c8:e9:88


I'm generating traffic with iperf3 from pppoe hosts to hosts in
192.168.100.0/24 network and at same time generating traffic between
pppoe hosts.
This triggers splassert after half a hour or so ...

r420-1# splassert: pipex_ppp_output: want 2 have 0
Starting stack trace...
pipex_ppp_output(fd80a45eaa00,800022782400,21) at
pipex_ppp_output+0x5e
pppac_qstart(80ec0ab0) at pppac_qstart+0x96
ifq_serialize(80ec0ab0,80ec0b90) at ifq_serialize+0xfd
taskq_thread(80030080) at taskq_thread+0x100
end trace frame: 0x0, count: 253
End of stack trace.



>>
>> r420-1# splassert: pipex_ppp_output: want 2 have 0
>> Starting stack trace...
>> pipex_ppp_output(fd80cb7c9500,800022761200,21) at
>> pipex_ppp_output+0x5e
>> pppac_qstart(80e80ab0) at pppac_qstart+0x96
>> ifq_serialize(80e80ab0,80e80b90) at ifq_serialize+0xfd
>> taskq_thread(80030080) at taskq_thread+0x100
>> end trace frame: 0x0, count: 253
>> End of stack trace.
>> splassert: pipex_ppp_output: want 2 have 0
>> Starting stack trace...
>> pipex_ppp_output(fd80c53b2300,800022761200,21) at
>> pipex_ppp_output+0x5e
>> pppac_qstart(80e80ab0) at pppac_qstart+0x96
>> ifq_serialize(80e80ab0,80e80b90) at ifq_serialize+0xfd
>> taskq_thread(80030080) at taskq_thread+0x100
>> end trace frame: 0x0, count: 253
>> End of stack trace.
>>
>>
>> r420-1# npppctl ses bri
>> Ppp Id Assigned IPv4   Username Proto Tunnel From
>> -- ---  -
>> -
>>  7 10.53.251.22r6202L2TP  192.168.100.12:1701
>>  1 10.53.253.11hrvojepppoe11PPPoE 90:e2:ba:33:af:ec
>>  0 10.53.253.12hrvojepppoe12PPPoE ec:f4:bb:da:f7:f8
> 
> This means the hack we use to enforce pppac_qstart() and
> pppx_if_qstart() be called with netlock held doesn't work anymore for
> pppoe sessions:
> 
>   /* XXXSMP: be sure pppac_qstart() called with NET_LOCK held */
>   ifq_set_maxlen(>if_snd, 1);
> 
> pptp and l2tp sessions are not affected, because the pcb layer is still
> accessed with exclusive netlock held, but the code is common for all
> session types. This mean we can't rely on netlock and should rework
> pipex(4) locking.
> 
> The diff below introduces `pipex_list_mtx' mutex(9) to protect global
> pipex(4) lists and made `session' dereference safe. It doesn't fix the
> assertion, but only makes us sure the pppoe session can't be killed
> concurrently while stack processes it.
> 
> I'll protect pipex(4) session data and rework pipex(4) output with next
> diffs.
> 
> ok?
> 
> Index: sys/net/if_ethersubr.c
> ===
> RCS file: /cvs/src/sys/net/if_ethersubr.c,v
> retrieving revision 1.279
> diff -u -p -r1

Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-10 Thread Hrvoje Popovski
On 10.6.2022. 0:24, Hrvoje Popovski wrote:
> Hi,
> 
> I tried with trunk lacp and tagged vlan700 over trunk0 and it's working
> as expected. Everything is the same as before, only aggr0 is now trunk0.
> 
> with tso
> [  4]  17.00-18.00  sec  1.09 GBytes  9.39 Gbits/sec   22622 KBytes
> [  4]  18.00-19.00  sec  1.09 GBytes  9.39 Gbits/sec   17662 KBytes
> [  4]  19.00-20.00  sec  1.09 GBytes  9.39 Gbits/sec0696 KBytes
> [  4]  20.00-21.00  sec  1.09 GBytes  9.39 Gbits/sec0735 KBytes
> 
> 
> without tso
> [  4]  17.00-18.00  sec   594 MBytes  4.98 Gbits/sec0612 KBytes
> [  4]  18.00-19.00  sec   594 MBytes  4.98 Gbits/sec0646 KBytes
> [  4]  19.00-20.00  sec   595 MBytes  4.99 Gbits/sec0665 KBytes
> [  4]  20.00-20.87  sec   515 MBytes  4.99 Gbits/sec0675 KBytes
> 
> 
> tcpdump is the same as with aggr...
> 
> 
> After this I've tried same aggr/vlan setup on two different boxes with
> ix interfaces and send traffic to them from linux and openbsd and
> results are not good ...
> Only differences between your setup and mine is that i have switch and
> you have directly connected hosts... but i don't think that should be
> the problem..
> 
> It would be great if someone else could test this because it's not that
> i don't know what I'm doing but I'm running out of ideas...
> 

Hi,

maybe this gives some clue.. or I'm doing something totally wrong.

with ix0 tso, vport and veb config I'm getting same bad results with
iperf or tcpbench as with vlan over aggr.


cat /etc/hostname.ix0
up

cat hostname.vlan700
parent ix0
vnetid 700
up

cat /etc/hostname.vport0
inet 192.168.100.222 255.255.255.0
up

cat /etc/hostname.veb0
add vlan700
add vport0
up


When veb0 connects ix0 with tso instead of vlan700 and vport0 then
iperf3 and tcpbench results are as expected 9.42 Gbit/sec




Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-10 Thread Hrvoje Popovski
On 10.6.2022. 10:37, Hrvoje Popovski wrote:
> while sending traffic from linux to openbsd ix0 with tso

netstat while sending traffic 

smc24# netstat -sp tcp
tcp:
1933441 packets sent
3923 data packets (684390 bytes)
0 data packets (0 bytes) retransmitted
0 fast retransmitted packets
1597673 ack-only packets (1735982 delayed)
0 URG only packets
0 window probe packets
331809 window update packets
36 control packets
0 packets software-checksummed
3335314 packets received
1509 acks (for 683936 bytes)
6 duplicate acks
0 acks for unsent data
0 acks for old data
3309028 packets (24469203295 bytes) received in-sequence
1 completely duplicate packet (274 bytes)
0 old duplicate packets
0 packets with some duplicate data (0 bytes duplicated)
24600 out-of-order packets (35619352 bytes)
0 packets (0 bytes) of data after window
0 window probes
0 window update packets
7 packets received after close
14 discarded for bad checksums
0 discarded for bad header offset fields
0 discarded because packet too short
0 discarded for missing IPsec protection
0 discarded due to memory shortage
14 packets software-checksummed
0 bad/missing md5 checksums
0 good md5 checksums
14 connection requests
10 connection accepts
24 connections established (including accepts)
135 connections closed (including 0 drops)
0 connections drained
0 embryonic connections dropped
1280 segments updated rtt (of 1283 attempts)
1 retransmit timeout
0 connections dropped by rexmit timeout
0 persist timeouts
0 keepalive timeouts
0 keepalive probes sent
0 connections dropped by keepalive
0 correct ACK header predictions
3308838 correct data packet header predictions
318 PCB cache misses
295 dropped due to no socket
0 ECN connections accepted
0 ECE packets received
0 CWR packets received
0 CE packets received
0 ECT packets sent
0 ECE packets sent
0 CWR packets sent
cwr by fastrecovery: 0
cwr by timeout: 1
cwr by ecn: 0
3 bad connection attempts
0 SYN packets dropped due to queue or memory full
10 SYN cache entries added
0 hash collisions
10 completed
0 aborted (no space to build PCB)
0 timed out
0 dropped due to overflow
0 dropped due to bucket overflow
0 dropped due to RST
0 dropped due to ICMP unreachable
0 SYN,ACKs retransmitted
0 duplicate SYNs received for entries already in the cache
0 SYNs dropped (no route or no space)
1 SYN cache seed with new random
293 hash bucket array size in current SYN cache
0 entries in current SYN cache, limit is 10255
0 longest bucket length in current SYN cache, limit is 105
0 uses of current SYN cache left
0 SACK recovery episodes
0 segment rexmits in SACK recovery episodes
0 byte rexmits in SACK recovery episodes
0 SACK options received
18763 SACK options sent
0 SACK options dropped



Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-10 Thread Hrvoje Popovski
On 10.6.2022. 10:20, David Gwynne wrote:
> 
> 
>> On 10 Jun 2022, at 08:24, Hrvoje Popovski  wrote:
>>
>> On 9.6.2022. 19:25, Hrvoje Popovski wrote:
>>> On 9.6.2022. 19:11, Jan Klemkow wrote:
>>>> On Thu, Jun 09, 2022 at 08:25:22AM +0200, Hrvoje Popovski wrote:
>>>>> On 8.6.2022. 22:01, Hrvoje Popovski wrote:
>>>>>> On 8.6.2022. 15:04, Jan Klemkow wrote:
>>>>>>> Could you show me, how your setup and your configuration looks like?
>>>>>> Yes, of course ..
>>>>>>
>>>>>> All my lab boxes are connected to switch (no flow-control). In this
>>>>>> setup ix0 and ix1 are in aggr and vlans 700 and 800 are tagged on aggr3.
>>>>>> I have tried tcpbench and iperf3 from openbsd or linux box and with tso
>>>>>> I'm getting few Kbps. In attachment you can find tcpdump for iperf3 on
>>>>>> openbsd with or without tso
>>>>>>
>>>>>> I will play more with this setup maybe I'm doing something terribly 
>>>>>> wrong...
>>>>>>
>>>>>> My OpenBSD conf:
>>>>>>
>>>>>> r620-1# cat /etc/hostname.ix0
>>>>>> up
>>>>>>
>>>>>> r620-1# cat /etc/hostname.ix1
>>>>>> up
>>>>>>
>>>>>> r620-1# cat /etc/hostname.aggr3
>>>>>> trunkport ix0
>>>>>> trunkport ix1
>>>>>> up
>>>>>>
>>>>>> r620-1# cat /etc/hostname.vlan700
>>>>>> parent aggr3
>>>>>> vnetid 700
>>>>>> lladdr ec:f4:bb:da:f7:f8
>>>>>> inet 192.168.100.11/24
>>>>>> !route add 16/8 192.168.100.111
>>>>>> up
>>>>>>
>>>>>> r620-1# cat /etc/hostname.vlan800
>>>>>> parent aggr3
>>>>>> vnetid 800
>>>>>> lladdr ec:f4:bb:da:f7:fa
>>>>>> inet 192.168.111.11/24
>>>>>> !route add 48/8 192.168.111.111
>>>>>> up
>>>>>
>>>>> now I've configured aggr like this
>>>>>
>>>>> ix0 - aggr0 - vlan700
>>>>> ix1 - aggr1 - vlan800
>>>>> ip configuration is the same
>>>>>
>>>>> still I have problem with tso
>>>>>
>>>>> tcpdump in attachment
>>>>
>>>> The tcpdump output on my machine looks similar to yours.
>>>> Do you see TCP retransmits on your sending Linux host?
>>>>
>>>> # netstat -s | grep retransmitted
>>>>0 segments retransmitted
>>>>
>>>
>>> [root@test2 pktgen]# netstat -s | grep -i segment
>>>8352173627 segments received
>>>14648977712 segments send out
>>>13336873 segments retransmited
>>>586 bad segments received.
>>>
>>> yes, segments retransmited is increasing
>>>
>>>
>>>> Which specific ix(4) NIC do you use?
>>>>
>>>
>>> r620-1# dmesg | grep ix
>>> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
>>> ix0 at pci1 dev 0 function 0 "Intel 82599" rev 0x01, msix, 6 queues,
>>> address ec:f4:bb:da:f7:f8
>>> ix1 at pci1 dev 0 function 1 "Intel 82599" rev 0x01, msix, 6 queues,
>>> address ec:f4:bb:da:f7:fa
>>> ix2 at pci4 dev 0 function 0 "Intel X540T" rev 0x01, msix, 6 queues,
>>> address a0:36:9f:29:f2:0c
>>> ix3 at pci4 dev 0 function 1 "Intel X540T" rev 0x01, msix, 6 queues,
>>> address a0:36:9f:29:f2:0e
>>>
>>>
>>> I'm using ix0 and ix1 for testing
>>>> I'm using these two:
>>>>
>>>> # dmesg | grep ^ix
>>>> ix0 at pci3 dev 0 function 0 "Intel 82599" rev 0x01, msix, 12 queues, 
>>>> address 00:1b:21:87:fb:2c
>>>> ix1 at pci3 dev 0 function 1 "Intel 82599" rev 0x01, msix, 12 queues, 
>>>> address 00:1b:21:87:fb:2d
>>>>
>>>> Thanks,
>>>> Jan
>>>>
>>>
>>
>> Hi,
>>
>> I tried with trunk lacp and tagged vlan700 over trunk0 and it's working
>> as expected. Everything is the same as before, only aggr0 is now trunk0.
>>
>> with tso
>> [  4]  17.00-18.00  sec  1.09 GBytes  9.39 Gbits/sec   22622 KBytes
>> [  4]  18.00-19.00  sec  1.09 GBytes  9.39 Gbits/se

Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-09 Thread Hrvoje Popovski
On 9.6.2022. 19:25, Hrvoje Popovski wrote:
> On 9.6.2022. 19:11, Jan Klemkow wrote:
>> On Thu, Jun 09, 2022 at 08:25:22AM +0200, Hrvoje Popovski wrote:
>>> On 8.6.2022. 22:01, Hrvoje Popovski wrote:
>>>> On 8.6.2022. 15:04, Jan Klemkow wrote:
>>>>> Could you show me, how your setup and your configuration looks like?
>>>> Yes, of course ..
>>>>
>>>> All my lab boxes are connected to switch (no flow-control). In this
>>>> setup ix0 and ix1 are in aggr and vlans 700 and 800 are tagged on aggr3.
>>>> I have tried tcpbench and iperf3 from openbsd or linux box and with tso
>>>> I'm getting few Kbps. In attachment you can find tcpdump for iperf3 on
>>>> openbsd with or without tso
>>>>
>>>> I will play more with this setup maybe I'm doing something terribly 
>>>> wrong...
>>>>
>>>> My OpenBSD conf:
>>>>
>>>> r620-1# cat /etc/hostname.ix0
>>>> up
>>>>
>>>> r620-1# cat /etc/hostname.ix1
>>>> up
>>>>
>>>> r620-1# cat /etc/hostname.aggr3
>>>> trunkport ix0
>>>> trunkport ix1
>>>> up
>>>>
>>>> r620-1# cat /etc/hostname.vlan700
>>>> parent aggr3
>>>> vnetid 700
>>>> lladdr ec:f4:bb:da:f7:f8
>>>> inet 192.168.100.11/24
>>>> !route add 16/8 192.168.100.111
>>>> up
>>>>
>>>> r620-1# cat /etc/hostname.vlan800
>>>> parent aggr3
>>>> vnetid 800
>>>> lladdr ec:f4:bb:da:f7:fa
>>>> inet 192.168.111.11/24
>>>> !route add 48/8 192.168.111.111
>>>> up
>>>
>>> now I've configured aggr like this
>>>
>>> ix0 - aggr0 - vlan700
>>> ix1 - aggr1 - vlan800
>>> ip configuration is the same
>>>
>>> still I have problem with tso
>>>
>>> tcpdump in attachment
>>
>> The tcpdump output on my machine looks similar to yours.
>> Do you see TCP retransmits on your sending Linux host?
>>
>> # netstat -s | grep retransmitted
>> 0 segments retransmitted
>>
> 
> [root@test2 pktgen]# netstat -s | grep -i segment
> 8352173627 segments received
> 14648977712 segments send out
> 13336873 segments retransmited
> 586 bad segments received.
> 
> yes, segments retransmited is increasing
> 
> 
>> Which specific ix(4) NIC do you use?
>>
> 
> r620-1# dmesg | grep ix
> mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
> ix0 at pci1 dev 0 function 0 "Intel 82599" rev 0x01, msix, 6 queues,
> address ec:f4:bb:da:f7:f8
> ix1 at pci1 dev 0 function 1 "Intel 82599" rev 0x01, msix, 6 queues,
> address ec:f4:bb:da:f7:fa
> ix2 at pci4 dev 0 function 0 "Intel X540T" rev 0x01, msix, 6 queues,
> address a0:36:9f:29:f2:0c
> ix3 at pci4 dev 0 function 1 "Intel X540T" rev 0x01, msix, 6 queues,
> address a0:36:9f:29:f2:0e
> 
> 
> I'm using ix0 and ix1 for testing
>> I'm using these two:
>>
>> # dmesg | grep ^ix
>> ix0 at pci3 dev 0 function 0 "Intel 82599" rev 0x01, msix, 12 queues, 
>> address 00:1b:21:87:fb:2c
>> ix1 at pci3 dev 0 function 1 "Intel 82599" rev 0x01, msix, 12 queues, 
>> address 00:1b:21:87:fb:2d
>>
>> Thanks,
>> Jan
>>
> 

Hi,

I tried with trunk lacp and tagged vlan700 over trunk0 and it's working
as expected. Everything is the same as before, only aggr0 is now trunk0.

with tso
[  4]  17.00-18.00  sec  1.09 GBytes  9.39 Gbits/sec   22622 KBytes
[  4]  18.00-19.00  sec  1.09 GBytes  9.39 Gbits/sec   17662 KBytes
[  4]  19.00-20.00  sec  1.09 GBytes  9.39 Gbits/sec0696 KBytes
[  4]  20.00-21.00  sec  1.09 GBytes  9.39 Gbits/sec0735 KBytes


without tso
[  4]  17.00-18.00  sec   594 MBytes  4.98 Gbits/sec0612 KBytes
[  4]  18.00-19.00  sec   594 MBytes  4.98 Gbits/sec0646 KBytes
[  4]  19.00-20.00  sec   595 MBytes  4.99 Gbits/sec0665 KBytes
[  4]  20.00-20.87  sec   515 MBytes  4.99 Gbits/sec0675 KBytes


tcpdump is the same as with aggr...


After this I've tried same aggr/vlan setup on two different boxes with
ix interfaces and send traffic to them from linux and openbsd and
results are not good ...
Only differences between your setup and mine is that i have switch and
you have directly connected hosts... but i don't think that should be
the problem..

It would be great if someone else could test this because it's not that
i don't know what I'm doing but I'm running out of ideas...



Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-09 Thread Hrvoje Popovski
On 9.6.2022. 19:11, Jan Klemkow wrote:
> On Thu, Jun 09, 2022 at 08:25:22AM +0200, Hrvoje Popovski wrote:
>> On 8.6.2022. 22:01, Hrvoje Popovski wrote:
>>> On 8.6.2022. 15:04, Jan Klemkow wrote:
>>>> Could you show me, how your setup and your configuration looks like?
>>> Yes, of course ..
>>>
>>> All my lab boxes are connected to switch (no flow-control). In this
>>> setup ix0 and ix1 are in aggr and vlans 700 and 800 are tagged on aggr3.
>>> I have tried tcpbench and iperf3 from openbsd or linux box and with tso
>>> I'm getting few Kbps. In attachment you can find tcpdump for iperf3 on
>>> openbsd with or without tso
>>>
>>> I will play more with this setup maybe I'm doing something terribly wrong...
>>>
>>> My OpenBSD conf:
>>>
>>> r620-1# cat /etc/hostname.ix0
>>> up
>>>
>>> r620-1# cat /etc/hostname.ix1
>>> up
>>>
>>> r620-1# cat /etc/hostname.aggr3
>>> trunkport ix0
>>> trunkport ix1
>>> up
>>>
>>> r620-1# cat /etc/hostname.vlan700
>>> parent aggr3
>>> vnetid 700
>>> lladdr ec:f4:bb:da:f7:f8
>>> inet 192.168.100.11/24
>>> !route add 16/8 192.168.100.111
>>> up
>>>
>>> r620-1# cat /etc/hostname.vlan800
>>> parent aggr3
>>> vnetid 800
>>> lladdr ec:f4:bb:da:f7:fa
>>> inet 192.168.111.11/24
>>> !route add 48/8 192.168.111.111
>>> up
>>
>> now I've configured aggr like this
>>
>> ix0 - aggr0 - vlan700
>> ix1 - aggr1 - vlan800
>> ip configuration is the same
>>
>> still I have problem with tso
>>
>> tcpdump in attachment
> 
> The tcpdump output on my machine looks similar to yours.
> Do you see TCP retransmits on your sending Linux host?
> 
> # netstat -s | grep retransmitted
> 0 segments retransmitted
> 

[root@test2 pktgen]# netstat -s | grep -i segment
8352173627 segments received
14648977712 segments send out
13336873 segments retransmited
586 bad segments received.

yes, segments retransmited is increasing


> Which specific ix(4) NIC do you use?
> 

r620-1# dmesg | grep ix
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
ix0 at pci1 dev 0 function 0 "Intel 82599" rev 0x01, msix, 6 queues,
address ec:f4:bb:da:f7:f8
ix1 at pci1 dev 0 function 1 "Intel 82599" rev 0x01, msix, 6 queues,
address ec:f4:bb:da:f7:fa
ix2 at pci4 dev 0 function 0 "Intel X540T" rev 0x01, msix, 6 queues,
address a0:36:9f:29:f2:0c
ix3 at pci4 dev 0 function 1 "Intel X540T" rev 0x01, msix, 6 queues,
address a0:36:9f:29:f2:0e


I'm using ix0 and ix1 for testing
> I'm using these two:
> 
> # dmesg | grep ^ix
> ix0 at pci3 dev 0 function 0 "Intel 82599" rev 0x01, msix, 12 queues, address 
> 00:1b:21:87:fb:2c
> ix1 at pci3 dev 0 function 1 "Intel 82599" rev 0x01, msix, 12 queues, address 
> 00:1b:21:87:fb:2d
> 
> Thanks,
> Jan
> 



Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-09 Thread Hrvoje Popovski
On 8.6.2022. 22:01, Hrvoje Popovski wrote:
> On 8.6.2022. 15:04, Jan Klemkow wrote:
>> Could you show me, how your setup and your configuration looks like?
> Yes, of course ..
> 
> All my lab boxes are connected to switch (no flow-control). In this
> setup ix0 and ix1 are in aggr and vlans 700 and 800 are tagged on aggr3.
> I have tried tcpbench and iperf3 from openbsd or linux box and with tso
> I'm getting few Kbps. In attachment you can find tcpdump for iperf3 on
> openbsd with or without tso
> 
> I will play more with this setup maybe I'm doing something terribly wrong...
> 
> 
> 
> My OpenBSD conf:
> 
> r620-1# cat /etc/hostname.ix0
> up
> 
> r620-1# cat /etc/hostname.ix1
> up
> 
> r620-1# cat /etc/hostname.aggr3
> trunkport ix0
> trunkport ix1
> up
> 
> r620-1# cat /etc/hostname.vlan700
> parent aggr3
> vnetid 700
> lladdr ec:f4:bb:da:f7:f8
> inet 192.168.100.11/24
> !route add 16/8 192.168.100.111
> up
> 
> r620-1# cat /etc/hostname.vlan800
> parent aggr3
> vnetid 800
> lladdr ec:f4:bb:da:f7:fa
> inet 192.168.111.11/24
> !route add 48/8 192.168.111.111
> up
> 
> 

Hi,

now I've configured aggr like this

ix0 - aggr0 - vlan700
ix1 - aggr1 - vlan800
ip configuration is the same

still I have problem with tso

tcpdump in attachment


config:

r620-1# cat /etc/hostname.ix0
tso
up

r620-1# cat /etc/hostname.ix1
tso
up

r620-1# cat /etc/hostname.aggr0
trunkport ix0
up

r620-1# cat /etc/hostname.aggr1
trunkport ix1
up

r620-1# cat /etc/hostname.vlan700
parent aggr0
vnetid 700
lladdr ec:f4:bb:da:f7:f8
inet 192.168.100.11/24
!route add 16/8 192.168.100.111
up

r620-1# cat /etc/hostname.vlan800
parent aggr1
vnetid 800
lladdr ec:f4:bb:da:f7:fa
inet 192.168.111.11/24
!route add 48/8 192.168.111.111
up


with tso

08:20:23.302445 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: . 267880:269328(1448) ack 1 win 229  (DF) (ttl 64, id 35805, len 1500)
08:20:23.302469 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49ff! -> 2c51] ack 269328 win 248 
 (DF) 
(ttl 64, id 34126, len 64, bad ip cksum 0! -> 6b9d)
08:20:23.302552 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: . 269328:270776(1448) ack 1 win 229  (DF) (ttl 64, id 35806, len 1500)
08:20:23.302552 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: . 272224:273672(1448) ack 1 win 229  (DF) (ttl 64, id 35807, len 1500)
08:20:23.302572 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49f3! -> a7ca] ack 272224 win 226 
 (DF) (ttl 64, id 15149, len 52, bad ip 
cksum 0! -> b5ca)
08:20:23.302585 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49f3! -> a1f5] ack 273672 win 271 
 (DF) (ttl 64, id 41542, len 52, bad ip 
cksum 0! -> 4eb1)
08:20:23.302679 192.168.100.112.5 > 192.168.100.11.5201: . 
273672:278016(4344) ack 1 win 229  (DF) 
(ttl 64, id 35810, len 4396)
08:20:23.312441 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: . 278016:279464(1448) ack 1 win 229  (DF) (ttl 64, id 35811, len 1500)
08:20:23.312463 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49ff! -> e2b1] ack 273672 win 271 
 (DF) 
(ttl 64, id 50301, len 64, bad ip cksum 0! -> 2c6e)
08:20:23.312551 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: . 273672:275120(1448) ack 1 win 229  (DF) (ttl 64, id 35812, len 1500)
08:20:23.312564 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49ff! -> dd16] ack 275120 win 248 
 (DF) 
(ttl 64, id 34273, len 64, bad ip cksum 0! -> 6b0a)
08:20:23.513498 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: . 275120:276568(1448) ack 1 win 229  (DF) (ttl 64, id 35813, len 1500)
08:20:23.513583 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49ff! -> d6a5] ack 276568 win 248 
 (DF) 
(ttl 64, id 37687, len 64, bad ip cksum 0! -> 5db4)
08:20:23.513611 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: . 276568:278016(1448) ack 1 win 229  (DF) (ttl 64, id 35814, len 1500)
08:20:23.513611 802.1Q vid 700 pri 1 192.168.100.112.5 > 
192.168.100.11.5201: P 279464:280912(1448) ack 1 win 229  (DF) (ttl 64, id 35815, len 1500)
08:20:23.513626 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49f3! -> 8aaf] ack 279464 win 226 
 (DF) (ttl 64, id 40828, len 52, bad ip 
cksum 0! -> 517b)
08:20:23.513636 802.1Q vid 700 pri 3 192.168.100.11.5201 > 
192.168.100.112.5: . [bad tcp cksum 49f3! -> 84da] ack 280912 win 271 
 (DF) (ttl 64, id 51599, len 52, bad ip 
cksum 0! -> 2768)
08:20:23.513738 192.168.100.112.5 > 192.168.100.11.5201: . 
280912:285256(4344) ack

Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-08 Thread Hrvoje Popovski
On 8.6.2022. 15:04, Jan Klemkow wrote:
> Could you show me, how your setup and your configuration looks like?

Yes, of course ..

All my lab boxes are connected to switch (no flow-control). In this
setup ix0 and ix1 are in aggr and vlans 700 and 800 are tagged on aggr3.
I have tried tcpbench and iperf3 from openbsd or linux box and with tso
I'm getting few Kbps. In attachment you can find tcpdump for iperf3 on
openbsd with or without tso

I will play more with this setup maybe I'm doing something terribly wrong...



My OpenBSD conf:

r620-1# cat /etc/hostname.ix0
up

r620-1# cat /etc/hostname.ix1
up

r620-1# cat /etc/hostname.aggr3
trunkport ix0
trunkport ix1
up

r620-1# cat /etc/hostname.vlan700
parent aggr3
vnetid 700
lladdr ec:f4:bb:da:f7:f8
inet 192.168.100.11/24
!route add 16/8 192.168.100.111
up

r620-1# cat /etc/hostname.vlan800
parent aggr3
vnetid 800
lladdr ec:f4:bb:da:f7:fa
inet 192.168.111.11/24
!route add 48/8 192.168.111.111
up





ifconfig output

lo0: flags=8049 mtu 32768
index 8 priority 0 llprio 3
groups: lo
inet6 ::1 prefixlen 128
inet6 fe80::1%lo0 prefixlen 64 scopeid 0x8
inet 127.0.0.1 netmask 0xff00
ix0:
flags=8b43
mtu 1500
lladdr fe:e1:ba:d0:12:43
index 1 priority 0 llprio 3
trunk: trunkdev aggr3
media: Ethernet autoselect (10GSFP+Cu full-duplex,rxpause,txpause)
status: active
ix1:
flags=2008b43
mtu 1500
lladdr fe:e1:ba:d0:12:43
index 2 priority 0 llprio 3
trunk: trunkdev aggr3
media: Ethernet autoselect (10GSFP+Cu full-duplex,rxpause,txpause)
status: active
ix2: flags=8843 mtu 1500
lladdr a0:36:9f:29:f2:0c
index 3 priority 0 llprio 3
media: Ethernet autoselect (10GbaseT full-duplex,rxpause,txpause)
status: active
inet 192.168.1.1 netmask 0xfffc broadcast 192.168.1.3
ix3: flags=8802 mtu 1500
lladdr a0:36:9f:29:f2:0e
index 4 priority 0 llprio 3
media: Ethernet autoselect (10GbaseT full-duplex)
status: active
em0: flags=8843 mtu 1500
lladdr ec:f4:bb:da:f7:fc
index 5 priority 0 llprio 3
groups: egress
media: Ethernet autoselect (1000baseT full-duplex,master)
status: active
inet BLA.BLA.BLA.BLA netmask 0xffe0 broadcast BLA.BLA.BLA
em1: flags=8802 mtu 1500
lladdr ec:f4:bb:da:f7:fd
index 6 priority 0 llprio 3
media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause)
status: active
enc0: flags=0<>
index 7 priority 0 llprio 3
groups: enc
status: active
aggr3: flags=8943 mtu 1500
lladdr fe:e1:ba:d0:12:43
index 9 priority 0 llprio 7
trunk: trunkproto lacp
trunk id: [(8000,fe:e1:ba:d0:12:43,0009,,),
 (,02:04:96:a5:30:cc,03EB,,)]
ix0 lacp actor system pri 0x8000 mac fe:e1:ba:d0:12:43,
key 0x9, port pri 0x8000 number 0x1
ix0 lacp actor state
activity,aggregation,sync,collecting,distributing
ix0 lacp partner system pri 0x0 mac 02:04:96:a5:30:cc,
key 0x3eb, port pri 0x0 number 0x3eb
ix0 lacp partner state
activity,aggregation,sync,collecting,distributing
ix0 port active,collecting,distributing
ix1 lacp actor system pri 0x8000 mac fe:e1:ba:d0:12:43,
key 0x9, port pri 0x8000 number 0x2
ix1 lacp actor state
activity,aggregation,sync,collecting,distributing
ix1 lacp partner system pri 0x0 mac 02:04:96:a5:30:cc,
key 0x3eb, port pri 0x0 number 0x3ec
ix1 lacp partner state
activity,aggregation,sync,collecting,distributing
ix1 port active,collecting,distributing
groups: aggr
media: Ethernet autoselect
status: active
vlan700: flags=8843 mtu 1500
lladdr ec:f4:bb:da:f7:f8
index 10 priority 0 llprio 3
encap: vnetid 700 parent aggr3 txprio packet rxprio outer
groups: vlan
media: Ethernet autoselect
status: active
inet 192.168.100.11 netmask 0xff00 broadcast 192.168.100.255
vlan800: flags=8843 mtu 1500
lladdr ec:f4:bb:da:f7:fa
index 11 priority 0 llprio 3
encap: vnetid 800 parent aggr3 txprio packet rxprio outer
groups: vlan
media: Ethernet autoselect
status: active
inet 192.168.111.11 netmask 0xff00 broadcast 192.168.111.255

tcpdump without tso

Jun 08 21:54:21.393214 90:e2:ba:33:b4:a0 ec:f4:bb:da:f7:f8 8100 1518: 802.1Q 
vid 700 pri 1 192.168.100.112.52088 > 192.168.100.11.5201: . 
333030430:333031878(1448) ack 1 win 229  (DF) (ttl 64, id 26693, len 1500)
Jun 08 21:54:21.393214 90:e2:ba:33:b4:a0 ec:f4:bb:da:f7:f8 8100 1518: 802.1Q 
vid 700 pri 1 192.168.100.112.52088 > 192.168.100.11.5201: . 
333031878:333033326(1448) ack 1 win 229  (DF) (ttl 64, id 26694, len 1500)
Jun 08 21:54:21.393214 90:e2:ba:33:b4:a0 ec:f4:bb:da:f7:f8 8100 

Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-06 Thread Hrvoje Popovski
On 4.6.2022. 21:23, Hrvoje Popovski wrote:
> On 1.6.2022. 11:21, Jan Klemkow wrote:
>> I moved the switch to ifconfig(8) in the diff below.
>>
>> # ifconfig ix0 tso
>> # ifconfig ix0 -tso
>>
>> I named it tso (TCP segment offloading), so I can reuse this switch
>> also for the sending part.  TSO is the combination of LRO and LSO.
>>
>> LRO: Large Receive Offloading
>> LSO: Large Send Offloading
>>
>> RSC (Receive Side Coalescing) is an alternative term for LRO, which is
>> used by the spec of ix(4) NICs.
>>
>>>> Tests with other ix(4) NICs are welcome and needed!
>>> We'll try and kick it around at work in the next week or so.
> 
> Hi all,
> 
> I've put this diff in production on clean source from this morning and
> got panic. I'm not 100% sure if it's because of TSO because in a last
> monts i had all kinds of diffs on production boxes.
> Now I will run spanshot maybe clean spanshot will panic :))

And box panic with spanshot but with different message. Will send report
to bugs@...



Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-04 Thread Hrvoje Popovski
On 4.6.2022. 21:23, Hrvoje Popovski wrote:
> Hi all,
> 
> I've put this diff in production on clean source from this morning and
> got panic. I'm not 100% sure if it's because of TSO because in a last
> monts i had all kinds of diffs on production boxes.
> Now I will run spanshot maybe clean spanshot will panic :))
> 
> I've couldn't trigger panic with TSO diff in lab ..

I think I've found something...

setup like "ix-aggr-vlan-ip", doesn't work as expected with tso
setups like "ix-aggr-ip" or "ix-vlan-ip" works as expected

sending tcp stream with tcpbech from box with clean snapshot to host
with "ix-aggr-vlan-ip" and tso

0.000 Peak Mbps:0.000 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.000 Peak Mbps:0.011 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.000 Peak Mbps:0.011 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011
0.000 Peak Mbps:0.011 Avg Mbps:0.000
0.011 Peak Mbps:0.011 Avg Mbps:0.011


without tso

1075.227 Peak Mbps: 1075.227 Avg Mbps: 1075.227
2163.509 Peak Mbps: 2163.509 Avg Mbps: 2163.509
2147.523 Peak Mbps: 2163.509 Avg Mbps: 2147.523
2183.271 Peak Mbps: 2183.271 Avg Mbps: 2183.271
2690.801 Peak Mbps: 2690.801 Avg Mbps: 2690.801
2602.508 Peak Mbps: 2690.801 Avg Mbps: 2602.508
3036.190 Peak Mbps: 3036.190 Avg Mbps: 3036.190
2799.911 Peak Mbps: 3036.190 Avg Mbps: 2799.911



Re: ix(4): Add support for TCP Large Receive Offloading

2022-06-04 Thread Hrvoje Popovski
On 1.6.2022. 11:21, Jan Klemkow wrote:
> I moved the switch to ifconfig(8) in the diff below.
> 
> # ifconfig ix0 tso
> # ifconfig ix0 -tso
> 
> I named it tso (TCP segment offloading), so I can reuse this switch
> also for the sending part.  TSO is the combination of LRO and LSO.
> 
> LRO: Large Receive Offloading
> LSO: Large Send Offloading
> 
> RSC (Receive Side Coalescing) is an alternative term for LRO, which is
> used by the spec of ix(4) NICs.
> 
>>> Tests with other ix(4) NICs are welcome and needed!
>> We'll try and kick it around at work in the next week or so.

Hi all,

I've put this diff in production on clean source from this morning and
got panic. I'm not 100% sure if it's because of TSO because in a last
monts i had all kinds of diffs on production boxes.
Now I will run spanshot maybe clean spanshot will panic :))

I've couldn't trigger panic with TSO diff in lab ..


panic:

bcbnfw1# panic: kernel diagnostic assertion "m->m_len >= ETHER_HDR_LEN"
failed: file "/sys/net/bpf.c", line 1489
Stopped at  db_enter+0x10:  popq%rbp
TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
 444766  59724  0 0x14000  0x2001  softnet
db_enter() at db_enter+0x10
panic(81f25898) at panic+0xbf
__assert(81f99ec1,81fd1b33,5d1,81f465eb) at
__assert+0x25
bpf_mtap_ether(80489600,fd80610e0f00,1) at bpf_mtap_ether+0xef
ifiq_input(8048eb00,800020b59b10) at ifiq_input+0xf3
ixgbe_rxeof(8048d3a0) at ixgbe_rxeof+0x32b
ixgbe_queue_intr(8048ab00) at ixgbe_queue_intr+0x3c
intr_handler(800020b59c50,8045dd00) at intr_handler+0x6e
Xintr_ioapic_edge0_untramp() at Xintr_ioapic_edge0_untramp+0x18f
acpicpu_idle() at acpicpu_idle+0x203
sched_idle(800020892ff0) at sched_idle+0x280
end trace frame: 0x0, count: 4
https://www.openbsd.org/ddb.html describes the minimum info required in
bug reports.  Insufficient info makes it difficult to find and fix bugs.
ddb{2}>


ddb{2}> show reg
rdi0
rsi 0x14
rbp   0x800020b59930
rbx   0xfd80610e0f00
rdx   0xc800
rcx0x282
rax 0x68
r8 0x101010101010101
r9 0
r10   0xd410e1bbe041370a
r11   0xff1c7218c30edf0a
r12   0x800020893a60
r13   0x800020b59a90
r140
r15   0x81f25898cmd0646_9_tim_udma+0x29032
rip   0x81813a30db_enter+0x10
cs   0x8
rflags 0x286
rsp   0x800020b59930
ss  0x10
db_enter+0x10:  popq%rbp


ddb{2}> ps
   PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
 58540  192000  34045  0  30x100083  ttyin ksh
 34045  380261  43095   1000  30x10008b  sigsusp   ksh
 43095  382180  79096   1000  30x98  kqreadsshd
 79096  410005  83400  0  30x82  kqreadsshd
 67346  388175  1  0  30x100083  ttyin ksh
  1576  379710  1  0  30x100098  kqreadcron
 80308  475205  63710720  3   0x190  kqreadlldpd
 63710  264590  1  0  30x80  netio lldpd
 70690  148684  79519 95  3   0x1100092  kqreadsmtpd
 96229  368855  79519103  3   0x1100092  kqreadsmtpd
 57602  230927  79519 95  3   0x1100092  kqreadsmtpd
 27959  412218  79519 95  30x100092  kqreadsmtpd
 68321  253958  79519 95  3   0x1100092  kqreadsmtpd
 15048  386230  79519 95  3   0x1100092  kqreadsmtpd
 79519   40897  1  0  30x100080  kqreadsmtpd
 81998   89562  1 77  3   0x1100090  kqreaddhcpd
 30223   19018  1  0  30x100080  kqreadsnmpd
 16185  104619  1 91  3   0x192  kqreadsnmpd
 83400  393345  1  0  30x88  kqreadsshd
 59135   14654  1  0  30x100080  kqreadntpd
 93144  455149  50136 83  30x100092  kqreadntpd
 50136  231167  1 83  3   0x1100092  kqreadntpd
 26285  193609  59081 74  3   0x1100092  bpf   pflogd
 59081  514111  1  0  30x80  netio pflogd
 90237  222054  74741 73  3   0x1100090  kqreadsyslogd
 74741   91812  1  0  30x100082  netio syslogd
 50581  332657  0  0  3 0x14200  bored smr
 11773  152326  0  0  3 0x14200  pgzerozerothread
 45553   54153  0  0  3 0x14200  aiodoned  aiodoned
 31391  475879  0  0  3 0x14200  syncerupdate
 44830  301726  0  0  3 0x14200  cleaner   cleaner
 62337   61034  0  0  3 0x14200  reaperreaper
 97528  277547   

Re: ix(4): Add support for TCP Large Receive Offloading

2022-05-31 Thread Hrvoje Popovski
On 31.5.2022. 11:36, Theo Buehler wrote:
>> smc24# cd /usr/src && make includes
> 
> Do 'cd /usr/src && make obj' first.
> 

Yes, thank you ...



Re: ix(4): Add support for TCP Large Receive Offloading

2022-05-31 Thread Hrvoje Popovski
On 27.5.2022. 18:25, Jan Klemkow wrote:
> Hi,
> 
> The following diff enables the TCP Large Receive Offloading feature for
> ix(4) interfaces.  It also includes a default off sysctl(2) switch.
> 
> The TCP single stream receiving performance increased from 3.6 Gbit/s to
> 9.4 Gbit/s.  Measured from Linux to OpenBSD with tcpbench.
> 
> I tested the diff with:
> ix0 at pci3 dev 0 function 0 "Intel 82599" rev 0x01, msix, 12 queues, address 
> 00:1b:21:87:fb:2c
> 
> If you want to test the diff:
> 
>  1. Apply the diff
>  2. Rebuild the kernel
>  3. Rebuild header files
> # cd /usr/src && make includes

Hi,

I'm sorry but I' stuck here.

smc24# cd /usr/src && make includes
cd /usr/src/include &&  su build -c 'exec make prereq' &&  exec make
includes
preparing in /usr/src/include/../lib/libcrypto
cat /usr/src/lib/libcrypto/objects/obj_mac.num > obj_mac.num.tmp
/bin/sh: cannot create obj_mac.num.tmp: Permission denied
*** Error 1 in lib/libcrypto (Makefile:460 'obj_mac.h')
*** Error 2 in include (Makefile:81 'prereq': @for i in ../lib/libcrypto
../lib/librpcsvc; do  echo preparing in /usr/src/include/$i;  cd /u...)
*** Error 2 in /usr/src (Makefile:55 'includes')

I'm doing "make includes" as root ..



>  4. Rebuild sysctl(8)
> # cd /usr/src/sbin/sysctl && make && make install
>  5. Reboot
>  6. Enable the feature
> # sysctl net.inet.tcp.large_recv_offload=1
> # ifconfig ix0 down && ifconfig ix0 up
> 
> I tested this diff for a while in different scenarios (receiving,
> routing, relaying) without noticing any problems yet.
> 
> bluhm@ already suggested that I could change the feature switch from a
> global sysctl(2) to an per interface ifconfig(8) option.  This would
> give the user more control.
> 
> Tests with other ix(4) NICs are welcome and needed!
> 
> bye,
> Jan
> 
> Index: dev/pci/if_ix.c
> ===
> RCS file: /cvs/src/sys/dev/pci/if_ix.c,v
> retrieving revision 1.185
> diff -u -p -r1.185 if_ix.c
> --- dev/pci/if_ix.c   15 Mar 2022 11:22:10 -  1.185
> +++ dev/pci/if_ix.c   23 May 2022 14:39:45 -
> @@ -2870,7 +2870,7 @@ ixgbe_initialize_receive_units(struct ix
>  {
>   struct rx_ring  *rxr = sc->rx_rings;
>   struct ixgbe_hw *hw = >hw;
> - uint32_tbufsz, fctrl, srrctl, rxcsum;
> + uint32_tbufsz, fctrl, srrctl, rxcsum, rdrxctl;
>   uint32_thlreg;
>   int i;
>  
> @@ -2894,6 +2894,14 @@ ixgbe_initialize_receive_units(struct ix
>   hlreg |= IXGBE_HLREG0_JUMBOEN;
>   IXGBE_WRITE_REG(hw, IXGBE_HLREG0, hlreg);
>  
> + if (tcp_lro) {
> + /* enable RSCACKC for RSC */
> + rdrxctl = IXGBE_READ_REG(hw, IXGBE_RDRXCTL);
> + rdrxctl |= IXGBE_RDRXCTL_RSCACKC;
> + rdrxctl |= IXGBE_RDRXCTL_FCOE_WRFIX;
> + IXGBE_WRITE_REG(hw, IXGBE_RDRXCTL, rdrxctl);
> + }
> +
>   bufsz = (sc->rx_mbuf_sz - ETHER_ALIGN) >> IXGBE_SRRCTL_BSIZEPKT_SHIFT;
>  
>   for (i = 0; i < sc->num_queues; i++, rxr++) {
> @@ -2909,6 +2917,12 @@ ixgbe_initialize_receive_units(struct ix
>   /* Set up the SRRCTL register */
>   srrctl = bufsz | IXGBE_SRRCTL_DESCTYPE_ADV_ONEBUF;
>   IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(i), srrctl);
> +
> + if (tcp_lro) {
> + /* Enable Receive Side Coalescing */
> + IXGBE_WRITE_REG(hw, IXGBE_RSCCTL(i),
> + IXGBE_RSCCTL_RSCEN|IXGBE_RSCCTL_MAXDESC_16);
> + }
>  
>   /* Setup the HW Rx Head and Tail Descriptor Pointers */
>   IXGBE_WRITE_REG(hw, IXGBE_RDH(i), 0);
> Index: dev/pci/ixgbe.h
> ===
> RCS file: /cvs/src/sys/dev/pci/ixgbe.h,v
> retrieving revision 1.33
> diff -u -p -r1.33 ixgbe.h
> --- dev/pci/ixgbe.h   8 Feb 2022 03:38:00 -   1.33
> +++ dev/pci/ixgbe.h   23 May 2022 14:53:59 -
> @@ -61,11 +61,16 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
> +#include 
> +#include 
> +#include 
>  
>  #if NBPFILTER > 0
>  #include 
> Index: netinet/tcp_input.c
> ===
> RCS file: /cvs/src/sys/netinet/tcp_input.c,v
> retrieving revision 1.375
> diff -u -p -r1.375 tcp_input.c
> --- netinet/tcp_input.c   4 Jan 2022 06:32:39 -   1.375
> +++ netinet/tcp_input.c   23 May 2022 14:41:59 -
> @@ -126,6 +126,7 @@ struct timeval tcp_rst_ppslim_last;
>  int tcp_ackdrop_ppslim = 100;/* 100pps */
>  int tcp_ackdrop_ppslim_count = 0;
>  struct timeval tcp_ackdrop_ppslim_last;
> +int tcp_lro = 0; /* TCP Large Receive Offload */
>  
>  #define TCP_PAWS_IDLE(24 * 24 * 60 * 60 * PR_SLOWHZ)
>  
> Index: netinet/tcp_usrreq.c
> ===
> RCS file: 

pf_state_export panic with NET_TASKQ=6 and stuff ....

2022-05-27 Thread Hrvoje Popovski
Hi all,

I'm running firewall in production with NET_TASKQ=6 with claudio@ "use
timeout for rttimer" and bluhm@ "kernel lock in arp" diffs.
After week or so of running smoothly I've got panic.

I'm aware that it's not plain snapshot, but having two firewalls with
carp and pfsync gives me room for playing around and report back ...


Panic log in attachment

dmesg:
OpenBSD 7.1-current (GENERIC.MP) #24: Sun May 22 19:35:12 CEST 2022
hrv...@bcbnfw1.lan:/sys/arch/amd64/compile/GENERIC.MP
real mem = 34224844800 (32639MB)
avail mem = 32913821696 (31389MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xec9b0 (62 entries)
bios0: vendor American Megatrends Inc. version "3.1c" date 05/02/2019
bios0: Supermicro X10SRW-F
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT SPMI MCFG UEFI HPET NFIT WDDT
SSDT NITR SSDT SSDT PRAD DMAR HEST BERT ERST EINJ
acpi0: wakeup devices IP2P(S4) EHC1(S4) EHC2(S4) RP01(S4) RP02(S4)
RP03(S4) RP04(S4) RP05(S4) RP06(S4) RP07(S4) RP08(S4) BR1A(S4) BR1B(S4)
BR2A(S4) BR2B(S4) BR2C(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.55 MHz, 06-4f-01
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.01 MHz, 06-4f-01
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.01 MHz, 06-4f-01
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.00 MHz, 06-4f-01
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
cpu4 at mainbus0: apid 8 (application processor)
cpu4: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.00 MHz, 06-4f-01
cpu4:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,DCA,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,PQM,RDSEED,ADX,SMAP,PT,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu4: 256KB 64b/line 8-way L2 cache
cpu4: smt 0, core 4, package 0
cpu5 at mainbus0: apid 10 (application processor)
cpu5: Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz, 3600.00 MHz, 06-4f-01
cpu5:

Re: kernel lock in arp

2022-05-21 Thread Hrvoje Popovski
On 18.5.2022. 20:52, Hrvoje Popovski wrote:
> On 18.5.2022. 17:31, Alexander Bluhm wrote:
>> Hi,
>>
>> For parallel IP forwarding I had put kernel lock around arpresolve()
>> as a quick workaround for crashes.  Moving the kernel lock inside
>> the function makes the hot path lock free.  I see slight prerformace
>> increase in my test and no lock contention in kstack flamegraph.
>>
>> http://bluhm.genua.de/perform/results/latest/2022-05-16T00%3A00%3A00Z/btrace/tcpbench_-S100_-t10_-n100_10.3.45.35-btrace-kstack.0.svg
>> http://bluhm.genua.de/perform/results/latest/patch-sys-arpresolve-kernel-lock.1/btrace/tcpbench_-S100_-t10_-n100_10.3.45.35-btrace-kstack.0.svg
>>
>> Search for kernel_lock.  Matched goes from 0.6% to 0.2%
>>
>> We are running such a diff in our genua code for a while.  I think
>> route flags need more love.  I doubt that all flags and fields are
>> consistent when run on multiple CPU.  But this diff does not make
>> it worse and increases MP pressure.
> 
> Hi,
> 
> I'm seeing increase of forwarding performance from 2Mpps to 2.4Mpps with
> this diff and NET_TASKQ=4 on 6 x E5-2643 v2 @ 3.50GHz, 3600.01 MHz
> 
> Thank you ..
> 
> 

Hi,

I'm not seeing any fallout's with this diff. I'm running it on
production and test firewalls.



Re: kernel lock in arp

2022-05-18 Thread Hrvoje Popovski
On 18.5.2022. 17:31, Alexander Bluhm wrote:
> Hi,
> 
> For parallel IP forwarding I had put kernel lock around arpresolve()
> as a quick workaround for crashes.  Moving the kernel lock inside
> the function makes the hot path lock free.  I see slight prerformace
> increase in my test and no lock contention in kstack flamegraph.
> 
> http://bluhm.genua.de/perform/results/latest/2022-05-16T00%3A00%3A00Z/btrace/tcpbench_-S100_-t10_-n100_10.3.45.35-btrace-kstack.0.svg
> http://bluhm.genua.de/perform/results/latest/patch-sys-arpresolve-kernel-lock.1/btrace/tcpbench_-S100_-t10_-n100_10.3.45.35-btrace-kstack.0.svg
> 
> Search for kernel_lock.  Matched goes from 0.6% to 0.2%
> 
> We are running such a diff in our genua code for a while.  I think
> route flags need more love.  I doubt that all flags and fields are
> consistent when run on multiple CPU.  But this diff does not make
> it worse and increases MP pressure.

Hi,

I'm seeing increase of forwarding performance from 2Mpps to 2.4Mpps with
this diff and NET_TASKQ=4 on 6 x E5-2643 v2 @ 3.50GHz, 3600.01 MHz

Thank you ..




arpresolve: XXXXX route contains no arp information

2022-05-17 Thread Hrvoje Popovski
Hi all,

here we have carp pfsync setup mostly for wireless and mobile users.
Beside from pure forwarding and firewalling boxes are dhcp server with
dhcpsync.


>From time to time in logs i can see:

May 17 11:44:18 bcbnfw1 /bsd: arpresolve: 10.30.0.0: route contains no
arp information


I've route monitor it and i think that this is the one:

got message of size 272 on Tue May 17 11:44:18 2022
RTM_MISS: Lookup failed on this address: len 272, priority 0, table 0,
if# 0, pid: 0, seq 0, errno 17
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 (53)
Qb2.20.00.80.ff.ff.7c.dd.7e.81.ff.ff.ff.ff.00.00.00.00.00.00.00.00.07.00.00.00.00.00.00.00.70.0d.26.66.80.fd.ff.ff.80.65.01.00.60.00.00.00.00.00.00.00.00.00.00.00.38.36.b2.20.00.80.ff.ff.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.11


at that time 11:44:18 there are few logs for ip 10.30.15.39 (mobile
phone), please see log in attachment.


route 10.30.0/20 is connected route to interface vlan1130

DestinationGatewayFlags   Refs  Use   Mtu  Prio
Iface
10.30.0/20 10.30.0.2  UCn  14122619 - 4
vlan1130

arp -an | grep 10.30.15.39
10.30.15.39  76:b9:98:72:7c:9c vlan1130 15m30s

Beside that in logs there are lot's of:
arp: attempt to add entry for 10.33.0.116 on vlan1133 by
a4:45:19:7d:bb:28 on vlan1132

i'm explaining this log to myself - because there is one SSID (eduroam)
and lots of vlans i will this this log no matter what ..


is this "arpresolve log" bug ?

got message of size 192 on Tue May 17 11:44:18 2022
RTM_ADD: Add Route: len 192, priority 3, table 0, if# 37, name vlan1130, pid: 
0, seq 0, errno 0
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 link#37 90:e2:ba:d7:1b:f4 10.30.0.2
got message of size 192 on Tue May 17 11:44:18 2022
RTM_DELETE: Delete Route: len 192, priority 3, table 0, if# 37, name vlan1130, 
pid: 0, seq 0, errno 0
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 link#37 90:e2:ba:d7:1b:f4 10.30.0.2
got message of size 192 on Tue May 17 11:44:18 2022
RTM_ADD: Add Route: len 192, priority 3, table 0, if# 37, name vlan1130, pid: 
0, seq 0, errno 0
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 link#37 90:e2:ba:d7:1b:f4 10.30.0.2
got message of size 192 on Tue May 17 11:44:18 2022
RTM_DELETE: Delete Route: len 192, priority 3, table 0, if# 37, name vlan1130, 
pid: 0, seq 0, errno 0
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 link#37 90:e2:ba:d7:1b:f4 10.30.0.2
got message of size 192 on Tue May 17 11:44:18 2022
RTM_ADD: Add Route: len 192, priority 3, table 0, if# 37, name vlan1130, pid: 
0, seq 0, errno 0
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 link#37 90:e2:ba:d7:1b:f4 10.30.0.2
got message of size 272 on Tue May 17 11:44:18 2022
RTM_MISS: Lookup failed on this address: len 272, priority 0, table 0, if# 0, 
pid: 0, seq 0, errno 17
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 (53) 
Qb2.20.00.80.ff.ff.7c.dd.7e.81.ff.ff.ff.ff.00.00.00.00.00.00.00.00.07.00.00.00.00.00.00.00.70.0d.26.66.80.fd.ff.ff.80.65.01.00.60.00.00.00.00.00.00.00.00.00.00.00.38.36.b2.20.00.80.ff.ff.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.11
got message of size 192 on Tue May 17 11:44:18 2022
RTM_RESOLVE: Route created by cloning: len 192, priority 3, table 0, if# 37, 
name vlan1130, pid: 0, seq 0, errno 0
flags:
fmask:
use:0   mtu:0expire:0
locks:  inits:
sockaddrs: 
 10.30.15.39 76:b9:98:72:7c:9c 90:e2:ba:d7:1b:f4 10.30.0.2


Re: more generic cpu freq reporting

2022-04-26 Thread Hrvoje Popovski
On 25.4.2022. 19:38, Hrvoje Popovski wrote:
> On 25.4.2022. 17:32, Claudio Jeker wrote:
>> You may need to play with hw.setperf and maybe run a single cpu load to
>> see boost behaviour. I noticed that my 7th gen Intel CPU behaves different
>> to the AMD Ryzen CPUs I own.
> This is like playing with new desktop environment... Thank you :)


Just for fun :)

Dell R7515 with AMD EPYC 7702P 64-Core Processor

hw.sensors.cpu0.frequency0=255000.00 Hz
hw.sensors.cpu1.frequency0=27.00 Hz
hw.sensors.cpu2.frequency0=275000.00 Hz
hw.sensors.cpu3.frequency0=315000.00 Hz
hw.sensors.cpu4.frequency0=20.00 Hz
hw.sensors.cpu5.frequency0=335000.00 Hz
hw.sensors.cpu6.frequency0=305000.00 Hz
hw.sensors.cpu7.frequency0=305000.00 Hz
hw.sensors.cpu8.frequency0=29.00 Hz
hw.sensors.cpu9.frequency0=32.00 Hz
hw.sensors.cpu10.frequency0=325000.00 Hz
hw.sensors.cpu11.frequency0=22.00 Hz
hw.sensors.cpu12.frequency0=245000.00 Hz
hw.sensors.cpu13.frequency0=32.00 Hz
hw.sensors.cpu14.frequency0=30.00 Hz
hw.sensors.cpu15.frequency0=21.00 Hz
hw.sensors.cpu16.frequency0=32.00 Hz
hw.sensors.cpu17.frequency0=325000.00 Hz
hw.sensors.cpu18.frequency0=30.00 Hz
hw.sensors.cpu19.frequency0=25.00 Hz
hw.sensors.cpu20.frequency0=335000.00 Hz
hw.sensors.cpu21.frequency0=33.00 Hz
hw.sensors.cpu22.frequency0=26.00 Hz
hw.sensors.cpu23.frequency0=195000.00 Hz
hw.sensors.cpu24.frequency0=33.00 Hz
hw.sensors.cpu25.frequency0=28.00 Hz
hw.sensors.cpu26.frequency0=30.00 Hz
hw.sensors.cpu27.frequency0=325000.00 Hz
hw.sensors.cpu28.frequency0=32.00 Hz
hw.sensors.cpu29.frequency0=29.00 Hz
hw.sensors.cpu30.frequency0=275000.00 Hz
hw.sensors.cpu31.frequency0=195000.00 Hz
hw.sensors.cpu32.frequency0=315000.00 Hz
hw.sensors.cpu33.frequency0=325000.00 Hz
hw.sensors.cpu34.frequency0=295000.00 Hz
hw.sensors.cpu35.frequency0=30.00 Hz
hw.sensors.cpu36.frequency0=315000.00 Hz
hw.sensors.cpu37.frequency0=325000.00 Hz
hw.sensors.cpu38.frequency0=30.00 Hz
hw.sensors.cpu39.frequency0=29.00 Hz
hw.sensors.cpu40.frequency0=295000.00 Hz
hw.sensors.cpu41.frequency0=325000.00 Hz
hw.sensors.cpu42.frequency0=325000.00 Hz
hw.sensors.cpu43.frequency0=30.00 Hz
hw.sensors.cpu44.frequency0=325000.00 Hz
hw.sensors.cpu45.frequency0=32.00 Hz
hw.sensors.cpu46.frequency0=335000.00 Hz
hw.sensors.cpu47.frequency0=305000.00 Hz
hw.sensors.cpu48.frequency0=325000.00 Hz
hw.sensors.cpu49.frequency0=33.00 Hz
hw.sensors.cpu50.frequency0=23.00 Hz
hw.sensors.cpu51.frequency0=275000.00 Hz
hw.sensors.cpu52.frequency0=225000.00 Hz
hw.sensors.cpu53.frequency0=255000.00 Hz
hw.sensors.cpu54.frequency0=315000.00 Hz
hw.sensors.cpu55.frequency0=325000.00 Hz
hw.sensors.cpu56.frequency0=31.00 Hz
hw.sensors.cpu57.frequency0=285000.00 Hz
hw.sensors.cpu58.frequency0=21.00 Hz
hw.sensors.cpu59.frequency0=24.00 Hz
hw.sensors.cpu60.frequency0=275000.00 Hz
hw.sensors.cpu61.frequency0=31.00 Hz
hw.sensors.cpu62.frequency0=215000.00 Hz
hw.sensors.cpu63.frequency0=215000.00 Hz
hw.sensors.ksmn0.temp0=46.50 degC (Tctl)
hw.sensors.ksmn0.temp1=42.00 degC (Tccd0)
hw.sensors.ksmn0.temp2=46.25 degC (Tccd1)
hw.sensors.ksmn0.temp3=41.00 degC (Tccd2)
hw.sensors.ksmn0.temp4=41.50 degC (Tccd3)
hw.sensors.ksmn0.temp5=42.25 degC (Tccd4)
hw.sensors.ksmn0.temp6=44.50 degC (Tccd5)
hw.sensors.ksmn0.temp7=42.00 degC (Tccd6)
hw.sensors.ksmn0.temp8=44.00 degC (Tccd7)
hw.sensors.mfii0.temp0=28.00 degC (bbu)
hw.sensors.mfii0.volt0=3.94 VDC (bbu)
hw.sensors.mfii0.current0=0.00 A (bbu)
hw.sensors.mfii0.indicator0=On (bbu ok), OK
hw.sensors.mfii0.indicator1=Off (pack missing)
hw.sensors.mfii0.indicator2=Off (voltage low)
hw.sensors.mfii0.indicator3=Off (temp high)
hw.sensors.mfii0.indicator4=Off (charge active)
hw.sensors.mfii0.indicator5=Off (discharge active)
hw.sensors.mfii0.indicator6=Off (learn cycle req'd)
hw.sensors.mfii0.indicator7=Off (learn cycle active)
hw.sensors.mfii0.indicator8=Off (learn cycle failed)
hw.sensors.mfii0.indicator9=Off (learn cycle timeout)
hw.sensors.mfii0.indicator10=Off (I2C errors)
hw.sensors.mfii0.indicator11=Off (replace pack)
hw.sensors.mfii0.indicator12=Off (low capacity)
hw.sensors.mfii0.indicator13=Off (periodic learn req'd)
hw.sensors.mfii0.drive0=online (sd0), OK
hw.sensors.ksmn1.temp0=46.50 degC (Tctl)
hw.sensors.ksmn1.temp1=42.00 degC (Tccd0)
hw.sensors.ksmn1.temp2=46.25 degC (Tccd1)
hw.sensors.ksmn1.temp3=41.00 degC (Tccd2)
hw.sensors.ksmn1.temp4=41.50 degC (Tccd3)
hw.sensors.ksmn1.temp5=42.25 degC (Tccd4)
hw.sensors.ksmn1.temp6=44.50 degC (Tccd5)
hw.sensors.ksmn1.temp7=42.00 degC (Tccd6)
hw.sensors.ksmn1.temp8=44.00 degC (Tccd7)
hw.sensors.ksmn2.temp0=46.50 degC (Tctl)
hw.sensors.ksmn2.temp1=42.00 degC (Tccd0)
hw.sensor

Re: more generic cpu freq reporting

2022-04-25 Thread Hrvoje Popovski
On 25.4.2022. 17:32, Claudio Jeker wrote:
> You may need to play with hw.setperf and maybe run a single cpu load to
> see boost behaviour. I noticed that my 7th gen Intel CPU behaves different
> to the AMD Ryzen CPUs I own.

This is like playing with new desktop environment... Thank you :)


2 cores - E5-2650 v3 @ 2.30GHz, 2996.54 MHz, 06-3f-02
hw.cpuspeed=2996
hw.sensors.cpu0.temp0=54.00 degC
hw.sensors.cpu0.frequency0=30.00 Hz
hw.sensors.cpu1.frequency0=30.00 Hz
hw.sensors.mfii0.drive0=online (sd0), OK


4 cores - E5-2650 v3 @ 2.30GHz, 2996.54 MHz, 06-3f-02
hw.cpuspeed=2996
hw.sensors.cpu0.temp0=45.00 degC
hw.sensors.cpu0.frequency0=30.00 Hz
hw.sensors.cpu1.frequency0=30.00 Hz
hw.sensors.cpu2.frequency0=30.00 Hz
hw.sensors.cpu3.frequency0=30.00 Hz
hw.sensors.mfii0.drive0=online (sd0), OK


8 cores - E5-2650 v3 @ 2.30GHz, 2696.88 MHz, 06-3f-02
hw.cpuspeed=2697
hw.sensors.cpu0.temp0=50.00 degC
hw.sensors.cpu0.frequency0=27.00 Hz
hw.sensors.cpu1.frequency0=27.00 Hz
hw.sensors.cpu2.frequency0=27.00 Hz
hw.sensors.cpu3.frequency0=27.00 Hz
hw.sensors.cpu4.frequency0=27.00 Hz
hw.sensors.cpu5.frequency0=27.00 Hz
hw.sensors.cpu6.frequency0=27.00 Hz
hw.sensors.cpu7.frequency0=27.00 Hz


12 cores - E5-2650 v3 @ 2.30GHz, 2597.00 MHz, 06-3f-02
hw.cpuspeed=2597
hw.sensors.cpu0.temp0=50.00 degC
hw.sensors.cpu0.frequency0=26.00 Hz
hw.sensors.cpu1.frequency0=26.00 Hz
hw.sensors.cpu2.frequency0=26.00 Hz
hw.sensors.cpu3.frequency0=26.00 Hz
hw.sensors.cpu4.frequency0=26.00 Hz
hw.sensors.cpu5.frequency0=26.00 Hz
hw.sensors.cpu6.frequency0=26.00 Hz
hw.sensors.cpu7.frequency0=26.00 Hz
hw.sensors.cpu8.frequency0=26.00 Hz
hw.sensors.cpu9.frequency0=26.00 Hz
hw.sensors.cpu10.frequency0=26.00 Hz
hw.sensors.cpu11.frequency0=26.00 Hz
hw.sensors.mfii0.drive0=online (sd0), OK


16 cores - E5-2650 v3 @ 2.30GHz, 2597.01 MHz, 06-3f-02
hw.cpuspeed=2597
hw.sensors.cpu0.temp0=59.00 degC
hw.sensors.cpu0.frequency0=26.00 Hz
hw.sensors.cpu1.frequency0=26.00 Hz
hw.sensors.cpu2.frequency0=26.00 Hz
hw.sensors.cpu3.frequency0=26.00 Hz
hw.sensors.cpu4.frequency0=26.00 Hz
hw.sensors.cpu5.frequency0=26.00 Hz
hw.sensors.cpu6.frequency0=26.00 Hz
hw.sensors.cpu7.frequency0=26.00 Hz
hw.sensors.cpu8.frequency0=26.00 Hz
hw.sensors.cpu9.frequency0=26.00 Hz
hw.sensors.cpu10.frequency0=26.00 Hz
hw.sensors.cpu11.frequency0=26.00 Hz
hw.sensors.cpu12.frequency0=26.00 Hz
hw.sensors.cpu13.frequency0=26.00 Hz
hw.sensors.cpu14.frequency0=26.00 Hz
hw.sensors.cpu15.frequency0=26.00 Hz
hw.sensors.mfii0.drive0=online (sd0), OK



20 cores - E5-2650 v3 @ 2.30GHz, 2597.00 MHz, 06-3f-02
hw.cpuspeed=2597
hw.sensors.cpu0.temp0=61.00 degC
hw.sensors.cpu0.frequency0=26.00 Hz
hw.sensors.cpu1.frequency0=26.00 Hz
hw.sensors.cpu2.frequency0=26.00 Hz
hw.sensors.cpu3.frequency0=26.00 Hz
hw.sensors.cpu4.frequency0=26.00 Hz
hw.sensors.cpu5.frequency0=26.00 Hz
hw.sensors.cpu6.frequency0=26.00 Hz
hw.sensors.cpu7.frequency0=26.00 Hz
hw.sensors.cpu8.frequency0=26.00 Hz
hw.sensors.cpu9.frequency0=26.00 Hz
hw.sensors.cpu10.frequency0=26.00 Hz
hw.sensors.cpu11.frequency0=26.00 Hz
hw.sensors.cpu12.frequency0=26.00 Hz
hw.sensors.cpu13.frequency0=26.00 Hz
hw.sensors.cpu14.frequency0=26.00 Hz
hw.sensors.cpu15.frequency0=26.00 Hz
hw.sensors.cpu16.frequency0=26.00 Hz
hw.sensors.cpu17.frequency0=26.00 Hz
hw.sensors.cpu18.frequency0=26.00 Hz
hw.sensors.cpu19.frequency0=26.00 Hz
hw.sensors.mfii0.drive0=online (sd0), OK




OpenBSD 7.1-current (GENERIC.MP) #12: Mon Apr 25 19:30:26 CEST 2022
hrv...@r430.srce.hr:/sys/arch/amd64/compile/GENERIC.MP
real mem = 137189920768 (130834MB)
avail mem = 133014806528 (126852MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7af09000 (64 entries)
bios0: vendor Dell Inc. version "2.14.0" date 01/25/2022
bios0: Dell Inc. PowerEdge R430
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S5
acpi0: tables DSDT FACP MCEJ WD__ SLIC HPET APIC MCFG MSCT SLIT SRAT
SSDT SSDT SSDT PRAD DMAR HEST BERT ERST EINJ
acpi0: wakeup devices PCI0(S4) BR1A(S4) BR2A(S4) BR2B(S4) BR2C(S4)
BR2D(S4) BR3A(S4) BR3B(S4) BR3C(S4) BR3D(S4) XHC_(S0) RP01(S4) RP02(S4)
RP03(S4) RP04(S4) RP05(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2597.34 MHz, 06-3f-02
cpu0:

Re: more generic cpu freq reporting

2022-04-25 Thread Hrvoje Popovski
On 25.4.2022. 16:50, Hrvoje Popovski wrote:
> On 25.4.2022. 16:19, Claudio Jeker wrote:
>> After I sent out my ksmn(4) diff to include cpu frequency sensors dlg@
>> told me that this is a generic way to find the cpu frequency on modern x86
>> cpus (both intel and amd support it).
>>
>> So this diff cleans up the CPU frequency sensors and moves them to the
>> cpu(4). I had to split the sensor attachement up since sensordev_install()
>> calls into hotplug which does a selwakeup() and that call locks up (I
>> guess it is the KERNEL_LOCK()). Moving that part of the code to
>> cpu_attach() makes the problem go away.
>>
>> Tested on a AMD Ryzen Pro 5850U and an Intel Core i7-7500U.
> 
> Hi,
> 
> Supermicro AS-1114S-WTRT with AMD EPYC 7413
> 
> 
> smc24# sysctl hw.sensors
> hw.sensors.cpu0.frequency0=29.00 Hz
> hw.sensors.cpu1.frequency0=21.00 Hz
> hw.sensors.cpu2.frequency0=195000.00 Hz
> hw.sensors.cpu3.frequency0=235000.00 Hz
> hw.sensors.cpu4.frequency0=195000.00 Hz
> hw.sensors.cpu5.frequency0=195000.00 Hz
> hw.sensors.cpu6.frequency0=19.00 Hz
> hw.sensors.cpu7.frequency0=195000.00 Hz
> hw.sensors.cpu8.frequency0=195000.00 Hz
> hw.sensors.cpu9.frequency0=195000.00 Hz
> hw.sensors.cpu10.frequency0=195000.00 Hz
> hw.sensors.cpu11.frequency0=195000.00 Hz
> hw.sensors.cpu12.frequency0=205000.00 Hz
> hw.sensors.cpu13.frequency0=28.00 Hz
> hw.sensors.cpu14.frequency0=20.00 Hz
> hw.sensors.cpu15.frequency0=29.00 Hz
> hw.sensors.cpu16.frequency0=21.00 Hz
> hw.sensors.cpu17.frequency0=20.00 Hz
> hw.sensors.cpu18.frequency0=19.00 Hz
> hw.sensors.cpu19.frequency0=27.00 Hz
> hw.sensors.cpu20.frequency0=195000.00 Hz
> hw.sensors.cpu21.frequency0=215000.00 Hz
> hw.sensors.cpu22.frequency0=255000.00 Hz
> hw.sensors.cpu23.frequency0=20.00 Hz
> hw.sensors.ksmn0.temp0=47.12 degC
> hw.sensors.ksmn1.temp0=47.12 degC
> hw.sensors.ksmn2.temp0=47.12 degC
> hw.sensors.ksmn3.temp0=47.12 degC
> 
> 
> while doing make -j24
> hw.sensors.cpu0.frequency0=36.00 Hz
> hw.sensors.cpu1.frequency0=36.00 Hz
> hw.sensors.cpu2.frequency0=36.00 Hz
> hw.sensors.cpu3.frequency0=36.00 Hz
> hw.sensors.cpu4.frequency0=36.00 Hz
> hw.sensors.cpu5.frequency0=36.00 Hz
> hw.sensors.cpu6.frequency0=36.00 Hz
> hw.sensors.cpu7.frequency0=36.00 Hz
> hw.sensors.cpu8.frequency0=36.00 Hz
> hw.sensors.cpu9.frequency0=36.00 Hz
> hw.sensors.cpu10.frequency0=36.00 Hz
> hw.sensors.cpu11.frequency0=36.00 Hz
> hw.sensors.cpu12.frequency0=36.00 Hz
> hw.sensors.cpu13.frequency0=36.00 Hz
> hw.sensors.cpu14.frequency0=36.00 Hz
> hw.sensors.cpu15.frequency0=36.00 Hz
> hw.sensors.cpu16.frequency0=36.00 Hz
> hw.sensors.cpu17.frequency0=36.00 Hz
> hw.sensors.cpu18.frequency0=36.00 Hz
> hw.sensors.cpu19.frequency0=36.00 Hz
> hw.sensors.cpu20.frequency0=36.00 Hz
> hw.sensors.cpu21.frequency0=36.00 Hz
> hw.sensors.cpu22.frequency0=36.00 Hz
> hw.sensors.cpu23.frequency0=36.00 Hz
> hw.sensors.ksmn0.temp0=63.25 degC
> hw.sensors.ksmn1.temp0=63.25 degC
> hw.sensors.ksmn2.temp0=63.25 degC
> hw.sensors.ksmn3.temp0=63.25 degC
> 

Dell R430 with Intel E5-2650 v3

before
r430# sysctl hw.sensors
hw.sensors.cpu0.temp0=51.00 degC
hw.sensors.mfii0.drive0=online (sd0), OK


after
r430# sysctl hw.sensors
hw.sensors.cpu0.temp0=55.00 degC
hw.sensors.cpu0.frequency0=26.00 Hz
hw.sensors.cpu1.frequency0=26.00 Hz
hw.sensors.cpu2.frequency0=26.00 Hz
hw.sensors.cpu3.frequency0=26.00 Hz
hw.sensors.cpu4.frequency0=26.00 Hz
hw.sensors.cpu5.frequency0=26.00 Hz
hw.sensors.cpu6.frequency0=26.00 Hz
hw.sensors.cpu7.frequency0=26.00 Hz
hw.sensors.cpu8.frequency0=26.00 Hz
hw.sensors.cpu9.frequency0=26.00 Hz
hw.sensors.cpu10.frequency0=26.00 Hz
hw.sensors.cpu11.frequency0=26.00 Hz
hw.sensors.cpu12.frequency0=26.00 Hz
hw.sensors.cpu13.frequency0=26.00 Hz
hw.sensors.cpu14.frequency0=26.00 Hz
hw.sensors.cpu15.frequency0=26.00 Hz
hw.sensors.cpu16.frequency0=26.00 Hz
hw.sensors.cpu17.frequency0=26.00 Hz
hw.sensors.cpu18.frequency0=26.00 Hz
hw.sensors.cpu19.frequency0=26.00 Hz
hw.sensors.mfii0.drive0=online (sd0), OK



while doing make -j20
r430# sysctl hw.sensors
hw.sensors.cpu0.temp0=62.00 degC
hw.sensors.cpu0.frequency0=26.00 Hz
hw.sensors.cpu1.frequency0=26.00 Hz
hw.sensors.cpu2.frequency0=26.00 Hz
hw.sensors.cpu3.frequency0=26

Re: more generic cpu freq reporting

2022-04-25 Thread Hrvoje Popovski
On 25.4.2022. 16:19, Claudio Jeker wrote:
> After I sent out my ksmn(4) diff to include cpu frequency sensors dlg@
> told me that this is a generic way to find the cpu frequency on modern x86
> cpus (both intel and amd support it).
> 
> So this diff cleans up the CPU frequency sensors and moves them to the
> cpu(4). I had to split the sensor attachement up since sensordev_install()
> calls into hotplug which does a selwakeup() and that call locks up (I
> guess it is the KERNEL_LOCK()). Moving that part of the code to
> cpu_attach() makes the problem go away.
> 
> Tested on a AMD Ryzen Pro 5850U and an Intel Core i7-7500U.

Hi,

Supermicro AS-1114S-WTRT with AMD EPYC 7413


smc24# sysctl hw.sensors
hw.sensors.cpu0.frequency0=29.00 Hz
hw.sensors.cpu1.frequency0=21.00 Hz
hw.sensors.cpu2.frequency0=195000.00 Hz
hw.sensors.cpu3.frequency0=235000.00 Hz
hw.sensors.cpu4.frequency0=195000.00 Hz
hw.sensors.cpu5.frequency0=195000.00 Hz
hw.sensors.cpu6.frequency0=19.00 Hz
hw.sensors.cpu7.frequency0=195000.00 Hz
hw.sensors.cpu8.frequency0=195000.00 Hz
hw.sensors.cpu9.frequency0=195000.00 Hz
hw.sensors.cpu10.frequency0=195000.00 Hz
hw.sensors.cpu11.frequency0=195000.00 Hz
hw.sensors.cpu12.frequency0=205000.00 Hz
hw.sensors.cpu13.frequency0=28.00 Hz
hw.sensors.cpu14.frequency0=20.00 Hz
hw.sensors.cpu15.frequency0=29.00 Hz
hw.sensors.cpu16.frequency0=21.00 Hz
hw.sensors.cpu17.frequency0=20.00 Hz
hw.sensors.cpu18.frequency0=19.00 Hz
hw.sensors.cpu19.frequency0=27.00 Hz
hw.sensors.cpu20.frequency0=195000.00 Hz
hw.sensors.cpu21.frequency0=215000.00 Hz
hw.sensors.cpu22.frequency0=255000.00 Hz
hw.sensors.cpu23.frequency0=20.00 Hz
hw.sensors.ksmn0.temp0=47.12 degC
hw.sensors.ksmn1.temp0=47.12 degC
hw.sensors.ksmn2.temp0=47.12 degC
hw.sensors.ksmn3.temp0=47.12 degC


while doing make -j24
hw.sensors.cpu0.frequency0=36.00 Hz
hw.sensors.cpu1.frequency0=36.00 Hz
hw.sensors.cpu2.frequency0=36.00 Hz
hw.sensors.cpu3.frequency0=36.00 Hz
hw.sensors.cpu4.frequency0=36.00 Hz
hw.sensors.cpu5.frequency0=36.00 Hz
hw.sensors.cpu6.frequency0=36.00 Hz
hw.sensors.cpu7.frequency0=36.00 Hz
hw.sensors.cpu8.frequency0=36.00 Hz
hw.sensors.cpu9.frequency0=36.00 Hz
hw.sensors.cpu10.frequency0=36.00 Hz
hw.sensors.cpu11.frequency0=36.00 Hz
hw.sensors.cpu12.frequency0=36.00 Hz
hw.sensors.cpu13.frequency0=36.00 Hz
hw.sensors.cpu14.frequency0=36.00 Hz
hw.sensors.cpu15.frequency0=36.00 Hz
hw.sensors.cpu16.frequency0=36.00 Hz
hw.sensors.cpu17.frequency0=36.00 Hz
hw.sensors.cpu18.frequency0=36.00 Hz
hw.sensors.cpu19.frequency0=36.00 Hz
hw.sensors.cpu20.frequency0=36.00 Hz
hw.sensors.cpu21.frequency0=36.00 Hz
hw.sensors.cpu22.frequency0=36.00 Hz
hw.sensors.cpu23.frequency0=36.00 Hz
hw.sensors.ksmn0.temp0=63.25 degC
hw.sensors.ksmn1.temp0=63.25 degC
hw.sensors.ksmn2.temp0=63.25 degC
hw.sensors.ksmn3.temp0=63.25 degC



Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-24 Thread Hrvoje Popovski
On 24.4.2022. 20:24, Hrvoje Popovski wrote:
> after diff
> smc24# sysctl | grep ksmn
> hw.sensors.ksmn0.temp0=47.50 degC (Tctl)
> hw.sensors.ksmn0.temp1=45.25 degC (Tccd0)
> hw.sensors.ksmn0.temp2=46.00 degC (Tccd2)
> hw.sensors.ksmn0.temp3=45.75 degC (Tccd4)
> hw.sensors.ksmn0.temp4=47.50 degC (Tccd6)
> hw.sensors.ksmn0.frequency0=1960043474.00 Hz (CPU0)
> hw.sensors.ksmn0.frequency1=2178969010.00 Hz (CPU1)
> hw.sensors.ksmn0.frequency2=2021703765.00 Hz (CPU2)
> hw.sensors.ksmn0.frequency3=2791496996.00 Hz (CPU3)
> hw.sensors.ksmn0.frequency4=1936332732.00 Hz (CPU4)
> hw.sensors.ksmn0.frequency5=1952819576.00 Hz (CPU5)
> hw.sensors.ksmn0.frequency6=1895289933.00 Hz (CPU6)
> hw.sensors.ksmn0.frequency7=1906813124.00 Hz (CPU7)
> hw.sensors.ksmn0.frequency8=1916662200.00 Hz (CPU8)
> hw.sensors.ksmn0.frequency9=1925515463.00 Hz (CPU9)
> hw.sensors.ksmn0.frequency10=2165544390.00 Hz (CPU10)
> hw.sensors.ksmn0.frequency11=1940854644.00 Hz (CPU11)
> hw.sensors.ksmn0.frequency12=1963695350.00 Hz (CPU12)
> hw.sensors.ksmn0.frequency13=2038281258.00 Hz (CPU13)
> hw.sensors.ksmn0.frequency14=1973428768.00 Hz (CPU14)
> hw.sensors.ksmn0.frequency15=2035124252.00 Hz (CPU15)
> hw.sensors.ksmn0.frequency16=1931312925.00 Hz (CPU16)
> hw.sensors.ksmn0.frequency17=191422.00 Hz (CPU17)
> hw.sensors.ksmn0.frequency18=1913169799.00 Hz (CPU18)
> hw.sensors.ksmn0.frequency19=2472108200.00 Hz (CPU19)
> hw.sensors.ksmn0.frequency20=1915108480.00 Hz (CPU20)
> hw.sensors.ksmn0.frequency21=2862980120.00 Hz (CPU21)
> hw.sensors.ksmn0.frequency22=2639124653.00 Hz (CPU22)
> hw.sensors.ksmn0.frequency23=1908989778.00 Hz (CPU23)
> hw.sensors.ksmn1.temp0=47.50 degC (Tctl)
> hw.sensors.ksmn1.temp1=45.25 degC (Tccd0)
> hw.sensors.ksmn1.temp2=46.00 degC (Tccd2)
> hw.sensors.ksmn1.temp3=45.75 degC (Tccd4)
> hw.sensors.ksmn1.temp4=47.50 degC (Tccd6)
> hw.sensors.ksmn1.frequency0=1968382096.00 Hz (CPU0)
> hw.sensors.ksmn1.frequency1=2376295723.00 Hz (CPU1)
> hw.sensors.ksmn1.frequency2=2369074799.00 Hz (CPU2)
> hw.sensors.ksmn1.frequency3=2404712762.00 Hz (CPU3)
> hw.sensors.ksmn1.frequency4=2457581506.00 Hz (CPU4)
> hw.sensors.ksmn1.frequency5=2401611206.00 Hz (CPU5)
> hw.sensors.ksmn1.frequency6=2339088239.00 Hz (CPU6)
> hw.sensors.ksmn1.frequency7=2463824725.00 Hz (CPU7)
> hw.sensors.ksmn1.frequency8=2359482485.00 Hz (CPU8)
> hw.sensors.ksmn1.frequency9=2483767808.00 Hz (CPU9)
> hw.sensors.ksmn1.frequency10=2413048435.00 Hz (CPU10)
> hw.sensors.ksmn1.frequency11=2391201370.00 Hz (CPU11)
> hw.sensors.ksmn1.frequency12=1944466261.00 Hz (CPU12)
> hw.sensors.ksmn1.frequency13=1939033492.00 Hz (CPU13)
> hw.sensors.ksmn1.frequency14=1949862067.00 Hz (CPU14)
> hw.sensors.ksmn1.frequency15=1947783743.00 Hz (CPU15)
> hw.sensors.ksmn1.frequency16=1919198696.00 Hz (CPU16)
> hw.sensors.ksmn1.frequency17=1953120383.00 Hz (CPU17)
> hw.sensors.ksmn1.frequency18=2543332610.00 Hz (CPU18)
> hw.sensors.ksmn1.frequency19=2564893500.00 Hz (CPU19)
> hw.sensors.ksmn1.frequency20=2638202441.00 Hz (CPU20)
> hw.sensors.ksmn1.frequency21=2814783269.00 Hz (CPU21)
> hw.sensors.ksmn1.frequency22=2808046584.00 Hz (CPU22)
> hw.sensors.ksmn1.frequency23=2578708588.00 Hz (CPU23)
> hw.sensors.ksmn2.temp0=47.50 degC (Tctl)
> hw.sensors.ksmn2.temp1=45.25 degC (Tccd0)
> hw.sensors.ksmn2.temp2=46.00 degC (Tccd2)
> hw.sensors.ksmn2.temp3=45.75 degC (Tccd4)
> hw.sensors.ksmn2.temp4=47.50 degC (Tccd6)
> hw.sensors.ksmn2.frequency0=2001533749.00 Hz (CPU0)
> hw.sensors.ksmn2.frequency1=1948022864.00 Hz (CPU1)
> hw.sensors.ksmn2.frequency2=1949718978.00 Hz (CPU2)
> hw.sensors.ksmn2.frequency3=2093756889.00 Hz (CPU3)
> hw.sensors.ksmn2.frequency4=1948401172.00 Hz (CPU4)
> hw.sensors.ksmn2.frequency5=1990612716.00 Hz (CPU5)
> hw.sensors.ksmn2.frequency6=2112140214.00 Hz (CPU6)
> hw.sensors.ksmn2.frequency7=1962903090.00 Hz (CPU7)
> hw.sensors.ksmn2.frequency8=1985202582.00 Hz (CPU8)
> hw.sensors.ksmn2.frequency9=2190306365.00 Hz (CPU9)
> hw.sensors.ksmn2.frequency10=1991116471.00 Hz (CPU10)
> hw.sensors.ksmn2.frequency11=2002007440.00 Hz (CPU11)
> hw.sensors.ksmn2.frequency12=3126687467.00 Hz (CPU12)
> hw.sensors.ksmn2.frequency13=3360747003.00 Hz (CPU13)
> hw.sensors.ksmn2.frequency14=2544531280.00 Hz (CPU14)
> hw.sensors.ksmn2.frequency15=3270889025.00 Hz (CPU15)
> hw.sensors.ksmn2.frequency16=3112205978.00 Hz (CPU16)
> hw.sensors.ksmn2.frequency17=2553566819.00 Hz (CPU17)
> hw.sensors.ksmn2.frequency18=2106320461.00 Hz (CPU18)
> hw.sensors.ksmn2.frequency19=2580420523.00 Hz (CPU19)
> hw.sensors.ksmn2.frequency20=2046857758.00 Hz (CPU20)
> hw.sensors.ksmn2.frequency21=2440632976.00 Hz (CPU21)
> hw.sensors.ksmn2.frequency22=2398193682.00 Hz (CPU22)
> hw.sensors.ksmn2.frequency23=2242716702.00 Hz (CPU23)
>

Re: beef up ksmn(4) to show more temps and CPU frequency

2022-04-24 Thread Hrvoje Popovski
On 24.4.2022. 19:06, Claudio Jeker wrote:
> On Ryzen CPUs each CCD has a temp sensor. If the CPU has CCDs (which
> excludes Zen APU CPUs) this should show additional temp info. This is
> based on info from the Linux k10temp driver.
> 
> Additionally use the MSRs defined in "Open-Source Register Reference For
> AMD Family 17h Processors" to measure the CPU core frequency.
> That should be the actuall speed of the CPU core during the measuring
> interval.
> 
> On my T14g2 the output is now for example:
> ksmn0.temp0   63.88 degC  Tctl
> ksmn0.frequency03553141515.00 Hz  CPU0
> ksmn0.frequency13549080315.00 Hz  CPU2
> ksmn0.frequency23552369937.00 Hz  CPU4
> ksmn0.frequency33546055048.00 Hz  CPU6
> ksmn0.frequency43546854449.00 Hz  CPU8
> ksmn0.frequency53543869698.00 Hz  CPU10
> ksmn0.frequency63542551127.00 Hz  CPU12
> ksmn0.frequency74441623647.00 Hz  CPU14
> 
> It is intresting to watch turbo kick in and how temp causes the CPU to
> throttle.
> 
> I only tested this on systems with APUs so I could not thest the Tccd temp
> reporting.

Hi,

before diff

smc24# sysctl | grep ksmn
hw.sensors.ksmn0.temp0=48.00 degC
hw.sensors.ksmn1.temp0=48.00 degC
hw.sensors.ksmn2.temp0=48.00 degC
hw.sensors.ksmn3.temp0=48.00 degC
smc24#


after diff
smc24# sysctl | grep ksmn
hw.sensors.ksmn0.temp0=47.50 degC (Tctl)
hw.sensors.ksmn0.temp1=45.25 degC (Tccd0)
hw.sensors.ksmn0.temp2=46.00 degC (Tccd2)
hw.sensors.ksmn0.temp3=45.75 degC (Tccd4)
hw.sensors.ksmn0.temp4=47.50 degC (Tccd6)
hw.sensors.ksmn0.frequency0=1960043474.00 Hz (CPU0)
hw.sensors.ksmn0.frequency1=2178969010.00 Hz (CPU1)
hw.sensors.ksmn0.frequency2=2021703765.00 Hz (CPU2)
hw.sensors.ksmn0.frequency3=2791496996.00 Hz (CPU3)
hw.sensors.ksmn0.frequency4=1936332732.00 Hz (CPU4)
hw.sensors.ksmn0.frequency5=1952819576.00 Hz (CPU5)
hw.sensors.ksmn0.frequency6=1895289933.00 Hz (CPU6)
hw.sensors.ksmn0.frequency7=1906813124.00 Hz (CPU7)
hw.sensors.ksmn0.frequency8=1916662200.00 Hz (CPU8)
hw.sensors.ksmn0.frequency9=1925515463.00 Hz (CPU9)
hw.sensors.ksmn0.frequency10=2165544390.00 Hz (CPU10)
hw.sensors.ksmn0.frequency11=1940854644.00 Hz (CPU11)
hw.sensors.ksmn0.frequency12=1963695350.00 Hz (CPU12)
hw.sensors.ksmn0.frequency13=2038281258.00 Hz (CPU13)
hw.sensors.ksmn0.frequency14=1973428768.00 Hz (CPU14)
hw.sensors.ksmn0.frequency15=2035124252.00 Hz (CPU15)
hw.sensors.ksmn0.frequency16=1931312925.00 Hz (CPU16)
hw.sensors.ksmn0.frequency17=191422.00 Hz (CPU17)
hw.sensors.ksmn0.frequency18=1913169799.00 Hz (CPU18)
hw.sensors.ksmn0.frequency19=2472108200.00 Hz (CPU19)
hw.sensors.ksmn0.frequency20=1915108480.00 Hz (CPU20)
hw.sensors.ksmn0.frequency21=2862980120.00 Hz (CPU21)
hw.sensors.ksmn0.frequency22=2639124653.00 Hz (CPU22)
hw.sensors.ksmn0.frequency23=1908989778.00 Hz (CPU23)
hw.sensors.ksmn1.temp0=47.50 degC (Tctl)
hw.sensors.ksmn1.temp1=45.25 degC (Tccd0)
hw.sensors.ksmn1.temp2=46.00 degC (Tccd2)
hw.sensors.ksmn1.temp3=45.75 degC (Tccd4)
hw.sensors.ksmn1.temp4=47.50 degC (Tccd6)
hw.sensors.ksmn1.frequency0=1968382096.00 Hz (CPU0)
hw.sensors.ksmn1.frequency1=2376295723.00 Hz (CPU1)
hw.sensors.ksmn1.frequency2=2369074799.00 Hz (CPU2)
hw.sensors.ksmn1.frequency3=2404712762.00 Hz (CPU3)
hw.sensors.ksmn1.frequency4=2457581506.00 Hz (CPU4)
hw.sensors.ksmn1.frequency5=2401611206.00 Hz (CPU5)
hw.sensors.ksmn1.frequency6=2339088239.00 Hz (CPU6)
hw.sensors.ksmn1.frequency7=2463824725.00 Hz (CPU7)
hw.sensors.ksmn1.frequency8=2359482485.00 Hz (CPU8)
hw.sensors.ksmn1.frequency9=2483767808.00 Hz (CPU9)
hw.sensors.ksmn1.frequency10=2413048435.00 Hz (CPU10)
hw.sensors.ksmn1.frequency11=2391201370.00 Hz (CPU11)
hw.sensors.ksmn1.frequency12=1944466261.00 Hz (CPU12)
hw.sensors.ksmn1.frequency13=1939033492.00 Hz (CPU13)
hw.sensors.ksmn1.frequency14=1949862067.00 Hz (CPU14)
hw.sensors.ksmn1.frequency15=1947783743.00 Hz (CPU15)
hw.sensors.ksmn1.frequency16=1919198696.00 Hz (CPU16)
hw.sensors.ksmn1.frequency17=1953120383.00 Hz (CPU17)
hw.sensors.ksmn1.frequency18=2543332610.00 Hz (CPU18)
hw.sensors.ksmn1.frequency19=2564893500.00 Hz (CPU19)
hw.sensors.ksmn1.frequency20=2638202441.00 Hz (CPU20)
hw.sensors.ksmn1.frequency21=2814783269.00 Hz (CPU21)
hw.sensors.ksmn1.frequency22=2808046584.00 Hz (CPU22)
hw.sensors.ksmn1.frequency23=2578708588.00 Hz (CPU23)
hw.sensors.ksmn2.temp0=47.50 degC (Tctl)
hw.sensors.ksmn2.temp1=45.25 degC (Tccd0)
hw.sensors.ksmn2.temp2=46.00 degC (Tccd2)
hw.sensors.ksmn2.temp3=45.75 degC (Tccd4)
hw.sensors.ksmn2.temp4=47.50 degC (Tccd6)
hw.sensors.ksmn2.frequency0=2001533749.00 Hz (CPU0)
hw.sensors.ksmn2.frequency1=1948022864.00 Hz (CPU1)
hw.sensors.ksmn2.frequency2=1949718978.00 Hz (CPU2)
hw.sensors.ksmn2.frequency3=2093756889.00 Hz (CPU3)
hw.sensors.ksmn2.frequency4=1948401172.00 Hz (CPU4)
hw.sensors.ksmn2.frequency5=1990612716.00 Hz (CPU5)
hw.sensors.ksmn2.frequency6=2112140214.00 

Re: [External] : Re: pfsync(4) snapshot lists must have dedicated link element

2022-04-21 Thread Hrvoje Popovski
On 20.4.2022. 23:22, Alexandr Nedvedicky wrote:
> Hello,
> 
> On Wed, Apr 20, 2022 at 03:43:06PM +0200, Alexander Bluhm wrote:
>> On Sat, Apr 09, 2022 at 01:51:05AM +0200, Alexandr Nedvedicky wrote:
>>> updated diff is below.
>> I am not sure what Hrvoje actually did test and what not.  My
>> impression was, that he got a panic with the previous version of
>> this diff, but the machine was stable with the code in current.
>>
>> But maybe I got it wrong and we need this code to run pfsync with
>> IPsec in parallel.
> that's correct. Hrvoje was testing several diffs stacked
> on each other:
>   diff which enables parallel forwarding
>   diff which fixes tunnel descriptor block handling (tdb) for ipsec
>   diff which fixes pfsync

Yes,

panics that we exchange privately was on production and I couldn't
reproduce them in lab. In production I don't have ipsec, only simple
pfsync setup. It seems to me that pfsync mpfloor diff solved panics that
I had in production.

At the same time I was running sasyncd setup with 5 site-to-site ipsec
tunnels with same diffs and for now it seems stable ..




Re: router timer mutex

2022-04-21 Thread Hrvoje Popovski
On 20.4.2022. 20:12, Alexander Bluhm wrote:
> Hi,
> 
> mvs@ reminded me of a crash I have seen in December.  Route timers
> are not MP safe, but I think this can be fixed with a mutex.  The
> idea is to protect the global lists with a mutex and move the rttimer
> into a temporary list.  Then the callback and pool put can be called
> later without mutex.
> 
> It survived a full regress with witness.
> 
> Hrvoje: Can you put this on your test machine together with parallel
> IP forwarding?
> 
> ok?


No problem,

I'm running this and parallel forwarding diff in lab and production.



Re: parallel IP forwarding

2022-04-18 Thread Hrvoje Popovski
On 8.4.2022. 12:56, Alexander Bluhm wrote:
> Hi,
> 
> I now the right time to commit the parallel forwarding diff?
> 
> Known limitiations are:
> - Hrvoje has seen a crash with both pfsync and ipsec on his production
>   machine.  But he cannot reproduce it in his lab.

This is resolved. At least this panic doesn't happen any more.


> - TCP processing gets slower as we have an additional queue between
>   IP and protocol layer.
> - Protocol layer may starve as 1 exclusive lock is fightig with 4
>   shared locks.  This happens only when forwardig a lot.



Re: [External] : Re: pfsync(4) snapshot lists must have dedicated link element

2022-04-17 Thread Hrvoje Popovski
On 9.4.2022. 1:51, Alexandr Nedvedicky wrote:
> Hello,
> 
> thank you for taking a look at it.
> 
> 
>> I think there is a use after free in you diff.  After you return
>> from pfsync_delete_tdb() you must not access the TDB again.
>>
>> Comments inline.
>>
> 
>>> TAILQ_INIT(>sn_upd_req_list);
>>> -   TAILQ_CONCAT(>sn_upd_req_list, >sc_upd_req_list, ur_entry);
>>> +   while (!TAILQ_EMPTY(>sc_upd_req_list)) {
>>> +   ur = TAILQ_FIRST(>sc_upd_req_list);
>> Other loops have this idiom
>> while ((ur = TAILQ_FIRST(>sc_upd_req_list) != NULL) {
>>
> new diff version uses 'TAILQ_FIRST()'
> 
> 
>>> @@ -1827,18 +1853,20 @@ pfsync_sendout(void)
>>> subh = (struct pfsync_subheader *)(m->m_data + offset);
>>> offset += sizeof(*subh);
>>>
>>> -   mtx_enter(>sc_tdb_mtx);
>>> count = 0;
>>> while ((t = TAILQ_FIRST(_tdb_q)) != NULL) {
>>> -   TAILQ_REMOVE(_tdb_q, t, tdb_sync_entry);
>>> +   TAILQ_REMOVE(_tdb_q, t, tdb_sync_snap);
>> I think the TDB in tdb_sync_snap list may be freed.  Maybe
>> you should grab a refcount if you store them into a list.
>>
> I see. pfsync must grab the reference count in pfsync_update_tdb(),
> where tdb entry is inserted into queue. new diff  fixes that.
> 
>>> pfsync_out_tdb(t, m->m_data + offset);
>>> offset += sizeof(struct pfsync_tdb);
>>> +#ifdef PFSYNC_DEBUG
>>> +   KASSERT(t->tdb_snapped == 1);
>>> +#endif
>>> +   t->tdb_snapped = 0;
>> This may also be use after free.
> new diffs drops the reference as soon as pfsync(4) dispatches
> the tdb into pfsync packet.
> 
> 
>>> @@ -2525,7 +2565,17 @@ pfsync_delete_tdb(struct tdb *t)
>>>
>>> mtx_enter(>sc_tdb_mtx);
>>>
>>> +   /*
>>> +* if tdb entry is just being processed (found in snapshot),
>>> +* then it can not be deleted. we just came too late
>>> +*/
>>> +   if (t->tdb_snapped) {
>>> +   mtx_leave(>sc_tdb_mtx);
>> You must not access the TDB after this point.  I thnik you cannot
>> guarantee that.  The memory will freed after return.
> new diff fixes that
> 
> 
>>> diff --git a/sys/netinet/ip_ipsp.h b/sys/netinet/ip_ipsp.h
>>> index c697994047b..ebdb6ada1ae 100644
>>> --- a/sys/netinet/ip_ipsp.h
>>> +++ b/sys/netinet/ip_ipsp.h
>>> @@ -405,6 +405,7 @@ struct tdb {/* tunnel 
>>> descriptor block */
>>> u_int8_ttdb_wnd;/* Replay window */
>>> u_int8_ttdb_satype; /* SA type (RFC2367, PF_KEY) */
>>> u_int8_ttdb_updates;/* pfsync update counter */
>>> +   u_int8_ttdb_snapped;/* dispatched by pfsync(4) */
>> u_int8_t is not atomic.  I you want to change tdb_snapped, you need
>> a mutex that also protects the othere fields in the same 32 bit
>> word everywhere.  I think a new flag TDBF_PFSYNC_SNAPPED in tdb_flags
>> would be easier.  The tdb_flags are protected by tdb_mtx.
>>
> I like your idea.
> 
> updated diff is below.
> 
> 
> thanks and
> regards
> sashan


Hi,

I'm running this diff with bluhm@ pfsync mpfloor diff for 4 days on
production firewalls where panics were found and for now everything
seems normal ..





Re: dell and supermicro servers won't boot

2022-04-05 Thread Hrvoje Popovski
please delete this mail ... my fault ..


On 5.4.2022. 18:51, Hrvoje Popovski wrote:
> Hi all,
> 
> I've did sysupgrade -s and then fetch source 10,20 minutes ago. Did that
> on two boxes and both boxes won't boot. They stop on "entry point" and
> then reboot .. if i boot snapshot kernel boot works as expected
> 
> 
> 
> 
> dell R620 - real serial
> 
>>> OpenBSD/amd64 BOOT 3.53
> boot>
> booting hd0a:/bsd: 10431259+2475012+241684+0+1130496
> [97+586992+631014]=0xec9850
> entry point at 0xd0201000
> 
> 
> supermicro - serial over lan
> 
> Using drive 0, partition 3.
> Loading..
>/amd64 BOO
> probing: pc0 com0 com1 mem[615K 1934M 30720M a20=on]
> disk: hd0+ hd1+*
>>> OpenBSD/amd64 BOOT 3.53
> switching console to com1
> 
> booting hd0a:/bsd: 10431259+2479108+245780+0+1130496
> [97+586992+631014]=0xecb850
> entry point at 0xd0201000
> 



dell and supermicro servers won't boot

2022-04-05 Thread Hrvoje Popovski
Hi all,

I've did sysupgrade -s and then fetch source 10,20 minutes ago. Did that
on two boxes and both boxes won't boot. They stop on "entry point" and
then reboot .. if i boot snapshot kernel boot works as expected




dell R620 - real serial

>> OpenBSD/amd64 BOOT 3.53
boot>
booting hd0a:/bsd: 10431259+2475012+241684+0+1130496
[97+586992+631014]=0xec9850
entry point at 0xd0201000


supermicro - serial over lan

Using drive 0, partition 3.
Loading..
   /amd64 BOO
probing: pc0 com0 com1 mem[615K 1934M 30720M a20=on]
disk: hd0+ hd1+*
>> OpenBSD/amd64 BOOT 3.53
switching console to com1

booting hd0a:/bsd: 10431259+2479108+245780+0+1130496
[97+586992+631014]=0xecb850
entry point at 0xd0201000



Re: ipsec acquire mutex refcount

2022-03-14 Thread Hrvoje Popovski
On 9.3.2022. 19:14, Alexander Bluhm wrote:
> Hi,
> 
> Hrvoje has hit a crash with IPsec acquire while testing the parallel
> ip forwarding diff.  Analysis with sashan@ concludes that this part
> is not MP safe.  Mutex and refcount should fix the memory management.
> 
> Hrvoje: Could you test this diff?


Hi,

now that this diff is in, is it time to test parallel forwarding on
wider audience?



Re: ipsec acquire mutex refcount

2022-03-11 Thread Hrvoje Popovski
On 9.3.2022. 21:02, Hrvoje Popovski wrote:
> On 9.3.2022. 19:14, Alexander Bluhm wrote:
>> Hi,
>>
>> Hrvoje has hit a crash with IPsec acquire while testing the parallel
>> ip forwarding diff.  Analysis with sashan@ concludes that this part
>> is not MP safe.  Mutex and refcount should fix the memory management.
>>
>> Hrvoje: Could you test this diff?
> 
> Hi,
> 
> no problem. Give me 2 or 3 days to hit it properly...

Hi,

after 2 days of hitting sasyncd setup I can't trigger panic as before.

Thank you ..




Re: ipsec acquire mutex refcount

2022-03-09 Thread Hrvoje Popovski
On 9.3.2022. 19:14, Alexander Bluhm wrote:
> Hi,
> 
> Hrvoje has hit a crash with IPsec acquire while testing the parallel
> ip forwarding diff.  Analysis with sashan@ concludes that this part
> is not MP safe.  Mutex and refcount should fix the memory management.
> 
> Hrvoje: Could you test this diff?

Hi,

no problem. Give me 2 or 3 days to hit it properly...



  1   2   3   4   >