Re: another go at bypass support for sparc64 iommu and BUS_DMA_64BIT

2019-06-19 Thread Kaashif Hymabaccus
This is great, thanks for the work. I have the new snapshot running on
a T5220. I also have an Ultra 45 with some GPUs I use mainly for
playing games and sending people bug reports about how their game
doesn't work on sparc64, I'll try it there also. If there are any
specific tests you want done, I am happy to do them. Here is the dmesg
of the T5220 if there is anything that interests you:

console is /virtual-devices@100/console@1
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2019 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 6.5-current (GENERIC.MP) #207: Tue Jun 18 13:15:53 MDT 2019
dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP
real mem = 17045651456 (16256MB)
avail mem = 16721928192 (15947MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: SPARC Enterprise T5220
cpu0 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu1 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu2 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu3 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu4 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu5 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu6 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu7 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu8 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu9 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu10 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu11 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu12 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu13 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu14 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu15 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu16 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu17 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu18 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu19 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu20 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu21 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu22 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu23 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu24 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu25 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu26 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu27 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu28 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu29 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu30 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu31 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu32 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu33 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu34 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu35 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu36 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu37 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu38 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu39 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu40 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu41 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu42 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu43 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu44 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu45 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu46 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu47 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu48 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu49 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu50 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu51 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu52 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu53 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu54 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu55 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu56 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu57 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu58 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu59 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu60 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu61 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu62 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
cpu63 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
vbus0 at mainbus0
"flashprom" at vbus0 not 

Re: another go at bypass support for sparc64 iommu and BUS_DMA_64BIT

2019-06-19 Thread David Gwynne
I'm mostly concerned that nothing that currently works breaks. This changes a 
fairly fundamental chunk of code in the guts of the platform, so having 
machines still work afterward would be nice.

Glad your T5220 works, hopefully the u45 will be ok too.

dlg

> On 20 Jun 2019, at 08:52, Kaashif Hymabaccus  wrote:
> 
> This is great, thanks for the work. I have the new snapshot running on
> a T5220. I also have an Ultra 45 with some GPUs I use mainly for
> playing games and sending people bug reports about how their game
> doesn't work on sparc64, I'll try it there also. If there are any
> specific tests you want done, I am happy to do them. Here is the dmesg
> of the T5220 if there is anything that interests you:
> 
> console is /virtual-devices@100/console@1
> Copyright (c) 1982, 1986, 1989, 1991, 1993
>   The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2019 OpenBSD. All rights reserved.  https://www.OpenBSD.org
> 
> OpenBSD 6.5-current (GENERIC.MP) #207: Tue Jun 18 13:15:53 MDT 2019
>dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP
> real mem = 17045651456 (16256MB)
> avail mem = 16721928192 (15947MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root: SPARC Enterprise T5220
> cpu0 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu1 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu2 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu3 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu4 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu5 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu6 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu7 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu8 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu9 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu10 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu11 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu12 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu13 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu14 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu15 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu16 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu17 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu18 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu19 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu20 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu21 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu22 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu23 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu24 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu25 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu26 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu27 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu28 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu29 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu30 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu31 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu32 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu33 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu34 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu35 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu36 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu37 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu38 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu39 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu40 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu41 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu42 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu43 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu44 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu45 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu46 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu47 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu48 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu49 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu50 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu51 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu52 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu53 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu54 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu55 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu56 at mainbus0: SUNW,UltraSPARC-T2 (rev 0.0) @ 1165.379 MHz
> cpu57 at mainb

Re: another go at bypass support for sparc64 iommu and BUS_DMA_64BIT

2019-06-19 Thread Andrew Grillet
I have the snapshot running on a guest domain on a T1000 - see attached.

I will test on the primary in a few days, and may be able to test on a
T5220 primary towards the end of next week.

Let me know if you want any specific test procedure run.

On Tue, 18 Jun 2019 at 02:37, Theo de Raadt  wrote:

> David Gwynne  wrote:
>
> > this is a reposting of the diff i sent out a while back. it lets sparc64
> > enable iommu bypass, and then uses that bypass support for BUS_DMA_64BIT
> > dmamaps.
>
> BTW, this is in snapshots and I'd urge everyone running sparc64 to give it
> a try.
>
>
console is /virtual-devices@100/console@1
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.
Copyright (c) 1995-2019 OpenBSD. All rights reserved.  https://www.OpenBSD.org

OpenBSD 6.5-current (GENERIC.MP) #207: Tue Jun 18 13:15:53 MDT 2019
dera...@sparc64.openbsd.org:/usr/src/sys/arch/sparc64/compile/GENERIC.MP
real mem = 1073741824 (1024MB)
avail mem = 1033175040 (985MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: Sun Fire(TM) T1000
cpu0 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz
cpu1 at mainbus0: SUNW,UltraSPARC-T1 (rev 0.0) @ 1000 MHz
vbus0 at mainbus0
"flashprom" at vbus0 not configured
cbus0 at vbus0
vdsk0 at cbus0 chan 0x2: ivec 0x4, 0x5
scsibus1 at vdsk0: 2 targets
sd0 at scsibus1 targ 0 lun 0:  SCSI3 0/direct fixed
sd0: 10240MB, 512 bytes/sector, 20971520 sectors
vdsk1 at cbus0 chan 0x3: ivec 0x6, 0x7
scsibus2 at vdsk1: 2 targets
sd1 at scsibus2 targ 0 lun 0:  SCSI3 0/direct fixed
sd1: 2MB, 512 bytes/sector, 4096 sectors
vnet0 at cbus0 chan 0x4: ivec 0x8, 0x9, address 00:14:4f:fa:5a:dd
vnet1 at cbus0 chan 0x5: ivec 0xa, 0xb, address 00:14:4f:fb:22:90
vcons0 at vbus0: ivec 0x111, console
vrtc0 at vbus0
vscsi0 at root
scsibus3 at vscsi0: 256 targets
softraid0 at root
scsibus4 at softraid0: 256 targets
bootpath: /virtual-devices@100,0/channel-devices@200,0/disk@0,0
root on sd0a (8aa1a32792af6458.a) swap on sd0b dump on sd0b


Re: sftp(1) manual diff

2019-06-19 Thread Jason McIntyre
On Mon, Jun 17, 2019 at 02:47:09PM +0200, Tim van der Molen wrote:
> sftp(1) has this:
> 
>  reput [-Ppr] [local-path] remote-path
>  Resume upload of [local-path].  Equivalent to put with the -a
>  flag set.
> 
> remote-path should be marked optional, not local-path. Probably a pasto
> from reget.
> 
> OK?
> 

a bigger version of this went in after a bit more feedback.
jmc

Index: sftp.1
===
RCS file: /cvs/src/usr.bin/ssh/sftp.1,v
retrieving revision 1.126
diff -u -r1.126 sftp.1
--- sftp.1  12 Jun 2019 11:31:50 -  1.126
+++ sftp.1  19 Jun 2019 20:04:50 -
@@ -404,7 +404,7 @@
 Quit
 .Nm sftp .
 .It Xo Ic get
-.Op Fl afPpr
+.Op Fl afpR
 .Ar remote-path
 .Op Ar local-path
 .Xc
@@ -439,15 +439,19 @@
 will be called after the file transfer has completed to flush the file
 to disk.
 .Pp
-If either the
-.Fl P
-or
+If the
 .Fl p
+.\" undocumented redundant alias
+.\" or
+.\" .Fl P
 flag is specified, then full file permissions and access times are
 copied too.
 .Pp
 If the
-.Fl r
+.Fl R
+.\" undocumented redundant alias
+.\" or
+.\" .Fl r
 flag is specified then directories will be copied recursively.
 Note that
 .Nm
@@ -545,7 +549,7 @@
 .It Ic progress
 Toggle display of progress meter.
 .It Xo Ic put
-.Op Fl afPpr
+.Op Fl afpR
 .Ar local-path
 .Op Ar remote-path
 .Xc
@@ -581,15 +585,19 @@
 Note that this is only supported by servers that implement
 the "fs...@openssh.com" extension.
 .Pp
-If either the
-.Fl P
-or
+If the
 .Fl p
+.\" undocumented redundant alias
+.\" or
+.\" .Fl P
 flag is specified, then full file permissions and access times are
 copied too.
 .Pp
 If the
-.Fl r
+.Fl R
+.\" undocumented redundant alias
+.\" or
+.\" .Fl r
 flag is specified then directories will be copied recursively.
 Note that
 .Nm
@@ -600,7 +608,7 @@
 Quit
 .Nm sftp .
 .It Xo Ic reget
-.Op Fl Ppr
+.Op Fl fpR
 .Ar remote-path
 .Op Ar local-path
 .Xc
@@ -612,12 +620,12 @@
 .Fl a
 flag set.
 .It Xo Ic reput
-.Op Fl Ppr
-.Op Ar local-path
-.Ar remote-path
+.Op Fl fpR
+.Ar local-path
+.Op Ar remote-path
 .Xc
 Resume upload of
-.Op Ar local-path .
+.Ar local-path .
 Equivalent to
 .Ic put
 with the
Index: sftp.c
===
RCS file: /cvs/src/usr.bin/ssh/sftp.c,v
retrieving revision 1.192
diff -u -r1.192 sftp.c
--- sftp.c  7 Jun 2019 03:47:12 -   1.192
+++ sftp.c  19 Jun 2019 20:05:01 -
@@ -262,9 +262,7 @@
"df [-hi] [path]Display statistics for current 
directory or\n"
"   filesystem containing 'path'\n"
"exit   Quit sftp\n"
-   "get [-afPpRr] remote [local]   Download file\n"
-   "reget [-fPpRr] remote [local]  Resume download file\n"
-   "reput [-fPpRr] [local] remote  Resume upload file\n"
+   "get [-afpR] remote [local] Download file\n"
"help   Display this help text\n"
"lcd path   Change local directory to 
'path'\n"
"lls [ls-options [path]]Display local directory 
listing\n"
@@ -275,10 +273,12 @@
"lumask umask   Set local umask to 'umask'\n"
"mkdir path Create remote directory\n"
"progress   Toggle display of progress 
meter\n"
-   "put [-afPpRr] local [remote]   Upload file\n"
+   "put [-afpR] local [remote] Upload file\n"
"pwdDisplay remote working 
directory\n"
"quit   Quit sftp\n"
+   "reget [-fpR] remote [local]Resume download file\n"
"rename oldpath newpath Rename remote file\n"
+   "reput [-fpR] local [remote]Resume upload file\n"
"rm pathDelete remote file\n"
"rmdir path Remove remote directory\n"
"symlink oldpath newpathSymlink remote file\n"



Re: bgpd async nexthop update loop

2019-06-19 Thread Sebastian Benoit
ok!

Claudio Jeker(cje...@diehard.n-r-g.com) on 2019.06.17 23:08:37 +0200:
> On Mon, Jun 17, 2019 at 10:20:57PM +0200, Sebastian Benoit wrote:
> > Claudio Jeker(cje...@diehard.n-r-g.com) on 2019.06.17 21:34:30 +0200:
> > > Hi,
> > > 
> > > Now that the community rewrite is in here another diff to make bgpd more
> > > awesome. There was one loop left that did not run asynchronous compared to
> > > the main event loop. This loop was in nexthop_update() now a route server
> > > or a big route reflector would be hit hard by this loop since it will send
> > > out many updates when a nexthop becomes valid. I have seen times where the
> > > RDE was busy for 15min in nexthop_update().
> > > 
> > > This diff changes nexthop_update() to do this work asynchronous and
> > > therefor returns to the main event loop much more often.
> > > Nexthops now have a prefix pointer next_prefix which is used for looping
> > > through all prefixes. The for loop is broken up in RDE_RUNNER_ROUNDS steps
> > > which should hopefully return quickly.
> > > 
> > > Additionally prefixes belonging to a RIB that is not evaluated (no
> > > decision process running) are no longer linked into this nexthop list
> > > which should also shorten the loop a bit (at least Adj-RIB-In and
> > > Adj-RIB-Out are now skipped). To make this work correctly the linking and
> > > unlinking of prefixes has been adjusted to always work in the right way
> > > (nexhop last on link and first on unlink).
> > 
> > reads ok, one comment.
> 
> Fixed.
>  
> > > When testing please watch out for this message:
> > > prefix_updateall: prefix with F_RIB_NOEVALUATE hit
> > > This should never happen.
> > > 
> > > I'm running this on a few routers with good success. Please check if you 
> > > see
> > > a) worse convergence time (updates propagating much slower)
> > > b) better responsiveness (to e.g. bgpctl commands)
> > 
> > i see things much faster, actually too fast, i cant meaningdul test bgpctl
> > responsiveness.
> 
> You just need around 500 peers which are affected by a big nexthop update :)
>  
> > > There is some potential for tuning but I would prefer to do this at a
> > > later stage.
> > 
> > sure. one question about that: is the TAILQ_INSERT_TAIL/TAILQ_INSERT_HEAD
> > optimal?
> 
> The idea is to make this like a ring. Start with the latest arrival
> first and then process one after the other. This makes sure that every
> entry gets about the same amount of process time. The TAILQ_INSERT_HEAD
> in nexthop_update() will reduce initial latency for nexthops with only
> few prefixes since they will be done in one nexthop_runner() call.
> 
> There is also additional interactions with the other runners
> (imsg processing, rib_dump_runner and rde_update_queue_runner) which I
> need to profile to understand them better. It looks like
> rde_dispatch_imsg_session() is running ahead and produces a lot of work
> that can't be absorbed that quickly by the other runners. At least that is
> the case in the big route server setup that I'm working with.
> 
> > > -- 
> > > :wq Claudio
> > > 
> > > Index: rde.c
> > > ===
> > > RCS file: /cvs/src/usr.sbin/bgpd/rde.c,v
> > > retrieving revision 1.470
> > > diff -u -p -r1.470 rde.c
> > > --- rde.c 17 Jun 2019 13:35:43 -  1.470
> > > +++ rde.c 17 Jun 2019 13:56:16 -
> > > @@ -256,9 +256,6 @@ rde_main(int debug, int verbose)
> > >   set_pollfd(&pfd[PFD_PIPE_SESSION], ibuf_se);
> > >   set_pollfd(&pfd[PFD_PIPE_SESSION_CTL], ibuf_se_ctl);
> > >  
> > > - if (rib_dump_pending() || rde_update_queue_pending())
> > > - timeout = 0;
> > > -
> > >   i = PFD_PIPE_COUNT;
> > >   for (mctx = LIST_FIRST(&rde_mrts); mctx != 0; mctx = xmctx) {
> > >   xmctx = LIST_NEXT(mctx, entry);
> > > @@ -275,6 +272,10 @@ rde_main(int debug, int verbose)
> > >   }
> > >   }
> > >  
> > > + if (rib_dump_pending() || rde_update_queue_pending() ||
> > > + nexthop_pending())
> > > + timeout = 0;
> > > +
> > >   if (poll(pfd, i, timeout) == -1) {
> > >   if (errno != EINTR)
> > >   fatal("poll error");
> > > @@ -311,12 +312,13 @@ rde_main(int debug, int verbose)
> > >   mctx = LIST_NEXT(mctx, entry);
> > >   }
> > >  
> > > + rib_dump_runner();
> > > + nexthop_runner();
> > >   if (ibuf_se && ibuf_se->w.queued < SESS_MSG_HIGH_MARK) {
> > >   rde_update_queue_runner();
> > >   for (aid = AID_INET6; aid < AID_MAX; aid++)
> > >   rde_update6_queue_runner(aid);
> > >   }
> > > - rib_dump_runner();
> > >   }
> > >  
> > >   /* do not clean up on shutdown on production, it takes ages. */
> > > Index: rde.h
> > > ===
> > > RCS file: /cvs/src

Re: libcrypto: recognize HW acceleration support on arm64

2019-06-19 Thread Patrick Wildt
On Wed, Jun 19, 2019 at 09:25:27AM +0200, Mark Kettenis wrote:
> > Date: Wed, 19 Jun 2019 07:13:19 +0200
> > From: Patrick Wildt 
> > 
> > Hi,
> > 
> > this diff adds the necessary helpers to arm64 so that libcrypto knows
> > which of the hardware crypto features are available on the machine.
> > Those helpers are used by the existing and matching armv7 code.
> > 
> > ok?
> 
> No objections to the diff per se, but unless I'm missing something,
> there currently isn't any assembly code that takes advantage of this
> for arm64.  The armv7 code is 32-bit code so I don't think you can use
> any of it in 64-bit mode.

No, you're right, at the moment there's no acceleration support for
arm64 in our tree.  But OpenSSL has more of that, even before their
license change, so I can easily pull that in step by step.



Re: libcrypto: recognize HW acceleration support on arm64

2019-06-19 Thread Mark Kettenis
> Date: Wed, 19 Jun 2019 07:13:19 +0200
> From: Patrick Wildt 
> 
> Hi,
> 
> this diff adds the necessary helpers to arm64 so that libcrypto knows
> which of the hardware crypto features are available on the machine.
> Those helpers are used by the existing and matching armv7 code.
> 
> ok?

No objections to the diff per se, but unless I'm missing something,
there currently isn't any assembly code that takes advantage of this
for arm64.  The armv7 code is 32-bit code so I don't think you can use
any of it in 64-bit mode.

> diff --git a/lib/libcrypto/arch/aarch64/Makefile.inc 
> b/lib/libcrypto/arch/aarch64/Makefile.inc
> index 8742504f2d4..972e5536b5e 100644
> --- a/lib/libcrypto/arch/aarch64/Makefile.inc
> +++ b/lib/libcrypto/arch/aarch64/Makefile.inc
> @@ -26,3 +26,6 @@ ${f}.S: ${LCRYPTO_SRC}/${dir}/asm/${f}.pl
>   /usr/bin/perl \
>   ${LCRYPTO_SRC}/${dir}/asm/${f}.pl void ${.TARGET} > ${.TARGET}
>  .endfor
> +
> +CFLAGS+= -DOPENSSL_CPUID_OBJ
> +SRCS+=   arm64cpuid.S armcap.c
> diff --git a/lib/libcrypto/arm64cpuid.S b/lib/libcrypto/arm64cpuid.S
> new file mode 100644
> index 000..5eeff91c6ea
> --- /dev/null
> +++ b/lib/libcrypto/arm64cpuid.S
> @@ -0,0 +1,47 @@
> +#include "arm_arch.h"
> +
> +.text
> +.archarmv8-a+crypto+sha3
> +
> +.align   5
> +.globl   _armv7_neon_probe
> +.type_armv7_neon_probe,%function
> +_armv7_neon_probe:
> + orr v15.16b, v15.16b, v15.16b
> + ret
> +.size_armv7_neon_probe,.-_armv7_neon_probe
> +
> +.globl   _armv8_aes_probe
> +.type_armv8_aes_probe,%function
> +_armv8_aes_probe:
> + aesev0.16b, v0.16b
> + ret
> +.size_armv8_aes_probe,.-_armv8_aes_probe
> +
> +.globl   _armv8_sha1_probe
> +.type_armv8_sha1_probe,%function
> +_armv8_sha1_probe:
> + sha1h   s0, s0
> + ret
> +.size_armv8_sha1_probe,.-_armv8_sha1_probe
> +
> +.globl   _armv8_sha256_probe
> +.type_armv8_sha256_probe,%function
> +_armv8_sha256_probe:
> + sha256su0   v0.4s, v0.4s
> + ret
> +.size_armv8_sha256_probe,.-_armv8_sha256_probe
> +
> +.globl   _armv8_pmull_probe
> +.type_armv8_pmull_probe,%function
> +_armv8_pmull_probe:
> + pmull   v0.1q, v0.1d, v0.1d
> + ret
> +.size_armv8_pmull_probe,.-_armv8_pmull_probe
> +
> +.globl   _armv8_sha512_probe
> +.type_armv8_sha512_probe,%function
> +_armv8_sha512_probe:
> + sha512su0   v0.2d,v0.2d
> + ret
> +.size_armv8_sha512_probe,.-_armv8_sha512_probe
> diff --git a/lib/libcrypto/arm_arch.h b/lib/libcrypto/arm_arch.h
> index a64c6da46eb..bb137e6a48e 100644
> --- a/lib/libcrypto/arm_arch.h
> +++ b/lib/libcrypto/arm_arch.h
> @@ -17,7 +17,11 @@
> * gcc/config/arm/arm.c. On a side note it defines
> * __ARMEL__/__ARMEB__ for little-/big-endian.
> */
> -#  ifdefined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
> +#  ifdefined(__ARM_ARCH)
> +#   define __ARM_ARCH__ __ARM_ARCH
> +#  elif  defined(__ARM_ARCH_8A__)
> +#   define __ARM_ARCH__ 8
> +#  elif  defined(__ARM_ARCH_7__) || defined(__ARM_ARCH_7A__) || \
>   defined(__ARM_ARCH_7R__)|| defined(__ARM_ARCH_7M__) || \
>   defined(__ARM_ARCH_7EM__)
>  #   define __ARM_ARCH__ 7
> 
>