Allow DDB to be disabled at boot time?

2016-09-02 Thread Adam McDougall
Would it be possible to add a method to the boot loader to disable
DDB support in the kernel, or perhaps even disable it by default?
I've encountered two different brands of servers (Dell and HP)
where the remote graphical console loses keystrokes in the generic
kernel. This is a severe impediment to troubleshooting and I'm not
the only one to encounter this. Strangely, compiling a kernel without
DDB solves this issue. If it was possible to either do a one-time
disable from the boot loader or make it permanent in the boot config
file, I wouldn't have to custom compile the kernel for installs on
bare metal. Holding down keys longer helps, but you have to time it
just right and it is very difficult when entering passwords. Thanks
for your consideration.



Fwd: Re: relayd's icmp check only works for a small number of hosts

2016-09-02 Thread Remi Locherer

forgot to add bugs@openbsd.org

Subject: Re: relayd's icmp check only works for a small number of hosts
Date: 2016-09-02 17:50
From: Remi Locherer 
To: Reyk Floeter 

On 2016-09-02 16:51, Reyk Floeter wrote:

On Fri, Aug 19, 2016 at 04:31:10PM +0200, Remi Locherer wrote:

>Synopsis:   relayd's icmp check only works for a small number of hosts
>Category:   relayd
>Environment:
System  : OpenBSD 5.9
	Details : OpenBSD 5.9 (GENERIC.MP) #10: Wed Aug  3 13:46:07 CEST 
2016
			 
r...@stable-59-amd64.mtier.org:/binpatchng/work-binpatch59-amd64/src/sys/arch/amd64/compile/GENERIC.MP


Architecture: OpenBSD.amd64
Machine : amd64

>Description:
	relayd says 70 out of 104 hosts are not reachable via icmp. But ping 
on
the same host where relayd runs can reach all hosts with a rtt below 
1ms.


In the logs I see "210ms,icmp read timeout". But in relayd.conf a 
timeout

of 1000 is set.



All checks have to be completed before the next check interval.  With
that many tests, it can happen that relayd is not finished
sending/receiving all individual checks before the next interval;
missed hosts will be marked down.

You could try the following:

1. Increase the global interval.


With this in relayd.conf:
interval 300
timeout 6

relayd successfully checked 36 hosts and reported icmp response times 
between
4 and 6 ms. After 60s relayd reports "icmp read timeout" for the other 
68 hosts.


Sep  2 17:29:13 lb2 relayd[31358]: host 192.168.63.48, check icmp 
(60008ms,icmp read timeout), state unknown -> down, availability 0.00%


While it's true that a few hosts are down the majority of hosts answer
my manual pings within 0.600 ms.



2. Instead of testing the same hosts multiple times, you can use the
"parent" keyword to interhit the state from a tested hosts, eg.

table  {
10.1.1.1
}

table  {
10.1.1.1 parent 1
}

table  {
10.1.1.1 parent 1
}


I'll try this one. It's a bit tricky since I can only reference the
parent table by index and not by name. My relayd.conf is generated
and deployed with Ansible.



Re: Boot fails on i386 -current

2016-09-02 Thread Eivind Eide
(sorry with his gmail webmail, it mangles stuff)

This is output from sendbug:


>Synopsis:Boot fails on i386 -current
>Category:i386 generic kernel
>Environment:
System  : OpenBSD 6.0
Details : OpenBSD 6.0 (GENERIC) #1917: Tue Jul 26 12:48:33 MDT 2016
 dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC

Architecture: OpenBSD.i386
Machine : i386
>Description:
See former mail.
>How-To-Repeat:

>Fix:



dmesg:
OpenBSD 6.0 (GENERIC) #1917: Tue Jul 26 12:48:33 MDT 2016
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz ("GenuineIntel"
686-class) 1.80 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PERF
real mem  = 2146852864 (2047MB)
avail mem = 2093072384 (1996MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 05/15/03, BIOS32 rev. 0 @ 0xffe90, SMBIOS rev.
2.3 @ 0xf7690 (61 entries)
bios0: vendor Dell Computer Corporation version "A09" date 05/15/2003
bios0: Dell Computer Corporation Latitude C640
acpi0 at bios0: rev 0
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP
acpi0: wakeup devices LID_(S3) PBTN(S4) PCI0(S3) UAR1(S3) USB0(S1)
USB1(S1) USB2(S1) MODM(S3) PCIE(S3) MPCI(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (AGP_)
acpiprt2 at acpi0: bus 2 (PCIE)
acpiprt3 at acpi0: bus -1 (MPCI)
acpicpu0 at acpi0acpicpu0: struck PSS entry, core frequency equals  last
acpicpu0: struck PSS entry, core frequency equals  last
acpicpu0: invalid _PSS length
: !C2(@50 io@0x8e4), C1(@1 halt!)
acpipwrres0 at acpi0: PADA, resource for ADPT
acpitz0 at acpi0: critical temperature is 99 degC
acpiac0 at acpi0: AC unit online
acpibat0 at acpi0: BAT0 model "LIP8120DLP" serial 5184 type LION oem
"Sony Corp."
acpibat1 at acpi0: BAT1 not present
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: PBTN
acpibtn2 at acpi0: SBTN
"PNP0F13" at acpi0 not configured
"PNP0303" at acpi0 not configured
"PNP0700" at acpi0 not configured
"PNP0501" at acpi0 not configured
"PNP0401" at acpi0 not configured
acpidock0 at acpi0: GDCK not docked (0)
acpivideo0 at acpi0: VID_
bios0: ROM list: 0xc/0xf000 0xcf000/0x800! 0xcf800/0x800!
cpu0 at mainbus0: (uniprocessor)
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 82845 Host" rev 0x04
intelagp0 at pchb0
agp0 at intelagp0: aperture at 0xe800, size 0x400
ppb0 at pci0 dev 1 function 0 "Intel 82845 AGP" rev 0x04
pci1 at ppb0 bus 1
radeondrm0 at pci1 dev 0 function 0 "ATI Radeon Mobility M7" rev 0x00
drm0 at radeondrm0
radeondrm0: irq 11
uhci0 at pci0 dev 29 function 0 "Intel 82801CA/CAM USB" rev 0x02: irq 11
ppb1 at pci0 dev 30 function 0 "Intel 82801BAM Hub-to-PCI" rev 0x42
pci2 at ppb1 bus 2
xl0 at pci2 dev 0 function 0 "3Com 3c905C 100Base-TX" rev 0x78: irq
11, address 00:08:74:48:40:d6
exphy0 at xl0 phy 24: 3Com internal media interface
cbb0 at pci2 dev 1 function 0 "TI PCI1420 CardBus" rev 0x00: irq 11
cbb1 at pci2 dev 1 function 1 "TI PCI1420 CardBus" rev 0x00: irq 11
ath0 at pci2 dev 3 function 0 "Atheros AR2413" rev 0x01: irq 11
ath0: AR2413 7.8 phy 4.5 rf 5.6 eeprom 5.2, WOR3W, address 00:16:cf:53:07:71
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 4 device 0 cacheline 0x8, lattimer 0x20
pcmcia0 at cardslot0
cardslot1 at cbb1 slot 1 flags 0
cardbus1 at cardslot1: bus 5 device 0 cacheline 0x8, lattimer 0x20
pcmcia1 at cardslot1
ichpcib0 at pci0 dev 31 function 0 "Intel 82801CAM LPC" rev 0x02
pciide0 at pci0 dev 31 function 1 "Intel 82801CAM IDE" rev 0x02: DMA,
channel 0 configured to compatibility, channel 1 configured to
compatibility
wd0 at pciide0 channel 0 drive 0: 
wd0: 16-sector PIO, LBA, 76319MB, 156301488 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
atapiscsi0 at pciide0 channel 1 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0:  ATAPI
5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
auich0 at pci0 dev 31 function 5 "Intel 82801CA/CAM AC97" rev 0x02:
irq 11, ICH3 AC97
ac97: codec id 0x4352595b (Cirrus Logic CS4205 rev 3)
ac97: codec features mic channel, tone, simulated stereo, bass boost,
20 bit DAC, 18 bit ADC, SRS 3D
audio0 at auich0
"Intel 82801CA/CAM Modem" rev 0x02 at pci0 dev 31 function 6 not configured
usb0 at uhci0: USB revision 1.0
uhub0 at usb0 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at ichpcib0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
wsmouse1 at pms0 mux 0
pms0: Synaptics touchpad, firmware 5.9
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
lpt0 

Re: Boot fails on i386 -current

2016-09-02 Thread Mike Larkin
On Fri, Sep 02, 2016 at 04:14:27PM +0200, Eivind Eide wrote:
> Boot fails on i386 -current
> 
> With last two snapshots the i386 bsd kernel fails to boot on this old
> machine I got.
> This is how boot attempts end (written down by hand)
> 
> (...snip...)
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus 1 (AGP_)
> acpiprt2 at acpi0: bus 2 (PCIE)
> acpiprt3 at acpi0: bus -1 (MPCI)
> acpicpu0 at acpi0unable to find cpu 0
> uvm_fault(0xd0baa160, 0x0, 0, 1) -> e
> kernel: page fault trap, code=0
> Stopped atacpicpu_getcst_from_fadt+0x37:  movlclean_idt+0x2f4(%eax),%
> eax
> ddb>
> 

Please use sendbug from a working machine so we get the aml.

-ml

> This is dmesg from somewhat older backup kernel I can still boot from:
> 
> OpenBSD 6.0 (GENERIC) #1917: Tue Jul 26 12:48:33 MDT 2016
> dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
> cpu0: Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz ("GenuineIntel"
> 686-class) 1.80 GHz
> cpu0: 
> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PERF
> real mem  = 2146852864 (2047MB)
> avail mem = 2093072384 (1996MB)
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: date 05/15/03, BIOS32 rev. 0 @ 0xffe90, SMBIOS rev.
> 2.3 @ 0xf7690 (61 entries)
> bios0: vendor Dell Computer Corporation version "A09" date 05/15/2003
> bios0: Dell Computer Corporation Latitude C640
> acpi0 at bios0: rev 0
> acpi0: sleep states S0 S1 S3 S4 S5
> acpi0: tables DSDT FACP
> acpi0: wakeup devices LID_(S3) PBTN(S4) PCI0(S3) UAR1(S3) USB0(S1)
> USB1(S1) USB2(S1) MODM(S3) PCIE(S3) MPCI(S3)
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpiprt1 at acpi0: bus 1 (AGP_)
> acpiprt2 at acpi0: bus 2 (PCIE)
> acpiprt3 at acpi0: bus -1 (MPCI)
> acpicpu0 at acpi0acpicpu0: struck PSS entry, core frequency equals  last
> acpicpu0: struck PSS entry, core frequency equals  last
> acpicpu0: invalid _PSS length
> : !C2(@50 io@0x8e4), C1(@1 halt!)
> acpipwrres0 at acpi0: PADA, resource for ADPT
> acpitz0 at acpi0: critical temperature is 99 degC
> acpiac0 at acpi0: AC unit online
> acpibat0 at acpi0: BAT0 model "LIP8120DLP" serial 5184 type LION oem
> "Sony Corp."
> acpibat1 at acpi0: BAT1 not present
> acpibtn0 at acpi0: LID_
> acpibtn1 at acpi0: PBTN
> acpibtn2 at acpi0: SBTN
> "PNP0F13" at acpi0 not configured
> "PNP0303" at acpi0 not configured
> "PNP0700" at acpi0 not configured
> "PNP0501" at acpi0 not configured
> "PNP0401" at acpi0 not configured
> acpidock0 at acpi0: GDCK not docked (0)
> acpivideo0 at acpi0: VID_
> bios0: ROM list: 0xc/0xf000 0xcf000/0x800! 0xcf800/0x800!
> cpu0 at mainbus0: (uniprocessor)
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> pci0 at mainbus0 bus 0: configuration mode 1 (bios)
> pchb0 at pci0 dev 0 function 0 "Intel 82845 Host" rev 0x04
> intelagp0 at pchb0
> agp0 at intelagp0: aperture at 0xe800, size 0x400
> ppb0 at pci0 dev 1 function 0 "Intel 82845 AGP" rev 0x04
> pci1 at ppb0 bus 1
> radeondrm0 at pci1 dev 0 function 0 "ATI Radeon Mobility M7" rev 0x00
> drm0 at radeondrm0
> radeondrm0: irq 11
> uhci0 at pci0 dev 29 function 0 "Intel 82801CA/CAM USB" rev 0x02: irq 11
> ppb1 at pci0 dev 30 function 0 "Intel 82801BAM Hub-to-PCI" rev 0x42
> pci2 at ppb1 bus 2
> xl0 at pci2 dev 0 function 0 "3Com 3c905C 100Base-TX" rev 0x78: irq
> 11, address 00:08:74:48:40:d6
> exphy0 at xl0 phy 24: 3Com internal media interface
> cbb0 at pci2 dev 1 function 0 "TI PCI1420 CardBus" rev 0x00: irq 11
> cbb1 at pci2 dev 1 function 1 "TI PCI1420 CardBus" rev 0x00: irq 11
> ath0 at pci2 dev 3 function 0 "Atheros AR2413" rev 0x01: irq 11
> ath0: AR2413 7.8 phy 4.5 rf 5.6 eeprom 5.2, WOR3W, address 00:16:cf:53:07:71
> cardslot0 at cbb0 slot 0 flags 0
> cardbus0 at cardslot0: bus 4 device 0 cacheline 0x8, lattimer 0x20
> pcmcia0 at cardslot0
> cardslot1 at cbb1 slot 1 flags 0
> cardbus1 at cardslot1: bus 5 device 0 cacheline 0x8, lattimer 0x20
> pcmcia1 at cardslot1
> ichpcib0 at pci0 dev 31 function 0 "Intel 82801CAM LPC" rev 0x02
> pciide0 at pci0 dev 31 function 1 "Intel 82801CAM IDE" rev 0x02: DMA,
> channel 0 configured to compatibility, channel 1 configured to
> compatibility
> wd0 at pciide0 channel 0 drive 0: 
> wd0: 16-sector PIO, LBA, 76319MB, 156301488 sectors
> wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
> atapiscsi0 at pciide0 channel 1 drive 0
> scsibus1 at atapiscsi0: 2 targets
> cd0 at scsibus1 targ 0 lun 0:  ATAPI
> 5/cdrom removable
> cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
> auich0 at pci0 dev 31 function 5 "Intel 82801CA/CAM AC97" rev 0x02:
> irq 11, ICH3 AC97
> ac97: codec id 0x4352595b (Cirrus Logic CS4205 rev 3)
> ac97: codec features mic channel, tone, simulated stereo, bass boost,
> 20 bit DAC, 18 bit ADC, SRS 3D
> audio0 at auich0
> "Intel 82801CA/CAM Modem" rev 0x02 at pci0 dev 31 

Re: relayd's icmp check only works for a small number of hosts

2016-09-02 Thread Remi Locherer

On 2016-09-01 00:27, Sebastian Benoit wrote:

Remi Locherer(remi.loche...@relo.ch) on 2016.08.19 16:31:10 +0200:

>Synopsis:   relayd's icmp check only works for a small number of hosts
>Category:   relayd
>Environment:
System  : OpenBSD 5.9
	Details : OpenBSD 5.9 (GENERIC.MP) #10: Wed Aug  3 13:46:07 CEST 
2016
			 
r...@stable-59-amd64.mtier.org:/binpatchng/work-binpatch59-amd64/src/sys/arch/amd64/compile/GENERIC.MP


Architecture: OpenBSD.amd64
Machine : amd64

>Description:
	relayd says 70 out of 104 hosts are not reachable via icmp. But ping 
on
the same host where relayd runs can reach all hosts with a rtt below 
1ms.


In the logs I see "210ms,icmp read timeout". But in relayd.conf a 
timeout

of 1000 is set.

Could this be related to the problem mentioned in the commit message 
of

src/usr.sbin/relayd/check_icmp.c rev 1.41?


i think you mean 1.40?


yes


try to increase

usr.sbin/relayd/relayd.h:93:#define ICMP_RCVBUF_SIZE 262144

and see if you can have more checks then.


I tried the values 524288 and 393216 for ICMP_RCVBUF_SIZE. For both 
values relayd tells me:


relayd_icmp_patch: icmp_setup: setsockopt: No buffer space available

And then it exits.



Boot fails on i386 -current

2016-09-02 Thread Eivind Eide
Boot fails on i386 -current

With last two snapshots the i386 bsd kernel fails to boot on this old
machine I got.
This is how boot attempts end (written down by hand)

(...snip...)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (AGP_)
acpiprt2 at acpi0: bus 2 (PCIE)
acpiprt3 at acpi0: bus -1 (MPCI)
acpicpu0 at acpi0unable to find cpu 0
uvm_fault(0xd0baa160, 0x0, 0, 1) -> e
kernel: page fault trap, code=0
Stopped atacpicpu_getcst_from_fadt+0x37:  movlclean_idt+0x2f4(%eax),%
eax
ddb>

This is dmesg from somewhat older backup kernel I can still boot from:

OpenBSD 6.0 (GENERIC) #1917: Tue Jul 26 12:48:33 MDT 2016
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz ("GenuineIntel"
686-class) 1.80 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PERF
real mem  = 2146852864 (2047MB)
avail mem = 2093072384 (1996MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 05/15/03, BIOS32 rev. 0 @ 0xffe90, SMBIOS rev.
2.3 @ 0xf7690 (61 entries)
bios0: vendor Dell Computer Corporation version "A09" date 05/15/2003
bios0: Dell Computer Corporation Latitude C640
acpi0 at bios0: rev 0
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP
acpi0: wakeup devices LID_(S3) PBTN(S4) PCI0(S3) UAR1(S3) USB0(S1)
USB1(S1) USB2(S1) MODM(S3) PCIE(S3) MPCI(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (AGP_)
acpiprt2 at acpi0: bus 2 (PCIE)
acpiprt3 at acpi0: bus -1 (MPCI)
acpicpu0 at acpi0acpicpu0: struck PSS entry, core frequency equals  last
acpicpu0: struck PSS entry, core frequency equals  last
acpicpu0: invalid _PSS length
: !C2(@50 io@0x8e4), C1(@1 halt!)
acpipwrres0 at acpi0: PADA, resource for ADPT
acpitz0 at acpi0: critical temperature is 99 degC
acpiac0 at acpi0: AC unit online
acpibat0 at acpi0: BAT0 model "LIP8120DLP" serial 5184 type LION oem
"Sony Corp."
acpibat1 at acpi0: BAT1 not present
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: PBTN
acpibtn2 at acpi0: SBTN
"PNP0F13" at acpi0 not configured
"PNP0303" at acpi0 not configured
"PNP0700" at acpi0 not configured
"PNP0501" at acpi0 not configured
"PNP0401" at acpi0 not configured
acpidock0 at acpi0: GDCK not docked (0)
acpivideo0 at acpi0: VID_
bios0: ROM list: 0xc/0xf000 0xcf000/0x800! 0xcf800/0x800!
cpu0 at mainbus0: (uniprocessor)
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 82845 Host" rev 0x04
intelagp0 at pchb0
agp0 at intelagp0: aperture at 0xe800, size 0x400
ppb0 at pci0 dev 1 function 0 "Intel 82845 AGP" rev 0x04
pci1 at ppb0 bus 1
radeondrm0 at pci1 dev 0 function 0 "ATI Radeon Mobility M7" rev 0x00
drm0 at radeondrm0
radeondrm0: irq 11
uhci0 at pci0 dev 29 function 0 "Intel 82801CA/CAM USB" rev 0x02: irq 11
ppb1 at pci0 dev 30 function 0 "Intel 82801BAM Hub-to-PCI" rev 0x42
pci2 at ppb1 bus 2
xl0 at pci2 dev 0 function 0 "3Com 3c905C 100Base-TX" rev 0x78: irq
11, address 00:08:74:48:40:d6
exphy0 at xl0 phy 24: 3Com internal media interface
cbb0 at pci2 dev 1 function 0 "TI PCI1420 CardBus" rev 0x00: irq 11
cbb1 at pci2 dev 1 function 1 "TI PCI1420 CardBus" rev 0x00: irq 11
ath0 at pci2 dev 3 function 0 "Atheros AR2413" rev 0x01: irq 11
ath0: AR2413 7.8 phy 4.5 rf 5.6 eeprom 5.2, WOR3W, address 00:16:cf:53:07:71
cardslot0 at cbb0 slot 0 flags 0
cardbus0 at cardslot0: bus 4 device 0 cacheline 0x8, lattimer 0x20
pcmcia0 at cardslot0
cardslot1 at cbb1 slot 1 flags 0
cardbus1 at cardslot1: bus 5 device 0 cacheline 0x8, lattimer 0x20
pcmcia1 at cardslot1
ichpcib0 at pci0 dev 31 function 0 "Intel 82801CAM LPC" rev 0x02
pciide0 at pci0 dev 31 function 1 "Intel 82801CAM IDE" rev 0x02: DMA,
channel 0 configured to compatibility, channel 1 configured to
compatibility
wd0 at pciide0 channel 0 drive 0: 
wd0: 16-sector PIO, LBA, 76319MB, 156301488 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
atapiscsi0 at pciide0 channel 1 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0:  ATAPI
5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
auich0 at pci0 dev 31 function 5 "Intel 82801CA/CAM AC97" rev 0x02:
irq 11, ICH3 AC97
ac97: codec id 0x4352595b (Cirrus Logic CS4205 rev 3)
ac97: codec features mic channel, tone, simulated stereo, bass boost,
20 bit DAC, 18 bit ADC, SRS 3D
audio0 at auich0
"Intel 82801CA/CAM Modem" rev 0x02 at pci0 dev 31 function 6 not configured
usb0 at uhci0: USB revision 1.0
uhub0 at usb0 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at ichpcib0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard
pms0 at pckbc0 

Re: SGI Indy netboot not working since OpenBSD 5.6 and still in 6.0-current

2016-09-02 Thread Frank Scheiner

Hi all,

Small update from my side: I'm currently trying to determine the exact 
patch or patch set that is responsible for the described problem.


Earlier I assumed a change in `[...]/sys/netinet/if_ether.c` could be 
responsible, but from my testing this is not the case (see below).


I compiled a bsd.IP22 kernel for every change date for 
`[...]/sys/netinet/if_ether.c` between 5.5 and 5.6 after using `cvs 
update -d-R -D ` (I started with the content of [1]). As I'm 
totally new to CVS, please let me know if I made a mistake with that 
command. My intention was to update my sandbox with all changes up to a 
specific date.


[1]: http://ftp.vim.org/OpenBSD/5.5/sys.tar.gz

I then booted all the resulting kernels with the Indy and so could at 
least find out that the assumed responsible change must have happened 
between:


"Mon Jun 16 19:47:21 2014 UTC"

...and:

"Sat Jul 12 14:26:00 2014 UTC"

...because the kernel from June still works, but the kernel from July 
cannot successfully complete the reverse ARP.


I ruled out that a change in `[...]/sys/netinet/if_ether.c` alone could 
be responsible, because when reverting the change to this file from "Sat 
Jul 12 14:26:00 2014 UTC" and compiling the bsd.IP22 kernel with this 
repo state it's still not working properly.


Do you have any suggestions on how to proceed, because there were still 
a lot of changes between those two dates. Even when removing changes to 
other architectures, the remaining patches are still about 1 MiB.


If time allows I'll try to do some manual bisecting with the dates of 
the remaining commits this weekend and try out the resulting IP22 
kernels on my Indy. To be sure the issue still exists, I'll also try the 
final OpenBSD 6.0 kernel for IP22.


Bye
Frank

On 08/20/2016 05:11 PM, Frank Scheiner wrote:
>>Synopsis: SGI Indy netboot not working since OpenBSD 5.6 and still in
> 6.0-current (#686) (for GENERIC-IP22)
>
>>Category: kernel sgi
>
>>Environment (taken from OpenBSD 5.5):
> System  : OpenBSD 5.5
> Details : OpenBSD 5.5 (GENERIC-IP22) #127: Tue Mar  4 16:19:56 
MST 2014

>
> dera...@sgi.openbsd.org:/usr/src/sys/arch/sgi/compile/GENERIC-IP22
>
> Architecture: OpenBSD.sgi
> Machine : sgi
>
>>Description:
> I'm trying to netboot a SGI Indy (w/R4400) using `bootecoff` and the
> GENERIC-IP22 kernel of the respective OpenBSD versions (5.9 and
> 6.0-current (builddate: 1471615628)). Everything works fine until the
> kernel tries to determine the IP address of the network interface (sq0
> is the only network interface in this machine!) used for netbooting via
> RARP. It states that its RARP requests are not answered, but when
> sniffing the network traffic between the Indy and the next RARP server
> (same as the NFS server that serves the root FS of the Indy) I can see
> that all RARP requests that orginate from the Indy are answered with the
> designated IP address. But the Indy doesn't seem to recognize these
> answers or just ignores them. I then built a root FS using OpenBSD 5.2
> (the first version that had support for the Indy) and netbooted with
> this root FS and the respective loader and kernel, which worked fine.
> Some further root FS building and testing later I found out that the
> last OpenBSD version where netbooting works ok for the Indy is OpenBSD
> 5.5. The RARP problems described above start with OpenBSD 5.6. I sadly
> don't know if this is due to changes to the SGI specific part of the
> kernel or due to changes in the nfs ([...]/sys/nfs/nfs_boot.c) or RARP
> ([...]/sys/netinet/if_ether.c) related parts. I have an Octane2 ready
> for kernel compilation and would be glad to test out any proposed
> patches with my Indy.
>
>>How-To-Repeat:
> Netboot a SGI Indy with OpenBSD >= 5.6.
>
>>Fix:
> Unknown.
>
> These are the firmware messages, the boot messages for 6.0-current
> (builddate: 1471615628) until the kernel panics, the trace and the
> output of `ps` and `show registers`:
> ```
>>> version
> PROM Monitor SGI Version 5.0 Rev B6 IP24 Sep 28, 1994 (BE)
>>>  boot --s
> Setting $netaddr to 192.168.178.58 (from server )
> Obtaining /bootecoff from server
> 37920+192+2592 entry: 0x880020f0
>
> OpenBSD/sgi-IP22 ARCBios boot version 1.6
> arg 0: bootp()/bootecoff
> arg 1: -s
> arg 2: ConsoleIn=serial(0)
> arg 3: ConsoleOut=serial(0)
> arg 4: SystemPartition=bootp()
> arg 5: OSLoader=bootecoff
> arg 6: OSLoadPartition=bootp()
> arg 7: OSLoadFilename=indy
> arg 8: OSLoadOptions=-s
> Boot: bootp()indy
> Setting $netaddr to 192.168.178.58 (from server )
> Obtaining indy from server
> Setting $netaddr to 192.168.178.58 (from server )
> Obtaining indy from server
> 3538216+964696Setting $netaddr to 192.168.178.58 (from server )
> Obtaining indy from server
>  [78+240984+142551]=0x4a92c0
> ARCS32 Firmware Version 1.10
> Found SGI-IP22, setting up.
> Initial setup done, switching console.
> [ using 384320 bytes of bsd ELF symbol table ]
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the 

Re: very slow ssh / sshd performance on current

2016-09-02 Thread Alexander Bluhm
On Fri, Sep 02, 2016 at 10:33:19AM +0200, Andreas Bartelt wrote:
> On 09/02/16 10:24, Alexander Bluhm wrote:
> > I see a performance drop to 10 Mbit/sec on some old i386 machines
> > with em(4).  Can you try this kernel diff to see wether it is the
> > same problem?
> > 
> 
> yes, >50 MB/s now. Thanks!

So I think this happens:

sosend() uses m_getuio() now to allocate a MAXMCLBYTES mbuf cluster,
that is 65536 bytes.  sbreserve() calculates the upper cluster limit
min(cc * 2, sb_max + (sb_max / MCLBYTES) * MSIZE).  In our case it
is min(1024*16 * 2, (256*1024) + ((256*1024) / (1 << 11)) * 256)
== min(32768, 294912) == 32768.

So after allocating a single mbuf cluster the sending socket buffer
has no space anymore.  As tcp_output() keeps the mbuf cluster for
retransmits, it will be freed only after all ACKs have been received.
That kills performance totally.

To allow cycling through the mbufs periodically, I think we need
space for at least 3 of them.  Note that this diff also affects the
mbuf size on the receiving side, but I think it does not matter
much as the data size is also limited.

Andreas, can you revert the diff I sent previously and try this one
instead?

ok?

bluhm

Index: kern/uipc_socket2.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket2.c,v
retrieving revision 1.64
diff -u -p -r1.64 uipc_socket2.c
--- kern/uipc_socket2.c 28 Jun 2016 14:47:00 -  1.64
+++ kern/uipc_socket2.c 2 Sep 2016 10:29:10 -
@@ -397,7 +397,8 @@ sbreserve(struct sockbuf *sb, u_long cc)
if (cc == 0 || cc > sb_max)
return (1);
sb->sb_hiwat = cc;
-   sb->sb_mbmax = min(cc * 2, sb_max + (sb_max / MCLBYTES) * MSIZE);
+   sb->sb_mbmax = max(3 * MAXMCLBYTES,
+   min(cc * 2, sb_max + (sb_max / MCLBYTES) * MSIZE));
if (sb->sb_lowat > sb->sb_hiwat)
sb->sb_lowat = sb->sb_hiwat;
return (0);



Re: very slow ssh / sshd performance on current

2016-09-02 Thread Andreas Bartelt

On 09/02/16 10:24, Alexander Bluhm wrote:

On Fri, Sep 02, 2016 at 09:43:13AM +0200, Andreas Bartelt wrote:

I'm observing very slow ssh / sshd performance on current (tested on amd64).
Throughput is less than 1/50th of what I'm typically seeing on my boxes.
This drop in performance seems to be independent of the used ciphers (tested
with aes-gcm-128 & chacha20-poly1305).

All tested interfaces are em(4) which, however, seems to be completely
unrelated since I don't observe this huge drop in performance via nc(1) -
it's >70 MB/s via nc(1) vs. ~1-2 MB/s via ssh/scp.


I see a performance drop to 10 Mbit/sec on some old i386 machines
with em(4).  Can you try this kernel diff to see wether it is the
same problem?



yes, >50 MB/s now. Thanks!



Re: very slow ssh / sshd performance on current

2016-09-02 Thread Alexander Bluhm
On Fri, Sep 02, 2016 at 09:43:13AM +0200, Andreas Bartelt wrote:
> I'm observing very slow ssh / sshd performance on current (tested on amd64).
> Throughput is less than 1/50th of what I'm typically seeing on my boxes.
> This drop in performance seems to be independent of the used ciphers (tested
> with aes-gcm-128 & chacha20-poly1305).
> 
> All tested interfaces are em(4) which, however, seems to be completely
> unrelated since I don't observe this huge drop in performance via nc(1) -
> it's >70 MB/s via nc(1) vs. ~1-2 MB/s via ssh/scp.

I see a performance drop to 10 Mbit/sec on some old i386 machines
with em(4).  Can you try this kernel diff to see wether it is the
same problem?

bluhm

Index: kern/uipc_socket.c
===
RCS file: /data/mirror/openbsd/cvs/src/sys/kern/uipc_socket.c,v
retrieving revision 1.155
diff -u -p -r1.155 uipc_socket.c
--- kern/uipc_socket.c  25 Aug 2016 14:13:19 -  1.155
+++ kern/uipc_socket.c  29 Aug 2016 18:02:24 -
@@ -544,7 +544,7 @@ m_getuio(struct mbuf **mp, int atomic, l
 
resid = ulmin(resid, space);
if (resid >= MINCLSIZE) {
-   MCLGETI(m, M_NOWAIT, NULL, ulmin(resid, MAXMCLBYTES));
+   MCLGETI(m, M_NOWAIT, NULL, ulmin(resid, PAGE_SIZE));
if ((m->m_flags & M_EXT) == 0)
goto nopages;
mlen = m->m_ext.ext_size;



very slow ssh / sshd performance on current

2016-09-02 Thread Andreas Bartelt
I'm observing very slow ssh / sshd performance on current (tested on 
amd64). Throughput is less than 1/50th of what I'm typically seeing on 
my boxes. This drop in performance seems to be independent of the used 
ciphers (tested with aes-gcm-128 & chacha20-poly1305).


All tested interfaces are em(4) which, however, seems to be completely 
unrelated since I don't observe this huge drop in performance via nc(1) 
- it's >70 MB/s via nc(1) vs. ~1-2 MB/s via ssh/scp.
OpenBSD 6.0-current (GENERIC.MP) #0: Fri Sep  2 07:25:22 CEST 2016
a...@obsd.bartelt.name:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8277159936 (7893MB)
avail mem = 8021807104 (7650MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xccbfd000 (64 entries)
bios0: vendor LENOVO version "N10ET38W (1.17 )" date 08/20/2015
bios0: LENOVO 20CMCTO1WW
acpi0 at bios0: rev 2
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP ASF! HPET ECDT APIC MCFG SSDT SSDT SSDT SSDT SSDT SSDT 
SSDT SSDT SSDT PCCT SSDT UEFI MSDM BATB FPDT UEFI DMAR
acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) XHCI(S3) EHC1(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpiec0 at acpi0
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 798.28 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,PT,SENSOR,ARAT
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 798.15 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,PT,SENSOR,ARAT
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 1, core 0, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 798.15 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,PT,SENSOR,ARAT
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 1, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i7-5600U CPU @ 2.60GHz, 798.16 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,BMI1,HLE,AVX2,SMEP,BMI2,ERMS,INVPCID,RTM,RDSEED,ADX,SMAP,PT,SENSOR,ARAT
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 1, core 1, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 40 pins
acpimcfg0 at acpi0 addr 0xf800, bus 0-63
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG_)
acpiprt2 at acpi0: bus 2 (EXP1)
acpiprt3 at acpi0: bus 3 (EXP2)
acpiprt4 at acpi0: bus -1 (EXP3)
acpicpu0 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpipwrres0 at acpi0: PUBS, resource for XHCI, EHC1
acpipwrres1 at acpi0: NVP3, resource for PEG_
acpipwrres2 at acpi0: NVP2, resource for PEG_
acpitz0 at acpi0: critical temperature is 128 degC
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
"LEN0071" at acpi0 not configured
"LEN0046" at acpi0 not configured
acpibat0 at acpi0: BAT0 model "45N1113" serial   473 type LION oem "LGC"
acpibat1 at acpi0: BAT1 model "45N1738" serial  1842 type LION oem "LGC"
acpiac0 at acpi0: AC unit offline
acpithinkpad0 at acpi0
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not