Re: serious watchdog timeout issues with em driver

2015-12-21 Thread Gregor Best
On Mon, Dec 21, 2015 at 10:41:22AM +0200, Kapetanakis Giannis wrote:
> Hi,
> 
> Problem is still here with Dec 16 snapshot.
> 
> Dec 17 13:08:20 server /bsd: OpenBSD 5.8-current (GENERIC.MP) #1494: Wed Dec
> 16 12:13:03 MST 2015
> Dec 17 13:08:20 server /bsd:
> dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP
> Dec 17 13:08:20 server /bsd: cpu0: Intel(R) Pentium(R) 4 CPU 3.00GHz
> ("GenuineIntel" 686-class) 3 GHz
> Dec 17 13:08:20 server /bsd: em0 at pci1 dev 10 function 0 "Intel 82541EI"
> rev 0x00: apic 2 int 22, address 00:30:48:72:28:58
> Dec 17 13:08:20 server /bsd: em1 at pci1 dev 11 function 0 "Intel 82541EI"
> rev 0x00: apic 2 int 23, address 00:30:48:72:28:59
> Dec 20 16:53:18 server /bsd: em0: watchdog timeout -- resetting
> Dec 21 01:54:12 server /bsd: em0: watchdog timeout -- resetting
> 
> G
> 

I'm also seeing this with a Dec 19 snapshot on i386. This is with

em0 at pci1 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address 
00:03:2d:20:cf:84
em1 at pci2 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address 
00:03:2d:20:cf:85
em2 at pci3 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address 
00:03:2d:20:cf:86
em3 at pci4 dev 0 function 0 "Intel 82583V" rev 0x00: msi, address 
00:03:2d:20:cf:87

the timeouts  seem to be much  less frequently though and  it looks like
running  iperf doesn't  trigger them  anymore. When  running iperf,  I'm
seeing the top shows "system" nicely distributed over cores #1 to #3 and
interrupts on core #0 and throughput at around 500Mbit/sec.

A dmesg is attached after my signature.

-- 
Gregor

OpenBSD 5.8-current (GENERIC.MP) #1499: Sat Dec 19 08:24:55 MST 2015
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: Intel(R) Atom(TM) CPU D2550 @ 1.86GHz ("GenuineIntel" 686-class) 1.87 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,xTPR,PDCM,MOVBE,LAHF,PERF,ITSC,SENSOR,ARAT
real mem  = 2135064576 (2036MB)
avail mem = 2081611776 (1985MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 10/11/11, SMBIOS rev. 2.7 @ 0xe9380 (50 entries)
bios0: vendor American Megatrends Inc. version "4.6.5" date 06/21/2012
bios0: INTEL Corporation Tiger Hill
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP APIC MCFG HPET SSDT SSDT SSDT IFEU
acpi0: wakeup devices P0P8(S4) PS2K(S3) PS2M(S3) USB0(S3) USB1(S3) USB2(S3) 
USB3(S3) USB7(S3) PXSX(S4) RP01(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) 
PXSX(S4) RP04(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges
cpu0: apic clock running at 133MHz
cpu0: mwait min=64, max=64, C-substates=0.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Atom(TM) CPU D2550 @ 1.86GHz ("GenuineIntel" 686-class) 1.87 GHz
cpu1: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,xTPR,PDCM,MOVBE,LAHF,PERF,ITSC,SENSOR,ARAT
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Atom(TM) CPU D2550 @ 1.86GHz ("GenuineIntel" 686-class) 1.87 GHz
cpu2: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,xTPR,PDCM,MOVBE,LAHF,PERF,ITSC,SENSOR,ARAT
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Atom(TM) CPU D2550 @ 1.86GHz ("GenuineIntel" 686-class) 1.87 GHz
cpu3: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,NXE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,xTPR,PDCM,MOVBE,LAHF,PERF,ITSC,SENSOR,ARAT
ioapic0 at mainbus0: apid 4 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0 addr 0xe000, bus 0-255
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 5 (P0P8)
acpiprt2 at acpi0: bus 1 (RP01)
acpiprt3 at acpi0: bus 2 (RP02)
acpiprt4 at acpi0: bus 3 (RP03)
acpiprt5 at acpi0: bus 4 (RP04)
acpiec0 at acpi0: not present
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
acpicpu2 at acpi0: C1(@1 halt!)
acpicpu3 at acpi0: C1(@1 halt!)
acpitz0 at acpi0: critical temperature is 140 degC
acpipwrres0 at acpi0: FN00, resource for FAN0
acpitz1 at acpi0: critical temperature is 100 degC
acpibat0 at acpi0: BAT0 not present
acpibat1 at acpi0: BAT1 not present
acpibtn0 at acpi0: PWRB
acpiac0 at acpi0: AC unit offline
acpibtn1 at acpi0: SLPB
acpibtn2 at acpi0: LID0
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD02
bios0: ROM list: 0xc/0xf400! 0xcf800/0x1000 0xd0800/0x1000 0xd1800/0x1000 
0xd2800/0x1000
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 vendor 

Re: serious watchdog timeout issues with em driver

2015-12-21 Thread Kapetanakis Giannis

Hi,

Problem is still here with Dec 16 snapshot.

Dec 17 13:08:20 server /bsd: OpenBSD 5.8-current (GENERIC.MP) #1494: Wed 
Dec 16 12:13:03 MST 2015
Dec 17 13:08:20 server /bsd: 
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP
Dec 17 13:08:20 server /bsd: cpu0: Intel(R) Pentium(R) 4 CPU 3.00GHz 
("GenuineIntel" 686-class) 3 GHz
Dec 17 13:08:20 server /bsd: em0 at pci1 dev 10 function 0 "Intel 
82541EI" rev 0x00: apic 2 int 22, address 00:30:48:72:28:58
Dec 17 13:08:20 server /bsd: em1 at pci1 dev 11 function 0 "Intel 
82541EI" rev 0x00: apic 2 int 23, address 00:30:48:72:28:59

Dec 20 16:53:18 server /bsd: em0: watchdog timeout -- resetting
Dec 21 01:54:12 server /bsd: em0: watchdog timeout -- resetting

G



Re: serious watchdog timeout issues with em driver

2015-12-14 Thread Kapetanakis Giannis

On 09/12/15 10:42, Kapetanakis Giannis wrote:

On 08/12/15 21:47, Kapetanakis Giannis wrote:


The event happened only once and it's network recovered after a few 
seconds. no reboot.


G


Well that didn't last long.
Today I found the server hanged at ddb after a new watchdog timeout on 
em0.

Keyboard was not working so I could not get all the info.

I wrote on paper:
uvm_fault(0xd0ba3660, 0xefffe000, 0, 1) -> d
kernel: page fault trap, code=0
Stopped at bpf_m_xhalt+0x6f: movzwl 0(%esi),%eax

G


Hi,

Has something changed from Dec 6 snapshot to Dec 9 current that fixed this?
I've seen that the driver has not been updated.

I've compiled new current kernel on Dec 9 and system does NOT have any 
problem since then.

No watchdog timeout and no crash.

Problem solved? Any link to the commit that solved it?

thanks

Giannis



Re: serious watchdog timeout issues with em driver

2015-12-09 Thread Kapetanakis Giannis

On 08/12/15 21:47, Kapetanakis Giannis wrote:


The event happened only once and it's network recovered after a few 
seconds. no reboot.


G


Well that didn't last long.
Today I found the server hanged at ddb after a new watchdog timeout on em0.
Keyboard was not working so I could not get all the info.

I wrote on paper:
uvm_fault(0xd0ba3660, 0xefffe000, 0, 1) -> d
kernel: page fault trap, code=0
Stopped at bpf_m_xhalt+0x6f: movzwl 0(%esi),%eax

G



Re: serious watchdog timeout issues with em driver

2015-12-08 Thread Kapetanakis Giannis

On 20/11/15 15:12, Martin Pieuchot wrote:

I just committed a revert to 1.305 keeping the API changes needed for
the driver to build.

This should bring your stability back, please let us know if that's not
the case.

I'm sorry for your troubles.


Hi,

I've upgraded yesterday to Dec 6 snapshot and today I got my first
em0: watchdog timeout -- resetting

regards,

G

OpenBSD 5.8-current (GENERIC.MP) #1468: Sun Dec  6 11:27:59 MST 2015
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: Intel(R) Pentium(R) 4 CPU 3.00GHz ("GenuineIntel" 686-class) 3 GHz
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,CNXT-ID,xTPR,PERF
real mem  = 2146910208 (2047MB)
avail mem = 2093232128 (1996MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 04/13/04, BIOS32 rev. 0 @ 0xfb7f0, SMBIOS rev. 2.3 @ 
0xf0800 (42 entries)
bios0: vendor Phoenix Technologies, LTD version "6.00 PG" date 04/13/2004
bios0: Supermicro P4SCE
acpi0 at bios0: rev 0
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP APIC
acpi0: wakeup devices HUB0(S5) UAR1(S5) UAR2(S5) USB0(S1) USB1(S1) USB2(S1) 
USB3(S1) USBE(S1) MODM(S5) PCI0(S5)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 199MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Pentium(R) 4 CPU 3.00GHz ("GenuineIntel" 686-class) 3 GHz
cpu1: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,CNXT-ID,xTPR,PERF
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins
ioapic0: misconfigured as apic 0, remapped to apid 2
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (HUB0)
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
acpitz0 at acpi0: critical temperature is 100 degC
acpibtn0 at acpi0: PWRB
bios0: ROM list: 0xc/0x8000 0xc8000/0x8000!
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel 82875P Host" rev 0x02
uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: apic 2 int 16
uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: apic 2 int 19
uhci2 at pci0 dev 29 function 2 "Intel 82801EB/ER USB" rev 0x02: apic 2 int 18
uhci3 at pci0 dev 29 function 3 "Intel 82801EB/ER USB" rev 0x02: apic 2 int 16
ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB2" rev 0x02: apic 2 int 23
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb0 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xc2
pci1 at ppb0 bus 1
vga1 at pci1 dev 9 function 0 "ATI Rage XL" rev 0x27
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
em0 at pci1 dev 10 function 0 "Intel 82541EI" rev 0x00: apic 2 int 22, address 
00:30:48:72:28:58
em1 at pci1 dev 11 function 0 "Intel 82541EI" rev 0x00: apic 2 int 23, address 
00:30:48:72:28:59
ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
pciide0 at pci0 dev 31 function 1 "Intel 82801EB/ER IDE" rev 0x02: DMA, channel 
0 configured to compatibility, channel 1 configured to compatibility
pciide0: channel 0 disabled (no drives)
pciide0: channel 1 disabled (no drives)
pciide1 at pci0 dev 31 function 2 "Intel 82801EB SATA" rev 0x02: DMA, channel 0 
configured to native-PCI, channel 1 configured to native-PCI
pciide1: using apic 2 int 18 for native-PCI interrupt
wd0 at pciide1 channel 0 drive 0: 
wd0: 16-sector PIO, LBA48, 78533MB, 160836480 sectors
wd0(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 6
ichiic0 at pci0 dev 31 function 3 "Intel 82801EB/ER SMBus" rev 0x02: apic 2 int 
17
iic0 at ichiic0
spdmem0 at iic0 addr 0x50: 512MB DDR SDRAM non-parity PC3200CL3.0
spdmem1 at iic0 addr 0x51: 512MB DDR SDRAM non-parity PC3200CL3.0
spdmem2 at iic0 addr 0x52: 512MB DDR SDRAM non-parity PC3200CL3.0
spdmem3 at iic0 addr 0x53: 512MB DDR SDRAM non-parity PC3200CL3.0
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb4 at uhci3: USB revision 1.0
uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at ichpcib0
isadma0 at isa0
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
wbsio0 at isa0 port 0x2e/2: W83627HF rev 0x17
lm1 at wbsio0 port 0x290/8: W83627HF
npx0 at isa0 port 0xf0/16: reported by CPUID; using 

Re: serious watchdog timeout issues with em driver

2015-12-08 Thread Chris Cappuccio
Kapetanakis Giannis [bil...@edu.physics.uoc.gr] wrote:
> On 20/11/15 15:12, Martin Pieuchot wrote:
> >I just committed a revert to 1.305 keeping the API changes needed for
> >the driver to build.
> >
> >This should bring your stability back, please let us know if that's not
> >the case.
> >
> >I'm sorry for your troubles.
> 
> Hi,
> 
> I've upgraded yesterday to Dec 6 snapshot and today I got my first
> em0: watchdog timeout -- resetting
> 
> em0 at pci1 dev 10 function 0 "Intel 82541EI" rev 0x00: apic 2 int 22, 
> address 00:30:48:72:28:58
> em1 at pci1 dev 11 function 0 "Intel 82541EI" rev 0x00: apic 2 int 23, 
> address 00:30:48:72:28:59

Can you try to pinpoint when it started?



Re: serious watchdog timeout issues with em driver

2015-12-08 Thread Kapetanakis Giannis

On 08/12/15 19:39, Chris Cappuccio wrote:

Kapetanakis Giannis [bil...@edu.physics.uoc.gr] wrote:

On 20/11/15 15:12, Martin Pieuchot wrote:

I just committed a revert to 1.305 keeping the API changes needed for
the driver to build.

This should bring your stability back, please let us know if that's not
the case.

I'm sorry for your troubles.

Hi,

I've upgraded yesterday to Dec 6 snapshot and today I got my first
em0: watchdog timeout -- resetting

em0 at pci1 dev 10 function 0 "Intel 82541EI" rev 0x00: apic 2 int 22, address 
00:30:48:72:28:58
em1 at pci1 dev 11 function 0 "Intel 82541EI" rev 0x00: apic 2 int 23, address 
00:30:48:72:28:59

Can you try to pinpoint when it started?


You mean what type of traffic caused it? Don't know.
The server is a ~ busy internal-only recursive DNS server (bind).
Other than that I was playing in it's shell when the event occurred, 
nothing special.


If you mean time since boot, it was after ~ 22hours
Dec  7 15:53:20  /bsd: OpenBSD 5.8-current (GENERIC.MP) #1468: Sun Dec  
6 11:27:59 MST 2015

Dec  8 16:06:59  /bsd: em0: watchdog timeout -- resetting
Dec  8 16:07:00  named[10537]: client: warning: client xx.xx.xx.xx#30399 
(mail.expressionclones.com): error sending response: host unreachable
Dec  8 16:07:00  named[10537]: client: warning: client yy.yy.yy.yy#52263 
(85.151.91.139.sa-accredit.habeas.com): error sending response: host 
unreachable


The event happened only once and it's network recovered after a few 
seconds. no reboot.


G



Re: serious watchdog timeout issues with em driver

2015-12-02 Thread Atanas Vladimirov

On 30.11.2015 14:08, Atanas Vladimirov wrote:

Hi,
I'm not sure if this is related to resent em(4) changes, but after 
upgrade from:




Hi,
Just ignore my previous assumptions. I thinks that I found the real 
cause for
this upload speed problem. I'm using ifstated to inform me when 
something goes

wrong with my egress interface.

snip from ifstated.conf
...
state extif_online {
   init {
run "echo External interface ON-line @ `date +%H:%M:%S` | mail 
-s 'External Interface ON-line' t...@example.com"

run "/usr/sbin/arp -Ff /etc/ether.mac"
   }
   if $em2_up && ! $peer_up {
set-state extif_up
   }
   if $em2_down {
set-state extif_down
   }
}
...
/snip

[ns]~$ cat /etc/ether.mac
95.YY.XXX.225 64:87:88:58:b2:41 permanent
  ^^^
this is the ip of my default gateway

If I have a permanent arp entry for my gateway, then I observe 1-2mbps 
upload speed.

After I clear the arp I get 30-40mbps as it should be.

Meanwhile I updated to more resent snapshot #1696: Wed Dec  2 10:13:03 
MST 2015

and the problem persist.
If you need more info just ask.
Best regard,
Atanas

dmesg:

OpenBSD 5.8-current (GENERIC.MP) #1696: Wed Dec  2 10:13:03 MST 2015
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4269342720 (4071MB)
avail mem = 4135833600 (3944MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.6 @ 0x9f000 (70 entries)
bios0: vendor American Megatrends Inc. version "1.2a" date 06/27/2012
bios0: Supermicro X8SIL
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP APIC MCFG OEMB HPET GSCI SSDT EINJ BERT ERST 
HEST
acpi0: wakeup devices P0P1(S4) P0P3(S4) P0P4(S4) P0P5(S4) P0P6(S4) 
BR1E(S4) PS2K(S4) PS2M(S4) USB0(S4) USB1(S4) USB2(S4) USB3(S4) USB4(S4) 
USB5(S4) USB6(S4) GBE_(S4) [...]

acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.38 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 133MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 7 pa 0xfec0, version 20, 24 pins
ioapic0: misconfigured as apic 1, remapped to apid 7
acpimcfg0 at acpi0 addr 0xe000, bus 0-255
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (P0P1)
acpiprt2 at acpi0: bus 1 (P0P3)
acpiprt3 at acpi0: bus 2 (P0P5)
acpiprt4 at acpi0: bus -1 (P0P6)
acpiprt5 at acpi0: bus 6 (BR1E)
acpiprt6 at acpi0: bus 3 (BR20)
acpiprt7 at acpi0: bus 4 (BR24)
acpiprt8 at acpi0: bus 5 (BR25)
acpicpu0 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS

acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: PWRB
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2400 MHz: speeds: 2401, 2400, 2267, 2133, 2000, 
1867, 1733, 1600, 1467, 1333, 1200 MHz

pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core DMI" rev 0x11
ppb0 at pci0 dev 3 function 0 "Intel Core PCIE" rev 0x11: msi
pci1 at ppb0 bus 1
ppb1 at pci0 dev 5 function 0 "Intel Core 

Re: serious watchdog timeout issues with em driver

2015-12-02 Thread Atanas Vladimirov

On 02.12.2015 22:25, Atanas Vladimirov wrote:

On 30.11.2015 14:08, Atanas Vladimirov wrote:

Hi,
I'm not sure if this is related to resent em(4) changes, but after 
upgrade from:




Hi,
Just ignore my previous assumptions.


Hi,
Sorry for the noise! Please ignore all of my previous emails.
It seems that my ISP changed a NIC port on the router which
served as my default gateway and I used a wrong MAC address.
I'm really sorry.
Best wishes,
Atanas



Re: serious watchdog timeout issues with em driver

2015-11-30 Thread Atanas Vladimirov

On 20.11.2015 21:10, Sonic wrote:
On Fri, Nov 20, 2015 at 12:37 PM, Mark Kettenis 
 wrote:

Thanks Martin.


All is fine now. System booted with no errors and no watchdog timeouts.

Thanks to all.

Chris


Hi,
I'm not sure if this is related to resent em(4) changes, but after 
upgrade from:


-OpenBSD 5.8-current (GENERIC.MP) #1597: Thu Nov 12 07:33:59 MST 2015
+OpenBSD 5.8-current (GENERIC.MP) #1671: Thu Nov 26 20:36:24 MST 2015

my upload throughput can't reach more than 2mbps.
I also tried Nov 27 22:50:35 snapshot with same result.

-OpenBSD 5.8-current (GENERIC.MP) #1671: Thu Nov 26 20:36:24 MST 2015
+OpenBSD 5.8-current (GENERIC.MP) #1675: Fri Nov 27 22:50:35 MST 2015

I tried to build an older em(4) revisions, but all exept if_em.c -r 
1.312 failed to build.
Kernel with if_em.c -r 1.312, if_em.h -r 1.60 and if_em_hw.c -r 1.88 
boot normally but

same upload throughput.

Does anyone observe such a behavior? If you need more info just ask.
Thanks for your time.

P.S.: When I plug the cable from my ISP in Windows 7 laptop I have 35-40 
mbps.


full dmesg:
OpenBSD 5.8-current (GENERIC.MP) #1675: Fri Nov 27 22:50:35 MST 2015
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4269342720 (4071MB)
avail mem = 4135833600 (3944MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.6 @ 0x9f000 (70 entries)
bios0: vendor American Megatrends Inc. version "1.2a" date 06/27/2012
bios0: Supermicro X8SIL
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP APIC MCFG OEMB HPET GSCI SSDT EINJ BERT ERST 
HEST
acpi0: wakeup devices P0P1(S4) P0P3(S4) P0P4(S4) P0P5(S4) P0P6(S4) 
BR1E(S4) PS2K(S4) PS2M(S4) USB0(S4) USB1(S4) USB2(S4) USB3(S4) USB4(S4) 
USB5(S4) USB6(S4) GBE_(S4) [...]

acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.32 MHz
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 133MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.01 MHz
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG,LAHF,PERF,ITSC,SENSOR

cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 0, core 3, package 0
ioapic0 at mainbus0: apid 7 pa 0xfec0, version 20, 24 pins
ioapic0: misconfigured as apic 1, remapped to apid 7
acpimcfg0 at acpi0 addr 0xe000, bus 0-255
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (P0P1)
acpiprt2 at acpi0: bus 1 (P0P3)
acpiprt3 at acpi0: bus 2 (P0P5)
acpiprt4 at acpi0: bus -1 (P0P6)
acpiprt5 at acpi0: bus 6 (BR1E)
acpiprt6 at acpi0: bus 3 (BR20)
acpiprt7 at acpi0: bus 4 (BR24)
acpiprt8 at acpi0: bus 5 (BR25)
acpicpu0 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: !C3(350@17 mwait.1@0x20), !C2(500@17 mwait.1@0x10), 
C1(1000@1 mwait.1), PSS

acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: PWRB
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2400 MHz: speeds: 2401, 2400, 2267, 2133, 2000, 
1867, 1733, 1600, 1467, 1333, 1200 MHz

pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core DMI" rev 0x11
ppb0 at pci0 dev 3 function 0 "Intel Core PCIE" rev 0x11: msi
pci1 at ppb0 bus 1
ppb1 at pci0 dev 5 function 0 "Intel Core PCIE" rev 0x11: msi

Re: serious watchdog timeout issues with em driver

2015-11-20 Thread Martin Pieuchot
On 19/11/15(Thu) 17:54, Sonic wrote:
> Have serious problems for over 7 weeks now with em driver,
> specifically any rev of if_em.c >  1.305. Starting with rev 1.306,
> released on 2015/09/30 and continuing to -current, watchdog timeouts
> rue the day. Unfortunately rev 1.305 no longer builds with -current as
> it appears the patch in rev 1.309 would be necessary.

I just committed a revert to 1.305 keeping the API changes needed for
the driver to build.

This should bring your stability back, please let us know if that's not
the case.

I'm sorry for your troubles.



Re: serious watchdog timeout issues with em driver

2015-11-20 Thread Sonic
On Fri, Nov 20, 2015 at 8:12 AM, Martin Pieuchot  wrote:
> I just committed a revert to 1.305 keeping the API changes needed for
> the driver to build.
>
> This should bring your stability back, please let us know if that's not
> the case.

The kernel/driver builds with those changes but crashes on startup
(hadn't rebuilt userland yet). Couldn't see much when it happened, I
believe it was just after starting the network, and there was some
Xsoft... error and then what I would describe as a core dump to the
console. No footprints seem to be left after a power reset and booting
into obsd.

Thanks,

Chris



Re: serious watchdog timeout issues with em driver

2015-11-20 Thread Mark Kettenis
> Date: Fri, 20 Nov 2015 14:12:52 +0100
> From: Martin Pieuchot 
> 
> On 19/11/15(Thu) 17:54, Sonic wrote:
> > Have serious problems for over 7 weeks now with em driver,
> > specifically any rev of if_em.c >  1.305. Starting with rev 1.306,
> > released on 2015/09/30 and continuing to -current, watchdog timeouts
> > rue the day. Unfortunately rev 1.305 no longer builds with -current as
> > it appears the patch in rev 1.309 would be necessary.
> 
> I just committed a revert to 1.305 keeping the API changes needed for
> the driver to build.

Thanks Martin.  I didn't have the time in the last few weeks to do the
backout.



Re: serious watchdog timeout issues with em driver

2015-11-20 Thread Sonic
On Fri, Nov 20, 2015 at 12:37 PM, Mark Kettenis  wrote:
> Thanks Martin.

All is fine now. System booted with no errors and no watchdog timeouts.

Thanks to all.

Chris



serious watchdog timeout issues with em driver

2015-11-19 Thread Sonic
Have serious problems for over 7 weeks now with em driver,
specifically any rev of if_em.c >  1.305. Starting with rev 1.306,
released on 2015/09/30 and continuing to -current, watchdog timeouts
rue the day. Unfortunately rev 1.305 no longer builds with -current as
it appears the patch in rev 1.309 would be necessary.

System in question is a NAT firewall, also running Unbound and DHCPD.
Timeouts occur randomly and can affect both internal and external
interfaces. But use of a bittorrent app on an internal client system
will always trigger many such timeouts:

Nov 18 12:21:17 stargate /bsd: em0: watchdog timeout -- resetting
Nov 18 12:21:17 stargate /bsd: em1: watchdog timeout -- resetting
Nov 18 12:22:34 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:22:34 stargate unbound: [12687:1] notice: remote address is
172.27.12.11 port 55181
Nov 18 12:22:36 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:22:36 stargate unbound: [12687:1] notice: remote address is
172.27.12.253 port 54266
Nov 18 12:22:36 stargate unbound: [22477:0] notice: sendto failed: No buffer
space available
Nov 18 12:22:36 stargate unbound: [22477:0] notice: remote address is
172.27.12.253 port 53257
Nov 18 12:22:37 stargate /bsd: em0: watchdog timeout -- resetting
Nov 18 12:23:42 stargate /bsd: em0: watchdog timeout -- resetting
Nov 18 12:28:11 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:28:11 stargate unbound: [12687:1] notice: remote address is
172.27.12.66 port 56045
Nov 18 12:28:12 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:28:12 stargate unbound: [12687:1] notice: remote address is
172.27.12.66 port 41975
Nov 18 12:28:12 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:28:12 stargate unbound: [12687:1] notice: remote address is
172.27.12.66 port 48603
Nov 18 12:28:12 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:28:12 stargate unbound: [12687:1] notice: remote address is
172.27.12.66 port 17834
Nov 18 12:28:13 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:28:13 stargate unbound: [12687:1] notice: remote address is
172.27.12.66 port 1177
Nov 18 12:28:14 stargate unbound: [12687:1] notice: sendto failed: No buffer
space available
Nov 18 12:28:14 stargate unbound: [12687:1] notice: remote address is
172.27.12.66 port 39013
Nov 18 12:28:15 stargate /bsd: em0: watchdog timeout -- resetting
Nov 18 12:29:42 stargate /bsd: em0: watchdog timeout -- resetting
Nov 18 14:00:01 stargate syslogd: restart
Nov 18 16:00:01 stargate syslogd: restart
Nov 19 12:00:01 stargate syslogd: restart
Nov 19 16:00:01 stargate syslogd: restart
Nov 19 16:08:36 stargate /bsd: em0: watchdog timeout -- resetting
Nov 19 16:10:34 stargate /bsd: em0: watchdog timeout -- resetting
Nov 19 16:15:04 stargate /bsd: em0: watchdog timeout -- resetting
Nov 19 16:19:55 stargate last message repeated 3 times

(one of the above is on the external interface em1)

The timeouts don't just shutdown net access during the reset time,
other problems occur. Many time the SSH server no longer accepts
connections so shelling into the system is not an option:

$ ssh stargate
write: Connection reset by peer


I've also had a system crash that I suspect (no proof at all and
thankfully it hasn't re-occurred, but timing is everything) was caused
by the faulty em driver:

Nov  1 22:23:55 stargate /bsd: uvm_fault(0x818f9920,
0xfff7818adf60, 0, 1) -> e
Nov  1 22:23:55 stargate /bsd: fatal page fault in supervisor mode
Nov  1 22:23:55 stargate /bsd: trap type 6 code 0 rip 81329e69
cs 8 rflags 10286 cr2  fff7818adf60 cpl 7 rsp 8000221df76
0
Nov  1 22:23:55 stargate /bsd: panic: trap type 6, code=0, pc=81329e69
Nov  1 22:23:55 stargate /bsd: Starting stack trace...
Nov  1 22:23:55 stargate /bsd: panic() at panic+0x10b
Nov  1 22:23:55 stargate /bsd: trap() at trap+0x7b8
Nov  1 22:23:55 stargate /bsd: --- trap (number 6) ---
Nov  1 22:23:55 stargate /bsd: trap() at trap+0x709
Nov  1 22:23:55 stargate /bsd: --- trap (number 4) ---
Nov  1 22:23:55 stargate /bsd: trap() at trap+0x709
Nov  1 22:23:55 stargate /bsd: --- trap (number 4) ---
Nov  1 22:23:55 stargate /bsd: bpf_filter() at bpf_filter+0x19b
Nov  1 22:23:55 stargate /bsd: _bpf_mtap() at _bpf_mtap+0xf4
Nov  1 22:23:55 stargate /bsd: bpf_mtap_ether() at bpf_mtap_ether+0x39
Nov  1 22:23:55 stargate /bsd: em_start() at em_start+0xd6
Nov  1 22:23:55 stargate /bsd: nettxintr() at nettxintr+0x52
Nov  1 22:23:55 stargate /bsd: softintr_dispatch() at softintr_dispatch+0x8b
Nov  1 22:23:55 stargate /bsd: Xsoftnet() at