FWIW, I get the same behavior on GENERIC.MP, so I don't think the PPPOE_TERM_UNKNOWN_SESSIONS kernel option is causing this.
If I can provide any more information, please let me know. -- Bryan On 2017-09-03 05:59:51, Bryan Linton <[email protected]> wrote: > >Synopsis: em0 loses connectivity due to low mbufs > >Category: system > >Environment: > System : OpenBSD 6.2 > Details : OpenBSD 6.2-beta (GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS) > #22: Wed Aug 30 19:23:17 JST 2017 > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS > > Architecture: OpenBSD.amd64 > Machine : amd64 > >Description: > > I've been seeing random drops in network connectivity that I've > traced to what appears to be not enough mbufs being used. The > issue is seen on the following em0 controller in a Thinkpad T440p: > > em0 at pci0 dev 25 function 0 "Intel I217-LM" rev 0x04: msi > > A freshly booted system will work fine, but heavily using the > network (for example, by transferring large files over a LAN) will > cause connectivity to drop sooner rather than later. But even > only lightly using the network will eventually cause the issue to > surface. > > Rebooting always fixes the issue. Issuing a "zzz" command will > also fix it most of the time, but not always. Some of the time, a > simple "ifconfig em0 down up" will fix it, but usually only once > or twice. After that, the system must either be zzz'ed or > rebooted. > > I've attached "systat mbuf" output for various states of "working" > vs. "not working". > > Note how the ALIVE value drops below the LWM value when it's not > working. > > FULL DISCLOSURE: I am running a kernel with the > PPPOE_TERM_UNKNOWN_SESSIONS option set. I do not believe it > would affect the em0 driver, but I suppose it's possible that it > could. > > vvvvvvvvvvvvvvvvvvv WORKING vvvvvvvvvvvvvvvvvvvvv > > IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM > System 0 256 418 30 > 2048 17 9 > 2112 15 5 > 4096 256 37 > lo0 > em0 2050 15 10 256 15 > iwm0 > enc0 > pppoe0 > pflog0 > > > > > vvvvvvv NON-WORKING (immediately after connectivity is lost) vvvvvvvvv > (Note that ALIVE is lower than LWM and CWM is currently 145) > > IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM > System 0 256 1509 160 > 2048 31 29 > 2112 1036 72 > 4096 256 41 > lo0 > em0 2050 2 10 256 145 > iwm0 > enc0 > pppoe0 > pflog0 > > > vvvvvvvvvvvvvvv NON-WORKING (after a minute or so) vvvvvvvvvvvvvvvvv > (Note how the CWM has steadily increased to 256 over the last > minute) > > IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM > System 0 256 1656 160 > 2048 32 29 > 2112 1179 81 > 4096 256 41 > lo0 > em0 2050 2 10 256 256 > iwm0 > enc0 > pppoe0 > pflog0 > > > vvvvvvvvvvvvvvv NON-WORKING (after "ifconfig em0 down up") vvvvvvvvvvvv > (In this instance, "ifconfig em0 down up" didn't work, but it did > reset CWM back to LWM) > > IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM > System 0 256 1711 160 > 2048 43 29 > 2112 1191 82 > 4096 256 41 > lo0 > em0 2050 1 10 256 10 > iwm0 > enc0 > pppoe0 > pflog0 > > > vvvvvvvvvvv NON-WORKING (after repeated "ifconfig em0 down up") vvvvvvvv > (Now there is nothing reported for em0 at all. After getting into > this state, dmesg showed the following lines: > > em0: unable to fill any rx descriptors > em0: unable to fill any rx descriptors > em0: unable to fill any rx descriptors > em0: unable to fill any rx descriptors > em0: unable to fill any rx descriptors > em0: unable to fill any rx descriptors > em0: unable to fill any rx descriptors > ) > > IFACE LIVELOCKS SIZE ALIVE LWM HWM CWM > System 0 256 1742 160 > 2048 54 29 > 2112 1215 84 > 4096 256 41 > lo0 > em0 > iwm0 > enc0 > pppoe0 > pflog0 > ----------------------------------------------- > > I am willing to provide any additional needed information, as well > as test any potential patches. Please let me know if I can > provide any additional details. > > > >How-To-Repeat: > Saturate the network connection. Eventually, the system > will stop receiving network data. > >Fix: > Temporary fix: Either issue "ifconfig em0 down up", "zzz", > or reboot. > > Permanent fix: Unknown. I attempted to revert, in turn, > the if_em* files all the way up to a Jan 23rd commit to > see if it was due to any recent commits there, but the > kernel panicked upon booting when the if_em* files were > reverted to that point. I think there has been too much > progress in the rest of the system to sucessfully revert > to such a long time ago. > > > dmesg: > OpenBSD 6.2-beta (GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS) #22: Wed Aug 30 > 19:23:17 JST 2017 > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP-PPPOE_TERM_UNKNOWN_SESSIONS > real mem = 12539871232 (11958MB) > avail mem = 12152803328 (11589MB) > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xbcc0d000 (67 entries) > bios0: vendor LENOVO version "GLET85WW (2.39 )" date 09/29/2016 > bios0: LENOVO 20AWS27D00 > acpi0 at bios0: rev 2 > acpi0: sleep states S0 S3 S4 S5 > acpi0: tables DSDT FACP SLIC DBGP ECDT HPET APIC MCFG SSDT SSDT SSDT SSDT > SSDT SSDT SSDT PCCT SSDT TCPA UEFI MSDM ASF! BATB FPDT UEFI DMAR > acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) EXP3(S4) XHCI(S3) > EHC1(S3) EHC2(S3) > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpiec0 at acpi0 > acpihpet0 at acpi0: 14318179 Hz > acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat > cpu0 at mainbus0: apid 0 (boot processor) > cpu0: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2594.37 MHz > cpu0: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT > cpu0: 256KB 64b/line 8-way L2 cache > cpu0: TSC frequency 2594368320 Hz > cpu0: smt 0, core 0, package 0 > mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges > cpu0: apic clock running at 99MHz > cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4, IBE > cpu1 at mainbus0: apid 1 (application processor) > cpu1: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz > cpu1: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT > cpu1: 256KB 64b/line 8-way L2 cache > cpu1: smt 1, core 0, package 0 > cpu2 at mainbus0: apid 2 (application processor) > cpu2: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz > cpu2: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT > cpu2: 256KB 64b/line 8-way L2 cache > cpu2: smt 0, core 1, package 0 > cpu3 at mainbus0: apid 3 (application processor) > cpu3: Intel(R) Core(TM) i5-4300M CPU @ 2.60GHz, 2593.99 MHz > cpu3: > FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SENSOR,ARAT > cpu3: 256KB 64b/line 8-way L2 cache > cpu3: smt 1, core 1, package 0 > ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 24 pins > acpimcfg0 at acpi0 addr 0xf8000000, bus 0-63 > acpiprt0 at acpi0: bus 0 (PCI0) > acpiprt1 at acpi0: bus -1 (PEG0) > acpiprt2 at acpi0: bus -1 (PEG_) > acpiprt3 at acpi0: bus 2 (EXP1) > acpiprt4 at acpi0: bus 3 (EXP2) > acpiprt5 at acpi0: bus -1 (EXP3) > acpiprt6 at acpi0: bus -1 (EXP6) > acpicpu0 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS > acpicpu1 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS > acpicpu2 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS > acpicpu3 at acpi0: C2(200@148 mwait.1@0x33), C1(1000@1 mwait.1), PSS > acpipwrres0 at acpi0: PUBS, resource for XHCI, EHC1, EHC2 > acpipwrres1 at acpi0: NVP3, resource for PEG_ > acpipwrres2 at acpi0: NVP2, resource for PEG_ > acpitz0 at acpi0: critical temperature is 200 degC > acpibtn0 at acpi0: LID_ > acpibtn1 at acpi0: SLPB > "LEN0071" at acpi0 not configured > "LEN0036" at acpi0 not configured > "SMO1200" at acpi0 not configured > acpibat0 at acpi0: BAT0 model "45N1161" serial 3584 type LION oem "LGC" > acpiac0 at acpi0: AC unit online > acpithinkpad0 at acpi0 > "PNP0C14" at acpi0 not configured > "PNP0C14" at acpi0 not configured > "PNP0C14" at acpi0 not configured > "INT340F" at acpi0 not configured > acpivideo0 at acpi0: VID_ > acpivout at acpivideo0 not configured > acpivideo1 at acpi0: VID_ > cpu0: Enhanced SpeedStep 2594 MHz: speeds: 2601, 2600, 2500, 2300, 2200, > 2100, 2000, 1800, 1700, 1600, 1400, 1300, 1200, 1100, 900, 800 MHz > pci0 at mainbus0 bus 0 > pchb0 at pci0 dev 0 function 0 "Intel Core 4G Host" rev 0x06 > inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 4600" rev 0x06 > drm0 at inteldrm0 > inteldrm0: msi > inteldrm0: 1920x1080, 32bpp > wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation) > wsdisplay0: screen 1-5 added (std, vt100 emulation) > azalia0 at pci0 dev 3 function 0 "Intel Core 4G HD Audio" rev 0x06: msi > xhci0 at pci0 dev 20 function 0 "Intel 8 Series xHCI" rev 0x04: msi > usb0 at xhci0: USB revision 3.0 > uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 > addr 1 > "Intel 8 Series MEI" rev 0x04 at pci0 dev 22 function 0 not configured > em0 at pci0 dev 25 function 0 "Intel I217-LM" rev 0x04: msi, address > xx:xx:xx:xx:xx:xx > ehci0 at pci0 dev 26 function 0 "Intel 8 Series USB" rev 0x04: apic 2 int 16 > usb1 at ehci0: USB revision 2.0 > uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 > addr 1 > azalia1 at pci0 dev 27 function 0 "Intel 8 Series HD Audio" rev 0x04: msi > azalia1: codecs: Realtek ALC292 > audio0 at azalia1 > ppb0 at pci0 dev 28 function 0 "Intel 8 Series PCIE" rev 0xd4: msi > pci1 at ppb0 bus 2 > rtsx0 at pci1 dev 0 function 0 "Realtek RTS5227 Card Reader" rev 0x01: msi > sdmmc0 at rtsx0: 4-bit > ppb1 at pci0 dev 28 function 1 "Intel 8 Series PCIE" rev 0xd4: msi > pci2 at ppb1 bus 3 > iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless AC 7260" rev 0x83, msi > ehci1 at pci0 dev 29 function 0 "Intel 8 Series USB" rev 0x04: apic 2 int 23 > usb2 at ehci1: USB revision 2.0 > uhub2 at usb2 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 > addr 1 > pcib0 at pci0 dev 31 function 0 "Intel QM87 LPC" rev 0x04 > ahci0 at pci0 dev 31 function 2 "Intel 8 Series AHCI" rev 0x04: msi, AHCI 1.3 > ahci0: port 0: 6.0Gb/s > ahci0: port 5: 1.5Gb/s > scsibus1 at ahci0: 32 targets > sd0 at scsibus1 targ 0 lun 0: <ATA, Samsung SSD 850, EMT0> SCSI3 0/direct > fixed naa.5002538d41895ee0 > sd0: 476940MB, 512 bytes/sector, 976773168 sectors, thin > cd0 at scsibus1 targ 5 lun 0: <PLDS, DVD-RW DU8A5SH, BU51> ATAPI 5/cdrom > removable > ichiic0 at pci0 dev 31 function 3 "Intel 8 Series SMBus" rev 0x04: apic 2 int > 18 > iic0 at ichiic0 > isa0 at pcib0 > isadma0 at isa0 > pckbc0 at isa0 port 0x60/5 irq 1 irq 12 > pckbd0 at pckbc0 (kbd slot) > wskbd0 at pckbd0: console keyboard, using wsdisplay0 > pms0 at pckbc0 (aux slot) > wsmouse0 at pms0 mux 0 > wsmouse1 at pms0 mux 0 > pms0: Synaptics clickpad, firmware 8.2, 0x1e2b1 0x943300 > pcppi0 at isa0 port 0x61 > spkr0 at pcppi0 > vmm0 at mainbus0: VMX/EPT > error: [drm:pid0:intel_uncore_check_errors] *ERROR* Unclaimed register before > interrupt > umass0 at uhub0 port 2 configuration 1 interface 0 "SHARP Corporation 305SH" > rev 2.00/2.28 addr 2 > umass0: using SCSI over Bulk-Only > scsibus2 at umass0: 2 targets, initiator 0 > sd1 at scsibus2 targ 1 lun 0: <SHARP, 305SH microSD, 3.14> SCSI3 0/direct > removable serial.04dd97d5598055430410 > uhidev0 at uhub0 port 3 configuration 1 interface 0 "WiseGroup.,Ltd > JC-PS101U" rev 1.00/2.88 addr 3 > uhidev0: iclass 3/0 > uhid0 at uhidev0: input=7, output=3, feature=0 > uhidev1 at uhub0 port 6 configuration 1 interface 0 "Logitech USB Laser > Mouse" rev 2.00/56.01 addr 4 > uhidev1: iclass 3/1 > ums0 at uhidev1: 8 buttons, Z and W dir > wsmouse2 at ums0 mux 0 > ugen0 at uhub0 port 7 "Validity Sensors VFS5011 Fingerprint Reader" rev > 1.10/0.78 addr 5 > ugen1 at uhub0 port 11 "Intel product 0x07dc" rev 2.00/0.01 addr 6 > sdmmc0: can't enable card > uvideo0 at uhub0 port 12 configuration 1 interface 0 "SunplusIT INC. > Integrated Camera" rev 2.00/0.03 addr 7 > video0 at uvideo0 > umass1 at uhub0 port 16 configuration 1 interface 0 "Seagate Backup+ Desk" > rev 3.00/3.42 addr 8 > umass1: using SCSI over Bulk-Only > scsibus3 at umass1: 2 targets, initiator 0 > sd2 at scsibus3 targ 1 lun 0: <Seagate, Backup+ Desk, 0342> SCSI4 0/direct > fixed > sd2: 4769307MB, 4096 bytes/sector, 1220942645 sectors > uhub3 at uhub1 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" > rev 2.00/0.04 addr 2 > uhub4 at uhub2 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" > rev 2.00/0.04 addr 2 > vscsi0 at root > scsibus4 at vscsi0: 256 targets > softraid0 at root > scsibus5 at softraid0: 256 targets > > usbdevs: > Controller /dev/usb0: > addr 1: super speed, self powered, config 1, xHCI root hub(0x0000), > Intel(0x8086), rev 1.00 > port 1 disabled > port 2 disabled > port 3 addr 2: low speed, power 100 mA, config 1, JC-PS101U(0x8888), > WiseGroup.,Ltd(0x0925), rev 2.88 > port 4 disabled > port 5 disabled > port 6 addr 3: low speed, power 98 mA, config 1, USB Laser Mouse(0xc069), > Logitech(0x046d), rev 56.01 > port 7 addr 4: full speed, power 100 mA, config 1, VFS5011 Fingerprint > Reader(0x0017), Validity Sensors(0x138a), rev 0.78, iSerialNumber 7f178585b00e > port 8 disabled > port 9 disabled > port 10 disabled > port 11 addr 5: full speed, self powered, config 1, product 0x07dc(0x07dc), > Intel(0x8087), rev 0.01 > port 12 addr 6: high speed, power 500 mA, config 1, Integrated > Camera(0x0268), SunplusIT INC.(0x5986), rev 0.03 > port 13 disabled > port 14 disabled > port 15 disabled > port 16 addr 7: super speed, self powered, config 1, Backup+ Desk(0xab31), > Seagate(0x0bc2), rev 3.42, iSerialNumber NA7EA2SZ > Controller /dev/usb1: > addr 1: high speed, self powered, config 1, EHCI root hub(0x0000), > Intel(0x8086), rev 1.00 > port 1 addr 2: high speed, self powered, config 1, Rate Matching > Hub(0x8008), Intel(0x8087), rev 0.04 > port 1 powered > port 2 powered > port 3 powered > port 4 powered > port 5 powered > port 6 powered > port 2 powered > port 3 powered > Controller /dev/usb2: > addr 1: high speed, self powered, config 1, EHCI root hub(0x0000), > Intel(0x8086), rev 1.00 > port 1 addr 2: high speed, self powered, config 1, Rate Matching > Hub(0x8000), Intel(0x8087), rev 0.04 > port 1 powered > port 2 powered > port 3 powered > port 4 powered > port 5 powered > port 6 powered > port 7 powered > port 8 powered > port 2 powered > port 3 powered >
