Re: [External] : 7.1-Current crash with NET_TASKQ 4 and veb interface
Hi, I’m not a fun of this. > > + if (span_port_pool.pr_size == 0) { > + pool_init(_port_pool, sizeof(struct veb_span_port), > + 0, IPL_SOFTNET, 0, "vebspl", NULL); > + } Does initialized pool consume significant resources? Why don’t we do this within vebattach(). This is also true for `veb_rule_pool’ initialization.
Re: [External] : 7.1-Current crash with NET_TASKQ 4 and veb interface
Hello, On Mon, May 09, 2022 at 06:01:07PM +0300, Barbaros Bilek wrote: > Hello, > > I was using veb (veb+vlan+ixl) interfaces quite stable since 6.9. > My system ran as a firewall under OpenBSD 6.9 and 7.0 quite stable. > Also I've used 7.1 for a limited time and there were no crash. > After OpenBSD' NET_TASKQ upgrade to 4 it crashed after 5 days. > Here crash report and dmesg: > > ether_input(8520e000,fd8053616700) at ether_input+0x3ad > vport_if_enqueue(8520e000,fd8053616700) at vport_if_enqueue+0x19 > veb_port_input(851c3800,fd806064c200,,82066600) > at veb_port_input+0x4d2 > ether_input(851c3800,fd806064c200) at ether_input+0x100 > end trace frame: 0x800025575290, count: 0 > ddb{1}> show panic > > *cpu1: kernel diagnostic assertion "curcpu()->ci_schedstate.spc_smrdepth == > 0" f > ailed: file "/usr/src/sys/kern/subr_xxx.c", line 163 > > ddb{1}> trace > > db_enter() at db_enter+0x10 > > panic(81f22e39) at panic+0xbf > > __assert(81f96c9d,81f85ebc,a3,81fd252f) at > __assert+0x2 > > 5 > diff below attempts to fix this particular panic triggered by veb_span() function. This is fairly simple/straightforward change: we grab references to veb ports inside SMR_READ_ section. we keep those references in single linked list as soon as we leave SMR_READ_ section we process the list: dispatch packets drop references to port The change may uncover similar panics in other veb/bridge area. diff applies to current thanks for testing and reporting back. regards sashan 8<---8<---8<--8< diff --git a/sys/net/if_veb.c b/sys/net/if_veb.c index 2976cc200f1..a02dbac887f 100644 --- a/sys/net/if_veb.c +++ b/sys/net/if_veb.c @@ -159,6 +159,11 @@ struct veb_softc { struct veb_ports sc_spans; }; +struct veb_span_port { + SLIST_ENTRY(veb_span_port) sp_entry; + struct veb_port *sp_port; +}; + #define DPRINTF(_sc, fmt...)do { \ if (ISSET((_sc)->sc_if.if_flags, IFF_DEBUG)) \ printf(fmt); \ @@ -225,6 +230,7 @@ static struct if_clone veb_cloner = IF_CLONE_INITIALIZER("veb", veb_clone_create, veb_clone_destroy); static struct pool veb_rule_pool; +static struct pool span_port_pool; static int vport_clone_create(struct if_clone *, int); static int vport_clone_destroy(struct ifnet *); @@ -266,6 +272,11 @@ veb_clone_create(struct if_clone *ifc, int unit) 0, IPL_SOFTNET, 0, "vebrpl", NULL); } + if (span_port_pool.pr_size == 0) { + pool_init(_port_pool, sizeof(struct veb_span_port), + 0, IPL_SOFTNET, 0, "vebspl", NULL); + } + sc = malloc(sizeof(*sc), M_DEVBUF, M_WAITOK|M_ZERO|M_CANFAIL); if (sc == NULL) return (ENOMEM); @@ -352,22 +363,38 @@ veb_span(struct veb_softc *sc, struct mbuf *m0) struct veb_port *p; struct ifnet *ifp0; struct mbuf *m; + struct veb_span_port *sp; + SLIST_HEAD(, veb_span_port) span_list; + SLIST_INIT(_list) smr_read_enter(); SMR_TAILQ_FOREACH(p, >sc_spans.l_list, p_entry) { ifp0 = p->p_ifp0; if (!ISSET(ifp0->if_flags, IFF_RUNNING)) continue; - m = m_dup_pkt(m0, max_linkhdr + ETHER_ALIGN, M_NOWAIT); - if (m == NULL) { - /* XXX count error */ - continue; - } + sp = pool_get(_port_pool, PR_NOWAIT); + if (sp == NULL) + continue; /* XXX count error */ - if_enqueue(ifp0, m); /* XXX count error */ + veb_eb_brport_take(p); + sp->sp_port = p; + SLIST_INSERT_HEAD(_list, sp, sp_entry); } smr_read_leave(); + + while (!SLIST_EMPTY(_list)) { + sp = SLIST_FIRST(_list); + SLIST_REMOVE_HEAD(_list, sp_entry); + + m = m_dup_pkt(m0, max_linkhdr + ETHER_ALIGN, M_NOWAIT); + if (m != NULL) + if_enqueue(sp->sp_port->p_ifp0, m); + /* XXX count error */ + + veb_eb_brport_rele(sp->sp_port); + pool_put(_port_pool, sp); + } } static int
Re: uhid spam: uhidev_intr: bad repid 33
On 2022/05/09 20:39, Mark Kettenis wrote: > > Date: Mon, 9 May 2022 17:44:29 +0100 > > From: Stuart Henderson > > > > I have a USB combi keyboard/trackpad thing which is triggering "bad > > repid 33" frequently while attached (between a couple of times a minute, > > and once every few minutes). It does work but it's annoying. > > > > Presumably this is because it has non-contiguous report IDs? > > That shouldn't be a problem. > > > Anyone have an idea how to handle it? > > No. But showing dmesg output might help. Here's one (the machine I had it connected to previously had been up for long enough that the live dmesg wasn't any help, and it wasn't connected early enough for dmesg.boot). OpenBSD 7.1 (GENERIC.MP) #0: Sun Apr 24 09:30:43 MDT 2022 r...@syspatch-71-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 4169531392 (3976MB) avail mem = 4025880576 (3839MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xec240 (83 entries) bios0: vendor Intel Corp. version "WYLPT10H.86A.0054.2019.0902.1752" date 09/02/2019 bios0: Intel Corporation D34010WYK acpi0 at bios0: ACPI 5.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP APIC FPDT FIDT SSDT SSDT MCFG HPET SSDT SSDT DMAR CSRT acpi0: wakeup devices RP01(S4) PXSX(S4) PXSX(S4) PXSX(S4) RP04(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) PXSX(S4) GLAN(S4) EHC1(S4) EHC2(S4) XHC_(S4) HDEF(S4) PEG0(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.37 MHz, 06-45-01 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.08 MHz, 06-45-01 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 cpu2 at mainbus0: apid 1 (application processor) cpu2: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.08 MHz, 06-45-01 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu2: 256KB 64b/line 8-way L2 cache cpu2: smt 1, core 0, package 0 cpu3 at mainbus0: apid 3 (application processor) cpu3: Intel(R) Core(TM) i3-4010U CPU @ 1.70GHz, 1696.08 MHz, 06-45-01 cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN cpu3: 256KB 64b/line 8-way L2 cache cpu3: smt 1, core 1, package 0 ioapic0 at mainbus0: apid 8 pa 0xfec0, version 20, 40 pins acpimcfg0 at acpi0 acpimcfg0: addr 0xf800, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 1 (RP01) acpiprt2 at acpi0: bus 2 (RP04) acpiprt3 at acpi0: bus -1 (PEG0) acpiec0 at acpi0: not present acpipci0 at acpi0 PCI0: 0x0010 0x0011 0x acpicmos0 at acpi0 "PNP0C14" at acpi0 not configured acpibtn0 at acpi0: PWRB "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured acpicpu0 at acpi0: C2(500@67 mwait.1@0x10), C1(1000@1 mwait.1), PSS acpicpu1 at acpi0: C2(500@67 mwait.1@0x10),
Re: 7.1-Current crash with NET_TASKQ 4 and veb interface
On Mon, May 09, 2022 at 06:01:07PM +0300, Barbaros Bilek wrote: > I was using veb (veb+vlan+ixl) interfaces quite stable since 6.9. > My system ran as a firewall under OpenBSD 6.9 and 7.0 quite stable. > Also I've used 7.1 for a limited time and there were no crash. > After OpenBSD' NET_TASKQ upgrade to 4 it crashed after 5 days. For me this looks like a bug in veb(4). > ddb{1}> trace > db_enter() at db_enter+0x10 > panic(81f22e39) at panic+0xbf > __assert(81f96c9d,81f85ebc,a3,81fd252f) at > __assert+0x25 > assertwaitok() at assertwaitok+0xcc > mi_switch() at mi_switch+0x40 > sleep_finish(800025574da0,1) at sleep_finish+0x10b > rw_enter(822cfe50,1) at rw_enter+0x1cb > pf_test(2,1,8520e000,800025575058) at pf_test+0x1088 > ip_input_if(800025575058,800025575064,4,0,8520e000) at > ip_input_if+0xcd > ipv4_input(8520e000,fd8053616700) at ipv4_input+0x39 > ether_input(8520e000,fd8053616700) at ether_input+0x3ad > vport_if_enqueue(8520e000,fd8053616700) at vport_if_enqueue+0x19 > veb_port_input(851c3800,fd806064c200,,82066600) > at veb_port_input+0x4d2 > ether_input(851c3800,fd806064c200) at ether_input+0x100 > vlan_input(8095a050,fd806064c200,8000255752bc) at > vlan_input+0x23d > ether_input(8095a050,fd806064c200) at ether_input+0x85 > if_input_process(8095a050,800025575358) at if_input_process+0x6f > ifiq_process(8095a460) at ifiq_process+0x69 > taskq_thread(80035080) at taskq_thread+0x100 veb_port_input -> veb_broadcast -> smr_read_enter; tp->p_enqueue -> vport_if_enqueue -> if_vinput -> ifp->if_input -> ether_input -> ipv4_input -> ip_input_if -> pf_test -> PF_LOCK -> rw_enter_write() After calling smr_read_enter sleeping is not allowed according to man page. pf sleeps because it uses a read write lock. I looks like we have some contention on the pf lock. With more forwarding threads, sleep in pf is more likely. > __mp_lock(823d986c) at __mp_lock+0x72 > wakeup_n(822cfe50,) at wakeup_n+0x32 > pf_test(2,2,80948050,80002557b300) at pf_test+0x11f6 > pf_route(80002557b388,fd89fb379938) at pf_route+0x1f6 > pf_test(2,1,80924050,80002557b598) at pf_test+0xa1f > ip_input_if(80002557b598,80002557b5a4,4,0,80924050) at > ip_input_if+0xcd > ipv4_input(80924050,fd8053540f00) at ipv4_input+0x39 > ether_input(80924050,fd8053540f00) at ether_input+0x3ad > if_input_process(80924050,80002557b688) at if_input_process+0x6f > ifiq_process(80926500) at ifiq_process+0x69 > taskq_thread(80035100) at taskq_thread+0x100 > __mp_lock(823d986c) at __mp_lock+0x72 > wakeup_n(822cfe50,) at wakeup_n+0x32 > pf_test(2,2,80948050,80002557b300) at pf_test+0x11f6 > pf_route(80002557b388,fd89fb379938) at pf_route+0x1f6 > pf_test(2,1,80924050,80002557b598) at pf_test+0xa1f > ip_input_if(80002557b598,80002557b5a4,4,0,80924050) at > ip_input_if+0xcd > ipv4_input(80924050,fd8053540f00) at ipv4_input+0x39 > ether_input(80924050,fd8053540f00) at ether_input+0x3ad > if_input_process(80924050,80002557b688) at if_input_process+0x6f > ifiq_process(80926500) at ifiq_process+0x69 > taskq_thread(80035100) at taskq_thread+0x100 Can some veb or smr hacker explain how this is supposed to work? Sleeping in pf is also not ideal as it is in the hot path and slows down packets. But that is not easy to fix as we have to refactor the memory allocations before converting pf lock to a mutex. sashan@ is working on that. bluhm
Re: uhid spam: uhidev_intr: bad repid 33
> Date: Mon, 9 May 2022 17:44:29 +0100 > From: Stuart Henderson > > I have a USB combi keyboard/trackpad thing which is triggering "bad > repid 33" frequently while attached (between a couple of times a minute, > and once every few minutes). It does work but it's annoying. > > Presumably this is because it has non-contiguous report IDs? That shouldn't be a problem. > Anyone have an idea how to handle it? No. But showing dmesg output might help. > Bus 000 Device 002: ID 045e:0800 Microsoft Corp. > Device Descriptor: > bLength18 > bDescriptorType 1 > bcdUSB 2.00 > bDeviceClass0 (Defined at Interface level) > bDeviceSubClass 0 > bDeviceProtocol 0 > bMaxPacketSize064 > idVendor 0x045e Microsoft Corp. > idProduct 0x0800 > bcdDevice9.44 > iManufacturer 1 Microsoft > iProduct2 Microsoft? Nano Transceiver v2.0 > iSerial 0 > bNumConfigurations 1 > Configuration Descriptor: > bLength 9 > bDescriptorType 2 > wTotalLength 84 > bNumInterfaces 3 > bConfigurationValue 1 > iConfiguration 0 > bmAttributes 0xa0 > (Bus Powered) > Remote Wakeup > MaxPower 100mA > Interface Descriptor: > bLength 9 > bDescriptorType 4 > bInterfaceNumber0 > bAlternateSetting 0 > bNumEndpoints 1 > bInterfaceClass 3 Human Interface Device > bInterfaceSubClass 1 Boot Interface Subclass > bInterfaceProtocol 1 Keyboard > iInterface 0 > HID Device Descriptor: > bLength 9 > bDescriptorType33 > bcdHID 1.11 > bCountryCode0 Not supported > bNumDescriptors 1 > bDescriptorType34 Report > wDescriptorLength 57 > Report Descriptor: (length is 57) > Item(Global): Usage Page, data= [ 0x01 ] 1 > Generic Desktop Controls > Item(Local ): Usage, data= [ 0x06 ] 6 > Keyboard > Item(Main ): Collection, data= [ 0x01 ] 1 > Application > Item(Global): Usage Page, data= [ 0x08 ] 8 > LEDs > Item(Local ): Usage Minimum, data= [ 0x01 ] 1 > NumLock > Item(Local ): Usage Maximum, data= [ 0x03 ] 3 > Scroll Lock > Item(Global): Logical Minimum, data= [ 0x00 ] 0 > Item(Global): Logical Maximum, data= [ 0x01 ] 1 > Item(Global): Report Size, data= [ 0x01 ] 1 > Item(Global): Report Count, data= [ 0x03 ] 3 > Item(Main ): Output, data= [ 0x02 ] 2 > Data Variable Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile > Bitfield > Item(Global): Report Count, data= [ 0x05 ] 5 > Item(Main ): Output, data= [ 0x01 ] 1 > Constant Array Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile > Bitfield > Item(Global): Usage Page, data= [ 0x07 ] 7 > Keyboard > Item(Local ): Usage Minimum, data= [ 0xe0 0x00 ] 224 > Control Left > Item(Local ): Usage Maximum, data= [ 0xe7 0x00 ] 231 > GUI Right > Item(Global): Report Count, data= [ 0x08 ] 8 > Item(Main ): Input, data= [ 0x02 ] 2 > Data Variable Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile > Bitfield > Item(Global): Report Size, data= [ 0x08 ] 8 > Item(Global): Report Count, data= [ 0x01 ] 1 > Item(Main ): Input, data= [ 0x01 ] 1 > Constant Array Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile > Bitfield > Item(Local ): Usage Minimum, data= [ 0x00 ] 0 > No Event > Item(Local ): Usage Maximum, data= [ 0x91 0x00 ] 145 > LANG 2 (Hanja Conversion, Korea) > Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255 > Item(Global): Report Count, data= [ 0x06 ] 6 > Item(Main ): Input, data= [ 0x00 ] 0 > Data Array Absolute No_Wrap Linear > Preferred_State No_Null_Position Non_Volatile > Bitfield > Item(Main ): End Collection, data=none > Endpoint Descriptor: >
Re: 'less -F' broken?
Oh! I completely missed -X. Adding that fixed it. I'm running with LESS=-aicFX and all is well under TERM=xterm now. Thanks! --lyndon
Re: [External] : 7.1-Current crash with NET_TASKQ 4 and veb interface
Hello Barbaros, thank you for testing and excellent report. > ddb{1}> trace > db_enter() at db_enter+0x10 > panic(81f22e39) at panic+0xbf > __assert(81f96c9d,81f85ebc,a3,81fd252f) at > __assert+0x25 > assertwaitok() at assertwaitok+0xcc > mi_switch() at mi_switch+0x40 assert indicates we attempt to sleep inside SMR section, which must be avoided. > sleep_finish(800025574da0,1) at sleep_finish+0x10b > rw_enter(822cfe50,1) at rw_enter+0x1cb > pf_test(2,1,8520e000,800025575058) at pf_test+0x1088 > ip_input_if(800025575058,800025575064,4,0,8520e000) at > ip_input_if+0xcd > ipv4_input(8520e000,fd8053616700) at ipv4_input+0x39 > ether_input(8520e000,fd8053616700) at ether_input+0x3ad > vport_if_enqueue(8520e000,fd8053616700) at vport_if_enqueue+0x19 > veb_port_input(851c3800,fd806064c200,,82066600) > at veb_port_input+0x4d2 > ether_input(851c3800,fd806064c200) at ether_input+0x100 > vlan_input(8095a050,fd806064c200,8000255752bc) at > vlan_input+0x23d > ether_input(8095a050,fd806064c200) at ether_input+0x85 > if_input_process(8095a050,800025575358) at if_input_process+0x6f > ifiq_process(8095a460) at ifiq_process+0x69 > taskq_thread(80035080) at taskq_thread+0x100 above is a call stack, which has done a bad thing (sleeping SMR section) in my opinion the primary suspect is veb_port_input() which code reads as follows: 966 static struct mbuf * 967 veb_port_input(struct ifnet *ifp0, struct mbuf *m, uint64_t dst, void *brport) 968 { 969 struct veb_port *p = brport; 970 struct veb_softc *sc = p->p_veb; 971 struct ifnet *ifp = >sc_if; 972 struct ether_header *eh; ... 1021 counters_pkt(ifp->if_counters, ifc_ipackets, ifc_ibytes, 1022 m->m_pkthdr.len); 1023 1024 /* force packets into the one routing domain for pf */ 1025 m->m_pkthdr.ph_rtableid = ifp->if_rdomain; 1026 1027 #if NBPFILTER > 0 1028 if_bpf = READ_ONCE(ifp->if_bpf); 1029 if (if_bpf != NULL) { 1030 if (bpf_mtap_ether(if_bpf, m, 0) != 0) 1031 goto drop; 1032 } 1033 #endif 1034 1035 veb_span(sc, m); 1036 1037 if (ISSET(p->p_bif_flags, IFBIF_BLOCKNONIP) && 1038 veb_ip_filter(m)) 1039 goto drop; 1040 1041 if (!ISSET(ifp->if_flags, IFF_LINK0) && 1042 veb_vlan_filter(m)) 1043 goto drop; 1044 1045 if (veb_rule_filter(p, VEB_RULE_LIST_IN, m, src, dst)) 1046 goto drop; call to veb_span() at line 1035 seems to be our guy/culprit (in my opinion): 356 smr_read_enter(); 357 SMR_TAILQ_FOREACH(p, >sc_spans.l_list, p_entry) { 358 ifp0 = p->p_ifp0; 359 if (!ISSET(ifp0->if_flags, IFF_RUNNING)) 360 continue; 361 362 m = m_dup_pkt(m0, max_linkhdr + ETHER_ALIGN, M_NOWAIT); 363 if (m == NULL) { 364 /* XXX count error */ 365 continue; 366 } 367 368 if_enqueue(ifp0, m); /* XXX count error */ 369 } 370 smr_read_leave(); loop above comes from veb_span(), which calls if_enqueue() from within a smr section. The line 368 calls here: 2191 static int 2192 vport_if_enqueue(struct ifnet *ifp, struct mbuf *m) 2193 { 2194 /* 2195 * switching an l2 packet toward a vport means pushing it 2196 * into the network stack. this function exists to make 2197 * if_vinput compat with veb calling if_enqueue. 2198 */ 2199 2200 if_vinput(ifp, m); 2201 2202 return (0); 2203 } which in turn calls if_vinput() which calls further down to ipstack, and IP stack my sleep. We must change veb_span() such calls to if_vinput() will happen outside of SMR section. I don't have such complex setup to use vlans and virtual ports. I'll try to cook some diff and pass it to you for testing. thanks again for coming back to us with report. regards sashan
Re: 'less -F' broken?
On Mon, 09 May 2022 00:37:58 -0700 "Lyndon Nerenberg (VE7TFX/VE6BBM)" wrote: > It seems that the -F flag to less is broken. Instead of reverting to > cat-like behaviour on short files, it prints nothing at all. > > : lyndon@orthanc:/home/lyndon; cat typescript > Script started on Mon May 9 00:22:41 2022 > : lyndon@orthanc:/home/lyndon; seq 10 > numbers > : lyndon@orthanc:/home/lyndon; cat numbers > 1 > 2 > 3 > 4 > 5 > 6 > 7 > 8 > 9 > 10 > : lyndon@orthanc:/home/lyndon; less -F numbers > : lyndon@orthanc:/home/lyndon; ^D > > Script done on Mon May 9 00:22:58 2022 > > OpenBSD orthanc.ca 7.1 GENERIC.MP#465 amd64 > > This is with TERM=xterm. If I set TERM=vt100, it works. This seems > to relate to some long-standing oddities I've noticed with the > 'xterm' terminfo definition OpenBSD uses. In particular, programs > like man always switch to the alternate screen buffer when displaying > output, then switch back afterwards. I find this very annoying so > I always compile my own terminfo definition for 'xterm', which works > everyplace else but not on OpenBSD. > > I've tried poking into this a few times, but I just don't have the > energy to dive into the guts of curses. Has anyone else run into > this, and maybe have some suggestions on where to start digging? > > Here's the terminfo definition I use, FWIW: > > # An xterm without the internal screen memory buffer. This variant > # does not save/restore the screen when running termcap based > applications. # This means the man page you were reading doesn't > disappear from the screen # when you quit the pager. > xterm|vs100|xterm terminal emulator, > am, xenl, km, mir, msgr, > cols#80, it#8, lines#65, > acsc=``aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~, > bel=^G, cr=^M, csr=\E[%i%p1%d;%p2%dr, tbc=\E[3g, > clear=\E[H\E[2J, el1=\E[1K$<3>, el=\E[K, ed=\E[J, > cup=\E[%i%p1%d;%p2%dH, cud1=^J, home=\E[H, cub1=^H, > cuf1=\E[C, cuu1=\E[A, dch1=\E[P, dl1=\E[M, enacs=\E(B\E)0, > smacs=^N, blink=\E[5m, bold=\E[1m, rev=\E[7m, smso=\E[7m, > smul=\E[4m, rmacs=^O, sgr0=\E[m, rmso=\E[m, rmul=\E[m, > ich1=\E[@, il1=\E[L, ka1=\EOq, ka3=\EOs, kb2=\EOr, kbs=^H, > kc1=\EOp, kc3=\EOn, kcud1=\EOB, kent=\EOM, kf0=\E[21~, > kf1=\E[11~, kf10=\EOx, kf2=\E[12~, kf3=\E[13~, kf4=\E[14~, > kf5=\E[15~, kf6=\E[17~, kf7=\E[18~, kf8=\E[19~, kf9=\E[20~, > kcub1=\EOD, kcuf1=\EOC, kcuu1=\EOA, rmkx=\E[?1l\E>, > smkx=\E[?1h\E=, dch=\E[%p1%dP, dl=\E[%p1%dM, > cud=\E[%p1%dB, ich=\E[%p1%d@, il=\E[%p1%dL, > cub=\E[%p1%dD, cuf=\E[%p1%dC, cuu=\E[%p1%dA, > rs1=\E>\E[1;3;4;5;6l\E[?7h\E[m\E[r\E[2J\E[H, rs2=@, > rc=\E8, sc=\E7, ind=^J, ri=\EM, > > sgr=\E[0%?%p1%p6%|%t;1%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;m%?%p9%t\016%e\017%;, > hts=\EH, ht=^I, > > > --lyndon > Just replying to confirm this bug appears for me too. Using tmux under xterm makes $TERM equal to "screen", and when I `less -F` on a file that can fit on less than one screen just makes xterm flicker, with no output... but if I do `less -F foo > bar`, then bar has the contents of foo... so it's still outputting to stdout, presumably. If I manually set $TERM to `xterm` I get no output or flickering, but `less -F foo > bar` exhibits the same behaivour as above. TERM=vt100 works for me, likewise. A recently updated machine to OpenBSD 7.1, if it matters.
uhid spam: uhidev_intr: bad repid 33
I have a USB combi keyboard/trackpad thing which is triggering "bad repid 33" frequently while attached (between a couple of times a minute, and once every few minutes). It does work but it's annoying. Presumably this is because it has non-contiguous report IDs? Anyone have an idea how to handle it? Bus 000 Device 002: ID 045e:0800 Microsoft Corp. Device Descriptor: bLength18 bDescriptorType 1 bcdUSB 2.00 bDeviceClass0 (Defined at Interface level) bDeviceSubClass 0 bDeviceProtocol 0 bMaxPacketSize064 idVendor 0x045e Microsoft Corp. idProduct 0x0800 bcdDevice9.44 iManufacturer 1 Microsoft iProduct2 Microsoft? Nano Transceiver v2.0 iSerial 0 bNumConfigurations 1 Configuration Descriptor: bLength 9 bDescriptorType 2 wTotalLength 84 bNumInterfaces 3 bConfigurationValue 1 iConfiguration 0 bmAttributes 0xa0 (Bus Powered) Remote Wakeup MaxPower 100mA Interface Descriptor: bLength 9 bDescriptorType 4 bInterfaceNumber0 bAlternateSetting 0 bNumEndpoints 1 bInterfaceClass 3 Human Interface Device bInterfaceSubClass 1 Boot Interface Subclass bInterfaceProtocol 1 Keyboard iInterface 0 HID Device Descriptor: bLength 9 bDescriptorType33 bcdHID 1.11 bCountryCode0 Not supported bNumDescriptors 1 bDescriptorType34 Report wDescriptorLength 57 Report Descriptor: (length is 57) Item(Global): Usage Page, data= [ 0x01 ] 1 Generic Desktop Controls Item(Local ): Usage, data= [ 0x06 ] 6 Keyboard Item(Main ): Collection, data= [ 0x01 ] 1 Application Item(Global): Usage Page, data= [ 0x08 ] 8 LEDs Item(Local ): Usage Minimum, data= [ 0x01 ] 1 NumLock Item(Local ): Usage Maximum, data= [ 0x03 ] 3 Scroll Lock Item(Global): Logical Minimum, data= [ 0x00 ] 0 Item(Global): Logical Maximum, data= [ 0x01 ] 1 Item(Global): Report Size, data= [ 0x01 ] 1 Item(Global): Report Count, data= [ 0x03 ] 3 Item(Main ): Output, data= [ 0x02 ] 2 Data Variable Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Global): Report Count, data= [ 0x05 ] 5 Item(Main ): Output, data= [ 0x01 ] 1 Constant Array Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Global): Usage Page, data= [ 0x07 ] 7 Keyboard Item(Local ): Usage Minimum, data= [ 0xe0 0x00 ] 224 Control Left Item(Local ): Usage Maximum, data= [ 0xe7 0x00 ] 231 GUI Right Item(Global): Report Count, data= [ 0x08 ] 8 Item(Main ): Input, data= [ 0x02 ] 2 Data Variable Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Global): Report Size, data= [ 0x08 ] 8 Item(Global): Report Count, data= [ 0x01 ] 1 Item(Main ): Input, data= [ 0x01 ] 1 Constant Array Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Local ): Usage Minimum, data= [ 0x00 ] 0 No Event Item(Local ): Usage Maximum, data= [ 0x91 0x00 ] 145 LANG 2 (Hanja Conversion, Korea) Item(Global): Logical Maximum, data= [ 0xff 0x00 ] 255 Item(Global): Report Count, data= [ 0x06 ] 6 Item(Main ): Input, data= [ 0x00 ] 0 Data Array Absolute No_Wrap Linear Preferred_State No_Null_Position Non_Volatile Bitfield Item(Main ): End Collection, data=none Endpoint Descriptor: bLength 7 bDescriptorType 5 bEndpointAddress 0x81 EP 1 IN bmAttributes3 Transfer TypeInterrupt Synch Type None Usage Type Data wMaxPacketSize 0x0008 1x 8 bytes bInterval 4
7.1-Current crash with NET_TASKQ 4 and veb interface
Hello, I was using veb (veb+vlan+ixl) interfaces quite stable since 6.9. My system ran as a firewall under OpenBSD 6.9 and 7.0 quite stable. Also I've used 7.1 for a limited time and there were no crash. After OpenBSD' NET_TASKQ upgrade to 4 it crashed after 5 days. Here crash report and dmesg: ether_input(8520e000,fd8053616700) at ether_input+0x3ad vport_if_enqueue(8520e000,fd8053616700) at vport_if_enqueue+0x19 veb_port_input(851c3800,fd806064c200,,82066600) at veb_port_input+0x4d2 ether_input(851c3800,fd806064c200) at ether_input+0x100 end trace frame: 0x800025575290, count: 0 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{1}> show panic *cpu1: kernel diagnostic assertion "curcpu()->ci_schedstate.spc_smrdepth == 0" f ailed: file "/usr/src/sys/kern/subr_xxx.c", line 163 ddb{1}> trace db_enter() at db_enter+0x10 panic(81f22e39) at panic+0xbf __assert(81f96c9d,81f85ebc,a3,81fd252f) at __assert+0x2 5 assertwaitok() at assertwaitok+0xcc mi_switch() at mi_switch+0x40 sleep_finish(800025574da0,1) at sleep_finish+0x10b rw_enter(822cfe50,1) at rw_enter+0x1cb pf_test(2,1,8520e000,800025575058) at pf_test+0x1088 ip_input_if(800025575058,800025575064,4,0,8520e000) at ip_input _if+0xcd ipv4_input(8520e000,fd8053616700) at ipv4_input+0x39 ether_input(8520e000,fd8053616700) at ether_input+0x3ad vport_if_enqueue(8520e000,fd8053616700) at vport_if_enqueue+0x19 veb_port_input(851c3800,fd806064c200,,82066600) at veb_port_input+0x4d2 ether_input(851c3800,fd806064c200) at ether_input+0x100 vlan_input(8095a050,fd806064c200,8000255752bc) at vlan_input+0x 23d ether_input(8095a050,fd806064c200) at ether_input+0x85 if_input_process(8095a050,800025575358) at if_input_process+0x6f ifiq_process(8095a460) at ifiq_process+0x69 taskq_thread(80035080) at taskq_thread+0x100 end trace frame: 0x0, count: -19 ddb{1}> ps /o TIDPIDUID PRFLAGS PFLAGS CPU COMMAND 422021 80579 0 0x2 07 ifconfig 292011 89065020x12 0x4008 mariadbd 427181 89065020x12 0x4006K mariadbd 86788 89065020x12 0x4003 mariadbd 302453 98158 0 0x14000 0x2009 softnet 88346 66890 0 0x14000 0x2005 softnet ddb{1}> machine ddbcpu 2 Stopped at x86_ipi_db+0x12:leave x86_ipi_db(80001d1c3ff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 acpicpu_idle() at acpicpu_idle+0x203 sched_idle(80001d1c3ff0) at sched_idle+0x280 end trace frame: 0x0, count: 10 ddb{2}> trace x86_ipi_db(80001d1c3ff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 acpicpu_idle() at acpicpu_idle+0x203 sched_idle(80001d1c3ff0) at sched_idle+0x280 end trace frame: 0x0, count: -5 ddb{2}> machine ddbcpu 2 Invalid cpu 2 ddb{2}> t[A[A Bad character x86_ipi_db(80001d1c3ff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 acpicpu_idle() at acpicpu_idle+0x203 sched_idle(80001d1c3ff0) at sched_idle+0x280 end trace frame: 0x0, count: -5 ddb{2}> machine ddbcpu 3 Stopped at x86_ipi_db+0x12:leave x86_ipi_db(80001d1ccff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 __mp_lock(823d986c) at __mp_lock+0x72 wakeup_n(8000fffeca88,1) at wakeup_n+0x32 futex_requeue(c13fb4a32e0,1,0,0,2) at futex_requeue+0xe4 sys_futex(8000fffc2008,8000265ca780,8000265ca7d0) at sys_futex+0xe6 syscall(8000265ca840) at syscall+0x374 Xsyscall() at Xsyscall+0x128 end of kernel end trace frame: 0xc13f5b4b090, count: 6 ddb{3}> trace x86_ipi_db(80001d1ccff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 __mp_lock(823d986c) at __mp_lock+0x72 wakeup_n(8000fffeca88,1) at wakeup_n+0x32 futex_requeue(c13fb4a32e0,1,0,0,2) at futex_requeue+0xe4 sys_futex(8000fffc2008,8000265ca780,8000265ca7d0) at sys_futex+0xe6 syscall(8000265ca840) at syscall+0x374 Xsyscall() at Xsyscall+0x128 end of kernel end trace frame: 0xc13f5b4b090, count: -9 ddb{3}> machine ddbcpu 4 Stopped at x86_ipi_db+0x12:leave x86_ipi_db(80001d1d5ff0) at x86_ipi_db+0x12 x86_ipi_handler() at x86_ipi_handler+0x80 Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23 acpicpu_idle() at acpicpu_idle+0x203 sched_idle(80001d1d5ff0) at
Re: 'less -F' broken?
On 2022/05/09 00:37, Lyndon Nerenberg (VE7TFX/VE6BBM) wrote: > It seems that the -F flag to less is broken. Instead of reverting to > cat-like behaviour on short files, it prints nothing at all. You'll probably be happier with at least LESS=Xcq in the environment. (I use MXciq). > This is with TERM=xterm. If I set TERM=vt100, it works. This seems > to relate to some long-standing oddities I've noticed with the > 'xterm' terminfo definition OpenBSD uses. In particular, programs > like man always switch to the alternate screen buffer when displaying > output, then switch back afterwards. I find this very annoying so Me too.
'less -F' broken?
It seems that the -F flag to less is broken. Instead of reverting to cat-like behaviour on short files, it prints nothing at all. : lyndon@orthanc:/home/lyndon; cat typescript Script started on Mon May 9 00:22:41 2022 : lyndon@orthanc:/home/lyndon; seq 10 > numbers : lyndon@orthanc:/home/lyndon; cat numbers 1 2 3 4 5 6 7 8 9 10 : lyndon@orthanc:/home/lyndon; less -F numbers : lyndon@orthanc:/home/lyndon; ^D Script done on Mon May 9 00:22:58 2022 OpenBSD orthanc.ca 7.1 GENERIC.MP#465 amd64 This is with TERM=xterm. If I set TERM=vt100, it works. This seems to relate to some long-standing oddities I've noticed with the 'xterm' terminfo definition OpenBSD uses. In particular, programs like man always switch to the alternate screen buffer when displaying output, then switch back afterwards. I find this very annoying so I always compile my own terminfo definition for 'xterm', which works everyplace else but not on OpenBSD. I've tried poking into this a few times, but I just don't have the energy to dive into the guts of curses. Has anyone else run into this, and maybe have some suggestions on where to start digging? Here's the terminfo definition I use, FWIW: # An xterm without the internal screen memory buffer. This variant # does not save/restore the screen when running termcap based applications. # This means the man page you were reading doesn't disappear from the screen # when you quit the pager. xterm|vs100|xterm terminal emulator, am, xenl, km, mir, msgr, cols#80, it#8, lines#65, acsc=``aaffggjjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~, bel=^G, cr=^M, csr=\E[%i%p1%d;%p2%dr, tbc=\E[3g, clear=\E[H\E[2J, el1=\E[1K$<3>, el=\E[K, ed=\E[J, cup=\E[%i%p1%d;%p2%dH, cud1=^J, home=\E[H, cub1=^H, cuf1=\E[C, cuu1=\E[A, dch1=\E[P, dl1=\E[M, enacs=\E(B\E)0, smacs=^N, blink=\E[5m, bold=\E[1m, rev=\E[7m, smso=\E[7m, smul=\E[4m, rmacs=^O, sgr0=\E[m, rmso=\E[m, rmul=\E[m, ich1=\E[@, il1=\E[L, ka1=\EOq, ka3=\EOs, kb2=\EOr, kbs=^H, kc1=\EOp, kc3=\EOn, kcud1=\EOB, kent=\EOM, kf0=\E[21~, kf1=\E[11~, kf10=\EOx, kf2=\E[12~, kf3=\E[13~, kf4=\E[14~, kf5=\E[15~, kf6=\E[17~, kf7=\E[18~, kf8=\E[19~, kf9=\E[20~, kcub1=\EOD, kcuf1=\EOC, kcuu1=\EOA, rmkx=\E[?1l\E>, smkx=\E[?1h\E=, dch=\E[%p1%dP, dl=\E[%p1%dM, cud=\E[%p1%dB, ich=\E[%p1%d@, il=\E[%p1%dL, cub=\E[%p1%dD, cuf=\E[%p1%dC, cuu=\E[%p1%dA, rs1=\E>\E[1;3;4;5;6l\E[?7h\E[m\E[r\E[2J\E[H, rs2=@, rc=\E8, sc=\E7, ind=^J, ri=\EM, sgr=\E[0%?%p1%p6%|%t;1%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;m%?%p9%t\016%e\017%;, hts=\EH, ht=^I, --lyndon