Re: Panic with TMPFS
Hi, Marko >I have good experience with mfs, if you just need memory file system, >use that instead. MFS does not free memory when you delete a file, but TMPFS does. Also, TMPFS can automatically set its capacity with the "-s=0" option, while MFS cannot. This is why I used tmpfs until 7.0. >If you need TMPFS fixed, well, I guess someone much more knowledgeable >than me would need to look into it. That's what I would do as well. Thank you. Yoshihiro Kawamata http://fuguita.org/
Re: vte(4): restore MDC clock speed register value after MAC reset
On Tue, Apr 19, 2022 at 11:33:04AM +0300, Andrius V wrote: > > Hi Kevin, > > > Thanks for the analysis, I committed your diff. > > Thank you for applying the patch! > > > For some reason, ethernet and usb devices never get interrupts, I have to > > disable both acpimadt and mpbios in the kernel to make it work. > > Does your machine also have this problem? Thanks. > > Yes, there are some problems with level type interrupts on Vortex86DX3 > machines, which seems to be the same issue on OpenBSD and NetBSD (edge > type interrupts work on the other hand like for storage or serial > interface). FreeBSD 12.0 (the last one I managed to boot without > modifications) and Linux doesn't seem to have this issue (unless it > somehow ignored) and devices work without visible issues. I have > registered it as https://gnats.netbsd.org/53894 on NetBSD for that and > I did some investigation to find out the cause, unfortunately without > success so far. Thus, neither USB, nor network (and probably audio) > doesn't work if device interrupts established through APIC. In NetBSD > it is possible to boot without acpi/smp mode in which it attaches > devices in legacy way (pic), in this case USB and network works. I > believe solving that would make system work properly. > > However, in OpenBSD case I also had crashes with the default kernel, > where older versions (don't remember which) would work with DIAGNOSTIC > option disabled, for current ones I was applying certain patches, like > to comment out acpi_checksum in acpi_attach_common and "pretend" that > hdr_revision is below 4 (however, I haven't updated my code to current > yet). Please try the latest snapshot and include the dmesg output of your system, thanks. > Regards, > Andrius V Kevin
Re: riscv64 panic
On Sun, Apr 10 2022, Jeremie Courreges-Anglas wrote: > On Fri, Oct 08 2021, Jeremie Courreges-Anglas wrote: >> riscv64.ports was running dpb(1) with two other members in the build >> cluster. A few minutes ago I found it in ddb(4). The report is short, >> sadly, as the machine doesn't return from the 'bt' command. >> >> The machine is acting both as an NFS server and and NFS client. > > Here's a crash during 7.1 packages build, on riscv64-4.p, a ports build > cluster member and NFS client. riscv64-1.p with a corruption of pted entries, cpu3 seems to run zerothread. I'm leaving this one in ddb. OpenBSD/riscv64 (riscv64-1.ports.openbsd.org) (console) login: panic: pool_do_ic: t: pol_d orget lipte pi ic: o pa e 0x f f fie ec o2 i= xef:4 dc 2b3 9 0c ; 2 00 0 d f tStopped at panic+0x106:addi a0,zero,256TIDPIDUID PR FLAGS PFLAGS CPU COMMAND *154438 21930 55 0x2 00 cc 147782 2264 55 0x2 01 c++ 100287 92452 55 0x2 02 c++ 504980 2975 0 0x14000 0x2003 zerothread panic() at panic+0x106 pool_do_get() at pool_do_get+0x286 pool_get() at pool_get+0x78 pmap_enter() at pmap_enter+0x12c uvm_fault_lower() at uvm_fault_lower+0x674 uvm_fault() at uvm_fault+0x150 do_trap_user() at do_trap_user+0x116 cpu_exception_handler_user() at cpu_exception_handler_user+0x7c end of kernel end trace frame: 0x42a7370, count: 7 https://www.openbsd.org/ddb.html describes the minimum info required in bug reports. Insufficient info makes it difficult to find and fix bugs. ddb{0}> show panic *cpu0: pool_do_get: pted free list modified: page 0xffc22b142000; item addr 0xffc22b142490; offset 0x0=0x9940d7f3d400acb2 != 0x9940d7f3d490acb2 cpu2: pool_do_get: pted free list modified: page 0xffc22b142000; item addr 0xffc22b142490; offset 0x0=0x9940d7f3d400acb2 != 0x9940d7f3d490acb2 cpu1: pool_do_get: pted free list modified: page 0xffc22b142000; item addr 0xffc22b142490; offset 0x0=0x9940d7f3d400acb2 != 0x9940d7f3d490acb2 ddb{0}> trace panic() at panic+0x106 pool_do_get() at pool_do_get+0x286 pool_get() at pool_get+0x78 pmap_enter() at pmap_enter+0x12c uvm_fault_lower() at uvm_fault_lower+0x674 uvm_fault() at uvm_fault+0x150 do_trap_user() at do_trap_user+0x116 cpu_exception_handler_user() at cpu_exception_handler_user+0x7c end of kernel end trace frame: 0x42a7370, count: -8 ddb{0}> machine ddbcpu 1 Stopped at ipi_intr+0x22: c.lia0,1 ipi_intr() at ipi_intr+0x22 riscv_cpu_intr() at riscv_cpu_intr+0x22 cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x7a sfuartcnputc() at sfuartcnputc+0x2c db_putchar() at db_putchar+0x318 kprintf() at kprintf+0xb36 db_printf() at db_printf+0x52 panic() at panic+0x92 pool_do_get() at pool_do_get+0x286 pool_get() at pool_get+0x78 pmap_enter() at pmap_enter+0x12c uvm_fault_lower() at uvm_fault_lower+0x674 uvm_fault() at uvm_fault+0x150 do_trap_user() at do_trap_user+0x116 end trace frame: 0xffc233516ec0, count: 0 ddb{1}> trace ipi_intr() at ipi_intr+0x22 riscv_cpu_intr() at riscv_cpu_intr+0x22 cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x7a sfuartcnputc() at sfuartcnputc+0x2c db_putchar() at db_putchar+0x318 kprintf() at kprintf+0xb36 db_printf() at db_printf+0x52 panic() at panic+0x92 pool_do_get() at pool_do_get+0x286 pool_get() at pool_get+0x78 pmap_enter() at pmap_enter+0x12c uvm_fault_lower() at uvm_fault_lower+0x674 uvm_fault() at uvm_fault+0x150 do_trap_user() at do_trap_user+0x116 cpu_exception_handler_user() at cpu_exception_handler_user+0x7c end of kernel end trace frame: 0xcb932fe8, count: -15 ddb{1}> mach ddbcpu 2 Stopped at ipi_intr+0x22: c.lia0,1 ipi_intr() at ipi_intr+0x22 riscv_cpu_intr() at riscv_cpu_intr+0x22 cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x7a sfuartcnputc() at sfuartcnputc+0x2c db_putchar() at db_putchar+0x2d0 kprintf() at kprintf+0xb36 db_printf() at db_printf+0x52 panic() at panic+0x92 pool_do_get() at pool_do_get+0x286 pool_get() at pool_get+0x78 pmap_enter() at pmap_enter+0x12c uvm_fault_lower() at uvm_fault_lower+0x674 uvm_fault() at uvm_fault+0x150 do_trap_user() at do_trap_user+0x116 end trace frame: 0xffc229e17ec0, count: 0 ddb{2}> mach ddbcpu 3 Stopped at ipi_intr+0x22: c.lia0,1 ipi_intr() at ipi_intr+0x22 riscv_cpu_intr() at riscv_cpu_intr+0x22 cpu_exception_handler_supervisor() at cpu_exception_handler_supervisor+0x7a pmap_kremove_pg() at pmap_kremove_pg+0xc6 uvm_pagezero_thread() at uvm_pagezero_thread+0xec proc_trampoline() at proc_trampoline+0x1a end trace frame: 0x0, count: 9 -- jca | PGP : 0x1524E7EE / 5135 92C1 AD36 5293 2BDF DDCC 0DFA 74AE 1524 E7EE
Re: bse: null dereference in genet_rxintr()
On Tue, Apr 19, 2022 at 07:32:36AM +0200, Anton Lindqvist wrote: > On Thu, Mar 24, 2022 at 07:41:44AM +0100, Anton Lindqvist wrote: > > >Synopsis: bse: null dereference in genet_rxintr() > > >Category: arm64 > > >Environment: > > System : OpenBSD 7.1 > > Details : OpenBSD 7.1-beta (GENERIC.MP) #1594: Mon Mar 21 06:55:12 > > MDT 2022 > > > > dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP > > > > Architecture: OpenBSD.arm64 > > Machine : arm64 > > >Description: > > > > Booting my rpi4 often but not always causes a panic while rc(8) tries to > > start > > the bse network interface: > > > > panic: attempt to access user address 0x38 from EL1 > > Stopped at panic+0x160:cmp w21, #0x0 > > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > > * 0 0 0 0x1 0x2000K swapper > > db_enter() at panic+0x15c > > panic() at do_el1h_sync+0x1f8 > > do_el1h_sync() at handle_el1h_sync+0x6c > > handle_el1h_sync() at genet_rxintr+0x120 > > genet_rxintr() at genet_intr+0x74 > > genet_intr() at ampintc_irq_handler+0x14c > > ampintc_irq_handler() at arm_cpu_irq+0x30 > > arm_cpu_irq() at handle_el1h_irq+0x6c > > handle_el1h_irq() at ampintc_splx+0x80 > > ampintc_splx() at genet_ioctl+0x158 > > genet_ioctl() at ifioctl+0x308 > > ifioctl() at nfs_boot_init+0xc0 > > nfs_boot_init() at nfs_mountroot+0x3c > > nfs_mountroot() at main+0x464 > > main() at virtdone+0x70 > > > > >Fix: > > > > The mbuf associated with the current index is NULL. I noticed that the > > NetBSD > > driver allocates mbufs for each ring entry in genet_setup_dma(). But even > > with > > that in place the same panic still occurs. Enabling GENET_DEBUG shows that > > the > > total is quite high: > > > > RX pidx=ca07 total=51463 > > > > > > Since it's greater than GENET_DMA_DESC_COUNT (=256) the null dereference > > will > > still happen after doing more than 256 iterations in genet_rxintr() since we > > will start accessing mbufs cleared by the previous iteration. > > > > Here's a diff with what I've tried so far. The KASSERT() is just capturing > > the > > problem at an earlier stage. Any pointers would be much appreciated. > > Further digging reveals that writes to GENET_RX_DMA_PROD_INDEX are > ignored by the hardware. That's why I ended up with a large amount of > mbufs available in genet_rxintr() since the software and hardware state > was out of sync. Honoring any existing value makes the problem go away > and matches what u-boot[1] does as well. > > The current RX cidx/pidx defaults in genet_fill_rx_ring() where probably > carefully selected as they ensure that the rx ring is filled with at > least the configured low watermark number of mbufs. However, instead of > being forced to ensure a pidx - cidx delta above 0 on the first > invocations of genet_fill_rx_ring(), RX_DESC_COUNT could simply be > passed as the max argument to if_rxr_get() which will clamp the value > anyway. > > Also, I've seen up to 8 mbufs being available per rx interrupt which is > odd as only a less amount of rx ring entries are actually populated. Not > sure if the driver is missing some interrupt threshold configuration. > Increasing the rx ring low watermark to 8 "solved" it for now. > Otherwise, the same null dereference occurs while trying to access empty > mbuf ring entries. > > Worth mentioning is that the NetBSD driver does not suffer from the same > problem as they keep all rx ring entries populated all the time. > > Looking for feedback and OKs at this point. > > [1] > https://github.com/u-boot/u-boot/blob/a94ab561e2f49a80d8579930e840b810ab1a1330/drivers/net/bcmgenet.c#L404 While putting more pressure on the network I'm seeing up to 100 mbufs being available per rx interrupt. Could it simply be explained by the hardware operating under the assumption that all ring entries are available? Even if instructing the hardware about the actual amount of available ring entries would require the driver to keep it in sync whenever the if_rxr_*() implementation decides to adjust the ring. Moving to if_rxr_init(RX_DESC_COUNT, RX_DESC_COUNT) essentially making all 256 ring entries always available makes the driver stable.
Re: macppc panic: vref used where vget required
On 14/04/22(Thu) 18:29, Alexander Bluhm wrote: > [...] > vn_lock: v_usecount == 0: 0x23e6b910, type VREG, use 0, write 0, hold 0, > flags (VBIOONFREELIST) > tag VT_UFS, ino 703119, on dev 0, 10 flags 0x100, effnlink 1, nlink 1 > mode 0100660, owner 21, group 21, size 13647873 > ==> vnode_history_print 0x23e6b910, next=5 > [3] c++[68540] usecount 2>1 > #0 0x625703e0 > [4] reaper[31684] usecount 1>0 > #0 splx+0x30 > #1 0xfffc > #2 vrele+0x5c > #3 uvn_detach+0x154 I wonder if the `uvn' has the UVM_VNODE_CANPERSIST set at this point. This would explain why the associate pages are not freed and end up in the inactive list. Is there an easy way to check that? > #4 uvm_unmap_detach+0x1a4 > #5 uvm_map_teardown+0x184 > #6 uvmspace_free+0x60 > #7 uvm_exit+0x30 > #8 reaper+0x138 > #9 fork_trampoline+0x14 > > panic: vn_lock: v_usecount == 0 > Stopped at db_enter+0x24: lwz r11,12(r1) > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > 433418 19000 21 0x2 00 c++ > *232867 36343 0 0x14000 0x2001K pagedaemon > db_enter() at db_enter+0x20 > panic(9434f4) at panic+0x158 > vn_lock(49535400,202000) at vn_lock+0x1c4 > uvn_io(5db64f85,a994c4,a979f4,,e401) at uvn_io+0x254 > uvn_put(fec3fe2c,e7ebbdd4,23ca9af0,40e0880) at uvn_put+0x64 > uvm_pager_put(0,0,e7ebbd70,32c0c0,200,8000,0) at uvm_pager_put+0x15c > uvmpd_scan_inactive(0) at uvmpd_scan_inactive+0x224 > uvmpd_scan() at uvmpd_scan+0x158 > uvm_pageout(5d4841c9) at uvm_pageout+0x398 > fork_trampoline() at fork_trampoline+0x14 > end trace frame: 0x0, count: 5 > https://www.openbsd.org/ddb.html describes the minimum info required in bug > reports. Insufficient info makes it difficult to find and fix bugs. > > ddb{1}> show panic > *cpu1: vn_lock: v_usecount == 0 > > ddb{1}> trace > db_enter() at db_enter+0x20 > panic(9434f4) at panic+0x158 > vn_lock(49535400,202000) at vn_lock+0x1c4 > uvn_io(5db64f85,a994c4,a979f4,,e401) at uvn_io+0x254 > uvn_put(fec3fe2c,e7ebbdd4,23ca9af0,40e0880) at uvn_put+0x64 > uvm_pager_put(0,0,e7ebbd70,32c0c0,200,8000,0) at uvm_pager_put+0x15c > uvmpd_scan_inactive(0) at uvmpd_scan_inactive+0x224 > uvmpd_scan() at uvmpd_scan+0x158 > uvm_pageout(5d4841c9) at uvm_pageout+0x398 > fork_trampoline() at fork_trampoline+0x14 > end trace frame: 0x0, count: -10 > > ddb{1}> show register > r0 0x8552f8panic+0x15c > r10xe7ebbb60 > r2 0 > r3 0xafcc00cpu_info+0x4c0 > r4 0xb0uvm_small_amap_pool+0x130 > r5 0x1 > r6 0 > r70xe7bb9000 > r8 0 > r9 0x91ae69digits > r10 0x14 > r11 0xb9c58763 > r12 0x972ab42e > r130 > r14 0xaf51b8bcstats > r15 0xa979d0uvmexp > r160 > r170 > r180 > r19 0x202000rtable_match+0x4c > r20 0x1000tlbdsmsize+0xf18 > r21 0x1000tlbdsmsize+0xf18 > r22 0xa80294netlock > r23 0xe401 > r24 0x23ca9b08 > r25 0x23ca9b0c > r26 0x925186digits+0xa31d > r270 > r280 > r29 0xafcec0cpu_info+0x780 > r30 0x9434f4cy_pio_rec+0x10c93 > r31 0xa97afcuvmexp+0x12c > lr 0x465b64db_enter+0x24 > cr0x48228204 > xer 0x2000 > ctr 0x84331copenpic_splx > iar 0x465b64db_enter+0x24 > msr 0x9032tlbdsmsize+0x8f4a > dar0 > dsisr 0 > db_enter+0x24: lwz r11,12(r1) > > ddb{1}> ps >PID TID PPIDUID S FLAGS WAIT COMMAND > 19000 433418 88497 21 7 0x2c++ > 88497 403151 58565 21 30x10008a sigsusp sh > 41513 242275 9837 21 2 0x2c++ > 9837 71908 58565 21 30x10008a sigsusp sh > 58565 163952 54207 21 30x10008a sigsusp make > 54207 76575 55963 21 30x10008a sigsusp sh > 55963 491542 28020 21 30x10008a sigsusp make > 28020 11887215 21 30x10008a sigsusp make > 17215 246222 89415 21 30x10008a sigsusp sh > 89415 114664 17720 21 30x10008a sigsusp make > 17720 164665 66695 0 30x10008a sigsusp sh > 66695 16970 75998 0 30x10008a sigsusp make > 75998 377562 93606 0 30x10008a sigsusp make > 93606 189705 64097 0 30x10008a sigsusp sh > 64097 134632 17259 0 30x82 piperdperl > 17259 145120 30986 0 3
Re: vte(4): restore MDC clock speed register value after MAC reset
Hi Kevin, > Thanks for the analysis, I committed your diff. Thank you for applying the patch! > For some reason, ethernet and usb devices never get interrupts, I have to > disable both acpimadt and mpbios in the kernel to make it work. > Does your machine also have this problem? Thanks. Yes, there are some problems with level type interrupts on Vortex86DX3 machines, which seems to be the same issue on OpenBSD and NetBSD (edge type interrupts work on the other hand like for storage or serial interface). FreeBSD 12.0 (the last one I managed to boot without modifications) and Linux doesn't seem to have this issue (unless it somehow ignored) and devices work without visible issues. I have registered it as https://gnats.netbsd.org/53894 on NetBSD for that and I did some investigation to find out the cause, unfortunately without success so far. Thus, neither USB, nor network (and probably audio) doesn't work if device interrupts established through APIC. In NetBSD it is possible to boot without acpi/smp mode in which it attaches devices in legacy way (pic), in this case USB and network works. I believe solving that would make system work properly. However, in OpenBSD case I also had crashes with the default kernel, where older versions (don't remember which) would work with DIAGNOSTIC option disabled, for current ones I was applying certain patches, like to comment out acpi_checksum in acpi_attach_common and "pretend" that hdr_revision is below 4 (however, I haven't updated my code to current yet). Regards, Andrius V On Tue, Apr 19, 2022 at 6:29 AM Kevin Lo wrote: > > Hi Andrius, > > I finally have hardware (DMP EB-3360-C2CF) to verify your diff and confirm > that restoring VTE_MDCSC value to original after reset solves the link state > flapping issue. Also tested on my ebox-3300mx to ensure it's not broken. > > Thanks for the analysis, I committed your diff. > > For some reason, ethernet and usb devices never get interrupts, I have to > disable both acpimadt and mpbios in the kernel to make it work. > Does your machine also have this problem? Thanks. > > Here's my dmesg output: > > OpenBSD 7.1-current (GENERIC.MP) #2: Mon Apr 18 16:19:48 CST 2022 > kevlo@ebox-3360:/usr/src/sys/arch/i386/compile/GENERIC.MP > real mem = 1006059520 (959MB) > avail mem = 970870784 (925MB) > random: good seed from bootblocks > mpath0 at root > scsibus0 at mpath0: 256 targets > mainbus0 at root > bios0 at mainbus0: date 08/15/18, BIOS32 rev. 0 @ 0xf0010, SMBIOS rev. 2.7 @ > 0xf9c80 (43 entries) > bios0: vendor American Megatrends Inc. version "08.00.1" date 08/15/2018 > bios0: RDC Semiconductor Co., Ltd. A9126 > acpi0 at bios0: ACPI 3.0 > acpi0: sleep states S0 > acpi0: tables DSDT FACP APIC MSDM SLIC OEMB > acpi0: wakeup devices USB1(S0) > acpitimer0 at acpi0: 3579545 Hz, 24 bits > acpiprt0 at acpi0: bus 0 (PCI0) > acpiprt1 at acpi0: bus 1 (P0P1) > acpiprt2 at acpi0: bus 2 (P0P2) > "PNP0A03" at acpi0 not configured > acpicmos0 at acpi0 > "PNP0501" at acpi0 not configured > "PNP0501" at acpi0 not configured > mpbios at bios0 function 0x0 not configured > bios0: ROM list: 0xc/0x8000 > cpu0 at mainbus0: (uniprocessor) > cpu0: Vortex86DX3 (686-class) 1.01 GHz, 06-01-01 > cpu0: FPU,PSE,TSC,MSR,CX8,APIC,SEP,PGE,CMOV,MMX,FXSR,SSE > pci0 at mainbus0 bus 0: configuration mode 1 (bios) > pchb0 at pci0 dev 0 function 0 "RDC R6023 Host" rev 0x02 > ppb0 at pci0 dev 1 function 0 "RDC R1031 PCIe" rev 0x01: irq 10 > pci1 at ppb0 bus 1 > ppb1 at pci0 dev 2 function 0 "RDC R1031 PCIe" rev 0x01: irq 11 > pci2 at ppb1 bus 2 > pcib0 at pci0 dev 7 function 0 "RDC R6035 ISA" rev 0x01 > pcib1 at pci0 dev 7 function 1 "RDC R6035 ISA" rev 0x01 > vte0 at pci0 dev 8 function 0 "RDC R6040 Ethernet" rev 0x00: irq 5, address > 00:1b:eb:xx:xx:xx > rdcphy0 at vte0 phy 1: R6040 10/100 PHY, rev. 0 > ohci0 at pci0 dev 10 function 0 "RDC R6060 USB" rev 0x14: irq 6, version 1.0, > legacy support > ehci0 at pci0 dev 10 function 1 "RDC R6061 USB2" rev 0x08: irq 10 > usb0 at ehci0: USB revision 2.0 > uhub0 at usb0 configuration 1 interface 0 "RDC EHCI root hub" rev 2.00/1.00 > addr 1 > pciide0 at pci0 dev 12 function 0 "RDC R1012 IDE" rev 0x02: DMA, channel 0 > configured to compatibility, channel 1 configured to compatibility > pciide0: channel 0 disabled (no drives) > wd0 at pciide0 channel 1 drive 0: > wd0: 1-sector PIO, LBA48, 122104MB, 250069680 sectors > vga1 at pci0 dev 13 function 0 "RDC M2015 VGA" rev 0x00 > wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) > wsdisplay0: screen 1-5 added (80x25, vt100 emulation) > azalia0 at pci0 dev 14 function 0 "RDC R3010 HDA" rev 0x01: irq 6 > azalia0: codecs: Realtek ALC262 > audio0 at azalia0 > isa0 at pcib0 > isadma0 at isa0 > com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo > com0: console > com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo > pckbc0 at isa0 port 0x60/5 irq 1 irq 12 > pckbd0 at pckbc0 (kbd slot) > wskbd0 at pckbd0: console keyboard, using wsdisplay0 >