I have stability problems with anything at or after this commit (b377ff8) on an amd64 laptop. While I see the following panic logged, no crash dump is preserved :-( It happens after ~5-6 minutes running in KDE (X).

Reverting to 36efc64 seems to work reliably (after ACPI changes but before the problematic PCI one)

kernel: Fatal trap 12: page fault while in kernel mode
kernel: cpuid = 2; apic id = 02
kernel: fault virtual address     = 0x48
kernel: fault code                = supervisor read data, page not present
kernel: instruction pointer       = 0x20:0xffffffff80acb962
kernel: stack pointer             = 0x28:0xfffffe00c4318d80
kernel: frame pointer             = 0x28:0xfffffe00c4318d80
kernel: code segment              = base 0x0, limit 0xfffff, type 0x1b
kernel:                   = DPL 0, pres 1, long 1, def32 0, gran 1
kernel: processor eflags  = interrupt enabled, resume, IOPL = 0
kernel: current process           = 2 (clock (0))
kernel: rdi: fffff802e460c000 rsi: 0000000000000000 rdx: 0000000000000002
kernel: rcx: 0000000000000000  r8: 000000000000001e  r9: fffffe00c4319000
kernel: rax: 0000000000000002 rbx: fffff802e460c000 rbp: fffffe00c4318d80
kernel: r10: 0000000000001388 r11: 000000007ffc765d r12: 000f000000000000
kernel: r13: 0002000000000000 r14: fffff8000193e740 r15: 0000000000000000
kernel: trap number               = 12
kernel: panic: page fault
kernel: cpuid = 2
kernel: time = 1707573802
kernel: Uptime: 6m19s
kernel: Dumping 942 out of 16242 MB:..2%..11%..21%..31%..41%..51%..62%..72%..82%..92%
kernel: Dump complete
kernel: Automatic reboot in 15 seconds - press a key on the console to abort

On 2/10/24 13:04, Mark Millard wrote:


On Feb 9, 2024, at 23:44, Bakul Shah <ba...@iitbombay.org> wrote:

$ git bisect good
b377ff8110e3489eb6e6b920b51a2384dfc4eb0b is the first bad commit

On Feb 9, 2024, at 8:13 PM, Mark Millard <mark...@yahoo.com> wrote:

Summary:

pcib0: <BCM2838-compatible PCI-express controller> mem 0x7d500000-0x7d50930f 
irq 80,81 on simplebus2
pcib0: parsing FDT for ECAM0:
pcib0:  PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: 0x40000000
. .
rman_manage_region: <pcib1 memory window> request: start 0x600000000, end 
0x6000fffff
panic: Failed to add resource to rman


Detail:


. .
pcib0: <BCM2838-compatible PCI-express controller> mem 0x7d500000-0x7d50930f 
irq 80,81 on simplebus2
pcib0: parsing FDT for ECAM0:
pcib0: PCI addr: 0xc0000000, CPU addr: 0x600000000, Size: 0x40000000
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: PCI addr: 0x0, CPU addr: 0x0, Size: 0x0
pcib0: Bus is not cache-coherent
rman_reserve_resource_bound: <I/O memory addresses> request: [0xfd500000, 
0xfd50930f], length 0x9310, flags 100, device pcib0
rman_reserve_resource_bound: trying 0x1fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x1fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x31bfffff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x33296fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39bf1fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39c02fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39c06fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39c07fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39c08fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39c2afff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39c36fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x39c37fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x3b03ffff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x3b04ffff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x3b2fffff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x3ee61fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x3ee63fff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0x3fffffff <0xfd500000,0x930f>
rman_reserve_resource_bound: tried 0xfbffffff <0xfd500000,0x930f>
considering [0xfc000000, 0xfd5d1fff]
truncated region: [0xfd500000, 0xfd50930f]; size 0x9310 (requested 0x9310)
candidate region: [0xfd500000, 0xfd50930f], size 0x9310
splitting region in three parts: [0xfc000000, 0xfd4fffff]; [0xfd500000, 
0xfd50930f]; [0xfd509310, 0xfd5d1fff]
rman_manage_region: <PCIe Memory> request: start 0xc0000000, end 0xffffffff
pcib0: hardware identifies as revision 0x304.
pcib0: note: reported link speed is 5.0 GT/s.
rman_reserve_resource_bound: <Interrupts> request: [0x51, 0x51], length 0x1, 
flags 0, device pcib0
rman_reserve_resource_bound: trying 0 <0x51,0>
rman_reserve_resource_bound: tried 0 <0x51,0>
rman_reserve_resource_bound: tried 0x1 <0x51,0>
rman_reserve_resource_bound: tried 0x2 <0x51,0>
rman_reserve_resource_bound: tried 0x3 <0x51,0>
rman_reserve_resource_bound: tried 0x4 <0x51,0>
rman_reserve_resource_bound: tried 0x5 <0x51,0>
rman_reserve_resource_bound: tried 0x6 <0x51,0>
rman_reserve_resource_bound: tried 0x7 <0x51,0>
rman_reserve_resource_bound: tried 0xc <0x51,0>
rman_reserve_resource_bound: tried 0xd <0x51,0>
rman_reserve_resource_bound: tried 0xe <0x51,0>
rman_reserve_resource_bound: tried 0xf <0x51,0>
rman_reserve_resource_bound: tried 0x10 <0x51,0>
rman_reserve_resource_bound: tried 0x11 <0x51,0>
rman_reserve_resource_bound: tried 0x12 <0x51,0>
rman_reserve_resource_bound: tried 0x17 <0x51,0>
rman_reserve_resource_bound: tried 0x18 <0x51,0>
rman_reserve_resource_bound: tried 0x1a <0x51,0>
rman_reserve_resource_bound: tried 0x1b <0x51,0>
rman_reserve_resource_bound: tried 0x22 <0x51,0>
rman_reserve_resource_bound: tried 0x23 <0x51,0>
rman_reserve_resource_bound: tried 0x24 <0x51,0>
rman_reserve_resource_bound: tried 0x25 <0x51,0>
rman_reserve_resource_bound: tried 0x26 <0x51,0>
rman_reserve_resource_bound: tried 0x27 <0x51,0>
rman_reserve_resource_bound: tried 0x28 <0x51,0>
rman_reserve_resource_bound: tried 0x29 <0x51,0>
rman_reserve_resource_bound: tried 0x4e <0x51,0>
rman_reserve_resource_bound: tried 0x4f <0x51,0>
considering [0x50, 0xffffffffffffffff]
truncated region: [0x51, 0x51]; size 0x1 (requested 0x1)
candidate region: [0x51, 0x51], size 0x1
splitting region in three parts: [0x50, 0x50]; [0x51, 0x51]; [0x52, 
0xffffffffffffffff]
pci0: <OFW PCI bus> on pcib0
rman_manage_region: <PCI domain 0 bus numbers> request: start 0, end 0xff
rman_reserve_resource_bound: <PCI domain 0 bus numbers> request: [0, 0], length 
0x1, flags 0, device pci0
rman_reserve_resource_bound: trying 0xff <0,0>
considering [0, 0xff]
truncated region: [0, 0]; size 0x1 (requested 0x1)
candidate region: [0, 0], size 0x1
allocating from the beginning
pci0: domain=0, physical bus=0
found-> vendor=0x14e4, dev=0x2711, revid=0x00
domain=0, bus=0, slot=0, func=0
class=06-04-00, hdrtype=0x01, mfdev=0
cmdreg=0x0000, statreg=0x0010, cachelnsz=0 (dwords)
lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
intpin=a, irq=0
powerspec 3  supports D0 D3  current D0
secbus=1, subbus=1
rman_reserve_resource_bound: <PCI domain 0 bus numbers> request: [0x1, 0x1], 
length 0x1, flags 0, device (null)
rman_reserve_resource_bound: trying 0 <0x1,0>
rman_reserve_resource_bound: tried 0 <0x1,0>
considering [0x1, 0xff]
truncated region: [0x1, 0x1]; size 0x1 (requested 0x1)
candidate region: [0x1, 0x1], size 0x1
allocating from the beginning
pcib1: <PCI-PCI bridge> irq 91 at device 0.0 on pci0

Looking at an old log shows that the next normal output was:

pcib0: rman_reserve_resource: start=0xc0000000, end=0xc00fffff, count=0x100000

and somewhat later there was:

pcib1: allocated memory range (0xc0000000-0xc0000fff) for rid 10 of bcm_xhci0

I'll give more text at the end of this message. Basically the
sequence is assocaited with xhci0. The count=0x100000 looks
to be from the reg for xhci@7e9c0000.

        scb {
                compatible = "simple-bus";
                #address-cells = <0x00000002>;
                #size-cells = <0x00000002>;
                ranges = <0x00000000 0x7c000000 0x00000000 0xfc000000 0x00000000 
0x03800000 0x00000000 0x40000000 0x00000000 0xff800000 0x00000000 0x00800000 
0x00000006 0x00000000 0x00000006 0x00000000 0x00000000 0x40000000 0x00000000 
0x00000000 0x00000000 0x00000000 0x00000000 0xfc000000>;
                dma-ranges = <0x00000000 0x00000000 0x00000000 0x00000000 
0x00000004 0x00000000>;
                phandle = <0x000000d6>;
                pcie@7d500000 {
                        compatible = "brcm,bcm2711-pcie", "brcm,bcm7445-pcie";
                        reg = <0x00000000 0x7d500000 0x00000000 0x00009310>;
                        device_type = "pci";
                        #address-cells = <0x00000003>;
                        #interrupt-cells = <0x00000001>;
                        #size-cells = <0x00000002>;
                        interrupts = <0x00000000 0x00000094 0x00000004 0x00000000 
0x00000094 0x00000004>;
                        interrupt-names = "pcie", "msi";
                        interrupt-map-mask = <0x00000000 0x00000000 0x00000000 
0x00000007>;
                        interrupt-map = <0x00000000 0x00000000 0x00000000 
0x00000001 0x00000001 0x00000000 0x0000008f 0x00000004>;
                        msi-controller;
                        msi-parent = <0x0000002c>;
                        ranges = <0x02000000 0x00000000 0xc0000000 0x00000006 
0x00000000 0x00000000 0x40000000>;
                        dma-ranges = <0x02000000 0x00000004 0x00000000 0x00000000 
0x00000000 0x00000002 0x00000000>;
                        brcm,enable-ssc;
                        phandle = <0x0000002c>;
                        pci@1,0 {
                                #address-cells = <0x00000003>;
                                #size-cells = <0x00000002>;
                                ranges;
                                reg = <0x00000000 0x00000000 0x00000000 0x00000000 
0x00000000>;
                                usb@1,0 {
                                        reg = <0x00010000 0x00000000 0x00000000 
0x00000000 0x00000000>;
                                        resets = <0x0000002d 0x00000000>;
                                };
                        };
                };
  . .
                xhci@7e9c0000 {
                        compatible = "generic-xhci";
                        status = "disabled";
                        reg = <0x00000000 0x7e9c0000 0x00000000 0x00100000>;
                        interrupts = <0x00000000 0x000000b0 0x00000004>;
                        power-domains = <0x00000013 0x00000006>;
                        phandle = <0x000000d9>;
                };

Note the lack of nesting of xhci@7e9c0000 relative to
pcie@7d50000 .

rman_manage_region: <pcib1 bus numbers> request: start 0x1, end 0x1
pcib0: rman_reserve_resource: start=0xc0000000, end=0xc00fffff, count=0x100000
rman_reserve_resource_bound: <PCIe Memory> request: [0xc0000000, 0xc00fffff], 
length 0x100000, flags 102, device pcib1
rman_reserve_resource_bound: trying 0xffffffff <0xc0000000,0xfffff>
considering [0xc0000000, 0xffffffff]
truncated region: [0xc0000000, 0xc00fffff]; size 0x100000 (requested 0x100000)
candidate region: [0xc0000000, 0xc00fffff], size 0x100000
allocating from the beginning
rman_manage_region: <pcib1 memory window> request: start 0x600000000, end 
0x6000fffff
panic: Failed to add resource to rman
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self
db_trace_self_wrapper() at db_trace_self_wrapper+0x38
vpanic() at vpanic+0x1a4
panic() at panic+0x48
pcib_add_window_resources() at pcib_add_window_resources+0xf4
pcib_alloc_window() at pcib_alloc_window+0xfc
pcib_attach_common() at pcib_attach_common+0xa18
pcib_attach() at pcib_attach+0x14
device_attach() at device_attach+0x3fc
device_probe_and_attach() at device_probe_and_attach+0x80
bus_generic_attach() at bus_generic_attach+0x1c
pci_attach() at pci_attach+0xec
device_attach() at device_attach+0x3fc
device_probe_and_attach() at device_probe_and_attach+0x80
bus_generic_attach() at bus_generic_attach+0x1c
device_attach() at device_attach+0x3fc
device_probe_and_attach() at device_probe_and_attach+0x80
bus_generic_new_pass() at bus_generic_new_pass+0x100
bus_generic_new_pass() at bus_generic_new_pass+0xb0
bus_generic_new_pass() at bus_generic_new_pass+0xb0
bus_generic_new_pass() at bus_generic_new_pass+0xb0
bus_set_pass() at bus_set_pass+0x50
mi_startup() at mi_startup+0x1e0
virtdone() at virtdone+0x68
KDB: enter: panic
[ thread pid 0 tid 100000 ]
Stopped at      kdb_enter+0x4c: str     xzr, [x19, #1280]
db>



For reference from an old log:

pcib1: <PCI-PCI bridge> irq 91 at device 0.0 on pci0
pcib0: rman_reserve_resource: start=0xc0000000, end=0xc00fffff, count=0x100000
pcib1:   domain            0
pcib1:   secondary bus     1
pcib1:   subordinate bus   1
pcib1:   memory decode     0xc0000000-0xc00fffff
pci1: <OFW PCI bus> on pcib1
pcib1: allocated bus range (1-1) for rid 0 of pci1
pci1: domain=0, physical bus=1
found->      vendor=0x1106, dev=0x3483, revid=0x01
        domain=0, bus=1, slot=0, func=0
        class=0c-03-30, hdrtype=0x00, mfdev=0
        cmdreg=0x0000, statreg=0x0010, cachelnsz=0 (dwords)
        lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
        intpin=a, irq=0
        powerspec 3  supports D0 D3  current D0
        MSI supports 4 messages, 64 bit
        map[10]: type Memory, range 64, base 0, size 12, memory disabled
pcib1: slot 0 INTA is routed to irq 92
bcm_xhci0: <VL805 USB 3.0 controller (on the Raspberry Pi 4b)> irq 92 at device 
0.0 on pci1
bcm_xhci0: note: xhci firmware not found.
bcm_xhci0: note: installing xhci firmware.
bcm_xhci0: note: xhci firmware detected; firmware is revision 138a1.
pcib1: allocated memory range (0xc0000000-0xc0000fff) for rid 10 of bcm_xhci0
bcm_xhci0: Lazy allocation of 0x1000 bytes rid 0x10 type 3 at 0xc0000000
bcm_xhci0: 32 bytes context size, 64-bit DMA
bcm_xhci0: attempting to allocate 1 MSI vectors (4 supported)
bcm_xhci0: using IRQ 93 for MSI
bcm_xhci0: MSI enabled
bcm_xhci0: (New XHCI DeviceId=0x34831106)
usbus0 on bcm_xhci0
bcm_xhci0: usbpf: Attached


===
Mark Millard
marklmi at yahoo.com




Reply via email to