On Tue, Jul 31, 2018 at 02:04:53PM -0700, Philip Guenther wrote: > On Tue, 31 Jul 2018, giova...@paclan.it wrote: > > >Synopsis: Every now and then I hit ddb with double fault trap, code=0 > > >Category: acpi > > >Environment: > > System : OpenBSD 6.3 > > Details : OpenBSD 6.3-current (GENERIC) #143: Fri Jul 27 04:38:01 > > MDT 2018 > > > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC > > >Description: > > Every couple of days I hit ddb: > > double fault trap, code=0 > > Stopped at __mtx_enter+0xf: pushq %r11 > > ddb{0}> bt > > __mtx_enter(0) at __mtx_enter+0xf > > > > i915_get_crtc_scanoutpos(f69da68b441aff13,ffff800000169156,ffff80000015f800,ffff80000015f800,1,0) > > at i915_get_crtc_scanoutpos+0xce > > > > drm_calc_vbltimestamp_from_scanoutpos(6551790fe8f1e00a,0,ffff80000015f800,0,ffff80000015f800,453d) > > at drm_calc_vbltimestamp_from_scanoutpos+0x92 > > drm_update_vblank_count() at drm_update_vblank_count+0x9b > > drm_handle_vblank() at drm_handle_vblank+0xd1 > > ironlake_irq_handler(575039e4693defc4,ffff80000015d700) at > > ironlake_irq_handler+0x320 > > intr_handler(84d668fbeb1151573,0) at intr_handler+0x68 > > Xintr_ioapic_edge16_untramp(0,0,1,0,ffffffff81b329e8,ffffff012cdc33f0) > > at Xintr_ioapic_edge16_untramp+0x19f > > uvm_map_addr_RBT_AUGMENT(1aa311ba321d33a4) at uvm_map_addr_RBT_AUGMENT > > uvm_mapent_addr_remove(ffffffff81c9ba58,ffff800032d20000) at > > uvm_mapent_addr_remove+0x67 > > > > uvm_mapent_mkfree(709092b46cd75947,ffff800032d20000,ffffff012cdc33f0,ffffffff81c9ba58,ffffff012cdc3000) > > at uvm_mapent_mkfree+0xc9 > > > > uvm_unmap_remove(da20d5b12b851735,ffff800032d2000,ffffffff81c9ba58,ffff800032cb2540,ffff800032d1f000,1) > > at uvm_unmap_remove+0x2cf > > uvm_unmap(709092b46c9e3347,ffff800032d1f000,ffff800032d20000) at > > uvm_unmap+0x75 > > km_free(4033f06f727571cc,514,0,1000) at km_free+0x4f > > _bus_space_unmap(ddf1a0675f9a0f6,1,0,ffffffff81b7aaf8) at > > _bus_space_unmap+0xdd > > acpi_gasio(ad65595461093493,0,0,ffff8000009293a0,ffff800032cb2768,1) at > > acpi_gasio+0x242 > > > > aml_opreg_sysmem_handler(14f676a0776defdb,ffff800032cb2748,ffffffff818347d0,ffff800032cb26d0,ad65595461093493) > > at aml_opreg_sysmem_handler+0x30 > > > > aml_rwgen(221071f83d28047,ffff800000929388,ffff800000062088,28a2,ffff80000048188,1) > > at aml_rwgen+0x650 > > aml_rwfield(771ec935baef5db0,ffff80000075f308,69,69,ffff800000062088) > > at aml_rwfield+0x3a5 > > > > aml_eval(79dedd78a5dfabfb,ffff80000075f308,ffff80000035031,69,ffff800000062088) > > at aml_eval+0x1f7 > > aml_parse(de6bf302f7589bb6,ffff80000075f308,ffff800000035021) at > > aml_parse+0x54 > > ....three more pages of the last line.... > > aml_eval(e4d65b8caee09c80,0,ffff800000089408,2,0) at aml_eval+0x323 > > aml_evalnode(bcb11c8975c0ae9,ffff800000026400,ffff800000026400,2,0) at > > aml_evalnode+0xae > > acpi_gpe(2c80ceef08cd0301,ffff800000026400,ffff80000002bc40) at > > acpi_gpe+0x35 > > acpi_thread(0) at acpi_thread+0x188 > > end trace frame: 0x0, count: -65 > > And quoting a previous off-list email: > > every now and then, starting from at least a month ago my laptop > > enters ddb with "Double fault trap, code=0". > > Most of the times it is in ieee80211 and at a first glance I > > looked at iwm(4), but today it happened also with intel(4). > > So it's double-faulting because it's running off the end of the kernel > stack for the ACPI thread due to a combination of deeply nested AML and > stack usage by the DRM and/or 802.11 interrupt handlers. > > I don't see any recent changes in the ACPI stack which would cause a > change in behavior on this box (it doesn't have GenericSerialBus, or _DSD > properties, or an sdhc device), so either > a) the thange in stack consumption is from the DRM and 802.11 side, OR > b) did you update the BIOS around the time this started? > > bios0: vendor TOSHIBA version "Version 5.10" date 04/18/2018 > Perhaps the new version uses more deeply nested AML. > I do not have enough dmesg log files but it could be related to a bios update I had completely forgot that
> > For those wondering about the iwm/802.11 case, the photo previously sent > had the trace of the interrupt fame going, from bottom up: > > -> Xintr_ioapic_edge24_untramp > -> intr_handler > -> iwm_intr > -> iwm_rx_pkt > -> iwm_rx_mpdu > -> iwm_rx_frame > -> ieee80211_input > -> ieee80211_recv_probe_resp > -> ieee80211_find_node_for_beacon > > Are any of those using more stack-space than before? > > > Not sure what we want to do here. > - if this did start after updating the BIOS, see if there's a newer one > or maybe downgrade There isn't an update available and a downgrade seems not possible > - if we can identify an increase in stack use in an interrupt path, we > should fix that > - making aml_parse() iterative instead of recursive...by tracking frames > of AML state in an explict stack...would be annoying, more complex to > maintain, and probably inefficient. Maybe it's time to let kernel > threads request a larger than default stack size and have acpi_thread > request another page or so? > - if all else fails, there's always increasing UPAGES... <barf> > > > Philip Guenther