On Mon, Jan 31, 2022 at 11:17:01PM +1100, Jonathan Gray wrote: > On Mon, Jan 31, 2022 at 12:54:53AM -0700, Thomas Frohwein wrote: > > On Sat, 29 Jan 2022 12:15:10 -0300 > > Martin Pieuchot <[email protected]> wrote: > > > > > On 28/01/22(Fri) 23:03, Thomas Frohwein wrote: > > > > On Sat, 29 Jan 2022 15:19:20 +1100 > > > > Jonathan Gray <[email protected]> wrote: > > > > > > > > > does this diff to revert uvm_fault.c rev 1.124 change anything? > > > > > > > > Unfortunately no. Same pmap error as in the original bug report occurs > > > > with a kernel with this diff. > > > > > > Could you submit a new bug report? Could you manage to include ps and the > > > trace of all the CPUs when the pmap corruption occurs? > > > > See below > > > > > > > > Do you have some steps to reproduce the corruption? Which program is > > > currently running? Is it multi-threaded? What is the simplest scenario > > > to trigger the corruption? > > > > It's during boot of the MP kernel. The only scenario I can provide is > > booting this machine with an MP kernel from January 18 or newer. If I > > boot SP kernel, or build an MP kernel with jsg@'s diff that adds > > `pool_debug = 2`, the panic does _not_ occur. > > > > Here some new (hand-typed from a picture) output when I boot a freshly > > downloaded snapshot MP kernel from January 30th (note this is an 8 core/16 > > hyperthreads CPU; I have _not_ enabled hyperthreading). I attached dmesg > > from > > booting bsd.sp, too. > > > > ... (boot, see dmesg in original bugs@ submission) > > wsdisplay0: screen 1-5 added (std, vt100 emulation) > > iwm0: hw rev 0x200, fw ver 36.ca7b901d.0, address [...] > > va 7f7fffffb000 ppa ffffffffff000 > > panic: pmap_get_ptp: unmanaged user PTP > > Stopped at db_enter+0x10: popq %rbp > > TID PID UID PRFLAGS PFLAGS CPU COMMAND > > * 28644 1 0 0 0 2K swapper > > db_enter() at db_enter+0x10 > > panic(ffffffff81f3dd1f) at panic+0xbf > > pmap_get_ptp(fffffd888e52ee58,7f7fffffb000) at pmap_get_ptp+0x303 > > pmap_enter(fffffd888e52ee58,7f7fffffb000,13d151000,3,22) at pmap_enter+0x188 > > uvm_fault_lower(ffff8000156852a0,ffff8000156852d8,ffff800015685220,0) at > > uvm_fault_lower+0x63d > > uvm_fault(fffffd888e52fdd0,7f7fffffb000,0,2) at uvm_fault+0x1b3 > > kpageflttrap(ffff800015685420,7f7fffffbff5) at kpageflttrap+0x12c > > kerntrap(ffff800015685420) at kerntrap+0x91 > > alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b > > copyout() at copyout+0x53 > > end trace frame: 0x0, count: 5 > > does this diff to provide stolen memory data help?
I have committed this minus the printf change would still be interested to hear if it helps > > Index: sys/dev/pci/drm/i915/i915_drv.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/drm/i915/i915_drv.c,v > retrieving revision 1.135 > diff -u -p -r1.135 i915_drv.c > --- sys/dev/pci/drm/i915/i915_drv.c 19 Jan 2022 02:20:06 -0000 1.135 > +++ sys/dev/pci/drm/i915/i915_drv.c 31 Jan 2022 11:20:04 -0000 > @@ -2350,6 +2350,7 @@ inteldrm_match(struct device *parent, vo > } > > int drm_gem_init(struct drm_device *); > +void intel_init_stolen_res(struct inteldrm_softc *); > > void > inteldrm_attach(struct device *parent, struct device *self, void *aux) > @@ -2469,6 +2470,7 @@ inteldrm_attach(struct device *parent, s > return; > } > dev->pdev->irq = -1; > + intel_init_stolen_res(dev_priv); > > config_mountroot(self, inteldrm_attachhook); > } > Index: sys/dev/pci/drm/i915/intel_stolen.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/drm/i915/intel_stolen.c,v > retrieving revision 1.2 > diff -u -p -r1.2 intel_stolen.c > --- sys/dev/pci/drm/i915/intel_stolen.c 14 Jan 2022 06:53:11 -0000 > 1.2 > +++ sys/dev/pci/drm/i915/intel_stolen.c 31 Jan 2022 11:25:37 -0000 > @@ -163,7 +163,7 @@ intel_init_stolen_res(struct inteldrm_so > > if (GRAPHICS_VER(dev_priv) >= 3 && GRAPHICS_VER(dev_priv) < 11) > stolen_base = gen3_stolen_base(dev_priv); > - else if (GRAPHICS_VER(dev_priv) == 11) > + else if (GRAPHICS_VER(dev_priv) == 11 || GRAPHICS_VER(dev_priv) == 12) > stolen_base = gen11_stolen_base(dev_priv); > > if (IS_I830(dev_priv) || IS_I845G(dev_priv)) > @@ -177,7 +177,7 @@ intel_init_stolen_res(struct inteldrm_so > stolen_size = gen6_stolen_size(dev_priv); > else if (GRAPHICS_VER(dev_priv) == 8) > stolen_size = gen8_stolen_size(dev_priv); > - else if (GRAPHICS_VER(dev_priv) >= 9 && GRAPHICS_VER(dev_priv) < 12) > + else if (GRAPHICS_VER(dev_priv) >= 9 && GRAPHICS_VER(dev_priv) <= 12) > stolen_size = gen9_stolen_size(dev_priv); > > if (stolen_base == 0 || stolen_size == 0) > Index: sys/dev/pci/drm/i915/gt/intel_ggtt.c > =================================================================== > RCS file: /cvs/src/sys/dev/pci/drm/i915/gt/intel_ggtt.c,v > retrieving revision 1.4 > diff -u -p -r1.4 intel_ggtt.c > --- sys/dev/pci/drm/i915/gt/intel_ggtt.c 26 Jan 2022 01:46:12 -0000 > 1.4 > +++ sys/dev/pci/drm/i915/gt/intel_ggtt.c 31 Jan 2022 11:33:05 -0000 > @@ -1320,10 +1320,10 @@ static int ggtt_probe_hw(struct i915_ggt > } > > /* GMADR is the PCI mmio aperture into the global GTT. */ > - drm_dbg(&i915->drm, "GGTT size = %lluM\n", ggtt->vm.total >> 20); > - drm_dbg(&i915->drm, "GMADR size = %lluM\n", > + drm_warn(&i915->drm, "GGTT size = %lluM\n", ggtt->vm.total >> 20); > + drm_warn(&i915->drm, "GMADR size = %lluM\n", > (u64)ggtt->mappable_end >> 20); > - drm_dbg(&i915->drm, "DSM size = %lluM\n", > + drm_warn(&i915->drm, "DSM size = %lluM\n", > (u64)resource_size(&intel_graphics_stolen_res) >> 20); > > return 0; > >
