The actual bug is showing in 
https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14598
if CONFIG_INTEL_IOMMU_DEFAULT_ON=y  ,  that IOMMU prevent the invalid access,  
but if  CONFIG_INTEL_IOMMU_DEFAULT_ON=n,   the invalid access will directly 
cause system crash after kexec reboot.

-----Original Message-----
From: Ville Syrjälä <[email protected]> 
Sent: Wednesday, October 8, 2025 5:22 AM
To: Yao, Jia <[email protected]>
Cc: [email protected]; Zuo, Alex <[email protected]>; Lin, 
Shuicheng <[email protected]>; Askar Safin <[email protected]>; 
Pingfan Liu <[email protected]>; Chris Wilson <[email protected]>
Subject: Re: [PATCH v2] drm/i915: Setting/clearing the memory access bit when 
en/disabling i915

On Tue, Oct 07, 2025 at 09:40:45PM +0000, Yao, Jia wrote:
> You mean  intel_pxp_fini(i915)  ?
> This is because mei_me_shutdown  is called after i915_driver_shutdown 
> in pci_device_shutdown sequence.  If we don't close pxp in advance, it 
> will cause
> 
> [  295.584775] i915 0000:00:02.0: [drm] *ERROR* gt: MMIO unreliable 
> (forcewake register returns 0xFFFFFFFF)!

So that is the actual bug you're trying to fix? Please just submit the pxp fix 
on its own.

> 
> Since we disabled PCI_COMMAND_MEMORY in  i915_driver_shutdown
> 
> Thanks,
> Jia
> 
> -----Original Message-----
> From: Ville Syrjälä <[email protected]>
> Sent: Tuesday, October 7, 2025 2:25 PM
> To: Yao, Jia <[email protected]>
> Cc: [email protected]; Zuo, Alex <[email protected]>; 
> Lin, Shuicheng <[email protected]>; Askar Safin 
> <[email protected]>; Pingfan Liu <[email protected]>; Chris Wilson 
> <[email protected]>
> Subject: Re: [PATCH v2] drm/i915: Setting/clearing the memory access 
> bit when en/disabling i915
> 
> On Tue, Oct 07, 2025 at 08:25:14PM +0000, Jia Yao wrote:
> > Make i915's PCI device management more robust by always 
> > setting/clearing the memory access bit when enabling/disabling the 
> > device, and by consolidating this logic into helper functions.
> > 
> > It fixes kexec reboot issue by disabling memory access before 
> > shutting down the device, which can block unsafe and unwanted access from 
> > DMA.
> > 
> > v2:
> >   - follow brace style
> > 
> > Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/14598
> > Cc: Alex Zuo <[email protected]>
> > Cc: Shuicheng Lin <[email protected]>
> > Cc: Askar Safin <[email protected]>
> > Cc: Pingfan Liu <[email protected]>
> > Suggested-by: Chris Wilson <[email protected]>
> > Signed-off-by: Jia Yao <[email protected]>
> > ---
> >  drivers/gpu/drm/i915/i915_driver.c | 35
> > +++++++++++++++++++++++++++---
> >  1 file changed, 32 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_driver.c
> > b/drivers/gpu/drm/i915/i915_driver.c
> > index b46cb54ef5dc..766f85726b67 100644
> > --- a/drivers/gpu/drm/i915/i915_driver.c
> > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > @@ -118,6 +118,33 @@
> >  
> >  static const struct drm_driver i915_drm_driver;
> >  
> > +static int i915_enable_device(struct pci_dev *pdev) {
> > +   u32 cmd;
> > +   int ret;
> > +
> > +   ret = pci_enable_device(pdev);
> > +   if (ret)
> > +           return ret;
> > +
> > +   pci_read_config_dword(pdev, PCI_COMMAND, &cmd);
> > +   if (!(cmd & PCI_COMMAND_MEMORY))
> > +           pci_write_config_dword(pdev, PCI_COMMAND, cmd | 
> > +PCI_COMMAND_MEMORY);
> > +
> > +   return 0;
> > +}
> 
> NAK. If the pci code is broken then fix the problem there.
> Do not add ugly hacks into random drivers.
> 
> > +
> > +static void i915_disable_device(struct pci_dev *pdev) {
> > +   u32 cmd;
> > +
> > +   pci_read_config_dword(pdev, PCI_COMMAND, &cmd);
> > +   if (cmd & PCI_COMMAND_MEMORY)
> > +           pci_write_config_dword(pdev, PCI_COMMAND, cmd & 
> > +~PCI_COMMAND_MEMORY);
> > +
> > +   pci_disable_device(pdev);
> > +}
> > +
> >  static int i915_workqueues_init(struct drm_i915_private *dev_priv)  {
> >     /*
> > @@ -788,7 +815,7 @@ int i915_driver_probe(struct pci_dev *pdev, const 
> > struct pci_device_id *ent)
> >     struct intel_display *display;
> >     int ret;
> >  
> > -   ret = pci_enable_device(pdev);
> > +   ret = i915_enable_device(pdev);
> >     if (ret) {
> >             pr_err("Failed to enable graphics device: %pe\n", ERR_PTR(ret));
> >             return ret;
> > @@ -796,7 +823,7 @@ int i915_driver_probe(struct pci_dev *pdev, 
> > const struct pci_device_id *ent)
> >  
> >     i915 = i915_driver_create(pdev, ent);
> >     if (IS_ERR(i915)) {
> > -           pci_disable_device(pdev);
> > +           i915_disable_device(pdev);
> >             return PTR_ERR(i915);
> >     }
> >  
> > @@ -885,7 +912,7 @@ int i915_driver_probe(struct pci_dev *pdev, const 
> > struct pci_device_id *ent)
> >     enable_rpm_wakeref_asserts(&i915->runtime_pm);
> >     i915_driver_late_release(i915);
> >  out_pci_disable:
> > -   pci_disable_device(pdev);
> > +   i915_disable_device(pdev);
> >     i915_probe_error(i915, "Device initialization failed (%d)\n", ret);
> >     return ret;
> >  }
> > @@ -1003,6 +1030,7 @@ void i915_driver_shutdown(struct 
> > drm_i915_private *i915)
> >  
> >     intel_dmc_suspend(display);
> >  
> > +   intel_pxp_fini(i915);
> 
> What is that doing in this patch?
> 
> >     i915_gem_suspend(i915);
> >  
> >     /*
> > @@ -1020,6 +1048,7 @@ void i915_driver_shutdown(struct drm_i915_private 
> > *i915)
> >     enable_rpm_wakeref_asserts(&i915->runtime_pm);
> >  
> >     intel_runtime_pm_driver_last_release(&i915->runtime_pm);
> > +   i915_disable_device(to_pci_dev(i915->drm.dev));
> >  }
> >  
> >  static bool suspend_to_idle(struct drm_i915_private *dev_priv)
> > --
> > 2.34.1
> 
> --
> Ville Syrjälä
> Intel

--
Ville Syrjälä
Intel

Reply via email to