> -----Original Message----- > From: Baolu Lu <baolu...@linux.intel.com> > Sent: Thursday, March 13, 2025 7:53 PM > To: Borah, Chaitanya Kumar <chaitanya.kumar.bo...@intel.com> > Cc: baolu...@linux.intel.com; intel-gfx@lists.freedesktop.org; intel- > x...@lists.freedesktop.org; io...@lists.linux.dev > Subject: Re: Regression on drm-tip > > On 2025/3/13 16:51, Borah, Chaitanya Kumar wrote: > > Hello Lu, > > > > Hope you are doing well. I am Chaitanya from the linux graphics team in > Intel. > > > > This mail is regarding a regression we are seeing in our CI runs[1] on > > drm-tip > repository. > > > > `````````````````````````````````````````````````````````````````````` > > ``````````` <4>[ 2.856622] WARNING: possible circular locking > > dependency detected <4>[ 2.856631] > > 6.14.0-rc5-CI_DRM_16217-gc55ef90b69d3+ #1 Tainted: G I <4>[ > > 2.856642] ------------------------------------------------------ > > <4>[ 2.856650] swapper/0/1 is trying to acquire lock: > > <4>[ 2.856657] ffffffff8360ecc8 > > (iommu_probe_device_lock){+.+.}-{3:3}, at: > > iommu_probe_device+0x1d/0x70 <4>[ 2.856679] > > but task is already holding lock: > > <4>[ 2.856686] ffff888102ab6fa8 > > (&device->physical_node_lock){+.+.}-{3:3}, at: > > intel_iommu_init+0xea1/0x1220 > > `````````````````````````````````````````````````````````````````````` > > ``````````` > > Details log can be found in [2]. > > > > After bisecting the tree, the following patch [3] seems to be the > > first "bad" commit > > > > `````````````````````````````````````````````````````````````````````` > > ``````````````````````````````````` > > commit b150654f74bf0df8e6a7936d5ec51400d9ec06d8 > > Author: Lu Baolumailto:baolu...@linux.intel.com > > Date: Fri Feb 28 18:27:26 2025 +0800 > > > > iommu/vt-d: Fix suspicious RCU usage > > > > `````````````````````````````````````````````````````````````````````` > > ``````````````````````````````````` > > > > We also verified that if we revert the patch the issue is not seen. > > > > Could you please check why the patch causes this regression and provide a > fix if necessary? > > Can you please take a quick test to check if the following fix works? > > diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index > e540092d664d..06debeaec643 100644 > --- a/drivers/iommu/intel/dmar.c > +++ b/drivers/iommu/intel/dmar.c > @@ -2051,8 +2051,13 @@ int enable_drhd_fault_handling(unsigned int cpu) > if (iommu->irq || iommu->node != cpu_to_node(cpu)) > continue; > > + /* > + * Call dmar_alloc_hwirq() with dmar_global_lock held, > + * could cause possible lock race condition. > + */ > + up_read(&dmar_global_lock); > ret = dmar_set_interrupt(iommu); > - > + down_read(&dmar_global_lock); > if (ret) { > pr_err("DRHD %Lx: failed to enable fault, interrupt, > ret %d\n", > (unsigned long long)drhd->reg_base_addr, ret); > > Thanks, > baolu
We still see the issue with this change. Regards Chaitanya