Re: [PATCH v11 00/11] PCI: brcmstb: enable PCIe for STB chips
On Tue, Aug 25, 2020 at 10:40:27AM -0700, Florian Fainelli wrote: > Hi, > > On 8/24/2020 12:30 PM, Jim Quinlan wrote: >> >> Patchset Summary: >>Enhance a PCIe host controller driver. Because of its unusual design >>we are foced to change dev->dma_pfn_offset into a more general role >>allowing multiple offsets. See the 'v1' notes below for more info. > > We are version 11 and counting, and it is not clear to me whether there is > any chance of getting these patches reviewed and hopefully merged for the > 5.10 merge window. > > There are a lot of different files being touched, so what would be the > ideal way of routing those changes towards inclusion? FYI, I offered to take the dma-mapping bits through the dma-mapping tree. I have a bit of a backlog, but plan to review and if Jim is ok with that apply the current version. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/1] iommu/vt-d: Use device numa domain if RHSA is missing
If there are multiple NUMA domains but the RHSA is missing in ACPI/DMAR table, we could default to the device NUMA domain as fall back. This also benefits the vIOMMU use case where only a single vIOMMU is exposed, hence no RHSA will be present but device numa domain can be correct. Cc: Jacob Pan Cc: Kevin Tian Cc: Ashok Raj Signed-off-by: Lu Baolu --- drivers/iommu/intel/iommu.c | 31 +-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index e0516d64d7a3..bce158468abf 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -700,12 +700,41 @@ static int domain_update_iommu_superpage(struct dmar_domain *domain, return fls(mask); } +static int domain_update_device_node(struct dmar_domain *domain) +{ + struct device_domain_info *info; + int nid = NUMA_NO_NODE; + + assert_spin_locked(&device_domain_lock); + + if (list_empty(&domain->devices)) + return NUMA_NO_NODE; + + list_for_each_entry(info, &domain->devices, link) { + if (!info->dev) + continue; + + nid = dev_to_node(info->dev); + if (nid != NUMA_NO_NODE) + break; + } + + return nid; +} + /* Some capabilities may be different across iommus */ static void domain_update_iommu_cap(struct dmar_domain *domain) { domain_update_iommu_coherency(domain); domain->iommu_snooping = domain_update_iommu_snooping(NULL); domain->iommu_superpage = domain_update_iommu_superpage(domain, NULL); + + /* +* If RHSA is missing, we should default to the device numa domain +* as fall back. +*/ + if (domain->nid == NUMA_NO_NODE) + domain->nid = domain_update_device_node(domain); } struct context_entry *iommu_context_addr(struct intel_iommu *iommu, u8 bus, @@ -5086,8 +5115,6 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) if (type == IOMMU_DOMAIN_DMA) intel_init_iova_domain(dmar_domain); - domain_update_iommu_cap(dmar_domain); - domain = &dmar_domain->domain; domain->geometry.aperture_start = 0; domain->geometry.aperture_end = -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v2 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications
> From: Lu Baolu > Sent: Thursday, August 27, 2020 12:25 PM > > The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG > General > Description) that: > > If multiple control fields in this register need to be modified, software > must serialize the modifications through multiple writes to this register. > > However, in irq_remapping.c, modifications of IRE and CFI are done in one > write. We need to do two separate writes with STS checking after each. > > Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out > security warning mess") > Cc: Andy Lutomirski > Cc: Jacob Pan > Cc: Kevin Tian > Cc: Ashok Raj > Signed-off-by: Lu Baolu > --- > drivers/iommu/intel/irq_remapping.c | 11 +-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > Change log: > v1->v2: > - v1 posted here > https://lore.kernel.org/linux-iommu/20200826025825.2322-1- > baolu...@linux.intel.com/; > - Add status check before disabling CFI. (Kevin) > > diff --git a/drivers/iommu/intel/irq_remapping.c > b/drivers/iommu/intel/irq_remapping.c > index 9564d23d094f..7552bb7e92c8 100644 > --- a/drivers/iommu/intel/irq_remapping.c > +++ b/drivers/iommu/intel/irq_remapping.c > @@ -507,12 +507,19 @@ static void iommu_enable_irq_remapping(struct > intel_iommu *iommu) > > /* Enable interrupt-remapping */ > iommu->gcmd |= DMA_GCMD_IRE; > - iommu->gcmd &= ~DMA_GCMD_CFI; /* Block compatibility-format > MSIs */ > writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); > - > IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, > readl, (sts & DMA_GSTS_IRES), sts); > > + /* Block compatibility-format MSIs */ > + sts = readl(iommu->reg + DMAR_GSTS_REG); no need of this readl as the status is already three in IOMMU_WAIT_OP. > + if (sts & DMA_GSTS_CFIS) { > + iommu->gcmd &= ~DMA_GCMD_CFI; > + writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); > + IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, > + readl, !(sts & DMA_GSTS_CFIS), sts); > + } > + > /* >* With CFI clear in the Global Command register, we should be >* protected from dangerous (i.e. compatibility) interrupts > -- > 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RESEND PATCH v4] iommu/mediatek: check 4GB mode by reading infracfg
On Wed, 2020-08-26 at 16:56 +0800, Miles Chen wrote: > In previous discussion [1] and [2], we found that it is risky to > use max_pfn or totalram_pages to tell if 4GB mode is enabled. > > Check 4GB mode by reading infracfg register, remove the usage > of the un-exported symbol max_pfn. > > This is a step towards building mtk_iommu as a kernel module. > > [1] > https://lore.kernel.org/lkml/20200603161132.2441-1-miles.c...@mediatek.com/ > [2] > https://lore.kernel.org/lkml/20200604080120.2628-1-miles.c...@mediatek.com/ > [3] https://lore.kernel.org/lkml/20200715205120.GA778876@bogus/ > > Cc: Mike Rapoport > Cc: David Hildenbrand > Cc: Yong Wu > Cc: Yingjoe Chen > Cc: Christoph Hellwig > Cc: Rob Herring > Cc: Matthias Brugger > Signed-off-by: Miles Chen > > --- > > Change since v3 > - use lore.kernel.org links > - move "change since..." after "---" > > Change since v2: > - determine compatible string by m4u_plat > - rebase to next-20200720 > - add "---" > > Change since v1: > - remove the phandle usage, search for infracfg instead [3] > - use infracfg instead of infracfg_regmap > - move infracfg definitaions to linux/soc/mediatek/infracfg.h > - update enable_4GB only when has_4gb_mode > --- > drivers/iommu/mtk_iommu.c | 34 +++ > include/linux/soc/mediatek/infracfg.h | 3 +++ > 2 files changed, 32 insertions(+), 5 deletions(-) > > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c > index 785b228d39a6..adc350150492 100644 > --- a/drivers/iommu/mtk_iommu.c > +++ b/drivers/iommu/mtk_iommu.c > @@ -3,7 +3,6 @@ > * Copyright (c) 2015-2016 MediaTek Inc. > * Author: Yong Wu > */ > -#include > #include > #include > #include > @@ -15,13 +14,16 @@ > #include > #include > #include > +#include > #include > #include > #include > #include > #include > +#include > #include > #include > +#include > #include > #include > > @@ -640,8 +642,11 @@ static int mtk_iommu_probe(struct platform_device *pdev) > struct resource *res; > resource_size_t ioaddr; > struct component_match *match = NULL; > + struct regmap *infracfg; > void*protect; > int i, larb_nr, ret; > + u32 val; > + char*p; > > data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL); > if (!data) > @@ -655,10 +660,29 @@ static int mtk_iommu_probe(struct platform_device *pdev) > return -ENOMEM; > data->protect_base = ALIGN(virt_to_phys(protect), MTK_PROTECT_PA_ALIGN); > > - /* Whether the current dram is over 4GB */ > - data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT)); > - if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) > - data->enable_4GB = false; > + data->enable_4GB = false; > + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE)) { > + switch (data->plat_data->m4u_plat) { > + case M4U_MT2712: > + p = "mediatek,mt2712-infracfg"; > + break; > + case M4U_MT8173: > + p = "mediatek,mt8173-infracfg"; > + break; > + default: > + p = NULL; > + } > + This can be simplified: if (data->plat_data->m4u_plat == M4U_MT2712) p = "mediatek,mt2712-infracfg"; else if(data->plat_data->m4u_plat == M4U_MT8173) p = "mediatek,mt8173-infracfg"; else return -EINVAL; Then, Reviewed-by: Yong Wu > + infracfg = syscon_regmap_lookup_by_compatible(p); > + > + if (IS_ERR(infracfg)) > + return PTR_ERR(infracfg); > + > + ret = regmap_read(infracfg, REG_INFRA_MISC, &val); > + if (ret) > + return ret; > + data->enable_4GB = !!(val & F_DDR_4GB_SUPPORT_EN); > + } > > res = platform_get_resource(pdev, IORESOURCE_MEM, 0); > data->base = devm_ioremap_resource(dev, res); > diff --git a/include/linux/soc/mediatek/infracfg.h > b/include/linux/soc/mediatek/infracfg.h > index fd25f0148566..233463d789c6 100644 > --- a/include/linux/soc/mediatek/infracfg.h > +++ b/include/linux/soc/mediatek/infracfg.h > @@ -32,6 +32,9 @@ > #define MT7622_TOP_AXI_PROT_EN_WB(BIT(2) | BIT(6) | \ >BIT(7) | BIT(8)) > > +#define REG_INFRA_MISC 0xf00 > +#define F_DDR_4GB_SUPPORT_EN BIT(13) > + > int mtk_infracfg_set_bus_protection(struct regmap *infracfg, u32 mask, > bool reg_update); > int mtk_infracfg_clear_bus_protection(struct regmap *infracfg, u32 mask, ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications
The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General Description) that: If multiple control fields in this register need to be modified, software must serialize the modifications through multiple writes to this register. However, in irq_remapping.c, modifications of IRE and CFI are done in one write. We need to do two separate writes with STS checking after each. Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Cc: Andy Lutomirski Cc: Jacob Pan Cc: Kevin Tian Cc: Ashok Raj Signed-off-by: Lu Baolu --- drivers/iommu/intel/irq_remapping.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) Change log: v1->v2: - v1 posted here https://lore.kernel.org/linux-iommu/20200826025825.2322-1-baolu...@linux.intel.com/; - Add status check before disabling CFI. (Kevin) diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 9564d23d094f..7552bb7e92c8 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -507,12 +507,19 @@ static void iommu_enable_irq_remapping(struct intel_iommu *iommu) /* Enable interrupt-remapping */ iommu->gcmd |= DMA_GCMD_IRE; - iommu->gcmd &= ~DMA_GCMD_CFI; /* Block compatibility-format MSIs */ writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); - IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, readl, (sts & DMA_GSTS_IRES), sts); + /* Block compatibility-format MSIs */ + sts = readl(iommu->reg + DMAR_GSTS_REG); + if (sts & DMA_GSTS_CFIS) { + iommu->gcmd &= ~DMA_GCMD_CFI; + writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); + IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, + readl, !(sts & DMA_GSTS_CFIS), sts); + } + /* * With CFI clear in the Global Command register, we should be * protected from dangerous (i.e. compatibility) interrupts -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/1] iommu/vt-d: Serialize IOMMU GCMD register modifications
Hi Kevin, On 8/26/20 1:29 PM, Tian, Kevin wrote: From: Lu Baolu Sent: Wednesday, August 26, 2020 10:58 AM The VT-d spec requires (10.4.4 Global Command Register, GCMD_REG General Description) that: If multiple control fields in this register need to be modified, software must serialize the modifications through multiple writes to this register. However, in irq_remapping.c, modifications of IRE and CFI are done in one write. We need to do two separate writes with STS checking after each. Fixes: af8d102f999a4 ("x86/intel/irq_remapping: Clean up x2apic opt-out security warning mess") Cc: Andy Lutomirski Cc: Jacob Pan Signed-off-by: Lu Baolu --- drivers/iommu/intel/irq_remapping.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c index 9564d23d094f..19d7e18876fe 100644 --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -507,12 +507,16 @@ static void iommu_enable_irq_remapping(struct intel_iommu *iommu) /* Enable interrupt-remapping */ iommu->gcmd |= DMA_GCMD_IRE; - iommu->gcmd &= ~DMA_GCMD_CFI; /* Block compatibility-format MSIs */ writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); - IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, readl, (sts & DMA_GSTS_IRES), sts); + /* Block compatibility-format MSIs */ + iommu->gcmd &= ~DMA_GCMD_CFI; + writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG); + IOMMU_WAIT_OP(iommu, DMAR_GSTS_REG, + readl, !(sts & DMA_GSTS_CFIS), sts); + Better do it only when CFI is actually enabled (by checking sts). Yes. Make sense. Will send a new version with this changed. Best regards, baolu /* * With CFI clear in the Global Command register, we should be * protected from dangerous (i.e. compatibility) interrupts -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Is: virtio_gpu_object_shmem_init issues? Was:Re: upstream boot error: general protection fault in swiotlb_map
Hello, I had a similiar panic when booting an ARM VM with kernel v5.9-rc1. git bisect identified following bad commit. After reverting the bad commit, the VM boot ok. Maybe we should look into the following commit. d323bb44e4d23802eb25d13de1f93f2335bd60d0 is the first bad commit commit d323bb44e4d23802eb25d13de1f93f2335bd60d0 Author: Daniel Vetter Date: Mon May 11 11:35:49 2020 +0200 drm/virtio: Call the right shmem helpers drm_gem_shmem_get_sg_table is meant to implement obj->funcs->get_sg_table, for prime exporting. The one we want is drm_gem_shmem_get_pages_sgt, which also handles imported dma-buf, not just native objects. v2: Rebase, this stuff moved around in commit 2f2aa13724d56829d910b2fa8e80c502d388f106 Author: Gerd Hoffmann Date: Fri Feb 7 08:46:38 2020 +0100 drm/virtio: move virtio_gpu_mem_entry initialization to new function Acked-by: Thomas Zimmermann Signed-off-by: Daniel Vetter Cc: David Airlie Cc: Gerd Hoffmann Cc: virtualizat...@lists.linux-foundation.org Link: https://patchwork.freedesktop.org/patch/msgid/20200511093554.211493-5-daniel.vet...@ffwll.ch Thank you, Thomas On 2020-08-24 11:06 a.m., Konrad Rzeszutek Wilk wrote: On Thu, Aug 06, 2020 at 03:46:23AM -0700, syzbot wrote: Hello, syzbot found the following issue on: HEAD commit:47ec5303 Merge git://git.kernel.org/pub/scm/linux/kernel/g.. git tree: upstream console output: https://syzkaller.appspot.com/x/log.txt?x=16fe1dea90 kernel config: https://syzkaller.appspot.com/x/.config?x=7c06047f622c5724 dashboard link: https://syzkaller.appspot.com/bug?extid=3f86afd0b1e4bf1cb64c compiler: gcc (GCC) 10.1.0-syz 20200507 IMPORTANT: if you fix the issue, please add the following tag to the commit: Reported-by: syzbot+3f86afd0b1e4bf1cb...@syzkaller.appspotmail.com ceph: loaded (mds proto 32) NET: Registered protocol family 38 async_tx: api initialized (async) Key type asymmetric registered Asymmetric key parser 'x509' registered Asymmetric key parser 'pkcs8' registered Key type pkcs7_test registered Asymmetric key parser 'tpm_parser' registered Block layer SCSI generic (bsg) driver version 0.4 loaded (major 243) io scheduler mq-deadline registered io scheduler kyber registered io scheduler bfq registered hgafb: HGA card not detected. hgafb: probe of hgafb.0 failed with error -22 usbcore: registered new interface driver udlfb uvesafb: failed to execute /sbin/v86d uvesafb: make sure that the v86d helper is installed and executable uvesafb: Getting VBE info block failed (eax=0x4f00, err=-2) uvesafb: vbe_init() failed with -22 uvesafb: probe of uvesafb.0 failed with error -22 vga16fb: mapped to 0x8aac772d Console: switching to colour frame buffer device 80x30 fb0: VGA16 VGA frame buffer device input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 ACPI: Power Button [PWRF] ioatdma: Intel(R) QuickData Technology Driver 5.00 PCI Interrupt Link [GSIF] enabled at IRQ 21 PCI Interrupt Link [GSIG] enabled at IRQ 22 PCI Interrupt Link [GSIH] enabled at IRQ 23 N_HDLC line discipline registered with maxframe=4096 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled 00:05: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A Cyclades driver 2.6 Initializing Nozomi driver 2.1d RocketPort device driver module, version 2.09, 12-June-2003 No rocketport ports found; unloading driver Non-volatile memory driver v1.3 Linux agpgart interface v0.103 [drm] Initialized vgem 1.0.0 20120112 for vgem on minor 0 [drm] Initialized vkms 1.0.0 20180514 for vkms on minor 1 usbcore: registered new interface driver udl [drm] pci: virtio-vga detected at :00:01.0 fb0: switching to virtiodrmfb from VGA16 VGA Console: switching to colour VGA+ 80x25 virtio-pci :00:01.0: vgaarb: deactivate vga console Console: switching to colour dummy device 80x25 [drm] features: -virgl +edid [drm] number of scanouts: 1 [drm] number of cap sets: 0 [drm] Initialized virtio_gpu 0.1.0 0 for virtio0 on minor 2 general protection fault, probably for non-canonical address 0xdc00: [#1] PREEMPT SMP KASAN KASAN: null-ptr-deref in range [0x-0x0007] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.8.0-syzkaller #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 RIP: 0010:swiotlb_map+0x5ac/0x700 kernel/dma/swiotlb.c:683 Code: 28 04 00 00 48 c1 ea 03 80 3c 02 00 0f 85 4d 01 00 00 4c 8b a5 18 04 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 1e 01 00 00 48 8d 7d 50 4d 8b 24 24 48 b8 00 00 RSP: :c934f3e0 EFLAGS: 00010246 RAX: dc00 RBX: RCX: 8162cc1d RDX: RSI: 8162cc98 RDI: 88802971a470 RBP: 88802971a048 R08: 0001 R09: 8c5dba77 R10: R11: R12: R13: 7ac0 R14: d
Re: [patch V2 29/46] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
On Wed, 26 Aug 2020 20:47:38 +0100, Thomas Gleixner wrote: > > On Wed, Aug 26 2020 at 20:06, Marc Zyngier wrote: > > On Wed, 26 Aug 2020 12:16:57 +0100, > > Thomas Gleixner wrote: > >> /** > >> - * msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain > >> associated tp @dev > >> - * @domain: The domain to managing the interrupts > >> + * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt domain > >> + * @domain: The domain to allocate from > >> * @dev: Pointer to device struct of the device for which the interrupts > >> - *are free > >> + *are allocated > >> + * @nvec: The number of interrupts to allocate > >> + * > >> + * Returns 0 on success or an error code. > >> */ > >> -void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev) > >> +int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, > >> +int nvec) > >> +{ > >> + struct msi_domain_info *info = domain->host_data; > >> + struct msi_domain_ops *ops = info->ops; > > > > Rework leftovers, I imagine. > > Hmm, no. How would it call ops->domain_alloc_irqs() without getting the > ops. I know, that the diff is horrible, but don't blame me for it. diff > sucks at times. I can't read. Time to put the laptop away! Thanks, M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 04/46] genirq/chip: Use the first chip in irq_chip_compose_msi_msg()
On Wed, 26 Aug 2020 22:19:56 +0100, Thomas Gleixner wrote: > > On Wed, Aug 26 2020 at 20:50, Marc Zyngier wrote: > > On Wed, 26 Aug 2020 12:16:32 +0100, > > Thomas Gleixner wrote: > >> --- > >> V2: New patch. Note, that this might break other stuff which relies on the > >> current behaviour, but the hierarchy composition of DT based chips is > >> really hard to follow. > > [...] > What about the below? > > Thanks, > > tglx > --- > --- a/kernel/irq/internals.h > +++ b/kernel/irq/internals.h > @@ -473,6 +473,15 @@ static inline void irq_domain_deactivate > } > #endif > > +static inline struct irq_data *irqd_get_parent_data(struct irq_data *irqd) > +{ > +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > + return irqd->parent_data; > +#else > + return NULL; > +#endif > +} > + We obviously should have had this forever. > #ifdef CONFIG_GENERIC_IRQ_DEBUGFS > #include > > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -1541,18 +1541,17 @@ EXPORT_SYMBOL_GPL(irq_chip_release_resou > */ > int irq_chip_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) > { > - struct irq_data *pos = NULL; > + struct irq_data *pos; > > -#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > - for (; data; data = data->parent_data) > -#endif > + for (pos = NULL; !pos && data; data = irqd_get_parent_data(data)) { > if (data->chip && data->chip->irq_compose_msi_msg) > pos = data; > + } > + > if (!pos) > return -ENOSYS; > > pos->chip->irq_compose_msi_msg(pos, msg); > - > return 0; > } Perfect, ship it! ;-) M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
On Wed, Aug 26 2020 at 22:14, Marc Zyngier wrote: > On Wed, 26 Aug 2020 12:17:02 +0100, > Thomas Gleixner wrote: >> @@ -103,6 +105,7 @@ config PCIE_XILINX_CPM >> bool "Xilinx Versal CPM host bridge support" >> depends on ARCH_ZYNQMP || COMPILE_TEST >> select PCI_HOST_COMMON >> +select PCI_MSI_ARCH_FALLBACKS > > This guy actually doesn't implement MSIs at all (it seems to delegate > them to an ITS present in the system, if I read the DT binding > correctly). However its older brother from the same silicon dealer > seems to need it. The patchlet below should fix it. Gah, at some point my eyes went squared and I lost track.. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 41/46] platform-msi: Provide default irq_chip:: Ack
On Wed, 26 Aug 2020 12:17:09 +0100, Thomas Gleixner wrote: > > From: Thomas Gleixner > > For the upcoming device MSI support it's required to have a default > irq_chip::ack implementation (irq_chip_ack_parent) so the drivers do not > need to care. > > Signed-off-by: Thomas Gleixner > > --- > drivers/base/platform-msi.c |2 ++ > 1 file changed, 2 insertions(+) > > --- a/drivers/base/platform-msi.c > +++ b/drivers/base/platform-msi.c > @@ -95,6 +95,8 @@ static void platform_msi_update_chip_ops > chip->irq_mask = irq_chip_mask_parent; > if (!chip->irq_unmask) > chip->irq_unmask = irq_chip_unmask_parent; > + if (!chip->irq_ack) > + chip->irq_ack = irq_chip_ack_parent; > if (!chip->irq_eoi) > chip->irq_eoi = irq_chip_eoi_parent; > if (!chip->irq_set_affinity) > > Acked-by: Marc Zyngier M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 04/46] genirq/chip: Use the first chip in irq_chip_compose_msi_msg()
On Wed, Aug 26 2020 at 20:50, Marc Zyngier wrote: > On Wed, 26 Aug 2020 12:16:32 +0100, > Thomas Gleixner wrote: >> --- >> V2: New patch. Note, that this might break other stuff which relies on the >> current behaviour, but the hierarchy composition of DT based chips is >> really hard to follow. > > Grepping around, I don't think there is any occurrence of two irqchips > providing irq_compose_msi() that can share a hierarchy on any real > system, so we should be fine. Famous last words. Knocking on wood :) >> #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY >> -for (; data; data = data->parent_data) >> -#endif >> -if (data->chip && data->chip->irq_compose_msi_msg) >> +for (; data; data = data->parent_data) { >> +if (data->chip && data->chip->irq_compose_msi_msg) { >> pos = data; >> +break; >> +} >> +} >> +#else >> +if (data->chip && data->chip->irq_compose_msi_msg) >> +pos = data; >> +#endif >> if (!pos) >> return -ENOSYS; > > Is it just me, or is this last change more complex than it ought to > be? Kinda. > diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c > index 857f5f4c8098..25e18b73699c 100644 > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -1544,7 +1544,7 @@ int irq_chip_compose_msi_msg(struct irq_data *data, > struct msi_msg *msg) > struct irq_data *pos = NULL; > > #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > - for (; data; data = data->parent_data) > + for (; data && !pos; data = data->parent_data) > #endif > if (data->chip && data->chip->irq_compose_msi_msg) > pos = data; > > Though the for loop in a #ifdef in admittedly an acquired taste... Checking !pos is simpler obviously. That doesn't make me hate the loop in the #ifdef less. :) What about the below? Thanks, tglx --- --- a/kernel/irq/internals.h +++ b/kernel/irq/internals.h @@ -473,6 +473,15 @@ static inline void irq_domain_deactivate } #endif +static inline struct irq_data *irqd_get_parent_data(struct irq_data *irqd) +{ +#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY + return irqd->parent_data; +#else + return NULL; +#endif +} + #ifdef CONFIG_GENERIC_IRQ_DEBUGFS #include --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -1541,18 +1541,17 @@ EXPORT_SYMBOL_GPL(irq_chip_release_resou */ int irq_chip_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) { - struct irq_data *pos = NULL; + struct irq_data *pos; -#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY - for (; data; data = data->parent_data) -#endif + for (pos = NULL; !pos && data; data = irqd_get_parent_data(data)) { if (data->chip && data->chip->irq_compose_msi_msg) pos = data; + } + if (!pos) return -ENOSYS; pos->chip->irq_compose_msi_msg(pos, msg); - return 0; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
On Wed, 26 Aug 2020 12:17:02 +0100, Thomas Gleixner wrote: > > From: Thomas Gleixner > > The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture > requires them or not. Architectures which are fully utilizing hierarchical > irq domains should never call into that code. > > It's not only architectures which depend on that by implementing one or > more of the weak functions, there is also a bunch of drivers which relies > on the weak functions which invoke msi_controller::setup_irq[s] and > msi_controller::teardown_irq. > > Make the architectures and drivers which rely on them select them in Kconfig > and if not selected replace them by stub functions which emit a warning and > fail the PCI/MSI interrupt allocation. > > Signed-off-by: Thomas Gleixner > --- > V2: Make the architectures (and drivers) which need the fallbacks select them > and not the other way round (Bjorn). > --- > arch/ia64/Kconfig |1 + > arch/mips/Kconfig |1 + > arch/powerpc/Kconfig |1 + > arch/s390/Kconfig |1 + > arch/sparc/Kconfig |1 + > arch/x86/Kconfig |1 + > drivers/pci/Kconfig|3 +++ > drivers/pci/controller/Kconfig |3 +++ > drivers/pci/msi.c |3 ++- > include/linux/msi.h| 31 ++- > 10 files changed, 40 insertions(+), 6 deletions(-) > [...] > --- a/drivers/pci/controller/Kconfig > +++ b/drivers/pci/controller/Kconfig > @@ -41,6 +41,7 @@ config PCI_TEGRA > bool "NVIDIA Tegra PCIe controller" > depends on ARCH_TEGRA || COMPILE_TEST > depends on PCI_MSI_IRQ_DOMAIN > + select PCI_MSI_ARCH_FALLBACKS > help > Say Y here if you want support for the PCIe host controller found > on NVIDIA Tegra SoCs. > @@ -67,6 +68,7 @@ config PCIE_RCAR_HOST > bool "Renesas R-Car PCIe host controller" > depends on ARCH_RENESAS || COMPILE_TEST > depends on PCI_MSI_IRQ_DOMAIN > + select PCI_MSI_ARCH_FALLBACKS > help > Say Y here if you want PCIe controller support on R-Car SoCs in host > mode. > @@ -103,6 +105,7 @@ config PCIE_XILINX_CPM > bool "Xilinx Versal CPM host bridge support" > depends on ARCH_ZYNQMP || COMPILE_TEST > select PCI_HOST_COMMON > + select PCI_MSI_ARCH_FALLBACKS This guy actually doesn't implement MSIs at all (it seems to delegate them to an ITS present in the system, if I read the DT binding correctly). However its older brother from the same silicon dealer seems to need it. The patchlet below should fix it. diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig index 9ad13919bcaa..f56ff049d469 100644 --- a/drivers/pci/controller/Kconfig +++ b/drivers/pci/controller/Kconfig @@ -96,6 +96,7 @@ config PCI_HOST_GENERIC config PCIE_XILINX bool "Xilinx AXI PCIe host bridge support" + select PCI_MSI_ARCH_FALLBACKS depends on OF || COMPILE_TEST help Say 'Y' here if you want kernel to support the Xilinx AXI PCIe @@ -105,7 +106,6 @@ config PCIE_XILINX_CPM bool "Xilinx Versal CPM host bridge support" depends on ARCH_ZYNQMP || COMPILE_TEST select PCI_HOST_COMMON - select PCI_MSI_ARCH_FALLBACKS help Say 'Y' here if you want kernel support for the Xilinx Versal CPM host bridge. With that fixed, Acked-by: Marc Zyngier M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 23/46] irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
On Wed, 2020-08-26 at 21:42 +0100, Marc Zyngier wrote: > On Wed, 26 Aug 2020 12:16:51 +0100, > Thomas Gleixner wrote: > > From: Thomas Gleixner > > > > PCI devices behind a VMD bus are not subject to interrupt remapping, but > > the irq domain for VMD MSI cannot be distinguished from a regular PCI/MSI > > irq domain. > > > > Add a new domain bus token and allow it in the bus token check in > > msi_check_reservation_mode() to keep the functionality the same once VMD > > uses this token. > > > > Signed-off-by: Thomas Gleixner > > > > --- > > include/linux/irqdomain.h |1 + > > kernel/irq/msi.c |7 ++- > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > --- a/include/linux/irqdomain.h > > +++ b/include/linux/irqdomain.h > > @@ -84,6 +84,7 @@ enum irq_domain_bus_token { > > DOMAIN_BUS_FSL_MC_MSI, > > DOMAIN_BUS_TI_SCI_INTA_MSI, > > DOMAIN_BUS_WAKEUP, > > + DOMAIN_BUS_VMD_MSI, > > }; > > > > /** > > --- a/kernel/irq/msi.c > > +++ b/kernel/irq/msi.c > > @@ -370,8 +370,13 @@ static bool msi_check_reservation_mode(s > > { > > struct msi_desc *desc; > > > > - if (domain->bus_token != DOMAIN_BUS_PCI_MSI) > > + switch(domain->bus_token) { > > + case DOMAIN_BUS_PCI_MSI: > > + case DOMAIN_BUS_VMD_MSI: > > + break; > > + default: > > return false; > > + } > > > > if (!(info->flags & MSI_FLAG_MUST_REACTIVATE)) > > return false; > > Acked-by: Marc Zyngier > > M. > Acked-by: Jon Derrick ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 15/46] x86/irq: Consolidate DMAR irq allocation
On Wed, Aug 26 2020 at 20:32, Thomas Gleixner wrote: > On Wed, Aug 26 2020 at 09:50, Megha Dey wrote: >>> @@ -329,15 +329,15 @@ static struct irq_chip dmar_msi_controll >>> static irq_hw_number_t dmar_msi_get_hwirq(struct msi_domain_info *info, >>> msi_alloc_info_t *arg) >>> { >>> - return arg->dmar_id; >>> + return arg->hwirq; >> >> Shouldn't this return the arg->devid which gets set in dmar_alloc_hwirq? > > Indeed. But for simplicity we can set arg->hwirq to the dmar id right in the alloc function and then once the generic ops are enabled remove the dmar callback completely. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI
On Wed, 26 Aug 2020 12:16:52 +0100, Thomas Gleixner wrote: > > From: Thomas Gleixner > > Devices on the VMD bus use their own MSI irq domain, but it is not > distinguishable from regular PCI/MSI irq domains. This is required > to exclude VMD devices from getting the irq domain pointer set by > interrupt remapping. > > Override the default bus token. > > Signed-off-by: Thomas Gleixner > Acked-by: Bjorn Helgaas > --- > drivers/pci/controller/vmd.c |6 ++ > 1 file changed, 6 insertions(+) > > --- a/drivers/pci/controller/vmd.c > +++ b/drivers/pci/controller/vmd.c > @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_ > return -ENODEV; > } > > + /* > + * Override the irq domain bus token so the domain can be distinguished > + * from a regular PCI/MSI domain. > + */ > + irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI); > + One day, we'll be able to set the token at domain creation time. In the meantime, Acked-by: Marc Zyngier M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 19/46] x86/msi: Use generic MSI domain ops
On Wed, Aug 26 2020 at 21:21, Marc Zyngier wrote: > On Wed, 26 Aug 2020 12:16:47 +0100, > Thomas Gleixner wrote: >> -void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc) >> -{ >> -arg->desc = desc; >> -arg->hwirq = pci_msi_domain_calc_hwirq(desc); >> -} >> -EXPORT_SYMBOL_GPL(pci_msi_set_desc); > > I think that at this stage, pci_msi_domain_calc_hwirq() can be made > static, as it was only ever exported for this call site. Nice cleanup! Doh indeed. Let me fix that. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 23/46] irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
On Wed, 26 Aug 2020 12:16:51 +0100, Thomas Gleixner wrote: > > From: Thomas Gleixner > > PCI devices behind a VMD bus are not subject to interrupt remapping, but > the irq domain for VMD MSI cannot be distinguished from a regular PCI/MSI > irq domain. > > Add a new domain bus token and allow it in the bus token check in > msi_check_reservation_mode() to keep the functionality the same once VMD > uses this token. > > Signed-off-by: Thomas Gleixner > > --- > include/linux/irqdomain.h |1 + > kernel/irq/msi.c |7 ++- > 2 files changed, 7 insertions(+), 1 deletion(-) > > --- a/include/linux/irqdomain.h > +++ b/include/linux/irqdomain.h > @@ -84,6 +84,7 @@ enum irq_domain_bus_token { > DOMAIN_BUS_FSL_MC_MSI, > DOMAIN_BUS_TI_SCI_INTA_MSI, > DOMAIN_BUS_WAKEUP, > + DOMAIN_BUS_VMD_MSI, > }; > > /** > --- a/kernel/irq/msi.c > +++ b/kernel/irq/msi.c > @@ -370,8 +370,13 @@ static bool msi_check_reservation_mode(s > { > struct msi_desc *desc; > > - if (domain->bus_token != DOMAIN_BUS_PCI_MSI) > + switch(domain->bus_token) { > + case DOMAIN_BUS_PCI_MSI: > + case DOMAIN_BUS_VMD_MSI: > + break; > + default: > return false; > + } > > if (!(info->flags & MSI_FLAG_MUST_REACTIVATE)) > return false; Acked-by: Marc Zyngier M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 17/46] PCI/MSI: Rework pci_msi_domain_calc_hwirq()
On Wed, 26 Aug 2020 12:16:45 +0100, Thomas Gleixner wrote: > > From: Thomas Gleixner > > Retrieve the PCI device from the msi descriptor instead of doing so at the > call sites. > > Signed-off-by: Thomas Gleixner > Acked-by: Bjorn Helgaas Acked-by: Marc Zyngier M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 19/46] x86/msi: Use generic MSI domain ops
On Wed, 26 Aug 2020 12:16:47 +0100, Thomas Gleixner wrote: > > From: Thomas Gleixner > > pci_msi_get_hwirq() and pci_msi_set_desc are not longer special. Enable the > generic MSI domain ops in the core and PCI MSI code unconditionally and get > rid of the x86 specific implementations in the X86 MSI code and in the > hyperv PCI driver. > > Signed-off-by: Thomas Gleixner > > --- > arch/x86/include/asm/msi.h |2 -- > arch/x86/kernel/apic/msi.c | 15 --- > drivers/pci/controller/pci-hyperv.c |8 > drivers/pci/msi.c |4 > kernel/irq/msi.c|6 -- > 5 files changed, 35 deletions(-) > > --- a/arch/x86/include/asm/msi.h > +++ b/arch/x86/include/asm/msi.h > @@ -9,6 +9,4 @@ typedef struct irq_alloc_info msi_alloc_ > int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, > msi_alloc_info_t *arg); > > -void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc); > - > #endif /* _ASM_X86_MSI_H */ > --- a/arch/x86/kernel/apic/msi.c > +++ b/arch/x86/kernel/apic/msi.c > @@ -204,12 +204,6 @@ void native_teardown_msi_irq(unsigned in > irq_domain_free_irqs(irq, 1); > } > > -static irq_hw_number_t pci_msi_get_hwirq(struct msi_domain_info *info, > - msi_alloc_info_t *arg) > -{ > - return arg->hwirq; > -} > - > int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, > msi_alloc_info_t *arg) > { > @@ -228,17 +222,8 @@ int pci_msi_prepare(struct irq_domain *d > } > EXPORT_SYMBOL_GPL(pci_msi_prepare); > > -void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc) > -{ > - arg->desc = desc; > - arg->hwirq = pci_msi_domain_calc_hwirq(desc); > -} > -EXPORT_SYMBOL_GPL(pci_msi_set_desc); I think that at this stage, pci_msi_domain_calc_hwirq() can be made static, as it was only ever exported for this call site. Nice cleanup! Reviewed-by: Marc Zyngier M. -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 04/46] genirq/chip: Use the first chip in irq_chip_compose_msi_msg()
On Wed, 26 Aug 2020 12:16:32 +0100, Thomas Gleixner wrote: > > The documentation of irq_chip_compose_msi_msg() claims that with > hierarchical irq domains the first chip in the hierarchy which has an > irq_compose_msi_msg() callback is chosen. But the code just keeps > iterating after it finds a chip with a compose callback. > > The x86 HPET MSI implementation relies on that behaviour, but that does not > make it more correct. > > The message should always be composed at the domain which manages the > underlying resource (e.g. APIC or remap table) because that domain knows > about the required layout of the message. > > On X86 the following hierarchies exist: > > 1) vector PCI/MSI > 2) vector -- IR -- PCI/MSI > > The vector domain has a different message format than the IR (remapping) > domain. So obviously the PCI/MSI domain can't compose the message without > having knowledge about the parent domain, which is exactly the opposite of > what hierarchical domains want to achieve. > > X86 actually has two different PCI/MSI chips where #1 has a compose > callback and #2 does not. #2 delegates the composition to the remap domain > where it belongs, but #1 does it at the PCI/MSI level. > > For the upcoming device MSI support it's necessary to change this and just > let the first domain which can compose the message take care of it. That > way the top level chip does not have to worry about it and the device MSI > code does not need special knowledge about topologies. It just sets the > compose callback to NULL and lets the hierarchy pick the first chip which > has one. > > Due to that the attempt to move the compose callback from the direct > delivery PCI/MSI domain to the vector domain made the system fail to boot > with interrupt remapping enabled because in the remapping case > irq_chip_compose_msi_msg() keeps iterating and choses the compose callback > of the vector domain which obviously creates the wrong format for the remap > table. > > Break out of the loop when the first irq chip with a compose callback is > found and fixup the HPET code temporarily. That workaround will be removed > once the direct delivery compose callback is moved to the place where it > belongs in the vector domain. > > Signed-off-by: Thomas Gleixner > --- > V2: New patch. Note, that this might break other stuff which relies on the > current behaviour, but the hierarchy composition of DT based chips is > really hard to follow. Grepping around, I don't think there is any occurrence of two irqchips providing irq_compose_msi() that can share a hierarchy on any real system, so we should be fine. Famous last words. > --- > arch/x86/kernel/apic/msi.c |7 +-- > kernel/irq/chip.c | 12 +--- > 2 files changed, 14 insertions(+), 5 deletions(-) > > --- a/arch/x86/kernel/apic/msi.c > +++ b/arch/x86/kernel/apic/msi.c > @@ -479,10 +479,13 @@ struct irq_domain *hpet_create_irq_domai > info.type = X86_IRQ_ALLOC_TYPE_HPET; > info.hpet_id = hpet_id; > parent = irq_remapping_get_ir_irq_domain(&info); > - if (parent == NULL) > + if (parent == NULL) { > parent = x86_vector_domain; > - else > + } else { > hpet_msi_controller.name = "IR-HPET-MSI"; > + /* Temporary fix: Will go away */ > + hpet_msi_controller.irq_compose_msi_msg = NULL; > + } > > fn = irq_domain_alloc_named_id_fwnode(hpet_msi_controller.name, > hpet_id); > --- a/kernel/irq/chip.c > +++ b/kernel/irq/chip.c > @@ -1544,10 +1544,16 @@ int irq_chip_compose_msi_msg(struct irq_ > struct irq_data *pos = NULL; > > #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY > - for (; data; data = data->parent_data) > -#endif > - if (data->chip && data->chip->irq_compose_msi_msg) > + for (; data; data = data->parent_data) { > + if (data->chip && data->chip->irq_compose_msi_msg) { > pos = data; > + break; > + } > + } > +#else > + if (data->chip && data->chip->irq_compose_msi_msg) > + pos = data; > +#endif > if (!pos) > return -ENOSYS; > > > Is it just me, or is this last change more complex than it ought to be? diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c index 857f5f4c8098..25e18b73699c 100644 --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -1544,7 +1544,7 @@ int irq_chip_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) struct irq_data *pos = NULL; #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY - for (; data; data = data->parent_data) + for (; data && !pos; data = data->parent_data) #endif if (data->chip && data->chip->irq_compose_msi_msg) pos = data; Though the for loop in a #ifdef in admittedly an acquired taste... Reviewed-by: Marc Zyngier M. -- Without deviation from the norm, progress
Re: [patch V2 29/46] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
On Wed, Aug 26 2020 at 20:06, Marc Zyngier wrote: > On Wed, 26 Aug 2020 12:16:57 +0100, > Thomas Gleixner wrote: >> /** >> - * msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain >> associated tp @dev >> - * @domain: The domain to managing the interrupts >> + * msi_domain_alloc_irqs - Allocate interrupts from a MSI interrupt domain >> + * @domain: The domain to allocate from >> * @dev:Pointer to device struct of the device for which the interrupts >> - * are free >> + * are allocated >> + * @nvec: The number of interrupts to allocate >> + * >> + * Returns 0 on success or an error code. >> */ >> -void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev) >> +int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, >> + int nvec) >> +{ >> +struct msi_domain_info *info = domain->host_data; >> +struct msi_domain_ops *ops = info->ops; > > Rework leftovers, I imagine. Hmm, no. How would it call ops->domain_alloc_irqs() without getting the ops. I know, that the diff is horrible, but don't blame me for it. diff sucks at times. >> + >> +return ops->domain_alloc_irqs(domain, dev, nvec); >> +} >> + >> +void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev) >> { >> struct msi_desc *desc; >> >> @@ -513,6 +525,20 @@ void msi_domain_free_irqs(struct irq_dom >> } >> >> /** >> + * __msi_domain_free_irqs - Free interrupts from a MSI interrupt @domain >> associated tp @dev > > Spurious __. Yup. Thanks, tglx ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 29/46] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
On Wed, 26 Aug 2020 12:16:57 +0100, Thomas Gleixner wrote: > > From: Thomas Gleixner > > To support MSI irq domains which do not fit at all into the regular MSI > irqdomain scheme, like the XEN MSI interrupt management for PV/HVM/DOM0, > it's necessary to allow to override the alloc/free implementation. > > This is a preperatory step to switch X86 away from arch_*_msi_irqs() and > store the irq domain pointer right in struct device. > > No functional change for existing MSI irq domain users. > > Aside of the evil XEN wrapper this is also useful for special MSI domains > which need to do extra alloc/free work before/after calling the generic > core function. Work like allocating/freeing MSI descriptors, MSI storage > space etc. > > Signed-off-by: Thomas Gleixner > > --- > include/linux/msi.h | 27 > kernel/irq/msi.c| 70 > +++- > 2 files changed, 75 insertions(+), 22 deletions(-) > > --- a/include/linux/msi.h > +++ b/include/linux/msi.h > @@ -241,6 +241,10 @@ struct msi_domain_info; > * @msi_finish: Optional callback to finalize the allocation > * @set_desc:Set the msi descriptor for an interrupt > * @handle_error:Optional error handler if the allocation fails > + * @domain_alloc_irqs: Optional function to override the default > allocation > + * function. > + * @domain_free_irqs:Optional function to override the default free > + * function. > * > * @get_hwirq, @msi_init and @msi_free are callbacks used by > * msi_create_irq_domain() and related interfaces > @@ -248,6 +252,22 @@ struct msi_domain_info; > * @msi_check, @msi_prepare, @msi_finish, @set_desc and @handle_error > * are callbacks used by msi_domain_alloc_irqs() and related > * interfaces which are based on msi_desc. > + * > + * @domain_alloc_irqs, @domain_free_irqs can be used to override the > + * default allocation/free functions (__msi_domain_alloc/free_irqs). This > + * is initially for a wrapper around XENs seperate MSI universe which can't > + * be wrapped into the regular irq domains concepts by mere mortals. This > + * allows to universally use msi_domain_alloc/free_irqs without having to > + * special case XEN all over the place. > + * > + * Contrary to other operations @domain_alloc_irqs and @domain_free_irqs > + * are set to the default implementation if NULL and even when > + * MSI_FLAG_USE_DEF_DOM_OPS is not set to avoid breaking existing users and > + * because these callbacks are obviously mandatory. > + * > + * This is NOT meant to be abused, but it can be useful to build wrappers > + * for specialized MSI irq domains which need extra work before and after > + * calling __msi_domain_alloc_irqs()/__msi_domain_free_irqs(). > */ > struct msi_domain_ops { > irq_hw_number_t (*get_hwirq)(struct msi_domain_info *info, > @@ -270,6 +290,10 @@ struct msi_domain_ops { > struct msi_desc *desc); > int (*handle_error)(struct irq_domain *domain, > struct msi_desc *desc, int error); > + int (*domain_alloc_irqs)(struct irq_domain *domain, > + struct device *dev, int nvec); > + void(*domain_free_irqs)(struct irq_domain *domain, > + struct device *dev); > }; > > /** > @@ -327,8 +351,11 @@ int msi_domain_set_affinity(struct irq_d > struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode, >struct msi_domain_info *info, >struct irq_domain *parent); > +int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, > + int nvec); > int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, > int nvec); > +void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev); > void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev); > struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain); > > --- a/kernel/irq/msi.c > +++ b/kernel/irq/msi.c > @@ -229,11 +229,13 @@ static int msi_domain_ops_check(struct i > } > > static struct msi_domain_ops msi_domain_ops_default = { > - .get_hwirq = msi_domain_ops_get_hwirq, > - .msi_init = msi_domain_ops_init, > - .msi_check = msi_domain_ops_check, > - .msi_prepare= msi_domain_ops_prepare, > - .set_desc = msi_domain_ops_set_desc, > + .get_hwirq = msi_domain_ops_get_hwirq, > + .msi_init = msi_domain_ops_init, > + .msi_check = msi_domain_ops_check, > + .msi_prepare= msi_domain_ops_prepare, > + .set_desc = msi_domain_ops_set_desc, > + .domain_alloc_irqs = __msi_domain_allo
Re: [patch V2 15/46] x86/irq: Consolidate DMAR irq allocation
On Wed, Aug 26 2020 at 09:50, Megha Dey wrote: >> @@ -329,15 +329,15 @@ static struct irq_chip dmar_msi_controll >> static irq_hw_number_t dmar_msi_get_hwirq(struct msi_domain_info *info, >>msi_alloc_info_t *arg) >> { >> -return arg->dmar_id; >> +return arg->hwirq; > > Shouldn't this return the arg->devid which gets set in dmar_alloc_hwirq? Indeed. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
On Mon, Aug 24, 2020 at 2:56 AM Tom Murphy wrote: > > Hi Logan/All, > > I have added a check for the sg_dma_len == 0 : > """ > } __sgt_iter(struct scatterlist *sgl, bool dma) { > struct sgt_iter s = { .sgp = sgl }; > > + if (sgl && sg_dma_len(sgl) == 0) > + s.sgp = NULL; > > if (s.sgp) { > . > """ > at location [1]. > but it doens't fix the problem. > > You're right though, this change does need to be made, this code > doesn't handle pages of sg_dma_len(sg) == 0 correctly > So my guess is that we have more bugs in other parts of the i915 > driver (or there is a problem with my "sg_dma_len == 0" fix above). > I have been trying to spot where else the code might be buggy but I > haven't had any luck so far. > > I'm doing a microconfernce (at LPC 2020) this wednesdays [1] on this > if you're interested in attending. > I'm hoping I can chat about it with a few people and find how can > reproduce and fix this issues. I don't have any more time I can give > to this unfortunately and it would be a shame for the work to go to > waste. > > [0] > https://github.com/torvalds/linux/blob/d012a7190fc1fd72ed48911e77ca97ba4521bccd/drivers/gpu/drm/i915/i915_scatterlist.h#L28 > [1] https://linuxplumbersconf.org/event/7/contributions/846/ > > On Fri, 29 May 2020 at 22:21, Logan Gunthorpe wrote: > > > > > > > > On 2020-05-29 3:11 p.m., Marek Szyprowski wrote: > > > Patches are pending: > > > https://lore.kernel.org/linux-iommu/20200513132114.6046-1-m.szyprow...@samsung.com/T/ > > > > Cool, nice! Though, I still don't think that fixes the issue in > > i915_scatterlist.h given it still ignores sg_dma_len() and strictly > > relies on sg_next()/sg_is_last() to stop iterating -- and I suspect this > > is the bug that got in Tom's way. > > > > >> However, as Robin pointed out, there are other ugly tricks like stopping > > >> iterating through the SGL when sg_dma_len() is zero. For example, the > > >> AMD driver appears to use drm_prime_sg_to_page_addr_arrays() which does > > >> this trick and thus likely isn't buggy (otherwise, I'd expect someone to > > >> have complained by now seeing AMD has already switched to IOMMU-DMA. We ran into the same issue with amdgpu and radeon when the AMD IOMMU driver was converted and had to fix it as well. The relevant fixes were: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=42e67b479eab6d26459b80b4867298232b0435e7 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0199172f933342d8b1011aae2054a695c25726f4 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=47f7826c520ecd92ffbffe59ecaa2fe61e42ec70 https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c0f83d164fb8f3a2b7bc379a6c1e27d1123a9eab Alex > > > > > > I'm not sure that this is a trick. Stopping at zero sg_dma_len() was > > > somewhere documented. > > > > Well whatever you want to call it, it is ugly to have some drivers doing > > one thing with the returned value and others assuming there's an extra > > zero at the end. It just causes confusion for people reading/copying the > > code. It would be better if they are all consistent. However, I concede > > stopping at zero should not be broken, presently. > > > > Logan > ___ > dri-devel mailing list > dri-de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
That would be great! On Wed., Aug. 26, 2020, 2:14 p.m. Robin Murphy, wrote: > Hi Tom, > > On 2019-12-21 15:03, Tom Murphy wrote: > > This patchset converts the intel iommu driver to the dma-iommu api. > > > > While converting the driver I exposed a bug in the intel i915 driver > which causes a huge amount of artifacts on the screen of my laptop. You can > see a picture of it here: > > > https://github.com/pippy360/kernelPatches/blob/master/IMG_20191219_225922.jpg > > > > This issue is most likely in the i915 driver and is most likely caused > by the driver not respecting the return value of the dma_map_ops::map_sg > function. You can see the driver ignoring the return value here: > > > https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51 > > > > Previously this didn’t cause issues because the intel map_sg always > returned the same number of elements as the input scatter gather list but > with the change to this dma-iommu api this is no longer the case. I wasn’t > able to track the bug down to a specific line of code unfortunately. > > > > Could someone from the intel team look at this? > > > > > > I have been testing on a lenovo x1 carbon 5th generation. Let me know if > there’s any more information you need. > > > > To allow my patch set to be tested I have added a patch (patch 8/8) in > this series to disable combining sg segments in the dma-iommu api which > fixes the bug but it doesn't fix the actual problem. > > > > As part of this patch series I copied the intel bounce buffer code to > the dma-iommu path. The addition of the bounce buffer code took me by > surprise. I did most of my development on this patch series before the > bounce buffer code was added and my reimplementation in the dma-iommu path > is very rushed and not properly tested but I’m running out of time to work > on this patch set. > > > > On top of that I also didn’t port over the intel tracing code from this > commit: > > > https://github.com/torvalds/linux/commit/3b53034c268d550d9e8522e613a14ab53b8840d8#diff-6b3e7c4993f05e76331e463ab1fc87e1 > > So all the work in that commit is now wasted. The code will need to be > removed and reimplemented in the dma-iommu path. I would like to take the > time to do this but I really don’t have the time at the moment and I want > to get these changes out before the iommu code changes any more. > > Further to what we just discussed at LPC, I've realised that tracepoints > are actually something I could do with *right now* for debugging my Arm > DMA ops series, so if I'm going to hack something up anyway I may as > well take responsibility for polishing it into a proper patch as well :) > > Robin. > > > > > Tom Murphy (8): > >iommu/vt-d: clean up 32bit si_domain assignment > >iommu/vt-d: Use default dma_direct_* mapping functions for direct > > mapped devices > >iommu/vt-d: Remove IOVA handling code from non-dma_ops path > >iommu: Handle freelists when using deferred flushing in iommu drivers > >iommu: Add iommu_dma_free_cpu_cached_iovas function > >iommu: allow the dma-iommu api to use bounce buffers > >iommu/vt-d: Convert intel iommu driver to the iommu ops > >DO NOT MERGE: iommu: disable list appending in dma-iommu > > > > drivers/iommu/Kconfig | 1 + > > drivers/iommu/amd_iommu.c | 14 +- > > drivers/iommu/arm-smmu-v3.c | 3 +- > > drivers/iommu/arm-smmu.c| 3 +- > > drivers/iommu/dma-iommu.c | 183 +-- > > drivers/iommu/exynos-iommu.c| 3 +- > > drivers/iommu/intel-iommu.c | 936 > > drivers/iommu/iommu.c | 39 +- > > drivers/iommu/ipmmu-vmsa.c | 3 +- > > drivers/iommu/msm_iommu.c | 3 +- > > drivers/iommu/mtk_iommu.c | 3 +- > > drivers/iommu/mtk_iommu_v1.c| 3 +- > > drivers/iommu/omap-iommu.c | 3 +- > > drivers/iommu/qcom_iommu.c | 3 +- > > drivers/iommu/rockchip-iommu.c | 3 +- > > drivers/iommu/s390-iommu.c | 3 +- > > drivers/iommu/tegra-gart.c | 3 +- > > drivers/iommu/tegra-smmu.c | 3 +- > > drivers/iommu/virtio-iommu.c| 3 +- > > drivers/vfio/vfio_iommu_type1.c | 2 +- > > include/linux/dma-iommu.h | 3 + > > include/linux/intel-iommu.h | 1 - > > include/linux/iommu.h | 32 +- > > 23 files changed, 345 insertions(+), 908 deletions(-) > > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/8] Convert the intel iommu driver to the dma-iommu api
Hi Tom, On 2019-12-21 15:03, Tom Murphy wrote: This patchset converts the intel iommu driver to the dma-iommu api. While converting the driver I exposed a bug in the intel i915 driver which causes a huge amount of artifacts on the screen of my laptop. You can see a picture of it here: https://github.com/pippy360/kernelPatches/blob/master/IMG_20191219_225922.jpg This issue is most likely in the i915 driver and is most likely caused by the driver not respecting the return value of the dma_map_ops::map_sg function. You can see the driver ignoring the return value here: https://github.com/torvalds/linux/blob/7e0165b2f1a912a06e381e91f0f4e495f4ac3736/drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c#L51 Previously this didn’t cause issues because the intel map_sg always returned the same number of elements as the input scatter gather list but with the change to this dma-iommu api this is no longer the case. I wasn’t able to track the bug down to a specific line of code unfortunately. Could someone from the intel team look at this? I have been testing on a lenovo x1 carbon 5th generation. Let me know if there’s any more information you need. To allow my patch set to be tested I have added a patch (patch 8/8) in this series to disable combining sg segments in the dma-iommu api which fixes the bug but it doesn't fix the actual problem. As part of this patch series I copied the intel bounce buffer code to the dma-iommu path. The addition of the bounce buffer code took me by surprise. I did most of my development on this patch series before the bounce buffer code was added and my reimplementation in the dma-iommu path is very rushed and not properly tested but I’m running out of time to work on this patch set. On top of that I also didn’t port over the intel tracing code from this commit: https://github.com/torvalds/linux/commit/3b53034c268d550d9e8522e613a14ab53b8840d8#diff-6b3e7c4993f05e76331e463ab1fc87e1 So all the work in that commit is now wasted. The code will need to be removed and reimplemented in the dma-iommu path. I would like to take the time to do this but I really don’t have the time at the moment and I want to get these changes out before the iommu code changes any more. Further to what we just discussed at LPC, I've realised that tracepoints are actually something I could do with *right now* for debugging my Arm DMA ops series, so if I'm going to hack something up anyway I may as well take responsibility for polishing it into a proper patch as well :) Robin. Tom Murphy (8): iommu/vt-d: clean up 32bit si_domain assignment iommu/vt-d: Use default dma_direct_* mapping functions for direct mapped devices iommu/vt-d: Remove IOVA handling code from non-dma_ops path iommu: Handle freelists when using deferred flushing in iommu drivers iommu: Add iommu_dma_free_cpu_cached_iovas function iommu: allow the dma-iommu api to use bounce buffers iommu/vt-d: Convert intel iommu driver to the iommu ops DO NOT MERGE: iommu: disable list appending in dma-iommu drivers/iommu/Kconfig | 1 + drivers/iommu/amd_iommu.c | 14 +- drivers/iommu/arm-smmu-v3.c | 3 +- drivers/iommu/arm-smmu.c| 3 +- drivers/iommu/dma-iommu.c | 183 +-- drivers/iommu/exynos-iommu.c| 3 +- drivers/iommu/intel-iommu.c | 936 drivers/iommu/iommu.c | 39 +- drivers/iommu/ipmmu-vmsa.c | 3 +- drivers/iommu/msm_iommu.c | 3 +- drivers/iommu/mtk_iommu.c | 3 +- drivers/iommu/mtk_iommu_v1.c| 3 +- drivers/iommu/omap-iommu.c | 3 +- drivers/iommu/qcom_iommu.c | 3 +- drivers/iommu/rockchip-iommu.c | 3 +- drivers/iommu/s390-iommu.c | 3 +- drivers/iommu/tegra-gart.c | 3 +- drivers/iommu/tegra-smmu.c | 3 +- drivers/iommu/virtio-iommu.c| 3 +- drivers/vfio/vfio_iommu_type1.c | 2 +- include/linux/dma-iommu.h | 3 + include/linux/intel-iommu.h | 1 - include/linux/iommu.h | 32 +- 23 files changed, 345 insertions(+), 908 deletions(-) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 15/46] x86/irq: Consolidate DMAR irq allocation
Hi Thomas, On 8/26/2020 4:16 AM, Thomas Gleixner wrote: From: Thomas Gleixner None of the DMAR specific fields are required. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h |6 -- arch/x86/kernel/apic/msi.c| 10 +- 2 files changed, 5 insertions(+), 11 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -83,12 +83,6 @@ struct irq_alloc_info { irq_hw_number_t msi_hwirq; }; #endif -#ifdef CONFIG_DMAR_TABLE - struct { - int dmar_id; - void*dmar_data; - }; -#endif #ifdefCONFIG_X86_UV struct { int uv_limit; --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -329,15 +329,15 @@ static struct irq_chip dmar_msi_controll static irq_hw_number_t dmar_msi_get_hwirq(struct msi_domain_info *info, msi_alloc_info_t *arg) { - return arg->dmar_id; + return arg->hwirq; Shouldn't this return the arg->devid which gets set in dmar_alloc_hwirq? -Megha } static int dmar_msi_init(struct irq_domain *domain, struct msi_domain_info *info, unsigned int virq, irq_hw_number_t hwirq, msi_alloc_info_t *arg) { - irq_domain_set_info(domain, virq, arg->dmar_id, info->chip, NULL, - handle_edge_irq, arg->dmar_data, "edge"); + irq_domain_set_info(domain, virq, arg->devid, info->chip, NULL, + handle_edge_irq, arg->data, "edge"); return 0; } @@ -384,8 +384,8 @@ int dmar_alloc_hwirq(int id, int node, v init_irq_alloc_info(&info, NULL); info.type = X86_IRQ_ALLOC_TYPE_DMAR; - info.dmar_id = id; - info.dmar_data = arg; + info.devid = id; + info.data = arg; return irq_domain_alloc_irqs(domain, 1, node, &info); } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu/arm-smmu-v3: Fix l1 stream table size in the error message
The actual size of level-1 stream table is l1size. This looks like an oversight on commit d2e88e7c081ef ("iommu/arm-smmu: Fix LOG2SIZE setting for 2-level stream tables") which forgot to update the @size in error message as well. As memory allocation failure is already bad enough, nothing worse would happen. But let's be careful. Signed-off-by: Zenghui Yu --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index c192544e874b..bb458d0c7b73 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -3280,7 +3280,7 @@ static int arm_smmu_init_strtab_2lvl(struct arm_smmu_device *smmu) if (!strtab) { dev_err(smmu->dev, "failed to allocate l1 stream table (%u bytes)\n", - size); + l1size); return -ENOMEM; } cfg->strtab = strtab; -- 2.19.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 0/2] iommu/amd: Fix IOMMUv2 devices when SME is active
[AMD Public Use] + Christian > -Original Message- > From: Kuehling, Felix > Sent: Wednesday, August 26, 2020 11:22 AM > To: Deucher, Alexander ; Joerg Roedel > ; iommu@lists.linux-foundation.org; Huang, Ray > > Cc: jroe...@suse.de; Lendacky, Thomas ; > Suthikulpanit, Suravee ; linux- > ker...@vger.kernel.org > Subject: Re: [PATCH 0/2] iommu/amd: Fix IOMMUv2 devices when SME is > active > > [+Ray] > > > Thanks for the heads up. Currently KFD won't work on APUs when IOMMUv2 > is disabled. But Ray is working on fallbacks that will allow KFD to work on > APUs even without IOMMUv2, similar to our dGPUs. Along with changes in > ROCm user mode, those fallbacks are necessary for making ROCm on APUs > generally useful. > > > How common is SME on typical PCs or laptops that would use AMD APUs? I think the hw supports it, but it as far as I know it's not formally productized on client parts. > > > Alex, do you know if anyone has tested amdgpu on an APU with SME > enabled? Is this considered something we support? It's not something we've tested. I'm not even sure the GPU portion of APUs will work properly without an identity mapping. SME should work properly with dGPUs however, so this is a proper fix for them. We don't use the IOMMUv2 path on dGPUs at all. Alex > > > Thanks, > Felix > > > Am 2020-08-26 um 10:14 a.m. schrieb Deucher, Alexander: > > > > [AMD Official Use Only - Internal Distribution Only] > > > > > > + Felix > > -- > > -- > > *From:* Joerg Roedel > > *Sent:* Monday, August 24, 2020 6:54 AM > > *To:* iommu@lists.linux-foundation.org > > > > *Cc:* Joerg Roedel ; jroe...@suse.de > > ; Lendacky, Thomas ; > > Suthikulpanit, Suravee ; Deucher, > > Alexander ; linux-ker...@vger.kernel.org > > > > *Subject:* [PATCH 0/2] iommu/amd: Fix IOMMUv2 devices when SME is > > active > > > > From: Joerg Roedel > > > > Hi, > > > > Some IOMMUv2 capable devices do not work correctly when SME is active, > > because their DMA mask does not include the encryption bit, so that > > they can not DMA to encrypted memory directly. > > > > The IOMMU can jump in here, but the AMD IOMMU driver puts IOMMUv2 > > capable devices into an identity mapped domain. Fix that by not > > forcing an identity mapped domain on devices when SME is active and > > forbid using their IOMMUv2 functionality. > > > > Please review. > > > > Thanks, > > > > Joerg > > > > Joerg Roedel (2): > > iommu/amd: Do not force direct mapping when SME is active > > iommu/amd: Do not use IOMMUv2 functionality when SME is active > > > > drivers/iommu/amd/iommu.c | 7 ++- > > drivers/iommu/amd/iommu_v2.c | 7 +++ > > 2 files changed, 13 insertions(+), 1 deletion(-) > > > > -- > > 2.28.0 > > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 1/2] iommu/amd: Do not force direct mapping when SME is active
[AMD Public Use] + Felix, Christian > -Original Message- > From: Joerg Roedel > Sent: Monday, August 24, 2020 6:54 AM > To: iommu@lists.linux-foundation.org > Cc: Joerg Roedel ; jroe...@suse.de; Lendacky, Thomas > ; Suthikulpanit, Suravee > ; Deucher, Alexander > ; linux-ker...@vger.kernel.org > Subject: [PATCH 1/2] iommu/amd: Do not force direct mapping when SME is > active > > From: Joerg Roedel > > Do not force devices supporting IOMMUv2 to be direct mapped when > memory encryption is active. This might cause them to be unusable because > their DMA mask does not include the encryption bit. > > Signed-off-by: Joerg Roedel > --- > drivers/iommu/amd/iommu.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c > index ba9f3dbc5b94..77e4268e41cf 100644 > --- a/drivers/iommu/amd/iommu.c > +++ b/drivers/iommu/amd/iommu.c > @@ -2659,7 +2659,12 @@ static int amd_iommu_def_domain_type(struct > device *dev) > if (!dev_data) > return 0; > > - if (dev_data->iommu_v2) > + /* > + * Do not identity map IOMMUv2 capable devices when memory > encryption is > + * active, because some of those devices (AMD GPUs) don't have the > + * encryption bit in their DMA-mask and require remapping. > + */ I think on the integrated GPUs in APUs I'd prefer to have the identity mapping over SME, but I guess this is fine because you have to explicitly enable SME and if you do that you know what you are getting into. Alex > + if (!mem_encrypt_active() && dev_data->iommu_v2) > return IOMMU_DOMAIN_IDENTITY; > > return 0; > -- > 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] iommu/amd: Fix IOMMUv2 devices when SME is active
[+Ray] Thanks for the heads up. Currently KFD won't work on APUs when IOMMUv2 is disabled. But Ray is working on fallbacks that will allow KFD to work on APUs even without IOMMUv2, similar to our dGPUs. Along with changes in ROCm user mode, those fallbacks are necessary for making ROCm on APUs generally useful. How common is SME on typical PCs or laptops that would use AMD APUs? Alex, do you know if anyone has tested amdgpu on an APU with SME enabled? Is this considered something we support? Thanks, Felix Am 2020-08-26 um 10:14 a.m. schrieb Deucher, Alexander: > > [AMD Official Use Only - Internal Distribution Only] > > > + Felix > > *From:* Joerg Roedel > *Sent:* Monday, August 24, 2020 6:54 AM > *To:* iommu@lists.linux-foundation.org > *Cc:* Joerg Roedel ; jroe...@suse.de > ; Lendacky, Thomas ; > Suthikulpanit, Suravee ; Deucher, > Alexander ; linux-ker...@vger.kernel.org > > *Subject:* [PATCH 0/2] iommu/amd: Fix IOMMUv2 devices when SME is active > > From: Joerg Roedel > > Hi, > > Some IOMMUv2 capable devices do not work correctly when SME is > active, because their DMA mask does not include the encryption bit, so > that they can not DMA to encrypted memory directly. > > The IOMMU can jump in here, but the AMD IOMMU driver puts IOMMUv2 > capable devices into an identity mapped domain. Fix that by not > forcing an identity mapped domain on devices when SME is active and > forbid using their IOMMUv2 functionality. > > Please review. > > Thanks, > > Joerg > > Joerg Roedel (2): > iommu/amd: Do not force direct mapping when SME is active > iommu/amd: Do not use IOMMUv2 functionality when SME is active > > drivers/iommu/amd/iommu.c | 7 ++- > drivers/iommu/amd/iommu_v2.c | 7 +++ > 2 files changed, 13 insertions(+), 1 deletion(-) > > -- > 2.28.0 > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
On Wed, Aug 26 2020 at 13:17, Thomas Gleixner wrote: > + * If CONFIG_PCI_MSI_ARCH_FALLBACKS is not selected they are replaced by > + * stubs with warnings. > */ > +#ifdef CONFIG_PCI_MSI_DISABLE_ARCH_FALLBACKS Groan, I obviously failed to pull that back from the test box where I fixed it. That wants to be: +#ifdef CONFIG_PCI_MSI_ARCH_FALLBACKS Doing five things at once does not work well ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names
Hi, On Wed, Aug 26, 2020 at 8:01 AM Sai Prakash Ranjan wrote: > > On 2020-08-26 19:21, Robin Murphy wrote: > > On 2020-08-26 13:17, Sai Prakash Ranjan wrote: > >> On 2020-08-26 17:07, Robin Murphy wrote: > >>> On 2020-08-25 16:42, Sai Prakash Ranjan wrote: > Currently the non-strict or lazy mode of TLB invalidation can only > be set > for all or no domains. This works well for development platforms > where > setting to non-strict/lazy mode is fine for performance reasons but > on > production devices, we need a more fine grained control to allow > only > certain peripherals to support this mode where we can be sure that > it is > safe. So add support to filter non-strict/lazy mode based on the > device > names that are passed via cmdline parameter > "iommu.nonstrict_device". > >>> > >>> There seems to be considerable overlap here with both the existing > >>> patches for per-device default domain control [1], and the broader > >>> ongoing development on how to define, evaluate and handle "trusted" > >>> vs. "untrusted" devices (e.g. [2],[3]). I'd rather see work done to > >>> make sure those integrate properly together and work well for > >>> everyone's purposes, than add more disjoint mechanisms that only > >>> address small pieces of the overall issue. > >>> > >>> Robin. > >>> > >>> [1] > >>> https://lore.kernel.org/linux-iommu/20200824051726.7xaJRTTszJuzdFWGJ8YNsshCtfNR0BNeMrlILAyqt_0@z/ > >>> [2] > >>> https://lore.kernel.org/linux-iommu/20200630044943.3425049-1-raja...@google.com/ > >>> [3] > >>> https://lore.kernel.org/linux-iommu/20200626002710.110200-2-raja...@google.com/ > >> > >> Thanks for the links, [1] definitely sounds interesting, I was under > >> the impression > >> that changing such via sysfs is late, but seems like other Sai has got > >> it working > >> for the default domain type. So we can extend that and add a strict > >> attribute as well, > >> we should be definitely OK with system booting with default strict > >> mode for all > >> peripherals as long as we have an option to change that later, Doug? > > > > Right, IIRC there was initially a proposal of a command line option > > there too, and it faced the same criticism around not being very > > generic or scalable. I believe sysfs works as a reasonable compromise > > since in many cases it can be tweaked relatively early from an initrd, > > and non-essential devices can effectively be switched at any time by > > removing and reprobing their driver. > > > > Ah I see, so the catch is that device must not be bound to the driver > and won't work for the internal devices or builtin drivers probed early. Hrm, that wouldn't work so well for us for eMMC. I don't think I'm going to manage to convince folks that we need an initrd just for this. I'm probably being naive and I haven't looked at the code, but it does seem a little weird that this isn't the kind of thing that could just be tweaked for transfers going forward... > > As for a general approach for internal devices where you do believe > > the hardware is honest but don't necessarily trust whatever firmware > > it happens to be running, I'm pretty sure that's come up already, but > > I'll be sure to mention it at Rajat's imminent LPC talk if nobody else > > does. I'll at least attend. We'll see how useful my contributions are since, as per usual, I'm wandering into an area I'm not an expert in here. ;-) -Doug ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names
On 2020-08-26 19:21, Robin Murphy wrote: On 2020-08-26 13:17, Sai Prakash Ranjan wrote: On 2020-08-26 17:07, Robin Murphy wrote: On 2020-08-25 16:42, Sai Prakash Ranjan wrote: Currently the non-strict or lazy mode of TLB invalidation can only be set for all or no domains. This works well for development platforms where setting to non-strict/lazy mode is fine for performance reasons but on production devices, we need a more fine grained control to allow only certain peripherals to support this mode where we can be sure that it is safe. So add support to filter non-strict/lazy mode based on the device names that are passed via cmdline parameter "iommu.nonstrict_device". There seems to be considerable overlap here with both the existing patches for per-device default domain control [1], and the broader ongoing development on how to define, evaluate and handle "trusted" vs. "untrusted" devices (e.g. [2],[3]). I'd rather see work done to make sure those integrate properly together and work well for everyone's purposes, than add more disjoint mechanisms that only address small pieces of the overall issue. Robin. [1] https://lore.kernel.org/linux-iommu/20200824051726.7xaJRTTszJuzdFWGJ8YNsshCtfNR0BNeMrlILAyqt_0@z/ [2] https://lore.kernel.org/linux-iommu/20200630044943.3425049-1-raja...@google.com/ [3] https://lore.kernel.org/linux-iommu/20200626002710.110200-2-raja...@google.com/ Thanks for the links, [1] definitely sounds interesting, I was under the impression that changing such via sysfs is late, but seems like other Sai has got it working for the default domain type. So we can extend that and add a strict attribute as well, we should be definitely OK with system booting with default strict mode for all peripherals as long as we have an option to change that later, Doug? Right, IIRC there was initially a proposal of a command line option there too, and it faced the same criticism around not being very generic or scalable. I believe sysfs works as a reasonable compromise since in many cases it can be tweaked relatively early from an initrd, and non-essential devices can effectively be switched at any time by removing and reprobing their driver. Ah I see, so the catch is that device must not be bound to the driver and won't work for the internal devices or builtin drivers probed early. -Sai As for a general approach for internal devices where you do believe the hardware is honest but don't necessarily trust whatever firmware it happens to be running, I'm pretty sure that's come up already, but I'll be sure to mention it at Rajat's imminent LPC talk if nobody else does. Robin. -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH 2/2] iommu/amd: Do not use IOMMUv2 functionality when SME is active
[AMD Public Use] + Felix, Christian > -Original Message- > From: Joerg Roedel > Sent: Monday, August 24, 2020 6:54 AM > To: iommu@lists.linux-foundation.org > Cc: Joerg Roedel ; jroe...@suse.de; Lendacky, Thomas > ; Suthikulpanit, Suravee > ; Deucher, Alexander > ; linux-ker...@vger.kernel.org > Subject: [PATCH 2/2] iommu/amd: Do not use IOMMUv2 functionality when > SME is active > > From: Joerg Roedel > > When memory encryption is active the device is likely not in a direct mapped > domain. Forbid using IOMMUv2 functionality for now until finer grained > checks for this have been implemented. > > Signed-off-by: Joerg Roedel > --- > drivers/iommu/amd/iommu_v2.c | 7 +++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/iommu/amd/iommu_v2.c > b/drivers/iommu/amd/iommu_v2.c index c259108ab6dd..0d175aed1d92 > 100644 > --- a/drivers/iommu/amd/iommu_v2.c > +++ b/drivers/iommu/amd/iommu_v2.c > @@ -737,6 +737,13 @@ int amd_iommu_init_device(struct pci_dev *pdev, > int pasids) > > might_sleep(); > > + /* > + * When memory encryption is active the device is likely not in a > + * direct-mapped domain. Forbid using IOMMUv2 functionality for > now. > + */ > + if (mem_encrypt_active()) > + return -ENODEV; > + > if (!amd_iommu_v2_supported()) > return -ENODEV; > > -- > 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 0/2] iommu/amd: Fix IOMMUv2 devices when SME is active
[AMD Official Use Only - Internal Distribution Only] + Felix From: Joerg Roedel Sent: Monday, August 24, 2020 6:54 AM To: iommu@lists.linux-foundation.org Cc: Joerg Roedel ; jroe...@suse.de ; Lendacky, Thomas ; Suthikulpanit, Suravee ; Deucher, Alexander ; linux-ker...@vger.kernel.org Subject: [PATCH 0/2] iommu/amd: Fix IOMMUv2 devices when SME is active From: Joerg Roedel Hi, Some IOMMUv2 capable devices do not work correctly when SME is active, because their DMA mask does not include the encryption bit, so that they can not DMA to encrypted memory directly. The IOMMU can jump in here, but the AMD IOMMU driver puts IOMMUv2 capable devices into an identity mapped domain. Fix that by not forcing an identity mapped domain on devices when SME is active and forbid using their IOMMUv2 functionality. Please review. Thanks, Joerg Joerg Roedel (2): iommu/amd: Do not force direct mapping when SME is active iommu/amd: Do not use IOMMUv2 functionality when SME is active drivers/iommu/amd/iommu.c| 7 ++- drivers/iommu/amd/iommu_v2.c | 7 +++ 2 files changed, 13 insertions(+), 1 deletion(-) -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH RESEND v3] iommu/tegra-smmu: Add missing locks around mapping operations
Hi [This is an automated email] This commit has been processed because it contains a -stable tag. The stable tag indicates that it's relevant for the following trees: all The bot has tested the following trees: v5.8.2, v5.7.16, v5.4.59, v4.19.140, v4.14.193, v4.9.232, v4.4.232. v5.8.2: Build OK! v5.7.16: Build OK! v5.4.59: Failed to apply! Possible dependencies: 781ca2de89ba ("iommu: Add gfp parameter to iommu_ops::map") v4.19.140: Failed to apply! Possible dependencies: 06d60728ff5c ("iommu/dma: move the arm64 wrappers to common code") 44f6876a00e8 ("iommu/arm-smmu: Support non-strict mode") 46053c736854 ("dma-mapping: clear dev->dma_ops in arch_teardown_dma_ops") 781ca2de89ba ("iommu: Add gfp parameter to iommu_ops::map") 886643b76632 ("arm64: use the generic swiotlb_dma_ops") 92aec09cc879 ("iommu/dma: Move __iommu_dma_map") 96a299d24cfb ("iommu/arm-smmu: Add pm_runtime/sleep ops") c4dae366925f ("swiotlb: refactor swiotlb_map_page") d4a44f0750bb ("iommu/arm-smmu: Invoke pm_runtime across the driver") dff8d6c1ed58 ("swiotlb: remove the overflow buffer") fafadcd16595 ("swiotlb: don't dip into swiotlb pool for coherent allocations") v4.14.193: Failed to apply! Possible dependencies: 06d60728ff5c ("iommu/dma: move the arm64 wrappers to common code") 10dac04c79b1 ("mips: fix an off-by-one in dma_capable") 32b124492bdf ("iommu/io-pgtable-arm: Convert to IOMMU API TLB sync") 32ce3862af3c ("powerpc/lib: Implement PMEM API") 44f6876a00e8 ("iommu/arm-smmu: Support non-strict mode") 781ca2de89ba ("iommu: Add gfp parameter to iommu_ops::map") 92aec09cc879 ("iommu/dma: Move __iommu_dma_map") 96a299d24cfb ("iommu/arm-smmu: Add pm_runtime/sleep ops") d4a44f0750bb ("iommu/arm-smmu: Invoke pm_runtime across the driver") ea8c64ace866 ("dma-mapping: move swiotlb arch helpers to a new header") v4.9.232: Failed to apply! Possible dependencies: 125458ab3aef ("iommu/arm-smmu: Fix 16-bit ASID configuration") 280b683ceace ("iommu/arm-smmu: Simplify ASID/VMID handling") 32b124492bdf ("iommu/io-pgtable-arm: Convert to IOMMU API TLB sync") 3677a649a751 ("iommu/arm-smmu: Fix for ThunderX erratum #27704") 44f6876a00e8 ("iommu/arm-smmu: Support non-strict mode") 452107c79035 ("iommu/arm-smmu: Tidy up context bank indexing") 523d7423e21b ("iommu/arm-smmu: Remove io-pgtable spinlock") 58188afeb727 ("iommu/arm-smmu-v3: Remove io-pgtable spinlock") 61bc671179f1 ("iommu/arm-smmu: Install bypass S2CRs for IOMMU_DOMAIN_IDENTITY domains") 781ca2de89ba ("iommu: Add gfp parameter to iommu_ops::map") bdf95923086f ("iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed") d4a44f0750bb ("iommu/arm-smmu: Invoke pm_runtime across the driver") v4.4.232: Failed to apply! Possible dependencies: 267b62a96951 ("clk: tegra: pll: Update PLLM handling") 287980e49ffc ("remove lots of IS_ERR_VALUE abuses") 407254da291c ("clk: tegra: pll: Add logic for out-of-table rates for T210") 56fd27b31f1a ("clk: tegra: pll: Change misc_reg count from 3 to 6") 58188afeb727 ("iommu/arm-smmu-v3: Remove io-pgtable spinlock") 6583a6309e83 ("clk: tegra: pll: Add tegra_pll_wait_for_lock to clk header") 6929715cf6b9 ("clk: tegra: pll: Add support for PLLMB for Tegra210") 6b301a059eb2 ("clk: tegra: Add support for Tegra210 clocks") 781ca2de89ba ("iommu: Add gfp parameter to iommu_ops::map") 7db864c9deb2 ("clk: tegra: pll: Simplify clk_enable_path") 8cfb0cdf07e2 ("ACPI / debugger: Add IO interface to access debugger functionalities") 8f78515425da ("iommu/arm-smmu: Implement of_xlate() for SMMUv3") 9adb95949a34 ("iommu/arm-smmu: Support DMA-API domains") bc7f2ce0a7b5 ("iommu/arm-smmu: Don't fail device attach if already attached to a domain") bdf95923086f ("iommu/arm-smmu: Return IOVA in iova_to_phys when SMMU is bypassed") d907f4b4a178 ("clk: tegra: pll: Add logic for handling SDM data") f8d31489629c ("ACPICA: Debugger: Convert some mechanisms to OSPM specific") NOTE: The patch will not be queued to stable trees until it is upstream. How should we proceed with this patch? -- Thanks Sasha ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names
On 2020-08-26 13:17, Sai Prakash Ranjan wrote: On 2020-08-26 17:07, Robin Murphy wrote: On 2020-08-25 16:42, Sai Prakash Ranjan wrote: Currently the non-strict or lazy mode of TLB invalidation can only be set for all or no domains. This works well for development platforms where setting to non-strict/lazy mode is fine for performance reasons but on production devices, we need a more fine grained control to allow only certain peripherals to support this mode where we can be sure that it is safe. So add support to filter non-strict/lazy mode based on the device names that are passed via cmdline parameter "iommu.nonstrict_device". There seems to be considerable overlap here with both the existing patches for per-device default domain control [1], and the broader ongoing development on how to define, evaluate and handle "trusted" vs. "untrusted" devices (e.g. [2],[3]). I'd rather see work done to make sure those integrate properly together and work well for everyone's purposes, than add more disjoint mechanisms that only address small pieces of the overall issue. Robin. [1] https://lore.kernel.org/linux-iommu/20200824051726.7xaJRTTszJuzdFWGJ8YNsshCtfNR0BNeMrlILAyqt_0@z/ [2] https://lore.kernel.org/linux-iommu/20200630044943.3425049-1-raja...@google.com/ [3] https://lore.kernel.org/linux-iommu/20200626002710.110200-2-raja...@google.com/ Thanks for the links, [1] definitely sounds interesting, I was under the impression that changing such via sysfs is late, but seems like other Sai has got it working for the default domain type. So we can extend that and add a strict attribute as well, we should be definitely OK with system booting with default strict mode for all peripherals as long as we have an option to change that later, Doug? Right, IIRC there was initially a proposal of a command line option there too, and it faced the same criticism around not being very generic or scalable. I believe sysfs works as a reasonable compromise since in many cases it can be tweaked relatively early from an initrd, and non-essential devices can effectively be switched at any time by removing and reprobing their driver. As for a general approach for internal devices where you do believe the hardware is honest but don't necessarily trust whatever firmware it happens to be running, I'm pretty sure that's come up already, but I'll be sure to mention it at Rajat's imminent LPC talk if nobody else does. Robin. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v9 12/32] drm: msm: fix common struct sg_table related issues
Hi Marek, I love your patch! Yet something to improve: [auto build test ERROR on linuxtv-media/master] [also build test ERROR on drm-intel/for-linux-next linus/master v5.9-rc2 next-20200826] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Marek-Szyprowski/DRM-fix-struct-sg_table-nents-vs-orig_nents-misuse/20200826-143908 base: git://linuxtv.org/media_tree.git master config: arm64-randconfig-r002-20200826 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project 7cfcecece0e0430937cf529ce74d3a071a4dedc6) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install arm64 cross compiling tool for clang build # apt-get install binutils-aarch64-linux-gnu # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot All errors (new ones prefixed by >>): aarch64-linux-gnu-ld: warning: -z norelro ignored aarch64-linux-gnu-ld: fs/orangefs/orangefs-debugfs.o: in function `orangefs_debug_read': fs/orangefs/orangefs-debugfs.c:375: undefined reference to `stpcpy' aarch64-linux-gnu-ld: security/apparmor/lsm.o: in function `param_get_mode': security/apparmor/lsm.c:1559: undefined reference to `stpcpy' aarch64-linux-gnu-ld: security/apparmor/lsm.o: in function `param_get_audit': security/apparmor/lsm.c:1530: undefined reference to `stpcpy' aarch64-linux-gnu-ld: crypto/async_tx/async_tx.o: in function `async_tx_channel_switch': crypto/async_tx/async_tx.c:118: undefined reference to `dma_wait_for_async_tx' aarch64-linux-gnu-ld: crypto/async_tx/async_tx.o: in function `async_tx_quiesce': crypto/async_tx/async_tx.c:270: undefined reference to `dma_wait_for_async_tx' aarch64-linux-gnu-ld: crypto/async_tx/async_tx.c:270: undefined reference to `dma_wait_for_async_tx' aarch64-linux-gnu-ld: crypto/async_tx/async_memcpy.o: in function `async_memcpy': crypto/async_tx/async_memcpy.c:43: undefined reference to `dmaengine_get_unmap_data' aarch64-linux-gnu-ld: crypto/async_tx/async_memcpy.c:89: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: crypto/async_tx/async_xor.o: in function `async_xor': crypto/async_tx/async_xor.c:172: undefined reference to `dmaengine_get_unmap_data' aarch64-linux-gnu-ld: crypto/async_tx/async_xor.c:199: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: crypto/async_tx/async_xor.c:199: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: crypto/async_tx/async_xor.c:196: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: crypto/async_tx/async_xor.o: in function `async_xor_val': crypto/async_tx/async_xor.c:268: undefined reference to `dmaengine_get_unmap_data' aarch64-linux-gnu-ld: crypto/async_tx/async_xor.c:324: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: crypto/async_tx/async_pq.o: in function `async_gen_syndrome': crypto/async_tx/async_pq.c:176: undefined reference to `dmaengine_get_unmap_data' aarch64-linux-gnu-ld: crypto/async_tx/async_pq.c:233: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: crypto/async_tx/async_pq.c:229: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: crypto/async_tx/async_pq.o: in function `async_syndrome_val': crypto/async_tx/async_pq.c:295: undefined reference to `dmaengine_get_unmap_data' aarch64-linux-gnu-ld: crypto/async_tx/async_pq.c:412: undefined reference to `dmaengine_unmap_put' aarch64-linux-gnu-ld: drivers/xen/sys-hypervisor.o: in function `buildid_show': drivers/xen/sys-hypervisor.c:375: undefined reference to `stpcpy' aarch64-linux-gnu-ld: drivers/tty/tty_io.o: in function `tty_line_name': drivers/tty/tty_io.c:1139: undefined reference to `stpcpy' aarch64-linux-gnu-ld: drivers/tty/tty_io.c:1139: undefined reference to `stpcpy' aarch64-linux-gnu-ld: drivers/tty/tty_io.c:1139: undefined reference to `stpcpy' aarch64-linux-gnu-ld: drivers/gpu/drm/vc4/vc4_dsi.o: in function `dsi_dma_workaround_write': drivers/gpu/drm/vc4/vc4_dsi.c:581: undefined reference to `dma_sync_wait' aarch64-linux-gnu-ld: drivers/gpu/drm/vc4/vc4_dsi.c:581: undefined reference to `dma_sync_wait' aarch64-linux-gnu-ld: drivers/gpu/drm/vc4/vc4_dsi.c:581: undefined reference to `dma_sync_wait' a
Re: [PATCH v3 0/6] Add virtio-iommu built-in topology
On Fri, Aug 21, 2020 at 03:15:34PM +0200, Jean-Philippe Brucker wrote: > Add a topology description to the virtio-iommu driver and enable x86 > platforms. > > Since [v2] we have made some progress on adding ACPI support for > virtio-iommu, which is the preferred boot method on x86. It will be a > new vendor-agnostic table describing para-virtual topologies in a > minimal format. However some platforms don't use either ACPI or DT for > booting (for example microvm), and will need the alternative topology > description method proposed here. In addition, since the process to get > a new ACPI table will take a long time, this provides a boot method even > to ACPI-based platforms, if only temporarily for testing and > development. OK should I park this in next now? Seems appropriate ... > v3: > * Add patch 1 that moves virtio-iommu to a subfolder. > * Split the rest: > * Patch 2 adds topology-helper.c, which will be shared with the ACPI > support. > * Patch 4 adds definitions. > * Patch 5 adds parser in topology.c. > * Address other comments. > > Linux and QEMU patches available at: > https://jpbrucker.net/git/linux virtio-iommu/devel > https://jpbrucker.net/git/qemu virtio-iommu/devel > > [spec] https://lists.oasis-open.org/archives/virtio-dev/202008/msg00067.html > [v2] > https://lore.kernel.org/linux-iommu/20200228172537.377327-1-jean-phili...@linaro.org/ > [v1] > https://lore.kernel.org/linux-iommu/20200214160413.1475396-1-jean-phili...@linaro.org/ > [rfc] > https://lore.kernel.org/linux-iommu/20191122105000.800410-1-jean-phili...@linaro.org/ > > Jean-Philippe Brucker (6): > iommu/virtio: Move to drivers/iommu/virtio/ > iommu/virtio: Add topology helpers > PCI: Add DMA configuration for virtual platforms > iommu/virtio: Add topology definitions > iommu/virtio: Support topology description in config space > iommu/virtio: Enable x86 support > > drivers/iommu/Kconfig | 18 +- > drivers/iommu/Makefile| 3 +- > drivers/iommu/virtio/Makefile | 4 + > drivers/iommu/virtio/topology-helpers.h | 50 + > include/linux/virt_iommu.h| 15 ++ > include/uapi/linux/virtio_iommu.h | 44 > drivers/iommu/virtio/topology-helpers.c | 196 > drivers/iommu/virtio/topology.c | 259 ++ > drivers/iommu/{ => virtio}/virtio-iommu.c | 4 + > drivers/pci/pci-driver.c | 5 + > MAINTAINERS | 3 +- > 11 files changed, 597 insertions(+), 4 deletions(-) > create mode 100644 drivers/iommu/virtio/Makefile > create mode 100644 drivers/iommu/virtio/topology-helpers.h > create mode 100644 include/linux/virt_iommu.h > create mode 100644 drivers/iommu/virtio/topology-helpers.c > create mode 100644 drivers/iommu/virtio/topology.c > rename drivers/iommu/{ => virtio}/virtio-iommu.c (99%) > > -- > 2.28.0 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names
On 2020-08-26 17:07, Robin Murphy wrote: On 2020-08-25 16:42, Sai Prakash Ranjan wrote: Currently the non-strict or lazy mode of TLB invalidation can only be set for all or no domains. This works well for development platforms where setting to non-strict/lazy mode is fine for performance reasons but on production devices, we need a more fine grained control to allow only certain peripherals to support this mode where we can be sure that it is safe. So add support to filter non-strict/lazy mode based on the device names that are passed via cmdline parameter "iommu.nonstrict_device". There seems to be considerable overlap here with both the existing patches for per-device default domain control [1], and the broader ongoing development on how to define, evaluate and handle "trusted" vs. "untrusted" devices (e.g. [2],[3]). I'd rather see work done to make sure those integrate properly together and work well for everyone's purposes, than add more disjoint mechanisms that only address small pieces of the overall issue. Robin. [1] https://lore.kernel.org/linux-iommu/20200824051726.7xaJRTTszJuzdFWGJ8YNsshCtfNR0BNeMrlILAyqt_0@z/ [2] https://lore.kernel.org/linux-iommu/20200630044943.3425049-1-raja...@google.com/ [3] https://lore.kernel.org/linux-iommu/20200626002710.110200-2-raja...@google.com/ Thanks for the links, [1] definitely sounds interesting, I was under the impression that changing such via sysfs is late, but seems like other Sai has got it working for the default domain type. So we can extend that and add a strict attribute as well, we should be definitely OK with system booting with default strict mode for all peripherals as long as we have an option to change that later, Doug? Thanks, Sai -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 45/46] irqdomain/msi: Provide msi_alloc/free_store() callbacks
From: Thomas Gleixner For devices which don't have a standard storage for MSI messages like the upcoming IMS (Interrupt Message Storm) it's required to allocate storage space before allocating interrupts and after freeing them. This could be achieved with the existing callbacks, but that would be awkward because they operate on msi_alloc_info_t which is not uniform accross architectures. Also these callbacks are invoked per interrupt but the allocation might have bulk requirements depending on the device. As such devices can operate on different architectures it is simpler to have seperate callbacks which operate on struct device. The resulting storage information has to be stored in struct msi_desc so the underlying irq chip implementation can retrieve it for the relevant operations. Signed-off-by: Thomas Gleixner --- include/linux/msi.h |8 kernel/irq/msi.c| 11 +++ 2 files changed, 19 insertions(+) --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -279,6 +279,10 @@ struct msi_domain_info; * function. * @domain_free_irqs: Optional function to override the default free * function. + * @msi_alloc_store: Optional callback to allocate storage in a device + * specific non-standard MSI store + * @msi_alloc_free:Optional callback to free storage in a device + * specific non-standard MSI store * * @get_hwirq, @msi_init and @msi_free are callbacks used by * msi_create_irq_domain() and related interfaces @@ -328,6 +332,10 @@ struct msi_domain_ops { struct device *dev, int nvec); void(*domain_free_irqs)(struct irq_domain *domain, struct device *dev); + int (*msi_alloc_store)(struct irq_domain *domain, + struct device *dev, int nvec); + void(*msi_free_store)(struct irq_domain *domain, + struct device *dev); }; /** --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -434,6 +434,12 @@ int __msi_domain_alloc_irqs(struct irq_d if (ret) return ret; + if (ops->msi_alloc_store) { + ret = ops->msi_alloc_store(domain, dev, nvec); + if (ret) + return ret; + } + for_each_msi_entry(desc, dev) { ops->set_desc(&arg, desc); @@ -533,6 +539,8 @@ int msi_domain_alloc_irqs(struct irq_dom void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev) { + struct msi_domain_info *info = domain->host_data; + struct msi_domain_ops *ops = info->ops; struct msi_desc *desc; for_each_msi_entry(desc, dev) { @@ -546,6 +554,9 @@ void __msi_domain_free_irqs(struct irq_d desc->irq = 0; } } + + if (ops->msi_free_store) + ops->msi_free_store(domain, dev); } /** ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 41/46] platform-msi: Provide default irq_chip:: Ack
From: Thomas Gleixner For the upcoming device MSI support it's required to have a default irq_chip::ack implementation (irq_chip_ack_parent) so the drivers do not need to care. Signed-off-by: Thomas Gleixner --- drivers/base/platform-msi.c |2 ++ 1 file changed, 2 insertions(+) --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -95,6 +95,8 @@ static void platform_msi_update_chip_ops chip->irq_mask = irq_chip_mask_parent; if (!chip->irq_unmask) chip->irq_unmask = irq_chip_unmask_parent; + if (!chip->irq_ack) + chip->irq_ack = irq_chip_ack_parent; if (!chip->irq_eoi) chip->irq_eoi = irq_chip_eoi_parent; if (!chip->irq_set_affinity) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 42/46] genirq/proc: Take buslock on affinity write
Until now interrupt chips which support setting affinity are nit locking the associated bus lock for two reasons: - All chips which support affinity setting do not use buslock because they just can operated directly on the hardware. - All chips which use buslock do not support affinity setting because their interrupt chips are not capable. These chips are usually connected over a bus like I2C, SPI etc. and have an interrupt output which is conneted to CPU interrupt of some sort. So there is no way to set the affinity on the chip itself. Upcoming hardware which is PCIE based sports a non standard MSI(X) variant which stores the MSI message in RAM which is associated to e.g. a device queue. The device manages this RAM and writes have to be issued via command queues or similar mechanisms which is obviously not possible from interrupt disabled, raw spinlock held context. The buslock mechanism of irq chips can be utilized to support that. The affinity write to the chip writes to shadow state, marks it pending and the irq chip's irq_bus_sync_unlock() callback handles the command queue and wait for completion similar to the other chip operations on I2C or SPI busses. Change the locking in irq_set_affinity() to bus_lock/unlock to help with that. There are a few other callers than the proc interface, but none of them is affected by this change as none of them affects an irq chip with bus lock support. Signed-off-by: Thomas Gleixner --- V2: New patch --- kernel/irq/manage.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) --- a/kernel/irq/manage.c +++ b/kernel/irq/manage.c @@ -373,16 +373,16 @@ int irq_set_affinity_locked(struct irq_d int __irq_set_affinity(unsigned int irq, const struct cpumask *mask, bool force) { - struct irq_desc *desc = irq_to_desc(irq); + struct irq_desc *desc; unsigned long flags; int ret; + desc = irq_get_desc_buslock(irq, &flags, IRQ_GET_DESC_CHECK_GLOBAL); if (!desc) return -EINVAL; - raw_spin_lock_irqsave(&desc->lock, flags); ret = irq_set_affinity_locked(irq_desc_get_irq_data(desc), mask, force); - raw_spin_unlock_irqrestore(&desc->lock, flags); + irq_put_desc_busunlock(desc, flags); return ret; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 40/46] x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI
From: Thomas Gleixner Rename it to x86_msi_prepare() and handle the allocation type setup depending on the device type. Add a new arch_msi_prepare define which will be utilized by the upcoming device MSI support. Define it to NULL if not provided by an architecture in the generic MSI header. One arch specific function for MSI support is truly enough. Signed-off-by: Thomas Gleixner --- V2: Polish subject line --- arch/x86/include/asm/msi.h |4 +++- arch/x86/kernel/apic/msi.c | 27 --- drivers/pci/controller/pci-hyperv.c |2 +- include/linux/msi.h |4 4 files changed, 28 insertions(+), 9 deletions(-) --- a/arch/x86/include/asm/msi.h +++ b/arch/x86/include/asm/msi.h @@ -6,7 +6,9 @@ typedef struct irq_alloc_info msi_alloc_info_t; -int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, +int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, msi_alloc_info_t *arg); +#define arch_msi_prepare x86_msi_prepare + #endif /* _ASM_X86_MSI_H */ --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -181,26 +181,39 @@ static struct irq_chip pci_msi_controlle .flags = IRQCHIP_SKIP_SET_WAKE, }; -int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, - msi_alloc_info_t *arg) +static void pci_msi_prepare(struct device *dev, msi_alloc_info_t *arg) { - struct pci_dev *pdev = to_pci_dev(dev); - struct msi_desc *desc = first_pci_msi_entry(pdev); + struct msi_desc *desc = first_msi_entry(dev); - init_irq_alloc_info(arg, NULL); if (desc->msi_attrib.is_msix) { arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX; } else { arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSI; arg->flags |= X86_IRQ_ALLOC_CONTIGUOUS_VECTORS; } +} + +static void dev_msi_prepare(struct device *dev, msi_alloc_info_t *arg) +{ + arg->type = X86_IRQ_ALLOC_TYPE_DEV_MSI; +} + +int x86_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, + msi_alloc_info_t *arg) +{ + init_irq_alloc_info(arg, NULL); + + if (dev_is_pci(dev)) + pci_msi_prepare(dev, arg); + else + dev_msi_prepare(dev, arg); return 0; } -EXPORT_SYMBOL_GPL(pci_msi_prepare); +EXPORT_SYMBOL_GPL(x86_msi_prepare); static struct msi_domain_ops pci_msi_domain_ops = { - .msi_prepare= pci_msi_prepare, + .msi_prepare= x86_msi_prepare, }; static struct msi_domain_info pci_msi_domain_info = { --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1532,7 +1532,7 @@ static struct irq_chip hv_msi_irq_chip = }; static struct msi_domain_ops hv_msi_ops = { - .msi_prepare= pci_msi_prepare, + .msi_prepare= arch_msi_prepare, .msi_free = hv_msi_free, }; --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -430,4 +430,8 @@ static inline struct irq_domain *pci_msi } #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */ +#ifndef arch_msi_prepare +# define arch_msi_prepare NULL +#endif + #endif /* LINUX_MSI_H */ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 44/46] platform-msi: Add device MSI infrastructure
From: Thomas Gleixner Add device specific MSI domain infrastructure for devices which have their own resource management and interrupt chip. These devices are not related to PCI and contrary to platform MSI they do not share a common resource and interrupt chip. They provide their own domain specific resource management and interrupt chip. This utilizes the new alloc/free override in a non evil way which avoids having yet another set of specialized alloc/free functions. Just using msi_domain_alloc/free_irqs() is sufficient While initially it was suggested and tried to piggyback device MSI on platform MSI, the better variant is to reimplement platform MSI on top of device MSI. Signed-off-by: Thomas Gleixner --- drivers/base/platform-msi.c | 131 include/linux/irqdomain.h |1 include/linux/msi.h | 24 kernel/irq/Kconfig |4 + 4 files changed, 160 insertions(+) --- a/drivers/base/platform-msi.c +++ b/drivers/base/platform-msi.c @@ -412,3 +412,134 @@ int platform_msi_domain_alloc(struct irq return err; } + +#ifdef CONFIG_DEVICE_MSI +/* + * Device specific MSI domain infrastructure for devices which have their + * own resource management and interrupt chip. These devices are not + * related to PCI and contrary to platform MSI they do not share a common + * resource and interrupt chip. They provide their own domain specific + * resource management and interrupt chip. + */ + +static void device_msi_free_msi_entries(struct device *dev) +{ + struct list_head *msi_list = dev_to_msi_list(dev); + struct msi_desc *entry, *tmp; + + list_for_each_entry_safe(entry, tmp, msi_list, list) { + list_del(&entry->list); + free_msi_entry(entry); + } +} + +/** + * device_msi_free_irqs - Free MSI interrupts assigned to a device + * @dev: Pointer to the device + * + * Frees the interrupt and the MSI descriptors. + */ +static void device_msi_free_irqs(struct irq_domain *domain, struct device *dev) +{ + __msi_domain_free_irqs(domain, dev); + device_msi_free_msi_entries(dev); +} + +/** + * device_msi_alloc_irqs - Allocate MSI interrupts for a device + * @dev: Pointer to the device + * @nvec: Number of vectors + * + * Allocates the required number of MSI descriptors and the corresponding + * interrupt descriptors. + */ +static int device_msi_alloc_irqs(struct irq_domain *domain, struct device *dev, int nvec) +{ + int i, ret = -ENOMEM; + + for (i = 0; i < nvec; i++) { + struct msi_desc *entry = alloc_msi_entry(dev, 1, NULL); + + if (!entry) + goto fail; + list_add_tail(&entry->list, dev_to_msi_list(dev)); + } + + ret = __msi_domain_alloc_irqs(domain, dev, nvec); + if (!ret) + return 0; +fail: + device_msi_free_msi_entries(dev); + return ret; +} + +static void device_msi_update_dom_ops(struct msi_domain_info *info) +{ + if (!info->ops->domain_alloc_irqs) + info->ops->domain_alloc_irqs = device_msi_alloc_irqs; + if (!info->ops->domain_free_irqs) + info->ops->domain_free_irqs = device_msi_free_irqs; + if (!info->ops->msi_prepare) + info->ops->msi_prepare = arch_msi_prepare; +} + +/** + * device_msi_create_msi_irq_domain - Create an irq domain for devices + * @fwnode:Firmware node of the interrupt controller + * @info: MSI domain info to configure the new domain + * @parent:Parent domain + */ +struct irq_domain *device_msi_create_irq_domain(struct fwnode_handle *fn, + struct msi_domain_info *info, + struct irq_domain *parent) +{ + struct irq_domain *domain; + + if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS) + platform_msi_update_chip_ops(info); + + if (info->flags & MSI_FLAG_USE_DEF_DOM_OPS) + device_msi_update_dom_ops(info); + + msi_domain_set_default_info_flags(info); + + domain = msi_create_irq_domain(fn, info, parent); + if (domain) + irq_domain_update_bus_token(domain, DOMAIN_BUS_DEVICE_MSI); + return domain; +} + +#ifdef CONFIG_PCI +#include + +/** + * pci_subdevice_msi_create_irq_domain - Create an irq domain for subdevices + * @pdev: Pointer to PCI device for which the subdevice domain is created + * @info: MSI domain info to configure the new domain + */ +struct irq_domain *pci_subdevice_msi_create_irq_domain(struct pci_dev *pdev, + struct msi_domain_info *info) +{ + struct irq_domain *domain, *pdev_msi; + struct fwnode_handle *fn; + + /* +* Retrieve the MSI domain of the underlying PCI device's MSI +* domain. The PCI device domain's parent domain is also the parent +* domain of the
[patch V2 43/46] genirq/msi: Provide and use msi_domain_set_default_info_flags()
MSI interrupts have some common flags which should be set not only for PCI/MSI interrupts. Move the PCI/MSI flag setting into a common function so it can be reused. Signed-off-by: Thomas Gleixner --- V2: New patch --- drivers/pci/msi.c |7 +-- include/linux/msi.h |1 + kernel/irq/msi.c| 24 3 files changed, 26 insertions(+), 6 deletions(-) --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -1469,12 +1469,7 @@ struct irq_domain *pci_msi_create_irq_do if (info->flags & MSI_FLAG_USE_DEF_CHIP_OPS) pci_msi_domain_update_chip_ops(info); - info->flags |= MSI_FLAG_ACTIVATE_EARLY; - if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE)) - info->flags |= MSI_FLAG_MUST_REACTIVATE; - - /* PCI-MSI is oneshot-safe */ - info->chip->flags |= IRQCHIP_ONESHOT_SAFE; + msi_domain_set_default_info_flags(info); domain = msi_create_irq_domain(fwnode, info, parent); if (!domain) --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -410,6 +410,7 @@ int platform_msi_domain_alloc(struct irq void platform_msi_domain_free(struct irq_domain *domain, unsigned int virq, unsigned int nvec); void *platform_msi_get_host_data(struct irq_domain *domain); +void msi_domain_set_default_info_flags(struct msi_domain_info *info); #endif /* CONFIG_GENERIC_MSI_IRQ_DOMAIN */ #ifdef CONFIG_PCI_MSI_IRQ_DOMAIN --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -70,6 +70,30 @@ void get_cached_msi_msg(unsigned int irq EXPORT_SYMBOL_GPL(get_cached_msi_msg); #ifdef CONFIG_GENERIC_MSI_IRQ_DOMAIN +void msi_domain_set_default_info_flags(struct msi_domain_info *info) +{ + /* Required so that a device latches a valid MSI message on startup */ + info->flags |= MSI_FLAG_ACTIVATE_EARLY; + + /* +* Interrupt reservation mode allows to stear the MSI message of an +* inactive device to a special (usually spurious interrupt) target. +* This allows to prevent interrupt vector exhaustion e.g. on x86. +* But (PCI)MSI interrupts are activated early - see above - so the +* interrupt request/startup sequence would not try to allocate a +* usable vector which means that the device interupts would end +* up on the special vector and issue spurious interrupt messages. +* Setting the reactivation flag ensures that when the interrupt +* is requested the activation is invoked again so that a real +* vector can be allocated. +*/ + if (IS_ENABLED(CONFIG_GENERIC_IRQ_RESERVATION_MODE)) + info->flags |= MSI_FLAG_MUST_REACTIVATE; + + /* MSI is oneshot-safe at least in theory */ + info->chip->flags |= IRQCHIP_ONESHOT_SAFE; +} + static inline void irq_chip_write_msi_msg(struct irq_data *data, struct msi_msg *msg) { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 46/46] irqchip: Add IMS (Interrupt Message Storm) driver - NOT FOR MERGING
Generic IMS irq chips and irq domain implementations for IMS based devices in both variants: - Message store in an array in device memory - Message store in system RAM (part of queue memory) Allocation and freeing of interrupts happens via the generic msi_domain_alloc/free_irqs() interface. No special purpose IMS magic required as long as the interrupt domain is stored in the underlying device struct. Completely untested of course and mostly for illustration and educational purpose. This should of course be a modular irq chip, but adding that support is left as an exercise for the people who care about this deeply. Signed-off-by: Thomas Gleixner --- V2: Reworked to handle both devmem arrays and queue based storage. --- drivers/irqchip/Kconfig | 19 + drivers/irqchip/Makefile|1 drivers/irqchip/irq-ims-msi.c | 343 include/linux/irqchip/irq-ims-msi.h | 95 + 4 files changed, 458 insertions(+) --- a/drivers/irqchip/Kconfig +++ b/drivers/irqchip/Kconfig @@ -571,4 +571,23 @@ config LOONGSON_PCH_MSI help Support for the Loongson PCH MSI Controller. +config IMS_MSI + depends on PCI + select DEVICE_MSI + bool + +config IMS_MSI_ARRAY + bool "IMS Interrupt Message Storm MSI controller for device memory storage arrays" + select IMS_MSI + help + Support for IMS Interrupt Message Storm MSI controller + with IMS slot storage in a slot array in device memory + +config IMS_MSI_QUEUE + bool "IMS Interrupt Message Storm MSI controller for IMS queue storage" + select IMS_MSI + help + Support for IMS Interrupt Message Storm MSI controller + with IMS slot storage in the queue storage of a device + endmenu --- a/drivers/irqchip/Makefile +++ b/drivers/irqchip/Makefile @@ -111,3 +111,4 @@ obj-$(CONFIG_LOONGSON_HTPIC)+= irq-loo obj-$(CONFIG_LOONGSON_HTVEC) += irq-loongson-htvec.o obj-$(CONFIG_LOONGSON_PCH_PIC) += irq-loongson-pch-pic.o obj-$(CONFIG_LOONGSON_PCH_MSI) += irq-loongson-pch-msi.o +obj-$(CONFIG_IMS_MSI) += irq-ims-msi.o --- /dev/null +++ b/drivers/irqchip/irq-ims-msi.c @@ -0,0 +1,343 @@ +// SPDX-License-Identifier: GPL-2.0 +// (C) Copyright 2020 Thomas Gleixner +/* + * Shared interrupt chips and irq domains for IMS devices + */ +#include +#include +#include +#include + +#include + +#ifdef CONFIG_IMS_ARRAY + +struct ims_array_data { + struct ims_array_info info; + unsigned long map[0]; +}; + +static void ims_array_mask_irq(struct irq_data *data) +{ + struct msi_desc *desc = irq_data_get_msi_desc(data); + struct ims_slot __iomem *slot = desc->device_msi.priv_iomem; + u32 __iomem *ctrl = &slot->ctrl; + + iowrite32(ioread32(ctrl) & ~IMS_VECTOR_CTRL_UNMASK, ctrl); +} + +static void ims_array_unmask_irq(struct irq_data *data) +{ + struct msi_desc *desc = irq_data_get_msi_desc(data); + struct ims_slot __iomem *slot = desc->device_msi.priv_iomem; + u32 __iomem *ctrl = &slot->ctrl; + + iowrite32(ioread32(ctrl) | IMS_VECTOR_CTRL_UNMASK, ctrl); +} + +static void ims_array_write_msi_msg(struct irq_data *data, struct msi_msg *msg) +{ + struct msi_desc *desc = irq_data_get_msi_desc(data); + struct ims_slot __iomem *slot = desc->device_msi.priv_iomem; + + iowrite32(msg->address_lo, &slot->address_lo); + iowrite32(msg->address_hi, &slot->address_hi); + iowrite32(msg->data, &slot->data); +} + +static const struct irq_chip ims_array_msi_controller = { + .name = "IMS", + .irq_mask = ims_array_mask_irq, + .irq_unmask = ims_array_unmask_irq, + .irq_write_msi_msg = ims_array_write_msi_msg, + .irq_retrigger = irq_chip_retrigger_hierarchy, + .flags = IRQCHIP_SKIP_SET_WAKE, +}; + +static void ims_array_reset_slot(struct ims_slot __iomem *slot) +{ + iowrite32(0, &slot->address_lo); + iowrite32(0, &slot->address_hi); + iowrite32(0, &slot->data); + iowrite32(0, &slot->ctrl); +} + +static void ims_array_free_msi_store(struct irq_domain *domain, +struct device *dev) +{ + struct msi_domain_info *info = domain->host_data; + struct ims_array_data *ims = info->data; + struct msi_desc *entry; + + for_each_msi_entry(entry, dev) { + if (entry->device_msi.priv_iomem) { + clear_bit(entry->device_msi.hwirq, ims->map); + ims_array_reset_slot(entry->device_msi.priv_iomem); +
[patch V2 23/46] irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI
From: Thomas Gleixner PCI devices behind a VMD bus are not subject to interrupt remapping, but the irq domain for VMD MSI cannot be distinguished from a regular PCI/MSI irq domain. Add a new domain bus token and allow it in the bus token check in msi_check_reservation_mode() to keep the functionality the same once VMD uses this token. Signed-off-by: Thomas Gleixner --- include/linux/irqdomain.h |1 + kernel/irq/msi.c |7 ++- 2 files changed, 7 insertions(+), 1 deletion(-) --- a/include/linux/irqdomain.h +++ b/include/linux/irqdomain.h @@ -84,6 +84,7 @@ enum irq_domain_bus_token { DOMAIN_BUS_FSL_MC_MSI, DOMAIN_BUS_TI_SCI_INTA_MSI, DOMAIN_BUS_WAKEUP, + DOMAIN_BUS_VMD_MSI, }; /** --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -370,8 +370,13 @@ static bool msi_check_reservation_mode(s { struct msi_desc *desc; - if (domain->bus_token != DOMAIN_BUS_PCI_MSI) + switch(domain->bus_token) { + case DOMAIN_BUS_PCI_MSI: + case DOMAIN_BUS_VMD_MSI: + break; + default: return false; + } if (!(info->flags & MSI_FLAG_MUST_REACTIVATE)) return false; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 38/46] iommu/amd: Remove domain search for PCI/MSI
Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner --- V2: New patch --- drivers/iommu/amd/iommu.c |3 --- 1 file changed, 3 deletions(-) --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3548,9 +3548,6 @@ static struct irq_domain *get_irq_domain case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT: case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: return iommu->ir_domain; - case X86_IRQ_ALLOC_TYPE_PCI_MSI: - case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - return iommu->msi_domain; default: WARN_ON_ONCE(1); return NULL; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 28/46] x86/xen: Consolidate XEN-MSI init
From: Thomas Gleixner X86 cannot store the irq domain pointer in struct device without breaking XEN because the irq domain pointer takes precedence over arch_*_msi_irqs() fallbacks. To achieve this XEN MSI interrupt management needs to be wrapped into an irq domain. Move the x86_msi ops setup into a single function to prepare for this. Signed-off-by: Thomas Gleixner --- V2: Use the proper logic to select initial domain setup --- arch/x86/pci/xen.c | 51 --- 1 file changed, 32 insertions(+), 19 deletions(-) --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -372,7 +372,10 @@ static void xen_initdom_restore_msi_irqs WARN(ret && ret != -ENOSYS, "restore_msi -> %d\n", ret); } } -#endif +#else /* CONFIG_XEN_DOM0 */ +#define xen_initdom_setup_msi_irqs NULL +#define xen_initdom_restore_msi_irqs NULL +#endif /* !CONFIG_XEN_DOM0 */ static void xen_teardown_msi_irqs(struct pci_dev *dev) { @@ -404,7 +407,31 @@ static void xen_teardown_msi_irq(unsigne WARN_ON_ONCE(1); } -#endif +static __init void xen_setup_pci_msi(void) +{ + if (xen_pv_domain()) { + if (xen_initial_domain()) { + x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs; + x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs; + } else { + x86_msi.setup_msi_irqs = xen_setup_msi_irqs; + } + x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs; + pci_msi_ignore_mask = 1; + } else if (xen_hvm_domain()) { + x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs; + x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; + } else { + WARN_ON_ONCE(1); + return; + } + + x86_msi.teardown_msi_irq = xen_teardown_msi_irq; +} + +#else /* CONFIG_PCI_MSI */ +static inline void xen_setup_pci_msi(void) { } +#endif /* CONFIG_PCI_MSI */ int __init pci_xen_init(void) { @@ -421,12 +448,7 @@ int __init pci_xen_init(void) /* Keep ACPI out of the picture */ acpi_noirq_set(); -#ifdef CONFIG_PCI_MSI - x86_msi.setup_msi_irqs = xen_setup_msi_irqs; - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; - x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs; - pci_msi_ignore_mask = 1; -#endif + xen_setup_pci_msi(); return 0; } @@ -446,10 +468,7 @@ static void __init xen_hvm_msi_init(void ((eax & XEN_HVM_CPUID_APIC_ACCESS_VIRT) && boot_cpu_has(X86_FEATURE_APIC))) return; } - - x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs; - x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; + xen_setup_pci_msi(); } #endif @@ -482,13 +501,7 @@ int __init pci_xen_initial_domain(void) { int irq; -#ifdef CONFIG_PCI_MSI - x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs; - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; - x86_msi.teardown_msi_irqs = xen_teardown_pv_msi_irqs; - x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs; - pci_msi_ignore_mask = 1; -#endif + xen_setup_pci_msi(); __acpi_register_gsi = acpi_register_gsi_xen; __acpi_unregister_gsi = NULL; /* ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 37/46] iommu/vt-d: Remove domain search for PCI/MSI[X]
Now that the domain can be retrieved through device::msi_domain the domain search for PCI_MSI[X] is not longer required. Remove it. Signed-off-by: Thomas Gleixner --- V2: New patch --- drivers/iommu/intel/irq_remapping.c |3 --- 1 file changed, 3 deletions(-) --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1132,9 +1132,6 @@ static struct irq_domain *intel_get_irq_ return map_ioapic_to_ir(info->devid); case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: return map_hpet_to_ir(info->devid); - case X86_IRQ_ALLOC_TYPE_PCI_MSI: - case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - return map_dev_to_ir(msi_desc_to_pci_dev(info->desc)); default: WARN_ON_ONCE(1); return NULL; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 32/46] iommm/amd: Store irq domain in struct device
From: Thomas Gleixner As the next step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner --- drivers/iommu/amd/iommu.c | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -729,7 +729,21 @@ static void iommu_poll_ga_log(struct amd } } } -#endif /* CONFIG_IRQ_REMAP */ + +static void +amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu) +{ + if (!irq_remapping_enabled || !dev_is_pci(dev) || + pci_dev_has_special_msi_domain(to_pci_dev(dev))) + return; + + dev_set_msi_domain(dev, iommu->msi_domain); +} + +#else /* CONFIG_IRQ_REMAP */ +static inline void +amd_iommu_set_pci_msi_domain(struct device *dev, struct amd_iommu *iommu) { } +#endif /* !CONFIG_IRQ_REMAP */ #define AMD_IOMMU_INT_MASK \ (MMIO_STATUS_EVT_INT_MASK | \ @@ -2157,6 +2171,7 @@ static struct iommu_device *amd_iommu_pr iommu_dev = ERR_PTR(ret); iommu_ignore_device(dev); } else { + amd_iommu_set_pci_msi_domain(dev, iommu); iommu_dev = &iommu->iommu; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 19/46] x86/msi: Use generic MSI domain ops
From: Thomas Gleixner pci_msi_get_hwirq() and pci_msi_set_desc are not longer special. Enable the generic MSI domain ops in the core and PCI MSI code unconditionally and get rid of the x86 specific implementations in the X86 MSI code and in the hyperv PCI driver. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/msi.h |2 -- arch/x86/kernel/apic/msi.c | 15 --- drivers/pci/controller/pci-hyperv.c |8 drivers/pci/msi.c |4 kernel/irq/msi.c|6 -- 5 files changed, 35 deletions(-) --- a/arch/x86/include/asm/msi.h +++ b/arch/x86/include/asm/msi.h @@ -9,6 +9,4 @@ typedef struct irq_alloc_info msi_alloc_ int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, msi_alloc_info_t *arg); -void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc); - #endif /* _ASM_X86_MSI_H */ --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -204,12 +204,6 @@ void native_teardown_msi_irq(unsigned in irq_domain_free_irqs(irq, 1); } -static irq_hw_number_t pci_msi_get_hwirq(struct msi_domain_info *info, -msi_alloc_info_t *arg) -{ - return arg->hwirq; -} - int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, msi_alloc_info_t *arg) { @@ -228,17 +222,8 @@ int pci_msi_prepare(struct irq_domain *d } EXPORT_SYMBOL_GPL(pci_msi_prepare); -void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc) -{ - arg->desc = desc; - arg->hwirq = pci_msi_domain_calc_hwirq(desc); -} -EXPORT_SYMBOL_GPL(pci_msi_set_desc); - static struct msi_domain_ops pci_msi_domain_ops = { - .get_hwirq = pci_msi_get_hwirq, .msi_prepare= pci_msi_prepare, - .set_desc = pci_msi_set_desc, }; static struct msi_domain_info pci_msi_domain_info = { --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1531,16 +1531,8 @@ static struct irq_chip hv_msi_irq_chip = .irq_unmask = hv_irq_unmask, }; -static irq_hw_number_t hv_msi_domain_ops_get_hwirq(struct msi_domain_info *info, - msi_alloc_info_t *arg) -{ - return arg->hwirq; -} - static struct msi_domain_ops hv_msi_ops = { - .get_hwirq = hv_msi_domain_ops_get_hwirq, .msi_prepare= pci_msi_prepare, - .set_desc = pci_msi_set_desc, .msi_free = hv_msi_free, }; --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -1401,16 +1401,12 @@ static int pci_msi_domain_handle_error(s return error; } -#ifdef GENERIC_MSI_DOMAIN_OPS static void pci_msi_domain_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc) { arg->desc = desc; arg->hwirq = pci_msi_domain_calc_hwirq(desc); } -#else -#define pci_msi_domain_set_descNULL -#endif static struct msi_domain_ops pci_msi_domain_ops_default = { .set_desc = pci_msi_domain_set_desc, --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -187,7 +187,6 @@ static const struct irq_domain_ops msi_d .deactivate = msi_domain_deactivate, }; -#ifdef GENERIC_MSI_DOMAIN_OPS static irq_hw_number_t msi_domain_ops_get_hwirq(struct msi_domain_info *info, msi_alloc_info_t *arg) { @@ -206,11 +205,6 @@ static void msi_domain_ops_set_desc(msi_ { arg->desc = desc; } -#else -#define msi_domain_ops_get_hwirq NULL -#define msi_domain_ops_prepare NULL -#define msi_domain_ops_set_descNULL -#endif /* !GENERIC_MSI_DOMAIN_OPS */ static int msi_domain_ops_init(struct irq_domain *domain, struct msi_domain_info *info, ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 12/46] x86/irq: Prepare consolidation of irq_alloc_info
From: Thomas Gleixner struct irq_alloc_info is a horrible zoo of unnamed structs in a union. Many of the struct fields can be generic and don't have to be type specific like hpet_id, ioapic_id... Provide a generic set of members to prepare for the consolidation. The goal is to make irq_alloc_info have the same basic member as the generic msi_alloc_info so generic MSI domain ops can be reused and yet more mess can be avoided when (non-PCI) device MSI support comes along. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h | 22 -- 1 file changed, 16 insertions(+), 6 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -44,10 +44,25 @@ enum irq_alloc_type { X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT, }; +/** + * irq_alloc_info - X86 specific interrupt allocation info + * @type: X86 specific allocation type + * @flags: Flags for allocation tweaks + * @devid: Device ID for allocations + * @hwirq: Associated hw interrupt number in the domain + * @mask: CPU mask for vector allocation + * @desc: Pointer to msi descriptor + * @data: Allocation specific data + */ struct irq_alloc_info { enum irq_alloc_type type; u32 flags; - const struct cpumask*mask; /* CPU mask for vector allocation */ + u32 devid; + irq_hw_number_t hwirq; + const struct cpumask*mask; + struct msi_desc *desc; + void*data; + union { int unused; #ifdef CONFIG_HPET_TIMER @@ -88,11 +103,6 @@ struct irq_alloc_info { char*uv_name; }; #endif -#if IS_ENABLED(CONFIG_VMD) - struct { - struct msi_desc *desc; - }; -#endif }; }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 25/46] PCI/MSI: Provide pci_dev_has_special_msi_domain() helper
From: Thomas Gleixner Provide a helper function to check whether a PCI device is handled by a non-standard PCI/MSI domain. This will be used to exclude such devices which hang of a special bus, e.g. VMD, to be excluded from the irq domain override in irq remapping. Signed-off-by: Thomas Gleixner Acked-by: Bjorn Helgaas --- drivers/pci/msi.c | 22 ++ include/linux/msi.h |1 + 2 files changed, 23 insertions(+) --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -1553,4 +1553,26 @@ struct irq_domain *pci_msi_get_device_do DOMAIN_BUS_PCI_MSI); return dom; } + +/** + * pci_dev_has_special_msi_domain - Check whether the device is handled by + * a non-standard PCI-MSI domain + * @pdev: The PCI device to check. + * + * Returns: True if the device irqdomain or the bus irqdomain is + * non-standard PCI/MSI. + */ +bool pci_dev_has_special_msi_domain(struct pci_dev *pdev) +{ + struct irq_domain *dom = dev_get_msi_domain(&pdev->dev); + + if (!dom) + dom = dev_get_msi_domain(&pdev->bus->dev); + + if (!dom) + return true; + + return dom->bus_token != DOMAIN_BUS_PCI_MSI; +} + #endif /* CONFIG_PCI_MSI_IRQ_DOMAIN */ --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -374,6 +374,7 @@ int pci_msi_domain_check_cap(struct irq_ struct msi_domain_info *info, struct device *dev); u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev); struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev); +bool pci_dev_has_special_msi_domain(struct pci_dev *pdev); #else static inline struct irq_domain *pci_msi_get_device_domain(struct pci_dev *pdev) { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 24/46] PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI
From: Thomas Gleixner Devices on the VMD bus use their own MSI irq domain, but it is not distinguishable from regular PCI/MSI irq domains. This is required to exclude VMD devices from getting the irq domain pointer set by interrupt remapping. Override the default bus token. Signed-off-by: Thomas Gleixner Acked-by: Bjorn Helgaas --- drivers/pci/controller/vmd.c |6 ++ 1 file changed, 6 insertions(+) --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -579,6 +579,12 @@ static int vmd_enable_domain(struct vmd_ return -ENODEV; } + /* +* Override the irq domain bus token so the domain can be distinguished +* from a regular PCI/MSI domain. +*/ + irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI); + pci_add_resource(&resources, &vmd->resources[0]); pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]); pci_add_resource_offset(&resources, &vmd->resources[2], offset[1]); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 34/46] PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable
From: Thomas Gleixner The arch_.*_msi_irq[s] fallbacks are compiled in whether an architecture requires them or not. Architectures which are fully utilizing hierarchical irq domains should never call into that code. It's not only architectures which depend on that by implementing one or more of the weak functions, there is also a bunch of drivers which relies on the weak functions which invoke msi_controller::setup_irq[s] and msi_controller::teardown_irq. Make the architectures and drivers which rely on them select them in Kconfig and if not selected replace them by stub functions which emit a warning and fail the PCI/MSI interrupt allocation. Signed-off-by: Thomas Gleixner --- V2: Make the architectures (and drivers) which need the fallbacks select them and not the other way round (Bjorn). --- arch/ia64/Kconfig |1 + arch/mips/Kconfig |1 + arch/powerpc/Kconfig |1 + arch/s390/Kconfig |1 + arch/sparc/Kconfig |1 + arch/x86/Kconfig |1 + drivers/pci/Kconfig|3 +++ drivers/pci/controller/Kconfig |3 +++ drivers/pci/msi.c |3 ++- include/linux/msi.h| 31 ++- 10 files changed, 40 insertions(+), 6 deletions(-) --- a/arch/ia64/Kconfig +++ b/arch/ia64/Kconfig @@ -56,6 +56,7 @@ config IA64 select NEED_DMA_MAP_STATE select NEED_SG_DMA_LENGTH select NUMA if !FLATMEM + select PCI_MSI_ARCH_FALLBACKS default y help The Itanium Processor Family is Intel's 64-bit successor to --- a/arch/mips/Kconfig +++ b/arch/mips/Kconfig @@ -86,6 +86,7 @@ config MIPS select MODULES_USE_ELF_REL if MODULES select MODULES_USE_ELF_RELA if MODULES && 64BIT select PERF_USE_VMALLOC + select PCI_MSI_ARCH_FALLBACKS select RTC_LIB select SYSCTL_EXCEPTION_TRACE select VIRT_TO_BUS --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -246,6 +246,7 @@ config PPC select OLD_SIGACTIONif PPC32 select OLD_SIGSUSPEND select PCI_DOMAINS if PCI + select PCI_MSI_ARCH_FALLBACKS select PCI_SYSCALL if PCI select PPC_DAWR if PPC64 select RTC_LIB --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -185,6 +185,7 @@ config S390 select OLD_SIGSUSPEND3 select PCI_DOMAINS if PCI select PCI_MSI if PCI + select PCI_MSI_ARCH_FALLBACKS select SPARSE_IRQ select SYSCTL_EXCEPTION_TRACE select THREAD_INFO_IN_TASK --- a/arch/sparc/Kconfig +++ b/arch/sparc/Kconfig @@ -43,6 +43,7 @@ config SPARC select GENERIC_STRNLEN_USER select MODULES_USE_ELF_RELA select PCI_SYSCALL if PCI + select PCI_MSI_ARCH_FALLBACKS select ODD_RT_SIGACTION select OLD_SIGSUSPEND select CPU_NO_EFFICIENT_FFS --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -225,6 +225,7 @@ config X86 select NEED_SG_DMA_LENGTH select PCI_DOMAINS if PCI select PCI_LOCKLESS_CONFIG if PCI + select PCI_MSI_ARCH_FALLBACKS select PERF_EVENTS select RTC_LIB select RTC_MC146818_LIB --- a/drivers/pci/Kconfig +++ b/drivers/pci/Kconfig @@ -56,6 +56,9 @@ config PCI_MSI_IRQ_DOMAIN depends on PCI_MSI select GENERIC_MSI_IRQ_DOMAIN +config PCI_MSI_ARCH_FALLBACKS + bool + config PCI_QUIRKS default y bool "Enable PCI quirk workarounds" if EXPERT --- a/drivers/pci/controller/Kconfig +++ b/drivers/pci/controller/Kconfig @@ -41,6 +41,7 @@ config PCI_TEGRA bool "NVIDIA Tegra PCIe controller" depends on ARCH_TEGRA || COMPILE_TEST depends on PCI_MSI_IRQ_DOMAIN + select PCI_MSI_ARCH_FALLBACKS help Say Y here if you want support for the PCIe host controller found on NVIDIA Tegra SoCs. @@ -67,6 +68,7 @@ config PCIE_RCAR_HOST bool "Renesas R-Car PCIe host controller" depends on ARCH_RENESAS || COMPILE_TEST depends on PCI_MSI_IRQ_DOMAIN + select PCI_MSI_ARCH_FALLBACKS help Say Y here if you want PCIe controller support on R-Car SoCs in host mode. @@ -103,6 +105,7 @@ config PCIE_XILINX_CPM bool "Xilinx Versal CPM host bridge support" depends on ARCH_ZYNQMP || COMPILE_TEST select PCI_HOST_COMMON + select PCI_MSI_ARCH_FALLBACKS help Say 'Y' here if you want kernel support for the Xilinx Versal CPM host bridge. --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -58,8 +58,8 @@ static void pci_msi_teardown_msi_irqs(st #define pci_msi_teardown_msi_irqs arch_teardown_msi_irqs #endif +#ifdef CONFIG_PCI_MSI_ARCH_FALLBACKS /* Arch hooks */ - int __weak arch_setup_msi_irq(struct pci_d
[patch V2 17/46] PCI/MSI: Rework pci_msi_domain_calc_hwirq()
From: Thomas Gleixner Retrieve the PCI device from the msi descriptor instead of doing so at the call sites. Signed-off-by: Thomas Gleixner Acked-by: Bjorn Helgaas --- V2: Address Bjorns comments (subject prefix, pdev/dev) --- arch/x86/kernel/apic/msi.c |2 +- drivers/pci/msi.c |9 - include/linux/msi.h|3 +-- 3 files changed, 6 insertions(+), 8 deletions(-) --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -232,7 +232,7 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare); void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc) { - arg->msi_hwirq = pci_msi_domain_calc_hwirq(arg->msi_dev, desc); + arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc); } EXPORT_SYMBOL_GPL(pci_msi_set_desc); --- a/drivers/pci/msi.c +++ b/drivers/pci/msi.c @@ -1346,14 +1346,14 @@ void pci_msi_domain_write_msg(struct irq /** * pci_msi_domain_calc_hwirq - Generate a unique ID for an MSI source - * @dev: Pointer to the PCI device * @desc: Pointer to the MSI descriptor * * The ID number is only used within the irqdomain. */ -irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev, - struct msi_desc *desc) +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc) { + struct pci_dev *dev = msi_desc_to_pci_dev(desc); + return (irq_hw_number_t)desc->msi_attrib.entry_nr | pci_dev_id(dev) << 11 | (pci_domain_nr(dev->bus) & 0x) << 27; @@ -1406,8 +1406,7 @@ static void pci_msi_domain_set_desc(msi_ struct msi_desc *desc) { arg->desc = desc; - arg->hwirq = pci_msi_domain_calc_hwirq(msi_desc_to_pci_dev(desc), - desc); + arg->hwirq = pci_msi_domain_calc_hwirq(desc); } #else #define pci_msi_domain_set_descNULL --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -369,8 +369,7 @@ void pci_msi_domain_write_msg(struct irq struct irq_domain *pci_msi_create_irq_domain(struct fwnode_handle *fwnode, struct msi_domain_info *info, struct irq_domain *parent); -irq_hw_number_t pci_msi_domain_calc_hwirq(struct pci_dev *dev, - struct msi_desc *desc); +irq_hw_number_t pci_msi_domain_calc_hwirq(struct msi_desc *desc); int pci_msi_domain_check_cap(struct irq_domain *domain, struct msi_domain_info *info, struct device *dev); u32 pci_msi_domain_get_msi_rid(struct irq_domain *domain, struct pci_dev *pdev); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 30/46] x86/xen: Wrap XEN MSI management into irqdomain
From: Thomas Gleixner To allow utilizing the irq domain pointer in struct device it is necessary to make XEN/MSI irq domain compatible. While the right solution would be to truly convert XEN to irq domains, this is an exercise which is not possible for mere mortals with limited XENology. Provide a plain irqdomain wrapper around XEN. While this is blatant violation of the irqdomain design, it's the only solution for a XEN igorant person to make progress on the issue which triggered this change. Signed-off-by: Thomas Gleixner Acked-by: Juergen Gross --- Note: This is completely untested, but it compiles so it must be perfect. --- arch/x86/pci/xen.c | 63 + 1 file changed, 63 insertions(+) --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -407,6 +407,63 @@ static void xen_teardown_msi_irq(unsigne WARN_ON_ONCE(1); } +static int xen_msi_domain_alloc_irqs(struct irq_domain *domain, +struct device *dev, int nvec) +{ + int type; + + if (WARN_ON_ONCE(!dev_is_pci(dev))) + return -EINVAL; + + if (first_msi_entry(dev)->msi_attrib.is_msix) + type = PCI_CAP_ID_MSIX; + else + type = PCI_CAP_ID_MSI; + + return x86_msi.setup_msi_irqs(to_pci_dev(dev), nvec, type); +} + +static void xen_msi_domain_free_irqs(struct irq_domain *domain, +struct device *dev) +{ + if (WARN_ON_ONCE(!dev_is_pci(dev))) + return; + + x86_msi.teardown_msi_irqs(to_pci_dev(dev)); +} + +static struct msi_domain_ops xen_pci_msi_domain_ops = { + .domain_alloc_irqs = xen_msi_domain_alloc_irqs, + .domain_free_irqs = xen_msi_domain_free_irqs, +}; + +static struct msi_domain_info xen_pci_msi_domain_info = { + .ops= &xen_pci_msi_domain_ops, +}; + +/* + * This irq domain is a blatant violation of the irq domain design, but + * distangling XEN into real irq domains is not a job for mere mortals with + * limited XENology. But it's the least dangerous way for a mere mortal to + * get rid of the arch_*_msi_irqs() hackery in order to store the irq + * domain pointer in struct device. This irq domain wrappery allows to do + * that without breaking XEN terminally. + */ +static __init struct irq_domain *xen_create_pci_msi_domain(void) +{ + struct irq_domain *d = NULL; + struct fwnode_handle *fn; + + fn = irq_domain_alloc_named_fwnode("XEN-MSI"); + if (fn) + d = msi_create_irq_domain(fn, &xen_pci_msi_domain_info, NULL); + + /* FIXME: No idea how to survive if this fails */ + BUG_ON(!d); + + return d; +} + static __init void xen_setup_pci_msi(void) { if (xen_pv_domain()) { @@ -427,6 +484,12 @@ static __init void xen_setup_pci_msi(voi } x86_msi.teardown_msi_irq = xen_teardown_msi_irq; + + /* +* Override the PCI/MSI irq domain init function. No point +* in allocating the native domain and never use it. +*/ + x86_init.irqs.create_pci_msi_domain = xen_create_pci_msi_domain; } #else /* CONFIG_PCI_MSI */ ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 09/46] iommu/vt-d: Consolidate irq domain getter
From: Thomas Gleixner The irq domain request mode is now indicated in irq_alloc_info::type. Consolidate the two getter functions into one. Signed-off-by: Thomas Gleixner --- drivers/iommu/intel/irq_remapping.c | 67 1 file changed, 24 insertions(+), 43 deletions(-) --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -204,35 +204,40 @@ static int modify_irte(struct irq_2_iomm return rc; } -static struct intel_iommu *map_hpet_to_ir(u8 hpet_id) +static struct irq_domain *map_hpet_to_ir(u8 hpet_id) { int i; - for (i = 0; i < MAX_HPET_TBS; i++) + for (i = 0; i < MAX_HPET_TBS; i++) { if (ir_hpet[i].id == hpet_id && ir_hpet[i].iommu) - return ir_hpet[i].iommu; + return ir_hpet[i].iommu->ir_domain; + } return NULL; } -static struct intel_iommu *map_ioapic_to_ir(int apic) +static struct intel_iommu *map_ioapic_to_iommu(int apic) { int i; - for (i = 0; i < MAX_IO_APICS; i++) + for (i = 0; i < MAX_IO_APICS; i++) { if (ir_ioapic[i].id == apic && ir_ioapic[i].iommu) return ir_ioapic[i].iommu; + } return NULL; } -static struct intel_iommu *map_dev_to_ir(struct pci_dev *dev) +static struct irq_domain *map_ioapic_to_ir(int apic) { - struct dmar_drhd_unit *drhd; + struct intel_iommu *iommu = map_ioapic_to_iommu(apic); - drhd = dmar_find_matched_drhd_unit(dev); - if (!drhd) - return NULL; + return iommu ? iommu->ir_domain : NULL; +} + +static struct irq_domain *map_dev_to_ir(struct pci_dev *dev) +{ + struct dmar_drhd_unit *drhd = dmar_find_matched_drhd_unit(dev); - return drhd->iommu; + return drhd ? drhd->iommu->ir_msi_domain : NULL; } static int clear_entries(struct irq_2_iommu *irq_iommu) @@ -996,7 +1001,7 @@ static int __init parse_ioapics_under_ir for (ioapic_idx = 0; ioapic_idx < nr_ioapics; ioapic_idx++) { int ioapic_id = mpc_ioapic_id(ioapic_idx); - if (!map_ioapic_to_ir(ioapic_id)) { + if (!map_ioapic_to_iommu(ioapic_id)) { pr_err(FW_BUG "ioapic %d has no mapping iommu, " "interrupt remapping will be disabled\n", ioapic_id); @@ -1101,47 +1106,23 @@ static void prepare_irte(struct irte *ir irte->redir_hint = 1; } -static struct irq_domain *intel_get_ir_irq_domain(struct irq_alloc_info *info) +static struct irq_domain *intel_get_irq_domain(struct irq_alloc_info *info) { - struct intel_iommu *iommu = NULL; - if (!info) return NULL; switch (info->type) { case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT: - iommu = map_ioapic_to_ir(info->ioapic_id); - break; + return map_ioapic_to_ir(info->ioapic_id); case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: - iommu = map_hpet_to_ir(info->hpet_id); - break; - default: - BUG_ON(1); - break; - } - - return iommu ? iommu->ir_domain : NULL; -} - -static struct irq_domain *intel_get_irq_domain(struct irq_alloc_info *info) -{ - struct intel_iommu *iommu; - - if (!info) - return NULL; - - switch (info->type) { + return map_hpet_to_ir(info->hpet_id); case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - iommu = map_dev_to_ir(info->msi_dev); - if (iommu) - return iommu->ir_msi_domain; - break; + return map_dev_to_ir(info->msi_dev); default: - break; + WARN_ON_ONCE(1); + return NULL; } - - return NULL; } struct irq_remap_ops intel_irq_remap_ops = { @@ -1150,7 +1131,7 @@ struct irq_remap_ops intel_irq_remap_ops .disable= disable_irq_remapping, .reenable = reenable_irq_remapping, .enable_faulting= enable_drhd_fault_handling, - .get_ir_irq_domain = intel_get_ir_irq_domain, + .get_ir_irq_domain = intel_get_irq_domain, .get_irq_domain = intel_get_irq_domain, }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 26/46] x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init()
From: Thomas Gleixner The only user is in the same file and the name is too generic because this function is only ever used for HVM domains. Signed-off-by: Thomas Gleixner Reviewed-by: Juergen Gross --- arch/x86/pci/xen.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -420,7 +420,7 @@ int __init pci_xen_init(void) } #ifdef CONFIG_PCI_MSI -void __init xen_msi_init(void) +static void __init xen_hvm_msi_init(void) { if (!disable_apic) { /* @@ -460,7 +460,7 @@ int __init pci_xen_hvm_init(void) * We need to wait until after x2apic is initialized * before we can set MSI IRQ ops. */ - x86_platform.apic_post_init = xen_msi_init; + x86_platform.apic_post_init = xen_hvm_msi_init; #endif return 0; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 21/46] x86/pci: Reducde #ifdeffery in PCI init code
From: Thomas Gleixner Adding a function call before the first #ifdef in arch_pci_init() triggers a 'mixed declarations and code' warning if PCI_DIRECT is enabled. Use stub functions and move the #ifdeffery to the header file where it is not in the way. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/pci_x86.h | 11 +++ arch/x86/pci/init.c| 10 +++--- 2 files changed, 14 insertions(+), 7 deletions(-) --- a/arch/x86/include/asm/pci_x86.h +++ b/arch/x86/include/asm/pci_x86.h @@ -114,9 +114,20 @@ extern const struct pci_raw_ops pci_dire extern bool port_cf9_safe; /* arch_initcall level */ +#ifdef CONFIG_PCI_DIRECT extern int pci_direct_probe(void); extern void pci_direct_init(int type); +#else +static inline int pci_direct_probe(void) { return -1; } +static inline void pci_direct_init(int type) { } +#endif + +#ifdef CONFIG_PCI_BIOS extern void pci_pcbios_init(void); +#else +static inline void pci_pcbios_init(void) { } +#endif + extern void __init dmi_check_pciprobe(void); extern void __init dmi_check_skip_isa_align(void); --- a/arch/x86/pci/init.c +++ b/arch/x86/pci/init.c @@ -8,11 +8,9 @@ in the right sequence from here. */ static __init int pci_arch_init(void) { -#ifdef CONFIG_PCI_DIRECT - int type = 0; + int type; type = pci_direct_probe(); -#endif if (!(pci_probe & PCI_PROBE_NOEARLY)) pci_mmcfg_early_init(); @@ -20,18 +18,16 @@ static __init int pci_arch_init(void) if (x86_init.pci.arch_init && !x86_init.pci.arch_init()) return 0; -#ifdef CONFIG_PCI_BIOS pci_pcbios_init(); -#endif + /* * don't check for raw_pci_ops here because we want pcbios as last * fallback, yet it's needed to run first to set pcibios_last_bus * in case legacy PCI probing is used. otherwise detecting peer busses * fails. */ -#ifdef CONFIG_PCI_DIRECT pci_direct_init(type); -#endif + if (!raw_pci_ops && !raw_pci_ext_ops) printk(KERN_ERR "PCI: Fatal: No config space access function found\n"); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 31/46] iommm/vt-d: Store irq domain in struct device
From: Thomas Gleixner As a first step to make X86 utilize the direct MSI irq domain operations store the irq domain pointer in the device struct when a device is probed. This is done from dmar_pci_bus_add_dev() because it has to work even when DMA remapping is disabled. It only overrides the irqdomain of devices which are handled by a regular PCI/MSI irq domain which protects PCI devices behind special busses like VMD which have their own irq domain. No functional change. It just avoids the redirection through arch_*_msi_irqs() and allows the PCI/MSI core to directly invoke the irq domain alloc/free functions instead of having to look up the irq domain for every single MSI interupt. Signed-off-by: Thomas Gleixner --- V2: Add missing forward declaration --- drivers/iommu/intel/dmar.c |3 +++ drivers/iommu/intel/irq_remapping.c | 16 include/linux/intel-iommu.h |7 +++ 3 files changed, 26 insertions(+) --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -316,6 +316,9 @@ static int dmar_pci_bus_add_dev(struct d if (ret < 0 && dmar_dev_scope_status == 0) dmar_dev_scope_status = ret; + if (ret >= 0) + intel_irq_remap_add_device(info); + return ret; } --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1086,6 +1086,22 @@ static int reenable_irq_remapping(int ei return -1; } +/* + * Store the MSI remapping domain pointer in the device if enabled. + * + * This is called from dmar_pci_bus_add_dev() so it works even when DMA + * remapping is disabled. Only update the pointer if the device is not + * already handled by a non default PCI/MSI interrupt domain. This protects + * e.g. VMD devices. + */ +void intel_irq_remap_add_device(struct dmar_pci_notify_info *info) +{ + if (!irq_remapping_enabled || pci_dev_has_special_msi_domain(info->dev)) + return; + + dev_set_msi_domain(&info->dev->dev, map_dev_to_ir(info->dev)); +} + static void prepare_irte(struct irte *irte, int vector, unsigned int dest) { memset(irte, 0, sizeof(*irte)); --- a/include/linux/intel-iommu.h +++ b/include/linux/intel-iommu.h @@ -425,6 +425,8 @@ struct q_inval { int free_cnt; }; +struct dmar_pci_notify_info; + #ifdef CONFIG_IRQ_REMAP /* 1MB - maximum possible interrupt remapping table size */ #define INTR_REMAP_PAGE_ORDER 8 @@ -439,6 +441,11 @@ struct ir_table { struct irte *base; unsigned long *bitmap; }; + +void intel_irq_remap_add_device(struct dmar_pci_notify_info *info); +#else +static inline void +intel_irq_remap_add_device(struct dmar_pci_notify_info *info) { } #endif struct iommu_flush { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 08/46] x86/irq: Add allocation type for parent domain retrieval
From: Thomas Gleixner irq_remapping_ir_irq_domain() is used to retrieve the remapping parent domain for an allocation type. irq_remapping_irq_domain() is for retrieving the actual device domain for allocating interrupts for a device. The two functions are similar and can be unified by using explicit modes for parent irq domain retrieval. Add X86_IRQ_ALLOC_TYPE_IOAPIC/HPET_GET_PARENT and use it in the iommu implementations. Drop the parent domain retrieval for PCI_MSI/X as that is unused. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h |2 ++ arch/x86/kernel/apic/io_apic.c |2 +- arch/x86/kernel/apic/msi.c |2 +- drivers/iommu/amd/iommu.c |8 drivers/iommu/hyperv-iommu.c|2 +- drivers/iommu/intel/irq_remapping.c |8 ++-- 6 files changed, 15 insertions(+), 9 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -40,6 +40,8 @@ enum irq_alloc_type { X86_IRQ_ALLOC_TYPE_PCI_MSIX, X86_IRQ_ALLOC_TYPE_DMAR, X86_IRQ_ALLOC_TYPE_UV, + X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT, + X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT, }; struct irq_alloc_info { --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2296,7 +2296,7 @@ static int mp_irqdomain_create(int ioapi return 0; init_irq_alloc_info(&info, NULL); - info.type = X86_IRQ_ALLOC_TYPE_IOAPIC; + info.type = X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT; info.ioapic_id = mpc_ioapic_id(ioapic); parent = irq_remapping_get_ir_irq_domain(&info); if (!parent) --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -476,7 +476,7 @@ struct irq_domain *hpet_create_irq_domai domain_info->data = (void *)(long)hpet_id; init_irq_alloc_info(&info, NULL); - info.type = X86_IRQ_ALLOC_TYPE_HPET; + info.type = X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT; info.hpet_id = hpet_id; parent = irq_remapping_get_ir_irq_domain(&info); if (parent == NULL) --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3534,6 +3534,14 @@ static struct irq_domain *get_ir_irq_dom if (!info) return NULL; + switch (info->type) { + case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT: + case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: + break; + default: + return NULL; + } + devid = get_devid(info); if (devid >= 0) { iommu = amd_iommu_rlookup_table[devid]; --- a/drivers/iommu/hyperv-iommu.c +++ b/drivers/iommu/hyperv-iommu.c @@ -184,7 +184,7 @@ static int __init hyperv_enable_irq_rema static struct irq_domain *hyperv_get_ir_irq_domain(struct irq_alloc_info *info) { - if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC) + if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT) return ioapic_ir_domain; else return NULL; --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1109,16 +1109,12 @@ static struct irq_domain *intel_get_ir_i return NULL; switch (info->type) { - case X86_IRQ_ALLOC_TYPE_IOAPIC: + case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT: iommu = map_ioapic_to_ir(info->ioapic_id); break; - case X86_IRQ_ALLOC_TYPE_HPET: + case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: iommu = map_hpet_to_ir(info->hpet_id); break; - case X86_IRQ_ALLOC_TYPE_PCI_MSI: - case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - iommu = map_dev_to_ir(info->msi_dev); - break; default: BUG_ON(1); break; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 29/46] irqdomain/msi: Allow to override msi_domain_alloc/free_irqs()
From: Thomas Gleixner To support MSI irq domains which do not fit at all into the regular MSI irqdomain scheme, like the XEN MSI interrupt management for PV/HVM/DOM0, it's necessary to allow to override the alloc/free implementation. This is a preperatory step to switch X86 away from arch_*_msi_irqs() and store the irq domain pointer right in struct device. No functional change for existing MSI irq domain users. Aside of the evil XEN wrapper this is also useful for special MSI domains which need to do extra alloc/free work before/after calling the generic core function. Work like allocating/freeing MSI descriptors, MSI storage space etc. Signed-off-by: Thomas Gleixner --- include/linux/msi.h | 27 kernel/irq/msi.c| 70 +++- 2 files changed, 75 insertions(+), 22 deletions(-) --- a/include/linux/msi.h +++ b/include/linux/msi.h @@ -241,6 +241,10 @@ struct msi_domain_info; * @msi_finish:Optional callback to finalize the allocation * @set_desc: Set the msi descriptor for an interrupt * @handle_error: Optional error handler if the allocation fails + * @domain_alloc_irqs: Optional function to override the default allocation + * function. + * @domain_free_irqs: Optional function to override the default free + * function. * * @get_hwirq, @msi_init and @msi_free are callbacks used by * msi_create_irq_domain() and related interfaces @@ -248,6 +252,22 @@ struct msi_domain_info; * @msi_check, @msi_prepare, @msi_finish, @set_desc and @handle_error * are callbacks used by msi_domain_alloc_irqs() and related * interfaces which are based on msi_desc. + * + * @domain_alloc_irqs, @domain_free_irqs can be used to override the + * default allocation/free functions (__msi_domain_alloc/free_irqs). This + * is initially for a wrapper around XENs seperate MSI universe which can't + * be wrapped into the regular irq domains concepts by mere mortals. This + * allows to universally use msi_domain_alloc/free_irqs without having to + * special case XEN all over the place. + * + * Contrary to other operations @domain_alloc_irqs and @domain_free_irqs + * are set to the default implementation if NULL and even when + * MSI_FLAG_USE_DEF_DOM_OPS is not set to avoid breaking existing users and + * because these callbacks are obviously mandatory. + * + * This is NOT meant to be abused, but it can be useful to build wrappers + * for specialized MSI irq domains which need extra work before and after + * calling __msi_domain_alloc_irqs()/__msi_domain_free_irqs(). */ struct msi_domain_ops { irq_hw_number_t (*get_hwirq)(struct msi_domain_info *info, @@ -270,6 +290,10 @@ struct msi_domain_ops { struct msi_desc *desc); int (*handle_error)(struct irq_domain *domain, struct msi_desc *desc, int error); + int (*domain_alloc_irqs)(struct irq_domain *domain, +struct device *dev, int nvec); + void(*domain_free_irqs)(struct irq_domain *domain, + struct device *dev); }; /** @@ -327,8 +351,11 @@ int msi_domain_set_affinity(struct irq_d struct irq_domain *msi_create_irq_domain(struct fwnode_handle *fwnode, struct msi_domain_info *info, struct irq_domain *parent); +int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, + int nvec); int msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, int nvec); +void __msi_domain_free_irqs(struct irq_domain *domain, struct device *dev); void msi_domain_free_irqs(struct irq_domain *domain, struct device *dev); struct msi_domain_info *msi_get_domain_info(struct irq_domain *domain); --- a/kernel/irq/msi.c +++ b/kernel/irq/msi.c @@ -229,11 +229,13 @@ static int msi_domain_ops_check(struct i } static struct msi_domain_ops msi_domain_ops_default = { - .get_hwirq = msi_domain_ops_get_hwirq, - .msi_init = msi_domain_ops_init, - .msi_check = msi_domain_ops_check, - .msi_prepare= msi_domain_ops_prepare, - .set_desc = msi_domain_ops_set_desc, + .get_hwirq = msi_domain_ops_get_hwirq, + .msi_init = msi_domain_ops_init, + .msi_check = msi_domain_ops_check, + .msi_prepare= msi_domain_ops_prepare, + .set_desc = msi_domain_ops_set_desc, + .domain_alloc_irqs = __msi_domain_alloc_irqs, + .domain_free_irqs = __msi_domain_free_irqs, }; static void msi_domain_update_dom_ops(struct msi_domain_info *info) @@ -245,6 +247,14 @@ static void msi_domain_update_dom_ops(st return;
[patch V2 16/46] x86/irq: Consolidate UV domain allocation
From: Thomas Gleixner Move the UV specific fields into their own struct for readability sake. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h | 21 - arch/x86/platform/uv/uv_irq.c | 16 2 files changed, 20 insertions(+), 17 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -53,6 +53,14 @@ struct ioapic_alloc_info { struct IO_APIC_route_entry *entry; }; +struct uv_alloc_info { + int limit; + int blade; + unsigned long offset; + char*name; + +}; + /** * irq_alloc_info - X86 specific interrupt allocation info * @type: X86 specific allocation type @@ -64,7 +72,8 @@ struct ioapic_alloc_info { * @data: Allocation specific data * * @ioapic:IOAPIC specific allocation data - */ + * @uv:UV specific allocation data +*/ struct irq_alloc_info { enum irq_alloc_type type; u32 flags; @@ -76,6 +85,8 @@ struct irq_alloc_info { union { struct ioapic_alloc_infoioapic; + struct uv_alloc_infouv; + int unused; #ifdef CONFIG_PCI_MSI struct { @@ -83,14 +94,6 @@ struct irq_alloc_info { irq_hw_number_t msi_hwirq; }; #endif -#ifdef CONFIG_X86_UV - struct { - int uv_limit; - int uv_blade; - unsigned long uv_offset; - char*uv_name; - }; -#endif }; }; --- a/arch/x86/platform/uv/uv_irq.c +++ b/arch/x86/platform/uv/uv_irq.c @@ -90,15 +90,15 @@ static int uv_domain_alloc(struct irq_do ret = irq_domain_alloc_irqs_parent(domain, virq, nr_irqs, arg); if (ret >= 0) { - if (info->uv_limit == UV_AFFINITY_CPU) + if (info->uv.limit == UV_AFFINITY_CPU) irq_set_status_flags(virq, IRQ_NO_BALANCING); else irq_set_status_flags(virq, IRQ_MOVE_PCNTXT); - chip_data->pnode = uv_blade_to_pnode(info->uv_blade); - chip_data->offset = info->uv_offset; + chip_data->pnode = uv_blade_to_pnode(info->uv.blade); + chip_data->offset = info->uv.offset; irq_domain_set_info(domain, virq, virq, &uv_irq_chip, chip_data, - handle_percpu_irq, NULL, info->uv_name); + handle_percpu_irq, NULL, info->uv.name); } else { kfree(chip_data); } @@ -193,10 +193,10 @@ int uv_setup_irq(char *irq_name, int cpu init_irq_alloc_info(&info, cpumask_of(cpu)); info.type = X86_IRQ_ALLOC_TYPE_UV; - info.uv_limit = limit; - info.uv_blade = mmr_blade; - info.uv_offset = mmr_offset; - info.uv_name = irq_name; + info.uv.limit = limit; + info.uv.blade = mmr_blade; + info.uv.offset = mmr_offset; + info.uv.name = irq_name; return irq_domain_alloc_irqs(domain, 1, uv_blade_to_memory_nid(mmr_blade), &info); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 33/46] x86/pci: Set default irq domain in pcibios_add_device()
From: Thomas Gleixner Now that interrupt remapping sets the irqdomain pointer when a PCI device is added it's possible to store the default irq domain in the device struct in pcibios_add_device(). If the bus to which a device is connected has an irq domain associated then this domain is used otherwise the default domain (PCI/MSI native or XEN PCI/MSI) is used. Using the bus domain ensures that special MSI bus domains like VMD work. This makes XEN and the non-remapped native case work solely based on the irq domain pointer in struct device for PCI/MSI and allows to remove the arch fallback and make most of the x86_msi ops private to XEN in the next steps. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/irqdomain.h |2 ++ arch/x86/kernel/apic/msi.c |2 +- arch/x86/pci/common.c| 18 +- 3 files changed, 20 insertions(+), 2 deletions(-) --- a/arch/x86/include/asm/irqdomain.h +++ b/arch/x86/include/asm/irqdomain.h @@ -53,9 +53,11 @@ extern int mp_irqdomain_ioapic_idx(struc #ifdef CONFIG_PCI_MSI void x86_create_pci_msi_domain(void); struct irq_domain *native_create_pci_msi_domain(void); +extern struct irq_domain *x86_pci_msi_default_domain; #else static inline void x86_create_pci_msi_domain(void) { } #define native_create_pci_msi_domain NULL +#define x86_pci_msi_default_domain NULL #endif #endif --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -21,7 +21,7 @@ #include #include -static struct irq_domain *x86_pci_msi_default_domain __ro_after_init; +struct irq_domain *x86_pci_msi_default_domain __ro_after_init; static void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg) { --- a/arch/x86/pci/common.c +++ b/arch/x86/pci/common.c @@ -19,6 +19,7 @@ #include #include #include +#include unsigned int pci_probe = PCI_PROBE_BIOS | PCI_PROBE_CONF1 | PCI_PROBE_CONF2 | PCI_PROBE_MMCONF; @@ -633,8 +634,9 @@ static void set_dev_domain_options(struc int pcibios_add_device(struct pci_dev *dev) { - struct setup_data *data; struct pci_setup_rom *rom; + struct irq_domain *msidom; + struct setup_data *data; u64 pa_data; pa_data = boot_params.hdr.setup_data; @@ -661,6 +663,20 @@ int pcibios_add_device(struct pci_dev *d memunmap(data); } set_dev_domain_options(dev); + + /* +* Setup the initial MSI domain of the device. If the underlying +* bus has a PCI/MSI irqdomain associated use the bus domain, +* otherwise set the default domain. This ensures that special irq +* domains e.g. VMD are preserved. The default ensures initial +* operation if irq remapping is not active. If irq remapping is +* active it will overwrite the domain pointer when the device is +* associated to a remapping domain. +*/ + msidom = dev_get_msi_domain(&dev->bus->dev); + if (!msidom) + msidom = x86_pci_msi_default_domain; + dev_set_msi_domain(&dev->dev, msidom); return 0; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 07/46] x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency
From: Thomas Gleixner No functional change. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h |4 ++-- arch/x86/kernel/apic/msi.c |6 +++--- drivers/iommu/amd/iommu.c | 24 drivers/iommu/intel/irq_remapping.c | 18 +- 4 files changed, 26 insertions(+), 26 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -36,8 +36,8 @@ struct msi_desc; enum irq_alloc_type { X86_IRQ_ALLOC_TYPE_IOAPIC = 1, X86_IRQ_ALLOC_TYPE_HPET, - X86_IRQ_ALLOC_TYPE_MSI, - X86_IRQ_ALLOC_TYPE_MSIX, + X86_IRQ_ALLOC_TYPE_PCI_MSI, + X86_IRQ_ALLOC_TYPE_PCI_MSIX, X86_IRQ_ALLOC_TYPE_DMAR, X86_IRQ_ALLOC_TYPE_UV, }; --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -188,7 +188,7 @@ int native_setup_msi_irqs(struct pci_dev struct irq_alloc_info info; init_irq_alloc_info(&info, NULL); - info.type = X86_IRQ_ALLOC_TYPE_MSI; + info.type = X86_IRQ_ALLOC_TYPE_PCI_MSI; info.msi_dev = dev; domain = irq_remapping_get_irq_domain(&info); @@ -220,9 +220,9 @@ int pci_msi_prepare(struct irq_domain *d init_irq_alloc_info(arg, NULL); arg->msi_dev = pdev; if (desc->msi_attrib.is_msix) { - arg->type = X86_IRQ_ALLOC_TYPE_MSIX; + arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX; } else { - arg->type = X86_IRQ_ALLOC_TYPE_MSI; + arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSI; arg->flags |= X86_IRQ_ALLOC_CONTIGUOUS_VECTORS; } --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3514,8 +3514,8 @@ static int get_devid(struct irq_alloc_in case X86_IRQ_ALLOC_TYPE_HPET: devid = get_hpet_devid(info->hpet_id); break; - case X86_IRQ_ALLOC_TYPE_MSI: - case X86_IRQ_ALLOC_TYPE_MSIX: + case X86_IRQ_ALLOC_TYPE_PCI_MSI: + case X86_IRQ_ALLOC_TYPE_PCI_MSIX: devid = get_device_id(&info->msi_dev->dev); break; default: @@ -3553,8 +3553,8 @@ static struct irq_domain *get_irq_domain return NULL; switch (info->type) { - case X86_IRQ_ALLOC_TYPE_MSI: - case X86_IRQ_ALLOC_TYPE_MSIX: + case X86_IRQ_ALLOC_TYPE_PCI_MSI: + case X86_IRQ_ALLOC_TYPE_PCI_MSIX: devid = get_device_id(&info->msi_dev->dev); if (devid < 0) return NULL; @@ -3615,8 +3615,8 @@ static void irq_remapping_prepare_irte(s break; case X86_IRQ_ALLOC_TYPE_HPET: - case X86_IRQ_ALLOC_TYPE_MSI: - case X86_IRQ_ALLOC_TYPE_MSIX: + case X86_IRQ_ALLOC_TYPE_PCI_MSI: + case X86_IRQ_ALLOC_TYPE_PCI_MSIX: msg->address_hi = MSI_ADDR_BASE_HI; msg->address_lo = MSI_ADDR_BASE_LO; msg->data = irte_info->index; @@ -3660,15 +3660,15 @@ static int irq_remapping_alloc(struct ir if (!info) return -EINVAL; - if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_MSI && - info->type != X86_IRQ_ALLOC_TYPE_MSIX) + if (nr_irqs > 1 && info->type != X86_IRQ_ALLOC_TYPE_PCI_MSI && + info->type != X86_IRQ_ALLOC_TYPE_PCI_MSIX) return -EINVAL; /* * With IRQ remapping enabled, don't need contiguous CPU vectors * to support multiple MSI interrupts. */ - if (info->type == X86_IRQ_ALLOC_TYPE_MSI) + if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI) info->flags &= ~X86_IRQ_ALLOC_CONTIGUOUS_VECTORS; devid = get_devid(info); @@ -3700,9 +3700,9 @@ static int irq_remapping_alloc(struct ir } else { index = -ENOMEM; } - } else if (info->type == X86_IRQ_ALLOC_TYPE_MSI || - info->type == X86_IRQ_ALLOC_TYPE_MSIX) { - bool align = (info->type == X86_IRQ_ALLOC_TYPE_MSI); + } else if (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI || + info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX) { + bool align = (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI); index = alloc_irq_index(devid, nr_irqs, align, info->msi_dev); } else { --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1115,8 +1115,8 @@ static struct irq_domain *intel_get_ir_i case X86_IRQ_ALLOC_TYPE_HPET: iommu = map_hpet_to_ir(info->hpet_id); break; - case X86_IRQ_ALLOC_TYPE_MSI: - case X86_IRQ_ALLOC_TYPE_MSIX: + case X86_IRQ_ALLOC_TYPE_PCI_MSI: + case X86_IRQ_ALLOC_TYPE_PCI_MSIX: iommu = map_dev_to_ir(info->msi_dev); break; default: @@ -1135,8 +1135,8 @@ static struct irq_domain *intel_get_irq_ return NULL; switch (info->t
[patch V2 13/46] x86/msi: Consolidate HPET allocation
From: Thomas Gleixner None of the magic HPET fields are required in any way. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h |7 --- arch/x86/kernel/apic/msi.c | 14 +++--- drivers/iommu/amd/iommu.c |2 +- drivers/iommu/intel/irq_remapping.c |4 ++-- 4 files changed, 10 insertions(+), 17 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -65,13 +65,6 @@ struct irq_alloc_info { union { int unused; -#ifdef CONFIG_HPET_TIMER - struct { - int hpet_id; - int hpet_index; - void*hpet_data; - }; -#endif #ifdef CONFIG_PCI_MSI struct { struct pci_dev *msi_dev; --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -427,7 +427,7 @@ static struct irq_chip hpet_msi_controll static irq_hw_number_t hpet_msi_get_hwirq(struct msi_domain_info *info, msi_alloc_info_t *arg) { - return arg->hpet_index; + return arg->hwirq; } static int hpet_msi_init(struct irq_domain *domain, @@ -435,8 +435,8 @@ static int hpet_msi_init(struct irq_doma irq_hw_number_t hwirq, msi_alloc_info_t *arg) { irq_set_status_flags(virq, IRQ_MOVE_PCNTXT); - irq_domain_set_info(domain, virq, arg->hpet_index, info->chip, NULL, - handle_edge_irq, arg->hpet_data, "edge"); + irq_domain_set_info(domain, virq, arg->hwirq, info->chip, NULL, + handle_edge_irq, arg->data, "edge"); return 0; } @@ -477,7 +477,7 @@ struct irq_domain *hpet_create_irq_domai init_irq_alloc_info(&info, NULL); info.type = X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT; - info.hpet_id = hpet_id; + info.devid = hpet_id; parent = irq_remapping_get_irq_domain(&info); if (parent == NULL) parent = x86_vector_domain; @@ -506,9 +506,9 @@ int hpet_assign_irq(struct irq_domain *d init_irq_alloc_info(&info, NULL); info.type = X86_IRQ_ALLOC_TYPE_HPET; - info.hpet_data = hc; - info.hpet_id = hpet_dev_id(domain); - info.hpet_index = dev_num; + info.data = hc; + info.devid = hpet_dev_id(domain); + info.hwirq = dev_num; return irq_domain_alloc_irqs(domain, 1, NUMA_NO_NODE, &info); } --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3511,7 +3511,7 @@ static int get_devid(struct irq_alloc_in return get_ioapic_devid(info->ioapic_id); case X86_IRQ_ALLOC_TYPE_HPET: case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: - return get_hpet_devid(info->hpet_id); + return get_hpet_devid(info->devid); case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: return get_device_id(&info->msi_dev->dev); --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1115,7 +1115,7 @@ static struct irq_domain *intel_get_irq_ case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT: return map_ioapic_to_ir(info->ioapic_id); case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: - return map_hpet_to_ir(info->hpet_id); + return map_hpet_to_ir(info->devid); case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: return map_dev_to_ir(info->msi_dev); @@ -1285,7 +1285,7 @@ static void intel_irq_remapping_prepare_ case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: if (info->type == X86_IRQ_ALLOC_TYPE_HPET) - set_hpet_sid(irte, info->hpet_id); + set_hpet_sid(irte, info->devid); else set_msi_sid(irte, info->msi_dev); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 35/46] x86/irq: Cleanup the arch_*_msi_irqs() leftovers
From: Thomas Gleixner Get rid of all the gunk and remove the 'select PCI_MSI_ARCH_FALLBACK' from the x86 Kconfig so the weak functions in the PCI core are replaced by stubs which emit a warning, which ensures that any fail to set the irq domain pointer results in a warning when the device is used. Signed-off-by: Thomas Gleixner --- V2: Adjust to the PCI_MSI_ARCH_FALLBACK change, i.e. remove it instead of selecting the disabler. --- arch/x86/Kconfig|1 - arch/x86/include/asm/pci.h | 11 --- arch/x86/include/asm/x86_init.h |1 - arch/x86/kernel/apic/msi.c | 22 -- arch/x86/kernel/x86_init.c | 18 -- arch/x86/pci/xen.c |7 --- 6 files changed, 60 deletions(-) --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -225,7 +225,6 @@ config X86 select NEED_SG_DMA_LENGTH select PCI_DOMAINS if PCI select PCI_LOCKLESS_CONFIG if PCI - select PCI_MSI_ARCH_FALLBACKS select PERF_EVENTS select RTC_LIB select RTC_MC146818_LIB --- a/arch/x86/include/asm/pci.h +++ b/arch/x86/include/asm/pci.h @@ -105,17 +105,6 @@ static inline void early_quirks(void) { extern void pci_iommu_alloc(void); -#ifdef CONFIG_PCI_MSI -/* implemented in arch/x86/kernel/apic/io_apic. */ -struct msi_desc; -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type); -void native_teardown_msi_irq(unsigned int irq); -void native_restore_msi_irqs(struct pci_dev *dev); -#else -#define native_setup_msi_irqs NULL -#define native_teardown_msi_irqNULL -#endif - /* generic pci stuff */ #include --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -277,7 +277,6 @@ struct pci_dev; struct x86_msi_ops { int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); - void (*teardown_msi_irq)(unsigned int irq); void (*teardown_msi_irqs)(struct pci_dev *dev); void (*restore_msi_irqs)(struct pci_dev *dev); }; --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -182,28 +182,6 @@ static struct irq_chip pci_msi_controlle .flags = IRQCHIP_SKIP_SET_WAKE, }; -int native_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) -{ - struct irq_domain *domain; - struct irq_alloc_info info; - - init_irq_alloc_info(&info, NULL); - info.type = X86_IRQ_ALLOC_TYPE_PCI_MSI; - - domain = irq_remapping_get_irq_domain(&info); - if (domain == NULL) - domain = x86_pci_msi_default_domain; - if (domain == NULL) - return -ENOSYS; - - return msi_domain_alloc_irqs(domain, &dev->dev, nvec); -} - -void native_teardown_msi_irq(unsigned int irq) -{ - irq_domain_free_irqs(irq, 1); -} - int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, msi_alloc_info_t *arg) { --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -146,28 +146,10 @@ EXPORT_SYMBOL_GPL(x86_platform); #if defined(CONFIG_PCI_MSI) struct x86_msi_ops x86_msi __ro_after_init = { - .setup_msi_irqs = native_setup_msi_irqs, - .teardown_msi_irq = native_teardown_msi_irq, - .teardown_msi_irqs = default_teardown_msi_irqs, .restore_msi_irqs = default_restore_msi_irqs, }; /* MSI arch specific hooks */ -int arch_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) -{ - return x86_msi.setup_msi_irqs(dev, nvec, type); -} - -void arch_teardown_msi_irqs(struct pci_dev *dev) -{ - x86_msi.teardown_msi_irqs(dev); -} - -void arch_teardown_msi_irq(unsigned int irq) -{ - x86_msi.teardown_msi_irq(irq); -} - void arch_restore_msi_irqs(struct pci_dev *dev) { x86_msi.restore_msi_irqs(dev); --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -402,11 +402,6 @@ static void xen_pv_teardown_msi_irqs(str xen_teardown_msi_irqs(dev); } -static void xen_teardown_msi_irq(unsigned int irq) -{ - WARN_ON_ONCE(1); -} - static int xen_msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev, int nvec) { @@ -483,8 +478,6 @@ static __init void xen_setup_pci_msi(voi return; } - x86_msi.teardown_msi_irq = xen_teardown_msi_irq; - /* * Override the PCI/MSI irq domain init function. No point * in allocating the native domain and never use it. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 20/46] x86/irq: Move apic_post_init() invocation to one place
From: Thomas Gleixner No point to call it from both 32bit and 64bit implementations of default_setup_apic_routing(). Move it to the caller. Signed-off-by: Thomas Gleixner --- arch/x86/kernel/apic/apic.c |3 +++ arch/x86/kernel/apic/probe_32.c |3 --- arch/x86/kernel/apic/probe_64.c |3 --- 3 files changed, 3 insertions(+), 6 deletions(-) --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -1429,6 +1429,9 @@ void __init apic_intr_mode_init(void) break; } + if (x86_platform.apic_post_init) + x86_platform.apic_post_init(); + apic_bsp_setup(upmode); } --- a/arch/x86/kernel/apic/probe_32.c +++ b/arch/x86/kernel/apic/probe_32.c @@ -170,9 +170,6 @@ void __init default_setup_apic_routing(v if (apic->setup_apic_routing) apic->setup_apic_routing(); - - if (x86_platform.apic_post_init) - x86_platform.apic_post_init(); } void __init generic_apic_probe(void) --- a/arch/x86/kernel/apic/probe_64.c +++ b/arch/x86/kernel/apic/probe_64.c @@ -32,9 +32,6 @@ void __init default_setup_apic_routing(v break; } } - - if (x86_platform.apic_post_init) - x86_platform.apic_post_init(); } int __init default_acpi_madt_oem_check(char *oem_id, char *oem_table_id) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 36/46] x86/irq: Make most MSI ops XEN private
From: Thomas Gleixner Nothing except XEN uses the setup/teardown ops. Hide them there. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/x86_init.h |2 -- arch/x86/pci/xen.c | 21 ++--- 2 files changed, 14 insertions(+), 9 deletions(-) --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -276,8 +276,6 @@ struct x86_platform_ops { struct pci_dev; struct x86_msi_ops { - int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); - void (*teardown_msi_irqs)(struct pci_dev *dev); void (*restore_msi_irqs)(struct pci_dev *dev); }; --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -157,6 +157,13 @@ static int acpi_register_gsi_xen(struct struct xen_pci_frontend_ops *xen_pci_frontend; EXPORT_SYMBOL_GPL(xen_pci_frontend); +struct xen_msi_ops { + int (*setup_msi_irqs)(struct pci_dev *dev, int nvec, int type); + void (*teardown_msi_irqs)(struct pci_dev *dev); +}; + +static struct xen_msi_ops xen_msi_ops __ro_after_init; + static int xen_setup_msi_irqs(struct pci_dev *dev, int nvec, int type) { int irq, ret, i; @@ -415,7 +422,7 @@ static int xen_msi_domain_alloc_irqs(str else type = PCI_CAP_ID_MSI; - return x86_msi.setup_msi_irqs(to_pci_dev(dev), nvec, type); + return xen_msi_ops.setup_msi_irqs(to_pci_dev(dev), nvec, type); } static void xen_msi_domain_free_irqs(struct irq_domain *domain, @@ -424,7 +431,7 @@ static void xen_msi_domain_free_irqs(str if (WARN_ON_ONCE(!dev_is_pci(dev))) return; - x86_msi.teardown_msi_irqs(to_pci_dev(dev)); + xen_msi_ops.teardown_msi_irqs(to_pci_dev(dev)); } static struct msi_domain_ops xen_pci_msi_domain_ops = { @@ -463,16 +470,16 @@ static __init void xen_setup_pci_msi(voi { if (xen_pv_domain()) { if (xen_initial_domain()) { - x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs; + xen_msi_ops.setup_msi_irqs = xen_initdom_setup_msi_irqs; x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs; } else { - x86_msi.setup_msi_irqs = xen_setup_msi_irqs; + xen_msi_ops.setup_msi_irqs = xen_setup_msi_irqs; } - x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs; + xen_msi_ops.teardown_msi_irqs = xen_pv_teardown_msi_irqs; pci_msi_ignore_mask = 1; } else if (xen_hvm_domain()) { - x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs; - x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; + xen_msi_ops.setup_msi_irqs = xen_hvm_setup_msi_irqs; + xen_msi_ops.teardown_msi_irqs = xen_teardown_msi_irqs; } else { WARN_ON_ONCE(1); return; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 14/46] x86/ioapic: Consolidate IOAPIC allocation
From: Thomas Gleixner Move the IOAPIC specific fields into their own struct and reuse the common devid. Get rid of the #ifdeffery as it does not matter at all whether the alloc info is a couple of bytes longer or not. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h | 23 ++- arch/x86/kernel/apic/io_apic.c | 70 ++-- arch/x86/kernel/devicetree.c|4 +- drivers/iommu/amd/iommu.c | 14 +++ drivers/iommu/hyperv-iommu.c|2 - drivers/iommu/intel/irq_remapping.c | 18 - 6 files changed, 66 insertions(+), 65 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -44,6 +44,15 @@ enum irq_alloc_type { X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT, }; +struct ioapic_alloc_info { + int pin; + int node; + u32 trigger : 1; + u32 polarity : 1; + u32 valid : 1; + struct IO_APIC_route_entry *entry; +}; + /** * irq_alloc_info - X86 specific interrupt allocation info * @type: X86 specific allocation type @@ -53,6 +62,8 @@ enum irq_alloc_type { * @mask: CPU mask for vector allocation * @desc: Pointer to msi descriptor * @data: Allocation specific data + * + * @ioapic:IOAPIC specific allocation data */ struct irq_alloc_info { enum irq_alloc_type type; @@ -64,6 +75,7 @@ struct irq_alloc_info { void*data; union { + struct ioapic_alloc_infoioapic; int unused; #ifdef CONFIG_PCI_MSI struct { @@ -71,17 +83,6 @@ struct irq_alloc_info { irq_hw_number_t msi_hwirq; }; #endif -#ifdef CONFIG_X86_IO_APIC - struct { - int ioapic_id; - int ioapic_pin; - int ioapic_node; - u32 ioapic_trigger : 1; - u32 ioapic_polarity : 1; - u32 ioapic_valid : 1; - struct IO_APIC_route_entry *ioapic_entry; - }; -#endif #ifdef CONFIG_DMAR_TABLE struct { int dmar_id; --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -860,10 +860,10 @@ void ioapic_set_alloc_attr(struct irq_al { init_irq_alloc_info(info, NULL); info->type = X86_IRQ_ALLOC_TYPE_IOAPIC; - info->ioapic_node = node; - info->ioapic_trigger = trigger; - info->ioapic_polarity = polarity; - info->ioapic_valid = 1; + info->ioapic.node = node; + info->ioapic.trigger = trigger; + info->ioapic.polarity = polarity; + info->ioapic.valid = 1; } #ifndef CONFIG_ACPI @@ -878,32 +878,32 @@ static void ioapic_copy_alloc_attr(struc copy_irq_alloc_info(dst, src); dst->type = X86_IRQ_ALLOC_TYPE_IOAPIC; - dst->ioapic_id = mpc_ioapic_id(ioapic_idx); - dst->ioapic_pin = pin; - dst->ioapic_valid = 1; - if (src && src->ioapic_valid) { - dst->ioapic_node = src->ioapic_node; - dst->ioapic_trigger = src->ioapic_trigger; - dst->ioapic_polarity = src->ioapic_polarity; + dst->devid = mpc_ioapic_id(ioapic_idx); + dst->ioapic.pin = pin; + dst->ioapic.valid = 1; + if (src && src->ioapic.valid) { + dst->ioapic.node = src->ioapic.node; + dst->ioapic.trigger = src->ioapic.trigger; + dst->ioapic.polarity = src->ioapic.polarity; } else { - dst->ioapic_node = NUMA_NO_NODE; + dst->ioapic.node = NUMA_NO_NODE; if (acpi_get_override_irq(gsi, &trigger, &polarity) >= 0) { - dst->ioapic_trigger = trigger; - dst->ioapic_polarity = polarity; + dst->ioapic.trigger = trigger; + dst->ioapic.polarity = polarity; } else { /* * PCI interrupts are always active low level * triggered. */ - dst->ioapic_trigger = IOAPIC_LEVEL; - dst->ioapic_polarity = IOAPIC_POL_LOW; + dst->ioapic.trigger = IOAPIC_LEVEL; + dst->ioapic.polarity = IOAPIC_POL_LOW; } } } static int ioapic_alloc_attr_node(struct irq_alloc_info *info) { - return (info && info->ioapic_valid) ? info->ioapic_node : NUMA_NO_NODE; + return (info && info->ioapic.valid) ? info->ioapic.node : NUMA_NO_NODE; } static void mp_register_handler(unsigned int ir
[patch V2 27/46] x86/xen: Rework MSI teardown
From: Thomas Gleixner X86 cannot store the irq domain pointer in struct device without breaking XEN because the irq domain pointer takes precedence over arch_*_msi_irqs() fallbacks. XENs MSI teardown relies on default_teardown_msi_irqs() which invokes arch_teardown_msi_irq(). default_teardown_msi_irqs() is a trivial iterator over the msi entries associated to a device. Implement this loop in xen_teardown_msi_irqs() to prepare for removal of the fallbacks for X86. This is a preparatory step to wrap XEN MSI alloc/free into a irq domain which in turn allows to store the irq domain pointer in struct device and to use the irq domain functions directly. Signed-off-by: Thomas Gleixner --- V2: Use teardown_pv... for initial domain (Juergen) --- arch/x86/pci/xen.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) --- a/arch/x86/pci/xen.c +++ b/arch/x86/pci/xen.c @@ -377,20 +377,31 @@ static void xen_initdom_restore_msi_irqs static void xen_teardown_msi_irqs(struct pci_dev *dev) { struct msi_desc *msidesc; + int i; + + for_each_pci_msi_entry(msidesc, dev) { + if (msidesc->irq) { + for (i = 0; i < msidesc->nvec_used; i++) + xen_destroy_irq(msidesc->irq + i); + } + } +} + +static void xen_pv_teardown_msi_irqs(struct pci_dev *dev) +{ + struct msi_desc *msidesc = first_pci_msi_entry(dev); - msidesc = first_pci_msi_entry(dev); if (msidesc->msi_attrib.is_msix) xen_pci_frontend_disable_msix(dev); else xen_pci_frontend_disable_msi(dev); - /* Free the IRQ's and the msidesc using the generic code. */ - default_teardown_msi_irqs(dev); + xen_teardown_msi_irqs(dev); } static void xen_teardown_msi_irq(unsigned int irq) { - xen_destroy_irq(irq); + WARN_ON_ONCE(1); } #endif @@ -413,7 +424,7 @@ int __init pci_xen_init(void) #ifdef CONFIG_PCI_MSI x86_msi.setup_msi_irqs = xen_setup_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; - x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; + x86_msi.teardown_msi_irqs = xen_pv_teardown_msi_irqs; pci_msi_ignore_mask = 1; #endif return 0; @@ -437,6 +448,7 @@ static void __init xen_hvm_msi_init(void } x86_msi.setup_msi_irqs = xen_hvm_setup_msi_irqs; + x86_msi.teardown_msi_irqs = xen_teardown_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; } #endif @@ -473,6 +485,7 @@ int __init pci_xen_initial_domain(void) #ifdef CONFIG_PCI_MSI x86_msi.setup_msi_irqs = xen_initdom_setup_msi_irqs; x86_msi.teardown_msi_irq = xen_teardown_msi_irq; + x86_msi.teardown_msi_irqs = xen_teardown_pv_msi_irqs; x86_msi.restore_msi_irqs = xen_initdom_restore_msi_irqs; pci_msi_ignore_mask = 1; #endif ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 22/46] x86/irq: Initialize PCI/MSI domain at PCI init time
From: Thomas Gleixner No point in initializing the default PCI/MSI interrupt domain early and no point to create it when XEN PV/HVM/DOM0 are active. Move the initialization to pci_arch_init() and convert it to init ops so that XEN can override it as XEN has it's own PCI/MSI management. The XEN override comes in a later step. Signed-off-by: Thomas Gleixner --- V2: Add missing include --- arch/x86/include/asm/irqdomain.h |6 -- arch/x86/include/asm/x86_init.h |3 +++ arch/x86/kernel/apic/msi.c | 31 +++ arch/x86/kernel/apic/vector.c|2 -- arch/x86/kernel/x86_init.c |4 +++- arch/x86/pci/init.c |3 +++ 6 files changed, 32 insertions(+), 17 deletions(-) --- a/arch/x86/include/asm/irqdomain.h +++ b/arch/x86/include/asm/irqdomain.h @@ -51,9 +51,11 @@ extern int mp_irqdomain_ioapic_idx(struc #endif /* CONFIG_X86_IO_APIC */ #ifdef CONFIG_PCI_MSI -extern void arch_init_msi_domain(struct irq_domain *domain); +void x86_create_pci_msi_domain(void); +struct irq_domain *native_create_pci_msi_domain(void); #else -static inline void arch_init_msi_domain(struct irq_domain *domain) { } +static inline void x86_create_pci_msi_domain(void) { } +#define native_create_pci_msi_domain NULL #endif #endif --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -8,6 +8,7 @@ struct mpc_bus; struct mpc_cpu; struct mpc_table; struct cpuinfo_x86; +struct irq_domain; /** * struct x86_init_mpparse - platform specific mpparse ops @@ -42,12 +43,14 @@ struct x86_init_resources { * @intr_init: interrupt init code * @intr_mode_select: interrupt delivery mode selection * @intr_mode_init:interrupt delivery mode setup + * @create_pci_msi_domain: Create the PCI/MSI interrupt domain */ struct x86_init_irqs { void (*pre_vector_init)(void); void (*intr_init)(void); void (*intr_mode_select)(void); void (*intr_mode_init)(void); + struct irq_domain *(*create_pci_msi_domain)(void); }; /** --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -21,7 +21,7 @@ #include #include -static struct irq_domain *msi_default_domain; +static struct irq_domain *x86_pci_msi_default_domain __ro_after_init; static void __irq_msi_compose_msg(struct irq_cfg *cfg, struct msi_msg *msg) { @@ -192,7 +192,7 @@ int native_setup_msi_irqs(struct pci_dev domain = irq_remapping_get_irq_domain(&info); if (domain == NULL) - domain = msi_default_domain; + domain = x86_pci_msi_default_domain; if (domain == NULL) return -ENOSYS; @@ -235,25 +235,32 @@ static struct msi_domain_info pci_msi_do .handler_name = "edge", }; -void __init arch_init_msi_domain(struct irq_domain *parent) +struct irq_domain * __init native_create_pci_msi_domain(void) { struct fwnode_handle *fn; + struct irq_domain *d; if (disable_apic) - return; + return NULL; fn = irq_domain_alloc_named_fwnode("PCI-MSI"); - if (fn) { - msi_default_domain = - pci_msi_create_irq_domain(fn, &pci_msi_domain_info, - parent); - } - if (!msi_default_domain) { + if (!fn) + return NULL; + + d = pci_msi_create_irq_domain(fn, &pci_msi_domain_info, + x86_vector_domain); + if (!d) { irq_domain_free_fwnode(fn); - pr_warn("failed to initialize irqdomain for MSI/MSI-x.\n"); + pr_warn("Failed to initialize PCI-MSI irqdomain.\n"); } else { - msi_default_domain->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK; + d->flags |= IRQ_DOMAIN_MSI_NOMASK_QUIRK; } + return d; +} + +void __init x86_create_pci_msi_domain(void) +{ + x86_pci_msi_default_domain = x86_init.irqs.create_pci_msi_domain(); } #ifdef CONFIG_IRQ_REMAP --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -713,8 +713,6 @@ int __init arch_early_irq_init(void) BUG_ON(x86_vector_domain == NULL); irq_set_default_host(x86_vector_domain); - arch_init_msi_domain(x86_vector_domain); - BUG_ON(!alloc_cpumask_var(&vector_searchmask, GFP_KERNEL)); /* --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -24,6 +24,7 @@ #include #include #include +#include void x86_init_noop(void) { } void __init x86_init_uint_noop(unsigned int unused) { } @@ -76,7 +77,8 @@ struct x86_init_ops x86_init __initdata .pre_vector_init= init_ISA_irqs, .intr_init = native_init_IRQ, .intr_mode_select = apic_intr_mode_select, - .intr_mode_init = apic_intr_mode_init + .intr_mode_init
[patch V2 18/46] x86/msi: Consolidate MSI allocation
From: Thomas Gleixner Convert the interrupt remap drivers to retrieve the pci device from the msi descriptor and use info::hwirq. This is the first step to prepare x86 for using the generic MSI domain ops. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h |8 arch/x86/kernel/apic/msi.c |7 +++ drivers/iommu/amd/iommu.c |5 +++-- drivers/iommu/intel/irq_remapping.c |4 ++-- drivers/pci/controller/pci-hyperv.c |2 +- 5 files changed, 9 insertions(+), 17 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -85,14 +85,6 @@ struct irq_alloc_info { union { struct ioapic_alloc_infoioapic; struct uv_alloc_infouv; - - int unused; -#ifdef CONFIG_PCI_MSI - struct { - struct pci_dev *msi_dev; - irq_hw_number_t msi_hwirq; - }; -#endif }; }; --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -189,7 +189,6 @@ int native_setup_msi_irqs(struct pci_dev init_irq_alloc_info(&info, NULL); info.type = X86_IRQ_ALLOC_TYPE_PCI_MSI; - info.msi_dev = dev; domain = irq_remapping_get_irq_domain(&info); if (domain == NULL) @@ -208,7 +207,7 @@ void native_teardown_msi_irq(unsigned in static irq_hw_number_t pci_msi_get_hwirq(struct msi_domain_info *info, msi_alloc_info_t *arg) { - return arg->msi_hwirq; + return arg->hwirq; } int pci_msi_prepare(struct irq_domain *domain, struct device *dev, int nvec, @@ -218,7 +217,6 @@ int pci_msi_prepare(struct irq_domain *d struct msi_desc *desc = first_pci_msi_entry(pdev); init_irq_alloc_info(arg, NULL); - arg->msi_dev = pdev; if (desc->msi_attrib.is_msix) { arg->type = X86_IRQ_ALLOC_TYPE_PCI_MSIX; } else { @@ -232,7 +230,8 @@ EXPORT_SYMBOL_GPL(pci_msi_prepare); void pci_msi_set_desc(msi_alloc_info_t *arg, struct msi_desc *desc) { - arg->msi_hwirq = pci_msi_domain_calc_hwirq(desc); + arg->desc = desc; + arg->hwirq = pci_msi_domain_calc_hwirq(desc); } EXPORT_SYMBOL_GPL(pci_msi_set_desc); --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3514,7 +3514,7 @@ static int get_devid(struct irq_alloc_in return get_hpet_devid(info->devid); case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - return get_device_id(&info->msi_dev->dev); + return get_device_id(msi_desc_to_dev(info->desc)); default: WARN_ON_ONCE(1); return -1; @@ -3688,7 +3688,8 @@ static int irq_remapping_alloc(struct ir info->type == X86_IRQ_ALLOC_TYPE_PCI_MSIX) { bool align = (info->type == X86_IRQ_ALLOC_TYPE_PCI_MSI); - index = alloc_irq_index(devid, nr_irqs, align, info->msi_dev); + index = alloc_irq_index(devid, nr_irqs, align, + msi_desc_to_pci_dev(info->desc)); } else { index = alloc_irq_index(devid, nr_irqs, false, NULL); } --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1118,7 +1118,7 @@ static struct irq_domain *intel_get_irq_ return map_hpet_to_ir(info->devid); case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - return map_dev_to_ir(info->msi_dev); + return map_dev_to_ir(msi_desc_to_pci_dev(info->desc)); default: WARN_ON_ONCE(1); return NULL; @@ -1287,7 +1287,7 @@ static void intel_irq_remapping_prepare_ if (info->type == X86_IRQ_ALLOC_TYPE_HPET) set_hpet_sid(irte, info->devid); else - set_msi_sid(irte, info->msi_dev); + set_msi_sid(irte, msi_desc_to_pci_dev(info->desc)); msg->address_hi = MSI_ADDR_BASE_HI; msg->data = sub_handle; --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1534,7 +1534,7 @@ static struct irq_chip hv_msi_irq_chip = static irq_hw_number_t hv_msi_domain_ops_get_hwirq(struct msi_domain_info *info, msi_alloc_info_t *arg) { - return arg->msi_hwirq; + return arg->hwirq; } static struct msi_domain_ops hv_msi_ops = { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 10/46] iommu/amd: Consolidate irq domain getter
From: Thomas Gleixner The irq domain request mode is now indicated in irq_alloc_info::type. Consolidate the two getter functions into one. Signed-off-by: Thomas Gleixner --- drivers/iommu/amd/iommu.c | 65 ++ 1 file changed, 21 insertions(+), 44 deletions(-) --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3505,77 +3505,54 @@ static void irte_ga_clear_allocated(stru static int get_devid(struct irq_alloc_info *info) { - int devid = -1; - switch (info->type) { case X86_IRQ_ALLOC_TYPE_IOAPIC: - devid = get_ioapic_devid(info->ioapic_id); - break; + case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT: + return get_ioapic_devid(info->ioapic_id); case X86_IRQ_ALLOC_TYPE_HPET: - devid = get_hpet_devid(info->hpet_id); - break; + case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: + return get_hpet_devid(info->hpet_id); case X86_IRQ_ALLOC_TYPE_PCI_MSI: case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - devid = get_device_id(&info->msi_dev->dev); - break; + return get_device_id(&info->msi_dev->dev); default: - BUG_ON(1); - break; + WARN_ON_ONCE(1); + return -1; } - - return devid; } -static struct irq_domain *get_ir_irq_domain(struct irq_alloc_info *info) +static struct irq_domain *get_irq_domain_for_devid(struct irq_alloc_info *info, + int devid) { - struct amd_iommu *iommu; - int devid; + struct amd_iommu *iommu = amd_iommu_rlookup_table[devid]; - if (!info) + if (!iommu) return NULL; switch (info->type) { case X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT: case X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT: - break; + return iommu->ir_domain; + case X86_IRQ_ALLOC_TYPE_PCI_MSI: + case X86_IRQ_ALLOC_TYPE_PCI_MSIX: + return iommu->msi_domain; default: + WARN_ON_ONCE(1); return NULL; } - - devid = get_devid(info); - if (devid >= 0) { - iommu = amd_iommu_rlookup_table[devid]; - if (iommu) - return iommu->ir_domain; - } - - return NULL; } static struct irq_domain *get_irq_domain(struct irq_alloc_info *info) { - struct amd_iommu *iommu; int devid; if (!info) return NULL; - switch (info->type) { - case X86_IRQ_ALLOC_TYPE_PCI_MSI: - case X86_IRQ_ALLOC_TYPE_PCI_MSIX: - devid = get_device_id(&info->msi_dev->dev); - if (devid < 0) - return NULL; - - iommu = amd_iommu_rlookup_table[devid]; - if (iommu) - return iommu->msi_domain; - break; - default: - break; - } - - return NULL; + devid = get_devid(info); + if (devid < 0) + return NULL; + return get_irq_domain_for_devid(info, devid); } struct irq_remap_ops amd_iommu_irq_ops = { @@ -3584,7 +3561,7 @@ struct irq_remap_ops amd_iommu_irq_ops = .disable= amd_iommu_disable, .reenable = amd_iommu_reenable, .enable_faulting= amd_iommu_enable_faulting, - .get_ir_irq_domain = get_ir_irq_domain, + .get_ir_irq_domain = get_irq_domain, .get_irq_domain = get_irq_domain, }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 39/46] x86/irq: Add DEV_MSI allocation type
From: Thomas Gleixner For the upcoming device MSI support a new allocation type is required. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h |1 + 1 file changed, 1 insertion(+) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -40,6 +40,7 @@ enum irq_alloc_type { X86_IRQ_ALLOC_TYPE_PCI_MSIX, X86_IRQ_ALLOC_TYPE_DMAR, X86_IRQ_ALLOC_TYPE_UV, + X86_IRQ_ALLOC_TYPE_DEV_MSI, X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT, X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT, }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 02/46] x86/init: Remove unused init ops
From: Thomas Gleixner Some past platform removal forgot to get rid of this unused ballast. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/mpspec.h | 10 -- arch/x86/include/asm/x86_init.h | 10 -- arch/x86/kernel/mpparse.c | 26 -- arch/x86/kernel/x86_init.c |4 4 files changed, 4 insertions(+), 46 deletions(-) --- a/arch/x86/include/asm/mpspec.h +++ b/arch/x86/include/asm/mpspec.h @@ -67,21 +67,11 @@ static inline void find_smp_config(void) #ifdef CONFIG_X86_MPPARSE extern void e820__memblock_alloc_reserved_mpc_new(void); extern int enable_update_mptable; -extern int default_mpc_apic_id(struct mpc_cpu *m); -extern void default_smp_read_mpc_oem(struct mpc_table *mpc); -# ifdef CONFIG_X86_IO_APIC -extern void default_mpc_oem_bus_info(struct mpc_bus *m, char *str); -# else -# define default_mpc_oem_bus_info NULL -# endif extern void default_find_smp_config(void); extern void default_get_smp_config(unsigned int early); #else static inline void e820__memblock_alloc_reserved_mpc_new(void) { } #define enable_update_mptable 0 -#define default_mpc_apic_id NULL -#define default_smp_read_mpc_oem NULL -#define default_mpc_oem_bus_info NULL #define default_find_smp_config x86_init_noop #define default_get_smp_config x86_init_uint_noop #endif --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -11,22 +11,12 @@ struct cpuinfo_x86; /** * struct x86_init_mpparse - platform specific mpparse ops - * @mpc_record:platform specific mpc record accounting * @setup_ioapic_ids: platform specific ioapic id override - * @mpc_apic_id: platform specific mpc apic id assignment - * @smp_read_mpc_oem: platform specific oem mpc table setup - * @mpc_oem_pci_bus: platform specific pci bus setup (default NULL) - * @mpc_oem_bus_info: platform specific mpc bus info * @find_smp_config: find the smp configuration * @get_smp_config:get the smp configuration */ struct x86_init_mpparse { - void (*mpc_record)(unsigned int mode); void (*setup_ioapic_ids)(void); - int (*mpc_apic_id)(struct mpc_cpu *m); - void (*smp_read_mpc_oem)(struct mpc_table *mpc); - void (*mpc_oem_pci_bus)(struct mpc_bus *m); - void (*mpc_oem_bus_info)(struct mpc_bus *m, char *name); void (*find_smp_config)(void); void (*get_smp_config)(unsigned int early); }; --- a/arch/x86/kernel/mpparse.c +++ b/arch/x86/kernel/mpparse.c @@ -46,11 +46,6 @@ static int __init mpf_checksum(unsigned return sum & 0xFF; } -int __init default_mpc_apic_id(struct mpc_cpu *m) -{ - return m->apicid; -} - static void __init MP_processor_info(struct mpc_cpu *m) { int apicid; @@ -61,7 +56,7 @@ static void __init MP_processor_info(str return; } - apicid = x86_init.mpparse.mpc_apic_id(m); + apicid = m->apicid; if (m->cpuflag & CPU_BOOTPROCESSOR) { bootup_cpu = " (Bootup-CPU)"; @@ -73,7 +68,7 @@ static void __init MP_processor_info(str } #ifdef CONFIG_X86_IO_APIC -void __init default_mpc_oem_bus_info(struct mpc_bus *m, char *str) +static void __init mpc_oem_bus_info(struct mpc_bus *m, char *str) { memcpy(str, m->bustype, 6); str[6] = 0; @@ -84,7 +79,7 @@ static void __init MP_bus_info(struct mp { char str[7]; - x86_init.mpparse.mpc_oem_bus_info(m, str); + mpc_oem_bus_info(m, str); #if MAX_MP_BUSSES < 256 if (m->busid >= MAX_MP_BUSSES) { @@ -100,9 +95,6 @@ static void __init MP_bus_info(struct mp mp_bus_id_to_type[m->busid] = MP_BUS_ISA; #endif } else if (strncmp(str, BUSTYPE_PCI, sizeof(BUSTYPE_PCI) - 1) == 0) { - if (x86_init.mpparse.mpc_oem_pci_bus) - x86_init.mpparse.mpc_oem_pci_bus(m); - clear_bit(m->busid, mp_bus_not_pci); #ifdef CONFIG_EISA mp_bus_id_to_type[m->busid] = MP_BUS_PCI; @@ -198,8 +190,6 @@ static void __init smp_dump_mptable(stru 1, mpc, mpc->length, 1); } -void __init default_smp_read_mpc_oem(struct mpc_table *mpc) { } - static int __init smp_read_mpc(struct mpc_table *mpc, unsigned early) { char str[16]; @@ -218,14 +208,7 @@ static int __init smp_read_mpc(struct mp if (early) return 1; - if (mpc->oemptr) - x86_init.mpparse.smp_read_mpc_oem(mpc); - - /* -* Now process the configuration blocks. -*/ - x86_init.mpparse.mpc_record(0); - + /* Now process the configuration blocks. */ while (count < mpc->length) { switch (*mpt) { case MP_PROCESSOR: @@ -256,7 +239,6 @@ static int __init smp_read_mpc(struct mp count = mpc->length; break; } - x86_i
[patch V2 05/46] x86/msi: Move compose message callback where it belongs
Composing the MSI message at the MSI chip level is wrong because the underlying parent domain is the one which knows how the message should be composed for the direct vector delivery or the interrupt remapping table entry. The interrupt remapping aware PCI/MSI chip does that already. Make the direct delivery chip do the same and move the composition of the direct delivery MSI message to the vector domain irq chip. This prepares for the upcoming device MSI support to avoid having architecture specific knowledge in the device MSI domain irq chips. Signed-off-by: Thomas Gleixner --- V2: New patch --- arch/x86/include/asm/apic.h |8 arch/x86/kernel/apic/msi.c| 12 +++- arch/x86/kernel/apic/vector.c |1 + 3 files changed, 12 insertions(+), 9 deletions(-) --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -519,6 +519,14 @@ static inline bool apic_id_is_primary_th static inline void apic_smt_update(void) { } #endif +struct msi_msg; + +#ifdef CONFIG_PCI_MSI +void x86_vector_msi_compose_msg(struct irq_data *data, struct msi_msg *msg); +#else +# define x86_vector_msi_compose_msg NULL +#endif + extern void ioapic_zap_locks(void); #endif /* _ASM_X86_APIC_H */ --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -45,7 +45,7 @@ static void __irq_msi_compose_msg(struct MSI_DATA_VECTOR(cfg->vector); } -static void irq_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) +void x86_vector_msi_compose_msg(struct irq_data *data, struct msi_msg *msg) { __irq_msi_compose_msg(irqd_cfg(data), msg); } @@ -177,7 +177,6 @@ static struct irq_chip pci_msi_controlle .irq_mask = pci_msi_mask_irq, .irq_ack= irq_chip_ack_parent, .irq_retrigger = irq_chip_retrigger_hierarchy, - .irq_compose_msi_msg= irq_msi_compose_msg, .irq_set_affinity = msi_set_affinity, .flags = IRQCHIP_SKIP_SET_WAKE, }; @@ -321,7 +320,6 @@ static struct irq_chip dmar_msi_controll .irq_ack= irq_chip_ack_parent, .irq_set_affinity = msi_domain_set_affinity, .irq_retrigger = irq_chip_retrigger_hierarchy, - .irq_compose_msi_msg= irq_msi_compose_msg, .irq_write_msi_msg = dmar_msi_write_msg, .flags = IRQCHIP_SKIP_SET_WAKE, }; @@ -419,7 +417,6 @@ static struct irq_chip hpet_msi_controll .irq_ack = irq_chip_ack_parent, .irq_set_affinity = msi_domain_set_affinity, .irq_retrigger = irq_chip_retrigger_hierarchy, - .irq_compose_msi_msg = irq_msi_compose_msg, .irq_write_msi_msg = hpet_msi_write_msg, .flags = IRQCHIP_SKIP_SET_WAKE, }; @@ -479,13 +476,10 @@ struct irq_domain *hpet_create_irq_domai info.type = X86_IRQ_ALLOC_TYPE_HPET; info.hpet_id = hpet_id; parent = irq_remapping_get_ir_irq_domain(&info); - if (parent == NULL) { + if (parent == NULL) parent = x86_vector_domain; - } else { + else hpet_msi_controller.name = "IR-HPET-MSI"; - /* Temporary fix: Will go away */ - hpet_msi_controller.irq_compose_msi_msg = NULL; - } fn = irq_domain_alloc_named_id_fwnode(hpet_msi_controller.name, hpet_id); --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -823,6 +823,7 @@ static struct irq_chip lapic_controller .name = "APIC", .irq_ack= apic_ack_edge, .irq_set_affinity = apic_set_affinity, + .irq_compose_msi_msg= x86_vector_msi_compose_msg, .irq_retrigger = apic_retrigger_irq, }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 06/46] x86/msi: Remove pointless vcpu_affinity callback
Setting the irq_set_vcpu_affinity() callback to irq_chip_set_vcpu_affinity_parent() is a pointless exercise because the function which utilizes it searchs the domain hierarchy to find a parent domain which has such a callback. Remove the useless indirection. Signed-off-by: Thomas Gleixner --- V2: New patch. The same is probably true for lots of other irq chips. --- arch/x86/kernel/apic/msi.c |1 - 1 file changed, 1 deletion(-) --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -278,7 +278,6 @@ static struct irq_chip pci_msi_ir_contro .irq_mask = pci_msi_mask_irq, .irq_ack= irq_chip_ack_parent, .irq_retrigger = irq_chip_retrigger_hierarchy, - .irq_set_vcpu_affinity = irq_chip_set_vcpu_affinity_parent, .flags = IRQCHIP_SKIP_SET_WAKE, }; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 00/46] x86, PCI, XEN, genirq ...: Prepare for device MSI
This is the second version of providing a base to support device MSI (non PCI based) and on top of that support for IMS (Interrupt Message Storm) based devices in a halfways architecture independent way. The first version can be found here: https://lore.kernel.org/r/20200821002424.119492...@linutronix.de It's still a mixed bag of bug fixes, cleanups and general improvements which are worthwhile independent of device MSI. There are quite a bunch of issues to solve: - X86 does not use the device::msi_domain pointer for historical reasons and due to XEN, which makes it impossible to create an architecture agnostic device MSI infrastructure. - X86 has it's own msi_alloc_info data type which is pointlessly different from the generic version and does not allow to share code. - The logic of composing MSI messages in an hierarchy is busted at the core level and of course some (x86) drivers depend on that. - A few minor shortcomings as usual This series addresses that in several steps: 1) Accidental bug fixes iommu/amd: Prevent NULL pointer dereference 2) Janitoring x86/init: Remove unused init ops PCI: vmd: Dont abuse vector irqomain as parent x86/msi: Remove pointless vcpu_affinity callback 3) Sanitizing the composition of MSI messages in a hierarchy genirq/chip: Use the first chip in irq_chip_compose_msi_msg() x86/msi: Move compose message callback where it belongs 4) Simplification of the x86 specific interrupt allocation mechanism x86/irq: Rename X86_IRQ_ALLOC_TYPE_MSI* to reflect PCI dependency x86/irq: Add allocation type for parent domain retrieval iommu/vt-d: Consolidate irq domain getter iommu/amd: Consolidate irq domain getter iommu/irq_remapping: Consolidate irq domain lookup 5) Consolidation of the X86 specific interrupt allocation mechanism to be as close as possible to the generic MSI allocation mechanism which allows to get rid of quite a bunch of x86'isms which are pointless x86/irq: Prepare consolidation of irq_alloc_info x86/msi: Consolidate HPET allocation x86/ioapic: Consolidate IOAPIC allocation x86/irq: Consolidate DMAR irq allocation x86/irq: Consolidate UV domain allocation PCI/MSI: Rework pci_msi_domain_calc_hwirq() x86/msi: Consolidate MSI allocation x86/msi: Use generic MSI domain ops 6) x86 specific cleanups to remove the dependency on arch_*_msi_irqs() x86/irq: Move apic_post_init() invocation to one place x86/pci: Reducde #ifdeffery in PCI init code x86/irq: Initialize PCI/MSI domain at PCI init time irqdomain/msi: Provide DOMAIN_BUS_VMD_MSI PCI: vmd: Mark VMD irqdomain with DOMAIN_BUS_VMD_MSI PCI/MSI: Provide pci_dev_has_special_msi_domain() helper x86/xen: Make xen_msi_init() static and rename it to xen_hvm_msi_init() x86/xen: Rework MSI teardown x86/xen: Consolidate XEN-MSI init irqdomain/msi: Allow to override msi_domain_alloc/free_irqs() x86/xen: Wrap XEN MSI management into irqdomain iommm/vt-d: Store irq domain in struct device iommm/amd: Store irq domain in struct device x86/pci: Set default irq domain in pcibios_add_device() PCI/MSI: Make arch_.*_msi_irq[s] fallbacks selectable x86/irq: Cleanup the arch_*_msi_irqs() leftovers x86/irq: Make most MSI ops XEN private iommu/vt-d: Remove domain search for PCI/MSI[X] iommu/amd: Remove domain search for PCI/MSI 7) X86 specific preparation for device MSI x86/irq: Add DEV_MSI allocation type x86/msi: Rename and rework pci_msi_prepare() to cover non-PCI MSI 8) Generic device MSI infrastructure platform-msi: Provide default irq_chip:: Ack genirq/proc: Take buslock on affinity write genirq/msi: Provide and use msi_domain_set_default_info_flags() platform-msi: Add device MSI infrastructure irqdomain/msi: Provide msi_alloc/free_store() callbacks 9) POC of IMS (Interrupt Message Storm) irq domain and irqchip implementations for both device array and queue storage. irqchip: Add IMS (Interrupt Message Storm) driver - NOT FOR MERGING Changes vs. V1: - Addressed various review comments and addressed the 0day fallout. - Corrected the XEN logic (Jürgen) - Make the arch fallback in PCI/MSI opt-in not opt-out (Bjorn) - Fixed the compose MSI message inconsistency - Ensure that the necessary flags are set for device SMI - Make the irq bus logic work for affinity setting to prepare support for IMS storage in queue memory. It turned out to be less scary than I feared. - Remove leftovers in iommu/intel|amd - Reworked the IMS POC driver to cover queue storage so Jason can have a look whether that fits the needs of MLX devices. The whole lot is also available from git: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/devel.git device-msi This has
[patch V2 11/46] iommu/irq_remapping: Consolidate irq domain lookup
From: Thomas Gleixner Now that the iommu implementations handle the X86_*_GET_PARENT_DOMAIN types, consolidate the two getter functions. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/irq_remapping.h |8 arch/x86/kernel/apic/io_apic.c |2 +- arch/x86/kernel/apic/msi.c |2 +- drivers/iommu/amd/iommu.c|1 - drivers/iommu/hyperv-iommu.c |4 ++-- drivers/iommu/intel/irq_remapping.c |1 - drivers/iommu/irq_remapping.c| 23 +-- drivers/iommu/irq_remapping.h|5 + 8 files changed, 6 insertions(+), 40 deletions(-) --- a/arch/x86/include/asm/irq_remapping.h +++ b/arch/x86/include/asm/irq_remapping.h @@ -45,8 +45,6 @@ extern int irq_remap_enable_fault_handli extern void panic_if_irq_remap(const char *msg); extern struct irq_domain * -irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info); -extern struct irq_domain * irq_remapping_get_irq_domain(struct irq_alloc_info *info); /* Create PCI MSI/MSIx irqdomain, use @parent as the parent irqdomain. */ @@ -74,12 +72,6 @@ static inline void panic_if_irq_remap(co } static inline struct irq_domain * -irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info) -{ - return NULL; -} - -static inline struct irq_domain * irq_remapping_get_irq_domain(struct irq_alloc_info *info) { return NULL; --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2298,7 +2298,7 @@ static int mp_irqdomain_create(int ioapi init_irq_alloc_info(&info, NULL); info.type = X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT; info.ioapic_id = mpc_ioapic_id(ioapic); - parent = irq_remapping_get_ir_irq_domain(&info); + parent = irq_remapping_get_irq_domain(&info); if (!parent) parent = x86_vector_domain; else --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -478,7 +478,7 @@ struct irq_domain *hpet_create_irq_domai init_irq_alloc_info(&info, NULL); info.type = X86_IRQ_ALLOC_TYPE_HPET_GET_PARENT; info.hpet_id = hpet_id; - parent = irq_remapping_get_ir_irq_domain(&info); + parent = irq_remapping_get_irq_domain(&info); if (parent == NULL) parent = x86_vector_domain; else --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3561,7 +3561,6 @@ struct irq_remap_ops amd_iommu_irq_ops = .disable= amd_iommu_disable, .reenable = amd_iommu_reenable, .enable_faulting= amd_iommu_enable_faulting, - .get_ir_irq_domain = get_irq_domain, .get_irq_domain = get_irq_domain, }; --- a/drivers/iommu/hyperv-iommu.c +++ b/drivers/iommu/hyperv-iommu.c @@ -182,7 +182,7 @@ static int __init hyperv_enable_irq_rema return IRQ_REMAP_X2APIC_MODE; } -static struct irq_domain *hyperv_get_ir_irq_domain(struct irq_alloc_info *info) +static struct irq_domain *hyperv_get_irq_domain(struct irq_alloc_info *info) { if (info->type == X86_IRQ_ALLOC_TYPE_IOAPIC_GET_PARENT) return ioapic_ir_domain; @@ -193,7 +193,7 @@ static struct irq_domain *hyperv_get_ir_ struct irq_remap_ops hyperv_irq_remap_ops = { .prepare= hyperv_prepare_irq_remapping, .enable = hyperv_enable_irq_remapping, - .get_ir_irq_domain = hyperv_get_ir_irq_domain, + .get_irq_domain = hyperv_get_irq_domain, }; #endif --- a/drivers/iommu/intel/irq_remapping.c +++ b/drivers/iommu/intel/irq_remapping.c @@ -1131,7 +1131,6 @@ struct irq_remap_ops intel_irq_remap_ops .disable= disable_irq_remapping, .reenable = reenable_irq_remapping, .enable_faulting= enable_drhd_fault_handling, - .get_ir_irq_domain = intel_get_irq_domain, .get_irq_domain = intel_get_irq_domain, }; --- a/drivers/iommu/irq_remapping.c +++ b/drivers/iommu/irq_remapping.c @@ -160,33 +160,12 @@ void panic_if_irq_remap(const char *msg) } /** - * irq_remapping_get_ir_irq_domain - Get the irqdomain associated with the IOMMU - * device serving request @info - * @info: interrupt allocation information, used to identify the IOMMU device - * - * It's used to get parent irqdomain for HPET and IOAPIC irqdomains. - * Returns pointer to IRQ domain, or NULL on failure. - */ -struct irq_domain * -irq_remapping_get_ir_irq_domain(struct irq_alloc_info *info) -{ - if (!remap_ops || !remap_ops->get_ir_irq_domain) - return NULL; - - return remap_ops->get_ir_irq_domain(info); -} - -/** * irq_remapping_get_irq_domain - Get the irqdomain serving the request @info * @info: interrupt allocation information, used to identify the IOMMU device * - * There will be one PCI MSI/MSIX irqdomain associated with each interrupt - * remapping device, so this interfa
[patch V2 04/46] genirq/chip: Use the first chip in irq_chip_compose_msi_msg()
The documentation of irq_chip_compose_msi_msg() claims that with hierarchical irq domains the first chip in the hierarchy which has an irq_compose_msi_msg() callback is chosen. But the code just keeps iterating after it finds a chip with a compose callback. The x86 HPET MSI implementation relies on that behaviour, but that does not make it more correct. The message should always be composed at the domain which manages the underlying resource (e.g. APIC or remap table) because that domain knows about the required layout of the message. On X86 the following hierarchies exist: 1) vector PCI/MSI 2) vector -- IR -- PCI/MSI The vector domain has a different message format than the IR (remapping) domain. So obviously the PCI/MSI domain can't compose the message without having knowledge about the parent domain, which is exactly the opposite of what hierarchical domains want to achieve. X86 actually has two different PCI/MSI chips where #1 has a compose callback and #2 does not. #2 delegates the composition to the remap domain where it belongs, but #1 does it at the PCI/MSI level. For the upcoming device MSI support it's necessary to change this and just let the first domain which can compose the message take care of it. That way the top level chip does not have to worry about it and the device MSI code does not need special knowledge about topologies. It just sets the compose callback to NULL and lets the hierarchy pick the first chip which has one. Due to that the attempt to move the compose callback from the direct delivery PCI/MSI domain to the vector domain made the system fail to boot with interrupt remapping enabled because in the remapping case irq_chip_compose_msi_msg() keeps iterating and choses the compose callback of the vector domain which obviously creates the wrong format for the remap table. Break out of the loop when the first irq chip with a compose callback is found and fixup the HPET code temporarily. That workaround will be removed once the direct delivery compose callback is moved to the place where it belongs in the vector domain. Signed-off-by: Thomas Gleixner --- V2: New patch. Note, that this might break other stuff which relies on the current behaviour, but the hierarchy composition of DT based chips is really hard to follow. --- arch/x86/kernel/apic/msi.c |7 +-- kernel/irq/chip.c | 12 +--- 2 files changed, 14 insertions(+), 5 deletions(-) --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -479,10 +479,13 @@ struct irq_domain *hpet_create_irq_domai info.type = X86_IRQ_ALLOC_TYPE_HPET; info.hpet_id = hpet_id; parent = irq_remapping_get_ir_irq_domain(&info); - if (parent == NULL) + if (parent == NULL) { parent = x86_vector_domain; - else + } else { hpet_msi_controller.name = "IR-HPET-MSI"; + /* Temporary fix: Will go away */ + hpet_msi_controller.irq_compose_msi_msg = NULL; + } fn = irq_domain_alloc_named_id_fwnode(hpet_msi_controller.name, hpet_id); --- a/kernel/irq/chip.c +++ b/kernel/irq/chip.c @@ -1544,10 +1544,16 @@ int irq_chip_compose_msi_msg(struct irq_ struct irq_data *pos = NULL; #ifdef CONFIG_IRQ_DOMAIN_HIERARCHY - for (; data; data = data->parent_data) -#endif - if (data->chip && data->chip->irq_compose_msi_msg) + for (; data; data = data->parent_data) { + if (data->chip && data->chip->irq_compose_msi_msg) { pos = data; + break; + } + } +#else + if (data->chip && data->chip->irq_compose_msi_msg) + pos = data; +#endif if (!pos) return -ENOSYS; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 15/46] x86/irq: Consolidate DMAR irq allocation
From: Thomas Gleixner None of the DMAR specific fields are required. Signed-off-by: Thomas Gleixner --- arch/x86/include/asm/hw_irq.h |6 -- arch/x86/kernel/apic/msi.c| 10 +- 2 files changed, 5 insertions(+), 11 deletions(-) --- a/arch/x86/include/asm/hw_irq.h +++ b/arch/x86/include/asm/hw_irq.h @@ -83,12 +83,6 @@ struct irq_alloc_info { irq_hw_number_t msi_hwirq; }; #endif -#ifdef CONFIG_DMAR_TABLE - struct { - int dmar_id; - void*dmar_data; - }; -#endif #ifdef CONFIG_X86_UV struct { int uv_limit; --- a/arch/x86/kernel/apic/msi.c +++ b/arch/x86/kernel/apic/msi.c @@ -329,15 +329,15 @@ static struct irq_chip dmar_msi_controll static irq_hw_number_t dmar_msi_get_hwirq(struct msi_domain_info *info, msi_alloc_info_t *arg) { - return arg->dmar_id; + return arg->hwirq; } static int dmar_msi_init(struct irq_domain *domain, struct msi_domain_info *info, unsigned int virq, irq_hw_number_t hwirq, msi_alloc_info_t *arg) { - irq_domain_set_info(domain, virq, arg->dmar_id, info->chip, NULL, - handle_edge_irq, arg->dmar_data, "edge"); + irq_domain_set_info(domain, virq, arg->devid, info->chip, NULL, + handle_edge_irq, arg->data, "edge"); return 0; } @@ -384,8 +384,8 @@ int dmar_alloc_hwirq(int id, int node, v init_irq_alloc_info(&info, NULL); info.type = X86_IRQ_ALLOC_TYPE_DMAR; - info.dmar_id = id; - info.dmar_data = arg; + info.devid = id; + info.data = arg; return irq_domain_alloc_irqs(domain, 1, node, &info); } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 01/46] iommu/amd: Prevent NULL pointer dereference
From: Thomas Gleixner Dereferencing irq_data before checking it for NULL is suboptimal. Signed-off-by: Thomas Gleixner --- drivers/iommu/amd/iommu.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -3717,8 +3717,8 @@ static int irq_remapping_alloc(struct ir for (i = 0; i < nr_irqs; i++) { irq_data = irq_domain_get_irq_data(domain, virq + i); - cfg = irqd_cfg(irq_data); - if (!irq_data || !cfg) { + cfg = irq_data ? irqd_cfg(irq_data) : NULL; + if (!cfg) { ret = -EINVAL; goto out_free_data; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[patch V2 03/46] PCI: vmd: Dont abuse vector irqomain as parent
VMD has it's own PCI/MSI interrupt domain which is not in any way depending on the x86 vector domain. PCI devices behind VMD share the VMD MSIX vector entries via a VMD specific message translation to the actual VMD MSIX vector. The VMD device interrupt handler for the VMD MSIX vectors invokes all interrupt handlers of the devices which share a vector. Making the x86 vector domain the actual parent of the VMD irq domain is pointless and actually counterproductive. When a device interrupt is requested then it will activate the interrupt which traverses down the hierarchy and consumes an interrupt vector in the vector domain which is never used. The domain is self contained and has no parent dependencies, so just hand in NULL for the parent and be done with it. Signed-off-by: Thomas Gleixner Cc: Jonathan Derrick Cc: linux-...@vger.kernel.org --- V2: New patch. --- drivers/pci/controller/vmd.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) --- a/drivers/pci/controller/vmd.c +++ b/drivers/pci/controller/vmd.c @@ -573,7 +573,8 @@ static int vmd_enable_domain(struct vmd_ return -ENODEV; vmd->irq_domain = pci_msi_create_irq_domain(fn, &vmd_msi_domain_info, - x86_vector_domain); + NULL); + if (!vmd->irq_domain) { irq_domain_free_fwnode(fn); return -ENODEV; ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: Add support to filter non-strict/lazy mode based on device names
On 2020-08-25 16:42, Sai Prakash Ranjan wrote: Currently the non-strict or lazy mode of TLB invalidation can only be set for all or no domains. This works well for development platforms where setting to non-strict/lazy mode is fine for performance reasons but on production devices, we need a more fine grained control to allow only certain peripherals to support this mode where we can be sure that it is safe. So add support to filter non-strict/lazy mode based on the device names that are passed via cmdline parameter "iommu.nonstrict_device". There seems to be considerable overlap here with both the existing patches for per-device default domain control [1], and the broader ongoing development on how to define, evaluate and handle "trusted" vs. "untrusted" devices (e.g. [2],[3]). I'd rather see work done to make sure those integrate properly together and work well for everyone's purposes, than add more disjoint mechanisms that only address small pieces of the overall issue. Robin. [1] https://lore.kernel.org/linux-iommu/20200824051726.7xaJRTTszJuzdFWGJ8YNsshCtfNR0BNeMrlILAyqt_0@z/ [2] https://lore.kernel.org/linux-iommu/20200630044943.3425049-1-raja...@google.com/ [3] https://lore.kernel.org/linux-iommu/20200626002710.110200-2-raja...@google.com/ Example: iommu.nonstrict_device="7c4000.sdhci,a60.dwc3,6048000.etr" Signed-off-by: Sai Prakash Ranjan --- drivers/iommu/iommu.c | 37 + 1 file changed, 33 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..fd10a073f557 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -32,6 +32,9 @@ static unsigned int iommu_def_domain_type __read_mostly; static bool iommu_dma_strict __read_mostly = true; static u32 iommu_cmd_line __read_mostly; +#define DEVICE_NAME_LEN 1024 +static char nonstrict_device[DEVICE_NAME_LEN] __read_mostly; + struct iommu_group { struct kobject kobj; struct kobject *devices_kobj; @@ -327,6 +330,32 @@ static int __init iommu_dma_setup(char *str) } early_param("iommu.strict", iommu_dma_setup); +static int __init iommu_nonstrict_filter_setup(char *str) +{ + strlcpy(nonstrict_device, str, DEVICE_NAME_LEN); + return 1; +} +__setup("iommu.nonstrict_device=", iommu_nonstrict_filter_setup); + +static bool iommu_nonstrict_device(struct device *dev) +{ + char *filter, *device; + + if (!dev) + return false; + + filter = kstrdup(nonstrict_device, GFP_KERNEL); + if (!filter) + return false; + + while ((device = strsep(&filter, ","))) { + if (!strcmp(device, dev_name(dev))) + return true; + } + + return false; +} + static ssize_t iommu_group_attr_show(struct kobject *kobj, struct attribute *__attr, char *buf) { @@ -1470,7 +1499,7 @@ static int iommu_get_def_domain_type(struct device *dev) static int iommu_group_alloc_default_domain(struct bus_type *bus, struct iommu_group *group, - unsigned int type) + unsigned int type, struct device *dev) { struct iommu_domain *dom; @@ -1489,7 +1518,7 @@ static int iommu_group_alloc_default_domain(struct bus_type *bus, if (!group->domain) group->domain = dom; - if (!iommu_dma_strict) { + if (!iommu_dma_strict || iommu_nonstrict_device(dev)) { int attr = 1; iommu_domain_set_attr(dom, DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, @@ -1509,7 +1538,7 @@ static int iommu_alloc_default_domain(struct iommu_group *group, type = iommu_get_def_domain_type(dev); - return iommu_group_alloc_default_domain(dev->bus, group, type); + return iommu_group_alloc_default_domain(dev->bus, group, type, dev); } /** @@ -1684,7 +1713,7 @@ static void probe_alloc_default_domain(struct bus_type *bus, if (!gtype.type) gtype.type = iommu_def_domain_type; - iommu_group_alloc_default_domain(bus, group, gtype.type); + iommu_group_alloc_default_domain(bus, group, gtype.type, NULL); } base-commit: e46b3c0d011eab9933c183d5b47569db8e377281 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] dma-pool: Fix an uninitialized variable bug in atomic_pool_expand()
The "page" pointer can be used with out being initialized. Fixes: d7e673ec2c8e ("dma-pool: Only allocate from CMA when in same memory zone") Signed-off-by: Dan Carpenter --- kernel/dma/pool.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/dma/pool.c b/kernel/dma/pool.c index 06582b488e31..1281c0f0442b 100644 --- a/kernel/dma/pool.c +++ b/kernel/dma/pool.c @@ -84,7 +84,7 @@ static int atomic_pool_expand(struct gen_pool *pool, size_t pool_size, gfp_t gfp) { unsigned int order; - struct page *page; + struct page *page = NULL; void *addr; int ret = -ENOMEM; -- 2.28.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 2/3] iommu/vt-d:Add support for probing ACPI device in RMRR
On Wed, Aug 26, 2020 at 06:27:51AM -0400, FelixCuioc wrote: > After acpi device in RMRR is detected,it is necessary > to establish a mapping for these devices. > In acpi_device_create_direct_mappings(),create a mapping > for the acpi device in RMRR. > Add a helper to achieve the acpi namespace device can > access the RMRR region. > > Signed-off-by: FelixCuioc > --- > drivers/iommu/intel/iommu.c | 27 +++ > drivers/iommu/iommu.c | 6 ++ > include/linux/iommu.h | 3 +++ > 3 files changed, 36 insertions(+) > > diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c > index f774ef63d473..b31f02f41c96 100644 > --- a/drivers/iommu/intel/iommu.c > +++ b/drivers/iommu/intel/iommu.c > @@ -4797,6 +4797,20 @@ static int __init platform_optin_force_iommu(void) > > return 1; > } > +static int acpi_device_create_direct_mappings(struct device *pn_dev, struct > device *acpi_device) Blank line. > +{ > + struct iommu_group *group; > + > + acpi_device->bus->iommu_ops = &intel_iommu_ops; > + group = iommu_group_get(pn_dev); > + if (!group) { > + pr_warn("ACPI name space devices create direct mappings > wrong!\n"); > + return -1; Use a proper error code. -ENOMEM? -EINVAL? > + } > + __acpi_device_create_direct_mappings(group, acpi_device); > + > + return 0; > +} > > static int __init probe_acpi_namespace_devices(void) > { > @@ -4812,6 +4826,7 @@ static int __init probe_acpi_namespace_devices(void) > struct acpi_device_physical_node *pn; > struct iommu_group *group; > struct acpi_device *adev; > + struct device *pn_dev = NULL; > > if (dev->bus != &acpi_bus_type) > continue; > @@ -4822,6 +4837,7 @@ static int __init probe_acpi_namespace_devices(void) > &adev->physical_node_list, node) { > group = iommu_group_get(pn->dev); > if (group) { > + pn_dev = pn->dev; > iommu_group_put(group); > continue; > } > @@ -4830,7 +4846,18 @@ static int __init probe_acpi_namespace_devices(void) > ret = iommu_probe_device(pn->dev); > if (ret) > break; > + pn_dev = pn->dev; > + } > + if (pn_dev == NULL) { Run checkpatch.pl --strict on this patch. Use "if (!pn_dev) {". > + dev->bus->iommu_ops = &intel_iommu_ops; > + ret = iommu_probe_device(dev); > + if (ret) { > + pr_err("acpi_device probe fail! > ret:%d\n", ret); > + return ret; ^^ This should be goto unlock; > + } > + return 0; > } > + ret = acpi_device_create_direct_mappings(pn_dev, dev); unlock: > mutex_unlock(&adev->physical_node_lock); ^^^ regards, dan carpenter ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 3/3] iommu/vt-d:Add mutex_unlock() before returning
On Wed, Aug 26, 2020 at 06:27:52AM -0400, FelixCuioc wrote: > In the probe_acpi_namespace_devices function,when the physical > node of the acpi device is NULL,the unlock function is missing. > Add mutex_unlock(&adev->physical_node_lock). > > Reported-by: Dan Carpenter > Signed-off-by: FelixCuioc Oh... Crap. I wondered why I was being CC'd on this patchset. Just fold this into the ealier patch. Don't worry about the Reported-by. regards, dan carpenter ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/3] iommu/vt-d:Add support for detecting ACPI device in RMRR
On Wed, Aug 26, 2020 at 06:27:50AM -0400, FelixCuioc wrote: > Some ACPI devices need to issue dma requests to access > the reserved memory area.BIOS uses the device scope type > ACPI_NAMESPACE_DEVICE in RMRR to report these ACPI devices. > This patch add support for detecting ACPI devices in RMRR. > > Signed-off-by: FelixCuioc > --- > drivers/iommu/intel/dmar.c | 74 - > drivers/iommu/intel/iommu.c | 22 ++- > include/linux/dmar.h| 12 +- > 3 files changed, 72 insertions(+), 36 deletions(-) > > diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c > index 93e6345f3414..024ca38dba12 100644 > --- a/drivers/iommu/intel/dmar.c > +++ b/drivers/iommu/intel/dmar.c > @@ -215,7 +215,7 @@ static bool dmar_match_pci_path(struct > dmar_pci_notify_info *info, int bus, > } > > /* Return: > 0 if match found, 0 if no match found, < 0 if error happens */ > -int dmar_insert_dev_scope(struct dmar_pci_notify_info *info, > +int dmar_pci_insert_dev_scope(struct dmar_pci_notify_info *info, > void *start, void*end, u16 segment, > struct dmar_dev_scope *devices, > int devices_cnt) > @@ -304,7 +304,7 @@ static int dmar_pci_bus_add_dev(struct > dmar_pci_notify_info *info) > > drhd = container_of(dmaru->hdr, > struct acpi_dmar_hardware_unit, header); > - ret = dmar_insert_dev_scope(info, (void *)(drhd + 1), > + ret = dmar_pci_insert_dev_scope(info, (void *)(drhd + 1), > ((void *)drhd) + drhd->header.length, > dmaru->segment, > dmaru->devices, dmaru->devices_cnt); > @@ -696,48 +696,56 @@ dmar_find_matched_drhd_unit(struct pci_dev *dev) > > return dmaru; > } > - > -static void __init dmar_acpi_insert_dev_scope(u8 device_number, > - struct acpi_device *adev) > +int dmar_acpi_insert_dev_scope(u8 device_number, The patch deleted the blank line between functions. Make this function bool, change the 1/0 to true/false. Add a comment explaining what the return values mean. > + struct acpi_device *adev, > + void *start, void *end, > + struct dmar_dev_scope *devices, > + int devices_cnt) > { > - struct dmar_drhd_unit *dmaru; > - struct acpi_dmar_hardware_unit *drhd; > struct acpi_dmar_device_scope *scope; > struct device *tmp; > int i; > struct acpi_dmar_pci_path *path; > > + for (; start < end; start += scope->length) { > + scope = start; Are we sure that there is enough space for sizeof(*scope) in end - start? (I don't know this code so maybe we are). > + if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE) > + continue; > + if (scope->enumeration_id != device_number) > + continue; > + path = (void *)(scope + 1); > + for_each_dev_scope(devices, devices_cnt, i, tmp) > + if (tmp == NULL) { > + devices[i].bus = scope->bus; > + devices[i].devfn = PCI_DEVFN(path->device, > path->function); > + rcu_assign_pointer(devices[i].dev, > +get_device(&adev->dev)); > + return 1; > + } > + WARN_ON(i >= devices_cnt); > + } > + return 0; > +} > +static int dmar_acpi_bus_add_dev(u8 device_number, struct acpi_device *adev) > +{ > + struct dmar_drhd_unit *dmaru; > + struct acpi_dmar_hardware_unit *drhd; > + int ret = 0; This initialization is never used. > + > for_each_drhd_unit(dmaru) { > drhd = container_of(dmaru->hdr, > struct acpi_dmar_hardware_unit, > header); > + ret = dmar_acpi_insert_dev_scope(device_number, adev, (void > *)(drhd+1), > + ((void > *)drhd)+drhd->header.length, > + dmaru->devices, > dmaru->devices_cnt); > + if (ret) > + break; > + } > + ret = dmar_rmrr_add_acpi_dev(device_number, adev); What about if dmar_acpi_insert_dev_scope() always returns zero, do we still want to call dmar_rmrr_add_acpi_dev()? > > - for (scope = (void *)(drhd + 1); > - (unsigned long)scope < ((unsigned long)drhd) + > drhd->header.length; > - scope = ((void *)scope) + scope->length) { > - if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE) > - continue; > - if (scope->enume
[PATCH v2 3/3] iommu/vt-d:Add mutex_unlock() before returning
In the probe_acpi_namespace_devices function,when the physical node of the acpi device is NULL,the unlock function is missing. Add mutex_unlock(&adev->physical_node_lock). Reported-by: Dan Carpenter Signed-off-by: FelixCuioc --- drivers/iommu/intel/iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b31f02f41c96..25e9853cba1b 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4851,6 +4851,7 @@ static int __init probe_acpi_namespace_devices(void) if (pn_dev == NULL) { dev->bus->iommu_ops = &intel_iommu_ops; ret = iommu_probe_device(dev); + mutex_unlock(&adev->physical_node_lock); if (ret) { pr_err("acpi_device probe fail! ret:%d\n", ret); return ret; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/3] iommu/vt-d:Add support for probing ACPI device in RMRR
After acpi device in RMRR is detected,it is necessary to establish a mapping for these devices. In acpi_device_create_direct_mappings(),create a mapping for the acpi device in RMRR. Add a helper to achieve the acpi namespace device can access the RMRR region. Signed-off-by: FelixCuioc --- drivers/iommu/intel/iommu.c | 27 +++ drivers/iommu/iommu.c | 6 ++ include/linux/iommu.h | 3 +++ 3 files changed, 36 insertions(+) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index f774ef63d473..b31f02f41c96 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4797,6 +4797,20 @@ static int __init platform_optin_force_iommu(void) return 1; } +static int acpi_device_create_direct_mappings(struct device *pn_dev, struct device *acpi_device) +{ + struct iommu_group *group; + + acpi_device->bus->iommu_ops = &intel_iommu_ops; + group = iommu_group_get(pn_dev); + if (!group) { + pr_warn("ACPI name space devices create direct mappings wrong!\n"); + return -1; + } + __acpi_device_create_direct_mappings(group, acpi_device); + + return 0; +} static int __init probe_acpi_namespace_devices(void) { @@ -4812,6 +4826,7 @@ static int __init probe_acpi_namespace_devices(void) struct acpi_device_physical_node *pn; struct iommu_group *group; struct acpi_device *adev; + struct device *pn_dev = NULL; if (dev->bus != &acpi_bus_type) continue; @@ -4822,6 +4837,7 @@ static int __init probe_acpi_namespace_devices(void) &adev->physical_node_list, node) { group = iommu_group_get(pn->dev); if (group) { + pn_dev = pn->dev; iommu_group_put(group); continue; } @@ -4830,7 +4846,18 @@ static int __init probe_acpi_namespace_devices(void) ret = iommu_probe_device(pn->dev); if (ret) break; + pn_dev = pn->dev; + } + if (pn_dev == NULL) { + dev->bus->iommu_ops = &intel_iommu_ops; + ret = iommu_probe_device(dev); + if (ret) { + pr_err("acpi_device probe fail! ret:%d\n", ret); + return ret; + } + return 0; } + ret = acpi_device_create_direct_mappings(pn_dev, dev); mutex_unlock(&adev->physical_node_lock); if (ret) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 609bd25bf154..4f714a2d5ef7 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -779,6 +779,12 @@ static bool iommu_is_attach_deferred(struct iommu_domain *domain, return false; } +void __acpi_device_create_direct_mappings(struct iommu_group *group, struct device *acpi_device) +{ + iommu_create_device_direct_mappings(group, acpi_device); +} +EXPORT_SYMBOL_GPL(__acpi_device_create_direct_mappings); + /** * iommu_group_add_device - add a device to an iommu group * @group: the group into which to add the device (reference should be held) diff --git a/include/linux/iommu.h b/include/linux/iommu.h index fee209efb756..9be134775886 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -514,6 +514,9 @@ extern void iommu_domain_window_disable(struct iommu_domain *domain, u32 wnd_nr) extern int report_iommu_fault(struct iommu_domain *domain, struct device *dev, unsigned long iova, int flags); +extern void __acpi_device_create_direct_mappings(struct iommu_group *group, + struct device *acpi_device); + static inline void iommu_flush_tlb_all(struct iommu_domain *domain) { if (domain->ops->flush_iotlb_all) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/3] iommu/vt-d:Add support for detecting ACPI device in RMRR
Some ACPI devices need to issue dma requests to access the reserved memory area.BIOS uses the device scope type ACPI_NAMESPACE_DEVICE in RMRR to report these ACPI devices. This patch add support for detecting ACPI devices in RMRR. Signed-off-by: FelixCuioc --- drivers/iommu/intel/dmar.c | 74 - drivers/iommu/intel/iommu.c | 22 ++- include/linux/dmar.h| 12 +- 3 files changed, 72 insertions(+), 36 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 93e6345f3414..024ca38dba12 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -215,7 +215,7 @@ static bool dmar_match_pci_path(struct dmar_pci_notify_info *info, int bus, } /* Return: > 0 if match found, 0 if no match found, < 0 if error happens */ -int dmar_insert_dev_scope(struct dmar_pci_notify_info *info, +int dmar_pci_insert_dev_scope(struct dmar_pci_notify_info *info, void *start, void*end, u16 segment, struct dmar_dev_scope *devices, int devices_cnt) @@ -304,7 +304,7 @@ static int dmar_pci_bus_add_dev(struct dmar_pci_notify_info *info) drhd = container_of(dmaru->hdr, struct acpi_dmar_hardware_unit, header); - ret = dmar_insert_dev_scope(info, (void *)(drhd + 1), + ret = dmar_pci_insert_dev_scope(info, (void *)(drhd + 1), ((void *)drhd) + drhd->header.length, dmaru->segment, dmaru->devices, dmaru->devices_cnt); @@ -696,48 +696,56 @@ dmar_find_matched_drhd_unit(struct pci_dev *dev) return dmaru; } - -static void __init dmar_acpi_insert_dev_scope(u8 device_number, - struct acpi_device *adev) +int dmar_acpi_insert_dev_scope(u8 device_number, + struct acpi_device *adev, + void *start, void *end, + struct dmar_dev_scope *devices, + int devices_cnt) { - struct dmar_drhd_unit *dmaru; - struct acpi_dmar_hardware_unit *drhd; struct acpi_dmar_device_scope *scope; struct device *tmp; int i; struct acpi_dmar_pci_path *path; + for (; start < end; start += scope->length) { + scope = start; + if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE) + continue; + if (scope->enumeration_id != device_number) + continue; + path = (void *)(scope + 1); + for_each_dev_scope(devices, devices_cnt, i, tmp) + if (tmp == NULL) { + devices[i].bus = scope->bus; + devices[i].devfn = PCI_DEVFN(path->device, path->function); + rcu_assign_pointer(devices[i].dev, + get_device(&adev->dev)); + return 1; + } + WARN_ON(i >= devices_cnt); + } + return 0; +} +static int dmar_acpi_bus_add_dev(u8 device_number, struct acpi_device *adev) +{ + struct dmar_drhd_unit *dmaru; + struct acpi_dmar_hardware_unit *drhd; + int ret = 0; + for_each_drhd_unit(dmaru) { drhd = container_of(dmaru->hdr, struct acpi_dmar_hardware_unit, header); + ret = dmar_acpi_insert_dev_scope(device_number, adev, (void *)(drhd+1), + ((void *)drhd)+drhd->header.length, + dmaru->devices, dmaru->devices_cnt); + if (ret) + break; + } + ret = dmar_rmrr_add_acpi_dev(device_number, adev); - for (scope = (void *)(drhd + 1); -(unsigned long)scope < ((unsigned long)drhd) + drhd->header.length; -scope = ((void *)scope) + scope->length) { - if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE) - continue; - if (scope->enumeration_id != device_number) - continue; + return ret; - path = (void *)(scope + 1); - pr_info("ACPI device \"%s\" under DMAR at %llx as %02x:%02x.%d\n", - dev_name(&adev->dev), dmaru->reg_base_addr, - scope->bus, path->device, path->function); - for_each_dev_scope(dmaru->devices, dmaru->devices_cnt, i, tmp) - if (tmp == NULL) { - dmaru->devices[i].b
[PATCH v2 0/3] Add support for ACPI device in RMRR to access reserved memory
BIOS allocate reserved memory ranges that may be DMA targets. BIOS may report each such reserved memory region through the RMRR structures,along with the devices that requires access to the specified reserved memory region. The purpose of this series is to achieve ACPI device in RMRR access reserved memory.Therefore,it is necessary to increase the analysis of acpi device in RMRR and establish a mapping for this device. The first patch adds interfaces for detecting ACPI device in RMRR and in order to distinguish it from pci device,some interface functions are modified. The second patch adds support for probing ACPI device in RMRR. In probe_acpi_namespace_devices(),add support for direct mapping of ACPI device and add support for physical node of acpi device to be NULL. The last patch adds mutex_unlock(&adev->physical_node_lock) before returning in probe_acpi_namespace_devices(). v1->v2: - Split the patch set to small series of patches - Move the processing of physical node of acpi device for NULL to probe_acpi_namespace_devices(). - Add mutex_unlock(&adev->physical_node_lock) before returning in probe_acpi_namespace_devices(). FelixCuioc (3): iommu/vt-d:Add support for detecting ACPI device in RMRR iommu/vt-d:Add support for probing ACPI device in RMRR iommu/vt-d:Add mutex_unlock() before returning drivers/iommu/intel/dmar.c | 74 - drivers/iommu/intel/iommu.c | 50 - drivers/iommu/iommu.c | 6 +++ include/linux/dmar.h| 12 +- include/linux/iommu.h | 3 ++ 5 files changed, 109 insertions(+), 36 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/3] iommu/vt-d:Add mutex_unlock() before returning
In the probe_acpi_namespace_devices function,when the physical node of the acpi device is NULL,the unlock function is missing. Add mutex_unlock(&adev->physical_node_lock). Reported-by: Dan Carpenter Signed-off-by: FelixCuioc --- drivers/iommu/intel/iommu.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b31f02f41c96..25e9853cba1b 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4851,6 +4851,7 @@ static int __init probe_acpi_namespace_devices(void) if (pn_dev == NULL) { dev->bus->iommu_ops = &intel_iommu_ops; ret = iommu_probe_device(dev); + mutex_unlock(&adev->physical_node_lock); if (ret) { pr_err("acpi_device probe fail! ret:%d\n", ret); return ret; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/3] iommu/vt-d:Add support for detecting ACPI device in RMRR
Some ACPI devices need to issue dma requests to access the reserved memory area.BIOS uses the device scope type ACPI_NAMESPACE_DEVICE in RMRR to report these ACPI devices. This patch add support for detecting ACPI devices in RMRR. Signed-off-by: FelixCuioc --- drivers/iommu/intel/dmar.c | 74 - drivers/iommu/intel/iommu.c | 22 ++- include/linux/dmar.h| 12 +- 3 files changed, 72 insertions(+), 36 deletions(-) diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 93e6345f3414..024ca38dba12 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -215,7 +215,7 @@ static bool dmar_match_pci_path(struct dmar_pci_notify_info *info, int bus, } /* Return: > 0 if match found, 0 if no match found, < 0 if error happens */ -int dmar_insert_dev_scope(struct dmar_pci_notify_info *info, +int dmar_pci_insert_dev_scope(struct dmar_pci_notify_info *info, void *start, void*end, u16 segment, struct dmar_dev_scope *devices, int devices_cnt) @@ -304,7 +304,7 @@ static int dmar_pci_bus_add_dev(struct dmar_pci_notify_info *info) drhd = container_of(dmaru->hdr, struct acpi_dmar_hardware_unit, header); - ret = dmar_insert_dev_scope(info, (void *)(drhd + 1), + ret = dmar_pci_insert_dev_scope(info, (void *)(drhd + 1), ((void *)drhd) + drhd->header.length, dmaru->segment, dmaru->devices, dmaru->devices_cnt); @@ -696,48 +696,56 @@ dmar_find_matched_drhd_unit(struct pci_dev *dev) return dmaru; } - -static void __init dmar_acpi_insert_dev_scope(u8 device_number, - struct acpi_device *adev) +int dmar_acpi_insert_dev_scope(u8 device_number, + struct acpi_device *adev, + void *start, void *end, + struct dmar_dev_scope *devices, + int devices_cnt) { - struct dmar_drhd_unit *dmaru; - struct acpi_dmar_hardware_unit *drhd; struct acpi_dmar_device_scope *scope; struct device *tmp; int i; struct acpi_dmar_pci_path *path; + for (; start < end; start += scope->length) { + scope = start; + if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE) + continue; + if (scope->enumeration_id != device_number) + continue; + path = (void *)(scope + 1); + for_each_dev_scope(devices, devices_cnt, i, tmp) + if (tmp == NULL) { + devices[i].bus = scope->bus; + devices[i].devfn = PCI_DEVFN(path->device, path->function); + rcu_assign_pointer(devices[i].dev, + get_device(&adev->dev)); + return 1; + } + WARN_ON(i >= devices_cnt); + } + return 0; +} +static int dmar_acpi_bus_add_dev(u8 device_number, struct acpi_device *adev) +{ + struct dmar_drhd_unit *dmaru; + struct acpi_dmar_hardware_unit *drhd; + int ret = 0; + for_each_drhd_unit(dmaru) { drhd = container_of(dmaru->hdr, struct acpi_dmar_hardware_unit, header); + ret = dmar_acpi_insert_dev_scope(device_number, adev, (void *)(drhd+1), + ((void *)drhd)+drhd->header.length, + dmaru->devices, dmaru->devices_cnt); + if (ret) + break; + } + ret = dmar_rmrr_add_acpi_dev(device_number, adev); - for (scope = (void *)(drhd + 1); -(unsigned long)scope < ((unsigned long)drhd) + drhd->header.length; -scope = ((void *)scope) + scope->length) { - if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_NAMESPACE) - continue; - if (scope->enumeration_id != device_number) - continue; + return ret; - path = (void *)(scope + 1); - pr_info("ACPI device \"%s\" under DMAR at %llx as %02x:%02x.%d\n", - dev_name(&adev->dev), dmaru->reg_base_addr, - scope->bus, path->device, path->function); - for_each_dev_scope(dmaru->devices, dmaru->devices_cnt, i, tmp) - if (tmp == NULL) { - dmaru->devices[i].b