[PATCH 1/1] iommu: Fix deferred domain attachment in iommu_probe_device()
The IOMMU core has support for deferring the attachment of default domain to device. Fix a missed deferred attaching check in iommu_probe_device(). Fixes: cf193888bfbd3 ("iommu: Move new probe_device path to separate function") Cc: sta...@vger.kernel.org # v5.8+ Signed-off-by: Lu Baolu --- drivers/iommu/iommu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 6d847027d35e..690abf60239d 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -93,6 +93,8 @@ static void __iommu_detach_group(struct iommu_domain *domain, static int iommu_create_device_direct_mappings(struct iommu_group *group, struct device *dev); static struct iommu_group *iommu_group_get_for_dev(struct device *dev); +static bool iommu_is_attach_deferred(struct iommu_domain *domain, +struct device *dev); #define IOMMU_GROUP_ATTR(_name, _mode, _show, _store) \ struct iommu_group_attribute iommu_group_attr_##_name =\ @@ -264,7 +266,8 @@ int iommu_probe_device(struct device *dev) */ iommu_alloc_default_domain(group, dev); - if (group->default_domain) + if (group->default_domain && + !iommu_is_attach_deferred(group->default_domain, dev)) ret = __iommu_attach_device(group->default_domain, dev); iommu_create_device_direct_mappings(group, dev); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu/amd: Increase interrupt remapping table limit to 512 entries
Hi Joerg, Do you have any concerns regarding this patch? Thanks, Suravee On 10/15/20 9:50 AM, Suravee Suthikulpanit wrote: Certain device drivers allocate IO queues on a per-cpu basis. On AMD EPYC platform, which can support up-to 256 cpu threads, this can exceed the current MAX_IRQ_PER_TABLE limit of 256, and result in the error message: AMD-Vi: Failed to allocate IRTE This has been observed with certain NVME devices. AMD IOMMU hardware can actually support upto 512 interrupt remapping table entries. Therefore, update the driver to match the hardware limit. Please note that this also increases the size of interrupt remapping table to 8KB per device when using the 128-bit IRTE format. Signed-off-by: Suravee Suthikulpanit --- drivers/iommu/amd/amd_iommu_types.h | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/amd/amd_iommu_types.h b/drivers/iommu/amd/amd_iommu_types.h index 30a5d412255a..427484c45589 100644 --- a/drivers/iommu/amd/amd_iommu_types.h +++ b/drivers/iommu/amd/amd_iommu_types.h @@ -406,7 +406,11 @@ extern bool amd_iommu_np_cache; /* Only true if all IOMMUs support device IOTLBs */ extern bool amd_iommu_iotlb_sup; -#define MAX_IRQS_PER_TABLE 256 +/* + * AMD IOMMU hardware only support 512 IRTEs despite + * the architectural limitation of 2048 entries. + */ +#define MAX_IRQS_PER_TABLE 512 #define IRQ_TABLE_ALIGNMENT 128 struct irq_remap_table { ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 17/35] x86/pci/xen: Use msi_msg shadow structs
From: David Woodhouse > Sent: 25 October 2020 10:26 > To: David Laight ; x...@kernel.org > > On Sun, 2020-10-25 at 09:49 +, David Laight wrote: > > Just looking at a random one of these patches... > > > > Does the compiler manage to optimise that reasonably? > > Or does it generate a lot of shifts and masks as each > > bitfield is set? > > > > The code generation for bitfields is often a lot worse > > that that for |= setting bits in a word. > > Indeed, it appears to be utterly appalling. That was one of my > motivations for doing it with masks and shifts in the first place. I thought it would be. I'm not even sure using bitfields to map hardware registers makes the code any more readable (or fool proof). I suspect using #define constants and explicit |= and &= ~ is actually best - provided the names are descriptive enough. If you set all the fields the compiler will merge the values and do a single write - provided nothing you read might alias the target. The only way to get strongly typed integers is to cast them to a structure pointer type. Together with a static inline function to remove the casts and | the values together it might make things fool proof. But I've never tried it. ISTR once doing something like that with error codes, but it didn't ever make source control. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 17/35] x86/pci/xen: Use msi_msg shadow structs
On Sun, 2020-10-25 at 09:49 +, David Laight wrote: > Just looking at a random one of these patches... > > Does the compiler manage to optimise that reasonably? > Or does it generate a lot of shifts and masks as each > bitfield is set? > > The code generation for bitfields is often a lot worse > that that for |= setting bits in a word. Indeed, it appears to be utterly appalling. That was one of my motivations for doing it with masks and shifts in the first place. Because in ioapic_setup_msg_from_msi(), for example, these two are consecutive in both source and destination: entry->vector = msg.arch_data.vector; entry->delivery_mode= msg.arch_data.delivery_mode; And so are those: entry->ir_format= msg.arch_addr_lo.dmar_format; entry->ir_index_0_14= msg.arch_addr_lo.dmar_index_0_14; But GCC 7.5.0 here is doing them all separately... 11e0 : { 11e0: 53 push %rbx 11e1: 48 89 f3mov%rsi,%rbx 11e4: 48 83 ec 18 sub$0x18,%rsp irq_chip_compose_msi_msg(irq_data, &msg); 11e8: 48 89 e6mov%rsp,%rsi { 11eb: 65 48 8b 04 25 28 00mov%gs:0x28,%rax 11f2: 00 00 11f4: 48 89 44 24 10 mov%rax,0x10(%rsp) 11f9: 31 c0 xor%eax,%eax irq_chip_compose_msi_msg(irq_data, &msg); 11fb: e8 00 00 00 00 callq 1200 entry->vector = msg.arch_data.vector; 1200: 0f b6 44 24 08 movzbl 0x8(%rsp),%eax entry->delivery_mode= msg.arch_data.delivery_mode; 1205: 0f b6 53 01 movzbl 0x1(%rbx),%edx 1209: 0f b6 74 24 09 movzbl 0x9(%rsp),%esi entry->vector = msg.arch_data.vector; 120e: 88 03 mov%al,(%rbx) entry->dest_mode_logical= msg.arch_addr_lo.dest_mode_logical; 1210: 0f b6 04 24 movzbl (%rsp),%eax entry->delivery_mode= msg.arch_data.delivery_mode; 1214: 83 e2 f0and$0xfff0,%edx 1217: 83 e6 07and$0x7,%esi entry->dest_mode_logical= msg.arch_addr_lo.dest_mode_logical; 121a: 09 f2 or %esi,%edx 121c: 8d 0c 00lea(%rax,%rax,1),%ecx entry->ir_format= msg.arch_addr_lo.dmar_format; 121f: c0 e8 04shr$0x4,%al entry->dest_mode_logical= msg.arch_addr_lo.dest_mode_logical; 1222: 83 e1 08and$0x8,%ecx 1225: 09 ca or %ecx,%edx 1227: 88 53 01mov%dl,0x1(%rbx) entry->ir_format= msg.arch_addr_lo.dmar_format; 122a: 89 c2 mov%eax,%edx entry->ir_index_0_14= msg.arch_addr_lo.dmar_index_0_14; 122c: 8b 04 24mov(%rsp),%eax 122f: 83 e2 01and$0x1,%edx 1232: c1 e8 05shr$0x5,%eax 1235: 66 25 ff 7f and$0x7fff,%ax 1239: 8d 0c 00lea(%rax,%rax,1),%ecx 123c: 66 c1 e8 07 shr$0x7,%ax 1240: 88 43 07mov%al,0x7(%rbx) 1243: 09 ca or %ecx,%edx } 1245: 48 8b 44 24 10 mov0x10(%rsp),%rax 124a: 65 48 33 04 25 28 00xor%gs:0x28,%rax 1251: 00 00 entry->ir_index_0_14= msg.arch_addr_lo.dmar_index_0_14; 1253: 88 53 06mov%dl,0x6(%rbx) } 1256: 75 06 jne125e 1258: 48 83 c4 18 add$0x18,%rsp 125c: 5b pop%rbx 125d: c3 retq 125e: e8 00 00 00 00 callq 1263 1263: 0f 1f 00nopl (%rax) 1266: 66 2e 0f 1f 84 00 00nopw %cs:0x0(%rax,%rax,1) 126d: 00 00 00 Still, this isn't really a fast path so I'm content to file the GCC PR for the really *obvious* misses and let it take its course. smime.p7s Description: S/MIME cryptographic signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [PATCH v3 17/35] x86/pci/xen: Use msi_msg shadow structs
From: David Woodhouse > Sent: 24 October 2020 22:35 > > From: Thomas Gleixner > > Use the msi_msg shadow structs and compose the message with named bitfields > instead of the unreadable macro maze. > > Signed-off-by: Thomas Gleixner > Signed-off-by: David Woodhouse > --- > arch/x86/pci/xen.c | 26 +++--- > 1 file changed, 11 insertions(+), 15 deletions(-) > > diff --git a/arch/x86/pci/xen.c b/arch/x86/pci/xen.c > index c552cd2d0632..3d41a09c2c14 100644 > --- a/arch/x86/pci/xen.c > +++ b/arch/x86/pci/xen.c > @@ -152,7 +152,6 @@ static int acpi_register_gsi_xen(struct device *dev, u32 > gsi, > > #if defined(CONFIG_PCI_MSI) > #include > -#include > > struct xen_pci_frontend_ops *xen_pci_frontend; > EXPORT_SYMBOL_GPL(xen_pci_frontend); > @@ -210,23 +209,20 @@ static int xen_setup_msi_irqs(struct pci_dev *dev, int > nvec, int type) > return ret; > } > > -#define XEN_PIRQ_MSI_DATA (MSI_DATA_TRIGGER_EDGE | \ > - MSI_DATA_LEVEL_ASSERT | (3 << 8) | MSI_DATA_VECTOR(0)) > - > static void xen_msi_compose_msg(struct pci_dev *pdev, unsigned int pirq, > struct msi_msg *msg) > { > - /* We set vector == 0 to tell the hypervisor we don't care about it, > - * but we want a pirq setup instead. > - * We use the dest_id field to pass the pirq that we want. */ > - msg->address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID(pirq); > - msg->address_lo = > - MSI_ADDR_BASE_LO | > - MSI_ADDR_DEST_MODE_PHYSICAL | > - MSI_ADDR_REDIRECTION_CPU | > - MSI_ADDR_DEST_ID(pirq); > - > - msg->data = XEN_PIRQ_MSI_DATA; > + /* > + * We set vector == 0 to tell the hypervisor we don't care about > + * it, but we want a pirq setup instead. We use the dest_id fields > + * to pass the pirq that we want. > + */ > + memset(msg, 0, sizeof(*msg)); > + msg->address_hi = X86_MSI_BASE_ADDRESS_HIGH; > + msg->arch_addr_hi.destid_8_31 = pirq >> 8; > + msg->arch_addr_lo.destid_0_7 = pirq & 0xFF; > + msg->arch_addr_lo.base_address = X86_MSI_BASE_ADDRESS_LOW; > + msg->arch_data.delivery_mode = APIC_DELIVERY_MODE_EXTINT; > } Just looking at a random one of these patches... Does the compiler manage to optimise that reasonably? Or does it generate a lot of shifts and masks as each bitfield is set? The code generation for bitfields is often a lot worse that that for |= setting bits in a word. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales) ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 22/35] genirq/irqdomain: Implement get_name() method on irqchip fwnodes
Hi David, nit: please use my kernel.org address for kernel related stuff. On Sat, 24 Oct 2020 22:35:22 +0100, David Woodhouse wrote: > > From: David Woodhouse > > Prerequesite to make x86 more irqdomain compliant. Prerequisite? > > Signed-off-by: David Woodhouse > Signed-off-by: Thomas Gleixner > --- > kernel/irq/irqdomain.c | 11 ++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c > index cf8b374b892d..673fa64c1c44 100644 > --- a/kernel/irq/irqdomain.c > +++ b/kernel/irq/irqdomain.c > @@ -42,7 +42,16 @@ static inline void debugfs_add_domain_dir(struct > irq_domain *d) { } > static inline void debugfs_remove_domain_dir(struct irq_domain *d) { } > #endif > > -const struct fwnode_operations irqchip_fwnode_ops; > +static const char *irqchip_fwnode_get_name(const struct fwnode_handle > *fwnode) > +{ > + struct irqchip_fwid *fwid = container_of(fwnode, struct irqchip_fwid, > fwnode); > + > + return fwid->name; > +} > + > +const struct fwnode_operations irqchip_fwnode_ops = { > + .get_name = irqchip_fwnode_get_name, > +}; > EXPORT_SYMBOL_GPL(irqchip_fwnode_ops); > > /** Acked-by: Marc Zyngier -- Without deviation from the norm, progress is not possible. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 00/35] Fix x2apic enablement and allow more CPUs, clean up I/OAPIC and MSI bitfields
On Sat, 2020-10-24 at 22:35 +0100, David Woodhouse wrote: > Fix the conditions for enabling x2apic on guests without interrupt > remapping, and support 15-bit Extended Destination ID to allow 32768 > CPUs without IR on hypervisors that support it. > > Make the I/OAPIC code generate its RTE directly from the MSI message > created by the parent irqchip, and fix up a bunch of magic mask/shift > macros to use bitfields for MSI messages and I/OAPIC RTEs while we're > at it. Forgot to mention (since I thought I'd posted it in a previous series) that v3 also ditches irq_remapping_get_irq_domain() and some icky special cases of hard-coding "x86_vector_domain", and makes HPET and I/OAPIC use irq_find_matching_fwspeC() to find their parent irqdomain. > v3: > • Lots of bitfield cleanups from Thomas. > • Disable hyperv-iommu if 15-bit extension is present. > • Fix inconsistent CONFIG_PCI_MSI/CONFIG_GENERIC_MSI_IRQ in hpet.c > • Split KVM_FEATURE_MSI_EXT_DEST_ID patch, half of which is going upstream >through KVM tree (and the other half needs to wait, or have an #ifdef) so >is left at the top of the tree. > > v2: > • Minor cleanups. > • Move __irq_msi_compose_msg() to apic.c, make virt_ext_dest_id static. > • Generate I/OAPIC RTE directly from parent irqchip's MSI messages. > • Clean up HPET MSI support into hpet.c now that we can. smime.p7s Description: S/MIME cryptographic signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu