Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
>>> On 13.05.19 at 16:45, wrote: > On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote: >> >>> On 13.05.19 at 15:48, wrote: >> > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: >> >> --- a/xen/drivers/passthrough/vtd/iommu.c >> >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a >> >> unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) >> >> : NUMA_NO_NODE; >> >> const cpumask_t *cpumask = _online_map; >> >> +struct irq_desc *desc; >> >> >> >> if ( node < MAX_NUMNODES && node_online(node) && >> >> cpumask_intersects(_to_cpumask(node), cpumask) ) >> >> cpumask = _to_cpumask(node); >> >> -dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); >> >> + >> >> +desc = irq_to_desc(drhd->iommu->msi.irq); >> >> +spin_lock_irq(>lock); >> > >> > I would use the irqsave/irqrestore variants here for extra safety. >> >> Hmm, maybe. But I think we're in bigger trouble if IRQs indeed >> ended up enabled at any of the two points where this function >> gets called. > > I think I'm misreading the above, but if you expect > adjust_irq_affinity to always be called with interrupts disabled using > spin_unlock_irq is wrong as it unconditionally enables interrupts. Oops - s/enabled/disabled/ in my earlier reply. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
On Mon, May 13, 2019 at 08:19:04AM -0600, Jan Beulich wrote: > >>> On 13.05.19 at 15:48, wrote: > > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: > >> --- a/xen/drivers/passthrough/vtd/iommu.c > >> +++ b/xen/drivers/passthrough/vtd/iommu.c > >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a > >> unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) > >> : NUMA_NO_NODE; > >> const cpumask_t *cpumask = _online_map; > >> +struct irq_desc *desc; > >> > >> if ( node < MAX_NUMNODES && node_online(node) && > >> cpumask_intersects(_to_cpumask(node), cpumask) ) > >> cpumask = _to_cpumask(node); > >> -dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); > >> + > >> +desc = irq_to_desc(drhd->iommu->msi.irq); > >> +spin_lock_irq(>lock); > > > > I would use the irqsave/irqrestore variants here for extra safety. > > Hmm, maybe. But I think we're in bigger trouble if IRQs indeed > ended up enabled at any of the two points where this function > gets called. I think I'm misreading the above, but if you expect adjust_irq_affinity to always be called with interrupts disabled using spin_unlock_irq is wrong as it unconditionally enables interrupts. Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
>>> On 13.05.19 at 15:48, wrote: > On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: >> --- a/xen/drivers/passthrough/vtd/iommu.c >> +++ b/xen/drivers/passthrough/vtd/iommu.c >> @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a >> unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) >> : NUMA_NO_NODE; >> const cpumask_t *cpumask = _online_map; >> +struct irq_desc *desc; >> >> if ( node < MAX_NUMNODES && node_online(node) && >> cpumask_intersects(_to_cpumask(node), cpumask) ) >> cpumask = _to_cpumask(node); >> -dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); >> + >> +desc = irq_to_desc(drhd->iommu->msi.irq); >> +spin_lock_irq(>lock); > > I would use the irqsave/irqrestore variants here for extra safety. Hmm, maybe. But I think we're in bigger trouble if IRQs indeed ended up enabled at any of the two points where this function gets called. Jan ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
On Wed, May 08, 2019 at 07:10:59AM -0600, Jan Beulich wrote: > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc > fields, and hence ought to be called with the descriptor lock held in > addition to vector_lock. This is currently the case for only > set_desc_affinity() (in the common case) and destroy_irq(), which also > clarifies what the nesting behavior between the locks has to be. > Reflect the new expectation by having these functions all take a > descriptor as parameter instead of an interrupt number. > > Also take care of the two special cases of calls to set_desc_affinity(): > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called > directly as well, and in these cases the descriptor locks hadn't got > acquired till now. For set_ioapic_affinity_irq() this means acquiring / > releasing of the IO-APIC lock can be plain spin_{,un}lock() then. > > Drop one of the two leading underscores from all three functions at > the same time. > > There's one case left where descriptors get manipulated with just > vector_lock held: setup_vector_irq() assumes its caller to acquire > vector_lock, and hence can't itself acquire the descriptor locks (wrong > lock order). I don't currently see how to address this. > > Signed-off-by: Jan Beulich Reviewed-by: Roger Pau Monné > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -2134,11 +2134,16 @@ static void adjust_irq_affinity(struct a > unsigned int node = rhsa ? pxm_to_node(rhsa->proximity_domain) > : NUMA_NO_NODE; > const cpumask_t *cpumask = _online_map; > +struct irq_desc *desc; > > if ( node < MAX_NUMNODES && node_online(node) && > cpumask_intersects(_to_cpumask(node), cpumask) ) > cpumask = _to_cpumask(node); > -dma_msi_set_affinity(irq_to_desc(drhd->iommu->msi.irq), cpumask); > + > +desc = irq_to_desc(drhd->iommu->msi.irq); > +spin_lock_irq(>lock); I would use the irqsave/irqrestore variants here for extra safety. Thanks, Roger. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
> From: Jan Beulich [mailto:jbeul...@suse.com] > Sent: Wednesday, May 8, 2019 9:16 PM > > >>> On 08.05.19 at 15:10, wrote: > > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc > > fields, and hence ought to be called with the descriptor lock held in > > addition to vector_lock. This is currently the case for only > > set_desc_affinity() (in the common case) and destroy_irq(), which also > > clarifies what the nesting behavior between the locks has to be. > > Reflect the new expectation by having these functions all take a > > descriptor as parameter instead of an interrupt number. > > > > Also take care of the two special cases of calls to set_desc_affinity(): > > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called > > directly as well, and in these cases the descriptor locks hadn't got > > acquired till now. For set_ioapic_affinity_irq() this means acquiring / > > releasing of the IO-APIC lock can be plain spin_{,un}lock() then. > > > > Drop one of the two leading underscores from all three functions at > > the same time. > > > > There's one case left where descriptors get manipulated with just > > vector_lock held: setup_vector_irq() assumes its caller to acquire > > vector_lock, and hence can't itself acquire the descriptor locks (wrong > > lock order). I don't currently see how to address this. > > > > Signed-off-by: Jan Beulich > > --- > > v2: Also adjust set_ioapic_affinity_irq() and VT-d's > > dma_msi_set_affinity(). > > I'm sorry, Kevin, I should have Cc-ed you on this one. Reviewed-by: Kevin Tian for vtd part. ___ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH v2 07/12] x86/IRQ: fix locking around vector management
>>> On 08.05.19 at 15:10, wrote: > All of __{assign,bind,clear}_irq_vector() manipulate struct irq_desc > fields, and hence ought to be called with the descriptor lock held in > addition to vector_lock. This is currently the case for only > set_desc_affinity() (in the common case) and destroy_irq(), which also > clarifies what the nesting behavior between the locks has to be. > Reflect the new expectation by having these functions all take a > descriptor as parameter instead of an interrupt number. > > Also take care of the two special cases of calls to set_desc_affinity(): > set_ioapic_affinity_irq() and VT-d's dma_msi_set_affinity() get called > directly as well, and in these cases the descriptor locks hadn't got > acquired till now. For set_ioapic_affinity_irq() this means acquiring / > releasing of the IO-APIC lock can be plain spin_{,un}lock() then. > > Drop one of the two leading underscores from all three functions at > the same time. > > There's one case left where descriptors get manipulated with just > vector_lock held: setup_vector_irq() assumes its caller to acquire > vector_lock, and hence can't itself acquire the descriptor locks (wrong > lock order). I don't currently see how to address this. > > Signed-off-by: Jan Beulich > --- > v2: Also adjust set_ioapic_affinity_irq() and VT-d's > dma_msi_set_affinity(). I'm sorry, Kevin, I should have Cc-ed you on this one. Jan > --- a/xen/arch/x86/io_apic.c > +++ b/xen/arch/x86/io_apic.c > @@ -550,14 +550,14 @@ static void clear_IO_APIC (void) > static void > set_ioapic_affinity_irq(struct irq_desc *desc, const cpumask_t *mask) > { > -unsigned long flags; > unsigned int dest; > int pin, irq; > struct irq_pin_list *entry; > > irq = desc->irq; > > -spin_lock_irqsave(_lock, flags); > +spin_lock(_lock); > + > dest = set_desc_affinity(desc, mask); > if (dest != BAD_APICID) { > if ( !x2apic_enabled ) > @@ -580,8 +580,8 @@ set_ioapic_affinity_irq(struct irq_desc > entry = irq_2_pin + entry->next; > } > } > -spin_unlock_irqrestore(_lock, flags); > > +spin_unlock(_lock); > } > > /* > @@ -674,16 +674,19 @@ void /*__init*/ setup_ioapic_dest(void) > for (ioapic = 0; ioapic < nr_ioapics; ioapic++) { > for (pin = 0; pin < nr_ioapic_entries[ioapic]; pin++) { > struct irq_desc *desc; > +unsigned long flags; > > irq_entry = find_irq_entry(ioapic, pin, mp_INT); > if (irq_entry == -1) > continue; > irq = pin_2_irq(irq_entry, ioapic, pin); > desc = irq_to_desc(irq); > + > +spin_lock_irqsave(>lock, flags); > BUG_ON(!cpumask_intersects(desc->arch.cpu_mask, > _online_map)); > set_ioapic_affinity_irq(desc, desc->arch.cpu_mask); > +spin_unlock_irqrestore(>lock, flags); > } > - > } > } > > --- a/xen/arch/x86/irq.c > +++ b/xen/arch/x86/irq.c > @@ -27,6 +27,7 @@ > #include > > static int parse_irq_vector_map_param(const char *s); > +static void _clear_irq_vector(struct irq_desc *desc); > > /* opt_noirqbalance: If true, software IRQ balancing/affinity is disabled. > */ > bool __read_mostly opt_noirqbalance; > @@ -120,13 +121,12 @@ static void trace_irq_mask(uint32_t even > trace_var(event, 1, sizeof(d), ); > } > > -static int __init __bind_irq_vector(int irq, int vector, const cpumask_t > *cpu_mask) > +static int __init _bind_irq_vector(struct irq_desc *desc, int vector, > + const cpumask_t *cpu_mask) > { > cpumask_t online_mask; > int cpu; > -struct irq_desc *desc = irq_to_desc(irq); > > -BUG_ON((unsigned)irq >= nr_irqs); > BUG_ON((unsigned)vector >= NR_VECTORS); > > cpumask_and(_mask, cpu_mask, _online_map); > @@ -137,9 +137,9 @@ static int __init __bind_irq_vector(int > return 0; > if ( desc->arch.vector != IRQ_VECTOR_UNASSIGNED ) > return -EBUSY; > -trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, irq, vector, _mask); > +trace_irq_mask(TRC_HW_IRQ_BIND_VECTOR, desc->irq, vector, _mask); > for_each_cpu(cpu, _mask) > -per_cpu(vector_irq, cpu)[vector] = irq; > +per_cpu(vector_irq, cpu)[vector] = desc->irq; > desc->arch.vector = vector; > cpumask_copy(desc->arch.cpu_mask, _mask); > if ( desc->arch.used_vectors ) > @@ -153,12 +153,18 @@ static int __init __bind_irq_vector(int > > int __init bind_irq_vector(int irq, int vector, const cpumask_t *cpu_mask) > { > +struct irq_desc *desc = irq_to_desc(irq); > unsigned long flags; > int ret; > > -spin_lock_irqsave(_lock, flags); > -ret = __bind_irq_vector(irq, vector, cpu_mask); > -spin_unlock_irqrestore(_lock, flags); > +BUG_ON((unsigned)irq >= nr_irqs); > + > +spin_lock_irqsave(>lock, flags); > +spin_lock(_lock); > +ret = _bind_irq_vector(desc, vector, cpu_mask); >