Re: [Xen-devel] [PATCHv1 1/2] passthrough: use per-interrupt lock when injecting an interrupt
On Thu, Oct 29, 2015 at 03:11:39AM -0600, Jan Beulich wrote: > >>> On 28.10.15 at 21:18,wrote: > >> > @@ -481,6 +489,8 @@ int pt_irq_destroy_bind( > >> > pirq = pirq_info(d, machine_gsi); > >> > pirq_dpci = pirq_dpci(pirq); > >> > > >> > +spin_lock(_dpci->lock); > >> > >> Considering that code further down in this function checks > >> pirq_dpci to be non-NULL, this doesn't look safe (or else those > >> checks should go away, as after this addition they would be > >> likely to trigger e.g. Coverity warnings). > > > > ? The checks are for pirq_dpci->dom. > > What about > > /* clear the mirq info */ > if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) ) > > and > > if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) && > list_empty(_dpci->digl_list) ) Yes. I was looking at the function code (pt_irq_create_bind) not the pt_irq_destroy_bind! > > ? In fact I can't spot any access to pirq_dpci->dom in this function. > > >> > @@ -675,7 +687,7 @@ static void hvm_dirq_assist(struct domain *d, struct > >> > hvm_pirq_dpci *pirq_dpci) > >> > { > >> > ASSERT(d->arch.hvm_domain.irq.dpci); > >> > > >> > -spin_lock(>event_lock); > >> > +spin_lock(_dpci->lock); > >> > if ( test_and_clear_bool(pirq_dpci->masked) ) > >> > { > >> > struct pirq *pirq = dpci_pirq(pirq_dpci); > >> > >> Along the same lines - it's not obvious that the uses of pirq here are > >> safe with event_lock no longer held. In particular I wonder about > >> send_guest_pirq() - despite the other use in __do_IRQ_guest() not > >> doing any locking either I'm not convinced this is correct. > > > > > > It seems that the event channel mechanism only uses the event channel > > lock when expanding and initializing (FIFO). For the old mechanism > > it was for binding, closing (uhuh), status, reset, and set_priority. > > Well, the event lock is also used for some pIRQ management iirc. Correct - pirq_guest_bind, pirq_guest_unbind, map_domain_pirq, map_domain_emuirq_pirq, unmap_domain_pirq_emuirq, and so on. > > Jan > ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv1 1/2] passthrough: use per-interrupt lock when injecting an interrupt
>>> On 28.10.15 at 21:18,wrote: >> > @@ -481,6 +489,8 @@ int pt_irq_destroy_bind( >> > pirq = pirq_info(d, machine_gsi); >> > pirq_dpci = pirq_dpci(pirq); >> > >> > +spin_lock(_dpci->lock); >> >> Considering that code further down in this function checks >> pirq_dpci to be non-NULL, this doesn't look safe (or else those >> checks should go away, as after this addition they would be >> likely to trigger e.g. Coverity warnings). > > ? The checks are for pirq_dpci->dom. What about /* clear the mirq info */ if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) ) and if ( pirq_dpci && (pirq_dpci->flags & HVM_IRQ_DPCI_MAPPED) && list_empty(_dpci->digl_list) ) ? In fact I can't spot any access to pirq_dpci->dom in this function. >> > @@ -675,7 +687,7 @@ static void hvm_dirq_assist(struct domain *d, struct >> > hvm_pirq_dpci *pirq_dpci) >> > { >> > ASSERT(d->arch.hvm_domain.irq.dpci); >> > >> > -spin_lock(>event_lock); >> > +spin_lock(_dpci->lock); >> > if ( test_and_clear_bool(pirq_dpci->masked) ) >> > { >> > struct pirq *pirq = dpci_pirq(pirq_dpci); >> >> Along the same lines - it's not obvious that the uses of pirq here are >> safe with event_lock no longer held. In particular I wonder about >> send_guest_pirq() - despite the other use in __do_IRQ_guest() not >> doing any locking either I'm not convinced this is correct. > > > It seems that the event channel mechanism only uses the event channel > lock when expanding and initializing (FIFO). For the old mechanism > it was for binding, closing (uhuh), status, reset, and set_priority. Well, the event lock is also used for some pIRQ management iirc. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv1 1/2] passthrough: use per-interrupt lock when injecting an interrupt
.snip.. > > @@ -481,6 +489,8 @@ int pt_irq_destroy_bind( > > pirq = pirq_info(d, machine_gsi); > > pirq_dpci = pirq_dpci(pirq); > > > > +spin_lock(_dpci->lock); > > Considering that code further down in this function checks > pirq_dpci to be non-NULL, this doesn't look safe (or else those > checks should go away, as after this addition they would be > likely to trigger e.g. Coverity warnings). ? The checks are for pirq_dpci->dom. > > > @@ -621,7 +633,7 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq) > > return 1; > > } > > > > -/* called with d->event_lock held */ > > +/* called with pirq_dhci->lock held */ > > static void __msi_pirq_eoi(struct hvm_pirq_dpci *pirq_dpci) > > The fact that this is a safe change to the locking model imo needs > to be spelled out explicitly in the commit message. Afaict it's safe > only because desc_guest_eoi() uses pirq for nothing else than to > (atomically!) clear pirq->masked. > > > @@ -675,7 +687,7 @@ static void hvm_dirq_assist(struct domain *d, struct > > hvm_pirq_dpci *pirq_dpci) > > { > > ASSERT(d->arch.hvm_domain.irq.dpci); > > > > -spin_lock(>event_lock); > > +spin_lock(_dpci->lock); > > if ( test_and_clear_bool(pirq_dpci->masked) ) > > { > > struct pirq *pirq = dpci_pirq(pirq_dpci); > > Along the same lines - it's not obvious that the uses of pirq here are > safe with event_lock no longer held. In particular I wonder about > send_guest_pirq() - despite the other use in __do_IRQ_guest() not > doing any locking either I'm not convinced this is correct. It seems that the event channel mechanism only uses the event channel lock when expanding and initializing (FIFO). For the old mechanism it was for binding, closing (uhuh), status, reset, and set_priority. The 'closing' is interesting. What if the guest was closing the port while we are to notify it of an interrupt (so inside hvm_dirq_assist). evtchn_close will take the event_lock call pirq_cleanup_check and calls pt_pirq_cleanup_check. We check for STATE_RUN (still set as dpci_softirq hasn't finished calling hvm_dirq_assist) so it returns zero .. and does not call radix_tree_delete. Then evtchn_close goes to 'unlink_pirq_port' and then follows up with 'unmap_domain_pirq_emuirq'. That sets emuirq to IRQ_UNBOUND. OK, now hvm_dirq_assist is now inside the first 'if' and checks: 684 if ( hvm_domain_use_pirq(d, pirq) ) which would have been true, but now is false (as emuirq is now IRQ_UNBOUND). Lets also assume this is for an MSI. The 'pt_irq_need_timer' returns 0: 142 return !(flags & (HVM_IRQ_DPCI_GUEST_MSI | HVM_IRQ_DPCI_TRANSLATE)); which hits this: 723 ASSERT(pt_irq_need_timer(pirq_dpci->flags)); and blows up. If we are not compiled with 'debug=y' then we would set a timer which 8 ms later calls 'pt_irq_time_out'. It then clears the STATE_RUN and the softirq is done. So lets switch back to 'evtchn_close'.. which unlocks the event_lock and is done. The 'pt_irq_time_out' is pretty much a nop - takes the event_lock, does not iterate over the list as it is an MSI one so no legacy interrupts. Calls pt_irq_guest_eoi over the radix tree - and pt_irq_guest_eoi does nothing as the _HVM_IRQ_DPCI_EOI_LATCH_SHIFT is not set. I think that is the only issue - when using the legacy event channels and the guest fiddling with the pirq. I suppose that means anybody using 'unmap_domain_pirq_emuirq' will hit this ASSERT(). That means: arch_domain_soft_reset physdev_unmap_pirq evtchn_close Thought you could 'fix' this by doing: diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c index bda9374..6ea59db 100644 --- a/xen/drivers/passthrough/io.c +++ b/xen/drivers/passthrough/io.c @@ -720,8 +720,8 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci) * guest will never deal with the irq, then the physical interrupt line * will never be deasserted. */ -ASSERT(pt_irq_need_timer(pirq_dpci->flags)); -set_timer(_dpci->timer, NOW() + PT_IRQ_TIME_OUT); +if (pt_irq_need_timer(pirq_dpci->flags)) +set_timer(_dpci->timer, NOW() + PT_IRQ_TIME_OUT); } spin_unlock(>event_lock); } > > Jan > > > ___ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Re: [Xen-devel] [PATCHv1 1/2] passthrough: use per-interrupt lock when injecting an interrupt
>>> On 23.10.15 at 13:05,wrote: > The use of the per-domain event_lock in hvm_dirq_assist() does not scale > with many VCPUs or interrupts. > > Add a per-interrupt lock to reduce contention. When a interrupt for a > passthrough device is being setup or teared down, we must take both > the event_lock and this new lock. > > Signed-off-by: David Vrabel Konrad, could you please take a look too? > @@ -245,11 +246,11 @@ int pt_irq_create_bind( > * would have spun forever and would do the same thing (wait to flush out > * outstanding hvm_dirq_assist calls. > */ > -if ( pt_pirq_softirq_active(pirq_dpci) ) > +while ( pt_pirq_softirq_active(pirq_dpci) ) > { > -spin_unlock(>event_lock); > +spin_unlock(_dpci->lock); > cpu_relax(); > -goto restart; > +spin_lock(_dpci->lock); > } The comment ahead of pt_pirq_softirq_active() (mentioning event_lock) will thus need updating. > @@ -481,6 +489,8 @@ int pt_irq_destroy_bind( > pirq = pirq_info(d, machine_gsi); > pirq_dpci = pirq_dpci(pirq); > > +spin_lock(_dpci->lock); Considering that code further down in this function checks pirq_dpci to be non-NULL, this doesn't look safe (or else those checks should go away, as after this addition they would be likely to trigger e.g. Coverity warnings). > @@ -621,7 +633,7 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq) > return 1; > } > > -/* called with d->event_lock held */ > +/* called with pirq_dhci->lock held */ > static void __msi_pirq_eoi(struct hvm_pirq_dpci *pirq_dpci) The fact that this is a safe change to the locking model imo needs to be spelled out explicitly in the commit message. Afaict it's safe only because desc_guest_eoi() uses pirq for nothing else than to (atomically!) clear pirq->masked. > @@ -675,7 +687,7 @@ static void hvm_dirq_assist(struct domain *d, struct > hvm_pirq_dpci *pirq_dpci) > { > ASSERT(d->arch.hvm_domain.irq.dpci); > > -spin_lock(>event_lock); > +spin_lock(_dpci->lock); > if ( test_and_clear_bool(pirq_dpci->masked) ) > { > struct pirq *pirq = dpci_pirq(pirq_dpci); Along the same lines - it's not obvious that the uses of pirq here are safe with event_lock no longer held. In particular I wonder about send_guest_pirq() - despite the other use in __do_IRQ_guest() not doing any locking either I'm not convinced this is correct. Jan ___ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
[Xen-devel] [PATCHv1 1/2] passthrough: use per-interrupt lock when injecting an interrupt
The use of the per-domain event_lock in hvm_dirq_assist() does not scale with many VCPUs or interrupts. Add a per-interrupt lock to reduce contention. When a interrupt for a passthrough device is being setup or teared down, we must take both the event_lock and this new lock. Signed-off-by: David Vrabel--- xen/drivers/passthrough/io.c | 34 +++--- xen/include/xen/hvm/irq.h| 1 + 2 files changed, 24 insertions(+), 11 deletions(-) diff --git a/xen/drivers/passthrough/io.c b/xen/drivers/passthrough/io.c index bda9374..7c86c20 100644 --- a/xen/drivers/passthrough/io.c +++ b/xen/drivers/passthrough/io.c @@ -106,7 +106,7 @@ static void pt_pirq_softirq_reset(struct hvm_pirq_dpci *pirq_dpci) { struct domain *d = pirq_dpci->dom; -ASSERT(spin_is_locked(>event_lock)); +ASSERT(spin_is_locked(_dpci->lock)); switch ( cmpxchg(_dpci->state, 1 << STATE_SCHED, 0) ) { @@ -209,7 +209,6 @@ int pt_irq_create_bind( if ( pirq < 0 || pirq >= d->nr_pirqs ) return -EINVAL; - restart: spin_lock(>event_lock); hvm_irq_dpci = domain_get_irq_dpci(d); @@ -237,6 +236,8 @@ int pt_irq_create_bind( } pirq_dpci = pirq_dpci(info); +spin_lock(_dpci->lock); + /* * A crude 'while' loop with us dropping the spinlock and giving * the softirq_dpci a chance to run. @@ -245,11 +246,11 @@ int pt_irq_create_bind( * would have spun forever and would do the same thing (wait to flush out * outstanding hvm_dirq_assist calls. */ -if ( pt_pirq_softirq_active(pirq_dpci) ) +while ( pt_pirq_softirq_active(pirq_dpci) ) { -spin_unlock(>event_lock); +spin_unlock(_dpci->lock); cpu_relax(); -goto restart; +spin_lock(_dpci->lock); } switch ( pt_irq_bind->irq_type ) @@ -301,6 +302,7 @@ int pt_irq_create_bind( pirq_dpci->dom = NULL; pirq_dpci->flags = 0; pirq_cleanup_check(info, d); +spin_unlock(_dpci->lock); spin_unlock(>event_lock); return rc; } @@ -311,6 +313,7 @@ int pt_irq_create_bind( if ( (pirq_dpci->flags & mask) != mask ) { +spin_unlock(_dpci->lock); spin_unlock(>event_lock); return -EBUSY; } @@ -331,6 +334,7 @@ int pt_irq_create_bind( dest_mode = !!(pirq_dpci->gmsi.gflags & VMSI_DM_MASK); dest_vcpu_id = hvm_girq_dest_2_vcpu_id(d, dest, dest_mode); pirq_dpci->gmsi.dest_vcpu_id = dest_vcpu_id; +spin_unlock(_dpci->lock); spin_unlock(>event_lock); if ( dest_vcpu_id >= 0 ) hvm_migrate_pirqs(d->vcpu[dest_vcpu_id]); @@ -351,6 +355,7 @@ int pt_irq_create_bind( if ( !digl || !girq ) { +spin_unlock(_dpci->lock); spin_unlock(>event_lock); xfree(girq); xfree(digl); @@ -412,6 +417,7 @@ int pt_irq_create_bind( hvm_irq_dpci->link_cnt[link]--; pirq_dpci->flags = 0; pirq_cleanup_check(info, d); +spin_unlock(_dpci->lock); spin_unlock(>event_lock); xfree(girq); xfree(digl); @@ -419,6 +425,7 @@ int pt_irq_create_bind( } } +spin_unlock(_dpci->lock); spin_unlock(>event_lock); if ( iommu_verbose ) @@ -430,6 +437,7 @@ int pt_irq_create_bind( } default: +spin_unlock(_dpci->lock); spin_unlock(>event_lock); return -EOPNOTSUPP; } @@ -481,6 +489,8 @@ int pt_irq_destroy_bind( pirq = pirq_info(d, machine_gsi); pirq_dpci = pirq_dpci(pirq); +spin_lock(_dpci->lock); + if ( pt_irq_bind->irq_type != PT_IRQ_TYPE_MSI ) { unsigned int bus = pt_irq_bind->u.pci.bus; @@ -549,6 +559,7 @@ int pt_irq_destroy_bind( pirq_cleanup_check(pirq, d); } +spin_unlock(_dpci->lock); spin_unlock(>event_lock); if ( what && iommu_verbose ) @@ -566,6 +577,7 @@ int pt_irq_destroy_bind( void pt_pirq_init(struct domain *d, struct hvm_pirq_dpci *dpci) { +spin_lock_init(>lock); INIT_LIST_HEAD(>digl_list); dpci->gmsi.dest_vcpu_id = -1; } @@ -621,7 +633,7 @@ int hvm_do_IRQ_dpci(struct domain *d, struct pirq *pirq) return 1; } -/* called with d->event_lock held */ +/* called with pirq_dhci->lock held */ static void __msi_pirq_eoi(struct hvm_pirq_dpci *pirq_dpci) { irq_desc_t *desc; @@ -675,7 +687,7 @@ static void hvm_dirq_assist(struct domain *d, struct hvm_pirq_dpci *pirq_dpci) { ASSERT(d->arch.hvm_domain.irq.dpci); -spin_lock(>event_lock); +spin_lock(_dpci->lock); if ( test_and_clear_bool(pirq_dpci->masked) ) { struct pirq *pirq = dpci_pirq(pirq_dpci); @@ -687,7 +699,7 @@ static void hvm_dirq_assist(struct domain