Konrad Rzeszutek Wilk <[email protected]> writes:

> On Thu, Aug 01, 2013 at 11:24:01PM +0900, Jonghwan Choi wrote:
>> From: David Vrabel <[email protected]>
>> 
>> This patch looks like it should be in the 3.10-stable tree, should we apply
>> it?
>
> Yes please!
>
> Thank you for catching that. If it would be possible to backport to
> earlier version of Linux that would be super!

Great, thanks.  I'll queue it for the 3.5 kernel.

Cheers,
-- 
Luis


>> 
>> ------------------
>> 
>> From: "David Vrabel <[email protected]>"
>> 
>> commit 179fbd5a45f0d4034cc6fd37b8d367a3b79663c4 upstream
>> 
>> Unbinding an event channel (either with the ioctl or when the evtchn
>> device is closed) may deadlock because disable_irq() is called with
>> port_user_lock held which is also locked by the interrupt handler.
>> 
>> Think of the IOCTL_EVTCHN_UNBIND is being serviced, the routine has
>> just taken the lock, and an interrupt happens. The evtchn_interrupt
>> is invoked, tries to take the lock and spins forever.
>> 
>> A quick glance at the code shows that the spinlock is a local IRQ
>> variant. Unfortunately that does not help as "disable_irq() waits for
>> the interrupt handler on all CPUs to stop running.  If the irq occurs
>> on another VCPU, it tries to take port_user_lock and can't because
>> the unbind ioctl is holding it." (from David). Hence we cannot
>> depend on the said spinlock to protect us. We could make it a system
>> wide IRQ disable spinlock but there is a better way.
>> 
>> We can piggyback on the fact that the existence of the spinlock is
>> to make get_port_user() checks be up-to-date. And we can alter those
>> checks to not depend on the spin lock (as it's protected by u->bind_mutex
>> in the ioctl) and can remove the unnecessary locking (this is
>> IOCTL_EVTCHN_UNBIND) path.
>> 
>> In the interrupt handler we cannot use the mutex, but we do not
>> need it.
>> 
>> "The unbind disables the irq before making the port user stale, so when
>> you clear it you are guaranteed that the interrupt handler that might
>> use that port cannot be running." (from David).
>> 
>> Hence this patch removes the spinlock usage on the teardown path
>> and piggybacks on disable_irq happening before we muck with the
>> get_port_user() data. This ensures that the interrupt handler will
>> never run on stale data.
>> 
>> Signed-off-by: David Vrabel <[email protected]>
>> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]>
>> [v1: Expanded the commit description a bit]
>> Signed-off-by: Jonghwan Choi <[email protected]>
>> ---
>>  drivers/xen/evtchn.c | 21 ++-------------------
>>  1 file changed, 2 insertions(+), 19 deletions(-)
>> 
>> diff --git a/drivers/xen/evtchn.c b/drivers/xen/evtchn.c
>> index 45c8efa..34924fb 100644
>> --- a/drivers/xen/evtchn.c
>> +++ b/drivers/xen/evtchn.c
>> @@ -377,18 +377,12 @@ static long evtchn_ioctl(struct file *file,
>>              if (unbind.port >= NR_EVENT_CHANNELS)
>>                      break;
>>  
>> -            spin_lock_irq(&port_user_lock);
>> -
>>              rc = -ENOTCONN;
>> -            if (get_port_user(unbind.port) != u) {
>> -                    spin_unlock_irq(&port_user_lock);
>> +            if (get_port_user(unbind.port) != u)
>>                      break;
>> -            }
>>  
>>              disable_irq(irq_from_evtchn(unbind.port));
>>  
>> -            spin_unlock_irq(&port_user_lock);
>> -
>>              evtchn_unbind_from_user(u, unbind.port);
>>  
>>              rc = 0;
>> @@ -488,26 +482,15 @@ static int evtchn_release(struct inode *inode, struct 
>> file *filp)
>>      int i;
>>      struct per_user_data *u = filp->private_data;
>>  
>> -    spin_lock_irq(&port_user_lock);
>> -
>> -    free_page((unsigned long)u->ring);
>> -
>>      for (i = 0; i < NR_EVENT_CHANNELS; i++) {
>>              if (get_port_user(i) != u)
>>                      continue;
>>  
>>              disable_irq(irq_from_evtchn(i));
>> -    }
>> -
>> -    spin_unlock_irq(&port_user_lock);
>> -
>> -    for (i = 0; i < NR_EVENT_CHANNELS; i++) {
>> -            if (get_port_user(i) != u)
>> -                    continue;
>> -
>>              evtchn_unbind_from_user(get_port_user(i), i);
>>      }
>>  
>> +    free_page((unsigned long)u->ring);
>>      kfree(u->name);
>>      kfree(u);
>>  
>> -- 
>> 1.8.1.2
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to [email protected]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe stable" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to