RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > -Original Message-
> > From: Thomas Gleixner [mailto:t...@linutronix.de]
> > Sent: Friday, February 21, 2014 7:53 PM
> > To: Liu, Chuansheng
> > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> > wait-forever
> > 
> > On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > > > > > I think you have a point there, but not on x86 wherre the atomic_dec
> > > > > > and the spinlock on the queueing side are full barriers. For non-x86
> > > > > > there is definitely a potential issue.
> > > > > >
> > > > > But even on X86, spin_unlock has no full barrier, the following 
> > > > > scenario:
> > > > > CPU0   CPU1
> > > > > spin_lock
> > > > >atomic_dec_and_test
> > > > > insert into queue
> > > > > spin_unlock
> > > > >checking waitqueue_active
> > > >
> > > > But CPU0 sees the 0, right?
> > > Not be clear here:)
> > > The atomic_read has no barrier.
> > >
> > > Found commit 6cb2a21049b89 has one similar smp_mb() calling before
> > > waitqueue_active() on one X86 CPU.
> > 
> > Indeed, you are completely right. Great detective work!
> Thanks your encouraging.
> 
> > 
> > I'm inclined to remove the waitqueue_active() alltogether. It's
> > creating more headache than it's worth.
> If I am understanding well, removing the checking of waitqueue_active(),
> and call wakeup() directly which will check list with spinlock protection.

Correct.
 
> If so, I can prepare one patch for it:)

Appreciated.

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Liu, Chuansheng
Hello Thomas,

> -Original Message-
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Friday, February 21, 2014 7:53 PM
> To: Liu, Chuansheng
> Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> wait-forever
> 
> On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > > > > I think you have a point there, but not on x86 wherre the atomic_dec
> > > > > and the spinlock on the queueing side are full barriers. For non-x86
> > > > > there is definitely a potential issue.
> > > > >
> > > > But even on X86, spin_unlock has no full barrier, the following 
> > > > scenario:
> > > > CPU0   CPU1
> > > > spin_lock
> > > >atomic_dec_and_test
> > > > insert into queue
> > > > spin_unlock
> > > >checking waitqueue_active
> > >
> > > But CPU0 sees the 0, right?
> > Not be clear here:)
> > The atomic_read has no barrier.
> >
> > Found commit 6cb2a21049b89 has one similar smp_mb() calling before
> > waitqueue_active() on one X86 CPU.
> 
> Indeed, you are completely right. Great detective work!
Thanks your encouraging.

> 
> I'm inclined to remove the waitqueue_active() alltogether. It's
> creating more headache than it's worth.
If I am understanding well, removing the checking of waitqueue_active(),
and call wakeup() directly which will check list with spinlock protection.

If so, I can prepare one patch for it:)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > > > I think you have a point there, but not on x86 wherre the atomic_dec
> > > > and the spinlock on the queueing side are full barriers. For non-x86
> > > > there is definitely a potential issue.
> > > >
> > > But even on X86, spin_unlock has no full barrier, the following scenario:
> > > CPU0   CPU1
> > > spin_lock
> > >atomic_dec_and_test
> > > insert into queue
> > > spin_unlock
> > >checking waitqueue_active
> > 
> > But CPU0 sees the 0, right?
> Not be clear here:)
> The atomic_read has no barrier.
> 
> Found commit 6cb2a21049b89 has one similar smp_mb() calling before
> waitqueue_active() on one X86 CPU.

Indeed, you are completely right. Great detective work!

I'm inclined to remove the waitqueue_active() alltogether. It's
creating more headache than it's worth.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Liu, Chuansheng


> -Original Message-
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Friday, February 21, 2014 7:11 PM
> To: Liu, Chuansheng
> Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> wait-forever
> 
> On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > Hello Thomas,
> >
> > > -Original Message-
> > > From: Thomas Gleixner [mailto:t...@linutronix.de]
> > > Sent: Friday, February 21, 2014 6:34 PM
> > > To: Liu, Chuansheng
> > > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> > > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq()
> wait-forever
> > >
> > > On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > > > But feels there is another case which the synchronize_irq waited there
> > > forever,
> > > > it is no waking up action from irq_thread().
> > > >
> > > > CPU0  CPU1
> > > > disable_irq() irq_thread()
> > > >   synchronize_irq()
> > > > wait_event()
> > > >  adding the __wait into the queue  wake_threads_waitq
> > > >test threads_active==0
> > > >  
> > > > atomic_dec_and_test(threads_active) 1 -- > 0
> > > >
> > > waitqueue_active(>wait_for_threads)
> > > >   <== Here without smp_mb(),
> CPU1
> > > maybe detect
> > > >   the queue is still empty??
> > > >  schedule()
> > > >
> > > > It will cause although the threads_active is 0, but irq_thread() didn't 
> > > > do
> the
> > > waking up action.
> > > > Is it reasonable? Then maybe we can add one smp_mb() before
> > > waitqueue_active.
> > >
> > > I think you have a point there, but not on x86 wherre the atomic_dec
> > > and the spinlock on the queueing side are full barriers. For non-x86
> > > there is definitely a potential issue.
> > >
> > But even on X86, spin_unlock has no full barrier, the following scenario:
> > CPU0   CPU1
> > spin_lock
> >atomic_dec_and_test
> > insert into queue
> > spin_unlock
> >checking waitqueue_active
> 
> But CPU0 sees the 0, right?
Not be clear here:)
The atomic_read has no barrier.

Found commit 6cb2a21049b89 has one similar smp_mb() calling before
waitqueue_active() on one X86 CPU.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> Hello Thomas,
> 
> > -Original Message-
> > From: Thomas Gleixner [mailto:t...@linutronix.de]
> > Sent: Friday, February 21, 2014 6:34 PM
> > To: Liu, Chuansheng
> > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> > Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> > wait-forever
> > 
> > On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > > But feels there is another case which the synchronize_irq waited there
> > forever,
> > > it is no waking up action from irq_thread().
> > >
> > > CPU0  CPU1
> > > disable_irq() irq_thread()
> > >   synchronize_irq()
> > > wait_event()
> > >  adding the __wait into the queue  wake_threads_waitq
> > >test threads_active==0
> > >atomic_dec_and_test(threads_active) 1 
> > > -- > 0
> > >
> > waitqueue_active(>wait_for_threads)
> > >   <== Here without smp_mb(), CPU1
> > maybe detect
> > >   the queue is still empty??
> > >  schedule()
> > >
> > > It will cause although the threads_active is 0, but irq_thread() didn't 
> > > do the
> > waking up action.
> > > Is it reasonable? Then maybe we can add one smp_mb() before
> > waitqueue_active.
> > 
> > I think you have a point there, but not on x86 wherre the atomic_dec
> > and the spinlock on the queueing side are full barriers. For non-x86
> > there is definitely a potential issue.
> > 
> But even on X86, spin_unlock has no full barrier, the following scenario:
> CPU0   CPU1
> spin_lock   
>atomic_dec_and_test 
> insert into queue 
> spin_unlock
>checking waitqueue_active

But CPU0 sees the 0, right?

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Liu, Chuansheng
Hello Thomas,

> -Original Message-
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Friday, February 21, 2014 6:34 PM
> To: Liu, Chuansheng
> Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> wait-forever
> 
> On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> > But feels there is another case which the synchronize_irq waited there
> forever,
> > it is no waking up action from irq_thread().
> >
> > CPU0  CPU1
> > disable_irq() irq_thread()
> >   synchronize_irq()
> > wait_event()
> >  adding the __wait into the queue  wake_threads_waitq
> >test threads_active==0
> >  atomic_dec_and_test(threads_active) 1 
> > -- > 0
> >
> waitqueue_active(>wait_for_threads)
> >   <== Here without smp_mb(), CPU1
> maybe detect
> >   the queue is still empty??
> >  schedule()
> >
> > It will cause although the threads_active is 0, but irq_thread() didn't do 
> > the
> waking up action.
> > Is it reasonable? Then maybe we can add one smp_mb() before
> waitqueue_active.
> 
> I think you have a point there, but not on x86 wherre the atomic_dec
> and the spinlock on the queueing side are full barriers. For non-x86
> there is definitely a potential issue.
> 
But even on X86, spin_unlock has no full barrier, the following scenario:
CPU0   CPU1
spin_lock   
   atomic_dec_and_test 
insert into queue 
spin_unlock
   checking waitqueue_active

Here after inserting into the queue, before waitqueue_active,
there is no mb.

So is it still the case? Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
> But feels there is another case which the synchronize_irq waited there 
> forever,
> it is no waking up action from irq_thread().
> 
> CPU0  CPU1
> disable_irq() irq_thread()
>   synchronize_irq()
> wait_event()
>  adding the __wait into the queue  wake_threads_waitq
>test threads_active==0  
>atomic_dec_and_test(threads_active) 1 
> -- > 0
>  
> waitqueue_active(>wait_for_threads)
>   <== Here without smp_mb(), CPU1 maybe detect
>   the queue is still empty??
>  schedule()
> 
> It will cause although the threads_active is 0, but irq_thread() didn't do 
> the waking up action.
> Is it reasonable? Then maybe we can add one smp_mb() before waitqueue_active.

I think you have a point there, but not on x86 wherre the atomic_dec
and the spinlock on the queueing side are full barriers. For non-x86
there is definitely a potential issue.

Thanks,

tglx



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
 But feels there is another case which the synchronize_irq waited there 
 forever,
 it is no waking up action from irq_thread().
 
 CPU0  CPU1
 disable_irq() irq_thread()
   synchronize_irq()
 wait_event()
  adding the __wait into the queue  wake_threads_waitq
test threads_active==0  
atomic_dec_and_test(threads_active) 1 
 --  0
  
 waitqueue_active(desc-wait_for_threads)
   == Here without smp_mb(), CPU1 maybe detect
   the queue is still empty??
  schedule()
 
 It will cause although the threads_active is 0, but irq_thread() didn't do 
 the waking up action.
 Is it reasonable? Then maybe we can add one smp_mb() before waitqueue_active.

I think you have a point there, but not on x86 wherre the atomic_dec
and the spinlock on the queueing side are full barriers. For non-x86
there is definitely a potential issue.

Thanks,

tglx



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Liu, Chuansheng
Hello Thomas,

 -Original Message-
 From: Thomas Gleixner [mailto:t...@linutronix.de]
 Sent: Friday, February 21, 2014 6:34 PM
 To: Liu, Chuansheng
 Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
 Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
 wait-forever
 
 On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
  But feels there is another case which the synchronize_irq waited there
 forever,
  it is no waking up action from irq_thread().
 
  CPU0  CPU1
  disable_irq() irq_thread()
synchronize_irq()
  wait_event()
   adding the __wait into the queue  wake_threads_waitq
 test threads_active==0
   atomic_dec_and_test(threads_active) 1 
  --  0
 
 waitqueue_active(desc-wait_for_threads)
== Here without smp_mb(), CPU1
 maybe detect
the queue is still empty??
   schedule()
 
  It will cause although the threads_active is 0, but irq_thread() didn't do 
  the
 waking up action.
  Is it reasonable? Then maybe we can add one smp_mb() before
 waitqueue_active.
 
 I think you have a point there, but not on x86 wherre the atomic_dec
 and the spinlock on the queueing side are full barriers. For non-x86
 there is definitely a potential issue.
 
But even on X86, spin_unlock has no full barrier, the following scenario:
CPU0   CPU1
spin_lock   
   atomic_dec_and_test 
insert into queue 
spin_unlock
   checking waitqueue_active

Here after inserting into the queue, before waitqueue_active,
there is no mb.

So is it still the case? Thanks.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
 Hello Thomas,
 
  -Original Message-
  From: Thomas Gleixner [mailto:t...@linutronix.de]
  Sent: Friday, February 21, 2014 6:34 PM
  To: Liu, Chuansheng
  Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
  Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
  wait-forever
  
  On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
   But feels there is another case which the synchronize_irq waited there
  forever,
   it is no waking up action from irq_thread().
  
   CPU0  CPU1
   disable_irq() irq_thread()
 synchronize_irq()
   wait_event()
adding the __wait into the queue  wake_threads_waitq
  test threads_active==0
  atomic_dec_and_test(threads_active) 1 
   --  0
  
  waitqueue_active(desc-wait_for_threads)
 == Here without smp_mb(), CPU1
  maybe detect
 the queue is still empty??
schedule()
  
   It will cause although the threads_active is 0, but irq_thread() didn't 
   do the
  waking up action.
   Is it reasonable? Then maybe we can add one smp_mb() before
  waitqueue_active.
  
  I think you have a point there, but not on x86 wherre the atomic_dec
  and the spinlock on the queueing side are full barriers. For non-x86
  there is definitely a potential issue.
  
 But even on X86, spin_unlock has no full barrier, the following scenario:
 CPU0   CPU1
 spin_lock   
atomic_dec_and_test 
 insert into queue 
 spin_unlock
checking waitqueue_active

But CPU0 sees the 0, right?

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Liu, Chuansheng


 -Original Message-
 From: Thomas Gleixner [mailto:t...@linutronix.de]
 Sent: Friday, February 21, 2014 7:11 PM
 To: Liu, Chuansheng
 Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
 Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
 wait-forever
 
 On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
  Hello Thomas,
 
   -Original Message-
   From: Thomas Gleixner [mailto:t...@linutronix.de]
   Sent: Friday, February 21, 2014 6:34 PM
   To: Liu, Chuansheng
   Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
   Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq()
 wait-forever
  
   On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
But feels there is another case which the synchronize_irq waited there
   forever,
it is no waking up action from irq_thread().
   
CPU0  CPU1
disable_irq() irq_thread()
  synchronize_irq()
wait_event()
 adding the __wait into the queue  wake_threads_waitq
   test threads_active==0
 
atomic_dec_and_test(threads_active) 1 --  0
   
   waitqueue_active(desc-wait_for_threads)
  == Here without smp_mb(),
 CPU1
   maybe detect
  the queue is still empty??
 schedule()
   
It will cause although the threads_active is 0, but irq_thread() didn't 
do
 the
   waking up action.
Is it reasonable? Then maybe we can add one smp_mb() before
   waitqueue_active.
  
   I think you have a point there, but not on x86 wherre the atomic_dec
   and the spinlock on the queueing side are full barriers. For non-x86
   there is definitely a potential issue.
  
  But even on X86, spin_unlock has no full barrier, the following scenario:
  CPU0   CPU1
  spin_lock
 atomic_dec_and_test
  insert into queue
  spin_unlock
 checking waitqueue_active
 
 But CPU0 sees the 0, right?
Not be clear here:)
The atomic_read has no barrier.

Found commit 6cb2a21049b89 has one similar smp_mb() calling before
waitqueue_active() on one X86 CPU.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
I think you have a point there, but not on x86 wherre the atomic_dec
and the spinlock on the queueing side are full barriers. For non-x86
there is definitely a potential issue.
   
   But even on X86, spin_unlock has no full barrier, the following scenario:
   CPU0   CPU1
   spin_lock
  atomic_dec_and_test
   insert into queue
   spin_unlock
  checking waitqueue_active
  
  But CPU0 sees the 0, right?
 Not be clear here:)
 The atomic_read has no barrier.
 
 Found commit 6cb2a21049b89 has one similar smp_mb() calling before
 waitqueue_active() on one X86 CPU.

Indeed, you are completely right. Great detective work!

I'm inclined to remove the waitqueue_active() alltogether. It's
creating more headache than it's worth.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Liu, Chuansheng
Hello Thomas,

 -Original Message-
 From: Thomas Gleixner [mailto:t...@linutronix.de]
 Sent: Friday, February 21, 2014 7:53 PM
 To: Liu, Chuansheng
 Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
 Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
 wait-forever
 
 On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
 I think you have a point there, but not on x86 wherre the atomic_dec
 and the spinlock on the queueing side are full barriers. For non-x86
 there is definitely a potential issue.

But even on X86, spin_unlock has no full barrier, the following 
scenario:
CPU0   CPU1
spin_lock
   atomic_dec_and_test
insert into queue
spin_unlock
   checking waitqueue_active
  
   But CPU0 sees the 0, right?
  Not be clear here:)
  The atomic_read has no barrier.
 
  Found commit 6cb2a21049b89 has one similar smp_mb() calling before
  waitqueue_active() on one X86 CPU.
 
 Indeed, you are completely right. Great detective work!
Thanks your encouraging.

 
 I'm inclined to remove the waitqueue_active() alltogether. It's
 creating more headache than it's worth.
If I am understanding well, removing the checking of waitqueue_active(),
and call wakeup() directly which will check list with spinlock protection.

If so, I can prepare one patch for it:)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-21 Thread Thomas Gleixner
On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
  -Original Message-
  From: Thomas Gleixner [mailto:t...@linutronix.de]
  Sent: Friday, February 21, 2014 7:53 PM
  To: Liu, Chuansheng
  Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
  Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
  wait-forever
  
  On Fri, 21 Feb 2014, Liu, Chuansheng wrote:
  I think you have a point there, but not on x86 wherre the atomic_dec
  and the spinlock on the queueing side are full barriers. For non-x86
  there is definitely a potential issue.
 
 But even on X86, spin_unlock has no full barrier, the following 
 scenario:
 CPU0   CPU1
 spin_lock
atomic_dec_and_test
 insert into queue
 spin_unlock
checking waitqueue_active
   
But CPU0 sees the 0, right?
   Not be clear here:)
   The atomic_read has no barrier.
  
   Found commit 6cb2a21049b89 has one similar smp_mb() calling before
   waitqueue_active() on one X86 CPU.
  
  Indeed, you are completely right. Great detective work!
 Thanks your encouraging.
 
  
  I'm inclined to remove the waitqueue_active() alltogether. It's
  creating more headache than it's worth.
 If I am understanding well, removing the checking of waitqueue_active(),
 and call wakeup() directly which will check list with spinlock protection.

Correct.
 
 If so, I can prepare one patch for it:)

Appreciated.

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-20 Thread Liu, Chuansheng
Hello Thomas,

> -Original Message-
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Thursday, February 20, 2014 8:53 PM
> To: Liu, Chuansheng
> Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> wait-forever
> 
> On Thu, 20 Feb 2014, Liu, Chuansheng wrote:
> > Hello Thomas,
> >
> > > -Original Message-
> > > From: Thomas Gleixner [mailto:t...@linutronix.de]
> > > Sent: Monday, February 10, 2014 4:58 PM
> > > To: Liu, Chuansheng
> > > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> > > Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq()
> wait-forever
> > >
> > > On Mon, 10 Feb 2014, Chuansheng Liu wrote:
> > > > There is below race between irq handler and irq thread:
> > > > irq handler irq thread
> > > >
> > > > irq_wake_thread()   irq_thread()
> > > >   set bit RUNTHREAD
> > > >   ...clear bit RUNTHREAD
> > > >  thread_fn()
> > > >  [A]test_and_decrease
> > > >thread_active
> > > >   [B]increase thread_active
> > > >
> > > > If action A is before action B, after that the thread_active
> > > > will be always > 0, and for synchronize_irq() calling, which
> > > > will be waiting there forever.
> > >
> > > No. thread_active is 0, simply because after the atomic_dec_and_test()
> > > it is -1 and the atomic_inc on the other side will bring it back to 0.
> > >
> > Yes, you are right. The thread_active is back to 0 at last.
> >
> > The case we meet is:
> > 1/ T1: blocking at disable_irq() -- > sync_irq() -- > wait_event()
> > [  142.678681]  [] schedule+0x23/0x60
> > [  142.683466]  [] synchronize_irq+0x75/0xb0
> > [  142.688931]  [] ? wake_up_bit+0x30/0x30
> > [  142.694201]  [] disable_irq+0x1b/0x20
> > [  142.699278]  [] smb347_shutdown+0x2c/0x50
> > [  142.704744]  [] i2c_device_shutdown+0x2d/0x40
> > [  142.710597]  [] device_shutdown+0x14/0x140
> > [  142.716161]  [] kernel_restart_prepare+0x32/0x40
> > [  142.722307]  [] kernel_restart+0x13/0x60
> >
> > 2/ The corresponding irq thread is at sleep state:
> > [  587.552408] irq/388-SMB0349 S f1c47620  7276   119  2
> 0x
> > [  587.552439]  f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c
> 0001 c1a5f3a5
> > [  587.552468]  c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0
> c20469c0 f1d6bef0
> > [  587.552497]  0296  0296 f1d6bef0 c1a5bfa6
> f1c47620 f1d6bf14 c126e329
> > [  587.552501] Call Trace:
> > [  587.552519]  [] ? sub_preempt_count+0x55/0xe0
> > [  587.552535]  [] ? _raw_spin_unlock_irqrestore+0x26/0x50
> > [  587.552548]  [] ? set_cpus_allowed_ptr+0x59/0xe0
> > [  587.552563]  [] schedule+0x23/0x60
> > [  587.552576]  [] irq_thread+0xa1/0x130
> > [  587.552588]  [] ? irq_thread_dtor+0xa0/0xa0
> >
> > 3/ All the cpus are in the idle task;
> 
> Lets look at it again:
> 
> CPU 0  CPU1
> 
> irq handlerirq thread
>   set IRQS_INPROGRESS
>   ...
>   irq_wake_thread()irq_thread()
> set bit RUNTHREAD
> ...  clear bit RUNTHREAD
>  thread_fn()
>atomic_dec_and_test(threads_active) ( 0 -> -1)
> 
> atomic_inc(threads_active) ( -1 -> 0)
>   clr IRQS_INPROGRESS
> 
> Now synchronize_irq comes into play, that's what caused you to look
> into this.
> 
> synchronize_irq() can never observe the -1 state because it is
> serialized against IRQS_INPROGESS. And when IRQS_INPROGRESS is
> cleared, the threads_active state is back to 0.
> 
> I'm really not seing how this can happen. Any chance you can reproduce
> this by executing the situation which led to this in a loop?
We can have a try to forcedly reproduce the case of threads_active -1/0.

But feels there is another case which the synchronize_irq waited there forever,
it is no waking up action from irq_thread().

CPU0 CPU1
disable_irq()  irq_thread()
  synchronize_irq()
wait_event()
 adding the __wait into the queuewake_threads_waitq
   test threads_active==0  
  

RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-20 Thread Thomas Gleixner
On Thu, 20 Feb 2014, Liu, Chuansheng wrote:
> Hello Thomas,
> 
> > -Original Message-
> > From: Thomas Gleixner [mailto:t...@linutronix.de]
> > Sent: Monday, February 10, 2014 4:58 PM
> > To: Liu, Chuansheng
> > Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> > Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> > wait-forever
> > 
> > On Mon, 10 Feb 2014, Chuansheng Liu wrote:
> > > There is below race between irq handler and irq thread:
> > > irq handler irq thread
> > >
> > > irq_wake_thread()   irq_thread()
> > >   set bit RUNTHREAD
> > >   ...clear bit RUNTHREAD
> > >  thread_fn()
> > >  [A]test_and_decrease
> > >thread_active
> > >   [B]increase thread_active
> > >
> > > If action A is before action B, after that the thread_active
> > > will be always > 0, and for synchronize_irq() calling, which
> > > will be waiting there forever.
> > 
> > No. thread_active is 0, simply because after the atomic_dec_and_test()
> > it is -1 and the atomic_inc on the other side will bring it back to 0.
> > 
> Yes, you are right. The thread_active is back to 0 at last.
> 
> The case we meet is:
> 1/ T1: blocking at disable_irq() -- > sync_irq() -- > wait_event()
> [  142.678681]  [] schedule+0x23/0x60
> [  142.683466]  [] synchronize_irq+0x75/0xb0
> [  142.688931]  [] ? wake_up_bit+0x30/0x30
> [  142.694201]  [] disable_irq+0x1b/0x20
> [  142.699278]  [] smb347_shutdown+0x2c/0x50
> [  142.704744]  [] i2c_device_shutdown+0x2d/0x40
> [  142.710597]  [] device_shutdown+0x14/0x140
> [  142.716161]  [] kernel_restart_prepare+0x32/0x40
> [  142.722307]  [] kernel_restart+0x13/0x60
> 
> 2/ The corresponding irq thread is at sleep state:
> [  587.552408] irq/388-SMB0349 S f1c47620  7276   119  2 0x
> [  587.552439]  f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c 
> 0001 c1a5f3a5
> [  587.552468]  c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 
> c20469c0 f1d6bef0
> [  587.552497]  0296  0296 f1d6bef0 c1a5bfa6 f1c47620 
> f1d6bf14 c126e329
> [  587.552501] Call Trace:
> [  587.552519]  [] ? sub_preempt_count+0x55/0xe0
> [  587.552535]  [] ? _raw_spin_unlock_irqrestore+0x26/0x50
> [  587.552548]  [] ? set_cpus_allowed_ptr+0x59/0xe0
> [  587.552563]  [] schedule+0x23/0x60
> [  587.552576]  [] irq_thread+0xa1/0x130
> [  587.552588]  [] ? irq_thread_dtor+0xa0/0xa0
> 
> 3/ All the cpus are in the idle task;

Lets look at it again:

CPU 0  CPU1

irq handlerirq thread
  set IRQS_INPROGRESS
  ...
  irq_wake_thread()irq_thread()
set bit RUNTHREAD
...  clear bit RUNTHREAD
 thread_fn()
 atomic_dec_and_test(threads_active) ( 0 -> -1)

atomic_inc(threads_active) ( -1 -> 0)
  clr IRQS_INPROGRESS

Now synchronize_irq comes into play, that's what caused you to look
into this. 

synchronize_irq() can never observe the -1 state because it is
serialized against IRQS_INPROGESS. And when IRQS_INPROGRESS is
cleared, the threads_active state is back to 0.

I'm really not seing how this can happen. Any chance you can reproduce
this by executing the situation which led to this in a loop?

Thanks,

tglx


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-20 Thread Thomas Gleixner
On Thu, 20 Feb 2014, Liu, Chuansheng wrote:
 Hello Thomas,
 
  -Original Message-
  From: Thomas Gleixner [mailto:t...@linutronix.de]
  Sent: Monday, February 10, 2014 4:58 PM
  To: Liu, Chuansheng
  Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
  Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
  wait-forever
  
  On Mon, 10 Feb 2014, Chuansheng Liu wrote:
   There is below race between irq handler and irq thread:
   irq handler irq thread
  
   irq_wake_thread()   irq_thread()
 set bit RUNTHREAD
 ...clear bit RUNTHREAD
thread_fn()
[A]test_and_decrease
  thread_active
 [B]increase thread_active
  
   If action A is before action B, after that the thread_active
   will be always  0, and for synchronize_irq() calling, which
   will be waiting there forever.
  
  No. thread_active is 0, simply because after the atomic_dec_and_test()
  it is -1 and the atomic_inc on the other side will bring it back to 0.
  
 Yes, you are right. The thread_active is back to 0 at last.
 
 The case we meet is:
 1/ T1: blocking at disable_irq() --  sync_irq() --  wait_event()
 [  142.678681]  [c1a5b353] schedule+0x23/0x60
 [  142.683466]  [c12b24c5] synchronize_irq+0x75/0xb0
 [  142.688931]  [c125fad0] ? wake_up_bit+0x30/0x30
 [  142.694201]  [c12b33ab] disable_irq+0x1b/0x20
 [  142.699278]  [c17a79bc] smb347_shutdown+0x2c/0x50
 [  142.704744]  [c1789f7d] i2c_device_shutdown+0x2d/0x40
 [  142.710597]  [c1601734] device_shutdown+0x14/0x140
 [  142.716161]  [c12535f2] kernel_restart_prepare+0x32/0x40
 [  142.722307]  [c1253613] kernel_restart+0x13/0x60
 
 2/ The corresponding irq thread is at sleep state:
 [  587.552408] irq/388-SMB0349 S f1c47620  7276   119  2 0x
 [  587.552439]  f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c 
 0001 c1a5f3a5
 [  587.552468]  c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 
 c20469c0 f1d6bef0
 [  587.552497]  0296  0296 f1d6bef0 c1a5bfa6 f1c47620 
 f1d6bf14 c126e329
 [  587.552501] Call Trace:
 [  587.552519]  [c1a5f3a5] ? sub_preempt_count+0x55/0xe0
 [  587.552535]  [c1a5bfa6] ? _raw_spin_unlock_irqrestore+0x26/0x50
 [  587.552548]  [c126e329] ? set_cpus_allowed_ptr+0x59/0xe0
 [  587.552563]  [c1a5b093] schedule+0x23/0x60
 [  587.552576]  [c12b2ae1] irq_thread+0xa1/0x130
 [  587.552588]  [c12b27f0] ? irq_thread_dtor+0xa0/0xa0
 
 3/ All the cpus are in the idle task;

Lets look at it again:

CPU 0  CPU1

irq handlerirq thread
  set IRQS_INPROGRESS
  ...
  irq_wake_thread()irq_thread()
set bit RUNTHREAD
...  clear bit RUNTHREAD
 thread_fn()
 atomic_dec_and_test(threads_active) ( 0 - -1)

atomic_inc(threads_active) ( -1 - 0)
  clr IRQS_INPROGRESS

Now synchronize_irq comes into play, that's what caused you to look
into this. 

synchronize_irq() can never observe the -1 state because it is
serialized against IRQS_INPROGESS. And when IRQS_INPROGRESS is
cleared, the threads_active state is back to 0.

I'm really not seing how this can happen. Any chance you can reproduce
this by executing the situation which led to this in a loop?

Thanks,

tglx


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-20 Thread Liu, Chuansheng
Hello Thomas,

 -Original Message-
 From: Thomas Gleixner [mailto:t...@linutronix.de]
 Sent: Thursday, February 20, 2014 8:53 PM
 To: Liu, Chuansheng
 Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
 Subject: RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
 wait-forever
 
 On Thu, 20 Feb 2014, Liu, Chuansheng wrote:
  Hello Thomas,
 
   -Original Message-
   From: Thomas Gleixner [mailto:t...@linutronix.de]
   Sent: Monday, February 10, 2014 4:58 PM
   To: Liu, Chuansheng
   Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
   Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq()
 wait-forever
  
   On Mon, 10 Feb 2014, Chuansheng Liu wrote:
There is below race between irq handler and irq thread:
irq handler irq thread
   
irq_wake_thread()   irq_thread()
  set bit RUNTHREAD
  ...clear bit RUNTHREAD
 thread_fn()
 [A]test_and_decrease
   thread_active
  [B]increase thread_active
   
If action A is before action B, after that the thread_active
will be always  0, and for synchronize_irq() calling, which
will be waiting there forever.
  
   No. thread_active is 0, simply because after the atomic_dec_and_test()
   it is -1 and the atomic_inc on the other side will bring it back to 0.
  
  Yes, you are right. The thread_active is back to 0 at last.
 
  The case we meet is:
  1/ T1: blocking at disable_irq() --  sync_irq() --  wait_event()
  [  142.678681]  [c1a5b353] schedule+0x23/0x60
  [  142.683466]  [c12b24c5] synchronize_irq+0x75/0xb0
  [  142.688931]  [c125fad0] ? wake_up_bit+0x30/0x30
  [  142.694201]  [c12b33ab] disable_irq+0x1b/0x20
  [  142.699278]  [c17a79bc] smb347_shutdown+0x2c/0x50
  [  142.704744]  [c1789f7d] i2c_device_shutdown+0x2d/0x40
  [  142.710597]  [c1601734] device_shutdown+0x14/0x140
  [  142.716161]  [c12535f2] kernel_restart_prepare+0x32/0x40
  [  142.722307]  [c1253613] kernel_restart+0x13/0x60
 
  2/ The corresponding irq thread is at sleep state:
  [  587.552408] irq/388-SMB0349 S f1c47620  7276   119  2
 0x
  [  587.552439]  f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c
 0001 c1a5f3a5
  [  587.552468]  c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0
 c20469c0 f1d6bef0
  [  587.552497]  0296  0296 f1d6bef0 c1a5bfa6
 f1c47620 f1d6bf14 c126e329
  [  587.552501] Call Trace:
  [  587.552519]  [c1a5f3a5] ? sub_preempt_count+0x55/0xe0
  [  587.552535]  [c1a5bfa6] ? _raw_spin_unlock_irqrestore+0x26/0x50
  [  587.552548]  [c126e329] ? set_cpus_allowed_ptr+0x59/0xe0
  [  587.552563]  [c1a5b093] schedule+0x23/0x60
  [  587.552576]  [c12b2ae1] irq_thread+0xa1/0x130
  [  587.552588]  [c12b27f0] ? irq_thread_dtor+0xa0/0xa0
 
  3/ All the cpus are in the idle task;
 
 Lets look at it again:
 
 CPU 0  CPU1
 
 irq handlerirq thread
   set IRQS_INPROGRESS
   ...
   irq_wake_thread()irq_thread()
 set bit RUNTHREAD
 ...  clear bit RUNTHREAD
  thread_fn()
atomic_dec_and_test(threads_active) ( 0 - -1)
 
 atomic_inc(threads_active) ( -1 - 0)
   clr IRQS_INPROGRESS
 
 Now synchronize_irq comes into play, that's what caused you to look
 into this.
 
 synchronize_irq() can never observe the -1 state because it is
 serialized against IRQS_INPROGESS. And when IRQS_INPROGRESS is
 cleared, the threads_active state is back to 0.
 
 I'm really not seing how this can happen. Any chance you can reproduce
 this by executing the situation which led to this in a loop?
We can have a try to forcedly reproduce the case of threads_active -1/0.

But feels there is another case which the synchronize_irq waited there forever,
it is no waking up action from irq_thread().

CPU0 CPU1
disable_irq()  irq_thread()
  synchronize_irq()
wait_event()
 adding the __wait into the queuewake_threads_waitq
   test threads_active==0  

 atomic_dec_and_test(threads_active) 1 --  0
 
waitqueue_active(desc-wait_for_threads)
   == Here without smp_mb(), 
CPU1 maybe detect
   the queue is still 
empty??
 schedule()

It will cause although the threads_active is 0, but irq_thread() didn't do the 
waking up action.
Is it reasonable? Then maybe we can add one smp_mb() before waitqueue_active.

Thanks.








--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body

RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-19 Thread Liu, Chuansheng
Hello Thomas,

> -Original Message-
> From: Thomas Gleixner [mailto:t...@linutronix.de]
> Sent: Monday, February 10, 2014 4:58 PM
> To: Liu, Chuansheng
> Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
> Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
> wait-forever
> 
> On Mon, 10 Feb 2014, Chuansheng Liu wrote:
> > There is below race between irq handler and irq thread:
> > irq handler irq thread
> >
> > irq_wake_thread()   irq_thread()
> >   set bit RUNTHREAD
> >   ...clear bit RUNTHREAD
> >  thread_fn()
> >  [A]test_and_decrease
> >thread_active
> >   [B]increase thread_active
> >
> > If action A is before action B, after that the thread_active
> > will be always > 0, and for synchronize_irq() calling, which
> > will be waiting there forever.
> 
> No. thread_active is 0, simply because after the atomic_dec_and_test()
> it is -1 and the atomic_inc on the other side will bring it back to 0.
> 
Yes, you are right. The thread_active is back to 0 at last.

The case we meet is:
1/ T1: blocking at disable_irq() -- > sync_irq() -- > wait_event()
[  142.678681]  [] schedule+0x23/0x60
[  142.683466]  [] synchronize_irq+0x75/0xb0
[  142.688931]  [] ? wake_up_bit+0x30/0x30
[  142.694201]  [] disable_irq+0x1b/0x20
[  142.699278]  [] smb347_shutdown+0x2c/0x50
[  142.704744]  [] i2c_device_shutdown+0x2d/0x40
[  142.710597]  [] device_shutdown+0x14/0x140
[  142.716161]  [] kernel_restart_prepare+0x32/0x40
[  142.722307]  [] kernel_restart+0x13/0x60

2/ The corresponding irq thread is at sleep state:
[  587.552408] irq/388-SMB0349 S f1c47620  7276   119  2 0x
[  587.552439]  f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c 0001 
c1a5f3a5
[  587.552468]  c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 c20469c0 
f1d6bef0
[  587.552497]  0296  0296 f1d6bef0 c1a5bfa6 f1c47620 f1d6bf14 
c126e329
[  587.552501] Call Trace:
[  587.552519]  [] ? sub_preempt_count+0x55/0xe0
[  587.552535]  [] ? _raw_spin_unlock_irqrestore+0x26/0x50
[  587.552548]  [] ? set_cpus_allowed_ptr+0x59/0xe0
[  587.552563]  [] schedule+0x23/0x60
[  587.552576]  [] irq_thread+0xa1/0x130
[  587.552588]  [] ? irq_thread_dtor+0xa0/0xa0

3/ All the cpus are in the idle task;

So we guess the thread_active is not 0 at that time, but irq thread is doing 
nothing at that time.
Thought for a long time, but there is no idea, and it is just hit once.

Appreciated if you have some idea, thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-19 Thread Liu, Chuansheng
Hello Thomas,

 -Original Message-
 From: Thomas Gleixner [mailto:t...@linutronix.de]
 Sent: Monday, February 10, 2014 4:58 PM
 To: Liu, Chuansheng
 Cc: linux-kernel@vger.kernel.org; Wang, Xiaoming
 Subject: Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() 
 wait-forever
 
 On Mon, 10 Feb 2014, Chuansheng Liu wrote:
  There is below race between irq handler and irq thread:
  irq handler irq thread
 
  irq_wake_thread()   irq_thread()
set bit RUNTHREAD
...clear bit RUNTHREAD
   thread_fn()
   [A]test_and_decrease
 thread_active
[B]increase thread_active
 
  If action A is before action B, after that the thread_active
  will be always  0, and for synchronize_irq() calling, which
  will be waiting there forever.
 
 No. thread_active is 0, simply because after the atomic_dec_and_test()
 it is -1 and the atomic_inc on the other side will bring it back to 0.
 
Yes, you are right. The thread_active is back to 0 at last.

The case we meet is:
1/ T1: blocking at disable_irq() --  sync_irq() --  wait_event()
[  142.678681]  [c1a5b353] schedule+0x23/0x60
[  142.683466]  [c12b24c5] synchronize_irq+0x75/0xb0
[  142.688931]  [c125fad0] ? wake_up_bit+0x30/0x30
[  142.694201]  [c12b33ab] disable_irq+0x1b/0x20
[  142.699278]  [c17a79bc] smb347_shutdown+0x2c/0x50
[  142.704744]  [c1789f7d] i2c_device_shutdown+0x2d/0x40
[  142.710597]  [c1601734] device_shutdown+0x14/0x140
[  142.716161]  [c12535f2] kernel_restart_prepare+0x32/0x40
[  142.722307]  [c1253613] kernel_restart+0x13/0x60

2/ The corresponding irq thread is at sleep state:
[  587.552408] irq/388-SMB0349 S f1c47620  7276   119  2 0x
[  587.552439]  f1d6bf20 0046 f1c47a48 f1c47620 f1d6bec4 9e91731c 0001 
c1a5f3a5
[  587.552468]  c20469c0 0001 c20469c0 f36559c0 f1c47620 f307bde0 c20469c0 
f1d6bef0
[  587.552497]  0296  0296 f1d6bef0 c1a5bfa6 f1c47620 f1d6bf14 
c126e329
[  587.552501] Call Trace:
[  587.552519]  [c1a5f3a5] ? sub_preempt_count+0x55/0xe0
[  587.552535]  [c1a5bfa6] ? _raw_spin_unlock_irqrestore+0x26/0x50
[  587.552548]  [c126e329] ? set_cpus_allowed_ptr+0x59/0xe0
[  587.552563]  [c1a5b093] schedule+0x23/0x60
[  587.552576]  [c12b2ae1] irq_thread+0xa1/0x130
[  587.552588]  [c12b27f0] ? irq_thread_dtor+0xa0/0xa0

3/ All the cpus are in the idle task;

So we guess the thread_active is not 0 at that time, but irq thread is doing 
nothing at that time.
Thought for a long time, but there is no idea, and it is just hit once.

Appreciated if you have some idea, thanks.



--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-10 Thread Thomas Gleixner
On Mon, 10 Feb 2014, Chuansheng Liu wrote:
> There is below race between irq handler and irq thread:
> irq handler irq thread
> 
> irq_wake_thread()   irq_thread()
>   set bit RUNTHREAD
>   ...clear bit RUNTHREAD
>  thread_fn()
>  [A]test_and_decrease
>thread_active
>   [B]increase thread_active
> 
> If action A is before action B, after that the thread_active
> will be always > 0, and for synchronize_irq() calling, which
> will be waiting there forever.

No. thread_active is 0, simply because after the atomic_dec_and_test()
it is -1 and the atomic_inc on the other side will bring it back to 0.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-10 Thread Chuansheng Liu
There is below race between irq handler and irq thread:
irq handler irq thread

irq_wake_thread()   irq_thread()
  set bit RUNTHREAD
  ...clear bit RUNTHREAD
 thread_fn()
 [A]test_and_decrease
   thread_active
  [B]increase thread_active

If action A is before action B, after that the thread_active
will be always > 0, and for synchronize_irq() calling, which
will be waiting there forever.

Here put the increasing thread-active before setting bit
RUNTHREAD, which can resolve such race.

Signed-off-by: xiaoming wang 
Signed-off-by: Chuansheng Liu 
---
 kernel/irq/handle.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index 131ca17..5f9fbb7 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -65,7 +65,7 @@ static void irq_wake_thread(struct irq_desc *desc, struct 
irqaction *action)
 * Wake up the handler thread for this action. If the
 * RUNTHREAD bit is already set, nothing to do.
 */
-   if (test_and_set_bit(IRQTF_RUNTHREAD, >thread_flags))
+   if (test_bit(IRQTF_RUNTHREAD, >thread_flags))
return;
 
/*
@@ -126,6 +126,25 @@ static void irq_wake_thread(struct irq_desc *desc, struct 
irqaction *action)
 */
atomic_inc(>threads_active);
 
+   /*
+* set the RUNTHREAD bit after increasing the threads_active,
+* it can avoid the below race:
+* irq handler  irq thread in case it is in
+*running state
+*
+*  set RUNTHREAD bit
+*  clear the RUNTHREAD bit
+*...   thread_fn()
+*
+*  due to threads_active==0,
+*  no waking up wait_for_threads
+*
+* threads_active ++
+* After that, the threads_active will be always > 0, which
+* will block the synchronize_irq().
+*/
+   set_bit(IRQTF_RUNTHREAD, >thread_flags);
+
wake_up_process(action->thread);
 }
 
-- 
1.9.rc0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-10 Thread Chuansheng Liu
There is below race between irq handler and irq thread:
irq handler irq thread

irq_wake_thread()   irq_thread()
  set bit RUNTHREAD
  ...clear bit RUNTHREAD
 thread_fn()
 [A]test_and_decrease
   thread_active
  [B]increase thread_active

If action A is before action B, after that the thread_active
will be always  0, and for synchronize_irq() calling, which
will be waiting there forever.

Here put the increasing thread-active before setting bit
RUNTHREAD, which can resolve such race.

Signed-off-by: xiaoming wang xiaoming.w...@intel.com
Signed-off-by: Chuansheng Liu chuansheng@intel.com
---
 kernel/irq/handle.c | 21 -
 1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/kernel/irq/handle.c b/kernel/irq/handle.c
index 131ca17..5f9fbb7 100644
--- a/kernel/irq/handle.c
+++ b/kernel/irq/handle.c
@@ -65,7 +65,7 @@ static void irq_wake_thread(struct irq_desc *desc, struct 
irqaction *action)
 * Wake up the handler thread for this action. If the
 * RUNTHREAD bit is already set, nothing to do.
 */
-   if (test_and_set_bit(IRQTF_RUNTHREAD, action-thread_flags))
+   if (test_bit(IRQTF_RUNTHREAD, action-thread_flags))
return;
 
/*
@@ -126,6 +126,25 @@ static void irq_wake_thread(struct irq_desc *desc, struct 
irqaction *action)
 */
atomic_inc(desc-threads_active);
 
+   /*
+* set the RUNTHREAD bit after increasing the threads_active,
+* it can avoid the below race:
+* irq handler  irq thread in case it is in
+*running state
+*
+*  set RUNTHREAD bit
+*  clear the RUNTHREAD bit
+*...   thread_fn()
+*
+*  due to threads_active==0,
+*  no waking up wait_for_threads
+*
+* threads_active ++
+* After that, the threads_active will be always  0, which
+* will block the synchronize_irq().
+*/
+   set_bit(IRQTF_RUNTHREAD, action-thread_flags);
+
wake_up_process(action-thread);
 }
 
-- 
1.9.rc0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] genirq: Fix the possible synchronize_irq() wait-forever

2014-02-10 Thread Thomas Gleixner
On Mon, 10 Feb 2014, Chuansheng Liu wrote:
 There is below race between irq handler and irq thread:
 irq handler irq thread
 
 irq_wake_thread()   irq_thread()
   set bit RUNTHREAD
   ...clear bit RUNTHREAD
  thread_fn()
  [A]test_and_decrease
thread_active
   [B]increase thread_active
 
 If action A is before action B, after that the thread_active
 will be always  0, and for synchronize_irq() calling, which
 will be waiting there forever.

No. thread_active is 0, simply because after the atomic_dec_and_test()
it is -1 and the atomic_inc on the other side will bring it back to 0.

Thanks,

tglx
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/