Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-10 Thread Andi Kleen
> wait_event:
> 
>   prepare_to_wait(wq) // takes wq->lock
> 
>   if (!CONDITION)
>   schedule();
> 
> Now,
> 
>   CONDITION = 1;
>   wake_up(wq);
> 
> at least need the full mb() before lits_empty().

You're right, but it would probably only matter for inlining with LTO
(if the LTO compiler ever decides to do that)

Without that a call should be always enough barrier in practice.

So yes I would add the mb, but most likely it will not make much
difference. Just make sure to comment it.

-Andi
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-10 Thread Andi Kleen
 wait_event:
 
   prepare_to_wait(wq) // takes wq-lock
 
   if (!CONDITION)
   schedule();
 
 Now,
 
   CONDITION = 1;
   wake_up(wq);
 
 at least need the full mb() before lits_empty().

You're right, but it would probably only matter for inlining with LTO
(if the LTO compiler ever decides to do that)

Without that a call should be always enough barrier in practice.

So yes I would add the mb, but most likely it will not make much
difference. Just make sure to comment it.

-Andi
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Oleg Nesterov
On 10/09, Peter Zijlstra wrote:
>
> One thing you might need to consider is the memory ordering, will the
> list_empty -- either careful or not -- observe the right list pointer,
> or could it -- when racing with wait_event()/prepare_to_wait() --
> observe a stale value. Or.. is that all already covered in on the use
> site.

I agree.

Without spin_lock(q->lock) (or some other barriers) wait_event-like
code can miss an event.

wait_event:

prepare_to_wait(wq) // takes wq->lock

if (!CONDITION)
schedule();

Now,

CONDITION = 1;
wake_up(wq);

at least need the full mb() before lits_empty().

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Peter Zijlstra
On Tue, 2012-10-09 at 06:37 -0700, Andi Kleen wrote:
> Ivo Sieben  writes:
> 
> > Check the waitqueue task list to be non empty before entering the critical
> > section. This prevents locking the spin lock needlessly in case the queue
> > was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
> > system.
> >
> > Signed-off-by: Ivo Sieben 
> > ---
> >
> >  REPOST:
> >  Request for comments:
> >  - Does this make any sense?
> 
> Looks good to me. Avoiding any spinlock is good.  I don't even think you
> need "careful", if you miss an update it was just as it happened a
> little later.

Yeah, so I've started looking at this several times, but never had the
time to actually think it through. I think I'll agree with you that
using the careful list empty check isn't needed.

In general there's already the race of doing a wakeup before the wait
and if the code using the waitqueue doesn't deal with that its already
broken, so in that respect you should be good, since you're simply
shifting the timing a little.

One thing you might need to consider is the memory ordering, will the
list_empty -- either careful or not -- observe the right list pointer,
or could it -- when racing with wait_event()/prepare_to_wait() --
observe a stale value. Or.. is that all already covered in on the use
site.

I started auditing a few users to see what they all assume, if they're
already broken and or if they would now be broken.. but like said, I
didn't get anywhere.


I'd like to see this patch/Changelog amended to at least cover these
points so that I feel all warm and fuzzy when I read it and not have to
think too hard ;-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Andi Kleen
Ivo Sieben  writes:

> Check the waitqueue task list to be non empty before entering the critical
> section. This prevents locking the spin lock needlessly in case the queue
> was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
> system.
>
> Signed-off-by: Ivo Sieben 
> ---
>
>  REPOST:
>  Request for comments:
>  - Does this make any sense?

Looks good to me. Avoiding any spinlock is good.  I don't even think you
need "careful", if you miss an update it was just as it happened a
little later.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Ivo Sieben
Check the waitqueue task list to be non empty before entering the critical
section. This prevents locking the spin lock needlessly in case the queue
was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
system.

Signed-off-by: Ivo Sieben 
---

 REPOST:
 Request for comments:
 - Does this make any sense?
 - I assume that I can safely use the list_empty_careful() function here, but is
   that correct?

 Background to this patch:
 Testing on a PREEMPT_RT system with TTY serial communication. Each time the TTY
 line discipline is dereferenced the Idle handling wait queue is woken up (see
 function put_ldisc in /drivers/tty/tty_ldisc.c)
 However line discipline idle handling is not used very often so the wait queue
 is empty most of the time. But still the wake_up() function enters the critical
 section guarded by spin locks. This causes additional scheduling overhead when
 a lower priority thread has control of that same lock.

 kernel/sched/core.c |   16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c177472..c1667c4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3090,9 +3090,19 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
 {
unsigned long flags;
 
-   spin_lock_irqsave(>lock, flags);
-   __wake_up_common(q, mode, nr_exclusive, 0, key);
-   spin_unlock_irqrestore(>lock, flags);
+   /*
+* We can check for list emptiness outside the lock by using the
+* "careful" check that verifies both the next and prev pointers, so
+* that there cannot be any half-pending updates in progress.
+*
+* This prevents the wake up to enter the critical section needlessly
+* when the task list is empty.
+*/
+   if (!list_empty_careful(>task_list)) {
+   spin_lock_irqsave(>lock, flags);
+   __wake_up_common(q, mode, nr_exclusive, 0, key);
+   spin_unlock_irqrestore(>lock, flags);
+   }
 }
 EXPORT_SYMBOL(__wake_up);
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Ivo Sieben
Check the waitqueue task list to be non empty before entering the critical
section. This prevents locking the spin lock needlessly in case the queue
was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
system.

Signed-off-by: Ivo Sieben meltedpiano...@gmail.com
---

 REPOST:
 Request for comments:
 - Does this make any sense?
 - I assume that I can safely use the list_empty_careful() function here, but is
   that correct?

 Background to this patch:
 Testing on a PREEMPT_RT system with TTY serial communication. Each time the TTY
 line discipline is dereferenced the Idle handling wait queue is woken up (see
 function put_ldisc in /drivers/tty/tty_ldisc.c)
 However line discipline idle handling is not used very often so the wait queue
 is empty most of the time. But still the wake_up() function enters the critical
 section guarded by spin locks. This causes additional scheduling overhead when
 a lower priority thread has control of that same lock.

 kernel/sched/core.c |   16 +---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c177472..c1667c4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3090,9 +3090,19 @@ void __wake_up(wait_queue_head_t *q, unsigned int mode,
 {
unsigned long flags;
 
-   spin_lock_irqsave(q-lock, flags);
-   __wake_up_common(q, mode, nr_exclusive, 0, key);
-   spin_unlock_irqrestore(q-lock, flags);
+   /*
+* We can check for list emptiness outside the lock by using the
+* careful check that verifies both the next and prev pointers, so
+* that there cannot be any half-pending updates in progress.
+*
+* This prevents the wake up to enter the critical section needlessly
+* when the task list is empty.
+*/
+   if (!list_empty_careful(q-task_list)) {
+   spin_lock_irqsave(q-lock, flags);
+   __wake_up_common(q, mode, nr_exclusive, 0, key);
+   spin_unlock_irqrestore(q-lock, flags);
+   }
 }
 EXPORT_SYMBOL(__wake_up);
 
-- 
1.7.9.5


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Andi Kleen
Ivo Sieben meltedpiano...@gmail.com writes:

 Check the waitqueue task list to be non empty before entering the critical
 section. This prevents locking the spin lock needlessly in case the queue
 was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
 system.

 Signed-off-by: Ivo Sieben meltedpiano...@gmail.com
 ---

  REPOST:
  Request for comments:
  - Does this make any sense?

Looks good to me. Avoiding any spinlock is good.  I don't even think you
need careful, if you miss an update it was just as it happened a
little later.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Peter Zijlstra
On Tue, 2012-10-09 at 06:37 -0700, Andi Kleen wrote:
 Ivo Sieben meltedpiano...@gmail.com writes:
 
  Check the waitqueue task list to be non empty before entering the critical
  section. This prevents locking the spin lock needlessly in case the queue
  was empty, and therefor also prevent scheduling overhead on a PREEMPT_RT
  system.
 
  Signed-off-by: Ivo Sieben meltedpiano...@gmail.com
  ---
 
   REPOST:
   Request for comments:
   - Does this make any sense?
 
 Looks good to me. Avoiding any spinlock is good.  I don't even think you
 need careful, if you miss an update it was just as it happened a
 little later.

Yeah, so I've started looking at this several times, but never had the
time to actually think it through. I think I'll agree with you that
using the careful list empty check isn't needed.

In general there's already the race of doing a wakeup before the wait
and if the code using the waitqueue doesn't deal with that its already
broken, so in that respect you should be good, since you're simply
shifting the timing a little.

One thing you might need to consider is the memory ordering, will the
list_empty -- either careful or not -- observe the right list pointer,
or could it -- when racing with wait_event()/prepare_to_wait() --
observe a stale value. Or.. is that all already covered in on the use
site.

I started auditing a few users to see what they all assume, if they're
already broken and or if they would now be broken.. but like said, I
didn't get anywhere.


I'd like to see this patch/Changelog amended to at least cover these
points so that I feel all warm and fuzzy when I read it and not have to
think too hard ;-)
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [REPOST] RFC: sched: Prevent wakeup to enter critical section needlessly

2012-10-09 Thread Oleg Nesterov
On 10/09, Peter Zijlstra wrote:

 One thing you might need to consider is the memory ordering, will the
 list_empty -- either careful or not -- observe the right list pointer,
 or could it -- when racing with wait_event()/prepare_to_wait() --
 observe a stale value. Or.. is that all already covered in on the use
 site.

I agree.

Without spin_lock(q-lock) (or some other barriers) wait_event-like
code can miss an event.

wait_event:

prepare_to_wait(wq) // takes wq-lock

if (!CONDITION)
schedule();

Now,

CONDITION = 1;
wake_up(wq);

at least need the full mb() before lits_empty().

Oleg.

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/