Re: Patch for lost wakeups

2013-08-11 Thread James Bottomley
On Sun, 2013-08-11 at 19:39 +0200, Oleg Nesterov wrote: > On 08/10, Long Gao wrote: > > > > By the way, could you help me join the linux kernel mailling list? > > Do you mean, you want to subscribe? > > Well, from http://www.tux.org/lkml/#s3-1 > > send the line "subscribe linux-kernel

Re: Patch for lost wakeups

2013-08-11 Thread Oleg Nesterov
On 08/10, Long Gao wrote: > > By the way, could you help me join the linux kernel mailling list? Do you mean, you want to subscribe? Well, from http://www.tux.org/lkml/#s3-1 send the line "subscribe linux-kernel your_email@your_ISP" in the body of the message to

Re: Patch for lost wakeups

2013-08-11 Thread Oleg Nesterov
On 08/11, Oleg Nesterov wrote: > > On 08/09, Linus Torvalds wrote: > > > > I guess that instead of a "smp_wmb()", we could do another > > "smp_mb__before_spinlock()" thing, like we already allow for other > > architectures to do a weaker form of mb in case the spinlock is > > already a full mb.

Re: Patch for lost wakeups

2013-08-11 Thread Oleg Nesterov
On 08/09, Linus Torvalds wrote: > > I guess that instead of a "smp_wmb()", we could do another > "smp_mb__before_spinlock()" thing, like we already allow for other > architectures to do a weaker form of mb in case the spinlock is > already a full mb. That would allow avoiding extra

Re: Patch for lost wakeups

2013-08-11 Thread Oleg Nesterov
On 08/09, Linus Torvalds wrote: I guess that instead of a smp_wmb(), we could do another smp_mb__before_spinlock() thing, like we already allow for other architectures to do a weaker form of mb in case the spinlock is already a full mb. That would allow avoiding extra synchronization. Do a

Re: Patch for lost wakeups

2013-08-11 Thread Oleg Nesterov
On 08/11, Oleg Nesterov wrote: On 08/09, Linus Torvalds wrote: I guess that instead of a smp_wmb(), we could do another smp_mb__before_spinlock() thing, like we already allow for other architectures to do a weaker form of mb in case the spinlock is already a full mb. That would allow

Re: Patch for lost wakeups

2013-08-11 Thread Oleg Nesterov
On 08/10, Long Gao wrote: By the way, could you help me join the linux kernel mailling list? Do you mean, you want to subscribe? Well, from http://www.tux.org/lkml/#s3-1 send the line subscribe linux-kernel your_email@your_ISP in the body of the message to

Re: Patch for lost wakeups

2013-08-11 Thread James Bottomley
On Sun, 2013-08-11 at 19:39 +0200, Oleg Nesterov wrote: On 08/10, Long Gao wrote: By the way, could you help me join the linux kernel mailling list? Do you mean, you want to subscribe? Well, from http://www.tux.org/lkml/#s3-1 send the line subscribe linux-kernel

Re: Patch for lost wakeups

2013-08-09 Thread Linus Torvalds
On Fri, Aug 9, 2013 at 6:04 AM, Oleg Nesterov wrote: > >>The case of signals is special, in that the "wakeup criteria" is >> inside the scheduler itself, but conceptually the rule is the same. > > yes, and because the waiter lacks mb(). Hmm. Ok. So say we have a sleeper that does

block_all_signals() must die (Was: Patch for lost wakeups)

2013-08-09 Thread Oleg Nesterov
And sorry for off-topic email, but I can't resist. Can't we finally kill block_all_signals() and ->notifier ? This is very, very wrong and doesn't work anyway. I tried to ask many, many times. Starting from 2007 at least. And every time the discussion "hangs". I am quoting the last email I sent

Re: Patch for lost wakeups

2013-08-09 Thread Oleg Nesterov
On 08/08, Linus Torvalds wrote: > > > Name:Xorg > > State:S (sleeping) > > Tgid:2597 > > Pid:2597 > > PPid:2595 > > TracerPid:0 > > Uid:0000 > > Gid:0000 > > FDSize:64 > > Groups: > > VmPeak: 44640 kB > > VmSize: 31232 kB > >

Re: Patch for lost wakeups

2013-08-09 Thread Oleg Nesterov
On 08/08, Linus Torvalds wrote: > > On Thu, Aug 8, 2013 at 12:17 PM, Oleg Nesterov wrote: > > > >> and as far as I can tell we have proper barriers for those (the > >> scheduler gets the rq lock > > > > Yes, but... ttwu() takse another lock, ->pi_lock to test ->state. > > The lock is different,

Re: Patch for lost wakeups

2013-08-09 Thread Oleg Nesterov
On 08/08, Linus Torvalds wrote: On Thu, Aug 8, 2013 at 12:17 PM, Oleg Nesterov o...@redhat.com wrote: and as far as I can tell we have proper barriers for those (the scheduler gets the rq lock Yes, but... ttwu() takse another lock, -pi_lock to test -state. The lock is different, but

Re: Patch for lost wakeups

2013-08-09 Thread Oleg Nesterov
On 08/08, Linus Torvalds wrote: Name:Xorg State:S (sleeping) Tgid:2597 Pid:2597 PPid:2595 TracerPid:0 Uid:0000 Gid:0000 FDSize:64 Groups: VmPeak: 44640 kB VmSize: 31232 kB VmLck: 0 kB

block_all_signals() must die (Was: Patch for lost wakeups)

2013-08-09 Thread Oleg Nesterov
And sorry for off-topic email, but I can't resist. Can't we finally kill block_all_signals() and -notifier ? This is very, very wrong and doesn't work anyway. I tried to ask many, many times. Starting from 2007 at least. And every time the discussion hangs. I am quoting the last email I sent

Re: Patch for lost wakeups

2013-08-09 Thread Linus Torvalds
On Fri, Aug 9, 2013 at 6:04 AM, Oleg Nesterov o...@redhat.com wrote: The case of signals is special, in that the wakeup criteria is inside the scheduler itself, but conceptually the rule is the same. yes, and because the waiter lacks mb(). Hmm. Ok. So say we have a sleeper that does

Re: Patch for lost wakeups

2013-08-08 Thread Linus Torvalds
On Thu, Aug 8, 2013 at 12:17 PM, Oleg Nesterov wrote: > >> - somebody setting TASK_SLEEPING -> __schedule() testing the >> signal_pending_state() >> >> and as far as I can tell we have proper barriers for those (the >> scheduler gets the rq lock > > Yes, but... ttwu() takse another lock,

Re: Patch for lost wakeups

2013-08-08 Thread Oleg Nesterov
On 08/08, Linus Torvalds wrote: > > As a result, doing a "recalc_sigpending_and_wake(()" and btw it should die, I think. > is definitely > incorrect, because sigpending state cannot actually have changed. Yes, if we need to wakeup in this case something is already wrong. > - somebody setting

Re: Patch for lost wakeups

2013-08-08 Thread Linus Torvalds
[ Adding proper people, and the kernel mailing list ] The patch is definitely incorrect, but the bug is interesting, so I'm cc'ing more people in case anybody else has any input on this. The reason I say that the patch is incorrect is because "legacy_queue()" doesn't actually *do* anything to

Re: Patch for lost wakeups

2013-08-08 Thread Linus Torvalds
[ Adding proper people, and the kernel mailing list ] The patch is definitely incorrect, but the bug is interesting, so I'm cc'ing more people in case anybody else has any input on this. The reason I say that the patch is incorrect is because legacy_queue() doesn't actually *do* anything to the

Re: Patch for lost wakeups

2013-08-08 Thread Oleg Nesterov
On 08/08, Linus Torvalds wrote: As a result, doing a recalc_sigpending_and_wake(() and btw it should die, I think. is definitely incorrect, because sigpending state cannot actually have changed. Yes, if we need to wakeup in this case something is already wrong. - somebody setting

Re: Patch for lost wakeups

2013-08-08 Thread Linus Torvalds
On Thu, Aug 8, 2013 at 12:17 PM, Oleg Nesterov o...@redhat.com wrote: - somebody setting TASK_SLEEPING - __schedule() testing the signal_pending_state() and as far as I can tell we have proper barriers for those (the scheduler gets the rq lock Yes, but... ttwu() takse another lock,