Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Sergey Senozhatsky
On (11/09/17 00:06), Steven Rostedt wrote:
> What does safe context mean?

"safe" means that we don't cause lockups, stalls, sched throttlings, etc.
by doing console_unlock() from that context [task].


> Do we really want to allow the printk thread to sleep when there's more
> to print? What happens if there's a crash at that moment? How do we safely
> flush out all the data when the printk thread is sleeping?

printk-kthread does not schedule with the console_sem locked. one
of the changes to console_unlock() introduced with printk-kthread,
which we can't have without offloading.


> Now we could have something that uses both nicely. When the
> printk_thread wakes up (we need to figure out when to do that), then it
> could constantly take over.

certainly we can have a better hand-off scheme in printk-kthread patch set.


> 
>   CPU1CPU2
>   
>console_unlock()
>  start printing a lot
>  (more than one, wake up printk_thread)
> 
>   printk thread wakes up
> 
>   becomes the waiter
> 
>sees waiter hands off
> 
>   starts printing
> 
>printk()
>  becomes waiter
> 
>   sees waiter hands off
>   then becomes new waiter! <-- key
> 
> starts printing
> sees waiter hands off
>   continues printing

there are corners cases here. learned the hard way. real reproducers
do exist.

wake_up_process() may enqueue printk_thread on the same rq that
current printk task is running on. so if your printk(), for instance,
is from IRQ then offloading won't happen.


> That is, we keep the waiter logic, and if anyone starts printing too
> much, it wakes up the printk thread (hopefully on another CPU, or the
> printk thread should migrate)  when the printk thread starts running it

it must migrate, yes. currently I'm playing games with the affinity
mask of printk-kthread when I do offloading.


> becomes the new waiter if the console lock is still held (just like in
> printk). Then it gets handed off the printk. We could just have the
> printk thread keep going, though I'm not sure I would want to let it
> schedule while printing. 

yes, scheduling under console_sem is not right. we don't want this.
not anymore, at least.


> But it could also hand off printks (like
> above), but then take it back immediately. This would mean that a
> printk caller from a "critical" path will only get to do one message,
> before the printk thread asks for it again.
> 
> Perhaps we could have more than one printk thread that migrates around,
> and they each hand off the printing. This makes sure the printing
> always happens and that it never stops due to the console_lock holder
> sleeping and we never lock up one CPU that does printing. This would
> work with just two printk threads. When one starts a printk loop,
> another one wakes up on another CPU and becomes the waiter to get the
> handoff of the console_lock. Then the first could schedule out (migrate
> if the current CPU is busy), and take over. In  fact, this would
> basically have two CPUs bouncing back and forth to do the printing.

can be. I pushed it much further, once. [probably too far].
and had per-CPU printk-kthreads :)


> This gives us our cake and we get to eat it too.
> 
> One, printing never stops (no scheduling out), as there's two threads
> to share the load (obiously only on SMP machines).
> 
> There's no lock up. There's two threads that print a little, pass off
> the console lock, do a cond_resched(), then takes over again.
> 
> Bascially, what I'm saying is that this is not two different solutions.
> There is two algorithms that can work together to give us reliable
> output and not lock up the system in doing so.

sure, I understand.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Sergey Senozhatsky
On (11/09/17 00:06), Steven Rostedt wrote:
> What does safe context mean?

"safe" means that we don't cause lockups, stalls, sched throttlings, etc.
by doing console_unlock() from that context [task].


> Do we really want to allow the printk thread to sleep when there's more
> to print? What happens if there's a crash at that moment? How do we safely
> flush out all the data when the printk thread is sleeping?

printk-kthread does not schedule with the console_sem locked. one
of the changes to console_unlock() introduced with printk-kthread,
which we can't have without offloading.


> Now we could have something that uses both nicely. When the
> printk_thread wakes up (we need to figure out when to do that), then it
> could constantly take over.

certainly we can have a better hand-off scheme in printk-kthread patch set.


> 
>   CPU1CPU2
>   
>console_unlock()
>  start printing a lot
>  (more than one, wake up printk_thread)
> 
>   printk thread wakes up
> 
>   becomes the waiter
> 
>sees waiter hands off
> 
>   starts printing
> 
>printk()
>  becomes waiter
> 
>   sees waiter hands off
>   then becomes new waiter! <-- key
> 
> starts printing
> sees waiter hands off
>   continues printing

there are corners cases here. learned the hard way. real reproducers
do exist.

wake_up_process() may enqueue printk_thread on the same rq that
current printk task is running on. so if your printk(), for instance,
is from IRQ then offloading won't happen.


> That is, we keep the waiter logic, and if anyone starts printing too
> much, it wakes up the printk thread (hopefully on another CPU, or the
> printk thread should migrate)  when the printk thread starts running it

it must migrate, yes. currently I'm playing games with the affinity
mask of printk-kthread when I do offloading.


> becomes the new waiter if the console lock is still held (just like in
> printk). Then it gets handed off the printk. We could just have the
> printk thread keep going, though I'm not sure I would want to let it
> schedule while printing. 

yes, scheduling under console_sem is not right. we don't want this.
not anymore, at least.


> But it could also hand off printks (like
> above), but then take it back immediately. This would mean that a
> printk caller from a "critical" path will only get to do one message,
> before the printk thread asks for it again.
> 
> Perhaps we could have more than one printk thread that migrates around,
> and they each hand off the printing. This makes sure the printing
> always happens and that it never stops due to the console_lock holder
> sleeping and we never lock up one CPU that does printing. This would
> work with just two printk threads. When one starts a printk loop,
> another one wakes up on another CPU and becomes the waiter to get the
> handoff of the console_lock. Then the first could schedule out (migrate
> if the current CPU is busy), and take over. In  fact, this would
> basically have two CPUs bouncing back and forth to do the printing.

can be. I pushed it much further, once. [probably too far].
and had per-CPU printk-kthreads :)


> This gives us our cake and we get to eat it too.
> 
> One, printing never stops (no scheduling out), as there's two threads
> to share the load (obiously only on SMP machines).
> 
> There's no lock up. There's two threads that print a little, pass off
> the console lock, do a cond_resched(), then takes over again.
> 
> Bascially, what I'm saying is that this is not two different solutions.
> There is two algorithms that can work together to give us reliable
> output and not lock up the system in doing so.

sure, I understand.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Steven Rostedt
On Thu, 9 Nov 2017 13:45:48 +0900
Sergey Senozhatsky  wrote:

>
> so what we are looking at
> 
>a) we take over printing. can be from safe context to unsafe context
>   [well, bad karma]. can be from unsafe context to a safe one. or from
>   safe context to another safe context... or from one unsafe context to
>   another unsafe context [bad karma again]. we really never know, no
>   one does.
> 
>   lots of uncertainties - "may be X, may be Y, may be Z". a bigger
>   picture: we still can have the same lockup scenarios as we do
>   have today.
> 
>   and we also bring busy loop with us, so the new console_sem
>   owner [regardless its current context] CPU must wait until the
>   current console_sem finishes its call_console_drivers(). I
>   mentioned it in my another email, you seemed to jump over that
>   part. was it irrelevant or wrong?
> 
> vs.
> 
>b) we offload to printk_kthread [safe context].
> 
> 
> why (a) is better than (b)?
> 


What does safe context mean? Do we really want to allow the printk
thread to sleep when there's more to print? What happens if there's a
crash at that moment? How do we safely flush out all the data when the
printk thread is sleeping?

Now we could have something that uses both nicely. When the
printk_thread wakes up (we need to figure out when to do that), then it
could constantly take over.


CPU1CPU2

   console_unlock()
 start printing a lot
 (more than one, wake up printk_thread)

printk thread wakes up

becomes the waiter

   sees waiter hands off

starts printing

   printk()
 becomes waiter

sees waiter hands off
then becomes new waiter! <-- key

starts printing
sees waiter hands off
continues printing


That is, we keep the waiter logic, and if anyone starts printing too
much, it wakes up the printk thread (hopefully on another CPU, or the
printk thread should migrate)  when the printk thread starts running it
becomes the new waiter if the console lock is still held (just like in
printk). Then it gets handed off the printk. We could just have the
printk thread keep going, though I'm not sure I would want to let it
schedule while printing. But it could also hand off printks (like
above), but then take it back immediately. This would mean that a
printk caller from a "critical" path will only get to do one message,
before the printk thread asks for it again.

Perhaps we could have more than one printk thread that migrates around,
and they each hand off the printing. This makes sure the printing
always happens and that it never stops due to the console_lock holder
sleeping and we never lock up one CPU that does printing. This would
work with just two printk threads. When one starts a printk loop,
another one wakes up on another CPU and becomes the waiter to get the
handoff of the console_lock. Then the first could schedule out (migrate
if the current CPU is busy), and take over. In  fact, this would
basically have two CPUs bouncing back and forth to do the printing.

This gives us our cake and we get to eat it too.

One, printing never stops (no scheduling out), as there's two threads
to share the load (obiously only on SMP machines).

There's no lock up. There's two threads that print a little, pass off
the console lock, do a cond_resched(), then takes over again.

Bascially, what I'm saying is that this is not two different solutions.
There is two algorithms that can work together to give us reliable
output and not lock up the system in doing so.

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Steven Rostedt
On Thu, 9 Nov 2017 13:45:48 +0900
Sergey Senozhatsky  wrote:

>
> so what we are looking at
> 
>a) we take over printing. can be from safe context to unsafe context
>   [well, bad karma]. can be from unsafe context to a safe one. or from
>   safe context to another safe context... or from one unsafe context to
>   another unsafe context [bad karma again]. we really never know, no
>   one does.
> 
>   lots of uncertainties - "may be X, may be Y, may be Z". a bigger
>   picture: we still can have the same lockup scenarios as we do
>   have today.
> 
>   and we also bring busy loop with us, so the new console_sem
>   owner [regardless its current context] CPU must wait until the
>   current console_sem finishes its call_console_drivers(). I
>   mentioned it in my another email, you seemed to jump over that
>   part. was it irrelevant or wrong?
> 
> vs.
> 
>b) we offload to printk_kthread [safe context].
> 
> 
> why (a) is better than (b)?
> 


What does safe context mean? Do we really want to allow the printk
thread to sleep when there's more to print? What happens if there's a
crash at that moment? How do we safely flush out all the data when the
printk thread is sleeping?

Now we could have something that uses both nicely. When the
printk_thread wakes up (we need to figure out when to do that), then it
could constantly take over.


CPU1CPU2

   console_unlock()
 start printing a lot
 (more than one, wake up printk_thread)

printk thread wakes up

becomes the waiter

   sees waiter hands off

starts printing

   printk()
 becomes waiter

sees waiter hands off
then becomes new waiter! <-- key

starts printing
sees waiter hands off
continues printing


That is, we keep the waiter logic, and if anyone starts printing too
much, it wakes up the printk thread (hopefully on another CPU, or the
printk thread should migrate)  when the printk thread starts running it
becomes the new waiter if the console lock is still held (just like in
printk). Then it gets handed off the printk. We could just have the
printk thread keep going, though I'm not sure I would want to let it
schedule while printing. But it could also hand off printks (like
above), but then take it back immediately. This would mean that a
printk caller from a "critical" path will only get to do one message,
before the printk thread asks for it again.

Perhaps we could have more than one printk thread that migrates around,
and they each hand off the printing. This makes sure the printing
always happens and that it never stops due to the console_lock holder
sleeping and we never lock up one CPU that does printing. This would
work with just two printk threads. When one starts a printk loop,
another one wakes up on another CPU and becomes the waiter to get the
handoff of the console_lock. Then the first could schedule out (migrate
if the current CPU is busy), and take over. In  fact, this would
basically have two CPUs bouncing back and forth to do the printing.

This gives us our cake and we get to eat it too.

One, printing never stops (no scheduling out), as there's two threads
to share the load (obiously only on SMP machines).

There's no lock up. There's two threads that print a little, pass off
the console lock, do a cond_resched(), then takes over again.

Bascially, what I'm saying is that this is not two different solutions.
There is two algorithms that can work together to give us reliable
output and not lock up the system in doing so.

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Sergey Senozhatsky
On (11/08/17 22:29), Steven Rostedt wrote:
> > On (11/08/17 09:29), Steven Rostedt wrote:
> > > On Wed, 8 Nov 2017 14:19:55 +0900
> > > Sergey Senozhatsky  wrote:
> > >   
> > > > the change goes further. I did express some of my concerns during the 
> > > > KS,
> > > > I'll just bring them to the list.
> > > > 
> > > > 
> > > > we now always shift printing from a save - scheduleable - context to
> > > > a potentially unsafe one - atomic. by example:  
> > > 
> > > And vice versa. We are now likely to go from a unscheduleable context
> > > to a schedule one, where before, that didn't exist.  
> > 
> > the existence of "and vice versa" is kinda alarming, isn't it? it's sort
> > of "yes, we can break some things, but we also can improve some things."
> 
> Not really. Because the heuristic is that what calls printk will do the
> printk.

so what we are looking at

   a) we take over printing. can be from safe context to unsafe context
  [well, bad karma]. can be from unsafe context to a safe one. or from
  safe context to another safe context... or from one unsafe context to
  another unsafe context [bad karma again]. we really never know, no
  one does.

  lots of uncertainties - "may be X, may be Y, may be Z". a bigger
  picture: we still can have the same lockup scenarios as we do
  have today.

  and we also bring busy loop with us, so the new console_sem
  owner [regardless its current context] CPU must wait until the
  current console_sem finishes its call_console_drivers(). I
  mentioned it in my another email, you seemed to jump over that
  part. was it irrelevant or wrong?

vs.

   b) we offload to printk_kthread [safe context].


why (a) is better than (b)?

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Sergey Senozhatsky
On (11/08/17 22:29), Steven Rostedt wrote:
> > On (11/08/17 09:29), Steven Rostedt wrote:
> > > On Wed, 8 Nov 2017 14:19:55 +0900
> > > Sergey Senozhatsky  wrote:
> > >   
> > > > the change goes further. I did express some of my concerns during the 
> > > > KS,
> > > > I'll just bring them to the list.
> > > > 
> > > > 
> > > > we now always shift printing from a save - scheduleable - context to
> > > > a potentially unsafe one - atomic. by example:  
> > > 
> > > And vice versa. We are now likely to go from a unscheduleable context
> > > to a schedule one, where before, that didn't exist.  
> > 
> > the existence of "and vice versa" is kinda alarming, isn't it? it's sort
> > of "yes, we can break some things, but we also can improve some things."
> 
> Not really. Because the heuristic is that what calls printk will do the
> printk.

so what we are looking at

   a) we take over printing. can be from safe context to unsafe context
  [well, bad karma]. can be from unsafe context to a safe one. or from
  safe context to another safe context... or from one unsafe context to
  another unsafe context [bad karma again]. we really never know, no
  one does.

  lots of uncertainties - "may be X, may be Y, may be Z". a bigger
  picture: we still can have the same lockup scenarios as we do
  have today.

  and we also bring busy loop with us, so the new console_sem
  owner [regardless its current context] CPU must wait until the
  current console_sem finishes its call_console_drivers(). I
  mentioned it in my another email, you seemed to jump over that
  part. was it irrelevant or wrong?

vs.

   b) we offload to printk_kthread [safe context].


why (a) is better than (b)?

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Steven Rostedt
On Thu, 9 Nov 2017 09:56:35 +0900
Sergey Senozhatsky  wrote:

> Hello Steven,
> 
> On (11/08/17 09:29), Steven Rostedt wrote:
> > On Wed, 8 Nov 2017 14:19:55 +0900
> > Sergey Senozhatsky  wrote:
> >   
> > > the change goes further. I did express some of my concerns during the KS,
> > > I'll just bring them to the list.
> > > 
> > > 
> > > we now always shift printing from a save - scheduleable - context to
> > > a potentially unsafe one - atomic. by example:  
> > 
> > And vice versa. We are now likely to go from a unscheduleable context
> > to a schedule one, where before, that didn't exist.  
> 
> the existence of "and vice versa" is kinda alarming, isn't it? it's sort
> of "yes, we can break some things, but we also can improve some things."

Not really. Because the heuristic is that what calls printk will do the
printk.

> 
> > And my approach, makes it more likely that the task doing the printk
> > prints its own message, and less likely to print someone else's.
> >   
> > > 
> > > CPU0  CPU1~CPU10  CPU11
> > > 
> > > console_lock()
> > > 
> > >   printk();
> > > 
> > > console_unlock()  IRQ
> > >  set console_ownerprintk()
> > >sees console_owner
> > >set console_waiter
> > >  sees console_waiter
> > >  break
> > >console_unlock()
> > > lockup [?]  
> > 
> > How?  
> 
> oh, yes, the missing part - assume CPU1~CPU10 did 5000 printk() calls,
> while console_sem was locked on CPU0. then we console_unlock() from CPU0
> and shortly after IRQ->printk() from CPU11 forcibly takes over, so now
> we are in console_unlock() from atomic, printing some 5000 messages.

I'd say remove those 5000 printks ;-)

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Steven Rostedt
On Thu, 9 Nov 2017 09:56:35 +0900
Sergey Senozhatsky  wrote:

> Hello Steven,
> 
> On (11/08/17 09:29), Steven Rostedt wrote:
> > On Wed, 8 Nov 2017 14:19:55 +0900
> > Sergey Senozhatsky  wrote:
> >   
> > > the change goes further. I did express some of my concerns during the KS,
> > > I'll just bring them to the list.
> > > 
> > > 
> > > we now always shift printing from a save - scheduleable - context to
> > > a potentially unsafe one - atomic. by example:  
> > 
> > And vice versa. We are now likely to go from a unscheduleable context
> > to a schedule one, where before, that didn't exist.  
> 
> the existence of "and vice versa" is kinda alarming, isn't it? it's sort
> of "yes, we can break some things, but we also can improve some things."

Not really. Because the heuristic is that what calls printk will do the
printk.

> 
> > And my approach, makes it more likely that the task doing the printk
> > prints its own message, and less likely to print someone else's.
> >   
> > > 
> > > CPU0  CPU1~CPU10  CPU11
> > > 
> > > console_lock()
> > > 
> > >   printk();
> > > 
> > > console_unlock()  IRQ
> > >  set console_ownerprintk()
> > >sees console_owner
> > >set console_waiter
> > >  sees console_waiter
> > >  break
> > >console_unlock()
> > > lockup [?]  
> > 
> > How?  
> 
> oh, yes, the missing part - assume CPU1~CPU10 did 5000 printk() calls,
> while console_sem was locked on CPU0. then we console_unlock() from CPU0
> and shortly after IRQ->printk() from CPU11 forcibly takes over, so now
> we are in console_unlock() from atomic, printing some 5000 messages.

I'd say remove those 5000 printks ;-)

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Sergey Senozhatsky
Hello Steven,

On (11/08/17 09:29), Steven Rostedt wrote:
> On Wed, 8 Nov 2017 14:19:55 +0900
> Sergey Senozhatsky  wrote:
> 
> > the change goes further. I did express some of my concerns during the KS,
> > I'll just bring them to the list.
> > 
> > 
> > we now always shift printing from a save - scheduleable - context to
> > a potentially unsafe one - atomic. by example:
> 
> And vice versa. We are now likely to go from a unscheduleable context
> to a schedule one, where before, that didn't exist.

the existence of "and vice versa" is kinda alarming, isn't it? it's sort
of "yes, we can break some things, but we also can improve some things."

> And my approach, makes it more likely that the task doing the printk
> prints its own message, and less likely to print someone else's.
> 
> > 
> > CPU0CPU1~CPU10  CPU11
> > 
> > console_lock()
> > 
> > printk();
> > 
> > console_unlock()IRQ
> >  set console_owner  printk()
> >  sees console_owner
> >  set console_waiter
> >  sees console_waiter
> >  break
> >  console_unlock()
> >   lockup [?]
> 
> How?

oh, yes, the missing part - assume CPU1~CPU10 did 5000 printk() calls,
while console_sem was locked on CPU0. then we console_unlock() from CPU0
and shortly after IRQ->printk() from CPU11 forcibly takes over, so now
we are in console_unlock() from atomic, printing some 5000 messages.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Sergey Senozhatsky
Hello Steven,

On (11/08/17 09:29), Steven Rostedt wrote:
> On Wed, 8 Nov 2017 14:19:55 +0900
> Sergey Senozhatsky  wrote:
> 
> > the change goes further. I did express some of my concerns during the KS,
> > I'll just bring them to the list.
> > 
> > 
> > we now always shift printing from a save - scheduleable - context to
> > a potentially unsafe one - atomic. by example:
> 
> And vice versa. We are now likely to go from a unscheduleable context
> to a schedule one, where before, that didn't exist.

the existence of "and vice versa" is kinda alarming, isn't it? it's sort
of "yes, we can break some things, but we also can improve some things."

> And my approach, makes it more likely that the task doing the printk
> prints its own message, and less likely to print someone else's.
> 
> > 
> > CPU0CPU1~CPU10  CPU11
> > 
> > console_lock()
> > 
> > printk();
> > 
> > console_unlock()IRQ
> >  set console_owner  printk()
> >  sees console_owner
> >  set console_waiter
> >  sees console_waiter
> >  break
> >  console_unlock()
> >   lockup [?]
> 
> How?

oh, yes, the missing part - assume CPU1~CPU10 did 5000 printk() calls,
while console_sem was locked on CPU0. then we console_unlock() from CPU0
and shortly after IRQ->printk() from CPU11 forcibly takes over, so now
we are in console_unlock() from atomic, printing some 5000 messages.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Steven Rostedt
On Wed, 8 Nov 2017 14:19:55 +0900
Sergey Senozhatsky  wrote:

> the change goes further. I did express some of my concerns during the KS,
> I'll just bring them to the list.
> 
> 
> we now always shift printing from a save - scheduleable - context to
> a potentially unsafe one - atomic. by example:

And vice versa. We are now likely to go from a unscheduleable context
to a schedule one, where before, that didn't exist.

And my approach, makes it more likely that the task doing the printk
prints its own message, and less likely to print someone else's.

> 
> CPU0  CPU1~CPU10  CPU11
> 
> console_lock()
> 
>   printk();
> 
> console_unlock()  IRQ
>  set console_ownerprintk()
>sees console_owner
>set console_waiter
>  sees console_waiter
>  break
>console_unlock()
> lockup [?]

How?

> 
> 
> so we are forcibly moving console_unlock() from safe CPU0 to unsafe CPU11.
> previously we would continue printing from a schedulable context.

And previously, we could be in unsafe CPU11 printing, and keep adding
to the buffer from safe CPUs, keeping CPU11 from ever stopping.

If anything, the patch makes the situation better, not worse.

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-08 Thread Steven Rostedt
On Wed, 8 Nov 2017 14:19:55 +0900
Sergey Senozhatsky  wrote:

> the change goes further. I did express some of my concerns during the KS,
> I'll just bring them to the list.
> 
> 
> we now always shift printing from a save - scheduleable - context to
> a potentially unsafe one - atomic. by example:

And vice versa. We are now likely to go from a unscheduleable context
to a schedule one, where before, that didn't exist.

And my approach, makes it more likely that the task doing the printk
prints its own message, and less likely to print someone else's.

> 
> CPU0  CPU1~CPU10  CPU11
> 
> console_lock()
> 
>   printk();
> 
> console_unlock()  IRQ
>  set console_ownerprintk()
>sees console_owner
>set console_waiter
>  sees console_waiter
>  break
>console_unlock()
> lockup [?]

How?

> 
> 
> so we are forcibly moving console_unlock() from safe CPU0 to unsafe CPU11.
> previously we would continue printing from a schedulable context.

And previously, we could be in unsafe CPU11 printing, and keep adding
to the buffer from safe CPUs, keeping CPU11 from ever stopping.

If anything, the patch makes the situation better, not worse.

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-07 Thread Sergey Senozhatsky
(Ccing Tejun)

On (11/07/17 10:40), Sergey Senozhatsky wrote:
> On (11/06/17 21:06), Tetsuo Handa wrote:
> > I tried your patch with warn_alloc() torture. It did not cause lockups.
> > But I felt that possibility of failing to flush last second messages (such
> > as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?
> 
> do I understand it correctly that there are "lost messages"?
> 
> sysrq-b does an immediate emergency reboot. "normally" it's not expected
> to flush any pending logbuf messages because it's an emergency-reboot...
> but in fact it does. and this is why sysrq-b is not 100% reliable:
> 
>   __handle_sysrq()
>   {
> pr_info("SysRq : ");
> 
> op_p = __sysrq_get_key_op(key);
> pr_cont("%s\n", op_p->action_msg);
> 
>   op_p->handler(key);
> 
> pr_cont("\n");
>   }
> 
> those pr_info()/pr_cont() calls can spoil sysrq-b, depending on how
> badly the system is screwed. if pr_info() deadlocks, then we never
> go to op_p->handler(key)->emergency_restart(). even if you suppress
> printing of info loglevel messages, pr_info() still goes to
> console_unlock() and prints [console_seq, log_next_seq] messages,
> if there any.
> 
> there is, however, a subtle behaviour change, I think.
> 
> previously, in some cases [?], pr_info("SysRq : ") from __handle_sysrq()
> would flush logbuf messages. now we have that "break out of console_unlock()
> loop even though there are pending messages, there is another CPU doing
> printk()". so sysrb-b instead of looping in console_unlock() goes directly
> to emergency_restart(). without the change it would have continued looping
> in console_unlock() and would have called emergency_restart() only when
> "console_seq == log_next_seq".
> 
> now... the "subtle" part here is that we had that thing:
>   - *IF* __handle_sysrq() grabs the console_sem then it will not
> return from console_unlock() until logbuf is empty. so
> concurrent printk() messages won't get lost.
> 
> what we have now is:
>   - if there are concurrent printk() then __handle_sysrq() does not
> fully flush the logbuf *even* if it grabbed the console_sem.

the change goes further. I did express some of my concerns during the KS,
I'll just bring them to the list.


we now always shift printing from a save - scheduleable - context to
a potentially unsafe one - atomic. by example:

CPU0CPU1~CPU10  CPU11

console_lock()

printk();

console_unlock()IRQ
 set console_owner  printk()
 sees console_owner
 set console_waiter
 sees console_waiter
 break
 console_unlock()
  lockup [?]


so we are forcibly moving console_unlock() from safe CPU0 to unsafe CPU11.
previously we would continue printing from a schedulable context.


another case. bare with me.

suppose that call_console_drivers() is slower than printk() -> log_store(),
which is often the case.

now assume the following:

CPU0CPU1

IRQ IRQ

printk()printk()
printk()printk()
printk()printk()


which probably could have been handled something like this:

CPU0CPU1

IRQ IRQ

printk()printk()
 log_store()
 log_store()
 console_unlock()
  call_console_drivers()
printk()
 log_store()
 goto again;
  call_console_drivers()
printk()
 log_store()
 goto again;
  call_console_drivers()
printk()
 log_store()
  console_unlock()
   call_console_drivers()
printk()
 log_store()
  console_unlock()
   call_console_drivers()


so CPU0 printed all the messages.
CPU1 simply did 3 * log_store()
// + spent some cycles on logbuf_lock spin_lock
// + console_sem trylock


but now every CPU will do call_console_drivers() + busy loop.


CPU0CPU1

IRQ IRQ

printk()printk()
 log_store()
 log_store()
 console_unlock()
  set console_owner
 sees console_owner
 sets console_waiter
 spin
  call_console_drivers()
  sees console_waiter
   break

printk()
 log_store()
 console_unlock()
  set console_owner
 sees console_owner
 sets console_waiter
 spin
 call_console_drivers()
 sees console_waiter
  break

   

Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-07 Thread Sergey Senozhatsky
(Ccing Tejun)

On (11/07/17 10:40), Sergey Senozhatsky wrote:
> On (11/06/17 21:06), Tetsuo Handa wrote:
> > I tried your patch with warn_alloc() torture. It did not cause lockups.
> > But I felt that possibility of failing to flush last second messages (such
> > as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?
> 
> do I understand it correctly that there are "lost messages"?
> 
> sysrq-b does an immediate emergency reboot. "normally" it's not expected
> to flush any pending logbuf messages because it's an emergency-reboot...
> but in fact it does. and this is why sysrq-b is not 100% reliable:
> 
>   __handle_sysrq()
>   {
> pr_info("SysRq : ");
> 
> op_p = __sysrq_get_key_op(key);
> pr_cont("%s\n", op_p->action_msg);
> 
>   op_p->handler(key);
> 
> pr_cont("\n");
>   }
> 
> those pr_info()/pr_cont() calls can spoil sysrq-b, depending on how
> badly the system is screwed. if pr_info() deadlocks, then we never
> go to op_p->handler(key)->emergency_restart(). even if you suppress
> printing of info loglevel messages, pr_info() still goes to
> console_unlock() and prints [console_seq, log_next_seq] messages,
> if there any.
> 
> there is, however, a subtle behaviour change, I think.
> 
> previously, in some cases [?], pr_info("SysRq : ") from __handle_sysrq()
> would flush logbuf messages. now we have that "break out of console_unlock()
> loop even though there are pending messages, there is another CPU doing
> printk()". so sysrb-b instead of looping in console_unlock() goes directly
> to emergency_restart(). without the change it would have continued looping
> in console_unlock() and would have called emergency_restart() only when
> "console_seq == log_next_seq".
> 
> now... the "subtle" part here is that we had that thing:
>   - *IF* __handle_sysrq() grabs the console_sem then it will not
> return from console_unlock() until logbuf is empty. so
> concurrent printk() messages won't get lost.
> 
> what we have now is:
>   - if there are concurrent printk() then __handle_sysrq() does not
> fully flush the logbuf *even* if it grabbed the console_sem.

the change goes further. I did express some of my concerns during the KS,
I'll just bring them to the list.


we now always shift printing from a save - scheduleable - context to
a potentially unsafe one - atomic. by example:

CPU0CPU1~CPU10  CPU11

console_lock()

printk();

console_unlock()IRQ
 set console_owner  printk()
 sees console_owner
 set console_waiter
 sees console_waiter
 break
 console_unlock()
  lockup [?]


so we are forcibly moving console_unlock() from safe CPU0 to unsafe CPU11.
previously we would continue printing from a schedulable context.


another case. bare with me.

suppose that call_console_drivers() is slower than printk() -> log_store(),
which is often the case.

now assume the following:

CPU0CPU1

IRQ IRQ

printk()printk()
printk()printk()
printk()printk()


which probably could have been handled something like this:

CPU0CPU1

IRQ IRQ

printk()printk()
 log_store()
 log_store()
 console_unlock()
  call_console_drivers()
printk()
 log_store()
 goto again;
  call_console_drivers()
printk()
 log_store()
 goto again;
  call_console_drivers()
printk()
 log_store()
  console_unlock()
   call_console_drivers()
printk()
 log_store()
  console_unlock()
   call_console_drivers()


so CPU0 printed all the messages.
CPU1 simply did 3 * log_store()
// + spent some cycles on logbuf_lock spin_lock
// + console_sem trylock


but now every CPU will do call_console_drivers() + busy loop.


CPU0CPU1

IRQ IRQ

printk()printk()
 log_store()
 log_store()
 console_unlock()
  set console_owner
 sees console_owner
 sets console_waiter
 spin
  call_console_drivers()
  sees console_waiter
   break

printk()
 log_store()
 console_unlock()
  set console_owner
 sees console_owner
 sets console_waiter
 spin
 call_console_drivers()
 sees console_waiter
  break

   

Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-06 Thread Sergey Senozhatsky
On (11/06/17 21:06), Tetsuo Handa wrote:
> I tried your patch with warn_alloc() torture. It did not cause lockups.
> But I felt that possibility of failing to flush last second messages (such
> as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?

do I understand it correctly that there are "lost messages"?

sysrq-b does an immediate emergency reboot. "normally" it's not expected
to flush any pending logbuf messages because it's an emergency-reboot...
but in fact it does. and this is why sysrq-b is not 100% reliable:

__handle_sysrq()
{
  pr_info("SysRq : ");

  op_p = __sysrq_get_key_op(key);
  pr_cont("%s\n", op_p->action_msg);

op_p->handler(key);

  pr_cont("\n");
}

those pr_info()/pr_cont() calls can spoil sysrq-b, depending on how
badly the system is screwed. if pr_info() deadlocks, then we never
go to op_p->handler(key)->emergency_restart(). even if you suppress
printing of info loglevel messages, pr_info() still goes to
console_unlock() and prints [console_seq, log_next_seq] messages,
if there any.

there is, however, a subtle behaviour change, I think.

previously, in some cases [?], pr_info("SysRq : ") from __handle_sysrq()
would flush logbuf messages. now we have that "break out of console_unlock()
loop even though there are pending messages, there is another CPU doing
printk()". so sysrb-b instead of looping in console_unlock() goes directly
to emergency_restart(). without the change it would have continued looping
in console_unlock() and would have called emergency_restart() only when
"console_seq == log_next_seq".

now... the "subtle" part here is that we had that thing:
- *IF* __handle_sysrq() grabs the console_sem then it will not
  return from console_unlock() until logbuf is empty. so
  concurrent printk() messages won't get lost.

what we have now is:
- if there are concurrent printk() then __handle_sysrq() does not
  fully flush the logbuf *even* if it grabbed the console_sem.

> -- vmcore-dmesg start --
> [  169.016198] postgres cpuset=
> [  169.032544]  filemap_fault+0x311/0x790
> [  169.047745] /
> [  169.047780]  mems_allowed=0
> [  169.050577]  ? xfs_ilock+0x126/0x1a0 [xfs]
> [  169.062769]  mems_allowed=0
> [  169.065754]  ? down_read_nested+0x3a/0x60
> [  169.065783]  ? xfs_ilock+0x126/0x1a0 [xfs]
> [  189.700206] sysrq: SysRq :
> [  189.700639]  __xfs_filemap_fault.isra.19+0x3f/0xe0 [xfs]
> [  189.700799]  xfs_filemap_fault+0xb/0x10 [xfs]
> [  189.703981] Trigger a crash
> [  189.707032]  __do_fault+0x19/0xa0
> [  189.710008] BUG: unable to handle kernel
> [  189.713387]  __handle_mm_fault+0xbb3/0xda0
> [  189.716473] NULL pointer dereference
> [  189.719674]  handle_mm_fault+0x14f/0x300
> [  189.722969]  at   (null)
> [  189.722974] IP: sysrq_handle_crash+0x3b/0x70
> [  189.726156]  ? handle_mm_fault+0x39/0x300
> [  189.729537] PGD 1170dc067
> [  189.732841]  __do_page_fault+0x23e/0x4f0
> [  189.735876] P4D 1170dc067
> [  189.739171]  do_page_fault+0x30/0x80
> [  189.742323] PUD 1170dd067
> [  189.745437]  page_fault+0x22/0x30
> [  189.748329] PMD 0
> [  189.751106] RIP: 0033:0x650390
> [  189.756583] RSP: 002b:7fffef6b1568 EFLAGS: 00010246
> [  189.759574] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
> [  189.762607] RAX:  RBX: 7fffef6b1594 RCX: 
> 7fae949caa20
> [  189.765665] Modules linked in:
> [  189.768423] RDX: 0008 RSI:  RDI: 
> 
> [  189.768425] RBP: 7fffef6b1590 R08: 0002 R09: 
> 0010
> [  189.771478]  ip6t_rpfilter
> [  189.774297] R10: 0001 R11: 0246 R12: 
> 
> [  189.777016]  ipt_REJECT
> [  189.779366] R13:  R14: 7fae969787e0 R15: 
> 0004
> [  189.782114]  nf_reject_ipv4
> [  189.784839] CPU: 7 PID: 6959 Comm: sleep Not tainted 4.14.0-rc8+ #302
> [  189.785113] Mem-Info:
> -- vmcore-dmesg end --

hm... wondering if this is a regression.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-06 Thread Sergey Senozhatsky
On (11/06/17 21:06), Tetsuo Handa wrote:
> I tried your patch with warn_alloc() torture. It did not cause lockups.
> But I felt that possibility of failing to flush last second messages (such
> as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?

do I understand it correctly that there are "lost messages"?

sysrq-b does an immediate emergency reboot. "normally" it's not expected
to flush any pending logbuf messages because it's an emergency-reboot...
but in fact it does. and this is why sysrq-b is not 100% reliable:

__handle_sysrq()
{
  pr_info("SysRq : ");

  op_p = __sysrq_get_key_op(key);
  pr_cont("%s\n", op_p->action_msg);

op_p->handler(key);

  pr_cont("\n");
}

those pr_info()/pr_cont() calls can spoil sysrq-b, depending on how
badly the system is screwed. if pr_info() deadlocks, then we never
go to op_p->handler(key)->emergency_restart(). even if you suppress
printing of info loglevel messages, pr_info() still goes to
console_unlock() and prints [console_seq, log_next_seq] messages,
if there any.

there is, however, a subtle behaviour change, I think.

previously, in some cases [?], pr_info("SysRq : ") from __handle_sysrq()
would flush logbuf messages. now we have that "break out of console_unlock()
loop even though there are pending messages, there is another CPU doing
printk()". so sysrb-b instead of looping in console_unlock() goes directly
to emergency_restart(). without the change it would have continued looping
in console_unlock() and would have called emergency_restart() only when
"console_seq == log_next_seq".

now... the "subtle" part here is that we had that thing:
- *IF* __handle_sysrq() grabs the console_sem then it will not
  return from console_unlock() until logbuf is empty. so
  concurrent printk() messages won't get lost.

what we have now is:
- if there are concurrent printk() then __handle_sysrq() does not
  fully flush the logbuf *even* if it grabbed the console_sem.

> -- vmcore-dmesg start --
> [  169.016198] postgres cpuset=
> [  169.032544]  filemap_fault+0x311/0x790
> [  169.047745] /
> [  169.047780]  mems_allowed=0
> [  169.050577]  ? xfs_ilock+0x126/0x1a0 [xfs]
> [  169.062769]  mems_allowed=0
> [  169.065754]  ? down_read_nested+0x3a/0x60
> [  169.065783]  ? xfs_ilock+0x126/0x1a0 [xfs]
> [  189.700206] sysrq: SysRq :
> [  189.700639]  __xfs_filemap_fault.isra.19+0x3f/0xe0 [xfs]
> [  189.700799]  xfs_filemap_fault+0xb/0x10 [xfs]
> [  189.703981] Trigger a crash
> [  189.707032]  __do_fault+0x19/0xa0
> [  189.710008] BUG: unable to handle kernel
> [  189.713387]  __handle_mm_fault+0xbb3/0xda0
> [  189.716473] NULL pointer dereference
> [  189.719674]  handle_mm_fault+0x14f/0x300
> [  189.722969]  at   (null)
> [  189.722974] IP: sysrq_handle_crash+0x3b/0x70
> [  189.726156]  ? handle_mm_fault+0x39/0x300
> [  189.729537] PGD 1170dc067
> [  189.732841]  __do_page_fault+0x23e/0x4f0
> [  189.735876] P4D 1170dc067
> [  189.739171]  do_page_fault+0x30/0x80
> [  189.742323] PUD 1170dd067
> [  189.745437]  page_fault+0x22/0x30
> [  189.748329] PMD 0
> [  189.751106] RIP: 0033:0x650390
> [  189.756583] RSP: 002b:7fffef6b1568 EFLAGS: 00010246
> [  189.759574] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
> [  189.762607] RAX:  RBX: 7fffef6b1594 RCX: 
> 7fae949caa20
> [  189.765665] Modules linked in:
> [  189.768423] RDX: 0008 RSI:  RDI: 
> 
> [  189.768425] RBP: 7fffef6b1590 R08: 0002 R09: 
> 0010
> [  189.771478]  ip6t_rpfilter
> [  189.774297] R10: 0001 R11: 0246 R12: 
> 
> [  189.777016]  ipt_REJECT
> [  189.779366] R13:  R14: 7fae969787e0 R15: 
> 0004
> [  189.782114]  nf_reject_ipv4
> [  189.784839] CPU: 7 PID: 6959 Comm: sleep Not tainted 4.14.0-rc8+ #302
> [  189.785113] Mem-Info:
> -- vmcore-dmesg end --

hm... wondering if this is a regression.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-06 Thread Tetsuo Handa
I tried your patch with warn_alloc() torture. It did not cause lockups.
But I felt that possibility of failing to flush last second messages (such
as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?

-- vmcore-dmesg start --
[  169.016198] postgres cpuset=
[  169.032544]  filemap_fault+0x311/0x790
[  169.047745] /
[  169.047780]  mems_allowed=0
[  169.050577]  ? xfs_ilock+0x126/0x1a0 [xfs]
[  169.062769]  mems_allowed=0
[  169.065754]  ? down_read_nested+0x3a/0x60
[  169.065783]  ? xfs_ilock+0x126/0x1a0 [xfs]
[  189.700206] sysrq: SysRq :
[  189.700639]  __xfs_filemap_fault.isra.19+0x3f/0xe0 [xfs]
[  189.700799]  xfs_filemap_fault+0xb/0x10 [xfs]
[  189.703981] Trigger a crash
[  189.707032]  __do_fault+0x19/0xa0
[  189.710008] BUG: unable to handle kernel
[  189.713387]  __handle_mm_fault+0xbb3/0xda0
[  189.716473] NULL pointer dereference
[  189.719674]  handle_mm_fault+0x14f/0x300
[  189.722969]  at   (null)
[  189.722974] IP: sysrq_handle_crash+0x3b/0x70
[  189.726156]  ? handle_mm_fault+0x39/0x300
[  189.729537] PGD 1170dc067
[  189.732841]  __do_page_fault+0x23e/0x4f0
[  189.735876] P4D 1170dc067
[  189.739171]  do_page_fault+0x30/0x80
[  189.742323] PUD 1170dd067
[  189.745437]  page_fault+0x22/0x30
[  189.748329] PMD 0
[  189.751106] RIP: 0033:0x650390
[  189.756583] RSP: 002b:7fffef6b1568 EFLAGS: 00010246
[  189.759574] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[  189.762607] RAX:  RBX: 7fffef6b1594 RCX: 7fae949caa20
[  189.765665] Modules linked in:
[  189.768423] RDX: 0008 RSI:  RDI: 
[  189.768425] RBP: 7fffef6b1590 R08: 0002 R09: 0010
[  189.771478]  ip6t_rpfilter
[  189.774297] R10: 0001 R11: 0246 R12: 
[  189.777016]  ipt_REJECT
[  189.779366] R13:  R14: 7fae969787e0 R15: 0004
[  189.782114]  nf_reject_ipv4
[  189.784839] CPU: 7 PID: 6959 Comm: sleep Not tainted 4.14.0-rc8+ #302
[  189.785113] Mem-Info:
-- vmcore-dmesg end --

-- serial console start --
[  168.975447] Mem-Info:
[  168.975453] active_anon:827953 inactive_anon:3376 isolated_anon:0
[  168.975453]  active_file:55 inactive_file:449 isolated_file:246
[  168.975453]  unevictable:0 dirty:2 writeback:68 unstable:0
[  168.975453]  slab_reclaimable:4344 slab_unreclaimable:36066
[  168.975453]  mapped:2250 shmem:3543 pagetables:9568 bounce:0
[  168.975453]  free:21398 free_pcp:175 free_cma:0
[  168.975458] Node 0 active_anon:3311812kB inactive_anon:13504kB 
active_file:220kB inactive_file:1796kB unevictable:0kB isolated(anon):0kB 
isolated(file):984kB mapped:9000kB dirty:8kB writeback:272kB shmem:14172kB 
shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2869248kB writeback_tmp:0kB 
unstable:0kB all_unreclaimable? no
[  168.975460] Node 0 DMA free:14756kB min:288kB low:360kB high:432kB 
active_anon:1088kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB 
kernel_stack:0kB pagetables:28kB bounce:0kB free_pcp:0kB local_pcp:0kB 
free_cma:0kB
[  168.975482] lowmem_reserve[]: 0 2686 3619 3619
[  168.975489] Node 0 DMA32 free:53624kB min:49956kB low:62444kB high:74932kB 
active_anon:2691088kB inactive_anon:16kB active_file:0kB inactive_file:0kB 
unevictable:0kB writepending:0kB present:3129152kB managed:2751400kB 
mlocked:0kB kernel_stack:32kB pagetables:4232kB bounce:0kB free_pcp:0kB 
local_pcp:0kB free_cma:0kB
[  168.975494] lowmem_reserve[]: 0 0 932 932
[  168.975501] Node 0 Normal free:17212kB min:17336kB low:21668kB high:26000kB 
active_anon:619636kB inactive_anon:13488kB active_file:220kB 
inactive_file:2872kB unevictable:0kB writepending:280kB present:1048576kB 
managed:954828kB mlocked:0kB kernel_stack:22784kB pagetables:34012kB bounce:0kB 
free_pcp:700kB local_pcp:40kB free_cma:0kB
[  168.975505] lowmem_reserve[]: 0 0 0 0
[  168.975512] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (UM) 
2*128kB (UM) 2*256kB (UM) 1*512kB (M) 1*1024kB (U) 0*2048kB 3*4096kB (M) = 
14756kB
[  168.975536] Node 0 DMA32: 18*4kB (U) 14*8kB (UM) 14*16kB (UE) 37*32kB (UE) 
9*64kB (UE) 4*128kB (UME) 1*256kB (M) 3*512kB (UME) 2*1024kB (UE) 3*2048kB 
(UME) 10*4096kB (M) = 53624kB
[0.00] Linux version 4.14.0-rc8+ (root@localhost.localdomain) (gcc 
version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC)) #302 SMP Mon Nov 6 12:15:00 
JST 2017
[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.14.0-rc8+ 
root=UUID=98df1583-260a-423a-a193-182dade5d085 ro security=none 
sysrq_always_enabled console=ttyS0,115200n8 console=tty0 LANG=en_US.UTF-8 
irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off 
udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug 
transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867704K
-- serial console end --


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-06 Thread Tetsuo Handa
I tried your patch with warn_alloc() torture. It did not cause lockups.
But I felt that possibility of failing to flush last second messages (such
as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?

-- vmcore-dmesg start --
[  169.016198] postgres cpuset=
[  169.032544]  filemap_fault+0x311/0x790
[  169.047745] /
[  169.047780]  mems_allowed=0
[  169.050577]  ? xfs_ilock+0x126/0x1a0 [xfs]
[  169.062769]  mems_allowed=0
[  169.065754]  ? down_read_nested+0x3a/0x60
[  169.065783]  ? xfs_ilock+0x126/0x1a0 [xfs]
[  189.700206] sysrq: SysRq :
[  189.700639]  __xfs_filemap_fault.isra.19+0x3f/0xe0 [xfs]
[  189.700799]  xfs_filemap_fault+0xb/0x10 [xfs]
[  189.703981] Trigger a crash
[  189.707032]  __do_fault+0x19/0xa0
[  189.710008] BUG: unable to handle kernel
[  189.713387]  __handle_mm_fault+0xbb3/0xda0
[  189.716473] NULL pointer dereference
[  189.719674]  handle_mm_fault+0x14f/0x300
[  189.722969]  at   (null)
[  189.722974] IP: sysrq_handle_crash+0x3b/0x70
[  189.726156]  ? handle_mm_fault+0x39/0x300
[  189.729537] PGD 1170dc067
[  189.732841]  __do_page_fault+0x23e/0x4f0
[  189.735876] P4D 1170dc067
[  189.739171]  do_page_fault+0x30/0x80
[  189.742323] PUD 1170dd067
[  189.745437]  page_fault+0x22/0x30
[  189.748329] PMD 0
[  189.751106] RIP: 0033:0x650390
[  189.756583] RSP: 002b:7fffef6b1568 EFLAGS: 00010246
[  189.759574] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
[  189.762607] RAX:  RBX: 7fffef6b1594 RCX: 7fae949caa20
[  189.765665] Modules linked in:
[  189.768423] RDX: 0008 RSI:  RDI: 
[  189.768425] RBP: 7fffef6b1590 R08: 0002 R09: 0010
[  189.771478]  ip6t_rpfilter
[  189.774297] R10: 0001 R11: 0246 R12: 
[  189.777016]  ipt_REJECT
[  189.779366] R13:  R14: 7fae969787e0 R15: 0004
[  189.782114]  nf_reject_ipv4
[  189.784839] CPU: 7 PID: 6959 Comm: sleep Not tainted 4.14.0-rc8+ #302
[  189.785113] Mem-Info:
-- vmcore-dmesg end --

-- serial console start --
[  168.975447] Mem-Info:
[  168.975453] active_anon:827953 inactive_anon:3376 isolated_anon:0
[  168.975453]  active_file:55 inactive_file:449 isolated_file:246
[  168.975453]  unevictable:0 dirty:2 writeback:68 unstable:0
[  168.975453]  slab_reclaimable:4344 slab_unreclaimable:36066
[  168.975453]  mapped:2250 shmem:3543 pagetables:9568 bounce:0
[  168.975453]  free:21398 free_pcp:175 free_cma:0
[  168.975458] Node 0 active_anon:3311812kB inactive_anon:13504kB 
active_file:220kB inactive_file:1796kB unevictable:0kB isolated(anon):0kB 
isolated(file):984kB mapped:9000kB dirty:8kB writeback:272kB shmem:14172kB 
shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 2869248kB writeback_tmp:0kB 
unstable:0kB all_unreclaimable? no
[  168.975460] Node 0 DMA free:14756kB min:288kB low:360kB high:432kB 
active_anon:1088kB inactive_anon:0kB active_file:0kB inactive_file:0kB 
unevictable:0kB writepending:0kB present:15988kB managed:15904kB mlocked:0kB 
kernel_stack:0kB pagetables:28kB bounce:0kB free_pcp:0kB local_pcp:0kB 
free_cma:0kB
[  168.975482] lowmem_reserve[]: 0 2686 3619 3619
[  168.975489] Node 0 DMA32 free:53624kB min:49956kB low:62444kB high:74932kB 
active_anon:2691088kB inactive_anon:16kB active_file:0kB inactive_file:0kB 
unevictable:0kB writepending:0kB present:3129152kB managed:2751400kB 
mlocked:0kB kernel_stack:32kB pagetables:4232kB bounce:0kB free_pcp:0kB 
local_pcp:0kB free_cma:0kB
[  168.975494] lowmem_reserve[]: 0 0 932 932
[  168.975501] Node 0 Normal free:17212kB min:17336kB low:21668kB high:26000kB 
active_anon:619636kB inactive_anon:13488kB active_file:220kB 
inactive_file:2872kB unevictable:0kB writepending:280kB present:1048576kB 
managed:954828kB mlocked:0kB kernel_stack:22784kB pagetables:34012kB bounce:0kB 
free_pcp:700kB local_pcp:40kB free_cma:0kB
[  168.975505] lowmem_reserve[]: 0 0 0 0
[  168.975512] Node 0 DMA: 1*4kB (U) 0*8kB 0*16kB 1*32kB (U) 2*64kB (UM) 
2*128kB (UM) 2*256kB (UM) 1*512kB (M) 1*1024kB (U) 0*2048kB 3*4096kB (M) = 
14756kB
[  168.975536] Node 0 DMA32: 18*4kB (U) 14*8kB (UM) 14*16kB (UE) 37*32kB (UE) 
9*64kB (UE) 4*128kB (UME) 1*256kB (M) 3*512kB (UME) 2*1024kB (UE) 3*2048kB 
(UME) 10*4096kB (M) = 53624kB
[0.00] Linux version 4.14.0-rc8+ (root@localhost.localdomain) (gcc 
version 4.8.5 20150623 (Red Hat 4.8.5-16) (GCC)) #302 SMP Mon Nov 6 12:15:00 
JST 2017
[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.14.0-rc8+ 
root=UUID=98df1583-260a-423a-a193-182dade5d085 ro security=none 
sysrq_always_enabled console=ttyS0,115200n8 console=tty0 LANG=en_US.UTF-8 
irqpoll nr_cpus=1 reset_devices cgroup_disable=memory mce=off numa=off 
udev.children-max=2 panic=10 rootflags=nofail acpi_no_memhotplug 
transparent_hugepage=never disable_cpu_apicid=0 elfcorehdr=867704K
-- serial console end --


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Sergey Senozhatsky
On (11/02/17 23:15), Steven Rostedt wrote:
> On Thu, 2 Nov 2017 23:16:16 +0100
> Vlastimil Babka  wrote:
> 
> > > + if (spin) {
> > > + /* We spin waiting for the owner to release us 
> > > */
> > > + spin_acquire(_owner_dep_map, 0, 0, 
> > > _THIS_IP_);
> > > + /* Owner will clear console_waiter on hand off 
> > > */
> > > + while (!READ_ONCE(console_waiter))  
> > 
> > This should not be negated, right? We should spin while it's true, not
> > false.
> 
> Ug, yes. How did that not crash in my tests.

Ah, right... Good catch, Vlastimil. The V1 didn't work as expected
on my tests.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Sergey Senozhatsky
On (11/02/17 23:15), Steven Rostedt wrote:
> On Thu, 2 Nov 2017 23:16:16 +0100
> Vlastimil Babka  wrote:
> 
> > > + if (spin) {
> > > + /* We spin waiting for the owner to release us 
> > > */
> > > + spin_acquire(_owner_dep_map, 0, 0, 
> > > _THIS_IP_);
> > > + /* Owner will clear console_waiter on hand off 
> > > */
> > > + while (!READ_ONCE(console_waiter))  
> > 
> > This should not be negated, right? We should spin while it's true, not
> > false.
> 
> Ug, yes. How did that not crash in my tests.

Ah, right... Good catch, Vlastimil. The V1 didn't work as expected
on my tests.

-ss


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt
On Fri, 3 Nov 2017 07:54:04 -0400
Steven Rostedt  wrote:

> The new waiter gets set only if there isn't already a waiter *and*
> there is an owner that is not current (and with the printk_safe_enter I
> don't think that is even needed).
> 
> + while (!READ_ONCE(console_waiter))
> + cpu_relax();

I still need to fix the patch. I cut and pasted the bad version. This
should have been:

while (READ_ONCE(console_waiter))
cpu_relax();

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt
On Fri, 3 Nov 2017 07:54:04 -0400
Steven Rostedt  wrote:

> The new waiter gets set only if there isn't already a waiter *and*
> there is an owner that is not current (and with the printk_safe_enter I
> don't think that is even needed).
> 
> + while (!READ_ONCE(console_waiter))
> + cpu_relax();

I still need to fix the patch. I cut and pasted the bad version. This
should have been:

while (READ_ONCE(console_waiter))
cpu_relax();

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt
On Fri, 3 Nov 2017 07:21:21 -0400
Steven Rostedt  wrote:

> On Thu, 2 Nov 2017 21:09:32 -0700
> John Hubbard  wrote:
>

 > 
> > For example, if there are 3 or more threads, you can do the following:
> > 
> > thread A: holds the console lock, is printing, then moves into the 
> > console_unlock
> >   phase
> > 
> > thread B: goes into the waiter spin loop above, and (once the polarity is 
> > corrected)
> >   waits for console_waiter to become 0
> > 
> > thread A: finishing up, sets console_waiter --> 0
> > 
> > thread C: before thread B notices, thread C goes into the "else" section, 
> > sees that
> >   console_waiter == 0, and sets console_waiter --> 1. So thread C 
> > now
> >   becomes the waiter  
> 
> But console_waiter only gets set to 1 if console_waiter is 0 *and*
> console_owner is not NULL and is not current. console_owner is only
> updated under a spin lock and console_waiter is only set under a spin
> lock when console_owner is not NULL.
> 
> This means this scenario can not happen.
> 
> 
> > 
> > thread B: gets *very* unlucky and never sees the 1 --> 0 --> 1 transition of
> >   console_waiter, so it continues waiting.  And now we have both B
> >   and C in the same spin loop, and this is now broken.
> > 
> > At the root, this is really due to the absence of a pre-existing "hand-off 
> > this lock"
> > mechanism. And this one here is not quite correct.
> > 
> > Solution ideas: for a true hand-off, there needs to be a bit more 
> > information
> > exchanged. Conceptually, a (lock-protected) list of waiters (which would 
> > only ever have zero or one entries) is a good way to start thinking about 
> > it.  
> 
> As stated above, the console owner check will prevent this issue.
> 

I'll condense the patch to show what I mean:

To become a waiter, a task must do the following:

+   printk_safe_enter_irqsave(flags);
+
+   raw_spin_lock(_owner_lock);
+   owner = READ_ONCE(console_owner);
+   waiter = READ_ONCE(console_waiter);
+   if (!waiter && owner && owner != current) {
+   WRITE_ONCE(console_waiter, true);
+   spin = true;
+   }
+   raw_spin_unlock(_owner_lock);


The new waiter gets set only if there isn't already a waiter *and*
there is an owner that is not current (and with the printk_safe_enter I
don't think that is even needed).

+   while (!READ_ONCE(console_waiter))
+   cpu_relax();

The spin is outside the spin lock. But only the owner can clear it.

Now the owner is doing a loop of this (with interrupts disabled)

+   raw_spin_lock(_owner_lock);
+   console_owner = current;
+   raw_spin_unlock(_owner_lock);

Write to consoles.

+   raw_spin_lock(_owner_lock);
+   waiter = READ_ONCE(console_waiter);
+   console_owner = NULL;
+   raw_spin_unlock(_owner_lock);

+   if (waiter)
+   break;

At this moment console_owner is NULL, and no new waiters can happen.
The next owner will be the waiter that is spinning.

+   if (waiter) {
+   WRITE_ONCE(console_waiter, false);

There is no possibility of another task sneaking in and becoming a
waiter at this moment. The console_owner was cleared under spin lock,
and a waiter is only set under the same spin lock if owner is set.
There will be no new owner sneaking in because to become the owner, you
must have the console lock. Since it is never released between the time
the owner clears console_waiter and the waiter takes the console lock,
there is no race.

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt
On Fri, 3 Nov 2017 07:21:21 -0400
Steven Rostedt  wrote:

> On Thu, 2 Nov 2017 21:09:32 -0700
> John Hubbard  wrote:
>

 > 
> > For example, if there are 3 or more threads, you can do the following:
> > 
> > thread A: holds the console lock, is printing, then moves into the 
> > console_unlock
> >   phase
> > 
> > thread B: goes into the waiter spin loop above, and (once the polarity is 
> > corrected)
> >   waits for console_waiter to become 0
> > 
> > thread A: finishing up, sets console_waiter --> 0
> > 
> > thread C: before thread B notices, thread C goes into the "else" section, 
> > sees that
> >   console_waiter == 0, and sets console_waiter --> 1. So thread C 
> > now
> >   becomes the waiter  
> 
> But console_waiter only gets set to 1 if console_waiter is 0 *and*
> console_owner is not NULL and is not current. console_owner is only
> updated under a spin lock and console_waiter is only set under a spin
> lock when console_owner is not NULL.
> 
> This means this scenario can not happen.
> 
> 
> > 
> > thread B: gets *very* unlucky and never sees the 1 --> 0 --> 1 transition of
> >   console_waiter, so it continues waiting.  And now we have both B
> >   and C in the same spin loop, and this is now broken.
> > 
> > At the root, this is really due to the absence of a pre-existing "hand-off 
> > this lock"
> > mechanism. And this one here is not quite correct.
> > 
> > Solution ideas: for a true hand-off, there needs to be a bit more 
> > information
> > exchanged. Conceptually, a (lock-protected) list of waiters (which would 
> > only ever have zero or one entries) is a good way to start thinking about 
> > it.  
> 
> As stated above, the console owner check will prevent this issue.
> 

I'll condense the patch to show what I mean:

To become a waiter, a task must do the following:

+   printk_safe_enter_irqsave(flags);
+
+   raw_spin_lock(_owner_lock);
+   owner = READ_ONCE(console_owner);
+   waiter = READ_ONCE(console_waiter);
+   if (!waiter && owner && owner != current) {
+   WRITE_ONCE(console_waiter, true);
+   spin = true;
+   }
+   raw_spin_unlock(_owner_lock);


The new waiter gets set only if there isn't already a waiter *and*
there is an owner that is not current (and with the printk_safe_enter I
don't think that is even needed).

+   while (!READ_ONCE(console_waiter))
+   cpu_relax();

The spin is outside the spin lock. But only the owner can clear it.

Now the owner is doing a loop of this (with interrupts disabled)

+   raw_spin_lock(_owner_lock);
+   console_owner = current;
+   raw_spin_unlock(_owner_lock);

Write to consoles.

+   raw_spin_lock(_owner_lock);
+   waiter = READ_ONCE(console_waiter);
+   console_owner = NULL;
+   raw_spin_unlock(_owner_lock);

+   if (waiter)
+   break;

At this moment console_owner is NULL, and no new waiters can happen.
The next owner will be the waiter that is spinning.

+   if (waiter) {
+   WRITE_ONCE(console_waiter, false);

There is no possibility of another task sneaking in and becoming a
waiter at this moment. The console_owner was cleared under spin lock,
and a waiter is only set under the same spin lock if owner is set.
There will be no new owner sneaking in because to become the owner, you
must have the console lock. Since it is never released between the time
the owner clears console_waiter and the waiter takes the console lock,
there is no race.

-- Steve


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt
On Thu, 2 Nov 2017 21:09:32 -0700
John Hubbard  wrote:

> On 11/02/2017 03:16 PM, Vlastimil Babka wrote:
> > On 11/02/2017 06:45 PM, Steven Rostedt wrote:  
> > ...>__DEVKMSG_LOG_BIT_ON = 0,
> >>__DEVKMSG_LOG_BIT_OFF,
> >> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> >> * semaphore.  The release will print out buffers and wake up
> >> * /dev/kmsg and syslog() users.
> >> */
> >> -  if (console_trylock())
> >> +  if (console_trylock()) {
> >>console_unlock();
> >> +  } else {
> >> +  struct task_struct *owner = NULL;
> >> +  bool waiter;
> >> +  bool spin = false;
> >> +
> >> +  printk_safe_enter_irqsave(flags);
> >> +
> >> +  raw_spin_lock(_owner_lock);
> >> +  owner = READ_ONCE(console_owner);
> >> +  waiter = READ_ONCE(console_waiter);
> >> +  if (!waiter && owner && owner != current) {
> >> +  WRITE_ONCE(console_waiter, true);
> >> +  spin = true;
> >> +  }
> >> +  raw_spin_unlock(_owner_lock);
> >> +
> >> +  /*
> >> +   * If there is an active printk() writing to the
> >> +   * consoles, instead of having it write our data too,
> >> +   * see if we can offload that load from the active
> >> +   * printer, and do some printing ourselves.
> >> +   * Go into a spin only if there isn't already a waiter
> >> +   * spinning, and there is an active printer, and
> >> +   * that active printer isn't us (recursive printk?).
> >> +   */
> >> +  if (spin) {
> >> +  /* We spin waiting for the owner to release us 
> >> */
> >> +  spin_acquire(_owner_dep_map, 0, 0, 
> >> _THIS_IP_);
> >> +  /* Owner will clear console_waiter on hand off 
> >> */
> >> +  while (!READ_ONCE(console_waiter))  
> > 
> > This should not be negated, right? We should spin while it's true, not
> > false.
> >   
> 
> Vlastimil's right about the polarity problem above, but while I was trying
> to verify that, I noticed another problem: the "handoff" of the console lock
> is broken.
> 
> For example, if there are 3 or more threads, you can do the following:
> 
> thread A: holds the console lock, is printing, then moves into the 
> console_unlock
>   phase
> 
> thread B: goes into the waiter spin loop above, and (once the polarity is 
> corrected)
>   waits for console_waiter to become 0
> 
> thread A: finishing up, sets console_waiter --> 0
> 
> thread C: before thread B notices, thread C goes into the "else" section, 
> sees that
>   console_waiter == 0, and sets console_waiter --> 1. So thread C now
>   becomes the waiter

But console_waiter only gets set to 1 if console_waiter is 0 *and*
console_owner is not NULL and is not current. console_owner is only
updated under a spin lock and console_waiter is only set under a spin
lock when console_owner is not NULL.

This means this scenario can not happen.


> 
> thread B: gets *very* unlucky and never sees the 1 --> 0 --> 1 transition of
>   console_waiter, so it continues waiting.  And now we have both B
>   and C in the same spin loop, and this is now broken.
> 
> At the root, this is really due to the absence of a pre-existing "hand-off 
> this lock"
> mechanism. And this one here is not quite correct.
> 
> Solution ideas: for a true hand-off, there needs to be a bit more information
> exchanged. Conceptually, a (lock-protected) list of waiters (which would 
> only ever have zero or one entries) is a good way to start thinking about it.

As stated above, the console owner check will prevent this issue.

-- Steve

> 
> I talked it over with Mark Hairgrove here, he suggested a more sophisticated
> way of doing that sort of hand-off, using compare-and-exchange. I can turn 
> that
> into a patch if you like (I'm not as fast as some folks, so I didn't attempt 
> to
> do that right away), although I'm sure you have lots of ideas on how to do it.
> 


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-03 Thread Steven Rostedt
On Thu, 2 Nov 2017 21:09:32 -0700
John Hubbard  wrote:

> On 11/02/2017 03:16 PM, Vlastimil Babka wrote:
> > On 11/02/2017 06:45 PM, Steven Rostedt wrote:  
> > ...>__DEVKMSG_LOG_BIT_ON = 0,
> >>__DEVKMSG_LOG_BIT_OFF,
> >> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
> >> * semaphore.  The release will print out buffers and wake up
> >> * /dev/kmsg and syslog() users.
> >> */
> >> -  if (console_trylock())
> >> +  if (console_trylock()) {
> >>console_unlock();
> >> +  } else {
> >> +  struct task_struct *owner = NULL;
> >> +  bool waiter;
> >> +  bool spin = false;
> >> +
> >> +  printk_safe_enter_irqsave(flags);
> >> +
> >> +  raw_spin_lock(_owner_lock);
> >> +  owner = READ_ONCE(console_owner);
> >> +  waiter = READ_ONCE(console_waiter);
> >> +  if (!waiter && owner && owner != current) {
> >> +  WRITE_ONCE(console_waiter, true);
> >> +  spin = true;
> >> +  }
> >> +  raw_spin_unlock(_owner_lock);
> >> +
> >> +  /*
> >> +   * If there is an active printk() writing to the
> >> +   * consoles, instead of having it write our data too,
> >> +   * see if we can offload that load from the active
> >> +   * printer, and do some printing ourselves.
> >> +   * Go into a spin only if there isn't already a waiter
> >> +   * spinning, and there is an active printer, and
> >> +   * that active printer isn't us (recursive printk?).
> >> +   */
> >> +  if (spin) {
> >> +  /* We spin waiting for the owner to release us 
> >> */
> >> +  spin_acquire(_owner_dep_map, 0, 0, 
> >> _THIS_IP_);
> >> +  /* Owner will clear console_waiter on hand off 
> >> */
> >> +  while (!READ_ONCE(console_waiter))  
> > 
> > This should not be negated, right? We should spin while it's true, not
> > false.
> >   
> 
> Vlastimil's right about the polarity problem above, but while I was trying
> to verify that, I noticed another problem: the "handoff" of the console lock
> is broken.
> 
> For example, if there are 3 or more threads, you can do the following:
> 
> thread A: holds the console lock, is printing, then moves into the 
> console_unlock
>   phase
> 
> thread B: goes into the waiter spin loop above, and (once the polarity is 
> corrected)
>   waits for console_waiter to become 0
> 
> thread A: finishing up, sets console_waiter --> 0
> 
> thread C: before thread B notices, thread C goes into the "else" section, 
> sees that
>   console_waiter == 0, and sets console_waiter --> 1. So thread C now
>   becomes the waiter

But console_waiter only gets set to 1 if console_waiter is 0 *and*
console_owner is not NULL and is not current. console_owner is only
updated under a spin lock and console_waiter is only set under a spin
lock when console_owner is not NULL.

This means this scenario can not happen.


> 
> thread B: gets *very* unlucky and never sees the 1 --> 0 --> 1 transition of
>   console_waiter, so it continues waiting.  And now we have both B
>   and C in the same spin loop, and this is now broken.
> 
> At the root, this is really due to the absence of a pre-existing "hand-off 
> this lock"
> mechanism. And this one here is not quite correct.
> 
> Solution ideas: for a true hand-off, there needs to be a bit more information
> exchanged. Conceptually, a (lock-protected) list of waiters (which would 
> only ever have zero or one entries) is a good way to start thinking about it.

As stated above, the console owner check will prevent this issue.

-- Steve

> 
> I talked it over with Mark Hairgrove here, he suggested a more sophisticated
> way of doing that sort of hand-off, using compare-and-exchange. I can turn 
> that
> into a patch if you like (I'm not as fast as some folks, so I didn't attempt 
> to
> do that right away), although I'm sure you have lots of ideas on how to do it.
> 


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-02 Thread Steven Rostedt
On Thu, 2 Nov 2017 23:16:16 +0100
Vlastimil Babka  wrote:

> > +   if (spin) {
> > +   /* We spin waiting for the owner to release us 
> > */
> > +   spin_acquire(_owner_dep_map, 0, 0, 
> > _THIS_IP_);
> > +   /* Owner will clear console_waiter on hand off 
> > */
> > +   while (!READ_ONCE(console_waiter))  
> 
> This should not be negated, right? We should spin while it's true, not
> false.

Ug, yes. How did that not crash in my tests.

Will fix.

-- Steve



Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-02 Thread Steven Rostedt
On Thu, 2 Nov 2017 23:16:16 +0100
Vlastimil Babka  wrote:

> > +   if (spin) {
> > +   /* We spin waiting for the owner to release us 
> > */
> > +   spin_acquire(_owner_dep_map, 0, 0, 
> > _THIS_IP_);
> > +   /* Owner will clear console_waiter on hand off 
> > */
> > +   while (!READ_ONCE(console_waiter))  
> 
> This should not be negated, right? We should spin while it's true, not
> false.

Ug, yes. How did that not crash in my tests.

Will fix.

-- Steve



Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-02 Thread Vlastimil Babka
On 11/02/2017 06:45 PM, Steven Rostedt wrote:
...>__DEVKMSG_LOG_BIT_ON = 0,
>   __DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>* semaphore.  The release will print out buffers and wake up
>* /dev/kmsg and syslog() users.
>*/
> - if (console_trylock())
> + if (console_trylock()) {
>   console_unlock();
> + } else {
> + struct task_struct *owner = NULL;
> + bool waiter;
> + bool spin = false;
> +
> + printk_safe_enter_irqsave(flags);
> +
> + raw_spin_lock(_owner_lock);
> + owner = READ_ONCE(console_owner);
> + waiter = READ_ONCE(console_waiter);
> + if (!waiter && owner && owner != current) {
> + WRITE_ONCE(console_waiter, true);
> + spin = true;
> + }
> + raw_spin_unlock(_owner_lock);
> +
> + /*
> +  * If there is an active printk() writing to the
> +  * consoles, instead of having it write our data too,
> +  * see if we can offload that load from the active
> +  * printer, and do some printing ourselves.
> +  * Go into a spin only if there isn't already a waiter
> +  * spinning, and there is an active printer, and
> +  * that active printer isn't us (recursive printk?).
> +  */
> + if (spin) {
> + /* We spin waiting for the owner to release us 
> */
> + spin_acquire(_owner_dep_map, 0, 0, 
> _THIS_IP_);
> + /* Owner will clear console_waiter on hand off 
> */
> + while (!READ_ONCE(console_waiter))

This should not be negated, right? We should spin while it's true, not
false.

> + cpu_relax();
> +
> + spin_release(_owner_dep_map, 1, 
> _THIS_IP_);
> + printk_safe_exit_irqrestore(flags);
> +
> + /*
> +  * The owner passed the console lock to us.
> +  * Since we did not spin on console lock, 
> annotate
> +  * this as a trylock. Otherwise lockdep will
> +  * complain.
> +  */
> + mutex_acquire(_lock_dep_map, 0, 1, 
> _THIS_IP_);
> + console_unlock();
> + printk_safe_enter_irqsave(flags);
> + }
> + printk_safe_exit_irqrestore(flags);
> +
> + }
>   }


Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes

2017-11-02 Thread Vlastimil Babka
On 11/02/2017 06:45 PM, Steven Rostedt wrote:
...>__DEVKMSG_LOG_BIT_ON = 0,
>   __DEVKMSG_LOG_BIT_OFF,
> @@ -1753,8 +1760,56 @@ asmlinkage int vprintk_emit(int facility
>* semaphore.  The release will print out buffers and wake up
>* /dev/kmsg and syslog() users.
>*/
> - if (console_trylock())
> + if (console_trylock()) {
>   console_unlock();
> + } else {
> + struct task_struct *owner = NULL;
> + bool waiter;
> + bool spin = false;
> +
> + printk_safe_enter_irqsave(flags);
> +
> + raw_spin_lock(_owner_lock);
> + owner = READ_ONCE(console_owner);
> + waiter = READ_ONCE(console_waiter);
> + if (!waiter && owner && owner != current) {
> + WRITE_ONCE(console_waiter, true);
> + spin = true;
> + }
> + raw_spin_unlock(_owner_lock);
> +
> + /*
> +  * If there is an active printk() writing to the
> +  * consoles, instead of having it write our data too,
> +  * see if we can offload that load from the active
> +  * printer, and do some printing ourselves.
> +  * Go into a spin only if there isn't already a waiter
> +  * spinning, and there is an active printer, and
> +  * that active printer isn't us (recursive printk?).
> +  */
> + if (spin) {
> + /* We spin waiting for the owner to release us 
> */
> + spin_acquire(_owner_dep_map, 0, 0, 
> _THIS_IP_);
> + /* Owner will clear console_waiter on hand off 
> */
> + while (!READ_ONCE(console_waiter))

This should not be negated, right? We should spin while it's true, not
false.

> + cpu_relax();
> +
> + spin_release(_owner_dep_map, 1, 
> _THIS_IP_);
> + printk_safe_exit_irqrestore(flags);
> +
> + /*
> +  * The owner passed the console lock to us.
> +  * Since we did not spin on console lock, 
> annotate
> +  * this as a trylock. Otherwise lockdep will
> +  * complain.
> +  */
> + mutex_acquire(_lock_dep_map, 0, 1, 
> _THIS_IP_);
> + console_unlock();
> + printk_safe_enter_irqsave(flags);
> + }
> + printk_safe_exit_irqrestore(flags);
> +
> + }
>   }