Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Petr Mladek
On Wed 2016-07-13 22:24:38, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/13/16 13:14), Petr Mladek wrote:
> [..]
> > > > [ 4913.000310] 
> > > > item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> > > > [ 4913.000327] 
> > > > item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> > > > [ 4913.000342] 
> > > > item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> > > > [ 4913.000356] item8780
> > > > [ 4913.000357] item8781
> > > > [ 4913.000358] item8782
> > > > [...]
> > > 
> > > hm.. so are there any 'concurrent' printk()-s coming from other CPUs
> > > that are not getting printed on the console (because of loglevel 
> > > restrictions),
> > > but still screw up the cont buffer? otherwise, my expectation was 
> > > that in
> > > this particular case cpu issues a new pr_cont() only after it has printed
> > > the current message via call_console_drivers()->write(). so partially 
> > > flushed
> > > cont buffer was not really expected to happen. I was wrong, obviously.
> > 
> > To be honest. I am not 100% sure what happens here. One theory is
> > that the printk kthread is waken because of some previous
> > non-continuous message. It steals console_lock() and partially flushes
> > the cont buffer. In this case, the pr_cont() calls are not able to get
> > console_lock() and basically work in the async mode. The pr_cont()
> > calls have to store each piece sepatately because the partially
> > flushed cont buffer is blocked until flushed completely.
> > 
> > Hmm, the strange thing is that I see this problem even when I force
> > the global synch more by
> > 
> > echo Y >  /sys/module/printk/parameters/synchronous
> 
> oh, even in sync mode. hm...

It seems that the rsyslogd daemon steps in. I have dumped all
processes that called console_cont_flush() and really flushed
something. It was mostly the process that triggered the test
for-cycle and called console_unlock() from every pr_cont
the syncronous way. But there was the following sinner
from time to time:

   rs:main Q:Reg-771   [001] d... 87486.753353: console_unlock:
 rs:main Q:Reg calling console_cont_flush
rs:main Q:Reg-771   [001] d... 87486.753360: 
 => console_unlock
 => do_con_write.part.22
 => con_write
 => do_output_char
 => n_tty_write
 => tty_write
 => __vfs_write
 => vfs_write
 => SyS_write
 => entry_SYSCALL_64_fastpath

It means that some other process does console_unlock() and blocks
using the cont buffer.

I am not sure what is the exact purpose of this syscall. But
it seems reasonable that rsystemlog is active when there is
some activity in the kernel log buffer.


> > > just an idea.
> > > ... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
> > > sites. ACPI is one notable example. the others include OOM, some cgroup
> > > related output (or... was it memcg), etc., etc.
> > > 
> > > so we *may be* can have a per-cpu cont buffer and add new API
> > > pr_cont_begin()/pr_cont_end(), that would disable preemption.
> > > 
> > > 
> > > + pr_cont_begin() /* preempt_disable() */
> > > 
> > >   for (.)
> > >   pr_cont("%pS );
> > > 
> > > + pr_cont_end() /* preempt_enable() */
> > >
> > > pr_cont_end() also can flush this CPU's cont buffer and store the log
> > > line. this will probably break very long cont lines (not sure if we
> > > have any of those though). and may be flush_on_panic() would have to
> > > do some extra work iterating each cpu.
> > 
> > It would work but I am a bit scared by the complexity. I think
> > that we should find a compromise between complexity and
> > reliability.
> 
> I understand your concerns.
> but, realistically, KERN_CONT will still be SMP unsafe and will not do
> what people probably expect it to do: e.g. ACPI_ERROR or ACPI_anything.
> it does acpi_os_vprintf() -> printk(KERN_CONT).
> 
> the whole thing is already complicated, we already have all those len,
> cons etc. checks in various places, it also has that owner task, etc.
> 
> static struct cont {
>   char buf[LOG_LINE_MAX];
>   size_t len; /* length == 0 means unused buffer */
>   size_t cons;/* bytes written to console */
>   struct task_struct *owner;  /* task of first print*/
>   u64 ts_nsec;/* time of first print */
>   u8 level;   /* log level of first message */
>   u8 facility;/* log facility of first message */
>   enum log_flags flags;   /* prefix, newline flags */
>   bool flushed:1; /* buffer sealed and committed */
> } cont;
> 
> so I think, personally... it's sort of kind of not exactly what people
> need (aka 'broken'). so my idea is to forbid concurrent cont buffer usage.
> each CPU will have it's own buffer, and touch it with preemption_disabled.
> there will be no more partial flushes, cont buffer will be flushed when it's
> done -- via 

Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Petr Mladek
On Wed 2016-07-13 22:24:38, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/13/16 13:14), Petr Mladek wrote:
> [..]
> > > > [ 4913.000310] 
> > > > item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> > > > [ 4913.000327] 
> > > > item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> > > > [ 4913.000342] 
> > > > item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> > > > [ 4913.000356] item8780
> > > > [ 4913.000357] item8781
> > > > [ 4913.000358] item8782
> > > > [...]
> > > 
> > > hm.. so are there any 'concurrent' printk()-s coming from other CPUs
> > > that are not getting printed on the console (because of loglevel 
> > > restrictions),
> > > but still screw up the cont buffer? otherwise, my expectation was 
> > > that in
> > > this particular case cpu issues a new pr_cont() only after it has printed
> > > the current message via call_console_drivers()->write(). so partially 
> > > flushed
> > > cont buffer was not really expected to happen. I was wrong, obviously.
> > 
> > To be honest. I am not 100% sure what happens here. One theory is
> > that the printk kthread is waken because of some previous
> > non-continuous message. It steals console_lock() and partially flushes
> > the cont buffer. In this case, the pr_cont() calls are not able to get
> > console_lock() and basically work in the async mode. The pr_cont()
> > calls have to store each piece sepatately because the partially
> > flushed cont buffer is blocked until flushed completely.
> > 
> > Hmm, the strange thing is that I see this problem even when I force
> > the global synch more by
> > 
> > echo Y >  /sys/module/printk/parameters/synchronous
> 
> oh, even in sync mode. hm...

It seems that the rsyslogd daemon steps in. I have dumped all
processes that called console_cont_flush() and really flushed
something. It was mostly the process that triggered the test
for-cycle and called console_unlock() from every pr_cont
the syncronous way. But there was the following sinner
from time to time:

   rs:main Q:Reg-771   [001] d... 87486.753353: console_unlock:
 rs:main Q:Reg calling console_cont_flush
rs:main Q:Reg-771   [001] d... 87486.753360: 
 => console_unlock
 => do_con_write.part.22
 => con_write
 => do_output_char
 => n_tty_write
 => tty_write
 => __vfs_write
 => vfs_write
 => SyS_write
 => entry_SYSCALL_64_fastpath

It means that some other process does console_unlock() and blocks
using the cont buffer.

I am not sure what is the exact purpose of this syscall. But
it seems reasonable that rsystemlog is active when there is
some activity in the kernel log buffer.


> > > just an idea.
> > > ... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
> > > sites. ACPI is one notable example. the others include OOM, some cgroup
> > > related output (or... was it memcg), etc., etc.
> > > 
> > > so we *may be* can have a per-cpu cont buffer and add new API
> > > pr_cont_begin()/pr_cont_end(), that would disable preemption.
> > > 
> > > 
> > > + pr_cont_begin() /* preempt_disable() */
> > > 
> > >   for (.)
> > >   pr_cont("%pS );
> > > 
> > > + pr_cont_end() /* preempt_enable() */
> > >
> > > pr_cont_end() also can flush this CPU's cont buffer and store the log
> > > line. this will probably break very long cont lines (not sure if we
> > > have any of those though). and may be flush_on_panic() would have to
> > > do some extra work iterating each cpu.
> > 
> > It would work but I am a bit scared by the complexity. I think
> > that we should find a compromise between complexity and
> > reliability.
> 
> I understand your concerns.
> but, realistically, KERN_CONT will still be SMP unsafe and will not do
> what people probably expect it to do: e.g. ACPI_ERROR or ACPI_anything.
> it does acpi_os_vprintf() -> printk(KERN_CONT).
> 
> the whole thing is already complicated, we already have all those len,
> cons etc. checks in various places, it also has that owner task, etc.
> 
> static struct cont {
>   char buf[LOG_LINE_MAX];
>   size_t len; /* length == 0 means unused buffer */
>   size_t cons;/* bytes written to console */
>   struct task_struct *owner;  /* task of first print*/
>   u64 ts_nsec;/* time of first print */
>   u8 level;   /* log level of first message */
>   u8 facility;/* log facility of first message */
>   enum log_flags flags;   /* prefix, newline flags */
>   bool flushed:1; /* buffer sealed and committed */
> } cont;
> 
> so I think, personally... it's sort of kind of not exactly what people
> need (aka 'broken'). so my idea is to forbid concurrent cont buffer usage.
> each CPU will have it's own buffer, and touch it with preemption_disabled.
> there will be no more partial flushes, cont buffer will be flushed when it's
> done -- via 

Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Sergey Senozhatsky
Hello,

On (07/13/16 13:14), Petr Mladek wrote:
[..]
> > > [ 4913.000310] 
> > > item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> > > [ 4913.000327] 
> > > item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> > > [ 4913.000342] 
> > > item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> > > [ 4913.000356] item8780
> > > [ 4913.000357] item8781
> > > [ 4913.000358] item8782
> > > [...]
> > 
> > hm.. so are there any 'concurrent' printk()-s coming from other CPUs
> > that are not getting printed on the console (because of loglevel 
> > restrictions),
> > but still screw up the cont buffer? otherwise, my expectation was that 
> > in
> > this particular case cpu issues a new pr_cont() only after it has printed
> > the current message via call_console_drivers()->write(). so partially 
> > flushed
> > cont buffer was not really expected to happen. I was wrong, obviously.
> 
> To be honest. I am not 100% sure what happens here. One theory is
> that the printk kthread is waken because of some previous
> non-continuous message. It steals console_lock() and partially flushes
> the cont buffer. In this case, the pr_cont() calls are not able to get
> console_lock() and basically work in the async mode. The pr_cont()
> calls have to store each piece sepatately because the partially
> flushed cont buffer is blocked until flushed completely.
> 
> Hmm, the strange thing is that I see this problem even when I force
> the global synch more by
> 
> echo Y >  /sys/module/printk/parameters/synchronous

oh, even in sync mode. hm...

> > just an idea.
> > ... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
> > sites. ACPI is one notable example. the others include OOM, some cgroup
> > related output (or... was it memcg), etc., etc.
> > 
> > so we *may be* can have a per-cpu cont buffer and add new API
> > pr_cont_begin()/pr_cont_end(), that would disable preemption.
> > 
> > 
> > +   pr_cont_begin() /* preempt_disable() */
> > 
> > for (.)
> > pr_cont("%pS );
> > 
> > +   pr_cont_end() /* preempt_enable() */
> >
> > pr_cont_end() also can flush this CPU's cont buffer and store the log
> > line. this will probably break very long cont lines (not sure if we
> > have any of those though). and may be flush_on_panic() would have to
> > do some extra work iterating each cpu.
> 
> It would work but I am a bit scared by the complexity. I think
> that we should find a compromise between complexity and
> reliability.

I understand your concerns.
but, realistically, KERN_CONT will still be SMP unsafe and will not do
what people probably expect it to do: e.g. ACPI_ERROR or ACPI_anything.
it does acpi_os_vprintf() -> printk(KERN_CONT).

the whole thing is already complicated, we already have all those len,
cons etc. checks in various places, it also has that owner task, etc.

static struct cont {
char buf[LOG_LINE_MAX];
size_t len; /* length == 0 means unused buffer */
size_t cons;/* bytes written to console */
struct task_struct *owner;  /* task of first print*/
u64 ts_nsec;/* time of first print */
u8 level;   /* log level of first message */
u8 facility;/* log facility of first message */
enum log_flags flags;   /* prefix, newline flags */
bool flushed:1; /* buffer sealed and committed */
} cont;

so I think, personally... it's sort of kind of not exactly what people
need (aka 'broken'). so my idea is to forbid concurrent cont buffer usage.
each CPU will have it's own buffer, and touch it with preemption_disabled.
there will be no more partial flushes, cont buffer will be flushed when it's
done -- via pr_cont_end()->log_store()->wake_up(printk_kthread). IOW it will
just add the cont buffer content to log_buf and print it via console_unlock(),
like the rest of the messages. no more console_cont_flush().

> > we still can have pr_cont() happening on several cpus simultaneously.
> > console_seq is getting reset, when we register a new CON_PRINTBUFFER
> > console.
> 
> I agree that mixing part of lines from different processes/cpus
> is not ideal. But it is not much worse than mixing whole lines.
> We should do a best effort but we do not need to be perfect.

sure, thanks! I'll try to think of something other than "throw it away".
but.., well, you know my opinion by now :)

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Sergey Senozhatsky
Hello,

On (07/13/16 13:14), Petr Mladek wrote:
[..]
> > > [ 4913.000310] 
> > > item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> > > [ 4913.000327] 
> > > item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> > > [ 4913.000342] 
> > > item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> > > [ 4913.000356] item8780
> > > [ 4913.000357] item8781
> > > [ 4913.000358] item8782
> > > [...]
> > 
> > hm.. so are there any 'concurrent' printk()-s coming from other CPUs
> > that are not getting printed on the console (because of loglevel 
> > restrictions),
> > but still screw up the cont buffer? otherwise, my expectation was that 
> > in
> > this particular case cpu issues a new pr_cont() only after it has printed
> > the current message via call_console_drivers()->write(). so partially 
> > flushed
> > cont buffer was not really expected to happen. I was wrong, obviously.
> 
> To be honest. I am not 100% sure what happens here. One theory is
> that the printk kthread is waken because of some previous
> non-continuous message. It steals console_lock() and partially flushes
> the cont buffer. In this case, the pr_cont() calls are not able to get
> console_lock() and basically work in the async mode. The pr_cont()
> calls have to store each piece sepatately because the partially
> flushed cont buffer is blocked until flushed completely.
> 
> Hmm, the strange thing is that I see this problem even when I force
> the global synch more by
> 
> echo Y >  /sys/module/printk/parameters/synchronous

oh, even in sync mode. hm...

> > just an idea.
> > ... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
> > sites. ACPI is one notable example. the others include OOM, some cgroup
> > related output (or... was it memcg), etc., etc.
> > 
> > so we *may be* can have a per-cpu cont buffer and add new API
> > pr_cont_begin()/pr_cont_end(), that would disable preemption.
> > 
> > 
> > +   pr_cont_begin() /* preempt_disable() */
> > 
> > for (.)
> > pr_cont("%pS );
> > 
> > +   pr_cont_end() /* preempt_enable() */
> >
> > pr_cont_end() also can flush this CPU's cont buffer and store the log
> > line. this will probably break very long cont lines (not sure if we
> > have any of those though). and may be flush_on_panic() would have to
> > do some extra work iterating each cpu.
> 
> It would work but I am a bit scared by the complexity. I think
> that we should find a compromise between complexity and
> reliability.

I understand your concerns.
but, realistically, KERN_CONT will still be SMP unsafe and will not do
what people probably expect it to do: e.g. ACPI_ERROR or ACPI_anything.
it does acpi_os_vprintf() -> printk(KERN_CONT).

the whole thing is already complicated, we already have all those len,
cons etc. checks in various places, it also has that owner task, etc.

static struct cont {
char buf[LOG_LINE_MAX];
size_t len; /* length == 0 means unused buffer */
size_t cons;/* bytes written to console */
struct task_struct *owner;  /* task of first print*/
u64 ts_nsec;/* time of first print */
u8 level;   /* log level of first message */
u8 facility;/* log facility of first message */
enum log_flags flags;   /* prefix, newline flags */
bool flushed:1; /* buffer sealed and committed */
} cont;

so I think, personally... it's sort of kind of not exactly what people
need (aka 'broken'). so my idea is to forbid concurrent cont buffer usage.
each CPU will have it's own buffer, and touch it with preemption_disabled.
there will be no more partial flushes, cont buffer will be flushed when it's
done -- via pr_cont_end()->log_store()->wake_up(printk_kthread). IOW it will
just add the cont buffer content to log_buf and print it via console_unlock(),
like the rest of the messages. no more console_cont_flush().

> > we still can have pr_cont() happening on several cpus simultaneously.
> > console_seq is getting reset, when we register a new CON_PRINTBUFFER
> > console.
> 
> I agree that mixing part of lines from different processes/cpus
> is not ideal. But it is not much worse than mixing whole lines.
> We should do a best effort but we do not need to be perfect.

sure, thanks! I'll try to think of something other than "throw it away".
but.., well, you know my opinion by now :)

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Petr Mladek
On Tue 2016-07-12 10:04:59, Viresh Kumar wrote:
> On 12-07-16, 17:59, Petr Mladek wrote:
> > On Mon 2016-07-11 12:25:11, Viresh Kumar wrote:
> > > We were also screwed because of the problem this thread is trying to 
> > > solve,
> > > which I posted last week [1].
> > > 
> > > And then I found this thread which fixed my issue, thanks a lot guys :)
> > > 
> > > Tested-by: Viresh Kumar 
> > > 
> > > Though, I would also like to mention one unwanted thing that happened on
> > > my setup :)
> > > 
> > > [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 
> > > (cpu 0)
> > > [   12.874909] potential CPU hogs:
> > > [   12.874909]  printk (292)
> > > 
> > > On my system, the excessive printing happens during suspend/resume and 
> > > this
> > > happened after all the non-boot CPUs were offlined. So, only CPU 0 was 
> > > left and
> > > that was doing printing for a long time and so these errors :)
> > > 
> > > It resulted in missing some print messages eventually as the scheduler 
> > > probably
> > > didn't schedule this thread for sometime after that.
> > > 
> > > Will it be fine to get the priority of this kthread to a somewhat lower 
> > > value,
> > > etc ?
> > 
> > I think that this patch helped only by chance.
> 
> Sorry got confused a bit here..
> 
> Which patch are you taking about here ? As you replied below my
> message, I believe that you are talking about the 3 patches that were
> posted in this series. But the rest of your comment looks like the
> fourth fixup patch that Sergey shared :)

You are right. I was talking about the 4th patch. It was sent as
another reply to this mail. I thought the 4th patch solved your
problems with missing messages during the suspend. 
 
> So, the fixup patch doesn't fix any issues in my case, to make it
> clear. I still have my system hanging somewhere.

I see. Thanks for clarification and I am sorry for the confusion.

Best Regards,
Petr


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Petr Mladek
On Tue 2016-07-12 10:04:59, Viresh Kumar wrote:
> On 12-07-16, 17:59, Petr Mladek wrote:
> > On Mon 2016-07-11 12:25:11, Viresh Kumar wrote:
> > > We were also screwed because of the problem this thread is trying to 
> > > solve,
> > > which I posted last week [1].
> > > 
> > > And then I found this thread which fixed my issue, thanks a lot guys :)
> > > 
> > > Tested-by: Viresh Kumar 
> > > 
> > > Though, I would also like to mention one unwanted thing that happened on
> > > my setup :)
> > > 
> > > [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 
> > > (cpu 0)
> > > [   12.874909] potential CPU hogs:
> > > [   12.874909]  printk (292)
> > > 
> > > On my system, the excessive printing happens during suspend/resume and 
> > > this
> > > happened after all the non-boot CPUs were offlined. So, only CPU 0 was 
> > > left and
> > > that was doing printing for a long time and so these errors :)
> > > 
> > > It resulted in missing some print messages eventually as the scheduler 
> > > probably
> > > didn't schedule this thread for sometime after that.
> > > 
> > > Will it be fine to get the priority of this kthread to a somewhat lower 
> > > value,
> > > etc ?
> > 
> > I think that this patch helped only by chance.
> 
> Sorry got confused a bit here..
> 
> Which patch are you taking about here ? As you replied below my
> message, I believe that you are talking about the 3 patches that were
> posted in this series. But the rest of your comment looks like the
> fourth fixup patch that Sergey shared :)

You are right. I was talking about the 4th patch. It was sent as
another reply to this mail. I thought the 4th patch solved your
problems with missing messages during the suspend. 
 
> So, the fixup patch doesn't fix any issues in my case, to make it
> clear. I still have my system hanging somewhere.

I see. Thanks for clarification and I am sorry for the confusion.

Best Regards,
Petr


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Petr Mladek
On Wed 2016-07-13 16:42:42, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/12/16 18:28), Petr Mladek wrote:
> > I have tried the following:
> > 
> >int a,b;
> > 
> >for (a = 0; a < 1000; a++) {
> >for (b = 0; b < 10; b++) {
> >pr_cont("item%04d", a*10 + b);
> >}
> >pr_cont("\n");
> >}
> > 
> > 
> > [ 4913.000298] 
> > item8740item8741item8742item8743item8744item8745item8746item8747item8748item8749
> > [ 4913.000310] 
> > item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> > [ 4913.000327] 
> > item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> > [ 4913.000342] 
> > item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> > [ 4913.000356] item8780
> > [ 4913.000357] item8781
> > [ 4913.000358] item8782
> > [...]
> 
> hm.. so are there any 'concurrent' printk()-s coming from other CPUs
> that are not getting printed on the console (because of loglevel 
> restrictions),
> but still screw up the cont buffer? otherwise, my expectation was that in
> this particular case cpu issues a new pr_cont() only after it has printed
> the current message via call_console_drivers()->write(). so partially flushed
> cont buffer was not really expected to happen. I was wrong, obviously.

To be honest. I am not 100% sure what happens here. One theory is
that the printk kthread is waken because of some previous
non-continuous message. It steals console_lock() and partially flushes
the cont buffer. In this case, the pr_cont() calls are not able to get
console_lock() and basically work in the async mode. The pr_cont()
calls have to store each piece sepatately because the partially
flushed cont buffer is blocked until flushed completely.

Hmm, the strange thing is that I see this problem even when I force
the global synch more by

echo Y >  /sys/module/printk/parameters/synchronous


I need to dig more into it.

> > Another solution would be to remember cont.cons outside
> > of struct cont. Then the cont buffer could be reused
> > immediately.
> 
> just an idea.
> ... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
> sites. ACPI is one notable example. the others include OOM, some cgroup
> related output (or... was it memcg), etc., etc.
> 
> so we *may be* can have a per-cpu cont buffer and add new API
> pr_cont_begin()/pr_cont_end(), that would disable preemption.
> 
> 
> + pr_cont_begin() /* preempt_disable() */
> 
>   for (.)
>   pr_cont("%pS );
> 
> + pr_cont_end() /* preempt_enable() */
>
> pr_cont_end() also can flush this CPU's cont buffer and store the log
> line. this will probably break very long cont lines (not sure if we
> have any of those though). and may be flush_on_panic() would have to
> do some extra work iterating each cpu.

It would work but I am a bit scared by the complexity. I think
that we should find a compromise between complexity and
reliability.

> > The saved position will be used either for the actual
> > cont buffer or for the very first message in the ring
> > buffer. The situation might be detected by comparing
> > console_seq, log_next_seq, and log_first_seq.
> 
> we still can have pr_cont() happening on several cpus simultaneously.
> console_seq is getting reset, when we register a new CON_PRINTBUFFER
> console.

I agree that mixing part of lines from different processes/cpus
is not ideal. But it is not much worse than mixing whole lines.
We should do a best effort but we do not need to be perfect.

I continue with scratching my head.

Best Regards,
Petr


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Petr Mladek
On Wed 2016-07-13 16:42:42, Sergey Senozhatsky wrote:
> Hello,
> 
> On (07/12/16 18:28), Petr Mladek wrote:
> > I have tried the following:
> > 
> >int a,b;
> > 
> >for (a = 0; a < 1000; a++) {
> >for (b = 0; b < 10; b++) {
> >pr_cont("item%04d", a*10 + b);
> >}
> >pr_cont("\n");
> >}
> > 
> > 
> > [ 4913.000298] 
> > item8740item8741item8742item8743item8744item8745item8746item8747item8748item8749
> > [ 4913.000310] 
> > item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> > [ 4913.000327] 
> > item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> > [ 4913.000342] 
> > item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> > [ 4913.000356] item8780
> > [ 4913.000357] item8781
> > [ 4913.000358] item8782
> > [...]
> 
> hm.. so are there any 'concurrent' printk()-s coming from other CPUs
> that are not getting printed on the console (because of loglevel 
> restrictions),
> but still screw up the cont buffer? otherwise, my expectation was that in
> this particular case cpu issues a new pr_cont() only after it has printed
> the current message via call_console_drivers()->write(). so partially flushed
> cont buffer was not really expected to happen. I was wrong, obviously.

To be honest. I am not 100% sure what happens here. One theory is
that the printk kthread is waken because of some previous
non-continuous message. It steals console_lock() and partially flushes
the cont buffer. In this case, the pr_cont() calls are not able to get
console_lock() and basically work in the async mode. The pr_cont()
calls have to store each piece sepatately because the partially
flushed cont buffer is blocked until flushed completely.

Hmm, the strange thing is that I see this problem even when I force
the global synch more by

echo Y >  /sys/module/printk/parameters/synchronous


I need to dig more into it.

> > Another solution would be to remember cont.cons outside
> > of struct cont. Then the cont buffer could be reused
> > immediately.
> 
> just an idea.
> ... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
> sites. ACPI is one notable example. the others include OOM, some cgroup
> related output (or... was it memcg), etc., etc.
> 
> so we *may be* can have a per-cpu cont buffer and add new API
> pr_cont_begin()/pr_cont_end(), that would disable preemption.
> 
> 
> + pr_cont_begin() /* preempt_disable() */
> 
>   for (.)
>   pr_cont("%pS );
> 
> + pr_cont_end() /* preempt_enable() */
>
> pr_cont_end() also can flush this CPU's cont buffer and store the log
> line. this will probably break very long cont lines (not sure if we
> have any of those though). and may be flush_on_panic() would have to
> do some extra work iterating each cpu.

It would work but I am a bit scared by the complexity. I think
that we should find a compromise between complexity and
reliability.

> > The saved position will be used either for the actual
> > cont buffer or for the very first message in the ring
> > buffer. The situation might be detected by comparing
> > console_seq, log_next_seq, and log_first_seq.
> 
> we still can have pr_cont() happening on several cpus simultaneously.
> console_seq is getting reset, when we register a new CON_PRINTBUFFER
> console.

I agree that mixing part of lines from different processes/cpus
is not ideal. But it is not much worse than mixing whole lines.
We should do a best effort but we do not need to be perfect.

I continue with scratching my head.

Best Regards,
Petr


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Sergey Senozhatsky
Hello,

On (07/12/16 18:28), Petr Mladek wrote:
> I have tried the following:
> 
>int a,b;
> 
>for (a = 0; a < 1000; a++) {
>for (b = 0; b < 10; b++) {
>pr_cont("item%04d", a*10 + b);
>}
>pr_cont("\n");
>}
> 
> 
> [ 4913.000298] 
> item8740item8741item8742item8743item8744item8745item8746item8747item8748item8749
> [ 4913.000310] 
> item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> [ 4913.000327] 
> item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> [ 4913.000342] 
> item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> [ 4913.000356] item8780
> [ 4913.000357] item8781
> [ 4913.000358] item8782
> [...]

hm.. so are there any 'concurrent' printk()-s coming from other CPUs
that are not getting printed on the console (because of loglevel restrictions),
but still screw up the cont buffer? otherwise, my expectation was that in
this particular case cpu issues a new pr_cont() only after it has printed
the current message via call_console_drivers()->write(). so partially flushed
cont buffer was not really expected to happen. I was wrong, obviously.

> > KERN_CONT output is heavily relying on the fact there will be no other
> > CPU(-s)/process(-es) doing any sort (KERN_CONT etc.) of printk-s 
> > simultaneously.
> > This basically means that pr_cont() CPU is expected to be the one that 
> > actually
> > does the printing and log_store() of the entire cont buffer:
> 
> In particular, the partially flushed cont buffer could not be used for new
> messages until the rest of it is flushed to the console. It is because
> the number of flushed characters is stored in the struct cont.
> > pr_cont()
> > console_unlock()
> > console_cont_flush()
> > log_store() /* cont buffer */
> > pr_cont()
> > console_unlock()
> > console_cont_flush()
> > log_store() /* cont buffer */
> > 
> > 
> > Async printk breaks this contract, because console_cont_flush() is now 
> > deferred.
> > Thus pr_cont() will see a not fully flushed cont buffer, which will force 
> > it to
> > append a new KERN_CONT not to cont buffer, but to buffer log as a separate 
> > log
> > entry via log_store(). Due to the deferred nature of console_cont_flush() 
> > the
> > very next pr_cont() may see cont buffer being flushed, so it will keep the
> > message in the cont buffer and log_store() it later. So the cont line will 
> > split
> > across several log entries -- printk_kthread can take some time to wkeup,
> > during which pr_cont() will keep splitting cont line:
> 
> I guess that the problem was there even before async printk. For
> example, we do not keep lockbuf_lock between console_cont_flush()
> and printing other lines from the ring buffer. It means that
> new lines might appear in the buffer in the meantime.
> 
> Note that we call consoles in console_cont_flush() without
> lockbuf_lock. So there is relatively long time for the race.
> 
[...]
> Another solution would be to remember cont.cons outside
> of struct cont. Then the cont buffer could be reused
> immediately.

just an idea.
... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
sites. ACPI is one notable example. the others include OOM, some cgroup
related output (or... was it memcg), etc., etc.

so we *may be* can have a per-cpu cont buffer and add new API
pr_cont_begin()/pr_cont_end(), that would disable preemption.


+   pr_cont_begin() /* preempt_disable() */

for (.)
pr_cont("%pS );

+   pr_cont_end() /* preempt_enable() */

pr_cont_end() also can flush this CPU's cont buffer and store the log
line. this will probably break very long cont lines (not sure if we
have any of those though). and may be flush_on_panic() would have to
do some extra work iterating each cpu.

> The saved position will be used either for the actual
> cont buffer or for the very first message in the ring
> buffer. The situation might be detected by comparing
> console_seq, log_next_seq, and log_first_seq.

we still can have pr_cont() happening on several cpus simultaneously.
console_seq is getting reset, when we register a new CON_PRINTBUFFER
console.

> if (console_seq == log_next_seq)
>cont buffer still includes the very last message
> 
> if (console_seq < log_first_seq)
>the message was lost => no partialy flushed message around
> 
> if (console_seq < log_next_seq && cont.cons)
>the current message in the log buffer is partially printed
> 
> 
> Well, this looks quite complicated. And I am afraid that
> I have missed something.
> 
> I am still scratching my head around it.

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-13 Thread Sergey Senozhatsky
Hello,

On (07/12/16 18:28), Petr Mladek wrote:
> I have tried the following:
> 
>int a,b;
> 
>for (a = 0; a < 1000; a++) {
>for (b = 0; b < 10; b++) {
>pr_cont("item%04d", a*10 + b);
>}
>pr_cont("\n");
>}
> 
> 
> [ 4913.000298] 
> item8740item8741item8742item8743item8744item8745item8746item8747item8748item8749
> [ 4913.000310] 
> item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
> [ 4913.000327] 
> item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
> [ 4913.000342] 
> item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
> [ 4913.000356] item8780
> [ 4913.000357] item8781
> [ 4913.000358] item8782
> [...]

hm.. so are there any 'concurrent' printk()-s coming from other CPUs
that are not getting printed on the console (because of loglevel restrictions),
but still screw up the cont buffer? otherwise, my expectation was that in
this particular case cpu issues a new pr_cont() only after it has printed
the current message via call_console_drivers()->write(). so partially flushed
cont buffer was not really expected to happen. I was wrong, obviously.

> > KERN_CONT output is heavily relying on the fact there will be no other
> > CPU(-s)/process(-es) doing any sort (KERN_CONT etc.) of printk-s 
> > simultaneously.
> > This basically means that pr_cont() CPU is expected to be the one that 
> > actually
> > does the printing and log_store() of the entire cont buffer:
> 
> In particular, the partially flushed cont buffer could not be used for new
> messages until the rest of it is flushed to the console. It is because
> the number of flushed characters is stored in the struct cont.
> > pr_cont()
> > console_unlock()
> > console_cont_flush()
> > log_store() /* cont buffer */
> > pr_cont()
> > console_unlock()
> > console_cont_flush()
> > log_store() /* cont buffer */
> > 
> > 
> > Async printk breaks this contract, because console_cont_flush() is now 
> > deferred.
> > Thus pr_cont() will see a not fully flushed cont buffer, which will force 
> > it to
> > append a new KERN_CONT not to cont buffer, but to buffer log as a separate 
> > log
> > entry via log_store(). Due to the deferred nature of console_cont_flush() 
> > the
> > very next pr_cont() may see cont buffer being flushed, so it will keep the
> > message in the cont buffer and log_store() it later. So the cont line will 
> > split
> > across several log entries -- printk_kthread can take some time to wkeup,
> > during which pr_cont() will keep splitting cont line:
> 
> I guess that the problem was there even before async printk. For
> example, we do not keep lockbuf_lock between console_cont_flush()
> and printing other lines from the ring buffer. It means that
> new lines might appear in the buffer in the meantime.
> 
> Note that we call consoles in console_cont_flush() without
> lockbuf_lock. So there is relatively long time for the race.
> 
[...]
> Another solution would be to remember cont.cons outside
> of struct cont. Then the cont buffer could be reused
> immediately.

just an idea.
... or try to make KERN_CONT SMP-safe. there are many pr_cont() call
sites. ACPI is one notable example. the others include OOM, some cgroup
related output (or... was it memcg), etc., etc.

so we *may be* can have a per-cpu cont buffer and add new API
pr_cont_begin()/pr_cont_end(), that would disable preemption.


+   pr_cont_begin() /* preempt_disable() */

for (.)
pr_cont("%pS );

+   pr_cont_end() /* preempt_enable() */

pr_cont_end() also can flush this CPU's cont buffer and store the log
line. this will probably break very long cont lines (not sure if we
have any of those though). and may be flush_on_panic() would have to
do some extra work iterating each cpu.

> The saved position will be used either for the actual
> cont buffer or for the very first message in the ring
> buffer. The situation might be detected by comparing
> console_seq, log_next_seq, and log_first_seq.

we still can have pr_cont() happening on several cpus simultaneously.
console_seq is getting reset, when we register a new CON_PRINTBUFFER
console.

> if (console_seq == log_next_seq)
>cont buffer still includes the very last message
> 
> if (console_seq < log_first_seq)
>the message was lost => no partialy flushed message around
> 
> if (console_seq < log_next_seq && cont.cons)
>the current message in the log buffer is partially printed
> 
> 
> Well, this looks quite complicated. And I am afraid that
> I have missed something.
> 
> I am still scratching my head around it.

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Viresh Kumar
On 12-07-16, 17:59, Petr Mladek wrote:
> On Mon 2016-07-11 12:25:11, Viresh Kumar wrote:
> > We were also screwed because of the problem this thread is trying to solve,
> > which I posted last week [1].
> > 
> > And then I found this thread which fixed my issue, thanks a lot guys :)
> > 
> > Tested-by: Viresh Kumar 
> > 
> > Though, I would also like to mention one unwanted thing that happened on
> > my setup :)
> > 
> > [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 
> > (cpu 0)
> > [   12.874909] potential CPU hogs:
> > [   12.874909]  printk (292)
> > 
> > On my system, the excessive printing happens during suspend/resume and this
> > happened after all the non-boot CPUs were offlined. So, only CPU 0 was left 
> > and
> > that was doing printing for a long time and so these errors :)
> > 
> > It resulted in missing some print messages eventually as the scheduler 
> > probably
> > didn't schedule this thread for sometime after that.
> > 
> > Will it be fine to get the priority of this kthread to a somewhat lower 
> > value,
> > etc ?
> 
> I think that this patch helped only by chance.

Sorry got confused a bit here..

Which patch are you taking about here ? As you replied below my
message, I believe that you are talking about the 3 patches that were
posted in this series. But the rest of your comment looks like the
fourth fixup patch that Sergey shared :)

So, the fixup patch doesn't fix any issues in my case, to make it
clear. I still have my system hanging somewhere.

> I am still scratching my head about the problem fixed by this patch
> and also about suspend problems.

:)

-- 
viresh


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Viresh Kumar
On 12-07-16, 17:59, Petr Mladek wrote:
> On Mon 2016-07-11 12:25:11, Viresh Kumar wrote:
> > We were also screwed because of the problem this thread is trying to solve,
> > which I posted last week [1].
> > 
> > And then I found this thread which fixed my issue, thanks a lot guys :)
> > 
> > Tested-by: Viresh Kumar 
> > 
> > Though, I would also like to mention one unwanted thing that happened on
> > my setup :)
> > 
> > [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 
> > (cpu 0)
> > [   12.874909] potential CPU hogs:
> > [   12.874909]  printk (292)
> > 
> > On my system, the excessive printing happens during suspend/resume and this
> > happened after all the non-boot CPUs were offlined. So, only CPU 0 was left 
> > and
> > that was doing printing for a long time and so these errors :)
> > 
> > It resulted in missing some print messages eventually as the scheduler 
> > probably
> > didn't schedule this thread for sometime after that.
> > 
> > Will it be fine to get the priority of this kthread to a somewhat lower 
> > value,
> > etc ?
> 
> I think that this patch helped only by chance.

Sorry got confused a bit here..

Which patch are you taking about here ? As you replied below my
message, I believe that you are talking about the 3 patches that were
posted in this series. But the rest of your comment looks like the
fourth fixup patch that Sergey shared :)

So, the fixup patch doesn't fix any issues in my case, to make it
clear. I still have my system hanging somewhere.

> I am still scratching my head about the problem fixed by this patch
> and also about suspend problems.

:)

-- 
viresh


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Sergey Senozhatsky
On (07/12/16 17:59), Petr Mladek wrote:
[..]
> > [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 
> > (cpu 0)
> > [   12.874909] potential CPU hogs:
> > [   12.874909]  printk (292)
> > 
> > On my system, the excessive printing happens during suspend/resume and this
> > happened after all the non-boot CPUs were offlined. So, only CPU 0 was left 
> > and
> > that was doing printing for a long time and so these errors :)
> > 
> > It resulted in missing some print messages eventually as the scheduler 
> > probably
> > didn't schedule this thread for sometime after that.
> > 
> > Will it be fine to get the priority of this kthread to a somewhat lower 
> > value,
> > etc ?
> 
> I think that this patch helped only by chance. It causes that any
> message without a new line will force printk to the sync mode.
> Then it will print even the buffered messages immediately. It
> causes printk to behave more or less in the sync mode all the time.

KERN_CONT messages are not so often, because they are not SMP-safe, any 
intruding
non-KERN_CONT message will wake_up() printk_kthread. there is a chance, you are
right, that printk will operate in sync mode for some time, when we would want 
it
to be async, so 0004 probably better try harder.

but KERN_CONT is not what's going on in this case, I think. at lest 
"[   12.874909]  printk (292)" line suggests so -- printk was in async
mode... and somehow non-preemptible for some time... a delay from spin_lock()
in call_console_drivers()->write()?... but why would async printk worsen it.

> I am still scratching my head about the problem fixed by this patch
> and also about suspend problems.

yeah, quite puzzled myself. thanks Petr and Viresh for digging this up.
I need some sleep now, it's 2 am here, will return back tomorrow.

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Sergey Senozhatsky
On (07/12/16 17:59), Petr Mladek wrote:
[..]
> > [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 
> > (cpu 0)
> > [   12.874909] potential CPU hogs:
> > [   12.874909]  printk (292)
> > 
> > On my system, the excessive printing happens during suspend/resume and this
> > happened after all the non-boot CPUs were offlined. So, only CPU 0 was left 
> > and
> > that was doing printing for a long time and so these errors :)
> > 
> > It resulted in missing some print messages eventually as the scheduler 
> > probably
> > didn't schedule this thread for sometime after that.
> > 
> > Will it be fine to get the priority of this kthread to a somewhat lower 
> > value,
> > etc ?
> 
> I think that this patch helped only by chance. It causes that any
> message without a new line will force printk to the sync mode.
> Then it will print even the buffered messages immediately. It
> causes printk to behave more or less in the sync mode all the time.

KERN_CONT messages are not so often, because they are not SMP-safe, any 
intruding
non-KERN_CONT message will wake_up() printk_kthread. there is a chance, you are
right, that printk will operate in sync mode for some time, when we would want 
it
to be async, so 0004 probably better try harder.

but KERN_CONT is not what's going on in this case, I think. at lest 
"[   12.874909]  printk (292)" line suggests so -- printk was in async
mode... and somehow non-preemptible for some time... a delay from spin_lock()
in call_console_drivers()->write()?... but why would async printk worsen it.

> I am still scratching my head about the problem fixed by this patch
> and also about suspend problems.

yeah, quite puzzled myself. thanks Petr and Viresh for digging this up.
I need some sleep now, it's 2 am here, will return back tomorrow.

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Petr Mladek
On Wed 2016-06-29 14:08:35, Sergey Senozhatsky wrote:
> Hello,
> 
> the patch to fix the async KERN_CONT printk regression I mentioned
> several days ago in another thread.
> 
> as a separate patch for now, to ease the review. will squash with 0001.
> 
> == 8< ==
> 
> >From 70f65ed55eb82a1b8b74fdbd1a7afc9e77e9b380 Mon Sep 17 00:00:00 2001
> From: Sergey Senozhatsky 
> Subject: [PATCH] printk: print KERN_CONT buffer in sync printk mode
> 
> One of my boxen has revealed that there is a race condition now between
> pr_cot() output and printk_kthread wake_up() time. Schematically, the CPU
> that triggered the race conditions was doing
> something like this:
> 
>   for (...) {
>   for () {
>   ...
>   pr_cont(...);
>   ...
>   }
>   pr_cont("\n");
>   }

I have tried the following:

   int a,b;

   for (a = 0; a < 1000; a++) {
   for (b = 0; b < 10; b++) {
   pr_cont("item%04d", a*10 + b);
   }
   pr_cont("\n");
   }


> the resulting output was:
> 
> $ dmesg
> 
> 0001 0002 0003 0004 0005 0006 0007
> 0008
> 0009 000a 000b 000c 000d 000e
> 000f 0010 0011 0012 0013 0014 0015
> 002b 002c 002d 002e 002f 0030 0031
> 

This patch helped a lot but I still see the problem from time to time.
For example,

[ 4913.000290] 
item8720item8721item8722item8723item8724item8725item8726item8727item8728item8729
[ 4913.000294] 
item8730item8731item8732item8733item8734item8735item8736item8737item8738item8739
[ 4913.000298] 
item8740item8741item8742item8743item8744item8745item8746item8747item8748item8749
[ 4913.000310] 
item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
[ 4913.000327] 
item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
[ 4913.000342] 
item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
[ 4913.000356] item8780
[ 4913.000357] item8781
[ 4913.000358] item8782
[ 4913.000358] item8783
[ 4913.000359] item8784
[ 4913.000359] item8785
[ 4913.000359] item8786
[ 4913.000360] item8787
[ 4913.000360] item8788
[ 4913.000360] item8789

[ 4913.000361] item8790
[ 4913.000361] item8791
[ 4913.000361] item8792
[...]


> 
> KERN_CONT output is heavily relying on the fact there will be no other
> CPU(-s)/process(-es) doing any sort (KERN_CONT etc.) of printk-s 
> simultaneously.
> This basically means that pr_cont() CPU is expected to be the one that 
> actually
> does the printing and log_store() of the entire cont buffer:

In particular, the partially flushed cont buffer could not be used for new
messages until the rest of it is flushed to the console. It is because
the number of flushed characters is stored in the struct cont.


>   pr_cont()
>   console_unlock()
>   console_cont_flush()
>   log_store() /* cont buffer */
>   pr_cont()
>   console_unlock()
>   console_cont_flush()
>   log_store() /* cont buffer */
>   
> 
> Async printk breaks this contract, because console_cont_flush() is now 
> deferred.
> Thus pr_cont() will see a not fully flushed cont buffer, which will force it 
> to
> append a new KERN_CONT not to cont buffer, but to buffer log as a separate log
> entry via log_store(). Due to the deferred nature of console_cont_flush() the
> very next pr_cont() may see cont buffer being flushed, so it will keep the
> message in the cont buffer and log_store() it later. So the cont line will 
> split
> across several log entries -- printk_kthread can take some time to wkeup,
> during which pr_cont() will keep splitting cont line:

I guess that the problem was there even before async printk. For
example, we do not keep lockbuf_lock between console_cont_flush()
and printing other lines from the ring buffer. It means that
new lines might appear in the buffer in the meantime.

Note that we call consoles in console_cont_flush() without
lockbuf_lock. So there is relatively long time for the race.

In each case, the async printk made it worse because there
might be longer periods between the printk thread is scheduled
and the rest of the cont buffer is flushed.



Anyway, this solution makes the async printk less helpful.
Any message that does not end with a new line will force
the sync mode and the risk of the soft lookup.


Another solution would be to remember cont.cons outside
of struct cont. Then the cont buffer could be reused
immediately.

The saved position will be used either for the actual
cont buffer or for the very first message in the ring
buffer. The situation might be detected by comparing
console_seq, log_next_seq, and log_first_seq.

if (console_seq == 

Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Petr Mladek
On Wed 2016-06-29 14:08:35, Sergey Senozhatsky wrote:
> Hello,
> 
> the patch to fix the async KERN_CONT printk regression I mentioned
> several days ago in another thread.
> 
> as a separate patch for now, to ease the review. will squash with 0001.
> 
> == 8< ==
> 
> >From 70f65ed55eb82a1b8b74fdbd1a7afc9e77e9b380 Mon Sep 17 00:00:00 2001
> From: Sergey Senozhatsky 
> Subject: [PATCH] printk: print KERN_CONT buffer in sync printk mode
> 
> One of my boxen has revealed that there is a race condition now between
> pr_cot() output and printk_kthread wake_up() time. Schematically, the CPU
> that triggered the race conditions was doing
> something like this:
> 
>   for (...) {
>   for () {
>   ...
>   pr_cont(...);
>   ...
>   }
>   pr_cont("\n");
>   }

I have tried the following:

   int a,b;

   for (a = 0; a < 1000; a++) {
   for (b = 0; b < 10; b++) {
   pr_cont("item%04d", a*10 + b);
   }
   pr_cont("\n");
   }


> the resulting output was:
> 
> $ dmesg
> 
> 0001 0002 0003 0004 0005 0006 0007
> 0008
> 0009 000a 000b 000c 000d 000e
> 000f 0010 0011 0012 0013 0014 0015
> 002b 002c 002d 002e 002f 0030 0031
> 

This patch helped a lot but I still see the problem from time to time.
For example,

[ 4913.000290] 
item8720item8721item8722item8723item8724item8725item8726item8727item8728item8729
[ 4913.000294] 
item8730item8731item8732item8733item8734item8735item8736item8737item8738item8739
[ 4913.000298] 
item8740item8741item8742item8743item8744item8745item8746item8747item8748item8749
[ 4913.000310] 
item8750item8751item8752item8753item8754item8755item8756item8757item8758item8759
[ 4913.000327] 
item8760item8761item8762item8763item8764item8765item8766item8767item8768item8769
[ 4913.000342] 
item8770item8771item8772item8773item8774item8775item8776item8777item8778item8779
[ 4913.000356] item8780
[ 4913.000357] item8781
[ 4913.000358] item8782
[ 4913.000358] item8783
[ 4913.000359] item8784
[ 4913.000359] item8785
[ 4913.000359] item8786
[ 4913.000360] item8787
[ 4913.000360] item8788
[ 4913.000360] item8789

[ 4913.000361] item8790
[ 4913.000361] item8791
[ 4913.000361] item8792
[...]


> 
> KERN_CONT output is heavily relying on the fact there will be no other
> CPU(-s)/process(-es) doing any sort (KERN_CONT etc.) of printk-s 
> simultaneously.
> This basically means that pr_cont() CPU is expected to be the one that 
> actually
> does the printing and log_store() of the entire cont buffer:

In particular, the partially flushed cont buffer could not be used for new
messages until the rest of it is flushed to the console. It is because
the number of flushed characters is stored in the struct cont.


>   pr_cont()
>   console_unlock()
>   console_cont_flush()
>   log_store() /* cont buffer */
>   pr_cont()
>   console_unlock()
>   console_cont_flush()
>   log_store() /* cont buffer */
>   
> 
> Async printk breaks this contract, because console_cont_flush() is now 
> deferred.
> Thus pr_cont() will see a not fully flushed cont buffer, which will force it 
> to
> append a new KERN_CONT not to cont buffer, but to buffer log as a separate log
> entry via log_store(). Due to the deferred nature of console_cont_flush() the
> very next pr_cont() may see cont buffer being flushed, so it will keep the
> message in the cont buffer and log_store() it later. So the cont line will 
> split
> across several log entries -- printk_kthread can take some time to wkeup,
> during which pr_cont() will keep splitting cont line:

I guess that the problem was there even before async printk. For
example, we do not keep lockbuf_lock between console_cont_flush()
and printing other lines from the ring buffer. It means that
new lines might appear in the buffer in the meantime.

Note that we call consoles in console_cont_flush() without
lockbuf_lock. So there is relatively long time for the race.

In each case, the async printk made it worse because there
might be longer periods between the printk thread is scheduled
and the rest of the cont buffer is flushed.



Anyway, this solution makes the async printk less helpful.
Any message that does not end with a new line will force
the sync mode and the risk of the soft lookup.


Another solution would be to remember cont.cons outside
of struct cont. Then the cont buffer could be reused
immediately.

The saved position will be used either for the actual
cont buffer or for the very first message in the ring
buffer. The situation might be detected by comparing
console_seq, log_next_seq, and log_first_seq.

if (console_seq == log_next_seq)
   cont buffer still 

Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Petr Mladek
On Mon 2016-07-11 12:25:11, Viresh Kumar wrote:
> On Fri, May 13, 2016 at 6:18 AM, Sergey Senozhatsky
>  wrote:
> > Hello,
> >
> > no code changes, just refreshing the series so it'll be easier to
> > pick up. added Reviwed-by-s and corrected a typo spotted by Petr.
> >
> > This patch permits to change printk() to operate in completely
> > asynchronous mode: new messages are getting upended to the kernel
> > printk buffer, but instead of 'direct' printing the actual print
> > job is performed by a dedicated kthread. This has the advantage
> > that printing always happens from a schedulable context and thus
> > we don't lockup any particular CPU or even interrupts.
> >
> > against next-20160513
> >
> >
> > v12:
> > -- rename printk_kthread_can_run bool flag
> > -- update printk_kthread_can_run comment (Petr)
> > -- drop mutex from printk_sync_set(), sysfs writes are synchronised (Petr)
> 
> We were also screwed because of the problem this thread is trying to solve,
> which I posted last week [1].
> 
> And then I found this thread which fixed my issue, thanks a lot guys :)
> 
> Tested-by: Viresh Kumar 
> 
> Though, I would also like to mention one unwanted thing that happened on
> my setup :)
> 
> [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 (cpu 
> 0)
> [   12.874909] potential CPU hogs:
> [   12.874909]  printk (292)
> 
> On my system, the excessive printing happens during suspend/resume and this
> happened after all the non-boot CPUs were offlined. So, only CPU 0 was left 
> and
> that was doing printing for a long time and so these errors :)
> 
> It resulted in missing some print messages eventually as the scheduler 
> probably
> didn't schedule this thread for sometime after that.
> 
> Will it be fine to get the priority of this kthread to a somewhat lower value,
> etc ?

I think that this patch helped only by chance. It causes that any
message without a new line will force printk to the sync mode.
Then it will print even the buffered messages immediately. It
causes printk to behave more or less in the sync mode all the time.

I am still scratching my head about the problem fixed by this patch
and also about suspend problems.

Best Regards,
Petr


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-12 Thread Petr Mladek
On Mon 2016-07-11 12:25:11, Viresh Kumar wrote:
> On Fri, May 13, 2016 at 6:18 AM, Sergey Senozhatsky
>  wrote:
> > Hello,
> >
> > no code changes, just refreshing the series so it'll be easier to
> > pick up. added Reviwed-by-s and corrected a typo spotted by Petr.
> >
> > This patch permits to change printk() to operate in completely
> > asynchronous mode: new messages are getting upended to the kernel
> > printk buffer, but instead of 'direct' printing the actual print
> > job is performed by a dedicated kthread. This has the advantage
> > that printing always happens from a schedulable context and thus
> > we don't lockup any particular CPU or even interrupts.
> >
> > against next-20160513
> >
> >
> > v12:
> > -- rename printk_kthread_can_run bool flag
> > -- update printk_kthread_can_run comment (Petr)
> > -- drop mutex from printk_sync_set(), sysfs writes are synchronised (Petr)
> 
> We were also screwed because of the problem this thread is trying to solve,
> which I posted last week [1].
> 
> And then I found this thread which fixed my issue, thanks a lot guys :)
> 
> Tested-by: Viresh Kumar 
> 
> Though, I would also like to mention one unwanted thing that happened on
> my setup :)
> 
> [   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 (cpu 
> 0)
> [   12.874909] potential CPU hogs:
> [   12.874909]  printk (292)
> 
> On my system, the excessive printing happens during suspend/resume and this
> happened after all the non-boot CPUs were offlined. So, only CPU 0 was left 
> and
> that was doing printing for a long time and so these errors :)
> 
> It resulted in missing some print messages eventually as the scheduler 
> probably
> didn't schedule this thread for sometime after that.
> 
> Will it be fine to get the priority of this kthread to a somewhat lower value,
> etc ?

I think that this patch helped only by chance. It causes that any
message without a new line will force printk to the sync mode.
Then it will print even the buffered messages immediately. It
causes printk to behave more or less in the sync mode all the time.

I am still scratching my head about the problem fixed by this patch
and also about suspend problems.

Best Regards,
Petr


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-11 Thread Viresh Kumar
On Fri, May 13, 2016 at 6:18 AM, Sergey Senozhatsky
 wrote:
> Hello,
>
> no code changes, just refreshing the series so it'll be easier to
> pick up. added Reviwed-by-s and corrected a typo spotted by Petr.
>
> This patch permits to change printk() to operate in completely
> asynchronous mode: new messages are getting upended to the kernel
> printk buffer, but instead of 'direct' printing the actual print
> job is performed by a dedicated kthread. This has the advantage
> that printing always happens from a schedulable context and thus
> we don't lockup any particular CPU or even interrupts.
>
> against next-20160513
>
>
> v12:
> -- rename printk_kthread_can_run bool flag
> -- update printk_kthread_can_run comment (Petr)
> -- drop mutex from printk_sync_set(), sysfs writes are synchronised (Petr)

We were also screwed because of the problem this thread is trying to solve,
which I posted last week [1].

And then I found this thread which fixed my issue, thanks a lot guys :)

Tested-by: Viresh Kumar 

Though, I would also like to mention one unwanted thing that happened on
my setup :)

[   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 (cpu 0)
[   12.874909] potential CPU hogs:
[   12.874909]  printk (292)

On my system, the excessive printing happens during suspend/resume and this
happened after all the non-boot CPUs were offlined. So, only CPU 0 was left and
that was doing printing for a long time and so these errors :)

It resulted in missing some print messages eventually as the scheduler probably
didn't schedule this thread for sometime after that.

Will it be fine to get the priority of this kthread to a somewhat lower value,
etc ?

--
viresh

[1] http://marc.info/?l=linux-kernel=146739243615687


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-07-11 Thread Viresh Kumar
On Fri, May 13, 2016 at 6:18 AM, Sergey Senozhatsky
 wrote:
> Hello,
>
> no code changes, just refreshing the series so it'll be easier to
> pick up. added Reviwed-by-s and corrected a typo spotted by Petr.
>
> This patch permits to change printk() to operate in completely
> asynchronous mode: new messages are getting upended to the kernel
> printk buffer, but instead of 'direct' printing the actual print
> job is performed by a dedicated kthread. This has the advantage
> that printing always happens from a schedulable context and thus
> we don't lockup any particular CPU or even interrupts.
>
> against next-20160513
>
>
> v12:
> -- rename printk_kthread_can_run bool flag
> -- update printk_kthread_can_run comment (Petr)
> -- drop mutex from printk_sync_set(), sysfs writes are synchronised (Petr)

We were also screwed because of the problem this thread is trying to solve,
which I posted last week [1].

And then I found this thread which fixed my issue, thanks a lot guys :)

Tested-by: Viresh Kumar 

Though, I would also like to mention one unwanted thing that happened on
my setup :)

[   12.874909] sched: RT throttling activated for rt_rq ffc0ac13fcd0 (cpu 0)
[   12.874909] potential CPU hogs:
[   12.874909]  printk (292)

On my system, the excessive printing happens during suspend/resume and this
happened after all the non-boot CPUs were offlined. So, only CPU 0 was left and
that was doing printing for a long time and so these errors :)

It resulted in missing some print messages eventually as the scheduler probably
didn't schedule this thread for sometime after that.

Will it be fine to get the priority of this kthread to a somewhat lower value,
etc ?

--
viresh

[1] http://marc.info/?l=linux-kernel=146739243615687


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-28 Thread Sergey Senozhatsky
On (06/28/16 22:16), Joe Perches wrote:
> On Wed, 2016-06-29 at 14:08 +0900, Sergey Senozhatsky wrote:
> > the patch to fix the async KERN_CONT printk regression I mentioned
> > several days ago in another thread.
> 
> KERN_CONT is effectively a no-op.
> This is any printk without a KERN_

if the message has no LOG_NEWLINE in flags then it's stored as LOG_CONT
in log buffer, but before getting there it sits in cont buffer. and those
are usually KERN_CONT messages that don't have LOG_NEWLINE in flags. may
be the wording is not exactly correct in the original message.

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-28 Thread Sergey Senozhatsky
On (06/28/16 22:16), Joe Perches wrote:
> On Wed, 2016-06-29 at 14:08 +0900, Sergey Senozhatsky wrote:
> > the patch to fix the async KERN_CONT printk regression I mentioned
> > several days ago in another thread.
> 
> KERN_CONT is effectively a no-op.
> This is any printk without a KERN_

if the message has no LOG_NEWLINE in flags then it's stored as LOG_CONT
in log buffer, but before getting there it sits in cont buffer. and those
are usually KERN_CONT messages that don't have LOG_NEWLINE in flags. may
be the wording is not exactly correct in the original message.

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-28 Thread Joe Perches
On Wed, 2016-06-29 at 14:08 +0900, Sergey Senozhatsky wrote:
> the patch to fix the async KERN_CONT printk regression I mentioned
> several days ago in another thread.

KERN_CONT is effectively a no-op.
This is any printk without a KERN_




Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-28 Thread Joe Perches
On Wed, 2016-06-29 at 14:08 +0900, Sergey Senozhatsky wrote:
> the patch to fix the async KERN_CONT printk regression I mentioned
> several days ago in another thread.

KERN_CONT is effectively a no-op.
This is any printk without a KERN_




Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-28 Thread Sergey Senozhatsky
Hello,

the patch to fix the async KERN_CONT printk regression I mentioned
several days ago in another thread.

as a separate patch for now, to ease the review. will squash with 0001.

== 8< ==

>From 70f65ed55eb82a1b8b74fdbd1a7afc9e77e9b380 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky 
Subject: [PATCH] printk: print KERN_CONT buffer in sync printk mode

One of my boxen has revealed that there is a race condition now between
pr_cot() output and printk_kthread wake_up() time. Schematically, the CPU
that triggered the race conditions was doing
something like this:

for (...) {
for () {
...
pr_cont(...);
...
}
pr_cont("\n");
}

the resulting output was:

$ dmesg

0001 0002 0003 0004 0005 0006 0007
0008
0009 000a 000b 000c 000d 000e
000f 0010 0011 0012 0013 0014 0015
002b 002c 002d 002e 002f 0030 0031



KERN_CONT output is heavily relying on the fact there will be no other
CPU(-s)/process(-es) doing any sort (KERN_CONT etc.) of printk-s simultaneously.
This basically means that pr_cont() CPU is expected to be the one that actually
does the printing and log_store() of the entire cont buffer:

pr_cont()
console_unlock()
console_cont_flush()
log_store() /* cont buffer */
pr_cont()
console_unlock()
console_cont_flush()
log_store() /* cont buffer */


Async printk breaks this contract, because console_cont_flush() is now deferred.
Thus pr_cont() will see a not fully flushed cont buffer, which will force it to
append a new KERN_CONT not to cont buffer, but to buffer log as a separate log
entry via log_store(). Due to the deferred nature of console_cont_flush() the
very next pr_cont() may see cont buffer being flushed, so it will keep the
message in the cont buffer and log_store() it later. So the cont line will split
across several log entries -- printk_kthread can take some time to wkeup,
during which pr_cont() will keep splitting cont line:

$ dmesg

0001 0002 0003 0004 0005 0006 0007
0008
0009
000a
000b
000c
000d
000e


To fix the issue we need to switch to sync printk mode for KERN_CONT messages
and when we cont_flush() the cont buffer.


CPU0CPU1

vprintk_emit(KERN_CONT "0001")
if (!(lflags & LOG_NEWLINE))
cont_add()
if (!cont.len)
starts a new cont line
wake_up() printk_kthread
...

vprintk_emit(KERN_CONT "004")   entering 
console_unlock()
if (!(lflags & LOG_NEWLINE))
cont_add()
wake_up() printk_kthread

vprintk_emit(KERN_CONT "005")
if (!(lflags & LOG_NEWLINE))
cont_add()
wake_up() printk_kthreadconsole_cont_flush()

cont_print_text()

vprintk_emit(KERN_CONT "006")
if (!(lflags & LOG_NEWLINE))
cont_add()
wake_up() printk_kthread
leaving console_unlock()

vprintk_emit(KERN_CONT "007")
if (!(lflags & LOG_NEWLINE))entering 
console_unlock()
cont_add()
wake_up() printk_kthreadconsole_cont_flush()

cont_print_text()
leaving console_unlock()
vprintk_emit(KERN_CONT "\n")
if (lflags & LOG_NEWLINE)
if (cont.len) {
cont_add()
cont_flush()
if (cont.cons) {
cont.flags = flags;
cont.flushed = true;

}
log_store() /* "001 .. 007" */
^^
}
wake_up() process   leaving console_unlock()

vprintk_emit(KERN_CONT "008")
if (!(lflags & LOG_NEWLINE))entering 
console_unlock()
cont_add()
{
if (cont.len && cont.flushed)
^^^  not flushed yet!
return false;
...
}

log_store() /* 

Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-28 Thread Sergey Senozhatsky
Hello,

the patch to fix the async KERN_CONT printk regression I mentioned
several days ago in another thread.

as a separate patch for now, to ease the review. will squash with 0001.

== 8< ==

>From 70f65ed55eb82a1b8b74fdbd1a7afc9e77e9b380 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky 
Subject: [PATCH] printk: print KERN_CONT buffer in sync printk mode

One of my boxen has revealed that there is a race condition now between
pr_cot() output and printk_kthread wake_up() time. Schematically, the CPU
that triggered the race conditions was doing
something like this:

for (...) {
for () {
...
pr_cont(...);
...
}
pr_cont("\n");
}

the resulting output was:

$ dmesg

0001 0002 0003 0004 0005 0006 0007
0008
0009 000a 000b 000c 000d 000e
000f 0010 0011 0012 0013 0014 0015
002b 002c 002d 002e 002f 0030 0031



KERN_CONT output is heavily relying on the fact there will be no other
CPU(-s)/process(-es) doing any sort (KERN_CONT etc.) of printk-s simultaneously.
This basically means that pr_cont() CPU is expected to be the one that actually
does the printing and log_store() of the entire cont buffer:

pr_cont()
console_unlock()
console_cont_flush()
log_store() /* cont buffer */
pr_cont()
console_unlock()
console_cont_flush()
log_store() /* cont buffer */


Async printk breaks this contract, because console_cont_flush() is now deferred.
Thus pr_cont() will see a not fully flushed cont buffer, which will force it to
append a new KERN_CONT not to cont buffer, but to buffer log as a separate log
entry via log_store(). Due to the deferred nature of console_cont_flush() the
very next pr_cont() may see cont buffer being flushed, so it will keep the
message in the cont buffer and log_store() it later. So the cont line will split
across several log entries -- printk_kthread can take some time to wkeup,
during which pr_cont() will keep splitting cont line:

$ dmesg

0001 0002 0003 0004 0005 0006 0007
0008
0009
000a
000b
000c
000d
000e


To fix the issue we need to switch to sync printk mode for KERN_CONT messages
and when we cont_flush() the cont buffer.


CPU0CPU1

vprintk_emit(KERN_CONT "0001")
if (!(lflags & LOG_NEWLINE))
cont_add()
if (!cont.len)
starts a new cont line
wake_up() printk_kthread
...

vprintk_emit(KERN_CONT "004")   entering 
console_unlock()
if (!(lflags & LOG_NEWLINE))
cont_add()
wake_up() printk_kthread

vprintk_emit(KERN_CONT "005")
if (!(lflags & LOG_NEWLINE))
cont_add()
wake_up() printk_kthreadconsole_cont_flush()

cont_print_text()

vprintk_emit(KERN_CONT "006")
if (!(lflags & LOG_NEWLINE))
cont_add()
wake_up() printk_kthread
leaving console_unlock()

vprintk_emit(KERN_CONT "007")
if (!(lflags & LOG_NEWLINE))entering 
console_unlock()
cont_add()
wake_up() printk_kthreadconsole_cont_flush()

cont_print_text()
leaving console_unlock()
vprintk_emit(KERN_CONT "\n")
if (lflags & LOG_NEWLINE)
if (cont.len) {
cont_add()
cont_flush()
if (cont.cons) {
cont.flags = flags;
cont.flushed = true;

}
log_store() /* "001 .. 007" */
^^
}
wake_up() process   leaving console_unlock()

vprintk_emit(KERN_CONT "008")
if (!(lflags & LOG_NEWLINE))entering 
console_unlock()
cont_add()
{
if (cont.len && cont.flushed)
^^^  not flushed yet!
return false;
...
}

log_store() /* "008" */

Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-08 Thread Sergey Senozhatsky
On (05/13/16 22:18), Sergey Senozhatsky wrote:
> Hello,
> 
> no code changes, just refreshing the series so it'll be easier to
> pick up. added Reviwed-by-s and corrected a typo spotted by Petr.
> 

Andrew,
Did you get a chance to look at the series?

-ss


Re: [PATCH v12 0/3] printk: Make printk() completely async

2016-06-08 Thread Sergey Senozhatsky
On (05/13/16 22:18), Sergey Senozhatsky wrote:
> Hello,
> 
> no code changes, just refreshing the series so it'll be easier to
> pick up. added Reviwed-by-s and corrected a typo spotted by Petr.
> 

Andrew,
Did you get a chance to look at the series?

-ss