Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-03-02 Thread Jan Kara
On Tue 01-03-16 18:22:25, Denys Vlasenko wrote:
> On Mon, Oct 26, 2015 at 5:52 AM, Jan Kara  wrote:
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads.  As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> 
> 
> > +/*
> > + * Number of kernel threads for offloading printing. We need at least two 
> > so
> > + * that they can hand over printing from one to another one and thus switch
> > + * CPUs.
> > + */
> > +#define PRINTING_TASKS 2
> > +
> > +/* Wait queue printing kthreads sleep on when idle */
> > +static DECLARE_WAIT_QUEUE_HEAD(print_queue);
> 
> Having two tasks, not one, for printking for the case
> when console output is slow... sounds wasteful.
> 
> Can this be improved so that only one task is needed?

Probably we'll go with workqueue in the next version of the patch series.
But at least in this version you needed two tasks so that one task can hand
over printing to the other one and thus relieve the load from a CPU.

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-03-02 Thread Jan Kara
On Tue 01-03-16 18:22:25, Denys Vlasenko wrote:
> On Mon, Oct 26, 2015 at 5:52 AM, Jan Kara  wrote:
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads.  As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> 
> 
> > +/*
> > + * Number of kernel threads for offloading printing. We need at least two 
> > so
> > + * that they can hand over printing from one to another one and thus switch
> > + * CPUs.
> > + */
> > +#define PRINTING_TASKS 2
> > +
> > +/* Wait queue printing kthreads sleep on when idle */
> > +static DECLARE_WAIT_QUEUE_HEAD(print_queue);
> 
> Having two tasks, not one, for printking for the case
> when console output is slow... sounds wasteful.
> 
> Can this be improved so that only one task is needed?

Probably we'll go with workqueue in the next version of the patch series.
But at least in this version you needed two tasks so that one task can hand
over printing to the other one and thus relieve the load from a CPU.

Honza
-- 
Jan Kara 
SUSE Labs, CR


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-03-01 Thread Denys Vlasenko
On Mon, Oct 26, 2015 at 5:52 AM, Jan Kara  wrote:
> This patch implements a mechanism where after printing specified number
> of characters (tunable as a kernel parameter printk.offload_chars), CPU
> doing printing asks for help by waking up one of dedicated kthreads.  As
> soon as the printing CPU notices kthread got scheduled and is spinning
> on print_lock dedicated for that purpose, it drops console_sem,
> print_lock, and exits console_unlock(). Kthread then takes over printing
> instead. This way no CPU should spend printing too long even if there
> is heavy printk traffic.


> +/*
> + * Number of kernel threads for offloading printing. We need at least two so
> + * that they can hand over printing from one to another one and thus switch
> + * CPUs.
> + */
> +#define PRINTING_TASKS 2
> +
> +/* Wait queue printing kthreads sleep on when idle */
> +static DECLARE_WAIT_QUEUE_HEAD(print_queue);

Having two tasks, not one, for printking for the case
when console output is slow... sounds wasteful.

Can this be improved so that only one task is needed?


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-03-01 Thread Denys Vlasenko
On Mon, Oct 26, 2015 at 5:52 AM, Jan Kara  wrote:
> This patch implements a mechanism where after printing specified number
> of characters (tunable as a kernel parameter printk.offload_chars), CPU
> doing printing asks for help by waking up one of dedicated kthreads.  As
> soon as the printing CPU notices kthread got scheduled and is spinning
> on print_lock dedicated for that purpose, it drops console_sem,
> print_lock, and exits console_unlock(). Kthread then takes over printing
> instead. This way no CPU should spend printing too long even if there
> is heavy printk traffic.


> +/*
> + * Number of kernel threads for offloading printing. We need at least two so
> + * that they can hand over printing from one to another one and thus switch
> + * CPUs.
> + */
> +#define PRINTING_TASKS 2
> +
> +/* Wait queue printing kthreads sleep on when idle */
> +static DECLARE_WAIT_QUEUE_HEAD(print_queue);

Having two tasks, not one, for printking for the case
when console output is slow... sounds wasteful.

Can this be improved so that only one task is needed?


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Jan Kara
On Wed 06-01-16 15:48:36, Sergey Senozhatsky wrote:
> On (01/05/16 15:37), Jan Kara wrote:
> > > How about setting 'sync_print' to 'true' in...
> > >   bust_spinlocks() /* only set to true */
> > > or
> > >   console_verbose() /* um... may be... */
> > > or
> > >   having a separate one-liner for that
> > > 
> > > void console_panic_mode(void)
> > > {
> > >   sync_print = true;
> > > }
> > > 
> > > and call it early in panic(), before we send out IPI_STOP.
> > 
> > I like using console_verbose() for setting sync_print to true. That will
> > likely be more reliable than using oops in progress. After all
> > console_verbose() is used like console_panic_mode() anyway and in quite a
> > few places so it is a reasonable match.
> 
> another corner case.
> 
> a quote from -mm a74b6533ead8 
> http://www.spinics.net/lists/linux-mm/msg98990.html
> 
> :This patch reduces the probability of such a lockup by introducing a
> :specialized kernel thread (oom_reaper) which tries to reclaim additional
> :memory by preemptively reaping the anonymous or swapped out memory owned
> :by the oom victim under an assumption that such a memory won't be needed
> :when its owner is killed and kicked from the userspace anyway.  There is
> :one notable exception to this, though, if the OOM victim was in the
> :process of coredumping the result would be incomplete.  This is 
> considered
> :a reasonable constrain because the overall system health is more 
> important
> :than debugability of a particular application.
> :
> :A kernel thread has been chosen because we need a reliable way of
> :invocation so workqueue context is not appropriate because all the 
> workers
> :might be busy (e.g.  allocating memory).  Kswapd which sounds like 
> another
> :good fit is not appropriate as well because it might get blocked on locks
> :during reclaim as well.
> 
> particularly this "workqueue context is not appropriate because all the 
> workers
> might be busy (e.g.  allocating memory)" part. I think printk should switch to
> sync mode in this case, since printk now does queue_work(system_wq, work).
> um... console_verbose() call from oom kill? but it'll be nice to return back
> to async mode once (if) memory pressure goes away.

Hum, yes, some mechanism to switch to sync printing in case work cannot be
executed for a long time is probably needed. I'll think about it.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Sergey Senozhatsky
On (01/06/16 11:21), Jan Kara wrote:
[..]
> > 
> > or like this on another setup ([k|u]_ts updated to u64)
> > 
> > # cat /proc/1/time_in_console_unlock
> > kern:[12.755920] user:[38.367332]
> 
> So maybe that is worth addressing if it bothers you but please as a
> separate patch set. This seems fairly independent and I think even current
> version of the patches will be controversial enough...

Agree

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Jan Kara
On Wed 06-01-16 17:36:53, Sergey Senozhatsky wrote:
> On (01/06/16 12:38), Sergey Senozhatsky wrote:
> > On (01/05/16 15:48), Jan Kara wrote:
> > > > [..]
> > > > > cond_resched() does its job there, of course. well, a user process 
> > > > > still can
> > > > > do a lot of call_console_drivers() calls. may be we can check who is 
> > > > > calling
> > > > > console_unlock() and if we have "!printk_sync && !oops_in_progress" 
> > > > > (or just printk_sync
> > > > > test) AND a user process then return from console_unlock() doing 
> > > > > irq_work_queue()
> > > > > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() 
> > > > > does it.
> > > > 
> > > > attached two patches, I ended up having on top of yours. just in case.
> > > > 
> > > > printk: factor out can_printk_async()
> > > > 
> > > > console_unlock() can be called directly or indirectly by a user
> > > > space process, so it can end up doing call_console_drivers() loop,
> > > > which will hold it from returning back to user-space from a syscall
> > > > for unpredictable amount of time.
> > > > 
> > > > Factor out can_printk_async() function, which queues an irq work and
> > > > sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async 
> > > > printk).
> > > > vprintk_emit() already does it, add can_printk_async() call to
> > > > console_unlock() for !PF_KTHREAD processes.
> > > 
> > > I'd be cautious about changing this userspace visible behavior. Someone 
> > > may
> > > be relying on it... I agree that sometimes we can block userspace process
> > > in kernel for a long time (e.g. in my testing I often see syslog process
> > > doing the printing) but so far I didn't see / was notified about some real
> > > problem with this. So unless I see some real user issues with user
> > > processes doing printing for too long I would not touch this.
> > 
> > and w/o a lot of effort (no heavy printk message traffic)
> 
> or like this on another setup ([k|u]_ts updated to u64)
> 
> # cat /proc/1/time_in_console_unlock
> kern:[12.755920] user:[38.367332]

So maybe that is worth addressing if it bothers you but please as a
separate patch set. This seems fairly independent and I think even current
version of the patches will be controversial enough...

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Sergey Senozhatsky
On (01/06/16 12:38), Sergey Senozhatsky wrote:
> On (01/05/16 15:48), Jan Kara wrote:
> > > [..]
> > > > cond_resched() does its job there, of course. well, a user process 
> > > > still can
> > > > do a lot of call_console_drivers() calls. may be we can check who is 
> > > > calling
> > > > console_unlock() and if we have "!printk_sync && !oops_in_progress" (or 
> > > > just printk_sync
> > > > test) AND a user process then return from console_unlock() doing 
> > > > irq_work_queue()
> > > > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does 
> > > > it.
> > > 
> > > attached two patches, I ended up having on top of yours. just in case.
> > > 
> > > printk: factor out can_printk_async()
> > > 
> > > console_unlock() can be called directly or indirectly by a user
> > > space process, so it can end up doing call_console_drivers() loop,
> > > which will hold it from returning back to user-space from a syscall
> > > for unpredictable amount of time.
> > > 
> > > Factor out can_printk_async() function, which queues an irq work and
> > > sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> > > vprintk_emit() already does it, add can_printk_async() call to
> > > console_unlock() for !PF_KTHREAD processes.
> > 
> > I'd be cautious about changing this userspace visible behavior. Someone may
> > be relying on it... I agree that sometimes we can block userspace process
> > in kernel for a long time (e.g. in my testing I often see syslog process
> > doing the printing) but so far I didn't see / was notified about some real
> > problem with this. So unless I see some real user issues with user
> > processes doing printing for too long I would not touch this.
> 
> and w/o a lot of effort (no heavy printk message traffic)

or like this on another setup ([k|u]_ts updated to u64)

# cat /proc/1/time_in_console_unlock
kern:[12.755920] user:[38.367332]

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Sergey Senozhatsky
On (01/06/16 12:38), Sergey Senozhatsky wrote:
> On (01/05/16 15:48), Jan Kara wrote:
> > > [..]
> > > > cond_resched() does its job there, of course. well, a user process 
> > > > still can
> > > > do a lot of call_console_drivers() calls. may be we can check who is 
> > > > calling
> > > > console_unlock() and if we have "!printk_sync && !oops_in_progress" (or 
> > > > just printk_sync
> > > > test) AND a user process then return from console_unlock() doing 
> > > > irq_work_queue()
> > > > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does 
> > > > it.
> > > 
> > > attached two patches, I ended up having on top of yours. just in case.
> > > 
> > > printk: factor out can_printk_async()
> > > 
> > > console_unlock() can be called directly or indirectly by a user
> > > space process, so it can end up doing call_console_drivers() loop,
> > > which will hold it from returning back to user-space from a syscall
> > > for unpredictable amount of time.
> > > 
> > > Factor out can_printk_async() function, which queues an irq work and
> > > sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> > > vprintk_emit() already does it, add can_printk_async() call to
> > > console_unlock() for !PF_KTHREAD processes.
> > 
> > I'd be cautious about changing this userspace visible behavior. Someone may
> > be relying on it... I agree that sometimes we can block userspace process
> > in kernel for a long time (e.g. in my testing I often see syslog process
> > doing the printing) but so far I didn't see / was notified about some real
> > problem with this. So unless I see some real user issues with user
> > processes doing printing for too long I would not touch this.
> 
> and w/o a lot of effort (no heavy printk message traffic)

or like this on another setup ([k|u]_ts updated to u64)

# cat /proc/1/time_in_console_unlock
kern:[12.755920] user:[38.367332]

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Jan Kara
On Wed 06-01-16 17:36:53, Sergey Senozhatsky wrote:
> On (01/06/16 12:38), Sergey Senozhatsky wrote:
> > On (01/05/16 15:48), Jan Kara wrote:
> > > > [..]
> > > > > cond_resched() does its job there, of course. well, a user process 
> > > > > still can
> > > > > do a lot of call_console_drivers() calls. may be we can check who is 
> > > > > calling
> > > > > console_unlock() and if we have "!printk_sync && !oops_in_progress" 
> > > > > (or just printk_sync
> > > > > test) AND a user process then return from console_unlock() doing 
> > > > > irq_work_queue()
> > > > > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() 
> > > > > does it.
> > > > 
> > > > attached two patches, I ended up having on top of yours. just in case.
> > > > 
> > > > printk: factor out can_printk_async()
> > > > 
> > > > console_unlock() can be called directly or indirectly by a user
> > > > space process, so it can end up doing call_console_drivers() loop,
> > > > which will hold it from returning back to user-space from a syscall
> > > > for unpredictable amount of time.
> > > > 
> > > > Factor out can_printk_async() function, which queues an irq work and
> > > > sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async 
> > > > printk).
> > > > vprintk_emit() already does it, add can_printk_async() call to
> > > > console_unlock() for !PF_KTHREAD processes.
> > > 
> > > I'd be cautious about changing this userspace visible behavior. Someone 
> > > may
> > > be relying on it... I agree that sometimes we can block userspace process
> > > in kernel for a long time (e.g. in my testing I often see syslog process
> > > doing the printing) but so far I didn't see / was notified about some real
> > > problem with this. So unless I see some real user issues with user
> > > processes doing printing for too long I would not touch this.
> > 
> > and w/o a lot of effort (no heavy printk message traffic)
> 
> or like this on another setup ([k|u]_ts updated to u64)
> 
> # cat /proc/1/time_in_console_unlock
> kern:[12.755920] user:[38.367332]

So maybe that is worth addressing if it bothers you but please as a
separate patch set. This seems fairly independent and I think even current
version of the patches will be controversial enough...

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Sergey Senozhatsky
On (01/06/16 11:21), Jan Kara wrote:
[..]
> > 
> > or like this on another setup ([k|u]_ts updated to u64)
> > 
> > # cat /proc/1/time_in_console_unlock
> > kern:[12.755920] user:[38.367332]
> 
> So maybe that is worth addressing if it bothers you but please as a
> separate patch set. This seems fairly independent and I think even current
> version of the patches will be controversial enough...

Agree

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-06 Thread Jan Kara
On Wed 06-01-16 15:48:36, Sergey Senozhatsky wrote:
> On (01/05/16 15:37), Jan Kara wrote:
> > > How about setting 'sync_print' to 'true' in...
> > >   bust_spinlocks() /* only set to true */
> > > or
> > >   console_verbose() /* um... may be... */
> > > or
> > >   having a separate one-liner for that
> > > 
> > > void console_panic_mode(void)
> > > {
> > >   sync_print = true;
> > > }
> > > 
> > > and call it early in panic(), before we send out IPI_STOP.
> > 
> > I like using console_verbose() for setting sync_print to true. That will
> > likely be more reliable than using oops in progress. After all
> > console_verbose() is used like console_panic_mode() anyway and in quite a
> > few places so it is a reasonable match.
> 
> another corner case.
> 
> a quote from -mm a74b6533ead8 
> http://www.spinics.net/lists/linux-mm/msg98990.html
> 
> :This patch reduces the probability of such a lockup by introducing a
> :specialized kernel thread (oom_reaper) which tries to reclaim additional
> :memory by preemptively reaping the anonymous or swapped out memory owned
> :by the oom victim under an assumption that such a memory won't be needed
> :when its owner is killed and kicked from the userspace anyway.  There is
> :one notable exception to this, though, if the OOM victim was in the
> :process of coredumping the result would be incomplete.  This is 
> considered
> :a reasonable constrain because the overall system health is more 
> important
> :than debugability of a particular application.
> :
> :A kernel thread has been chosen because we need a reliable way of
> :invocation so workqueue context is not appropriate because all the 
> workers
> :might be busy (e.g.  allocating memory).  Kswapd which sounds like 
> another
> :good fit is not appropriate as well because it might get blocked on locks
> :during reclaim as well.
> 
> particularly this "workqueue context is not appropriate because all the 
> workers
> might be busy (e.g.  allocating memory)" part. I think printk should switch to
> sync mode in this case, since printk now does queue_work(system_wq, work).
> um... console_verbose() call from oom kill? but it'll be nice to return back
> to async mode once (if) memory pressure goes away.

Hum, yes, some mechanism to switch to sync printing in case work cannot be
executed for a long time is probably needed. I'll think about it.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Sergey Senozhatsky
On (01/05/16 15:37), Jan Kara wrote:
> > How about setting 'sync_print' to 'true' in...
> >   bust_spinlocks() /* only set to true */
> > or
> >   console_verbose() /* um... may be... */
> > or
> >   having a separate one-liner for that
> > 
> > void console_panic_mode(void)
> > {
> > sync_print = true;
> > }
> > 
> > and call it early in panic(), before we send out IPI_STOP.
> 
> I like using console_verbose() for setting sync_print to true. That will
> likely be more reliable than using oops in progress. After all
> console_verbose() is used like console_panic_mode() anyway and in quite a
> few places so it is a reasonable match.

another corner case.

a quote from -mm a74b6533ead8 
http://www.spinics.net/lists/linux-mm/msg98990.html

:This patch reduces the probability of such a lockup by introducing a
:specialized kernel thread (oom_reaper) which tries to reclaim additional
:memory by preemptively reaping the anonymous or swapped out memory owned
:by the oom victim under an assumption that such a memory won't be needed
:when its owner is killed and kicked from the userspace anyway.  There is
:one notable exception to this, though, if the OOM victim was in the
:process of coredumping the result would be incomplete.  This is considered
:a reasonable constrain because the overall system health is more important
:than debugability of a particular application.
:
:A kernel thread has been chosen because we need a reliable way of
:invocation so workqueue context is not appropriate because all the workers
:might be busy (e.g.  allocating memory).  Kswapd which sounds like another
:good fit is not appropriate as well because it might get blocked on locks
:during reclaim as well.

particularly this "workqueue context is not appropriate because all the workers
might be busy (e.g.  allocating memory)" part. I think printk should switch to
sync mode in this case, since printk now does queue_work(system_wq, work).
um... console_verbose() call from oom kill? but it'll be nice to return back
to async mode once (if) memory pressure goes away.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Sergey Senozhatsky
Hello,

On (01/05/16 15:48), Jan Kara wrote:
> > [..]
> > > cond_resched() does its job there, of course. well, a user process still 
> > > can
> > > do a lot of call_console_drivers() calls. may be we can check who is 
> > > calling
> > > console_unlock() and if we have "!printk_sync && !oops_in_progress" (or 
> > > just printk_sync
> > > test) AND a user process then return from console_unlock() doing 
> > > irq_work_queue()
> > > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.
> > 
> > attached two patches, I ended up having on top of yours. just in case.
> > 
> > printk: factor out can_printk_async()
> > 
> > console_unlock() can be called directly or indirectly by a user
> > space process, so it can end up doing call_console_drivers() loop,
> > which will hold it from returning back to user-space from a syscall
> > for unpredictable amount of time.
> > 
> > Factor out can_printk_async() function, which queues an irq work and
> > sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> > vprintk_emit() already does it, add can_printk_async() call to
> > console_unlock() for !PF_KTHREAD processes.
> 
> I'd be cautious about changing this userspace visible behavior. Someone may
> be relying on it... I agree that sometimes we can block userspace process
> in kernel for a long time (e.g. in my testing I often see syslog process
> doing the printing) but so far I didn't see / was notified about some real
> problem with this. So unless I see some real user issues with user
> processes doing printing for too long I would not touch this.

hm, interesting point.

/* random thoughts, I'm still on sick leave */

do we really have a user visible behaviour here that it really wants to have?
a task that does tty_open, for instance, hardly wants to end up doing a bunch
of call_console_drivers() calls in console_unlock(). it does look to me mostly
as unexpected side effect.

I probably can imagine someone writing a /usr/bin/flush_logs_to_serial
app that specifically depends on that behaviour, but that will require a bit
of hackery and trickery, since (seems) there is no syscall this app can call
that will perform *only* the required action:

void force_flush_logs_to_serial(void)
{
console_lock();
console_unlock();
}

returning back to user space from syscalls quicker is a good thing, I'd
prefer user space apps to do less kernel job (by kernel job I mean
call_console_drivers() loop). well, at least on admittedly weird setups
that I have to deal with. but I may be missing something here.

some numbers

I added global `unsigned long k_ts, u_ts;' to accumulate time spent
in console_unlock() by PF_KTHREAD and !PF_KTHREAD correspondingly.

void console_unlock(void)
{
...
s_ts = local_clock();
console_cont_flush(text, sizeof(text));
again:

...

if (wake_klogd)
wake_up_klogd();

e_ts = local_clock();
if (time_after(e_ts, s_ts)) {
if (current->flags & PF_KTHREAD)
k_ts += (e_ts - s_ts);
else
u_ts += (e_ts - s_ts);
}
}

and a procfs file to read the values

 unsigned long k = k_ts;
 unsigned long u = u_ts;
 unsigned long rem_nsec_k = do_div(k, 10);
 unsigned long rem_nsec_u = do_div(u, 10);

 return sprintf(buf, "kern:[%lu.%06lu] user:[%lu.%06lu]\n",
 k, rem_nsec_k / 1000,
 u, rem_nsec_u / 1000);

and w/o a lot of effort (no heavy printk message traffic)

$ cat /proc/1/time_in_console_unlock
kern:[4.241475] user:[4.077787]

that's user space spent almost the same amount of time to print kernel
messages as the kernel did on its own. which is hard to formulate as an
issue, it's just user space was doing for 4 seconds something it was not
really meant to do (at least from user space app developer's point of
view); so there is an unpredictable additional cost X added to some of
the syscalls.

-ss

> > printk: introduce console_sync_mode
> > 
> > console_sync_mode() should be called early in panic() to switch
> > printk from async mode to sync. Otherwise, STOP IPIs can arrive
> > to other CPUs too late and those CPUs will see oops_in_progress
> > being 0 again.
> 
> So as I wrote, I like this in principle but there are much more places
> calling console_verbose() and all of them want console_sync_mode() as well.
> So I prefer hiding the sync printing in console_verbose() and possibly
> renaming it to something better but I'm not sure renaming is worth it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Sergey Senozhatsky
Hello,

On (01/05/16 15:37), Jan Kara wrote:
> Hi,
> 
> On Wed 23-12-15 10:54:49, Sergey Senozhatsky wrote:
> > slowly looking through the patches.
> 
> Back from Christmas vacation...
> 
> > How about setting 'sync_print' to 'true' in...
> >   bust_spinlocks() /* only set to true */
> > or
> >   console_verbose() /* um... may be... */
> > or
> >   having a separate one-liner for that
> > 
> > void console_panic_mode(void)
> > {
> > sync_print = true;
> > }
> > 
> > and call it early in panic(), before we send out IPI_STOP.
> 
> I like using console_verbose() for setting sync_print to true. That will
> likely be more reliable than using oops in progress. After all
> console_verbose() is used like console_panic_mode() anyway and in quite a
> few places so it is a reasonable match.

Agree, only arch/microblaze/kernel/setup.c and arch/nios2/kernel/setup.c
do console_verbose() early in setup_arch(), the rest seems to be what I
was thinking of.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Jan Kara
On Thu 31-12-15 13:58:59, Sergey Senozhatsky wrote:
> On (12/31/15 12:13), Sergey Senozhatsky wrote:
> [..]
> > cond_resched() does its job there, of course. well, a user process still can
> > do a lot of call_console_drivers() calls. may be we can check who is calling
> > console_unlock() and if we have "!printk_sync && !oops_in_progress" (or 
> > just printk_sync
> > test) AND a user process then return from console_unlock() doing 
> > irq_work_queue()
> > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.
> 
> attached two patches, I ended up having on top of yours. just in case.
> 
> printk: factor out can_printk_async()
> 
> console_unlock() can be called directly or indirectly by a user
> space process, so it can end up doing call_console_drivers() loop,
> which will hold it from returning back to user-space from a syscall
> for unpredictable amount of time.
> 
> Factor out can_printk_async() function, which queues an irq work and
> sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> vprintk_emit() already does it, add can_printk_async() call to
> console_unlock() for !PF_KTHREAD processes.

I'd be cautious about changing this userspace visible behavior. Someone may
be relying on it... I agree that sometimes we can block userspace process
in kernel for a long time (e.g. in my testing I often see syslog process
doing the printing) but so far I didn't see / was notified about some real
problem with this. So unless I see some real user issues with user
processes doing printing for too long I would not touch this.
 
> and
> 
> printk: introduce console_sync_mode
> 
> console_sync_mode() should be called early in panic() to switch
> printk from async mode to sync. Otherwise, STOP IPIs can arrive
> to other CPUs too late and those CPUs will see oops_in_progress
> being 0 again.

So as I wrote, I like this in principle but there are much more places
calling console_verbose() and all of them want console_sync_mode() as well.
So I prefer hiding the sync printing in console_verbose() and possibly
renaming it to something better but I'm not sure renaming is worth it.

Honza

> 
>   -ss

> From c3fc955809adab8f497cdc7581e67e1fa29d6517 Mon Sep 17 00:00:00 2001
> From: Sergey Senozhatsky 
> Date: Wed, 30 Dec 2015 20:39:12 +0900
> Subject: [PATCH 1/2] printk: introduce console_sync_mode
> 
> console_sync_mode() should be called early in panic() to switch
> printk from async mode to sync. Otherwise, STOP IPIs can arrive
> to other CPUs too late and those CPUs will see oops_in_progress
> being 0 again.
> ---
>  include/linux/console.h | 1 +
>  kernel/panic.c  | 1 +
>  kernel/printk/printk.c  | 5 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/include/linux/console.h b/include/linux/console.h
> index bd19434..f068985 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -150,6 +150,7 @@ extern int console_trylock(void);
>  extern void console_unlock(void);
>  extern void console_conditional_schedule(void);
>  extern void console_unblank(void);
> +extern void console_sync_mode(void);
>  extern struct tty_driver *console_device(int *);
>  extern void console_stop(struct console *);
>  extern void console_start(struct console *);
> diff --git a/kernel/panic.c b/kernel/panic.c
> index b80..04c8ff4 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -117,6 +117,7 @@ void panic(const char *fmt, ...)
>   if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
>   panic_smp_self_stop();
>  
> + console_sync_mode();
>   console_verbose();
>   bust_spinlocks(1);
>   va_start(args, fmt);
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index de9d31b..47a70a2 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2501,6 +2501,11 @@ void console_unblank(void)
>   console_unlock();
>  }
>  
> +void console_sync_mode(void)
> +{
> + printk_sync = true;
> +}
> +
>  /*
>   * Return the console tty driver structure and its associated index
>   */
> -- 
> 2.6.4
> 

> From 92f2c0f2a5ed015caa2757dcfec4407d708f8628 Mon Sep 17 00:00:00 2001
> From: Sergey Senozhatsky 
> Date: Thu, 31 Dec 2015 13:39:58 +0900
> Subject: [PATCH 2/2] printk: factor out can_printk_async()
> 
> console_unlock() can be called directly or indirectly by a user
> space process, so it can end up doing call_console_drivers() loop,
> which will hold it from returning back to user-space from a syscall
> for unpredictable amount of time.
> 
> Factor out can_printk_async() function, which queues an irq work and
> sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> vprintk_emit() already does it, add can_printk_async() call to
> console_unlock() for !PF_KTHREAD processes.
> ---
>  kernel/printk/printk.c | 42 --

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Jan Kara
Hi,

On Wed 23-12-15 10:54:49, Sergey Senozhatsky wrote:
> slowly looking through the patches.

Back from Christmas vacation...

> How about setting 'sync_print' to 'true' in...
>   bust_spinlocks() /* only set to true */
> or
>   console_verbose() /* um... may be... */
> or
>   having a separate one-liner for that
> 
> void console_panic_mode(void)
> {
>   sync_print = true;
> }
> 
> and call it early in panic(), before we send out IPI_STOP.

I like using console_verbose() for setting sync_print to true. That will
likely be more reliable than using oops in progress. After all
console_verbose() is used like console_panic_mode() anyway and in quite a
few places so it is a reasonable match.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Jan Kara
Hi,

On Wed 23-12-15 10:54:49, Sergey Senozhatsky wrote:
> slowly looking through the patches.

Back from Christmas vacation...

> How about setting 'sync_print' to 'true' in...
>   bust_spinlocks() /* only set to true */
> or
>   console_verbose() /* um... may be... */
> or
>   having a separate one-liner for that
> 
> void console_panic_mode(void)
> {
>   sync_print = true;
> }
> 
> and call it early in panic(), before we send out IPI_STOP.

I like using console_verbose() for setting sync_print to true. That will
likely be more reliable than using oops in progress. After all
console_verbose() is used like console_panic_mode() anyway and in quite a
few places so it is a reasonable match.

Honza
-- 
Jan Kara 
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Jan Kara
On Thu 31-12-15 13:58:59, Sergey Senozhatsky wrote:
> On (12/31/15 12:13), Sergey Senozhatsky wrote:
> [..]
> > cond_resched() does its job there, of course. well, a user process still can
> > do a lot of call_console_drivers() calls. may be we can check who is calling
> > console_unlock() and if we have "!printk_sync && !oops_in_progress" (or 
> > just printk_sync
> > test) AND a user process then return from console_unlock() doing 
> > irq_work_queue()
> > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.
> 
> attached two patches, I ended up having on top of yours. just in case.
> 
> printk: factor out can_printk_async()
> 
> console_unlock() can be called directly or indirectly by a user
> space process, so it can end up doing call_console_drivers() loop,
> which will hold it from returning back to user-space from a syscall
> for unpredictable amount of time.
> 
> Factor out can_printk_async() function, which queues an irq work and
> sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> vprintk_emit() already does it, add can_printk_async() call to
> console_unlock() for !PF_KTHREAD processes.

I'd be cautious about changing this userspace visible behavior. Someone may
be relying on it... I agree that sometimes we can block userspace process
in kernel for a long time (e.g. in my testing I often see syslog process
doing the printing) but so far I didn't see / was notified about some real
problem with this. So unless I see some real user issues with user
processes doing printing for too long I would not touch this.
 
> and
> 
> printk: introduce console_sync_mode
> 
> console_sync_mode() should be called early in panic() to switch
> printk from async mode to sync. Otherwise, STOP IPIs can arrive
> to other CPUs too late and those CPUs will see oops_in_progress
> being 0 again.

So as I wrote, I like this in principle but there are much more places
calling console_verbose() and all of them want console_sync_mode() as well.
So I prefer hiding the sync printing in console_verbose() and possibly
renaming it to something better but I'm not sure renaming is worth it.

Honza

> 
>   -ss

> From c3fc955809adab8f497cdc7581e67e1fa29d6517 Mon Sep 17 00:00:00 2001
> From: Sergey Senozhatsky 
> Date: Wed, 30 Dec 2015 20:39:12 +0900
> Subject: [PATCH 1/2] printk: introduce console_sync_mode
> 
> console_sync_mode() should be called early in panic() to switch
> printk from async mode to sync. Otherwise, STOP IPIs can arrive
> to other CPUs too late and those CPUs will see oops_in_progress
> being 0 again.
> ---
>  include/linux/console.h | 1 +
>  kernel/panic.c  | 1 +
>  kernel/printk/printk.c  | 5 +
>  3 files changed, 7 insertions(+)
> 
> diff --git a/include/linux/console.h b/include/linux/console.h
> index bd19434..f068985 100644
> --- a/include/linux/console.h
> +++ b/include/linux/console.h
> @@ -150,6 +150,7 @@ extern int console_trylock(void);
>  extern void console_unlock(void);
>  extern void console_conditional_schedule(void);
>  extern void console_unblank(void);
> +extern void console_sync_mode(void);
>  extern struct tty_driver *console_device(int *);
>  extern void console_stop(struct console *);
>  extern void console_start(struct console *);
> diff --git a/kernel/panic.c b/kernel/panic.c
> index b80..04c8ff4 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -117,6 +117,7 @@ void panic(const char *fmt, ...)
>   if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
>   panic_smp_self_stop();
>  
> + console_sync_mode();
>   console_verbose();
>   bust_spinlocks(1);
>   va_start(args, fmt);
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index de9d31b..47a70a2 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2501,6 +2501,11 @@ void console_unblank(void)
>   console_unlock();
>  }
>  
> +void console_sync_mode(void)
> +{
> + printk_sync = true;
> +}
> +
>  /*
>   * Return the console tty driver structure and its associated index
>   */
> -- 
> 2.6.4
> 

> From 92f2c0f2a5ed015caa2757dcfec4407d708f8628 Mon Sep 17 00:00:00 2001
> From: Sergey Senozhatsky 
> Date: Thu, 31 Dec 2015 13:39:58 +0900
> Subject: [PATCH 2/2] printk: factor out can_printk_async()
> 
> console_unlock() can be called directly or indirectly by a user
> space process, so it can end up doing call_console_drivers() loop,
> which will hold it from returning back to user-space from a syscall
> for unpredictable amount of time.
> 
> Factor out can_printk_async() function, which queues an irq work and
> sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> vprintk_emit() already does it, add can_printk_async() call to
> console_unlock() for !PF_KTHREAD processes.
> ---
>  

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Sergey Senozhatsky
On (01/05/16 15:37), Jan Kara wrote:
> > How about setting 'sync_print' to 'true' in...
> >   bust_spinlocks() /* only set to true */
> > or
> >   console_verbose() /* um... may be... */
> > or
> >   having a separate one-liner for that
> > 
> > void console_panic_mode(void)
> > {
> > sync_print = true;
> > }
> > 
> > and call it early in panic(), before we send out IPI_STOP.
> 
> I like using console_verbose() for setting sync_print to true. That will
> likely be more reliable than using oops in progress. After all
> console_verbose() is used like console_panic_mode() anyway and in quite a
> few places so it is a reasonable match.

another corner case.

a quote from -mm a74b6533ead8 
http://www.spinics.net/lists/linux-mm/msg98990.html

:This patch reduces the probability of such a lockup by introducing a
:specialized kernel thread (oom_reaper) which tries to reclaim additional
:memory by preemptively reaping the anonymous or swapped out memory owned
:by the oom victim under an assumption that such a memory won't be needed
:when its owner is killed and kicked from the userspace anyway.  There is
:one notable exception to this, though, if the OOM victim was in the
:process of coredumping the result would be incomplete.  This is considered
:a reasonable constrain because the overall system health is more important
:than debugability of a particular application.
:
:A kernel thread has been chosen because we need a reliable way of
:invocation so workqueue context is not appropriate because all the workers
:might be busy (e.g.  allocating memory).  Kswapd which sounds like another
:good fit is not appropriate as well because it might get blocked on locks
:during reclaim as well.

particularly this "workqueue context is not appropriate because all the workers
might be busy (e.g.  allocating memory)" part. I think printk should switch to
sync mode in this case, since printk now does queue_work(system_wq, work).
um... console_verbose() call from oom kill? but it'll be nice to return back
to async mode once (if) memory pressure goes away.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Sergey Senozhatsky
Hello,

On (01/05/16 15:48), Jan Kara wrote:
> > [..]
> > > cond_resched() does its job there, of course. well, a user process still 
> > > can
> > > do a lot of call_console_drivers() calls. may be we can check who is 
> > > calling
> > > console_unlock() and if we have "!printk_sync && !oops_in_progress" (or 
> > > just printk_sync
> > > test) AND a user process then return from console_unlock() doing 
> > > irq_work_queue()
> > > and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.
> > 
> > attached two patches, I ended up having on top of yours. just in case.
> > 
> > printk: factor out can_printk_async()
> > 
> > console_unlock() can be called directly or indirectly by a user
> > space process, so it can end up doing call_console_drivers() loop,
> > which will hold it from returning back to user-space from a syscall
> > for unpredictable amount of time.
> > 
> > Factor out can_printk_async() function, which queues an irq work and
> > sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
> > vprintk_emit() already does it, add can_printk_async() call to
> > console_unlock() for !PF_KTHREAD processes.
> 
> I'd be cautious about changing this userspace visible behavior. Someone may
> be relying on it... I agree that sometimes we can block userspace process
> in kernel for a long time (e.g. in my testing I often see syslog process
> doing the printing) but so far I didn't see / was notified about some real
> problem with this. So unless I see some real user issues with user
> processes doing printing for too long I would not touch this.

hm, interesting point.

/* random thoughts, I'm still on sick leave */

do we really have a user visible behaviour here that it really wants to have?
a task that does tty_open, for instance, hardly wants to end up doing a bunch
of call_console_drivers() calls in console_unlock(). it does look to me mostly
as unexpected side effect.

I probably can imagine someone writing a /usr/bin/flush_logs_to_serial
app that specifically depends on that behaviour, but that will require a bit
of hackery and trickery, since (seems) there is no syscall this app can call
that will perform *only* the required action:

void force_flush_logs_to_serial(void)
{
console_lock();
console_unlock();
}

returning back to user space from syscalls quicker is a good thing, I'd
prefer user space apps to do less kernel job (by kernel job I mean
call_console_drivers() loop). well, at least on admittedly weird setups
that I have to deal with. but I may be missing something here.

some numbers

I added global `unsigned long k_ts, u_ts;' to accumulate time spent
in console_unlock() by PF_KTHREAD and !PF_KTHREAD correspondingly.

void console_unlock(void)
{
...
s_ts = local_clock();
console_cont_flush(text, sizeof(text));
again:

...

if (wake_klogd)
wake_up_klogd();

e_ts = local_clock();
if (time_after(e_ts, s_ts)) {
if (current->flags & PF_KTHREAD)
k_ts += (e_ts - s_ts);
else
u_ts += (e_ts - s_ts);
}
}

and a procfs file to read the values

 unsigned long k = k_ts;
 unsigned long u = u_ts;
 unsigned long rem_nsec_k = do_div(k, 10);
 unsigned long rem_nsec_u = do_div(u, 10);

 return sprintf(buf, "kern:[%lu.%06lu] user:[%lu.%06lu]\n",
 k, rem_nsec_k / 1000,
 u, rem_nsec_u / 1000);

and w/o a lot of effort (no heavy printk message traffic)

$ cat /proc/1/time_in_console_unlock
kern:[4.241475] user:[4.077787]

that's user space spent almost the same amount of time to print kernel
messages as the kernel did on its own. which is hard to formulate as an
issue, it's just user space was doing for 4 seconds something it was not
really meant to do (at least from user space app developer's point of
view); so there is an unpredictable additional cost X added to some of
the syscalls.

-ss

> > printk: introduce console_sync_mode
> > 
> > console_sync_mode() should be called early in panic() to switch
> > printk from async mode to sync. Otherwise, STOP IPIs can arrive
> > to other CPUs too late and those CPUs will see oops_in_progress
> > being 0 again.
> 
> So as I wrote, I like this in principle but there are much more places
> calling console_verbose() and all of them want console_sync_mode() as well.
> So I prefer hiding the sync printing in console_verbose() and possibly
> renaming it to something better but I'm not sure renaming is worth it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2016-01-05 Thread Sergey Senozhatsky
Hello,

On (01/05/16 15:37), Jan Kara wrote:
> Hi,
> 
> On Wed 23-12-15 10:54:49, Sergey Senozhatsky wrote:
> > slowly looking through the patches.
> 
> Back from Christmas vacation...
> 
> > How about setting 'sync_print' to 'true' in...
> >   bust_spinlocks() /* only set to true */
> > or
> >   console_verbose() /* um... may be... */
> > or
> >   having a separate one-liner for that
> > 
> > void console_panic_mode(void)
> > {
> > sync_print = true;
> > }
> > 
> > and call it early in panic(), before we send out IPI_STOP.
> 
> I like using console_verbose() for setting sync_print to true. That will
> likely be more reliable than using oops in progress. After all
> console_verbose() is used like console_panic_mode() anyway and in quite a
> few places so it is a reasonable match.

Agree, only arch/microblaze/kernel/setup.c and arch/nios2/kernel/setup.c
do console_verbose() early in setup_arch(), the rest seems to be what I
was thinking of.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-30 Thread Sergey Senozhatsky
On (12/31/15 12:13), Sergey Senozhatsky wrote:
[..]
> cond_resched() does its job there, of course. well, a user process still can
> do a lot of call_console_drivers() calls. may be we can check who is calling
> console_unlock() and if we have "!printk_sync && !oops_in_progress" (or just 
> printk_sync
> test) AND a user process then return from console_unlock() doing 
> irq_work_queue()
> and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.

attached two patches, I ended up having on top of yours. just in case.

printk: factor out can_printk_async()

console_unlock() can be called directly or indirectly by a user
space process, so it can end up doing call_console_drivers() loop,
which will hold it from returning back to user-space from a syscall
for unpredictable amount of time.

Factor out can_printk_async() function, which queues an irq work and
sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
vprintk_emit() already does it, add can_printk_async() call to
console_unlock() for !PF_KTHREAD processes.

and

printk: introduce console_sync_mode

console_sync_mode() should be called early in panic() to switch
printk from async mode to sync. Otherwise, STOP IPIs can arrive
to other CPUs too late and those CPUs will see oops_in_progress
being 0 again.

-ss
>From c3fc955809adab8f497cdc7581e67e1fa29d6517 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky 
Date: Wed, 30 Dec 2015 20:39:12 +0900
Subject: [PATCH 1/2] printk: introduce console_sync_mode

console_sync_mode() should be called early in panic() to switch
printk from async mode to sync. Otherwise, STOP IPIs can arrive
to other CPUs too late and those CPUs will see oops_in_progress
being 0 again.
---
 include/linux/console.h | 1 +
 kernel/panic.c  | 1 +
 kernel/printk/printk.c  | 5 +
 3 files changed, 7 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index bd19434..f068985 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -150,6 +150,7 @@ extern int console_trylock(void);
 extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
+extern void console_sync_mode(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index b80..04c8ff4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -117,6 +117,7 @@ void panic(const char *fmt, ...)
 	if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
 		panic_smp_self_stop();
 
+	console_sync_mode();
 	console_verbose();
 	bust_spinlocks(1);
 	va_start(args, fmt);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index de9d31b..47a70a2 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2501,6 +2501,11 @@ void console_unblank(void)
 	console_unlock();
 }
 
+void console_sync_mode(void)
+{
+	printk_sync = true;
+}
+
 /*
  * Return the console tty driver structure and its associated index
  */
-- 
2.6.4

>From 92f2c0f2a5ed015caa2757dcfec4407d708f8628 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky 
Date: Thu, 31 Dec 2015 13:39:58 +0900
Subject: [PATCH 2/2] printk: factor out can_printk_async()

console_unlock() can be called directly or indirectly by a user
space process, so it can end up doing call_console_drivers() loop,
which will hold it from returning back to user-space from a syscall
for unpredictable amount of time.

Factor out can_printk_async() function, which queues an irq work and
sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
vprintk_emit() already does it, add can_printk_async() call to
console_unlock() for !PF_KTHREAD processes.
---
 kernel/printk/printk.c | 42 --
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 47a70a2..7d3a8e1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -355,6 +355,26 @@ int printk_deferred(const char *fmt, ...)
 	return r;
 }
 
+static bool can_printk_async(bool sync)
+{
+	/*
+	 * By default we print message to console asynchronously so that kernel
+	 * doesn't get stalled due to slow serial console. That can lead to
+	 * softlockups, lost interrupts, or userspace timing out under heavy
+	 * printing load.
+	 *
+	 * However we resort to synchronous printing of messages during early
+	 * boot, when oops is in progress, or when synchronous printing was
+	 * explicitely requested by kernel parameter.
+	 */
+	if (keventd_up() && !oops_in_progress && !sync) {
+		__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+		irq_work_queue(this_cpu_ptr(_up_klogd_work));
+		return true;
+	}
+	return false;
+}
+
 /* Return log buffer address */
 char *log_buf_addr_get(void)
 {
@@ -1889,20 +1909,7 @@ asmlinkage int vprintk_emit(int 

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-30 Thread Sergey Senozhatsky
On (12/31/15 11:44), Sergey Senozhatsky wrote:
> On (12/22/15 14:47), Jan Kara wrote:
> [..]
> > +int printk_deferred(const char *fmt, ...)
> > +{
> > +   va_list args;
> > +   int r;
> > +
> > +   va_start(args, fmt);
> > +   r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
> > +   va_end(args);
> > +
> > +   return r;
> > +}
> [..]
> > @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
> > logbuf_cpu = UINT_MAX;
> > raw_spin_unlock(_lock);
> > lockdep_on();
> > +   /*
> > +* By default we print message to console asynchronously so that kernel
> > +* doesn't get stalled due to slow serial console. That can lead to
> > +* softlockups, lost interrupts, or userspace timing out under heavy
> > +* printing load.
> > +*
> > +* However we resort to synchronous printing of messages during early
> > +* boot, when oops is in progress, or when synchronous printing was
> > +* explicitely requested by kernel parameter.
> > +*/
> > +   if (keventd_up() && !oops_in_progress && !sync_print) {
> > +   __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> > +   irq_work_queue(this_cpu_ptr(_up_klogd_work));
> > +   } else
> > +   sync_print = true;
> > local_irq_restore(flags);
> 
> So this fixes printk() and printk_deferred(), but it doesn't address any of 
> the
> direct and indirect console_lock/console_unlock callers.
> 
> for example, direct:
> ~/_mmots$ git grep console_unlock | egrep -v "printk\.c|panic\.c|console\.h" 
> | wc -l
> 199
> 
> indirect (e.g. via console_devices()):
> ~/_mmots$ git grep console_device | egrep -v 
> "printk\.c|panic\.c|console\.h|_console_device" | wc -l
> 4
> 
> One of those indirect callers is tty_lookup_driver(), called from tty_open(). 
> Which
> is quite big to ignore, I suspect.
>
> 
> A user space process opening a tty can end up doing that while (1) 
> call_console_drivers()
> loop, I suspect. At least nothing prevents it, at a glance.

d'oh... sorry. that cold that I have is affecting me... no more emails for 
today.


cond_resched() does its job there, of course. well, a user process still can
do a lot of call_console_drivers() calls. may be we can check who is calling
console_unlock() and if we have "!printk_sync && !oops_in_progress" (or just 
printk_sync
test) AND a user process then return from console_unlock() doing 
irq_work_queue()
and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.

-ss

> A side note, isn't it too often to cond_resched() from console_unlock()? What 
> if
> we have 1000 very short printk() messages (e.g. no more than 32 chars).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-30 Thread Sergey Senozhatsky
Hello,

On (12/22/15 14:47), Jan Kara wrote:
[..]
> +int printk_deferred(const char *fmt, ...)
> +{
> + va_list args;
> + int r;
> +
> + va_start(args, fmt);
> + r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
> + va_end(args);
> +
> + return r;
> +}
[..]
> @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
>   logbuf_cpu = UINT_MAX;
>   raw_spin_unlock(_lock);
>   lockdep_on();
> + /*
> +  * By default we print message to console asynchronously so that kernel
> +  * doesn't get stalled due to slow serial console. That can lead to
> +  * softlockups, lost interrupts, or userspace timing out under heavy
> +  * printing load.
> +  *
> +  * However we resort to synchronous printing of messages during early
> +  * boot, when oops is in progress, or when synchronous printing was
> +  * explicitely requested by kernel parameter.
> +  */
> + if (keventd_up() && !oops_in_progress && !sync_print) {
> + __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> + irq_work_queue(this_cpu_ptr(_up_klogd_work));
> + } else
> + sync_print = true;
>   local_irq_restore(flags);

So this fixes printk() and printk_deferred(), but it doesn't address any of the
direct and indirect console_lock/console_unlock callers.

for example, direct:
~/_mmots$ git grep console_unlock | egrep -v "printk\.c|panic\.c|console\.h" | 
wc -l
199

indirect (e.g. via console_devices()):
~/_mmots$ git grep console_device | egrep -v 
"printk\.c|panic\.c|console\.h|_console_device" | wc -l
4

One of those indirect callers is tty_lookup_driver(), called from tty_open(). 
Which
is quite big to ignore, I suspect.

A user space process opening a tty can end up doing that while (1) 
call_console_drivers()
loop, I suspect. At least nothing prevents it, at a glance.

A side note, isn't it too often to cond_resched() from console_unlock()? What if
we have 1000 very short printk() messages (e.g. no more than 32 chars).

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-30 Thread Sergey Senozhatsky
On (12/31/15 12:13), Sergey Senozhatsky wrote:
[..]
> cond_resched() does its job there, of course. well, a user process still can
> do a lot of call_console_drivers() calls. may be we can check who is calling
> console_unlock() and if we have "!printk_sync && !oops_in_progress" (or just 
> printk_sync
> test) AND a user process then return from console_unlock() doing 
> irq_work_queue()
> and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.

attached two patches, I ended up having on top of yours. just in case.

printk: factor out can_printk_async()

console_unlock() can be called directly or indirectly by a user
space process, so it can end up doing call_console_drivers() loop,
which will hold it from returning back to user-space from a syscall
for unpredictable amount of time.

Factor out can_printk_async() function, which queues an irq work and
sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
vprintk_emit() already does it, add can_printk_async() call to
console_unlock() for !PF_KTHREAD processes.

and

printk: introduce console_sync_mode

console_sync_mode() should be called early in panic() to switch
printk from async mode to sync. Otherwise, STOP IPIs can arrive
to other CPUs too late and those CPUs will see oops_in_progress
being 0 again.

-ss
>From c3fc955809adab8f497cdc7581e67e1fa29d6517 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky 
Date: Wed, 30 Dec 2015 20:39:12 +0900
Subject: [PATCH 1/2] printk: introduce console_sync_mode

console_sync_mode() should be called early in panic() to switch
printk from async mode to sync. Otherwise, STOP IPIs can arrive
to other CPUs too late and those CPUs will see oops_in_progress
being 0 again.
---
 include/linux/console.h | 1 +
 kernel/panic.c  | 1 +
 kernel/printk/printk.c  | 5 +
 3 files changed, 7 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index bd19434..f068985 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -150,6 +150,7 @@ extern int console_trylock(void);
 extern void console_unlock(void);
 extern void console_conditional_schedule(void);
 extern void console_unblank(void);
+extern void console_sync_mode(void);
 extern struct tty_driver *console_device(int *);
 extern void console_stop(struct console *);
 extern void console_start(struct console *);
diff --git a/kernel/panic.c b/kernel/panic.c
index b80..04c8ff4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -117,6 +117,7 @@ void panic(const char *fmt, ...)
 	if (old_cpu != PANIC_CPU_INVALID && old_cpu != this_cpu)
 		panic_smp_self_stop();
 
+	console_sync_mode();
 	console_verbose();
 	bust_spinlocks(1);
 	va_start(args, fmt);
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index de9d31b..47a70a2 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2501,6 +2501,11 @@ void console_unblank(void)
 	console_unlock();
 }
 
+void console_sync_mode(void)
+{
+	printk_sync = true;
+}
+
 /*
  * Return the console tty driver structure and its associated index
  */
-- 
2.6.4

>From 92f2c0f2a5ed015caa2757dcfec4407d708f8628 Mon Sep 17 00:00:00 2001
From: Sergey Senozhatsky 
Date: Thu, 31 Dec 2015 13:39:58 +0900
Subject: [PATCH 2/2] printk: factor out can_printk_async()

console_unlock() can be called directly or indirectly by a user
space process, so it can end up doing call_console_drivers() loop,
which will hold it from returning back to user-space from a syscall
for unpredictable amount of time.

Factor out can_printk_async() function, which queues an irq work and
sets a PRINTK_PENDING_OUTPUT pending bit (if we can do async printk).
vprintk_emit() already does it, add can_printk_async() call to
console_unlock() for !PF_KTHREAD processes.
---
 kernel/printk/printk.c | 42 --
 1 file changed, 28 insertions(+), 14 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 47a70a2..7d3a8e1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -355,6 +355,26 @@ int printk_deferred(const char *fmt, ...)
 	return r;
 }
 
+static bool can_printk_async(bool sync)
+{
+	/*
+	 * By default we print message to console asynchronously so that kernel
+	 * doesn't get stalled due to slow serial console. That can lead to
+	 * softlockups, lost interrupts, or userspace timing out under heavy
+	 * printing load.
+	 *
+	 * However we resort to synchronous printing of messages during early
+	 * boot, when oops is in progress, or when synchronous printing was
+	 * explicitely requested by kernel parameter.
+	 */
+	if (keventd_up() && !oops_in_progress && !sync) {
+		__this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
+		irq_work_queue(this_cpu_ptr(_up_klogd_work));
+		return true;
+	}
+	return false;
+}
+
 /* Return log buffer address */
 char *log_buf_addr_get(void)
 {
@@ 

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-30 Thread Sergey Senozhatsky
Hello,

On (12/22/15 14:47), Jan Kara wrote:
[..]
> +int printk_deferred(const char *fmt, ...)
> +{
> + va_list args;
> + int r;
> +
> + va_start(args, fmt);
> + r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
> + va_end(args);
> +
> + return r;
> +}
[..]
> @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
>   logbuf_cpu = UINT_MAX;
>   raw_spin_unlock(_lock);
>   lockdep_on();
> + /*
> +  * By default we print message to console asynchronously so that kernel
> +  * doesn't get stalled due to slow serial console. That can lead to
> +  * softlockups, lost interrupts, or userspace timing out under heavy
> +  * printing load.
> +  *
> +  * However we resort to synchronous printing of messages during early
> +  * boot, when oops is in progress, or when synchronous printing was
> +  * explicitely requested by kernel parameter.
> +  */
> + if (keventd_up() && !oops_in_progress && !sync_print) {
> + __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> + irq_work_queue(this_cpu_ptr(_up_klogd_work));
> + } else
> + sync_print = true;
>   local_irq_restore(flags);

So this fixes printk() and printk_deferred(), but it doesn't address any of the
direct and indirect console_lock/console_unlock callers.

for example, direct:
~/_mmots$ git grep console_unlock | egrep -v "printk\.c|panic\.c|console\.h" | 
wc -l
199

indirect (e.g. via console_devices()):
~/_mmots$ git grep console_device | egrep -v 
"printk\.c|panic\.c|console\.h|_console_device" | wc -l
4

One of those indirect callers is tty_lookup_driver(), called from tty_open(). 
Which
is quite big to ignore, I suspect.

A user space process opening a tty can end up doing that while (1) 
call_console_drivers()
loop, I suspect. At least nothing prevents it, at a glance.

A side note, isn't it too often to cond_resched() from console_unlock()? What if
we have 1000 very short printk() messages (e.g. no more than 32 chars).

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-30 Thread Sergey Senozhatsky
On (12/31/15 11:44), Sergey Senozhatsky wrote:
> On (12/22/15 14:47), Jan Kara wrote:
> [..]
> > +int printk_deferred(const char *fmt, ...)
> > +{
> > +   va_list args;
> > +   int r;
> > +
> > +   va_start(args, fmt);
> > +   r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args);
> > +   va_end(args);
> > +
> > +   return r;
> > +}
> [..]
> > @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
> > logbuf_cpu = UINT_MAX;
> > raw_spin_unlock(_lock);
> > lockdep_on();
> > +   /*
> > +* By default we print message to console asynchronously so that kernel
> > +* doesn't get stalled due to slow serial console. That can lead to
> > +* softlockups, lost interrupts, or userspace timing out under heavy
> > +* printing load.
> > +*
> > +* However we resort to synchronous printing of messages during early
> > +* boot, when oops is in progress, or when synchronous printing was
> > +* explicitely requested by kernel parameter.
> > +*/
> > +   if (keventd_up() && !oops_in_progress && !sync_print) {
> > +   __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> > +   irq_work_queue(this_cpu_ptr(_up_klogd_work));
> > +   } else
> > +   sync_print = true;
> > local_irq_restore(flags);
> 
> So this fixes printk() and printk_deferred(), but it doesn't address any of 
> the
> direct and indirect console_lock/console_unlock callers.
> 
> for example, direct:
> ~/_mmots$ git grep console_unlock | egrep -v "printk\.c|panic\.c|console\.h" 
> | wc -l
> 199
> 
> indirect (e.g. via console_devices()):
> ~/_mmots$ git grep console_device | egrep -v 
> "printk\.c|panic\.c|console\.h|_console_device" | wc -l
> 4
> 
> One of those indirect callers is tty_lookup_driver(), called from tty_open(). 
> Which
> is quite big to ignore, I suspect.
>
> 
> A user space process opening a tty can end up doing that while (1) 
> call_console_drivers()
> loop, I suspect. At least nothing prevents it, at a glance.

d'oh... sorry. that cold that I have is affecting me... no more emails for 
today.


cond_resched() does its job there, of course. well, a user process still can
do a lot of call_console_drivers() calls. may be we can check who is calling
console_unlock() and if we have "!printk_sync && !oops_in_progress" (or just 
printk_sync
test) AND a user process then return from console_unlock() doing 
irq_work_queue()
and set PRINTK_PENDING_OUTPUT pending bit, the way vprintk_emit() does it.

-ss

> A side note, isn't it too often to cond_resched() from console_unlock()? What 
> if
> we have 1000 very short printk() messages (e.g. no more than 32 chars).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/23/15 12:57), Sergey Senozhatsky wrote:
[..]
> > > can we replace this oops_in_progress check with something more reliable?
> > > 
> > > CPU0CPU1 - CPUN
> > > panic()
> > >  local_irq_disable()executing foo() with irqs disabled,
> > >  console_verbose()  or processing an extremely long irq 
> > > handler.
> > >  bust_spinlocks()
> > > oops_in_progress++
> 
>   or we huge enough number of CPUs, 
> `deep' stack
>   traces, slow serial and CPU doing 
> dump_stack()
>   under raw_spin_lock(_lock), so it 
> can take
>   longer than 1 second to print the 
> stacks and
>   thus panic CPU will set 
> oops_in_progress back
>   to 0.
> 
> > >  smp_send_stop()
> > > 
> > >  bust_spinlocks()
> > > oops_in_progress--  ok, IPI arrives
> > > dump_stack()/printk()/etc from 
> > > IPI_CPU_STOP
> > >   "while (1) cpu_relax()" with irq/fiq 
> > > disabled/halt/etc.
> > > 
> > > smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is 
> > > arch specific,
> > > and some platforms don't do any IPI-delivered (e.g. via 
> > > num_online_cpus()) checks at
> > > all. Some do. For example, arm/arm64:
> > > 
> > > void smp_send_stop(void)
> > > ...
> > > /* Wait up to one second for other CPUs to stop */
> > > timeout = USEC_PER_SEC;
> > > while (num_online_cpus() > 1 && timeout--)
> > > udelay(1);
> > > 
> > > if (num_online_cpus() > 1)
> > > pr_warn("SMP: failed to stop secondary CPUs\n");
> > > ...
> > > 
> > > 
> > > so there are non-zero chances that IPI will arrive to CPU after 
> > > 'oops_in_progress--',
> > > and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.
> > > 
> > > 
> > > bust_spinlocks(0) does
> > > ...
> > >   if (--oops_in_progress == 0)
> > >   wake_up_klogd();
> > > ...
> > > 
> > > but local cpu has irqs disabled and `panic_timeout' can be zero.

well, if panic_timeout != 0, then wake_up_klogd() calls irq_work_queue() which
schedule_work. what if we have the following

CPU0CPU1 - CPUN

foo
preempt_disable
bar
panic   irq/fiq disable
schedule_work   while (1) cpu_relax

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/23/15 12:37), Sergey Senozhatsky wrote:
> Date: Wed, 23 Dec 2015 12:37:24 +0900
> From: Sergey Senozhatsky 
> To: Sergey Senozhatsky 
> Cc: Jan Kara , Sergey Senozhatsky
>  , Andrew Morton ,
>  Petr Mladek , KY Sri nivasan , Steven
>  Rostedt , linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/7] printk: Hand over printing to console if printing
>  too long
> User-Agent: Mutt/1.5.24 (2015-08-30)
> 
> On (12/23/15 10:54), Sergey Senozhatsky wrote:
> > On (12/22/15 14:47), Jan Kara wrote:
> > [..]
> > > @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int 
> > > level,
> > >   logbuf_cpu = UINT_MAX;
> > >   raw_spin_unlock(_lock);
> > >   lockdep_on();
> > > + /*
> > > +  * By default we print message to console asynchronously so that kernel
> > > +  * doesn't get stalled due to slow serial console. That can lead to
> > > +  * softlockups, lost interrupts, or userspace timing out under heavy
> > > +  * printing load.
> > > +  *
> > > +  * However we resort to synchronous printing of messages during early
> > > +  * boot, when oops is in progress, or when synchronous printing was
> > > +  * explicitely requested by kernel parameter.
> > > +  */
> > > + if (keventd_up() && !oops_in_progress && !sync_print) {
> > > + __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> > > + irq_work_queue(this_cpu_ptr(_up_klogd_work));
> > > + } else
> > > + sync_print = true;
> 
> oops, didn't have enough coffee... missed that `else sync_print = true' :(
> 

ah, never mind my previous email... it's a local variable, so the very next 
printk()
happening right after bust_spinlocks(0) will irq_work_queue(). I'd prefer CPUs 
to
print stacks rather than burn cpu cycles in `while (1) cpu_relax()' loop.

so

else {
printk_sync = true;
sync_print = true; /* and remove this local variable entirely may be*/
}

> > can we replace this oops_in_progress check with something more reliable?
> > 
> > CPU0CPU1 - CPUN
> > panic()
> >  local_irq_disable()executing foo() with irqs disabled,
> >  console_verbose()  or processing an extremely long irq 
> > handler.
> >  bust_spinlocks()
> > oops_in_progress++

or we huge enough number of CPUs, 
`deep' stack
traces, slow serial and CPU doing 
dump_stack()
under raw_spin_lock(_lock), so it 
can take
longer than 1 second to print the 
stacks and
thus panic CPU will set 
oops_in_progress back
to 0.

> >  smp_send_stop()
> > 
> >  bust_spinlocks()
> > oops_in_progress--  ok, IPI arrives
> > dump_stack()/printk()/etc from 
> > IPI_CPU_STOP
> > "while (1) cpu_relax()" with irq/fiq 
> > disabled/halt/etc.
> > 
> > smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is arch 
> > specific,
> > and some platforms don't do any IPI-delivered (e.g. via num_online_cpus()) 
> > checks at
> > all. Some do. For example, arm/arm64:
> > 
> > void smp_send_stop(void)
> > ...
> > /* Wait up to one second for other CPUs to stop */
> > timeout = USEC_PER_SEC;
> > while (num_online_cpus() > 1 && timeout--)
> > udelay(1);
> > 
> > if (num_online_cpus() > 1)
> > pr_warn("SMP: failed to stop secondary CPUs\n");
> > ...
> > 
> > 
> > so there are non-zero chances that IPI will arrive to CPU after 
> > 'oops_in_progress--',
> > and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.
> > 
> > 
> > bust_spinlocks(0) does
> > ...
> > if (--oops_in_progress == 0)
> > wake_up_klogd();
> > ...
> > 
> > but local cpu has irqs disabled and `panic_timeout' can be zero.
> > 
> > How about setting 'sync_print' to 'true' in...
> >   bust_spinlocks() /* only set to true */
> > or
> >   console_verbose() /* um... may be... */
> > or
> >   having a separate one-liner for that
> > 
> > void console_panic_mode(void)
> > {
> > sync_print = true;

printk_sync = true;

> > }
> > 
> > and call it early in panic(), before we send out IPI_STOP.


-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/23/15 10:54), Sergey Senozhatsky wrote:
> On (12/22/15 14:47), Jan Kara wrote:
> [..]
> > @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
> > logbuf_cpu = UINT_MAX;
> > raw_spin_unlock(_lock);
> > lockdep_on();
> > +   /*
> > +* By default we print message to console asynchronously so that kernel
> > +* doesn't get stalled due to slow serial console. That can lead to
> > +* softlockups, lost interrupts, or userspace timing out under heavy
> > +* printing load.
> > +*
> > +* However we resort to synchronous printing of messages during early
> > +* boot, when oops is in progress, or when synchronous printing was
> > +* explicitely requested by kernel parameter.
> > +*/
> > +   if (keventd_up() && !oops_in_progress && !sync_print) {
> > +   __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> > +   irq_work_queue(this_cpu_ptr(_up_klogd_work));
> > +   } else
> > +   sync_print = true;

oops, didn't have enough coffee... missed that `else sync_print = true' :(

-ss

> > local_irq_restore(flags);
> 
> can we replace this oops_in_progress check with something more reliable?
> 
> CPU0CPU1 - CPUN
> panic()
>  local_irq_disable()executing foo() with irqs disabled,
>  console_verbose()  or processing an extremely long irq 
> handler.
>  bust_spinlocks()
> oops_in_progress++
> 
>  smp_send_stop()
> 
>  bust_spinlocks()
> oops_in_progress--  ok, IPI arrives
> dump_stack()/printk()/etc from 
> IPI_CPU_STOP
>   "while (1) cpu_relax()" with irq/fiq 
> disabled/halt/etc.
> 
> smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is arch 
> specific,
> and some platforms don't do any IPI-delivered (e.g. via num_online_cpus()) 
> checks at
> all. Some do. For example, arm/arm64:
> 
> void smp_send_stop(void)
> ...
> /* Wait up to one second for other CPUs to stop */
> timeout = USEC_PER_SEC;
> while (num_online_cpus() > 1 && timeout--)
> udelay(1);
> 
> if (num_online_cpus() > 1)
> pr_warn("SMP: failed to stop secondary CPUs\n");
> ...
> 
> 
> so there are non-zero chances that IPI will arrive to CPU after 
> 'oops_in_progress--',
> and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.
> 
> 
> bust_spinlocks(0) does
> ...
>   if (--oops_in_progress == 0)
>   wake_up_klogd();
> ...
> 
> but local cpu has irqs disabled and `panic_timeout' can be zero.
> 
> How about setting 'sync_print' to 'true' in...
>   bust_spinlocks() /* only set to true */
> or
>   console_verbose() /* um... may be... */
> or
>   having a separate one-liner for that
> 
> void console_panic_mode(void)
> {
>   sync_print = true;
> }
> 
> and call it early in panic(), before we send out IPI_STOP.
> 
>   -ss
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
Hi,

slowly looking through the patches.

On (12/22/15 14:47), Jan Kara wrote:
[..]
> @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
>   logbuf_cpu = UINT_MAX;
>   raw_spin_unlock(_lock);
>   lockdep_on();
> + /*
> +  * By default we print message to console asynchronously so that kernel
> +  * doesn't get stalled due to slow serial console. That can lead to
> +  * softlockups, lost interrupts, or userspace timing out under heavy
> +  * printing load.
> +  *
> +  * However we resort to synchronous printing of messages during early
> +  * boot, when oops is in progress, or when synchronous printing was
> +  * explicitely requested by kernel parameter.
> +  */
> + if (keventd_up() && !oops_in_progress && !sync_print) {
> + __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> + irq_work_queue(this_cpu_ptr(_up_klogd_work));
> + } else
> + sync_print = true;
>   local_irq_restore(flags);

can we replace this oops_in_progress check with something more reliable?

CPU0CPU1 - CPUN
panic()
 local_irq_disable()executing foo() with irqs disabled,
 console_verbose()  or processing an extremely long irq handler.
 bust_spinlocks()
oops_in_progress++

 smp_send_stop()

 bust_spinlocks()
oops_in_progress--  ok, IPI arrives
dump_stack()/printk()/etc from IPI_CPU_STOP
"while (1) cpu_relax()" with irq/fiq 
disabled/halt/etc.

smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is arch 
specific,
and some platforms don't do any IPI-delivered (e.g. via num_online_cpus()) 
checks at
all. Some do. For example, arm/arm64:

void smp_send_stop(void)
...
/* Wait up to one second for other CPUs to stop */
timeout = USEC_PER_SEC;
while (num_online_cpus() > 1 && timeout--)
udelay(1);

if (num_online_cpus() > 1)
pr_warn("SMP: failed to stop secondary CPUs\n");
...


so there are non-zero chances that IPI will arrive to CPU after 
'oops_in_progress--',
and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.


bust_spinlocks(0) does
...
if (--oops_in_progress == 0)
wake_up_klogd();
...

but local cpu has irqs disabled and `panic_timeout' can be zero.

How about setting 'sync_print' to 'true' in...
  bust_spinlocks() /* only set to true */
or
  console_verbose() /* um... may be... */
or
  having a separate one-liner for that

void console_panic_mode(void)
{
sync_print = true;
}

and call it early in panic(), before we send out IPI_STOP.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/22/15 14:47), Jan Kara wrote:
[..]
> Thanks for looking into my patches and sorry for replying with a delay. As
> I wrote in my previous email [1] even the referenced patches are not quite
> enough. Over last few days I have worked on redoing the stuff as we
> discussed with Linus and Andrew at Kernel Summit and I have new patches
> which are working fine for me. I still want to test them on some machines
> having real issues with udev during boot but so far stress-testing with
> serial console slowed down to ~1000 chars/sec on other machines and VMs
> looks promising.
> 
> I'm attaching them in case you want to have a look. They are on top of
> Tejun's patch adding cond_resched() (which is essential). I'll officially
> submit the patches once the testing is finished (but I'm not sure when I
> get to the problematic HW...).
> 

Hello,

Thanks a lot! Will take a look.

-ss

> From 2e9675abbfc0df4a24a8c760c58e8150b9a31259 Mon Sep 17 00:00:00 2001
> From: Jan Kara 
> Date: Mon, 21 Dec 2015 13:10:31 +0100
> Subject: [PATCH 1/2] printk: Make printk() completely async
> 
> Currently, printk() sometimes waits for message to be printed to console
> and sometimes it does not (when console_sem is held by some other
> process). In case printk() grabs console_sem and starts printing to
> console, it prints messages from kernel printk buffer until the buffer
> is empty. When serial console is attached, printing is slow and thus
> other CPUs in the system have plenty of time to append new messages to
> the buffer while one CPU is printing. Thus the CPU can spend unbounded
> amount of time doing printing in console_unlock().  This is especially
> serious problem if the printk() calling console_unlock() was called with
> interrupts disabled.
> 
> In practice users have observed a CPU can spend tens of seconds printing
> in console_unlock() (usually during boot when hundreds of SCSI devices
> are discovered) resulting in RCU stalls (CPU doing printing doesn't
> reach quiescent state for a long time), softlockup reports (IPIs for the
> printing CPU don't get served and thus other CPUs are spinning waiting
> for the printing CPU to process IPIs), and eventually a machine death
> (as messages from stalls and lockups append to printk buffer faster than
> we are able to print). So these machines are unable to boot with serial
> console attached. Another observed issue is that due to slow printk,
> hardware discovery is slow and udev times out before kernel manages to
> discover all the attached HW. Also during artificial stress testing SATA
> disk disappears from the system because its interrupts aren't served for
> too long.
> 
> This patch makes printk() completely asynchronous (similar to what
> printk_deferred() did until now). It appends message to the kernel
> printk buffer and queues work to do the printing to console. This has
> the advantage that printing always happens from a schedulable contex and
> thus we don't lockup any particular CPU or even interrupts. Also it has
> the advantage that printk() is fast and thus kernel booting is not
> slowed down by slow serial console. Disadvantage of this method is that
> in case of crash there is higher chance that important messages won't
> appear in console output (we may need working scheduling to print
> message to console). We somewhat mitigate this risk by switching printk
> to the original method of immediate printing to console if oops is in
> progress.  Also for debugging purposes we provide printk.synchronous
> kernel parameter which resorts to the original printk behavior.
> 
> Signed-off-by: Jan Kara 
> ---
>  Documentation/kernel-parameters.txt |  10 +++
>  kernel/printk/printk.c  | 144 
> +---
>  2 files changed, 95 insertions(+), 59 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 742f69d18fc8..4cf1bddeffc7 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3000,6 +3000,16 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   printk.time=Show timing data prefixed to each printk message line
>   Format:   (1/Y/y=enable, 0/N/n=disable)
>  
> + printk.synchronous=
> + By default kernel messages are printed to console
> + asynchronously (except during early boot or when oops
> + is happening). That avoids kernel stalling behind slow
> + serial console and thus avoids softlockups, interrupt
> + timeouts, or userspace timing out during heavy printing.
> + However for debugging problems, printing messages to
> + console immediately may be desirable. This option
> + enables such behavior.
> +
>   processor.max_cstate=   [HW,ACPI]
>   Limit processor to 

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Jan Kara
On Thu 10-12-15 23:52:51, Sergey Senozhatsky wrote:
> Hello,
> 
> *** in this email and in every later emails ***
> Sorry, if I messed up with Cc list or message-ids. It's suprisingly
> hard to jump in into a loop that has never been in your inbox. It took
> some `googling' effort.
> 
> I haven't tested the patch set yet, I just 'ported' it to linux-next.
> I reverted 073696a8bc7779b ("printk: do cond_resched() between lines while
> outputting to consoles") as a first step, but it comes in later again. I can
> send out the updated series (off list is OK).
> 
> > Currently, console_unlock() prints messages from kernel printk buffer to
> > console while the buffer is non-empty. When serial console is attached,
> > printing is slow and thus other CPUs in the system have plenty of time
> > to append new messages to the buffer while one CPU is printing. Thus the
> > CPU can spend unbounded amount of time doing printing in console_unlock().
> > This is especially serious problem if the printk() calling
> > console_unlock() was called with interrupts disabled.
> >
> > In practice users have observed a CPU can spend tens of seconds printing
> > in console_unlock() (usually during boot when hundreds of SCSI devices
> > are discovered) resulting in RCU stalls (CPU doing printing doesn't
> > reach quiescent state for a long time), softlockup reports (IPIs for the
> > printing CPU don't get served and thus other CPUs are spinning waiting
> > for the printing CPU to process IPIs), and eventually a machine death
> > (as messages from stalls and lockups append to printk buffer faster than
> > we are able to print). So these machines are unable to boot with serial
> > console attached. Also during artificial stress testing SATA disk
> > disappears from the system because its interrupts aren't served for too
> > long.
> >
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads.  As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> >
> > Signed-off-by: Jan Kara 
> 
> I think we better use raw_spin_lock as a print_lock; and, apart from that,
> seems that we don't re-init in zap_lock(). So I ended up with the following
> patch on top of yours (to be folded):
> 
> - use raw_spin_lock
> - do not forget to re-init `print_lock' in zap_locks()

Thanks for looking into my patches and sorry for replying with a delay. As
I wrote in my previous email [1] even the referenced patches are not quite
enough. Over last few days I have worked on redoing the stuff as we
discussed with Linus and Andrew at Kernel Summit and I have new patches
which are working fine for me. I still want to test them on some machines
having real issues with udev during boot but so far stress-testing with
serial console slowed down to ~1000 chars/sec on other machines and VMs
looks promising.

I'm attaching them in case you want to have a look. They are on top of
Tejun's patch adding cond_resched() (which is essential). I'll officially
submit the patches once the testing is finished (but I'm not sure when I
get to the problematic HW...).

Honza

[1] http://www.spinics.net/lists/stable/msg111535.html
-- 
Jan Kara 
SUSE Labs, CR
>From 2e9675abbfc0df4a24a8c760c58e8150b9a31259 Mon Sep 17 00:00:00 2001
From: Jan Kara 
Date: Mon, 21 Dec 2015 13:10:31 +0100
Subject: [PATCH 1/2] printk: Make printk() completely async

Currently, printk() sometimes waits for message to be printed to console
and sometimes it does not (when console_sem is held by some other
process). In case printk() grabs console_sem and starts printing to
console, it prints messages from kernel printk buffer until the buffer
is empty. When serial console is attached, printing is slow and thus
other CPUs in the system have plenty of time to append new messages to
the buffer while one CPU is printing. Thus the CPU can spend unbounded
amount of time doing printing in console_unlock().  This is especially
serious problem if the printk() calling console_unlock() was called with
interrupts disabled.

In practice users have observed a CPU can spend tens of seconds printing
in console_unlock() (usually during boot when hundreds of SCSI devices
are discovered) resulting in RCU stalls (CPU doing printing doesn't
reach quiescent state for a long time), softlockup reports (IPIs for the
printing CPU don't get served and thus other CPUs are spinning waiting
for the printing CPU to process IPIs), and eventually a machine death
(as messages from stalls and lockups append to printk buffer faster than
we are able to 

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
Hi,

slowly looking through the patches.

On (12/22/15 14:47), Jan Kara wrote:
[..]
> @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
>   logbuf_cpu = UINT_MAX;
>   raw_spin_unlock(_lock);
>   lockdep_on();
> + /*
> +  * By default we print message to console asynchronously so that kernel
> +  * doesn't get stalled due to slow serial console. That can lead to
> +  * softlockups, lost interrupts, or userspace timing out under heavy
> +  * printing load.
> +  *
> +  * However we resort to synchronous printing of messages during early
> +  * boot, when oops is in progress, or when synchronous printing was
> +  * explicitely requested by kernel parameter.
> +  */
> + if (keventd_up() && !oops_in_progress && !sync_print) {
> + __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> + irq_work_queue(this_cpu_ptr(_up_klogd_work));
> + } else
> + sync_print = true;
>   local_irq_restore(flags);

can we replace this oops_in_progress check with something more reliable?

CPU0CPU1 - CPUN
panic()
 local_irq_disable()executing foo() with irqs disabled,
 console_verbose()  or processing an extremely long irq handler.
 bust_spinlocks()
oops_in_progress++

 smp_send_stop()

 bust_spinlocks()
oops_in_progress--  ok, IPI arrives
dump_stack()/printk()/etc from IPI_CPU_STOP
"while (1) cpu_relax()" with irq/fiq 
disabled/halt/etc.

smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is arch 
specific,
and some platforms don't do any IPI-delivered (e.g. via num_online_cpus()) 
checks at
all. Some do. For example, arm/arm64:

void smp_send_stop(void)
...
/* Wait up to one second for other CPUs to stop */
timeout = USEC_PER_SEC;
while (num_online_cpus() > 1 && timeout--)
udelay(1);

if (num_online_cpus() > 1)
pr_warn("SMP: failed to stop secondary CPUs\n");
...


so there are non-zero chances that IPI will arrive to CPU after 
'oops_in_progress--',
and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.


bust_spinlocks(0) does
...
if (--oops_in_progress == 0)
wake_up_klogd();
...

but local cpu has irqs disabled and `panic_timeout' can be zero.

How about setting 'sync_print' to 'true' in...
  bust_spinlocks() /* only set to true */
or
  console_verbose() /* um... may be... */
or
  having a separate one-liner for that

void console_panic_mode(void)
{
sync_print = true;
}

and call it early in panic(), before we send out IPI_STOP.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/23/15 10:54), Sergey Senozhatsky wrote:
> On (12/22/15 14:47), Jan Kara wrote:
> [..]
> > @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int level,
> > logbuf_cpu = UINT_MAX;
> > raw_spin_unlock(_lock);
> > lockdep_on();
> > +   /*
> > +* By default we print message to console asynchronously so that kernel
> > +* doesn't get stalled due to slow serial console. That can lead to
> > +* softlockups, lost interrupts, or userspace timing out under heavy
> > +* printing load.
> > +*
> > +* However we resort to synchronous printing of messages during early
> > +* boot, when oops is in progress, or when synchronous printing was
> > +* explicitely requested by kernel parameter.
> > +*/
> > +   if (keventd_up() && !oops_in_progress && !sync_print) {
> > +   __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> > +   irq_work_queue(this_cpu_ptr(_up_klogd_work));
> > +   } else
> > +   sync_print = true;

oops, didn't have enough coffee... missed that `else sync_print = true' :(

-ss

> > local_irq_restore(flags);
> 
> can we replace this oops_in_progress check with something more reliable?
> 
> CPU0CPU1 - CPUN
> panic()
>  local_irq_disable()executing foo() with irqs disabled,
>  console_verbose()  or processing an extremely long irq 
> handler.
>  bust_spinlocks()
> oops_in_progress++
> 
>  smp_send_stop()
> 
>  bust_spinlocks()
> oops_in_progress--  ok, IPI arrives
> dump_stack()/printk()/etc from 
> IPI_CPU_STOP
>   "while (1) cpu_relax()" with irq/fiq 
> disabled/halt/etc.
> 
> smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is arch 
> specific,
> and some platforms don't do any IPI-delivered (e.g. via num_online_cpus()) 
> checks at
> all. Some do. For example, arm/arm64:
> 
> void smp_send_stop(void)
> ...
> /* Wait up to one second for other CPUs to stop */
> timeout = USEC_PER_SEC;
> while (num_online_cpus() > 1 && timeout--)
> udelay(1);
> 
> if (num_online_cpus() > 1)
> pr_warn("SMP: failed to stop secondary CPUs\n");
> ...
> 
> 
> so there are non-zero chances that IPI will arrive to CPU after 
> 'oops_in_progress--',
> and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.
> 
> 
> bust_spinlocks(0) does
> ...
>   if (--oops_in_progress == 0)
>   wake_up_klogd();
> ...
> 
> but local cpu has irqs disabled and `panic_timeout' can be zero.
> 
> How about setting 'sync_print' to 'true' in...
>   bust_spinlocks() /* only set to true */
> or
>   console_verbose() /* um... may be... */
> or
>   having a separate one-liner for that
> 
> void console_panic_mode(void)
> {
>   sync_print = true;
> }
> 
> and call it early in panic(), before we send out IPI_STOP.
> 
>   -ss
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/23/15 12:57), Sergey Senozhatsky wrote:
[..]
> > > can we replace this oops_in_progress check with something more reliable?
> > > 
> > > CPU0CPU1 - CPUN
> > > panic()
> > >  local_irq_disable()executing foo() with irqs disabled,
> > >  console_verbose()  or processing an extremely long irq 
> > > handler.
> > >  bust_spinlocks()
> > > oops_in_progress++
> 
>   or we huge enough number of CPUs, 
> `deep' stack
>   traces, slow serial and CPU doing 
> dump_stack()
>   under raw_spin_lock(_lock), so it 
> can take
>   longer than 1 second to print the 
> stacks and
>   thus panic CPU will set 
> oops_in_progress back
>   to 0.
> 
> > >  smp_send_stop()
> > > 
> > >  bust_spinlocks()
> > > oops_in_progress--  ok, IPI arrives
> > > dump_stack()/printk()/etc from 
> > > IPI_CPU_STOP
> > >   "while (1) cpu_relax()" with irq/fiq 
> > > disabled/halt/etc.
> > > 
> > > smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is 
> > > arch specific,
> > > and some platforms don't do any IPI-delivered (e.g. via 
> > > num_online_cpus()) checks at
> > > all. Some do. For example, arm/arm64:
> > > 
> > > void smp_send_stop(void)
> > > ...
> > > /* Wait up to one second for other CPUs to stop */
> > > timeout = USEC_PER_SEC;
> > > while (num_online_cpus() > 1 && timeout--)
> > > udelay(1);
> > > 
> > > if (num_online_cpus() > 1)
> > > pr_warn("SMP: failed to stop secondary CPUs\n");
> > > ...
> > > 
> > > 
> > > so there are non-zero chances that IPI will arrive to CPU after 
> > > 'oops_in_progress--',
> > > and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.
> > > 
> > > 
> > > bust_spinlocks(0) does
> > > ...
> > >   if (--oops_in_progress == 0)
> > >   wake_up_klogd();
> > > ...
> > > 
> > > but local cpu has irqs disabled and `panic_timeout' can be zero.

well, if panic_timeout != 0, then wake_up_klogd() calls irq_work_queue() which
schedule_work. what if we have the following

CPU0CPU1 - CPUN

foo
preempt_disable
bar
panic   irq/fiq disable
schedule_work   while (1) cpu_relax

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/23/15 12:37), Sergey Senozhatsky wrote:
> Date: Wed, 23 Dec 2015 12:37:24 +0900
> From: Sergey Senozhatsky <sergey.senozhatsky.w...@gmail.com>
> To: Sergey Senozhatsky <sergey.senozhatsky.w...@gmail.com>
> Cc: Jan Kara <j...@suse.cz>, Sergey Senozhatsky
>  <sergey.senozhat...@gmail.com>, Andrew Morton <a...@linux-foundation.org>,
>  Petr Mladek <pmla...@suse.cz>, KY Sri nivasan <k...@microsoft.com>, Steven
>  Rostedt <rost...@goodmis.org>, linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 1/7] printk: Hand over printing to console if printing
>  too long
> User-Agent: Mutt/1.5.24 (2015-08-30)
> 
> On (12/23/15 10:54), Sergey Senozhatsky wrote:
> > On (12/22/15 14:47), Jan Kara wrote:
> > [..]
> > > @@ -1803,10 +1869,24 @@ asmlinkage int vprintk_emit(int facility, int 
> > > level,
> > >   logbuf_cpu = UINT_MAX;
> > >   raw_spin_unlock(_lock);
> > >   lockdep_on();
> > > + /*
> > > +  * By default we print message to console asynchronously so that kernel
> > > +  * doesn't get stalled due to slow serial console. That can lead to
> > > +  * softlockups, lost interrupts, or userspace timing out under heavy
> > > +  * printing load.
> > > +  *
> > > +  * However we resort to synchronous printing of messages during early
> > > +  * boot, when oops is in progress, or when synchronous printing was
> > > +  * explicitely requested by kernel parameter.
> > > +  */
> > > + if (keventd_up() && !oops_in_progress && !sync_print) {
> > > + __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT);
> > > + irq_work_queue(this_cpu_ptr(_up_klogd_work));
> > > + } else
> > > + sync_print = true;
> 
> oops, didn't have enough coffee... missed that `else sync_print = true' :(
> 

ah, never mind my previous email... it's a local variable, so the very next 
printk()
happening right after bust_spinlocks(0) will irq_work_queue(). I'd prefer CPUs 
to
print stacks rather than burn cpu cycles in `while (1) cpu_relax()' loop.

so

else {
printk_sync = true;
sync_print = true; /* and remove this local variable entirely may be*/
}

> > can we replace this oops_in_progress check with something more reliable?
> > 
> > CPU0CPU1 - CPUN
> > panic()
> >  local_irq_disable()executing foo() with irqs disabled,
> >  console_verbose()  or processing an extremely long irq 
> > handler.
> >  bust_spinlocks()
> > oops_in_progress++

or we huge enough number of CPUs, 
`deep' stack
traces, slow serial and CPU doing 
dump_stack()
under raw_spin_lock(_lock), so it 
can take
longer than 1 second to print the 
stacks and
thus panic CPU will set 
oops_in_progress back
to 0.

> >  smp_send_stop()
> > 
> >  bust_spinlocks()
> > oops_in_progress--  ok, IPI arrives
> > dump_stack()/printk()/etc from 
> > IPI_CPU_STOP
> > "while (1) cpu_relax()" with irq/fiq 
> > disabled/halt/etc.
> > 
> > smp_send_stop() wrapped in `oops_in_progress++/oops_in_progress--' is arch 
> > specific,
> > and some platforms don't do any IPI-delivered (e.g. via num_online_cpus()) 
> > checks at
> > all. Some do. For example, arm/arm64:
> > 
> > void smp_send_stop(void)
> > ...
> > /* Wait up to one second for other CPUs to stop */
> > timeout = USEC_PER_SEC;
> > while (num_online_cpus() > 1 && timeout--)
> > udelay(1);
> > 
> > if (num_online_cpus() > 1)
> > pr_warn("SMP: failed to stop secondary CPUs\n");
> > ...
> > 
> > 
> > so there are non-zero chances that IPI will arrive to CPU after 
> > 'oops_in_progress--',
> > and thus dump_stack()/etc. happening on that/those cpu/cpus will be lost.
> > 
> > 
> > bust_spinlocks(0) does
> > ...
> > if (--oops_in_progress == 0)
> > wake_up_klogd();
> > ...
> > 
> > but local cpu has irqs disabled and `panic_timeout' can be zero.
> > 
> > How about setting 'sync_print' to 'true' in...
> >   bust_spinlocks() /* only set to true */
> > or
> >   console_verbose() /* um... may be... */
> > or
> >   having a separate one-liner for that
> > 
> > void console_panic_mode(void)
> > {
> > sync_print = true;

printk_sync = true;

> > }
> > 
> > and call it early in panic(), before we send out IPI_STOP.


-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Jan Kara
On Thu 10-12-15 23:52:51, Sergey Senozhatsky wrote:
> Hello,
> 
> *** in this email and in every later emails ***
> Sorry, if I messed up with Cc list or message-ids. It's suprisingly
> hard to jump in into a loop that has never been in your inbox. It took
> some `googling' effort.
> 
> I haven't tested the patch set yet, I just 'ported' it to linux-next.
> I reverted 073696a8bc7779b ("printk: do cond_resched() between lines while
> outputting to consoles") as a first step, but it comes in later again. I can
> send out the updated series (off list is OK).
> 
> > Currently, console_unlock() prints messages from kernel printk buffer to
> > console while the buffer is non-empty. When serial console is attached,
> > printing is slow and thus other CPUs in the system have plenty of time
> > to append new messages to the buffer while one CPU is printing. Thus the
> > CPU can spend unbounded amount of time doing printing in console_unlock().
> > This is especially serious problem if the printk() calling
> > console_unlock() was called with interrupts disabled.
> >
> > In practice users have observed a CPU can spend tens of seconds printing
> > in console_unlock() (usually during boot when hundreds of SCSI devices
> > are discovered) resulting in RCU stalls (CPU doing printing doesn't
> > reach quiescent state for a long time), softlockup reports (IPIs for the
> > printing CPU don't get served and thus other CPUs are spinning waiting
> > for the printing CPU to process IPIs), and eventually a machine death
> > (as messages from stalls and lockups append to printk buffer faster than
> > we are able to print). So these machines are unable to boot with serial
> > console attached. Also during artificial stress testing SATA disk
> > disappears from the system because its interrupts aren't served for too
> > long.
> >
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads.  As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> >
> > Signed-off-by: Jan Kara 
> 
> I think we better use raw_spin_lock as a print_lock; and, apart from that,
> seems that we don't re-init in zap_lock(). So I ended up with the following
> patch on top of yours (to be folded):
> 
> - use raw_spin_lock
> - do not forget to re-init `print_lock' in zap_locks()

Thanks for looking into my patches and sorry for replying with a delay. As
I wrote in my previous email [1] even the referenced patches are not quite
enough. Over last few days I have worked on redoing the stuff as we
discussed with Linus and Andrew at Kernel Summit and I have new patches
which are working fine for me. I still want to test them on some machines
having real issues with udev during boot but so far stress-testing with
serial console slowed down to ~1000 chars/sec on other machines and VMs
looks promising.

I'm attaching them in case you want to have a look. They are on top of
Tejun's patch adding cond_resched() (which is essential). I'll officially
submit the patches once the testing is finished (but I'm not sure when I
get to the problematic HW...).

Honza

[1] http://www.spinics.net/lists/stable/msg111535.html
-- 
Jan Kara 
SUSE Labs, CR
>From 2e9675abbfc0df4a24a8c760c58e8150b9a31259 Mon Sep 17 00:00:00 2001
From: Jan Kara 
Date: Mon, 21 Dec 2015 13:10:31 +0100
Subject: [PATCH 1/2] printk: Make printk() completely async

Currently, printk() sometimes waits for message to be printed to console
and sometimes it does not (when console_sem is held by some other
process). In case printk() grabs console_sem and starts printing to
console, it prints messages from kernel printk buffer until the buffer
is empty. When serial console is attached, printing is slow and thus
other CPUs in the system have plenty of time to append new messages to
the buffer while one CPU is printing. Thus the CPU can spend unbounded
amount of time doing printing in console_unlock().  This is especially
serious problem if the printk() calling console_unlock() was called with
interrupts disabled.

In practice users have observed a CPU can spend tens of seconds printing
in console_unlock() (usually during boot when hundreds of SCSI devices
are discovered) resulting in RCU stalls (CPU doing printing doesn't
reach quiescent state for a long time), softlockup reports (IPIs for the
printing CPU don't get served and thus other CPUs are spinning waiting
for the printing CPU to process IPIs), and eventually a machine death
(as messages from stalls and lockups append to 

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-22 Thread Sergey Senozhatsky
On (12/22/15 14:47), Jan Kara wrote:
[..]
> Thanks for looking into my patches and sorry for replying with a delay. As
> I wrote in my previous email [1] even the referenced patches are not quite
> enough. Over last few days I have worked on redoing the stuff as we
> discussed with Linus and Andrew at Kernel Summit and I have new patches
> which are working fine for me. I still want to test them on some machines
> having real issues with udev during boot but so far stress-testing with
> serial console slowed down to ~1000 chars/sec on other machines and VMs
> looks promising.
> 
> I'm attaching them in case you want to have a look. They are on top of
> Tejun's patch adding cond_resched() (which is essential). I'll officially
> submit the patches once the testing is finished (but I'm not sure when I
> get to the problematic HW...).
> 

Hello,

Thanks a lot! Will take a look.

-ss

> From 2e9675abbfc0df4a24a8c760c58e8150b9a31259 Mon Sep 17 00:00:00 2001
> From: Jan Kara 
> Date: Mon, 21 Dec 2015 13:10:31 +0100
> Subject: [PATCH 1/2] printk: Make printk() completely async
> 
> Currently, printk() sometimes waits for message to be printed to console
> and sometimes it does not (when console_sem is held by some other
> process). In case printk() grabs console_sem and starts printing to
> console, it prints messages from kernel printk buffer until the buffer
> is empty. When serial console is attached, printing is slow and thus
> other CPUs in the system have plenty of time to append new messages to
> the buffer while one CPU is printing. Thus the CPU can spend unbounded
> amount of time doing printing in console_unlock().  This is especially
> serious problem if the printk() calling console_unlock() was called with
> interrupts disabled.
> 
> In practice users have observed a CPU can spend tens of seconds printing
> in console_unlock() (usually during boot when hundreds of SCSI devices
> are discovered) resulting in RCU stalls (CPU doing printing doesn't
> reach quiescent state for a long time), softlockup reports (IPIs for the
> printing CPU don't get served and thus other CPUs are spinning waiting
> for the printing CPU to process IPIs), and eventually a machine death
> (as messages from stalls and lockups append to printk buffer faster than
> we are able to print). So these machines are unable to boot with serial
> console attached. Another observed issue is that due to slow printk,
> hardware discovery is slow and udev times out before kernel manages to
> discover all the attached HW. Also during artificial stress testing SATA
> disk disappears from the system because its interrupts aren't served for
> too long.
> 
> This patch makes printk() completely asynchronous (similar to what
> printk_deferred() did until now). It appends message to the kernel
> printk buffer and queues work to do the printing to console. This has
> the advantage that printing always happens from a schedulable contex and
> thus we don't lockup any particular CPU or even interrupts. Also it has
> the advantage that printk() is fast and thus kernel booting is not
> slowed down by slow serial console. Disadvantage of this method is that
> in case of crash there is higher chance that important messages won't
> appear in console output (we may need working scheduling to print
> message to console). We somewhat mitigate this risk by switching printk
> to the original method of immediate printing to console if oops is in
> progress.  Also for debugging purposes we provide printk.synchronous
> kernel parameter which resorts to the original printk behavior.
> 
> Signed-off-by: Jan Kara 
> ---
>  Documentation/kernel-parameters.txt |  10 +++
>  kernel/printk/printk.c  | 144 
> +---
>  2 files changed, 95 insertions(+), 59 deletions(-)
> 
> diff --git a/Documentation/kernel-parameters.txt 
> b/Documentation/kernel-parameters.txt
> index 742f69d18fc8..4cf1bddeffc7 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -3000,6 +3000,16 @@ bytes respectively. Such letter suffixes can also be 
> entirely omitted.
>   printk.time=Show timing data prefixed to each printk message line
>   Format:   (1/Y/y=enable, 0/N/n=disable)
>  
> + printk.synchronous=
> + By default kernel messages are printed to console
> + asynchronously (except during early boot or when oops
> + is happening). That avoids kernel stalling behind slow
> + serial console and thus avoids softlockups, interrupt
> + timeouts, or userspace timing out during heavy printing.
> + However for debugging problems, printing messages to
> + console immediately may be desirable. This option
> + enables such behavior.
> +
>   processor.max_cstate=   [HW,ACPI]
>   

Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
On (12/11/15 13:27), Sergey Senozhatsky wrote:
[..]
> > static bool cpu_stop_printing(int printed_chars)
> > {
> > /* Oops? Print everything now to maximize chances user will see it 
> > */
> > if (oops_in_progress)
> > return false;
> > if (!printk_offload_chars || printed_chars < printk_offload_chars)
> > return false;
> > /*
> >  * Make sure we load fresh value of printing_tasks_spinning. Matches
> >  * the barrier in printing_task()
> >  */
> > smp_rmb();
> > if (atomic_read(_tasks_spinning))
> > return true;
> > wake_up(_queue);
> >
> - return false;
> + return true;
> > }

*just as an idea*, I was thinking about having two different offload
limits -- for user space processes doing console_unlock() for whatever
reason (printk in syscall or because of console_lock, etc.) and for
KTHREADS. the kernel threads can have normal offload_limit, while user
space processes can return back from syscal sooner (doing only half of
printk worload, for example). but this is probably too `custom', though
sort of make some sense.

(compile tested only)

---
 kernel/printk/printk.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 79915da..cff1dd1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -330,6 +330,7 @@ static struct kernel_param_ops offload_chars_ops = {
  * 0.1s maximum latency due to printing.
  */
 static unsigned int __read_mostly printk_offload_chars = 1000;
+static unsigned int __read_mostly printk_offload_limits[2] = {500, 1000};
 
 module_param_cb(offload_chars, _chars_ops, _offload_chars,
   S_IRUGO | S_IWUSR);
@@ -343,10 +344,14 @@ MODULE_PARM_DESC(offload_chars, "offload printing to 
console to a different"
  */
 static bool cpu_stop_printing(int printed_chars)
 {
+   bool type = current->flags & PF_KTHREAD;
+
/* Oops? Print everything now to maximize chances user will see it */
if (oops_in_progress)
return false;
-   if (!printk_offload_chars || printed_chars < printk_offload_chars)
+   if (!printk_offload_chars)
+   return false;
+   if (printed_chars < printk_offload_limits[type])
return false;
/*
 * Make sure we load fresh value of printing_tasks_spinning. Matches
@@ -2995,6 +3000,12 @@ out_err:
return ret;
 }
 
+static void offload_limits_set(void)
+{
+   printk_offload_limits[0] = printk_offload_chars >> 1;
+   printk_offload_limits[1] = printk_offload_chars;
+}
+
 static int offload_chars_set(const char *val, const struct kernel_param *kp)
 {
int ret;
@@ -3006,6 +3017,8 @@ static int offload_chars_set(const char *val, const 
struct kernel_param *kp)
mutex_unlock(_kthread_mutex);
return ret;
}
+
+   offload_limits_set();
ret = printk_start_offload_kthreads();
mutex_unlock(_kthread_mutex);
return ret;
@@ -3014,11 +3027,13 @@ static int offload_chars_set(const char *val, const 
struct kernel_param *kp)
 static void printk_offload_init(void)
 {
mutex_lock(_kthread_mutex);
+   offload_limits_set();
if (num_possible_cpus() <= 1) {
/* Offloading doesn't make sense. Disable print offloading. */
printk_offload_chars = 0;
-   } else
+   } else {
printk_start_offload_kthreads();
+   }
mutex_unlock(_kthread_mutex);
 }
 
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
On (12/10/15 23:52), Sergey Senozhatsky wrote:
> > Currently, console_unlock() prints messages from kernel printk buffer to
> > console while the buffer is non-empty. When serial console is attached,
> > printing is slow and thus other CPUs in the system have plenty of time
> > to append new messages to the buffer while one CPU is printing. Thus the
> > CPU can spend unbounded amount of time doing printing in console_unlock().
> > This is especially serious problem if the printk() calling
> > console_unlock() was called with interrupts disabled.
> >
> > In practice users have observed a CPU can spend tens of seconds printing
> > in console_unlock() (usually during boot when hundreds of SCSI devices
> > are discovered) resulting in RCU stalls (CPU doing printing doesn't
> > reach quiescent state for a long time), softlockup reports (IPIs for the
> > printing CPU don't get served and thus other CPUs are spinning waiting
> > for the printing CPU to process IPIs), and eventually a machine death
> > (as messages from stalls and lockups append to printk buffer faster than
> > we are able to print). So these machines are unable to boot with serial
> > console attached. Also during artificial stress testing SATA disk
> > disappears from the system because its interrupts aren't served for too
> > long.
> >
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads.  As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> >
> > Signed-off-by: Jan Kara 


> static bool cpu_stop_printing(int printed_chars)
> {
> /* Oops? Print everything now to maximize chances user will see it */
> if (oops_in_progress)
> return false;
> if (!printk_offload_chars || printed_chars < printk_offload_chars)
> return false;
> /*
>  * Make sure we load fresh value of printing_tasks_spinning. Matches
>  * the barrier in printing_task()
>  */
> smp_rmb();
> if (atomic_read(_tasks_spinning))
> return true;
> wake_up(_queue);
>
-   return false;
+   return true;
> }


we just woke up a task we will offload printing to. no need to do another round
of call_console_drivers() from current, possibly overrunning 
printk_offload_chars
by strlen of next msg, which can be close to max text length in the worst case,
while woken up print task will burn cpu cycles spinning on the print_lock.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
On (12/10/15 23:52), Sergey Senozhatsky wrote:
> 
> I think we better use raw_spin_lock as a print_lock; and, apart from that,
> seems that we don't re-init in zap_lock(). So I ended up with the following
> patch on top of yours (to be folded):
> 
> - use raw_spin_lock
> - do not forget to re-init `print_lock' in zap_locks()

while we are on this, what do you guys think?


CPU1CPU2
console_unlock()
  call_console_drivers()
con->write()
   ... spin_lock ... uart, etc  panic

zap_lock() will raw_spin_lock_init(_lock) and
sema_init(_sem, 1), but we still have `spin_lock'
held by con->write().

so the `panic' flush or print out will see con->write() being already
blocked.


===8<

We do zap_lock() in printk to make a panic print out possible, but we
can end up having a locked serial console - e.g. panic has occurred
whilst CPUx was in con->write(), which takes some internal locks, thus
call_console_drivers() will perform con->write() on an already locked
console.

Try to reset() every console in zap_lock() via console specific
->reset() call.

Signed-off-by: Sergey Senozhatsky 
---
 include/linux/console.h | 1 +
 kernel/printk/printk.c  | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index bd19434..1cb8f72 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -125,6 +125,7 @@ struct console {
void(*unblank)(void);
int (*setup)(struct console *, char *);
int (*match)(struct console *, char *name, int idx, char *options);
+   void(*reset)(struct console *);
short   flags;
short   index;
int cflag;
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f4a9565..ad172c4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1580,6 +1580,7 @@ static void call_console_drivers(int level,
  */
 static void zap_locks(void)
 {
+   struct console *c;
static unsigned long oops_timestamp;
 
if (time_after_eq(jiffies, oops_timestamp) &&
@@ -1589,6 +1590,11 @@ static void zap_locks(void)
oops_timestamp = jiffies;
 
debug_locks_off();
+
+   for_each_console(c)
+   if ((c->flags & CON_ENABLED) && c->reset)
+   c->reset(c);
+
/* If a crash is occurring, make sure we can't deadlock */
raw_spin_lock_init(_lock);
/* And make sure that we print immediately */
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
Hello,

*** in this email and in every later emails ***
Sorry, if I messed up with Cc list or message-ids. It's suprisingly
hard to jump in into a loop that has never been in your inbox. It took
some `googling' effort.

I haven't tested the patch set yet, I just 'ported' it to linux-next.
I reverted 073696a8bc7779b ("printk: do cond_resched() between lines while
outputting to consoles") as a first step, but it comes in later again. I can
send out the updated series (off list is OK).

> Currently, console_unlock() prints messages from kernel printk buffer to
> console while the buffer is non-empty. When serial console is attached,
> printing is slow and thus other CPUs in the system have plenty of time
> to append new messages to the buffer while one CPU is printing. Thus the
> CPU can spend unbounded amount of time doing printing in console_unlock().
> This is especially serious problem if the printk() calling
> console_unlock() was called with interrupts disabled.
>
> In practice users have observed a CPU can spend tens of seconds printing
> in console_unlock() (usually during boot when hundreds of SCSI devices
> are discovered) resulting in RCU stalls (CPU doing printing doesn't
> reach quiescent state for a long time), softlockup reports (IPIs for the
> printing CPU don't get served and thus other CPUs are spinning waiting
> for the printing CPU to process IPIs), and eventually a machine death
> (as messages from stalls and lockups append to printk buffer faster than
> we are able to print). So these machines are unable to boot with serial
> console attached. Also during artificial stress testing SATA disk
> disappears from the system because its interrupts aren't served for too
> long.
>
> This patch implements a mechanism where after printing specified number
> of characters (tunable as a kernel parameter printk.offload_chars), CPU
> doing printing asks for help by waking up one of dedicated kthreads.  As
> soon as the printing CPU notices kthread got scheduled and is spinning
> on print_lock dedicated for that purpose, it drops console_sem,
> print_lock, and exits console_unlock(). Kthread then takes over printing
> instead. This way no CPU should spend printing too long even if there
> is heavy printk traffic.
>
> Signed-off-by: Jan Kara 

I think we better use raw_spin_lock as a print_lock; and, apart from that,
seems that we don't re-init in zap_lock(). So I ended up with the following
patch on top of yours (to be folded):

- use raw_spin_lock
- do not forget to re-init `print_lock' in zap_locks()
---
 kernel/printk/printk.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index d986599..2a86ff1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -85,7 +85,7 @@ EXPORT_SYMBOL_GPL(console_drivers);
  * we can spin on it when some other thread wants to take over printing to
  * console.
  */
-static DEFINE_SPINLOCK(print_lock);
+static DEFINE_RAW_SPINLOCK(print_lock);
 
 /*
  * Number of printing threads spinning on print_lock. Can go away once
@@ -1516,6 +1516,7 @@ static void zap_locks(void)
/* If a crash is occurring, make sure we can't deadlock */
raw_spin_lock_init(_lock);
/* And make sure that we print immediately */
+   raw_spin_lock_init(_lock);
sema_init(_sem, 1);
 }
 
@@ -2311,7 +2312,7 @@ void console_unlock(void)
console_cont_flush(text, sizeof(text));
 again:
retry = false;
-   spin_lock_irqsave(_lock, flags);
+   raw_spin_lock_irqsave(_lock, flags);
for (;;) {
struct printk_log *msg;
size_t ext_len = 0;
@@ -2410,7 +2411,7 @@ skip:
 * succeeds in getting console_sem (unless someone else takes it and
 * then he'll be responsible for printing).
  */
-   spin_unlock_irqrestore(_lock, flags);
+   raw_spin_unlock_irqrestore(_lock, flags);
 
/*
 * In case we cannot trylock the console_sem again, there's a new owner
@@ -2773,9 +2774,9 @@ static int printing_task(void *arg)
 * want to sleep once we got scheduled to make sure we take
 * over printing without depending on the scheduler.
 */
-   spin_lock_irqsave(_lock, flags);
+   raw_spin_lock_irqsave(_lock, flags);
atomic_dec(_tasks_spinning);
-   spin_unlock_irqrestore(_lock, flags);
+   raw_spin_unlock_irqrestore(_lock, flags);
if (console_trylock())
console_unlock();
preempt_enable();
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
On (12/10/15 23:52), Sergey Senozhatsky wrote:
> > Currently, console_unlock() prints messages from kernel printk buffer to
> > console while the buffer is non-empty. When serial console is attached,
> > printing is slow and thus other CPUs in the system have plenty of time
> > to append new messages to the buffer while one CPU is printing. Thus the
> > CPU can spend unbounded amount of time doing printing in console_unlock().
> > This is especially serious problem if the printk() calling
> > console_unlock() was called with interrupts disabled.
> >
> > In practice users have observed a CPU can spend tens of seconds printing
> > in console_unlock() (usually during boot when hundreds of SCSI devices
> > are discovered) resulting in RCU stalls (CPU doing printing doesn't
> > reach quiescent state for a long time), softlockup reports (IPIs for the
> > printing CPU don't get served and thus other CPUs are spinning waiting
> > for the printing CPU to process IPIs), and eventually a machine death
> > (as messages from stalls and lockups append to printk buffer faster than
> > we are able to print). So these machines are unable to boot with serial
> > console attached. Also during artificial stress testing SATA disk
> > disappears from the system because its interrupts aren't served for too
> > long.
> >
> > This patch implements a mechanism where after printing specified number
> > of characters (tunable as a kernel parameter printk.offload_chars), CPU
> > doing printing asks for help by waking up one of dedicated kthreads.  As
> > soon as the printing CPU notices kthread got scheduled and is spinning
> > on print_lock dedicated for that purpose, it drops console_sem,
> > print_lock, and exits console_unlock(). Kthread then takes over printing
> > instead. This way no CPU should spend printing too long even if there
> > is heavy printk traffic.
> >
> > Signed-off-by: Jan Kara 


> static bool cpu_stop_printing(int printed_chars)
> {
> /* Oops? Print everything now to maximize chances user will see it */
> if (oops_in_progress)
> return false;
> if (!printk_offload_chars || printed_chars < printk_offload_chars)
> return false;
> /*
>  * Make sure we load fresh value of printing_tasks_spinning. Matches
>  * the barrier in printing_task()
>  */
> smp_rmb();
> if (atomic_read(_tasks_spinning))
> return true;
> wake_up(_queue);
>
-   return false;
+   return true;
> }


we just woke up a task we will offload printing to. no need to do another round
of call_console_drivers() from current, possibly overrunning 
printk_offload_chars
by strlen of next msg, which can be close to max text length in the worst case,
while woken up print task will burn cpu cycles spinning on the print_lock.

-ss
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
On (12/11/15 13:27), Sergey Senozhatsky wrote:
[..]
> > static bool cpu_stop_printing(int printed_chars)
> > {
> > /* Oops? Print everything now to maximize chances user will see it 
> > */
> > if (oops_in_progress)
> > return false;
> > if (!printk_offload_chars || printed_chars < printk_offload_chars)
> > return false;
> > /*
> >  * Make sure we load fresh value of printing_tasks_spinning. Matches
> >  * the barrier in printing_task()
> >  */
> > smp_rmb();
> > if (atomic_read(_tasks_spinning))
> > return true;
> > wake_up(_queue);
> >
> - return false;
> + return true;
> > }

*just as an idea*, I was thinking about having two different offload
limits -- for user space processes doing console_unlock() for whatever
reason (printk in syscall or because of console_lock, etc.) and for
KTHREADS. the kernel threads can have normal offload_limit, while user
space processes can return back from syscal sooner (doing only half of
printk worload, for example). but this is probably too `custom', though
sort of make some sense.

(compile tested only)

---
 kernel/printk/printk.c | 19 +--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 79915da..cff1dd1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -330,6 +330,7 @@ static struct kernel_param_ops offload_chars_ops = {
  * 0.1s maximum latency due to printing.
  */
 static unsigned int __read_mostly printk_offload_chars = 1000;
+static unsigned int __read_mostly printk_offload_limits[2] = {500, 1000};
 
 module_param_cb(offload_chars, _chars_ops, _offload_chars,
   S_IRUGO | S_IWUSR);
@@ -343,10 +344,14 @@ MODULE_PARM_DESC(offload_chars, "offload printing to 
console to a different"
  */
 static bool cpu_stop_printing(int printed_chars)
 {
+   bool type = current->flags & PF_KTHREAD;
+
/* Oops? Print everything now to maximize chances user will see it */
if (oops_in_progress)
return false;
-   if (!printk_offload_chars || printed_chars < printk_offload_chars)
+   if (!printk_offload_chars)
+   return false;
+   if (printed_chars < printk_offload_limits[type])
return false;
/*
 * Make sure we load fresh value of printing_tasks_spinning. Matches
@@ -2995,6 +3000,12 @@ out_err:
return ret;
 }
 
+static void offload_limits_set(void)
+{
+   printk_offload_limits[0] = printk_offload_chars >> 1;
+   printk_offload_limits[1] = printk_offload_chars;
+}
+
 static int offload_chars_set(const char *val, const struct kernel_param *kp)
 {
int ret;
@@ -3006,6 +3017,8 @@ static int offload_chars_set(const char *val, const 
struct kernel_param *kp)
mutex_unlock(_kthread_mutex);
return ret;
}
+
+   offload_limits_set();
ret = printk_start_offload_kthreads();
mutex_unlock(_kthread_mutex);
return ret;
@@ -3014,11 +3027,13 @@ static int offload_chars_set(const char *val, const 
struct kernel_param *kp)
 static void printk_offload_init(void)
 {
mutex_lock(_kthread_mutex);
+   offload_limits_set();
if (num_possible_cpus() <= 1) {
/* Offloading doesn't make sense. Disable print offloading. */
printk_offload_chars = 0;
-   } else
+   } else {
printk_start_offload_kthreads();
+   }
mutex_unlock(_kthread_mutex);
 }
 
-- 
2.6.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
Hello,

*** in this email and in every later emails ***
Sorry, if I messed up with Cc list or message-ids. It's suprisingly
hard to jump in into a loop that has never been in your inbox. It took
some `googling' effort.

I haven't tested the patch set yet, I just 'ported' it to linux-next.
I reverted 073696a8bc7779b ("printk: do cond_resched() between lines while
outputting to consoles") as a first step, but it comes in later again. I can
send out the updated series (off list is OK).

> Currently, console_unlock() prints messages from kernel printk buffer to
> console while the buffer is non-empty. When serial console is attached,
> printing is slow and thus other CPUs in the system have plenty of time
> to append new messages to the buffer while one CPU is printing. Thus the
> CPU can spend unbounded amount of time doing printing in console_unlock().
> This is especially serious problem if the printk() calling
> console_unlock() was called with interrupts disabled.
>
> In practice users have observed a CPU can spend tens of seconds printing
> in console_unlock() (usually during boot when hundreds of SCSI devices
> are discovered) resulting in RCU stalls (CPU doing printing doesn't
> reach quiescent state for a long time), softlockup reports (IPIs for the
> printing CPU don't get served and thus other CPUs are spinning waiting
> for the printing CPU to process IPIs), and eventually a machine death
> (as messages from stalls and lockups append to printk buffer faster than
> we are able to print). So these machines are unable to boot with serial
> console attached. Also during artificial stress testing SATA disk
> disappears from the system because its interrupts aren't served for too
> long.
>
> This patch implements a mechanism where after printing specified number
> of characters (tunable as a kernel parameter printk.offload_chars), CPU
> doing printing asks for help by waking up one of dedicated kthreads.  As
> soon as the printing CPU notices kthread got scheduled and is spinning
> on print_lock dedicated for that purpose, it drops console_sem,
> print_lock, and exits console_unlock(). Kthread then takes over printing
> instead. This way no CPU should spend printing too long even if there
> is heavy printk traffic.
>
> Signed-off-by: Jan Kara 

I think we better use raw_spin_lock as a print_lock; and, apart from that,
seems that we don't re-init in zap_lock(). So I ended up with the following
patch on top of yours (to be folded):

- use raw_spin_lock
- do not forget to re-init `print_lock' in zap_locks()
---
 kernel/printk/printk.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index d986599..2a86ff1 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -85,7 +85,7 @@ EXPORT_SYMBOL_GPL(console_drivers);
  * we can spin on it when some other thread wants to take over printing to
  * console.
  */
-static DEFINE_SPINLOCK(print_lock);
+static DEFINE_RAW_SPINLOCK(print_lock);
 
 /*
  * Number of printing threads spinning on print_lock. Can go away once
@@ -1516,6 +1516,7 @@ static void zap_locks(void)
/* If a crash is occurring, make sure we can't deadlock */
raw_spin_lock_init(_lock);
/* And make sure that we print immediately */
+   raw_spin_lock_init(_lock);
sema_init(_sem, 1);
 }
 
@@ -2311,7 +2312,7 @@ void console_unlock(void)
console_cont_flush(text, sizeof(text));
 again:
retry = false;
-   spin_lock_irqsave(_lock, flags);
+   raw_spin_lock_irqsave(_lock, flags);
for (;;) {
struct printk_log *msg;
size_t ext_len = 0;
@@ -2410,7 +2411,7 @@ skip:
 * succeeds in getting console_sem (unless someone else takes it and
 * then he'll be responsible for printing).
  */
-   spin_unlock_irqrestore(_lock, flags);
+   raw_spin_unlock_irqrestore(_lock, flags);
 
/*
 * In case we cannot trylock the console_sem again, there's a new owner
@@ -2773,9 +2774,9 @@ static int printing_task(void *arg)
 * want to sleep once we got scheduled to make sure we take
 * over printing without depending on the scheduler.
 */
-   spin_lock_irqsave(_lock, flags);
+   raw_spin_lock_irqsave(_lock, flags);
atomic_dec(_tasks_spinning);
-   spin_unlock_irqrestore(_lock, flags);
+   raw_spin_unlock_irqrestore(_lock, flags);
if (console_trylock())
console_unlock();
preempt_enable();
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/7] printk: Hand over printing to console if printing too long

2015-12-10 Thread Sergey Senozhatsky
On (12/10/15 23:52), Sergey Senozhatsky wrote:
> 
> I think we better use raw_spin_lock as a print_lock; and, apart from that,
> seems that we don't re-init in zap_lock(). So I ended up with the following
> patch on top of yours (to be folded):
> 
> - use raw_spin_lock
> - do not forget to re-init `print_lock' in zap_locks()

while we are on this, what do you guys think?


CPU1CPU2
console_unlock()
  call_console_drivers()
con->write()
   ... spin_lock ... uart, etc  panic

zap_lock() will raw_spin_lock_init(_lock) and
sema_init(_sem, 1), but we still have `spin_lock'
held by con->write().

so the `panic' flush or print out will see con->write() being already
blocked.


===8<

We do zap_lock() in printk to make a panic print out possible, but we
can end up having a locked serial console - e.g. panic has occurred
whilst CPUx was in con->write(), which takes some internal locks, thus
call_console_drivers() will perform con->write() on an already locked
console.

Try to reset() every console in zap_lock() via console specific
->reset() call.

Signed-off-by: Sergey Senozhatsky 
---
 include/linux/console.h | 1 +
 kernel/printk/printk.c  | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/include/linux/console.h b/include/linux/console.h
index bd19434..1cb8f72 100644
--- a/include/linux/console.h
+++ b/include/linux/console.h
@@ -125,6 +125,7 @@ struct console {
void(*unblank)(void);
int (*setup)(struct console *, char *);
int (*match)(struct console *, char *name, int idx, char *options);
+   void(*reset)(struct console *);
short   flags;
short   index;
int cflag;
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index f4a9565..ad172c4 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1580,6 +1580,7 @@ static void call_console_drivers(int level,
  */
 static void zap_locks(void)
 {
+   struct console *c;
static unsigned long oops_timestamp;
 
if (time_after_eq(jiffies, oops_timestamp) &&
@@ -1589,6 +1590,11 @@ static void zap_locks(void)
oops_timestamp = jiffies;
 
debug_locks_off();
+
+   for_each_console(c)
+   if ((c->flags & CON_ENABLED) && c->reset)
+   c->reset(c);
+
/* If a crash is occurring, make sure we can't deadlock */
raw_spin_lock_init(_lock);
/* And make sure that we print immediately */
-- 
2.6.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/7] printk: Hand over printing to console if printing too long

2015-10-25 Thread Jan Kara
From: Jan Kara 

Currently, console_unlock() prints messages from kernel printk buffer to
console while the buffer is non-empty. When serial console is attached,
printing is slow and thus other CPUs in the system have plenty of time
to append new messages to the buffer while one CPU is printing. Thus the
CPU can spend unbounded amount of time doing printing in console_unlock().
This is especially serious problem if the printk() calling
console_unlock() was called with interrupts disabled.

In practice users have observed a CPU can spend tens of seconds printing
in console_unlock() (usually during boot when hundreds of SCSI devices
are discovered) resulting in RCU stalls (CPU doing printing doesn't
reach quiescent state for a long time), softlockup reports (IPIs for the
printing CPU don't get served and thus other CPUs are spinning waiting
for the printing CPU to process IPIs), and eventually a machine death
(as messages from stalls and lockups append to printk buffer faster than
we are able to print). So these machines are unable to boot with serial
console attached. Also during artificial stress testing SATA disk
disappears from the system because its interrupts aren't served for too
long.

This patch implements a mechanism where after printing specified number
of characters (tunable as a kernel parameter printk.offload_chars), CPU
doing printing asks for help by waking up one of dedicated kthreads.  As
soon as the printing CPU notices kthread got scheduled and is spinning
on print_lock dedicated for that purpose, it drops console_sem,
print_lock, and exits console_unlock(). Kthread then takes over printing
instead. This way no CPU should spend printing too long even if there
is heavy printk traffic.

Signed-off-by: Jan Kara 
---
 Documentation/kernel-parameters.txt |  15 
 kernel/printk/printk.c  | 173 
 2 files changed, 171 insertions(+), 17 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 22a4b687ea5b..df8adee975ba 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2958,6 +2958,21 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
Format:   (1/Y/y=enable, 0/N/n=disable)
default: disabled
 
+   printk.offload_chars=
+   Printing to console can be relatively slow especially
+   in case of serial console. When there is intensive
+   printing happening from several cpus (as is the case
+   during boot), a cpu can be spending significant time
+   (seconds or more) doing printing. To avoid softlockups,
+   lost interrupts, and similar problems other cpus
+   will take over printing after the currently printing
+   cpu has printed 'printk.offload_chars' characters.
+   Higher value means possibly longer interrupt and other
+   latencies but lower overhead of printing due to handing
+   over of printing.
+   Format:  (0 = disabled)
+   default: 1000
+
printk.time=Show timing data prefixed to each printk message line
Format:   (1/Y/y=enable, 0/N/n=disable)
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 8f0324ef72ab..1b26263edfa7 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -78,6 +79,29 @@ static DEFINE_SEMAPHORE(console_sem);
 struct console *console_drivers;
 EXPORT_SYMBOL_GPL(console_drivers);
 
+/*
+ * This spinlock is taken when printing to console. It is used only so that
+ * we can spin on it when some other thread wants to take over printing to
+ * console.
+ */
+static DEFINE_SPINLOCK(print_lock);
+
+/*
+ * Number of printing threads spinning on print_lock. Can go away once
+ * spin_is_contended() is reliable.
+ */
+static atomic_t printing_tasks_spinning = ATOMIC_INIT(0);
+
+/*
+ * Number of kernel threads for offloading printing. We need at least two so
+ * that they can hand over printing from one to another one and thus switch
+ * CPUs.
+ */
+#define PRINTING_TASKS 2
+
+/* Wait queue printing kthreads sleep on when idle */
+static DECLARE_WAIT_QUEUE_HEAD(print_queue);
+
 #ifdef CONFIG_LOCKDEP
 static struct lockdep_map console_lock_dep_map = {
.name = "console_lock"
@@ -279,6 +303,18 @@ static u32 clear_idx;
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
 static char *log_buf = __log_buf;
 static u32 log_buf_len = __LOG_BUF_LEN;
+/*
+ * How many characters can we print in one call of printk before asking
+ * other cpus to continue printing. 0 means infinity. Tunable via
+ * printk.offload_chars kernel parameter. Our default 1000 means about

[PATCH 1/7] printk: Hand over printing to console if printing too long

2015-10-25 Thread Jan Kara
From: Jan Kara 

Currently, console_unlock() prints messages from kernel printk buffer to
console while the buffer is non-empty. When serial console is attached,
printing is slow and thus other CPUs in the system have plenty of time
to append new messages to the buffer while one CPU is printing. Thus the
CPU can spend unbounded amount of time doing printing in console_unlock().
This is especially serious problem if the printk() calling
console_unlock() was called with interrupts disabled.

In practice users have observed a CPU can spend tens of seconds printing
in console_unlock() (usually during boot when hundreds of SCSI devices
are discovered) resulting in RCU stalls (CPU doing printing doesn't
reach quiescent state for a long time), softlockup reports (IPIs for the
printing CPU don't get served and thus other CPUs are spinning waiting
for the printing CPU to process IPIs), and eventually a machine death
(as messages from stalls and lockups append to printk buffer faster than
we are able to print). So these machines are unable to boot with serial
console attached. Also during artificial stress testing SATA disk
disappears from the system because its interrupts aren't served for too
long.

This patch implements a mechanism where after printing specified number
of characters (tunable as a kernel parameter printk.offload_chars), CPU
doing printing asks for help by waking up one of dedicated kthreads.  As
soon as the printing CPU notices kthread got scheduled and is spinning
on print_lock dedicated for that purpose, it drops console_sem,
print_lock, and exits console_unlock(). Kthread then takes over printing
instead. This way no CPU should spend printing too long even if there
is heavy printk traffic.

Signed-off-by: Jan Kara 
---
 Documentation/kernel-parameters.txt |  15 
 kernel/printk/printk.c  | 173 
 2 files changed, 171 insertions(+), 17 deletions(-)

diff --git a/Documentation/kernel-parameters.txt 
b/Documentation/kernel-parameters.txt
index 22a4b687ea5b..df8adee975ba 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2958,6 +2958,21 @@ bytes respectively. Such letter suffixes can also be 
entirely omitted.
Format:   (1/Y/y=enable, 0/N/n=disable)
default: disabled
 
+   printk.offload_chars=
+   Printing to console can be relatively slow especially
+   in case of serial console. When there is intensive
+   printing happening from several cpus (as is the case
+   during boot), a cpu can be spending significant time
+   (seconds or more) doing printing. To avoid softlockups,
+   lost interrupts, and similar problems other cpus
+   will take over printing after the currently printing
+   cpu has printed 'printk.offload_chars' characters.
+   Higher value means possibly longer interrupt and other
+   latencies but lower overhead of printing due to handing
+   over of printing.
+   Format:  (0 = disabled)
+   default: 1000
+
printk.time=Show timing data prefixed to each printk message line
Format:   (1/Y/y=enable, 0/N/n=disable)
 
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 8f0324ef72ab..1b26263edfa7 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -46,6 +46,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -78,6 +79,29 @@ static DEFINE_SEMAPHORE(console_sem);
 struct console *console_drivers;
 EXPORT_SYMBOL_GPL(console_drivers);
 
+/*
+ * This spinlock is taken when printing to console. It is used only so that
+ * we can spin on it when some other thread wants to take over printing to
+ * console.
+ */
+static DEFINE_SPINLOCK(print_lock);
+
+/*
+ * Number of printing threads spinning on print_lock. Can go away once
+ * spin_is_contended() is reliable.
+ */
+static atomic_t printing_tasks_spinning = ATOMIC_INIT(0);
+
+/*
+ * Number of kernel threads for offloading printing. We need at least two so
+ * that they can hand over printing from one to another one and thus switch
+ * CPUs.
+ */
+#define PRINTING_TASKS 2
+
+/* Wait queue printing kthreads sleep on when idle */
+static DECLARE_WAIT_QUEUE_HEAD(print_queue);
+
 #ifdef CONFIG_LOCKDEP
 static struct lockdep_map console_lock_dep_map = {
.name = "console_lock"
@@ -279,6 +303,18 @@ static u32 clear_idx;
 static char __log_buf[__LOG_BUF_LEN] __aligned(LOG_ALIGN);
 static char *log_buf = __log_buf;
 static u32 log_buf_len = __LOG_BUF_LEN;
+/*
+ * How many characters can we print in one call of printk before asking
+ * other cpus to continue printing. 0 means infinity. Tunable via
+ * printk.offload_chars kernel parameter.