Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-28 Thread Thomas Gleixner
Michael,

On Fri, 16 Nov 2018, Zhivich, Michael wrote:
> On 11/15/18, 12:17 PM, "John Stultz"  wrote:
> Would a more focused fix be to move the clocksource watchdog from a
> normal timer to a hrtimer?
> 
> That's an interesting idea - it would get clocksource watchdog out of
> ksoftirqd.  However, clocksource watchdog iterates over available CPUs to
> check the TSC on each core (see add_timer_on() call in
> clocksource_watchdog()).  I'm not seeing an API to start an hrtimer on a
> specific CPU - is this possible and I'm missing it?  Or would something
> like this have to be added to hrtimer?

It's surely an interesting idea, but it's not trivial.

There is no way to reliably queue hrtimers remote when high resolution mode
is enabled. It only works when the to be queued timer is not the first to
expire one. If it ends up being the first to expire timer, then there is
currently no way to rearm the remote per cpu clockevent device because it's
not remotely accesible.

It would need an SMP function call, but that needs to be asynchronous due
to locking/interrupt disabled constraints. Maybe ...

Thanks,

tglx


Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-28 Thread Thomas Gleixner
Michael,

On Fri, 16 Nov 2018, Zhivich, Michael wrote:
> On 11/15/18, 12:17 PM, "John Stultz"  wrote:
> Would a more focused fix be to move the clocksource watchdog from a
> normal timer to a hrtimer?
> 
> That's an interesting idea - it would get clocksource watchdog out of
> ksoftirqd.  However, clocksource watchdog iterates over available CPUs to
> check the TSC on each core (see add_timer_on() call in
> clocksource_watchdog()).  I'm not seeing an API to start an hrtimer on a
> specific CPU - is this possible and I'm missing it?  Or would something
> like this have to be added to hrtimer?

It's surely an interesting idea, but it's not trivial.

There is no way to reliably queue hrtimers remote when high resolution mode
is enabled. It only works when the to be queued timer is not the first to
expire one. If it ends up being the first to expire timer, then there is
currently no way to rearm the remote per cpu clockevent device because it's
not remotely accesible.

It would need an SMP function call, but that needs to be asynchronous due
to locking/interrupt disabled constraints. Maybe ...

Thanks,

tglx


Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-16 Thread Zhivich, Michael
On 11/15/18, 12:17 PM, "John Stultz"  wrote:

On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich  
wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0.  This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then 
the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
>...
>We should probably also consider the timer softirqs to be synchronous
>and not be delayed to ksoftirqd (since they were the issue with the
>earlier watchdog problems), but that should be done as a separate 
patch.
>...
>
> Signed-off-by: Michael Zhivich 
> ---
>  kernel/softirq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> +   ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << 
TIMER_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.

So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.

Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?

thanks
-john

Hi John,

That's an interesting idea - it would get clocksource watchdog out of 
ksoftirqd.  However, clocksource watchdog iterates over available CPUs to check 
the TSC on each core (see add_timer_on() call in clocksource_watchdog()).  I'm 
not seeing an API to start an hrtimer on a specific CPU - is this possible and 
I'm missing it?  Or would something like this have to be added to hrtimer?

Thanks,
~ Michael



Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-16 Thread Zhivich, Michael
On 11/15/18, 12:17 PM, "John Stultz"  wrote:

On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich  
wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0.  This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then 
the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
>...
>We should probably also consider the timer softirqs to be synchronous
>and not be delayed to ksoftirqd (since they were the issue with the
>earlier watchdog problems), but that should be done as a separate 
patch.
>...
>
> Signed-off-by: Michael Zhivich 
> ---
>  kernel/softirq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> +   ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << 
TIMER_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.

So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.

Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?

thanks
-john

Hi John,

That's an interesting idea - it would get clocksource watchdog out of 
ksoftirqd.  However, clocksource watchdog iterates over available CPUs to check 
the TSC on each core (see add_timer_on() call in clocksource_watchdog()).  I'm 
not seeing an API to start an hrtimer on a specific CPU - is this possible and 
I'm missing it?  Or would something like this have to be added to hrtimer?

Thanks,
~ Michael



Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-15 Thread John Stultz
On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich  wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0.  This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
>...
>We should probably also consider the timer softirqs to be synchronous
>and not be delayed to ksoftirqd (since they were the issue with the
>earlier watchdog problems), but that should be done as a separate patch.
>...
>
> Signed-off-by: Michael Zhivich 
> ---
>  kernel/softirq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> +   ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.

So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.

Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?

thanks
-john


Re: [PATCH] softirq: don't push timer softirq handling to ksoftirqd

2018-11-15 Thread John Stultz
On Thu, Nov 15, 2018 at 9:07 AM, Michael Zhivich  wrote:
> Require TIMER_SOFTIRQ to be handled immediately instead of delaying until
> ksoftirqd runs, thus preventing problems with reading clocksources that
> wrap often (e.g. acpi_pm).
>
> If acpi_pm is used as the clocksource watchdog, and machine is under heavy
> load, the time period for the watchdog check may be significantly longer
> than the requested 0.5 seconds.  If the watchdog check is delayed by 2
> seconds (observed behavior), then acpi_pm time delta will be
>
> 2.5 sec * 3579545 ticks/sec = 8948863 = 0x888c3f
>
> which will be treated as negative (since acpi_pm is only 24-bits wide) and
> truncated to 0.  This behavior will cause tsc to be incorrectly declared
> unstable in clocksource_watchdog(), as it no longer agrees with acpi_pm.
> If the clocksource watchdog check is delayed by more than 4.7 sec, then the
> acpi_pm clocksource will wrap altogether and produce incorrect time delta.
>
> The likely cause of this delay is that timer interrupts are serviced in
> ksoftirqd when the machine is very busy.
>
> Per Linus' comment in commit 3c53776e29f8 ("Mark HI and TASKLET softirq
> synchronous"):
>...
>We should probably also consider the timer softirqs to be synchronous
>and not be delayed to ksoftirqd (since they were the issue with the
>earlier watchdog problems), but that should be done as a separate patch.
>...
>
> Signed-off-by: Michael Zhivich 
> ---
>  kernel/softirq.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index d28813306b2c..6d517ce0fba8 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -82,7 +82,8 @@ static void wakeup_softirqd(void)
>   * right now. Let ksoftirqd handle this at its own rate, to get fairness,
>   * unless we're doing some of the synchronous softirqs.
>   */
> -#define SOFTIRQ_NOW_MASK ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ))
> +#define SOFTIRQ_NOW_MASK \
> +   ((1 << HI_SOFTIRQ) | (1 << TASKLET_SOFTIRQ) | (1 << TIMER_SOFTIRQ))
>  static bool ksoftirqd_running(unsigned long pending)
>  {
> struct task_struct *tsk = __this_cpu_read(ksoftirqd);

Thanks so much for sending this along! Sorry I didn't get back to your
mail earlier this week, I've been at Plumbers.

So while this does try to attack the reliability issue w/ the
clocksource watchdog being delayed, I worry this will have to many
side-effects elsewhere.

Would a more focused fix be to move the clocksource watchdog from a
normal timer to a hrtimer?

thanks
-john