from:"Sebastian Andrzej Siewior"

Re: [PATCH RT 0/1] Linux v4.19.306-rt132-rc1

2024-02-05 Thread Sebastian Andrzej Siewior

On 2024-02-02 18:04:56 [+0100], Daniel Wagner wrote:
> Dear RT Folks,
Hi,

> This is the RT stable review cycle of patch 4.19.306-rt132-rc1.
…
> Please scream at me if I messed something up. Please test the patches
> too.

Good.

> Enjoy!
> Daniel

Sebastian

Re: [PATCH net-next 16/24] net: netkit, veth, tun, virt*: Use nested-BH locking for XDP redirect.

2024-01-12 Thread Sebastian Andrzej Siewior

On 2023-12-18 09:52:05 [+0100], Daniel Borkmann wrote:
> Hi Sebastian,
Hi Daniel,

> Please exclude netkit from this set given it does not support XDP, but
> instead only accepts tc BPF typed programs.

okay, thank you.

> Thanks,
> Daniel

Sebastian

[PATCH net-next 16/24] net: netkit, veth, tun, virt*: Use nested-BH locking for XDP redirect.

2023-12-15 Thread Sebastian Andrzej Siewior

The per-CPU variables used during bpf_prog_run_xdp() invocation and
later during xdp_do_redirect() rely on disabled BH for their protection.
Without locking in local_bh_disable() on PREEMPT_RT these data structure
require explicit locking.

This is a follow-up on the previous change which introduced
bpf_run_lock.redirect_lock and uses it now within drivers.

The simple way is to acquire the lock before bpf_prog_run_xdp() is
invoked and hold it until the end of function.
This does not always work because some drivers (cpsw, atlantic) invoke
xdp_do_flush() in the same context.
Acquiring the lock in bpf_prog_run_xdp() and dropping in
xdp_do_redirect() (without touching drivers) does not work because not
all driver, which use bpf_prog_run_xdp(), do support XDP_REDIRECT (and
invoke xdp_do_redirect()).

Ideally the minimal locking scope would be bpf_prog_run_xdp() +
xdp_do_redirect() and everything else (error recovery, DMA unmapping,
free/ alloc of memory, …) would happen outside of the locked section.

Cc: "K. Y. Srinivasan" 
Cc: "Michael S. Tsirkin" 
Cc: Alexei Starovoitov 
Cc: Andrii Nakryiko 
Cc: Dexuan Cui 
Cc: Haiyang Zhang 
Cc: Hao Luo 
Cc: Jesper Dangaard Brouer 
Cc: Jiri Olsa 
Cc: John Fastabend 
Cc: Juergen Gross 
Cc: KP Singh 
Cc: Martin KaFai Lau 
Cc: Nikolay Aleksandrov 
Cc: Song Liu 
Cc: Stanislav Fomichev 
Cc: Stefano Stabellini 
Cc: Wei Liu 
Cc: Willem de Bruijn 
Cc: Xuan Zhuo 
Cc: Yonghong Song 
Cc: b...@vger.kernel.org
Cc: virtualizat...@lists.linux.dev
Cc: xen-de...@lists.xenproject.org
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/net/hyperv/netvsc_bpf.c |  1 +
 drivers/net/netkit.c| 13 +++
 drivers/net/tun.c   | 28 +--
 drivers/net/veth.c  | 40 -
 drivers/net/virtio_net.c|  1 +
 drivers/net/xen-netfront.c  |  1 +
 6 files changed, 52 insertions(+), 32 deletions(-)

diff --git a/drivers/net/hyperv/netvsc_bpf.c b/drivers/net/hyperv/netvsc_bpf.c
index 4a9522689fa4f..55f8ca92ca199 100644
--- a/drivers/net/hyperv/netvsc_bpf.c
+++ b/drivers/net/hyperv/netvsc_bpf.c
@@ -58,6 +58,7 @@ u32 netvsc_run_xdp(struct net_device *ndev, struct 
netvsc_channel *nvchan,
 
memcpy(xdp->data, data, len);
 
+   guard(local_lock_nested_bh)(_run_lock.redirect_lock);
act = bpf_prog_run_xdp(prog, xdp);
 
switch (act) {
diff --git a/drivers/net/netkit.c b/drivers/net/netkit.c
index 39171380ccf29..fbcf78477bda8 100644
--- a/drivers/net/netkit.c
+++ b/drivers/net/netkit.c
@@ -80,8 +80,15 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct 
net_device *dev)
netkit_prep_forward(skb, !net_eq(dev_net(dev), dev_net(peer)));
skb->dev = peer;
entry = rcu_dereference(nk->active);
-   if (entry)
-   ret = netkit_run(entry, skb, ret);
+   if (entry) {
+   scoped_guard(local_lock_nested_bh, _run_lock.redirect_lock) 
{
+   ret = netkit_run(entry, skb, ret);
+   if (ret == NETKIT_REDIRECT) {
+   dev_sw_netstats_tx_add(dev, 1, len);
+   skb_do_redirect(skb);
+   }
+   }
+   }
switch (ret) {
case NETKIT_NEXT:
case NETKIT_PASS:
@@ -95,8 +102,6 @@ static netdev_tx_t netkit_xmit(struct sk_buff *skb, struct 
net_device *dev)
}
break;
case NETKIT_REDIRECT:
-   dev_sw_netstats_tx_add(dev, 1, len);
-   skb_do_redirect(skb);
break;
case NETKIT_DROP:
default:
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index afa5497f7c35c..fe0d31f11e4b6 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1708,16 +1708,18 @@ static struct sk_buff *tun_build_skb(struct tun_struct 
*tun,
xdp_init_buff(, buflen, >xdp_rxq);
xdp_prepare_buff(, buf, pad, len, false);
 
-   act = bpf_prog_run_xdp(xdp_prog, );
-   if (act == XDP_REDIRECT || act == XDP_TX) {
-   get_page(alloc_frag->page);
-   alloc_frag->offset += buflen;
-   }
-   err = tun_xdp_act(tun, xdp_prog, , act);
-   if (err < 0) {
-   if (act == XDP_REDIRECT || act == XDP_TX)
-   put_page(alloc_frag->page);
-   goto out;
+   scoped_guard(local_lock_nested_bh, _run_lock.redirect_lock) 
{
+   act = bpf_prog_run_xdp(xdp_prog, );
+   if (act == XDP_REDIRECT || act == XDP_TX) {
+   get_page(alloc_frag->page);
+   alloc_frag->offset += buflen;
+   }
+   err = tun_xdp_act(tun, xdp_prog, , act);
+   if (err < 0) {
+

Re: [tip: core/rcu] softirq: Don't try waking ksoftirqd before it has been spawned

2021-04-14 Thread Sebastian Andrzej Siewior

On 2021-04-12 11:36:45 [-0700], Paul E. McKenney wrote:
> > Color me confused. I did not follow the discussion around this
> > completely, but wasn't it agreed on that this rcu torture muck can wait
> > until the threads are brought up?
> 
> Yes, we can cause rcutorture to wait.  But in this case, rcutorture
> is just the messenger, and making it wait would simply be ignoring
> the message.  The message is that someone could invoke any number of
> things that wait on a softirq handler's invocation during the interval
> before ksoftirqd has been spawned.

My memory on this is that the only user, that required this early
behaviour, was kprobe which was recently changed to not need it anymore.
Which makes the test as the only user that remains. Therefore I thought
that this test will be moved to later position (when ksoftirqd is up and
running) and that there is no more requirement for RCU to be completely
up that early in the boot process.

Did I miss anything?

> 
>   Thanx, Paul

Sebastian

[tip: locking/core] locking/rtmutex: Remove rt_mutex_timed_lock()

2021-03-29 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: c15380b72d7ae821ee090ba5a56fc6310828dbda
Gitweb:
https://git.kernel.org/tip/c15380b72d7ae821ee090ba5a56fc6310828dbda
Author:Sebastian Andrzej Siewior 
AuthorDate:Fri, 26 Mar 2021 16:29:30 +01:00
Committer: Ingo Molnar 
CommitterDate: Mon, 29 Mar 2021 15:57:02 +02:00

locking/rtmutex: Remove rt_mutex_timed_lock()

rt_mutex_timed_lock() has no callers since:

  c051b21f71d1f ("rtmutex: Confine deadlock logic to futex")

Remove it.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210326153943.061103...@linutronix.de
---
 include/linux/rtmutex.h  |  3 +---
 kernel/locking/rtmutex.c | 46 +---
 2 files changed, 49 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 6fd615a..32f4a35 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -115,9 +115,6 @@ extern void rt_mutex_lock(struct rt_mutex *lock);
 #endif
 
 extern int rt_mutex_lock_interruptible(struct rt_mutex *lock);
-extern int rt_mutex_timed_lock(struct rt_mutex *lock,
-  struct hrtimer_sleeper *timeout);
-
 extern int rt_mutex_trylock(struct rt_mutex *lock);
 
 extern void rt_mutex_unlock(struct rt_mutex *lock);
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index db31bce..ca93e5d 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1395,21 +1395,6 @@ rt_mutex_fastlock(struct rt_mutex *lock, int state,
 }
 
 static inline int
-rt_mutex_timed_fastlock(struct rt_mutex *lock, int state,
-   struct hrtimer_sleeper *timeout,
-   enum rtmutex_chainwalk chwalk,
-   int (*slowfn)(struct rt_mutex *lock, int state,
- struct hrtimer_sleeper *timeout,
- enum rtmutex_chainwalk chwalk))
-{
-   if (chwalk == RT_MUTEX_MIN_CHAINWALK &&
-   likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
-   return 0;
-
-   return slowfn(lock, state, timeout, chwalk);
-}
-
-static inline int
 rt_mutex_fasttrylock(struct rt_mutex *lock,
 int (*slowfn)(struct rt_mutex *lock))
 {
@@ -1517,37 +1502,6 @@ int __sched __rt_mutex_futex_trylock(struct rt_mutex 
*lock)
 }
 
 /**
- * rt_mutex_timed_lock - lock a rt_mutex interruptible
- * the timeout structure is provided
- * by the caller
- *
- * @lock:  the rt_mutex to be locked
- * @timeout:   timeout structure or NULL (no timeout)
- *
- * Returns:
- *  0  on success
- * -EINTR  when interrupted by a signal
- * -ETIMEDOUT  when the timeout expired
- */
-int
-rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
-{
-   int ret;
-
-   might_sleep();
-
-   mutex_acquire(>dep_map, 0, 0, _RET_IP_);
-   ret = rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
-  RT_MUTEX_MIN_CHAINWALK,
-  rt_mutex_slowlock);
-   if (ret)
-   mutex_release(>dep_map, _RET_IP_);
-
-   return ret;
-}
-EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
-
-/**
  * rt_mutex_trylock - try to lock a rt_mutex
  *
  * @lock:  the rt_mutex to be locked

[tip: locking/core] locking/rtmutex: Remove output from deadlock detector

2021-03-29 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 6d41c675a5394057f6fb1dc97cc0a0e360f2c2f8
Gitweb:
https://git.kernel.org/tip/6d41c675a5394057f6fb1dc97cc0a0e360f2c2f8
Author:Sebastian Andrzej Siewior 
AuthorDate:Fri, 26 Mar 2021 16:29:32 +01:00
Committer: Ingo Molnar 
CommitterDate: Mon, 29 Mar 2021 15:57:02 +02:00

locking/rtmutex: Remove output from deadlock detector

The rtmutex specific deadlock detector predates lockdep coverage of rtmutex
and since commit f5694788ad8da ("rt_mutex: Add lockdep annotations") it
contains a lot of redundant functionality:

 - lockdep will detect an potential deadlock before rtmutex-debug
   has a chance to do so

 - the deadlock debugging is restricted to rtmutexes which are not
   associated to futexes and have an active waiter, which is covered by
   lockdep already

Remove the redundant functionality and move actual deadlock WARN() into the
deadlock code path. The latter needs a seperate cleanup.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210326153943.320398...@linutronix.de
---
 include/linux/rtmutex.h |  7 +--
 kernel/locking/rtmutex-debug.c  | 97 +
 kernel/locking/rtmutex-debug.h  |  9 +---
 kernel/locking/rtmutex.c|  7 +--
 kernel/locking/rtmutex.h|  7 +--
 kernel/locking/rtmutex_common.h |  4 +-
 6 files changed, 1 insertion(+), 130 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 48b334b..0725c4b 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -31,9 +31,6 @@ struct rt_mutex {
raw_spinlock_t  wait_lock;
struct rb_root_cached   waiters;
struct task_struct  *owner;
-#ifdef CONFIG_DEBUG_RT_MUTEXES
-   const char  *name;
-#endif
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map  dep_map;
 #endif
@@ -56,8 +53,6 @@ struct hrtimer_sleeper;
 #endif
 
 #ifdef CONFIG_DEBUG_RT_MUTEXES
-# define __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \
-   , .name = #mutexname
 
 # define rt_mutex_init(mutex) \
 do { \
@@ -67,7 +62,6 @@ do { \
 
  extern void rt_mutex_debug_task_free(struct task_struct *tsk);
 #else
-# define __DEBUG_RT_MUTEX_INITIALIZER(mutexname)
 # define rt_mutex_init(mutex)  __rt_mutex_init(mutex, NULL, 
NULL)
 # define rt_mutex_debug_task_free(t)   do { } while (0)
 #endif
@@ -83,7 +77,6 @@ do { \
{ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(mutexname.wait_lock) \
, .waiters = RB_ROOT_CACHED \
, .owner = NULL \
-   __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \
__DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)}
 
 #define DEFINE_RT_MUTEX(mutexname) \
diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 7e411b9..fb15010 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -32,105 +32,12 @@
 
 #include "rtmutex_common.h"
 
-static void printk_task(struct task_struct *p)
-{
-   if (p)
-   printk("%16s:%5d [%p, %3d]", p->comm, task_pid_nr(p), p, 
p->prio);
-   else
-   printk("");
-}
-
-static void printk_lock(struct rt_mutex *lock, int print_owner)
-{
-   printk(" [%p] {%s}\n", lock, lock->name);
-
-   if (print_owner && rt_mutex_owner(lock)) {
-   printk(".. ->owner: %p\n", lock->owner);
-   printk(".. held by:  ");
-   printk_task(rt_mutex_owner(lock));
-   printk("\n");
-   }
-}
-
 void rt_mutex_debug_task_free(struct task_struct *task)
 {
DEBUG_LOCKS_WARN_ON(!RB_EMPTY_ROOT(>pi_waiters.rb_root));
DEBUG_LOCKS_WARN_ON(task->pi_blocked_on);
 }
 
-/*
- * We fill out the fields in the waiter to store the information about
- * the deadlock. We print when we return. act_waiter can be NULL in
- * case of a remove waiter operation.
- */
-void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
-struct rt_mutex_waiter *act_waiter,
-struct rt_mutex *lock)
-{
-   struct task_struct *task;
-
-   if (!debug_locks || chwalk == RT_MUTEX_FULL_CHAINWALK || !act_waiter)
-   return;
-
-   task = rt_mutex_owner(act_waiter->lock);
-   if (task && task != current) {
-   act_waiter->deadlock_task_pid = get_pid(task_pid(task));
-   act_waiter->deadlock_lock = lock;
-   }
-}
-
-void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter)
-{
-   struct task_struct *task;
-
-   if (!waiter->deadlock_lock || !debug_locks)
-   return;
-
-   rcu_read_lock();
-   task = pid_task(waiter->deadlock_task_pid, PI

[tip: locking/core] locking/rtmutex: Remove rtmutex deadlock tester leftovers

2021-03-29 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 2d445c3e4a8216cfa9703998124c13250cc13e5e
Gitweb:
https://git.kernel.org/tip/2d445c3e4a8216cfa9703998124c13250cc13e5e
Author:Sebastian Andrzej Siewior 
AuthorDate:Fri, 26 Mar 2021 16:29:31 +01:00
Committer: Ingo Molnar 
CommitterDate: Mon, 29 Mar 2021 15:57:02 +02:00

locking/rtmutex: Remove rtmutex deadlock tester leftovers

The following debug members of 'struct rtmutex' are unused:

 - save_state: No users

 - file,line: Printed if ::name is NULL. This is only used for non-futex
  locks so ::name is never NULL

 - magic: Assigned to NULL by rt_mutex_destroy(), no further usage

Remove them along with unused inline and macro leftovers related to
the long gone deadlock tester.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210326153943.195064...@linutronix.de
---
 include/linux/rtmutex.h | 7 ++-
 kernel/locking/rtmutex-debug.c  | 7 +--
 kernel/locking/rtmutex-debug.h  | 2 --
 kernel/locking/rtmutex.c| 3 ---
 kernel/locking/rtmutex.h| 2 --
 kernel/locking/rtmutex_common.h | 1 -
 6 files changed, 3 insertions(+), 19 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 32f4a35..48b334b 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -32,10 +32,7 @@ struct rt_mutex {
struct rb_root_cached   waiters;
struct task_struct  *owner;
 #ifdef CONFIG_DEBUG_RT_MUTEXES
-   int save_state;
-   const char  *name, *file;
-   int line;
-   void*magic;
+   const char  *name;
 #endif
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map  dep_map;
@@ -60,7 +57,7 @@ struct hrtimer_sleeper;
 
 #ifdef CONFIG_DEBUG_RT_MUTEXES
 # define __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \
-   , .name = #mutexname, .file = __FILE__, .line = __LINE__
+   , .name = #mutexname
 
 # define rt_mutex_init(mutex) \
 do { \
diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 36e6910..7e411b9 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -42,12 +42,7 @@ static void printk_task(struct task_struct *p)
 
 static void printk_lock(struct rt_mutex *lock, int print_owner)
 {
-   if (lock->name)
-   printk(" [%p] {%s}\n",
-   lock, lock->name);
-   else
-   printk(" [%p] {%s:%d}\n",
-   lock, lock->file, lock->line);
+   printk(" [%p] {%s}\n", lock, lock->name);
 
if (print_owner && rt_mutex_owner(lock)) {
printk(".. ->owner: %p\n", lock->owner);
diff --git a/kernel/locking/rtmutex-debug.h b/kernel/locking/rtmutex-debug.h
index fc54971..772c9b0 100644
--- a/kernel/locking/rtmutex-debug.h
+++ b/kernel/locking/rtmutex-debug.h
@@ -22,8 +22,6 @@ extern void debug_rt_mutex_deadlock(enum rtmutex_chainwalk 
chwalk,
struct rt_mutex_waiter *waiter,
struct rt_mutex *lock);
 extern void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter);
-# define debug_rt_mutex_reset_waiter(w)\
-   do { (w)->deadlock_lock = NULL; } while (0)
 
 static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter 
*waiter,
  enum rtmutex_chainwalk walk)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index ca93e5d..11abc60 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1594,9 +1594,6 @@ void __sched rt_mutex_futex_unlock(struct rt_mutex *lock)
 void rt_mutex_destroy(struct rt_mutex *lock)
 {
WARN_ON(rt_mutex_is_locked(lock));
-#ifdef CONFIG_DEBUG_RT_MUTEXES
-   lock->magic = NULL;
-#endif
 }
 EXPORT_SYMBOL_GPL(rt_mutex_destroy);
 
diff --git a/kernel/locking/rtmutex.h b/kernel/locking/rtmutex.h
index 732f96a..4dbdec1 100644
--- a/kernel/locking/rtmutex.h
+++ b/kernel/locking/rtmutex.h
@@ -11,7 +11,6 @@
  * Non-debug version.
  */
 
-#define rt_mutex_deadlock_check(l) (0)
 #define debug_rt_mutex_init_waiter(w)  do { } while (0)
 #define debug_rt_mutex_free_waiter(w)  do { } while (0)
 #define debug_rt_mutex_lock(l) do { } while (0)
@@ -21,7 +20,6 @@
 #define debug_rt_mutex_init(m, n, k)   do { } while (0)
 #define debug_rt_mutex_deadlock(d, a ,l)   do { } while (0)
 #define debug_rt_mutex_print_deadlock(w)   do { } while (0)
-#define debug_rt_mutex_reset_waiter(w) do { } while (0)
 
 static inline void rt_mutex_print_deadlock(struct r

[tip: locking/core] locking/rtmutex: Consolidate rt_mutex_init()

2021-03-29 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 199cacd1a625cfc499d624b98b10dc763062f7dd
Gitweb:
https://git.kernel.org/tip/199cacd1a625cfc499d624b98b10dc763062f7dd
Author:Sebastian Andrzej Siewior 
AuthorDate:Fri, 26 Mar 2021 16:29:33 +01:00
Committer: Ingo Molnar 
CommitterDate: Mon, 29 Mar 2021 15:57:02 +02:00

locking/rtmutex: Consolidate rt_mutex_init()

rt_mutex_init() only initializes lockdep if CONFIG_DEBUG_RT_MUTEXES is
enabled, which is fine because all lockdep variants select it, but there is
no reason to do so.

Move the function outside of the CONFIG_DEBUG_RT_MUTEXES block which
removes #ifdeffery.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210326153943.437405...@linutronix.de
---
 include/linux/rtmutex.h | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 0725c4b..243fabc 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -43,6 +43,7 @@ struct hrtimer_sleeper;
  extern int rt_mutex_debug_check_no_locks_freed(const void *from,
unsigned long len);
  extern void rt_mutex_debug_check_no_locks_held(struct task_struct *task);
+ extern void rt_mutex_debug_task_free(struct task_struct *tsk);
 #else
  static inline int rt_mutex_debug_check_no_locks_freed(const void *from,
   unsigned long len)
@@ -50,22 +51,15 @@ struct hrtimer_sleeper;
return 0;
  }
 # define rt_mutex_debug_check_no_locks_held(task)  do { } while (0)
+# define rt_mutex_debug_task_free(t)   do { } while (0)
 #endif
 
-#ifdef CONFIG_DEBUG_RT_MUTEXES
-
-# define rt_mutex_init(mutex) \
+#define rt_mutex_init(mutex) \
 do { \
static struct lock_class_key __key; \
__rt_mutex_init(mutex, __func__, &__key); \
 } while (0)
 
- extern void rt_mutex_debug_task_free(struct task_struct *tsk);
-#else
-# define rt_mutex_init(mutex)  __rt_mutex_init(mutex, NULL, 
NULL)
-# define rt_mutex_debug_task_free(t)   do { } while (0)
-#endif
-
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 #define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) \
, .dep_map = { .name = #mutexname }

[tip: irq/core] drm/i915: Use tasklet_unlock_spin_wait() in __tasklet_disable_sync_once()

2021-03-25 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 6e457914935a3161eeb74e319abf9fd511aa1e4d
Gitweb:
https://git.kernel.org/tip/6e457914935a3161eeb74e319abf9fd511aa1e4d
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 23 Mar 2021 10:22:21 +01:00
Committer: Thomas Gleixner 
CommitterDate: Thu, 25 Mar 2021 18:21:03 +01:00

drm/i915: Use tasklet_unlock_spin_wait() in __tasklet_disable_sync_once()

The i915 driver has its own tasklet interface which was overseen in the
tasklet rework. __tasklet_disable_sync_once() is a wrapper around
tasklet_unlock_wait(). tasklet_unlock_wait() might sleep, but the i915
wrappers invokes it from non-preemtible contexts with bottom halves disabled.

Use tasklet_unlock_spin_wait() instead which can be invoked from
non-preemptible contexts.

Fixes: da044747401fc ("tasklets: Replace spin wait in tasklet_unlock_wait()")
Reported-by: kernel test robot 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Link: https://lore.kernel.org/r/20210323092221.awq7g5b2muzypjw3@flow
---
 drivers/gpu/drm/i915/i915_gem.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index e622aee..440c35f 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -105,7 +105,7 @@ static inline bool tasklet_is_locked(const struct 
tasklet_struct *t)
 static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
 {
if (!atomic_fetch_inc(>count))
-   tasklet_unlock_wait(t);
+   tasklet_unlock_spin_wait(t);
 }
 
 static inline bool __tasklet_is_enabled(const struct tasklet_struct *t)

[PATCH] locking/rtmutex: Use correct define for debug

2021-03-24 Thread Sebastian Andrzej Siewior

There is no such thing like CONFIG_RT_MUTEX_DEBUG. The debugging code
for rtmutex is hidding behind CONFIG_DEBUG_RT_MUTEXES.

Use CONFIG_DEBUG_RT_MUTEXES for debugging.

Fixes: 3ac7d0ecf0e18 ("locking/rtmutex: Restrict the trylock WARN_ON() to 
debug")
Signed-off-by: Sebastian Andrzej Siewior 
---
 kernel/locking/rtmutex.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index d584e32dae50a..56f1278150637 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1504,7 +1504,7 @@ int __sched rt_mutex_trylock(struct rt_mutex *lock)
 {
int ret;
 
-   if (IS_ENABLED(CONFIG_RT_MUTEX_DEBUG) && WARN_ON_ONCE(!in_task()))
+   if (IS_ENABLED(CONFIG_DEBUG_RT_MUTEXES) && WARN_ON_ONCE(!in_task()))
return 0;
 
ret = rt_mutex_fasttrylock(lock, rt_mutex_slowtrylock);
-- 
2.31.0

[tip: locking/core] locking/rtmutex: Remove rt_mutex_timed_lock()

2021-03-24 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: ba8c437e7cf3c8cc92f4b68b32b6b2217d2036d9
Gitweb:
https://git.kernel.org/tip/ba8c437e7cf3c8cc92f4b68b32b6b2217d2036d9
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 23 Mar 2021 22:30:20 +01:00
Committer: Ingo Molnar 
CommitterDate: Wed, 24 Mar 2021 08:06:06 +01:00

locking/rtmutex: Remove rt_mutex_timed_lock()

rt_mutex_timed_lock() has no callers since commit:

  c051b21f71d1f ("rtmutex: Confine deadlock logic to futex")

Remove it.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210323213707.465154...@linutronix.de
---
 include/linux/rtmutex.h  |  3 +---
 kernel/locking/rtmutex.c | 46 +---
 2 files changed, 49 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 6fd615a..32f4a35 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -115,9 +115,6 @@ extern void rt_mutex_lock(struct rt_mutex *lock);
 #endif
 
 extern int rt_mutex_lock_interruptible(struct rt_mutex *lock);
-extern int rt_mutex_timed_lock(struct rt_mutex *lock,
-  struct hrtimer_sleeper *timeout);
-
 extern int rt_mutex_trylock(struct rt_mutex *lock);
 
 extern void rt_mutex_unlock(struct rt_mutex *lock);
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index db31bce..ca93e5d 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1395,21 +1395,6 @@ rt_mutex_fastlock(struct rt_mutex *lock, int state,
 }
 
 static inline int
-rt_mutex_timed_fastlock(struct rt_mutex *lock, int state,
-   struct hrtimer_sleeper *timeout,
-   enum rtmutex_chainwalk chwalk,
-   int (*slowfn)(struct rt_mutex *lock, int state,
- struct hrtimer_sleeper *timeout,
- enum rtmutex_chainwalk chwalk))
-{
-   if (chwalk == RT_MUTEX_MIN_CHAINWALK &&
-   likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
-   return 0;
-
-   return slowfn(lock, state, timeout, chwalk);
-}
-
-static inline int
 rt_mutex_fasttrylock(struct rt_mutex *lock,
 int (*slowfn)(struct rt_mutex *lock))
 {
@@ -1517,37 +1502,6 @@ int __sched __rt_mutex_futex_trylock(struct rt_mutex 
*lock)
 }
 
 /**
- * rt_mutex_timed_lock - lock a rt_mutex interruptible
- * the timeout structure is provided
- * by the caller
- *
- * @lock:  the rt_mutex to be locked
- * @timeout:   timeout structure or NULL (no timeout)
- *
- * Returns:
- *  0  on success
- * -EINTR  when interrupted by a signal
- * -ETIMEDOUT  when the timeout expired
- */
-int
-rt_mutex_timed_lock(struct rt_mutex *lock, struct hrtimer_sleeper *timeout)
-{
-   int ret;
-
-   might_sleep();
-
-   mutex_acquire(>dep_map, 0, 0, _RET_IP_);
-   ret = rt_mutex_timed_fastlock(lock, TASK_INTERRUPTIBLE, timeout,
-  RT_MUTEX_MIN_CHAINWALK,
-  rt_mutex_slowlock);
-   if (ret)
-   mutex_release(>dep_map, _RET_IP_);
-
-   return ret;
-}
-EXPORT_SYMBOL_GPL(rt_mutex_timed_lock);
-
-/**
  * rt_mutex_trylock - try to lock a rt_mutex
  *
  * @lock:  the rt_mutex to be locked

[tip: locking/core] locking/rtmutex: Consolidate rt_mutex_init()

2021-03-24 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 1ba7cf8e3a9316daa84ee774346028f84bca2957
Gitweb:
https://git.kernel.org/tip/1ba7cf8e3a9316daa84ee774346028f84bca2957
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 23 Mar 2021 22:30:23 +01:00
Committer: Ingo Molnar 
CommitterDate: Wed, 24 Mar 2021 08:06:07 +01:00

locking/rtmutex: Consolidate rt_mutex_init()

rt_mutex_init() only initializes lockdep if CONFIG_DEBUG_RT_MUTEXES=y,
which is fine because all lockdep variants select it, but there is
no reason to do so.

Move the function outside of the CONFIG_DEBUG_RT_MUTEXES block which
removes #ifdeffery.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210323213707.896403...@linutronix.de
---
 include/linux/rtmutex.h | 12 +++-
 1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 0725c4b..243fabc 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -43,6 +43,7 @@ struct hrtimer_sleeper;
  extern int rt_mutex_debug_check_no_locks_freed(const void *from,
unsigned long len);
  extern void rt_mutex_debug_check_no_locks_held(struct task_struct *task);
+ extern void rt_mutex_debug_task_free(struct task_struct *tsk);
 #else
  static inline int rt_mutex_debug_check_no_locks_freed(const void *from,
   unsigned long len)
@@ -50,22 +51,15 @@ struct hrtimer_sleeper;
return 0;
  }
 # define rt_mutex_debug_check_no_locks_held(task)  do { } while (0)
+# define rt_mutex_debug_task_free(t)   do { } while (0)
 #endif
 
-#ifdef CONFIG_DEBUG_RT_MUTEXES
-
-# define rt_mutex_init(mutex) \
+#define rt_mutex_init(mutex) \
 do { \
static struct lock_class_key __key; \
__rt_mutex_init(mutex, __func__, &__key); \
 } while (0)
 
- extern void rt_mutex_debug_task_free(struct task_struct *tsk);
-#else
-# define rt_mutex_init(mutex)  __rt_mutex_init(mutex, NULL, 
NULL)
-# define rt_mutex_debug_task_free(t)   do { } while (0)
-#endif
-
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
 #define __DEP_MAP_RT_MUTEX_INITIALIZER(mutexname) \
, .dep_map = { .name = #mutexname }

[tip: locking/core] locking/rtmutex: Remove output from deadlock detector.

2021-03-24 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 5389737bf9b1d430143a15c37c6ee2d89a75
Gitweb:
https://git.kernel.org/tip/5389737bf9b1d430143a15c37c6ee2d89a75
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 23 Mar 2021 22:30:22 +01:00
Committer: Ingo Molnar 
CommitterDate: Wed, 24 Mar 2021 08:06:07 +01:00

locking/rtmutex: Remove output from deadlock detector.

The rtmutex specific deadlock detector predates lockdep coverage of rtmutex
and since the following commit it contains a lot of redundant functionality:

  f5694788ad8da ("rt_mutex: Add lockdep annotations")

 - lockdep will detect an potential deadlock before rtmutex-debug
   has a chance to do so

 - the dead lock debugging is restricted to rtmutexes which are not
   associated to futexes and have an active waiter, which is covered by
   lockdep already

Remove the redundant functionality and move actual deadlock WARN() into the
deadlock code path. The latter needs a seperate cleanup.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210323213707.773152...@linutronix.de
---
 include/linux/rtmutex.h |  7 +--
 kernel/locking/rtmutex-debug.c  | 97 +
 kernel/locking/rtmutex-debug.h  |  9 +---
 kernel/locking/rtmutex.c|  7 +--
 kernel/locking/rtmutex.h|  7 +--
 kernel/locking/rtmutex_common.h |  4 +-
 6 files changed, 1 insertion(+), 130 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 48b334b..0725c4b 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -31,9 +31,6 @@ struct rt_mutex {
raw_spinlock_t  wait_lock;
struct rb_root_cached   waiters;
struct task_struct  *owner;
-#ifdef CONFIG_DEBUG_RT_MUTEXES
-   const char  *name;
-#endif
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map  dep_map;
 #endif
@@ -56,8 +53,6 @@ struct hrtimer_sleeper;
 #endif
 
 #ifdef CONFIG_DEBUG_RT_MUTEXES
-# define __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \
-   , .name = #mutexname
 
 # define rt_mutex_init(mutex) \
 do { \
@@ -67,7 +62,6 @@ do { \
 
  extern void rt_mutex_debug_task_free(struct task_struct *tsk);
 #else
-# define __DEBUG_RT_MUTEX_INITIALIZER(mutexname)
 # define rt_mutex_init(mutex)  __rt_mutex_init(mutex, NULL, 
NULL)
 # define rt_mutex_debug_task_free(t)   do { } while (0)
 #endif
@@ -83,7 +77,6 @@ do { \
{ .wait_lock = __RAW_SPIN_LOCK_UNLOCKED(mutexname.wait_lock) \
, .waiters = RB_ROOT_CACHED \
, .owner = NULL \
-   __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \
__DEP_MAP_RT_MUTEX_INITIALIZER(mutexname)}
 
 #define DEFINE_RT_MUTEX(mutexname) \
diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 7e411b9..fb15010 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -32,105 +32,12 @@
 
 #include "rtmutex_common.h"
 
-static void printk_task(struct task_struct *p)
-{
-   if (p)
-   printk("%16s:%5d [%p, %3d]", p->comm, task_pid_nr(p), p, 
p->prio);
-   else
-   printk("");
-}
-
-static void printk_lock(struct rt_mutex *lock, int print_owner)
-{
-   printk(" [%p] {%s}\n", lock, lock->name);
-
-   if (print_owner && rt_mutex_owner(lock)) {
-   printk(".. ->owner: %p\n", lock->owner);
-   printk(".. held by:  ");
-   printk_task(rt_mutex_owner(lock));
-   printk("\n");
-   }
-}
-
 void rt_mutex_debug_task_free(struct task_struct *task)
 {
DEBUG_LOCKS_WARN_ON(!RB_EMPTY_ROOT(>pi_waiters.rb_root));
DEBUG_LOCKS_WARN_ON(task->pi_blocked_on);
 }
 
-/*
- * We fill out the fields in the waiter to store the information about
- * the deadlock. We print when we return. act_waiter can be NULL in
- * case of a remove waiter operation.
- */
-void debug_rt_mutex_deadlock(enum rtmutex_chainwalk chwalk,
-struct rt_mutex_waiter *act_waiter,
-struct rt_mutex *lock)
-{
-   struct task_struct *task;
-
-   if (!debug_locks || chwalk == RT_MUTEX_FULL_CHAINWALK || !act_waiter)
-   return;
-
-   task = rt_mutex_owner(act_waiter->lock);
-   if (task && task != current) {
-   act_waiter->deadlock_task_pid = get_pid(task_pid(task));
-   act_waiter->deadlock_lock = lock;
-   }
-}
-
-void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter)
-{
-   struct task_struct *task;
-
-   if (!waiter->deadlock_lock || !debug_locks)
-   return;
-
-   rcu_read_lock();
-   task = pid_task(waiter->deadlock_task_pid, PIDTYPE_PID);
-   if (!task) {
-

[tip: locking/core] locking/rtmutex: Remove rtmutex deadlock tester leftovers

2021-03-24 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: ff033d34ec7244e6d86b4bbdd917ec0d54299f31
Gitweb:
https://git.kernel.org/tip/ff033d34ec7244e6d86b4bbdd917ec0d54299f31
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 23 Mar 2021 22:30:21 +01:00
Committer: Ingo Molnar 
CommitterDate: Wed, 24 Mar 2021 08:06:07 +01:00

locking/rtmutex: Remove rtmutex deadlock tester leftovers

The following debug members of struct rtmutex are unused:

 - save_state: No users

 - file,line:  Printed if ::name is NULL. This is only used for non-futex
   locks so ::name is never NULL

 - magic:  Assigned to NULL by rt_mutex_destroy(), no further usage

Remove them along with unused inlines and macros leftovers related to
the long gone deadlock tester.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Signed-off-by: Ingo Molnar 
Link: https://lore.kernel.org/r/20210323213707.630525...@linutronix.de
---
 include/linux/rtmutex.h | 7 ++-
 kernel/locking/rtmutex-debug.c  | 7 +--
 kernel/locking/rtmutex-debug.h  | 2 --
 kernel/locking/rtmutex.c| 3 ---
 kernel/locking/rtmutex.h| 2 --
 kernel/locking/rtmutex_common.h | 1 -
 6 files changed, 3 insertions(+), 19 deletions(-)

diff --git a/include/linux/rtmutex.h b/include/linux/rtmutex.h
index 32f4a35..48b334b 100644
--- a/include/linux/rtmutex.h
+++ b/include/linux/rtmutex.h
@@ -32,10 +32,7 @@ struct rt_mutex {
struct rb_root_cached   waiters;
struct task_struct  *owner;
 #ifdef CONFIG_DEBUG_RT_MUTEXES
-   int save_state;
-   const char  *name, *file;
-   int line;
-   void*magic;
+   const char  *name;
 #endif
 #ifdef CONFIG_DEBUG_LOCK_ALLOC
struct lockdep_map  dep_map;
@@ -60,7 +57,7 @@ struct hrtimer_sleeper;
 
 #ifdef CONFIG_DEBUG_RT_MUTEXES
 # define __DEBUG_RT_MUTEX_INITIALIZER(mutexname) \
-   , .name = #mutexname, .file = __FILE__, .line = __LINE__
+   , .name = #mutexname
 
 # define rt_mutex_init(mutex) \
 do { \
diff --git a/kernel/locking/rtmutex-debug.c b/kernel/locking/rtmutex-debug.c
index 36e6910..7e411b9 100644
--- a/kernel/locking/rtmutex-debug.c
+++ b/kernel/locking/rtmutex-debug.c
@@ -42,12 +42,7 @@ static void printk_task(struct task_struct *p)
 
 static void printk_lock(struct rt_mutex *lock, int print_owner)
 {
-   if (lock->name)
-   printk(" [%p] {%s}\n",
-   lock, lock->name);
-   else
-   printk(" [%p] {%s:%d}\n",
-   lock, lock->file, lock->line);
+   printk(" [%p] {%s}\n", lock, lock->name);
 
if (print_owner && rt_mutex_owner(lock)) {
printk(".. ->owner: %p\n", lock->owner);
diff --git a/kernel/locking/rtmutex-debug.h b/kernel/locking/rtmutex-debug.h
index fc54971..772c9b0 100644
--- a/kernel/locking/rtmutex-debug.h
+++ b/kernel/locking/rtmutex-debug.h
@@ -22,8 +22,6 @@ extern void debug_rt_mutex_deadlock(enum rtmutex_chainwalk 
chwalk,
struct rt_mutex_waiter *waiter,
struct rt_mutex *lock);
 extern void debug_rt_mutex_print_deadlock(struct rt_mutex_waiter *waiter);
-# define debug_rt_mutex_reset_waiter(w)\
-   do { (w)->deadlock_lock = NULL; } while (0)
 
 static inline bool debug_rt_mutex_detect_deadlock(struct rt_mutex_waiter 
*waiter,
  enum rtmutex_chainwalk walk)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index ca93e5d..11abc60 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1594,9 +1594,6 @@ void __sched rt_mutex_futex_unlock(struct rt_mutex *lock)
 void rt_mutex_destroy(struct rt_mutex *lock)
 {
WARN_ON(rt_mutex_is_locked(lock));
-#ifdef CONFIG_DEBUG_RT_MUTEXES
-   lock->magic = NULL;
-#endif
 }
 EXPORT_SYMBOL_GPL(rt_mutex_destroy);
 
diff --git a/kernel/locking/rtmutex.h b/kernel/locking/rtmutex.h
index 732f96a..4dbdec1 100644
--- a/kernel/locking/rtmutex.h
+++ b/kernel/locking/rtmutex.h
@@ -11,7 +11,6 @@
  * Non-debug version.
  */
 
-#define rt_mutex_deadlock_check(l) (0)
 #define debug_rt_mutex_init_waiter(w)  do { } while (0)
 #define debug_rt_mutex_free_waiter(w)  do { } while (0)
 #define debug_rt_mutex_lock(l) do { } while (0)
@@ -21,7 +20,6 @@
 #define debug_rt_mutex_init(m, n, k)   do { } while (0)
 #define debug_rt_mutex_deadlock(d, a ,l)   do { } while (0)
 #define debug_rt_mutex_print_deadlock(w)   do { } while (0)
-#define debug_rt_mutex_reset_waiter(w) do { } while (0)
 
 static inline void rt_mutex_print_deadlock(struct rt_mutex_waiter *w)
 {
diff --gi

[PATCH] drm/i915: Use tasklet_unlock_spin_wait() in __tasklet_disable_sync_once()

2021-03-23 Thread Sebastian Andrzej Siewior

The i915 driver has its own tasklet interface which was overseen in the
tasklet rework. __tasklet_disable_sync_once() is a wrapper around
tasklet_unlock_wait(). tasklet_unlock_wait() might sleep, but the i915
wrappers invoke it from non-preemtible contexts with bottom halves disabled.

Use tasklet_unlock_spin_wait() instead which can be invoked from
non-preemptible contexts.

Fixes: da044747401fc ("tasklets: Replace spin wait in tasklet_unlock_wait()")
Reported-by: kernel test robot 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/gpu/drm/i915/i915_gem.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.h b/drivers/gpu/drm/i915/i915_gem.h
index e622aee6e4be9..440c35f1abc9e 100644
--- a/drivers/gpu/drm/i915/i915_gem.h
+++ b/drivers/gpu/drm/i915/i915_gem.h
@@ -105,7 +105,7 @@ static inline bool tasklet_is_locked(const struct 
tasklet_struct *t)
 static inline void __tasklet_disable_sync_once(struct tasklet_struct *t)
 {
if (!atomic_fetch_inc(>count))
-   tasklet_unlock_wait(t);
+   tasklet_unlock_spin_wait(t);
 }
 
 static inline bool __tasklet_is_enabled(const struct tasklet_struct *t)
-- 
2.31.0

Re: [PATCH] serial: imx: drop workaround for forced irq threading

2021-03-23 Thread Sebastian Andrzej Siewior

On 2021-03-23 08:34:47 [+0100], Uwe Kleine-König wrote:
> Hello Sebastian,
Hi Uwe,

> On Mon, Mar 22, 2021 at 09:48:36PM +0100, Sebastian Andrzej Siewior wrote:
> > On 2021-03-22 14:40:32 [+0100], Uwe Kleine-König wrote:
> > > From a strictly logically point of view you indeed cannot. But if you go
> > > to the street and say to people there that they can park their car in
> > > this street free of charge between Monday and Friday, I expect that most
> > > of them will assume that they have to pay for parking on weekends.
> > 
> > If I hear that parking is free on weekdays and on paid on weekends, I
> > expect it to be a scam.
> 
> I don't feel taken seriously with this reply.

I'm sorry.

> > Uwe, the patch reverts a change which was needed for !RT + threadirqs.
> 
> This would be a useful information for the commit log.
> 
> > The commit message claims that since the referenced commit "… interrupt
> > handlers always run with interrupts disabled on non-RT… ". This has
> > nothing to do with _this_ change. It argues why the workaround is not
> > needed.
> 
> It argues why the work around is not needed on non-RT. It might be
> obvious for someone who is firm in the RT concepts, but IMHO commit logs
> should be understandable by and make sense for a wider audience than the
> deep experts. From what I know about RT "Force-threaded interrupt
> handlers used to run with interrupts enabled" still applies there.

Yes. The commit Johan referenced explains it in more detail.

> > If the referenced commit breaks RT then this is another story.
> 
> I'm surprised to hear that from you. With the goal to get RT into
> mainline I would expect you to be happy if people consider the effects
> on RT in their reviews.

Correct, I do and I am glad if people consider other aspects of the
kernel in their review including RT.

> > > So when you said that on on-RT the reason why it used to need a
> > > workaround is gone made me wonder what that implies for RT.
> > 
> > There was never reason (or a lockdep splat) for it on RT. If so you
> > should have seen it, right?
> 
> No, I don't consider myself to be an RT expert who is aware of all the
> problems. So I admit that for me the effect on RT of the patch under
> discussion isn't obvious. I just wonder that the change is justified
> with being OK on non-RT. So it's either bad that it breaks RT *or*
> improving the commit log would be great.
> 
> And even if I had reason to believe that there is no problem with the
> commit on RT, I'd still wish that the commit log wouldn't suggest to the
> casual reader that there might be a problem.

Okay. I added a sentence. What about this rewording:

  Force-threaded interrupt handlers used to run with interrupts enabled,
  something which could lead to deadlocks in case a threaded handler
  shared a lock with code running in hard interrupt context (e.g. timer
  callbacks) and did not explicitly disable interrupts.  

  This was specifically the case for serial drivers that take the port
  lock in their console write path as printk can be called from hard
  interrupt context also with forced threading ("threadirqs").

  Since commit 81e2073c175b ("genirq: Disable interrupts for force
  threaded handlers") interrupt handlers always run with interrupts
  disabled on non-RT so that drivers no longer need to do handle this.
  RT is not affected by the referenced commit and the workaround, that is
  reverted, was not required because spinlock_t must not be acquired on
  RT in hardirq context.

  Drop the now obsolete workaround added by commit 33f16855dcb9 ("tty:
  serial: imx: fix potential deadlock").

> Best regards
> Uwe
> 

Sebastian

Re: [PATCH] serial: imx: drop workaround for forced irq threading

2021-03-22 Thread Sebastian Andrzej Siewior

On 2021-03-22 14:40:32 [+0100], Uwe Kleine-König wrote:
> From a strictly logically point of view you indeed cannot. But if you go
> to the street and say to people there that they can park their car in
> this street free of charge between Monday and Friday, I expect that most
> of them will assume that they have to pay for parking on weekends.

If I hear that parking is free on weekdays and on paid on weekends, I
expect it to be a scam.

Uwe, the patch reverts a change which was needed for !RT + threadirqs.
The commit message claims that since the referenced commit "… interrupt
handlers always run with interrupts disabled on non-RT… ". This has
nothing to do with _this_ change. It argues why the workaround is not
needed.
If the referenced commit breaks RT then this is another story.

> So when you said that on on-RT the reason why it used to need a
> workaround is gone made me wonder what that implies for RT.

There was never reason (or a lockdep splat) for it on RT. If so you
should have seen it, right?

Sebastian

Re: [PATCH] USB: ehci: drop workaround for forced irq threading

2021-03-22 Thread Sebastian Andrzej Siewior

On 2021-03-22 12:42:00 [-0400], Alan Stern wrote:
> What happens on RT systems?  Are they smart enough to avoid the whole 
> problem by enabling interrupts during _all_ callbacks?

tl;dr: Yes. 

The referenced commit (id 81e2073c175b) disables interrupts only on !RT
configs so for RT everything remains unchanged (the backports are
already adjusted for the old stable trees to use the proper CONFIG_* for
enabled RT).

All hrtimer callbacks run as HRTIMER_MODE_SOFT by default. The
HRTIMER_MODE_HARD ones (which expire in HARDIRQ context) were audited /
explicitly enabled.
The same goes irq_work.
The printk code is different compared to mainline. A printk() on RT in
HARDIRQ context is printed once the HARDIRQ context is left. So the
serial/console/… driver never gets a chance to acquire its lock in
hardirq context.

An interrupt handler which is not forced-threaded must be marked as such
and must not use any spinlock_t based locking. lockdep/might_sleep
complain here already.

> Alan Stern

Sebastian

Re: [PATCH] serial: imx: drop workaround for forced irq threading

2021-03-22 Thread Sebastian Andrzej Siewior

On 2021-03-22 12:34:02 [+0100], Uwe Kleine-König wrote:
> On Mon, Mar 22, 2021 at 12:10:36PM +0100, Johan Hovold wrote:
> > Force-threaded interrupt handlers used to run with interrupts enabled,
> > something which could lead to deadlocks in case a threaded handler
> > shared a lock with code running in hard interrupt context (e.g. timer
> > callbacks) and did not explicitly disable interrupts.
> > 
> > This was specifically the case for serial drivers that take the port
> > lock in their console write path as printk can be called from hard
> > interrupt context also with forced threading ("threadirqs").
> > 
> > Since commit 81e2073c175b ("genirq: Disable interrupts for force
> > threaded handlers") interrupt handlers always run with interrupts
> > disabled on non-RT so that drivers no longer need to do handle this.
> 
> So we're breaking RT knowingly here? If this is the case I'm not happy
> with your change. (And if RT is not affected a different wording would
> be good.)

Which wording, could you be more specific? It looks good from here and
no, RT is not affected.

> Best regards
> Uwe

Sebastian

[ANNOUNCE] v5.12-rc3-rt3

2021-03-19 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.12-rc3-rt3 patch set. 

Changes since v5.12-rc3-rt2:

  - Update the softirq/tasklet patches to the latest version which has
been merged into the tip tree. Only the comments have changed.

  - In certain conditions the SLAB_TYPESAFE_BY_RCU marked SLAB pages may
have been returned to the page-allocator without waiting for
required grace period. The problem has been introduced during the
rework in v5.11.2-rt9.

  - Update John's printk patches.
With this update I can strike 
 - kdb/kgdb can easily deadlock.

off the known issues list.

Known issues
 - netconsole triggers WARN.

The delta patch against v5.12-rc3-rt2 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.12/incr/patch-5.12-rc3-rt2-rt3.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.12-rc3-rt3

The RT patch against v5.12-rc3 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.12/older/patch-5.12-rc3-rt3.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.12/older/patches-5.12-rc3-rt3.tar.xz

Sebastian

diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index 7a13bc20f0a0c..e0ced3afc667f 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -57,6 +57,7 @@ struct smp_ops_t {
 
 extern int smp_send_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 
delay_us);
 extern int smp_send_safe_nmi_ipi(int cpu, void (*fn)(struct pt_regs *), u64 
delay_us);
+extern void smp_send_debugger_break_cpu(unsigned int cpu);
 extern void smp_send_debugger_break(void);
 extern void start_secondary_resume(void);
 extern void smp_generic_give_timebase(void);
diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
index 409080208a6c4..1f716688c9775 100644
--- a/arch/powerpc/kernel/kgdb.c
+++ b/arch/powerpc/kernel/kgdb.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -120,11 +121,19 @@ int kgdb_skipexception(int exception, struct pt_regs 
*regs)
 
 static int kgdb_debugger_ipi(struct pt_regs *regs)
 {
-   kgdb_nmicallback(raw_smp_processor_id(), regs);
+   int cpu = raw_smp_processor_id();
+
+   if (!console_atomic_kgdb_cpu_delay(cpu))
+   kgdb_nmicallback(cpu, regs);
return 0;
 }
 
 #ifdef CONFIG_SMP
+void kgdb_roundup_cpu(unsigned int cpu)
+{
+   smp_send_debugger_break_cpu(cpu);
+}
+
 void kgdb_roundup_cpus(void)
 {
smp_send_debugger_break();
diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 1ef55f4b389a2..3c8d9bbb51cfa 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -73,8 +73,7 @@ static const char *nvram_os_partitions[] = {
 };
 
 static void oops_to_nvram(struct kmsg_dumper *dumper,
- enum kmsg_dump_reason reason,
- struct kmsg_dumper_iter *iter);
+ enum kmsg_dump_reason reason);
 
 static struct kmsg_dumper nvram_kmsg_dumper = {
.dump = oops_to_nvram
@@ -644,11 +643,11 @@ void __init nvram_init_oops_partition(int 
rtas_partition_exists)
  * partition.  If that's too much, go back and capture uncompressed text.
  */
 static void oops_to_nvram(struct kmsg_dumper *dumper,
- enum kmsg_dump_reason reason,
- struct kmsg_dumper_iter *iter)
+ enum kmsg_dump_reason reason)
 {
struct oops_log_info *oops_hdr = (struct oops_log_info *)oops_buf;
static unsigned int oops_count = 0;
+   static struct kmsg_dump_iter iter;
static bool panicking = false;
static DEFINE_SPINLOCK(lock);
unsigned long flags;
@@ -683,13 +682,14 @@ static void oops_to_nvram(struct kmsg_dumper *dumper,
return;
 
if (big_oops_buf) {
-   kmsg_dump_get_buffer(iter, false,
+   kmsg_dump_rewind();
+   kmsg_dump_get_buffer(, false,
 big_oops_buf, big_oops_buf_sz, _len);
rc = zip_oops(text_len);
}
if (rc != 0) {
-   kmsg_dump_rewind(iter);
-   kmsg_dump_get_buffer(iter, false,
+   kmsg_dump_rewind();
+   kmsg_dump_get_buffer(, false,
 oops_data, oops_data_sz, _len);
err_type = ERR_TYPE_KERNEL_PANIC;
oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION);
diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 5a4d59a1070d5..b396c6eafce38 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -582,6 +582,11 @@ static void debugger_ipi_callback(struct pt_regs *regs)
debugger_ipi(regs);
 }
 
+void

Re: [PATCH RFC 0/3] drivers/char: remove /dev/kmem for good

2021-03-19 Thread Sebastian Andrzej Siewior

On 2021-03-19 10:14:02 [-0700], Linus Torvalds wrote:
> On Fri, Mar 19, 2021 at 7:35 AM David Hildenbrand  wrote:
> >
> > Let's start a discussion if /dev/kmem is worth keeping around and
> > fixing/maintaining or if we should just remove it now for good.
> 
> I'll happily do this for the next merge window, but would really want
> distros to confirm that they don't enable it.
> 
> I can confirm that it's certainly not enabled on any of the machines I
> have, but..

Debian has CONFIG_DEVKMEM disabled since 2.6.31.

>  Linus

Sebastian

Re: [patch 1/1] genirq: Disable interrupts for force threaded handlers

2021-03-18 Thread Sebastian Andrzej Siewior

On 2021-03-17 17:23:39 [+0100], Johan Hovold wrote:
> > > thread(irq_A)
> > >   irq_handler(A)
> > > spin_lock(>lock);
> > > 
> > > interrupt(irq_B)
> > >   irq_handler(B)
> > > spin_lock(>lock);
> > 
> > It will not because both threads will wake_up(thread).
> 
> Note that the above says "interrupt(irq_B)" suggesting it's a
> non-threaded interrupt unlike irq_A.

I missed that bit, thanks.

Sebastian

[tip: irq/core] ath9k: Use tasklet_disable_in_atomic()

2021-03-17 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 3250aa8a293b1859d76577714a3e1fe95732c721
Gitweb:
https://git.kernel.org/tip/3250aa8a293b1859d76577714a3e1fe95732c721
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 09 Mar 2021 09:42:13 +01:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 17 Mar 2021 16:34:02 +01:00

ath9k: Use tasklet_disable_in_atomic()

All callers of ath9k_beacon_ensure_primary_slot() are preemptible /
acquire a mutex except for this callchain:

  spin_lock_bh(>sc_pcu_lock);
  ath_complete_reset()
  -> ath9k_calculate_summary_state()
 -> ath9k_beacon_ensure_primary_slot()

It's unclear how that can be distangled, so use tasklet_disable_in_atomic()
for now. This allows tasklet_disable() to become sleepable once the
remaining atomic users are cleaned up.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Acked-by: Kalle Valo 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210309084242.313899...@linutronix.de

---
 drivers/net/wireless/ath/ath9k/beacon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath9k/beacon.c 
b/drivers/net/wireless/ath/ath9k/beacon.c
index 71e2ada..72e2e71 100644
--- a/drivers/net/wireless/ath/ath9k/beacon.c
+++ b/drivers/net/wireless/ath/ath9k/beacon.c
@@ -251,7 +251,7 @@ void ath9k_beacon_ensure_primary_slot(struct ath_softc *sc)
int first_slot = ATH_BCBUF;
int slot;
 
-   tasklet_disable(>bcon_tasklet);
+   tasklet_disable_in_atomic(>bcon_tasklet);
 
/* Find first taken slot. */
for (slot = 0; slot < ATH_BCBUF; slot++) {

[tip: irq/core] atm: eni: Use tasklet_disable_in_atomic() in the send() callback

2021-03-17 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 405698ca359a23b1ef1a502ef2bdc4597dc6da36
Gitweb:
https://git.kernel.org/tip/405698ca359a23b1ef1a502ef2bdc4597dc6da36
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 09 Mar 2021 09:42:14 +01:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 17 Mar 2021 16:34:02 +01:00

atm: eni: Use tasklet_disable_in_atomic() in the send() callback

The atmdev_ops::send callback which calls tasklet_disable() is invoked with
bottom halfs disabled from net_device_ops::ndo_start_xmit(). All other
invocations of tasklet_disable() in this driver happen in preemptible
context.

Change the send() call to use tasklet_disable_in_atomic() which allows
tasklet_disable() to be made sleepable once the remaining atomic context
usage sites are cleaned up.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210309084242.415583...@linutronix.de

---
 drivers/atm/eni.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index 316a994..e96a4e8 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -2054,7 +2054,7 @@ static int eni_send(struct atm_vcc *vcc,struct sk_buff 
*skb)
}
submitted++;
ATM_SKB(skb)->vcc = vcc;
-   tasklet_disable(_DEV(vcc->dev)->task);
+   tasklet_disable_in_atomic(_DEV(vcc->dev)->task);
res = do_tx(skb);
tasklet_enable(_DEV(vcc->dev)->task);
if (res == enq_ok) return 0;

[tip: irq/core] net: jme: Replace link-change tasklet with work

2021-03-17 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the irq/core branch of tip:

Commit-ID: c62c38e349c73cad90f59f00fe8070b3648b6d08
Gitweb:
https://git.kernel.org/tip/c62c38e349c73cad90f59f00fe8070b3648b6d08
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 09 Mar 2021 09:42:11 +01:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 17 Mar 2021 16:33:58 +01:00

net: jme: Replace link-change tasklet with work

The link change tasklet disables the tasklets for tx/rx processing while
upating hw parameters and then enables the tasklets again.

This update can also be pushed into a workqueue where it can be performed
in preemptible context. This allows tasklet_disable() to become sleeping.

Replace the linkch_task tasklet with a work.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210309084242.106288...@linutronix.de

---
 drivers/net/ethernet/jme.c | 10 +-
 drivers/net/ethernet/jme.h |  2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/jme.c b/drivers/net/ethernet/jme.c
index e9efe07..f1b9284 100644
--- a/drivers/net/ethernet/jme.c
+++ b/drivers/net/ethernet/jme.c
@@ -1265,9 +1265,9 @@ jme_stop_shutdown_timer(struct jme_adapter *jme)
jwrite32f(jme, JME_APMC, apmc);
 }
 
-static void jme_link_change_tasklet(struct tasklet_struct *t)
+static void jme_link_change_work(struct work_struct *work)
 {
-   struct jme_adapter *jme = from_tasklet(jme, t, linkch_task);
+   struct jme_adapter *jme = container_of(work, struct jme_adapter, 
linkch_task);
struct net_device *netdev = jme->dev;
int rc;
 
@@ -1510,7 +1510,7 @@ jme_intr_msi(struct jme_adapter *jme, u32 intrstat)
 * all other events are ignored
 */
jwrite32(jme, JME_IEVE, intrstat);
-   tasklet_schedule(>linkch_task);
+   schedule_work(>linkch_task);
goto out_reenable;
}
 
@@ -1832,7 +1832,6 @@ jme_open(struct net_device *netdev)
jme_clear_pm_disable_wol(jme);
JME_NAPI_ENABLE(jme);
 
-   tasklet_setup(>linkch_task, jme_link_change_tasklet);
tasklet_setup(>txclean_task, jme_tx_clean_tasklet);
tasklet_setup(>rxclean_task, jme_rx_clean_tasklet);
tasklet_setup(>rxempty_task, jme_rx_empty_tasklet);
@@ -1920,7 +1919,7 @@ jme_close(struct net_device *netdev)
 
JME_NAPI_DISABLE(jme);
 
-   tasklet_kill(>linkch_task);
+   cancel_work_sync(>linkch_task);
tasklet_kill(>txclean_task);
tasklet_kill(>rxclean_task);
tasklet_kill(>rxempty_task);
@@ -3035,6 +3034,7 @@ jme_init_one(struct pci_dev *pdev,
atomic_set(>rx_empty, 1);
 
tasklet_setup(>pcc_task, jme_pcc_tasklet);
+   INIT_WORK(>linkch_task, jme_link_change_work);
jme->dpi.cur = PCC_P1;
 
jme->reg_ghc = 0;
diff --git a/drivers/net/ethernet/jme.h b/drivers/net/ethernet/jme.h
index a2c3b00..2af7632 100644
--- a/drivers/net/ethernet/jme.h
+++ b/drivers/net/ethernet/jme.h
@@ -411,7 +411,7 @@ struct jme_adapter {
struct tasklet_struct   rxempty_task;
struct tasklet_struct   rxclean_task;
struct tasklet_struct   txclean_task;
-   struct tasklet_struct   linkch_task;
+   struct work_struct  linkch_task;
struct tasklet_struct   pcc_task;
unsigned long   flags;
u32 reg_txcs;

[tip: irq/core] net: sundance: Use tasklet_disable_in_atomic().

2021-03-17 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 25cf87df1a3a85959bf1bf27df0eb2e6e04b2161
Gitweb:
https://git.kernel.org/tip/25cf87df1a3a85959bf1bf27df0eb2e6e04b2161
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 09 Mar 2021 09:42:12 +01:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 17 Mar 2021 16:34:00 +01:00

net: sundance: Use tasklet_disable_in_atomic().

tasklet_disable() is used in the timer callback. This might be distangled,
but without access to the hardware that's a bit risky.

Replace it with tasklet_disable_in_atomic() so tasklet_disable() can be
changed to a sleep wait once all remaining atomic users are converted.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210309084242.209110...@linutronix.de

---
 drivers/net/ethernet/dlink/sundance.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/dlink/sundance.c 
b/drivers/net/ethernet/dlink/sundance.c
index e3a8858..df0eab4 100644
--- a/drivers/net/ethernet/dlink/sundance.c
+++ b/drivers/net/ethernet/dlink/sundance.c
@@ -963,7 +963,7 @@ static void tx_timeout(struct net_device *dev, unsigned int 
txqueue)
unsigned long flag;
 
netif_stop_queue(dev);
-   tasklet_disable(>tx_tasklet);
+   tasklet_disable_in_atomic(>tx_tasklet);
iowrite16(0, ioaddr + IntrEnable);
printk(KERN_WARNING "%s: Transmit timed out, TxStatus %2.2x "
   "TxFrameId %2.2x,"

[tip: irq/core] PCI: hv: Use tasklet_disable_in_atomic()

2021-03-17 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the irq/core branch of tip:

Commit-ID: be4017cea0aec6369275df7eafbb09682f810e7e
Gitweb:
https://git.kernel.org/tip/be4017cea0aec6369275df7eafbb09682f810e7e
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 09 Mar 2021 09:42:15 +01:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 17 Mar 2021 16:34:03 +01:00

PCI: hv: Use tasklet_disable_in_atomic()

The hv_compose_msi_msg() callback in irq_chip::irq_compose_msi_msg is
invoked via irq_chip_compose_msi_msg(), which itself is always invoked from
atomic contexts from the guts of the interrupt core code.

There is no way to change this w/o rewriting the whole driver, so use
tasklet_disable_in_atomic() which allows to make tasklet_disable()
sleepable once the remaining atomic users are addressed.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Acked-by: Wei Liu 
Acked-by: Bjorn Helgaas 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210309084242.516519...@linutronix.de

---
 drivers/pci/controller/pci-hyperv.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/pci/controller/pci-hyperv.c 
b/drivers/pci/controller/pci-hyperv.c
index 27a17a1..a313708 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1458,7 +1458,7 @@ static void hv_compose_msi_msg(struct irq_data *data, 
struct msi_msg *msg)
 * Prevents hv_pci_onchannelcallback() from running concurrently
 * in the tasklet.
 */
-   tasklet_disable(>callback_event);
+   tasklet_disable_in_atomic(>callback_event);
 
/*
 * Since this function is called with IRQ locks held, can't

[tip: irq/core] firewire: ohci: Use tasklet_disable_in_atomic() where required

2021-03-17 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the irq/core branch of tip:

Commit-ID: f339fc16fba0167d67c4026678ef4c405bca3085
Gitweb:
https://git.kernel.org/tip/f339fc16fba0167d67c4026678ef4c405bca3085
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 09 Mar 2021 09:42:16 +01:00
Committer: Thomas Gleixner 
CommitterDate: Wed, 17 Mar 2021 16:34:05 +01:00

firewire: ohci: Use tasklet_disable_in_atomic() where required

tasklet_disable() is invoked in several places. Some of them are in atomic
context which prevents a conversion of tasklet_disable() to a sleepable
function.

The atomic callchains are:

 ar_context_tasklet()
   ohci_cancel_packet()
 tasklet_disable()

 ...
   ohci_flush_iso_completions()
 tasklet_disable()

The invocation of tasklet_disable() from at_context_flush() is always in
preemptible context.

Use tasklet_disable_in_atomic() for the two invocations in
ohci_cancel_packet() and ohci_flush_iso_completions().

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Thomas Gleixner 
Acked-by: Peter Zijlstra (Intel) 
Link: https://lore.kernel.org/r/20210309084242.616379...@linutronix.de

---
 drivers/firewire/ohci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 9811c40..17c9d82 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -2545,7 +2545,7 @@ static int ohci_cancel_packet(struct fw_card *card, 
struct fw_packet *packet)
struct driver_data *driver_data = packet->driver_data;
int ret = -ENOENT;
 
-   tasklet_disable(>tasklet);
+   tasklet_disable_in_atomic(>tasklet);
 
if (packet->ack != 0)
goto out;
@@ -3465,7 +3465,7 @@ static int ohci_flush_iso_completions(struct 
fw_iso_context *base)
struct iso_context *ctx = container_of(base, struct iso_context, base);
int ret = 0;
 
-   tasklet_disable(>context.tasklet);
+   tasklet_disable_in_atomic(>context.tasklet);
 
if (!test_and_set_bit_lock(0, >flushing_completions)) {
context_tasklet((unsigned long)>context);

Re: [patch 1/1] genirq: Disable interrupts for force threaded handlers

2021-03-17 Thread Sebastian Andrzej Siewior

On 2021-03-17 15:38:52 [+0100], Thomas Gleixner wrote:
> With interrupt force threading all device interrupt handlers are invoked
> from kernel threads. Contrary to hard interrupt context the invocation only
> disables bottom halfs, but not interrupts. This was an oversight back then
> because any code like this will have an issue:
> 
> thread(irq_A)
>   irq_handler(A)
> spin_lock(>lock);
> 
> interrupt(irq_B)
>   irq_handler(B)
> spin_lock(>lock);

It will not because both threads will wake_up(thread). It is an issue if
- if >lock is shared between a hrtimer and threaded-IRQ
- if >lock is shared between a non-threaded and thread-IRQ
- if >lock is shared between a printk() in hardirq context and
  thread-IRQ as I learned today.

> This has been triggered with networking (NAPI vs. hrtimers) and console
> drivers where printk() happens from an interrupt which interrupted the
> force threaded handler.
> 
> Now people noticed and started to change the spin_lock() in the handler to
> spin_lock_irqsave() which affects performance or add IRQF_NOTHREAD to the
> interrupt request which in turn breaks RT.
> 
> Fix the root cause and not the symptom and disable interrupts before
> invoking the force threaded handler which preserves the regular semantics
> and the usefulness of the interrupt force threading as a general debugging
> tool.
> 
> For not RT this is not changing much, except that during the execution of
> the threaded handler interrupts are delayed until the handler
> returns. Vs. scheduling and softirq processing there is no difference.
> 
> For RT kernels there is no issue.

Acked-by: Sebastian Andrzej Siewior 

> Fixes: 8d32a307e4fa ("genirq: Provide forced interrupt threading")
> Reported-by: Johan Hovold 
> Signed-off-by: Thomas Gleixner 
> Cc: Eric Dumazet 
> Cc: Sebastian Andrzej Siewior 
> Cc: netdev 
> Cc: "David S. Miller" 
> Cc: Krzysztof Kozlowski 
> Cc: Greg Kroah-Hartman 
> Cc: Andy Shevchenko 
> CC: Peter Zijlstra 
> Cc: linux-ser...@vger.kernel.org
> Cc: netdev 
> ---
>  kernel/irq/manage.c |4 
>  1 file changed, 4 insertions(+)
> 
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1142,11 +1142,15 @@ irq_forced_thread_fn(struct irq_desc *de
>   irqreturn_t ret;
>  
>   local_bh_disable();
> + if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> + local_irq_disable();
>   ret = action->thread_fn(action->irq, action->dev_id);
>   if (ret == IRQ_HANDLED)
>   atomic_inc(>threads_handled);
>  
>   irq_finalize_oneshot(desc, action);
> + if (!IS_ENABLED(CONFIG_PREEMPT_RT))
> + local_irq_enable();
>   local_bh_enable();
>   return ret;
>  }

Sebastian

[ANNOUNCE] v5.12-rc2-rt1

2021-03-12 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.12-rc2-rt1 patch set. 

Changes since v5.11.4-rt11:

  - Update to v5.12-rc2.

Known issues
 - kdb/kgdb can easily deadlock.
 - netconsole triggers WARN.

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.12-rc2-rt1

The RT patch against v5.12-rc2 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.12/older/patch-5.12-rc2-rt1.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.12/older/patches-5.12-rc2-rt1.tar.xz

Sebastian

Re: [PATCH] extcon: Provide extcon__notifier_all() stubs for !CONFIG_EXTCON

2021-03-12 Thread Sebastian Andrzej Siewior

On 2021-03-12 15:53:53 [+0100], Krzysztof Kozlowski wrote:
> Yeah, it missed the merge window...

Could you please send it for -rc3?

> Best regards,
> Krzysztof

Sebastian

Re: [PATCH] extcon: Provide extcon__notifier_all() stubs for !CONFIG_EXTCON

2021-03-12 Thread Sebastian Andrzej Siewior

On 2021-03-12 15:45:48 [+0100], Krzysztof Kozlowski wrote:
> Did you base your work on next?

no, -rc2.

> Best regards,
> Krzysztof

Sebastian

[PATCH] extcon: Provide extcon__notifier_all() stubs for !CONFIG_EXTCON

2021-03-12 Thread Sebastian Andrzej Siewior

CHARGER_MAX8997 fails to compile without CONFIG_EXTCON. There are stubs
already present for *extcon_*_notifier() but are missing for the _all()
variant.

Add *extcon_*_notifier_all() stubs for !CONFIG_EXTCON.

Fixes: f384989e88d44 ("power: supply: max8997_charger: Set CHARGER current 
limit")
Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/extcon.h | 24 
 1 file changed, 24 insertions(+)

diff --git a/include/linux/extcon.h b/include/linux/extcon.h
index fd183fb9c20f7..8246307f9ed38 100644
--- a/include/linux/extcon.h
+++ b/include/linux/extcon.h
@@ -276,6 +276,30 @@ static inline struct extcon_dev 
*extcon_get_extcon_dev(const char *extcon_name)
return ERR_PTR(-ENODEV);
 }
 
+static inline int extcon_register_notifier_all(struct extcon_dev *edev,
+  struct notifier_block *nb)
+{
+   return -EINVAL;
+}
+
+static inline int extcon_unregister_notifier_all(struct extcon_dev *edev,
+struct notifier_block *nb)
+{
+   return 0;
+}
+
+static inline int devm_extcon_register_notifier_all(struct device *dev,
+   struct extcon_dev *edev,
+   struct notifier_block *nb)
+{
+   return -EINVAL;
+}
+
+static inline void devm_extcon_unregister_notifier_all(struct device *dev,
+  struct extcon_dev *edev,
+  struct notifier_block 
*nb)
+{ }
+
 static inline struct extcon_dev *extcon_find_edev_by_node(struct device_node 
*node)
 {
return ERR_PTR(-ENODEV);
-- 
2.30.2

Re: [patch V2 3/3] signal: Allow tasks to cache one sigqueue struct

2021-03-12 Thread Sebastian Andrzej Siewior

On 2021-03-11 14:20:39 [+0100], Thomas Gleixner wrote:
> --- a/kernel/signal.c
> +++ b/kernel/signal.c
> @@ -433,7 +433,11 @@ static struct sigqueue *
>   rcu_read_unlock();
>  
>   if (override_rlimit || likely(sigpending <= task_rlimit(t, 
> RLIMIT_SIGPENDING))) {
> - q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
> + /* Preallocation does not hold sighand::siglock */
> + if (sigqueue_flags || !t->sigqueue_cache)
> + q = kmem_cache_alloc(sigqueue_cachep, gfp_flags);
> + else
> + q = xchg(>sigqueue_cache, NULL);

Could it happen that two tasks saw t->sigqueue_cache != NULL, the first
one got the pointer via xchg() and the second got NULL via xchg()?

>   } else {
>   print_dropped_signal(sig);
>   }
> @@ -472,12 +481,19 @@ void flush_sigqueue(struct sigpending *q
>  }
>  
>  /*
> - * Called from __exit_signal. Flush tsk->pending and clear tsk->sighand.
> + * Called from __exit_signal. Flush tsk->pending, clear tsk->sighand and
> + * free tsk->sigqueue_cache.
>   */
>  void exit_task_sighand(struct task_struct *tsk)
>  {
> + struct sigqueue *q;
> +
>   flush_sigqueue(>pending);
>   tsk->sighand = NULL;
> +
> + q = xchg(>sigqueue_cache, NULL);
> + if (q)
> + kmem_cache_free(sigqueue_cachep, q);

Do we need this xchg() here? Only the task itself adds something here
and the task is on its way out so it should not add an entry to the
cache.

>  }
>  
>  /*

Sebastian

Re: [PATCH v3] auxdisplay: Remove in_interrupt() usage.

2021-03-10 Thread Sebastian Andrzej Siewior

On 2021-02-16 21:21:07 [+0100], Miguel Ojeda wrote:
…
> It is not an order :-) i.e. don't feel pressured that you need to sign
> off on the comment change -- I can submit the comment on my own later
> on.

I assumed you are going to apply it but I don't see it in -next as of
today. Is there anything I need to do?

> Cheers,
> Miguel

Sebastian

[ANNOUNCE] v5.11.4-rt11

2021-03-09 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.11.4-rt11 patch set. 

Changes since v5.11.4-rt10:

  - Update the softirq/tasklet patches to the latest version posted to
the list.

Known issues
 - kdb/kgdb can easily deadlock.
 - netconsole triggers WARN.

The delta patch against v5.11.4-rt10 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/incr/patch-5.11.4-rt10-rt11.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.11.4-rt11

The RT patch against v5.11.4 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patch-5.11.4-rt11.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patches-5.11.4-rt11.tar.xz

Sebastian

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index ed6e49bceff1a..272ffd12cf756 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -672,6 +672,7 @@ static inline int tasklet_trylock(struct tasklet_struct *t)
 void tasklet_unlock(struct tasklet_struct *t);
 void tasklet_unlock_wait(struct tasklet_struct *t);
 void tasklet_unlock_spin_wait(struct tasklet_struct *t);
+
 #else
 static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; }
 static inline void tasklet_unlock(struct tasklet_struct *t) { }
@@ -702,8 +703,8 @@ static inline void tasklet_disable_nosync(struct 
tasklet_struct *t)
 }
 
 /*
- * Do not use in new code. There is no real reason to invoke this from
- * atomic contexts.
+ * Do not use in new code. Disabling tasklets from atomic contexts is
+ * error prone and should be avoided.
  */
 static inline void tasklet_disable_in_atomic(struct tasklet_struct *t)
 {
diff --git a/kernel/softirq.c b/kernel/softirq.c
index a9b66aa086366..27551db2b3ccc 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -830,8 +830,8 @@ EXPORT_SYMBOL(tasklet_init);
 
 #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
 /*
- * Do not use in new code. There is no real reason to invoke this from
- * atomic contexts.
+ * Do not use in new code. Waiting for tasklets from atomic contexts is
+ * error prone and should be avoided.
  */
 void tasklet_unlock_spin_wait(struct tasklet_struct *t)
 {
diff --git a/localversion-rt b/localversion-rt
index d79dde624aaac..05c35cb580779 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt10
+-rt11

[ANNOUNCE] v5.10.21-rt34

2021-03-09 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.10.21-rt34 patch set.

Changes since v5.10.21-rt33:

  - The alloc/free tracker sysfs file uses one PAGE size for
collecting the results. If it runs out of space it reallocates
more memory with disabled interrupts. The reallocation is now
forbidden on PREEMPT_RT.

  - Update the softirq/tasklet patches to the latest version posted to
the list.

Known issues
 - kdb/kgdb can easily deadlock.
 - netconsole triggers WARN.

The delta patch against v5.10.21-rt33 is appended below and can be found here:

 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/incr/patch-5.10.21-rt33-rt34.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.10.21-rt34

The RT patch against v5.10.21 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.21-rt34.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.21-rt34.tar.xz

Sebastian

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 2725c6ad10af6..7545a2f18560a 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -663,6 +663,7 @@ static inline int tasklet_trylock(struct tasklet_struct *t)
 void tasklet_unlock(struct tasklet_struct *t);
 void tasklet_unlock_wait(struct tasklet_struct *t);
 void tasklet_unlock_spin_wait(struct tasklet_struct *t);
+
 #else
 static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; }
 static inline void tasklet_unlock(struct tasklet_struct *t) { }
@@ -693,8 +694,8 @@ static inline void tasklet_disable_nosync(struct 
tasklet_struct *t)
 }
 
 /*
- * Do not use in new code. There is no real reason to invoke this from
- * atomic contexts.
+ * Do not use in new code. Disabling tasklets from atomic contexts is
+ * error prone and should be avoided.
  */
 static inline void tasklet_disable_in_atomic(struct tasklet_struct *t)
 {
diff --git a/kernel/softirq.c b/kernel/softirq.c
index f0074f1344402..c9adc5c462485 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -830,8 +830,8 @@ EXPORT_SYMBOL(tasklet_init);
 
 #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
 /*
- * Do not use in new code. There is no real reason to invoke this from
- * atomic contexts.
+ * Do not use in new code. Waiting for tasklets from atomic contexts is
+ * error prone and should be avoided.
  */
 void tasklet_unlock_spin_wait(struct tasklet_struct *t)
 {
diff --git a/localversion-rt b/localversion-rt
index e1d8362520178..21988f9ad53f1 100644
--- a/localversion-rt
+++ b/localversion-rt
@@ -1 +1 @@
--rt33
+-rt34
diff --git a/mm/slub.c b/mm/slub.c
index 32a87e0038776..15690db5223e7 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -4697,6 +4697,9 @@ static int alloc_loc_track(struct loc_track *t, unsigned 
long max, gfp_t flags)
struct location *l;
int order;
 
+   if (IS_ENABLED(CONFIG_PREEMPT_RT) && flags == GFP_ATOMIC)
+   return 0;
+
order = get_order(sizeof(struct location) * max);
 
l = (void *)__get_free_pages(flags, order);

Re: [patch 07/14] tasklets: Prevent tasklet_unlock_spin_wait() deadlock on RT

2021-03-09 Thread Sebastian Andrzej Siewior

On 2021-03-09 16:00:37 [+0100], To Thomas Gleixner wrote:
> diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
> index 07c7329d21aa7..1c14ccd351091 100644
> --- a/include/linux/interrupt.h
> +++ b/include/linux/interrupt.h
> @@ -663,15 +663,6 @@ static inline int tasklet_trylock(struct tasklet_struct 
> *t)
>  void tasklet_unlock(struct tasklet_struct *t);
>  void tasklet_unlock_wait(struct tasklet_struct *t);
>  
> -/*
> - * Do not use in new code. Waiting for tasklets from atomic contexts is
> - * error prone and should be avoided.
> - */
> -static inline void tasklet_unlock_spin_wait(struct tasklet_struct *t)
> -{
> - while (test_bit(TASKLET_STATE_RUN, >state))
> - cpu_relax();
> -}

Look at that. The forward declaration for tasklet_unlock_spin_wait()
should have remained. Sorry for that.

Sebastian

Re: [patch 07/14] tasklets: Prevent tasklet_unlock_spin_wait() deadlock on RT

2021-03-09 Thread Sebastian Andrzej Siewior

On 2021-03-09 09:42:10 [+0100], Thomas Gleixner wrote:
> tasklet_unlock_spin_wait() spin waits for the TASKLET_STATE_SCHED bit in
> the tasklet state to be cleared. This works on !RT nicely because the
…

Could you please fold this:

diff --git a/include/linux/interrupt.h b/include/linux/interrupt.h
index 07c7329d21aa7..1c14ccd351091 100644
--- a/include/linux/interrupt.h
+++ b/include/linux/interrupt.h
@@ -663,15 +663,6 @@ static inline int tasklet_trylock(struct tasklet_struct *t)
 void tasklet_unlock(struct tasklet_struct *t);
 void tasklet_unlock_wait(struct tasklet_struct *t);
 
-/*
- * Do not use in new code. Waiting for tasklets from atomic contexts is
- * error prone and should be avoided.
- */
-static inline void tasklet_unlock_spin_wait(struct tasklet_struct *t)
-{
-   while (test_bit(TASKLET_STATE_RUN, >state))
-   cpu_relax();
-}
 #else
 static inline int tasklet_trylock(struct tasklet_struct *t) { return 1; }
 static inline void tasklet_unlock(struct tasklet_struct *t) { }
diff --git a/kernel/softirq.c b/kernel/softirq.c
index f0074f1344402..c9adc5c462485 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -830,8 +830,8 @@ EXPORT_SYMBOL(tasklet_init);
 
 #if defined(CONFIG_SMP) || defined(CONFIG_PREEMPT_RT)
 /*
- * Do not use in new code. There is no real reason to invoke this from
- * atomic contexts.
+ * Do not use in new code. Waiting for tasklets from atomic contexts is
+ * error prone and should be avoided.
  */
 void tasklet_unlock_spin_wait(struct tasklet_struct *t)
 {
-- 
2.30.1

Re: [PATCH] kernel: profile: fix error return code of create_proc_profile()

2021-03-09 Thread Sebastian Andrzej Siewior

On 2021-03-09 01:02:15 [-0800], Jia-Ju Bai wrote:
> When proc_create() returns NULL to entry, no error return code of
> create_proc_profile() is assigned.
> To fix this bug, err is assigned with -ENOMEM in this case.

I preserved what was in commit
c270a817196a9 ("profile: Fix CPU hotplug callback registration")

which was already in commit
c33fff0afbef4 ("kernel: use non-racy method for proc entries creation")

and goes back to its original introduction in history tree's commit
423284ee6eb52 ("[PATCH] consolidate prof_cpu_mask")

which is ignoring a failure here.
If we decide otherwise and not ignore the error then it would be good to
use an earlier commit so the change also makes into v4.4+ stable.

> Fixes: e722d8daafb9 ("profile: Convert to hotplug state machine")
> Reported-by: TOTE Robot 
> Signed-off-by: Jia-Ju Bai 
> ---
>  kernel/profile.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/profile.c b/kernel/profile.c
> index 6f69a4195d56..65bf03bb8a5e 100644
> --- a/kernel/profile.c
> +++ b/kernel/profile.c
> @@ -549,8 +549,10 @@ int __ref create_proc_profile(void)
>  #endif
>   entry = proc_create("profile", S_IWUSR | S_IRUGO,
>   NULL, _proc_ops);
> - if (!entry)
> + if (!entry) {
> + err = -ENOMEM;
>   goto err_state_onl;
> + }
>   proc_set_size(entry, (1 + prof_len) * sizeof(atomic_t));
>  
>   return err;

Sebastian

[tip: sched/core] kcov: Remove kcov include from sched.h and move it to its users.

2021-03-06 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 183f47fcaa54a5ffe671d990186d330ac8c63b10
Gitweb:
https://git.kernel.org/tip/183f47fcaa54a5ffe671d990186d330ac8c63b10
Author:Sebastian Andrzej Siewior 
AuthorDate:Thu, 18 Feb 2021 18:31:24 +01:00
Committer: Ingo Molnar 
CommitterDate: Sat, 06 Mar 2021 12:40:21 +01:00

kcov: Remove kcov include from sched.h and move it to its users.

The recent addition of in_serving_softirq() to kconv.h results in
compile failure on PREEMPT_RT because it requires
task_struct::softirq_disable_cnt. This is not available if kconv.h is
included from sched.h.

It is not needed to include kconv.h from sched.h. All but the net/ user
already include the kconv header file.

Move the include of the kconv.h header from sched.h it its users.
Additionally include sched.h from kconv.h to ensure that everything
task_struct related is available.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Acked-by: Johannes Berg 
Acked-by: Andrey Konovalov 
Link: https://lkml.kernel.org/r/20210218173124.iy5iyqv3a4oia...@linutronix.de
---
 drivers/usb/usbip/usbip_common.h | 1 +
 include/linux/kcov.h | 1 +
 include/linux/sched.h| 1 -
 net/core/skbuff.c| 1 +
 net/mac80211/iface.c | 1 +
 net/mac80211/rx.c| 1 +
 6 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h
index d60ce17..a7dd6c6 100644
--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #undef pr_fmt
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037d..55dc338 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KCOV_H
 #define _LINUX_KCOV_H
 
+#include 
 #include 
 
 struct task_struct;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb2..cf245bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 545a472..420f23c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b80c9b0..c127deb 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "ieee80211_i.h"
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index c1343c0..62047e9 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include

Re: [PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-04 Thread Sebastian Andrzej Siewior

On 2021-03-03 16:09:05 [-0600], Eric W. Biederman wrote:
> Sebastian Andrzej Siewior  writes:
> 
> > From: Thomas Gleixner 
> >
> > Allow realtime tasks to cache one sigqueue in task struct. This avoids an
> > allocation which can increase the latency or fail.
> > Ideally the sigqueue is cached after first successful delivery and will be
> > available for next signal delivery. This works under the assumption that 
> > the RT
> > task has never an unprocessed signal while a one is about to be queued.
> >
> > The caching is not used for SIGQUEUE_PREALLOC because this kind of sigqueue 
> > is
> > handled differently (and not used for regular signal delivery).
> 
> What part of this is about real time tasks?  This allows any task
> to cache a sigqueue entry.

It is limited to realtime tasks (SCHED_FIFO/RR/DL):

+static void __sigqueue_cache_or_free(struct sigqueue *q)
+{
…
+   if (!task_is_realtime(current) || !sigqueue_add_cache(current, q))
+   kmem_cache_free(sigqueue_cachep, q);
+}

> Either the patch is buggy or the description is.  Overall caching one
> sigqueue entry doesn't look insane. But it would help to have a clear
> description of what is going on.

Does this clear things up or is my logic somehow broken here?

> Eric

Sebastian

Re: [PATCH 1/2] kthread: Move prio/affinite change into the newly created thread

2021-03-03 Thread Sebastian Andrzej Siewior

On 2020-11-21 11:55:48 [+0100], Thomas Gleixner wrote:
> On Tue, Nov 17 2020 at 13:45, Peter Zijlstra wrote:
> > On Tue, Nov 10, 2020 at 12:38:47PM +0100, Sebastian Andrzej Siewior wrote:
> >
> > Moo... yes this is certainly the easiest solution, because nouveau is a
> > horrible rats nest. But when I spoke to Greg KH about this, he suggested
> > nouveau ought to be fixed.
> >
> > Ben, I got terminally lost when trying to untangle nouvea init, is there
> > any chance this can be fixed to not hold that nvkm_device::mutex thing
> > while doing request_irq() ?
> 
> OTOH, creating a dependency chain vs. cpuset_rwsem and whatever lock is
> held by the caller via request_irq() or kthread_create() is not
> necessarily restricted to the nivea driver. struct device::mutex (not
> the nkvm_device::mutex) is always held when a driver is probed.
> 
> The cpuset_rwsem -> mmap_lock dependency is a given, so we're one step
> away from a circular dependency vs. mmap_lock.
> 
> That was my reasoning to move the stuff out into the thread context.

Just a friendly ping that this is still in my queue…

Ben could please reply here stating your view of the situation?

> Thanks,
> 
> tglx

Sebastian

[PATCH] signal: Allow RT tasks to cache one sigqueue struct

2021-03-03 Thread Sebastian Andrzej Siewior

From: Thomas Gleixner 

Allow realtime tasks to cache one sigqueue in task struct. This avoids an
allocation which can increase the latency or fail.
Ideally the sigqueue is cached after first successful delivery and will be
available for next signal delivery. This works under the assumption that the RT
task has never an unprocessed signal while a one is about to be queued.

The caching is not used for SIGQUEUE_PREALLOC because this kind of sigqueue is
handled differently (and not used for regular signal delivery).

[bigeasy: With a fix from Matt Fleming ]
Signed-off-by: Thomas Gleixner 
Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/sched.h  |  1 +
 include/linux/signal.h |  1 +
 kernel/exit.c  |  2 +-
 kernel/fork.c  |  1 +
 kernel/signal.c| 65 +++---
 5 files changed, 65 insertions(+), 5 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb22164cd..7009b25f48160 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -985,6 +985,7 @@ struct task_struct {
/* Signal handlers: */
struct signal_struct*signal;
struct sighand_struct __rcu *sighand;
+   struct sigqueue *sigqueue_cache;
sigset_tblocked;
sigset_treal_blocked;
/* Restored if set_restore_sigmask() was used: */
diff --git a/include/linux/signal.h b/include/linux/signal.h
index 205526c4003aa..d47a86790edc8 100644
--- a/include/linux/signal.h
+++ b/include/linux/signal.h
@@ -265,6 +265,7 @@ static inline void init_sigpending(struct sigpending *sig)
 }
 
 extern void flush_sigqueue(struct sigpending *queue);
+extern void flush_task_sigqueue(struct task_struct *tsk);
 
 /* Test if 'sig' is valid signal. Use this instead of testing _NSIG directly */
 static inline int valid_signal(unsigned long sig)
diff --git a/kernel/exit.c b/kernel/exit.c
index 04029e35e69af..346f7b76cecaa 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -152,7 +152,7 @@ static void __exit_signal(struct task_struct *tsk)
 * Do this under ->siglock, we can race with another thread
 * doing sigqueue_free() if we have SIGQUEUE_PREALLOC signals.
 */
-   flush_sigqueue(>pending);
+   flush_task_sigqueue(tsk);
tsk->sighand = NULL;
spin_unlock(>siglock);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index d66cd1014211b..a767e4e49a692 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1993,6 +1993,7 @@ static __latent_entropy struct task_struct *copy_process(
spin_lock_init(>alloc_lock);
 
init_sigpending(>pending);
+   p->sigqueue_cache = NULL;
 
p->utime = p->stime = p->gtime = 0;
 #ifdef CONFIG_ARCH_HAS_SCALED_CPUTIME
diff --git a/kernel/signal.c b/kernel/signal.c
index ba4d1ef39a9ea..d99273b798085 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -20,6 +20,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -404,13 +405,30 @@ void task_join_group_stop(struct task_struct *task)
task_set_jobctl_pending(task, mask | JOBCTL_STOP_PENDING);
 }
 
+static struct sigqueue *sigqueue_from_cache(struct task_struct *t)
+{
+   struct sigqueue *q = t->sigqueue_cache;
+
+   if (q && cmpxchg(>sigqueue_cache, q, NULL) == q)
+   return q;
+   return NULL;
+}
+
+static bool sigqueue_add_cache(struct task_struct *t, struct sigqueue *q)
+{
+   if (!t->sigqueue_cache && cmpxchg(>sigqueue_cache, NULL, q) == NULL)
+   return true;
+   return false;
+}
+
 /*
  * allocate a new signal queue record
  * - this may be called without locks if and only if t == current, otherwise an
  *   appropriate lock must be held to stop the target task from exiting
  */
 static struct sigqueue *
-__sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags, int 
override_rlimit)
+__sigqueue_do_alloc(int sig, struct task_struct *t, gfp_t flags,
+   int override_rlimit, bool fromslab)
 {
struct sigqueue *q = NULL;
struct user_struct *user;
@@ -432,7 +450,10 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t 
flags, int override_rlimi
rcu_read_unlock();
 
if (override_rlimit || likely(sigpending <= task_rlimit(t, 
RLIMIT_SIGPENDING))) {
-   q = kmem_cache_alloc(sigqueue_cachep, flags);
+   if (!fromslab)
+   q = sigqueue_from_cache(t);
+   if (!q)
+   q = kmem_cache_alloc(sigqueue_cachep, flags);
} else {
print_dropped_signal(sig);
}
@@ -449,6 +470,13 @@ __sigqueue_alloc(int sig, struct task_struct *t, gfp_t 
flags, int override_rlimi
return q;
 }
 
+static struct sigqueue *
+__sigqueue_alloc(int sig, struct task_struct *t, gfp_t flags,
+int override_rlimit)
+{
+   ret

[tip: sched/core] kcov: Remove kcov include from sched.h and move it to its users.

2021-03-03 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 4c7ee75cccbf0635cbec6528ae7fff4b7bc549fa
Gitweb:
https://git.kernel.org/tip/4c7ee75cccbf0635cbec6528ae7fff4b7bc549fa
Author:Sebastian Andrzej Siewior 
AuthorDate:Thu, 18 Feb 2021 18:31:24 +01:00
Committer: Peter Zijlstra 
CommitterDate: Wed, 03 Mar 2021 10:32:47 +01:00

kcov: Remove kcov include from sched.h and move it to its users.

The recent addition of in_serving_softirq() to kconv.h results in
compile failure on PREEMPT_RT because it requires
task_struct::softirq_disable_cnt. This is not available if kconv.h is
included from sched.h.

It is not needed to include kconv.h from sched.h. All but the net/ user
already include the kconv header file.

Move the include of the kconv.h header from sched.h it its users.
Additionally include sched.h from kconv.h to ensure that everything
task_struct related is available.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Johannes Berg 
Acked-by: Andrey Konovalov 
Link: https://lkml.kernel.org/r/20210218173124.iy5iyqv3a4oia...@linutronix.de
---
 drivers/usb/usbip/usbip_common.h | 1 +
 include/linux/kcov.h | 1 +
 include/linux/sched.h| 1 -
 net/core/skbuff.c| 1 +
 net/mac80211/iface.c | 1 +
 net/mac80211/rx.c| 1 +
 6 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/usb/usbip/usbip_common.h b/drivers/usb/usbip/usbip_common.h
index d60ce17..a7dd6c6 100644
--- a/drivers/usb/usbip/usbip_common.h
+++ b/drivers/usb/usbip/usbip_common.h
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #undef pr_fmt
diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037d..55dc338 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KCOV_H
 #define _LINUX_KCOV_H
 
+#include 
 #include 
 
 struct task_struct;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb2..cf245bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 545a472..420f23c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b80c9b0..c127deb 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "ieee80211_i.h"
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index c1343c0..62047e9 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include

[ANNOUNCE] v5.11.2-rt9

2021-03-02 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.11.2-rt9 patch set. 

Changes since v5.11.2-rt8:

  - Move the cpu_chill() prototype to hrtimer.h to avoid a warning.

  - Polish the sigqueue cache patch. Its doing remains unchanged with
one difference: The cache is now used only if the task is a realtime
task. Previously it was also used if a task was having a realtime
priority due to PI boost.

  - Refurbish the slub and page_alloc patches:
- There used to be a per-CPU list for pages which should have been
  given back to the page-alloctor but the caller was atomic. The
  atomic sections from within SLUB are gone, the per-CPU list is
  gone, too.

- The SLUB_CPU_PARTIAL switch can now be enabled. I looked at the
  resulting latency numbers and enabling leads to higher latency.
  Therefore it is off by default on RT.

- We used to split the free process of the pcp-pages into two
  stages and so slightly decouple the IRQ-off section from the
  zone-lock section. I don't see the need to keep doing it since the
  local-lock removes all the IRQ-off regions on RT and for !RT it
  shouldn't make much difference.
  This split is gone now.

- The alloc/free tracker sysfs file uses one PAGE size for
  collecting the results. If it runs out of space it reallocates
  more memory with disabled interrupts. The reallocation is not
  forbidden on PREEMPT_RT.

Known issues
 - kdb/kgdb can easily deadlock.
 - netconsole triggers WARN.

The delta patch against v5.11.2-rt8 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/incr/patch-5.11.2-rt8-rt9.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.11.2-rt9

The RT patch against v5.11.2 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patch-5.11.2-rt9.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patches-5.11.2-rt9.tar.xz

Sebastian

diff --git a/fs/namespace.c b/fs/namespace.c
index 45571517b9c21..d02bd66933735 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -14,7 +14,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
diff --git a/include/linux/delay.h b/include/linux/delay.h
index 02b37178b54f4..1d0e2ce6b6d9f 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -76,10 +76,4 @@ static inline void fsleep(unsigned long usecs)
msleep(DIV_ROUND_UP(usecs, 1000));
 }
 
-#ifdef CONFIG_PREEMPT_RT
-extern void cpu_chill(void);
-#else
-# define cpu_chill()   cpu_relax()
-#endif
-
 #endif /* defined(_LINUX_DELAY_H) */
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index bb5e7b0a42746..e425a26a5ed88 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -540,4 +540,10 @@ int hrtimers_dead_cpu(unsigned int cpu);
 #define hrtimers_dead_cpu  NULL
 #endif
 
+#ifdef CONFIG_PREEMPT_RT
+extern void cpu_chill(void);
+#else
+# define cpu_chill()   cpu_relax()
+#endif
+
 #endif
diff --git a/init/Kconfig b/init/Kconfig
index fa25113eda0dc..77d356fa8668e 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1972,8 +1972,8 @@ config SHUFFLE_PAGE_ALLOCATOR
  Say Y if unsure.
 
 config SLUB_CPU_PARTIAL
-   default y
-   depends on SLUB && SMP && !PREEMPT_RT
+   default y if !PREEMPT_RT
+   depends on SLUB && SMP
bool "SLUB per cpu partial cache"
help
  Per cpu partial caches accelerate objects allocation and freeing
diff --git a/kernel/signal.c b/kernel/signal.c
index b87178eb0df69..e40ed99a62a17 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -405,20 +405,20 @@ void task_join_group_stop(struct task_struct *task)
task_set_jobctl_pending(task, mask | JOBCTL_STOP_PENDING);
 }
 
-static inline struct sigqueue *get_task_cache(struct task_struct *t)
+static struct sigqueue *sigqueue_from_cache(struct task_struct *t)
 {
struct sigqueue *q = t->sigqueue_cache;
 
-   if (cmpxchg(>sigqueue_cache, q, NULL) != q)
-   return NULL;
-   return q;
+   if (q && cmpxchg(>sigqueue_cache, q, NULL) == q)
+   return q;
+   return NULL;
 }
 
-static inline int put_task_cache(struct task_struct *t, struct sigqueue *q)
+static bool sigqueue_add_cache(struct task_struct *t, struct sigqueue *q)
 {
-   if (cmpxchg(>sigqueue_cache, NULL, q) == NULL)
-   return 0;
-   return 1;
+   if (!t->sigqueue_cache && cmpxchg(>sigqueue_cache, NULL, q) == NULL)
+   return true;
+   return false;
 }
 
 /*
@@ -428,7 +428,7 @@ static inline int put_task_cache(struct task_struct *t, 
struct sigqueue *q)
  */
 static struct sigqueue *
 __sigqueue_do_alloc(int sig, struct task_struct *t, gfp_t flags,
-   int override_rlimit, int fromslab)
+

[tip: sched/core] kcov: Remove kcov include from sched.h and move it to its users.

2021-03-02 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the sched/core branch of tip:

Commit-ID: eae7a59d5a1e9bcf9804bcbd006ddce5cf72f8f4
Gitweb:
https://git.kernel.org/tip/eae7a59d5a1e9bcf9804bcbd006ddce5cf72f8f4
Author:Sebastian Andrzej Siewior 
AuthorDate:Thu, 18 Feb 2021 18:31:24 +01:00
Committer: Peter Zijlstra 
CommitterDate: Mon, 01 Mar 2021 18:17:22 +01:00

kcov: Remove kcov include from sched.h and move it to its users.

The recent addition of in_serving_softirq() to kconv.h results in
compile failure on PREEMPT_RT because it requires
task_struct::softirq_disable_cnt. This is not available if kconv.h is
included from sched.h.

It is not needed to include kconv.h from sched.h. All but the net/ user
already include the kconv header file.

Move the include of the kconv.h header from sched.h it its users.
Additionally include sched.h from kconv.h to ensure that everything
task_struct related is available.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Acked-by: Johannes Berg 
Acked-by: Andrey Konovalov 
Link: https://lkml.kernel.org/r/20210218173124.iy5iyqv3a4oia...@linutronix.de
---
 include/linux/kcov.h  | 1 +
 include/linux/sched.h | 1 -
 net/core/skbuff.c | 1 +
 net/mac80211/iface.c  | 1 +
 net/mac80211/rx.c | 1 +
 5 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037d..55dc338 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KCOV_H
 #define _LINUX_KCOV_H
 
+#include 
 #include 
 
 struct task_struct;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index ef00bb2..cf245bc 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 545a472..420f23c 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b80c9b0..c127deb 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "ieee80211_i.h"
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index c1343c0..62047e9 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include

Re: [RT v5.11-rt7] WARNING at include/linux/seqlock.h:271 nft_counter_eval

2021-02-23 Thread Sebastian Andrzej Siewior

On 2021-02-23 14:53:40 [+0100], Juri Lelli wrote:
> 
> So, I'm a bit confused and I'm very likely missing details (still
> digesting the seqprop_ magic), but write_seqcount_being() has
> 
>  if (seqprop_preemptible(s))
>  preempt_disable();
> 
> which in this case (no lock associated) is defined to return false, 
> while it should return true on RT (or in some occasions)? Or maybe this
> is what you are saying already.

write_seqcount_begin() has seqprop_assert() at the very beginning which
ends in __seqprop_assert() in your case (seqcount_t). Your warning.

> Also, the check for preemption been disabled happens before we can
> actually potentially disable it, no?

That seqprop_preemptible() is true for !RT for mutex/ww_mutex locks. On
RT it is always false since it does lock()+unlock() of the lock that is
part of the seqcount.

But back to the original issue: at write_seqcount_begin() preemption is
disabled !RT implicit by local_bh_disable(). Therefore no warning.
On RT local_bh_disable() disables BH on the CPUs so locking wise (since
it is a per-CPU seqcount it should work. Preemption remains enabled so
we have a warning.

I have no idea what annotation would be best here. Having a
local_bh_disable() type of a lock and the seqcount is not part of the data
structure it protects is less than ideal.
However, if I understand this correct then this nft_counter_percpu_priv
exists once per nft rule. The seqcount exists once per-CPU since it is
unlikely to modify two counters at once on a single CPU :) So there is
that.

While looking at it, there is nft_counter_reset() which modifies the
values without a seqcount write lock. This might be okay.

> Thanks for the quick reply!
> 
> Best,
> Juri

Sebastian

Re: [RT v5.11-rt7] WARNING at include/linux/seqlock.h:271 nft_counter_eval

2021-02-23 Thread Sebastian Andrzej Siewior

On 2021-02-23 11:49:07 [+0100], Juri Lelli wrote:
> Hi,
Hi,

> I'm seeing the following splat right after boot (or during late boot
> phases) with v5.11-rt7 (LOCKDEP enabled).
…
> [   85.273588] WARNING: CPU: 5 PID: 1416 at include/linux/seqlock.h:271 
> nft_counter_eval+0x95/0x130 [nft_counter]
…
> [   85.273713] RIP: 0010:nft_counter_eval+0x95/0x130 [nft_counter]

This is a per-CPU seqcount_t in net/netfilter/nft_counter.c which is
only protected by local_bh_disabled(). The warning expects preemption
to be disabled which is the case on !RT but not on RT.

Not sure what to do about this. It is doing anything wrong as of now. It
is noisy.

> Best,
> Juri

Sebastian

Re: [RFC v2] sched/rt: Fix RT (group) throttling with nohz_full

2021-02-22 Thread Sebastian Andrzej Siewior

On 2021-02-02 10:00:10 [+0100], Jonathan Schwender wrote:
> If nohz_full is enabled (more precisely HK_FLAG_TIMER is set), then
> do_sched_rt_period_timer may be called on a housekeeping CPU,
> which would not service the isolated CPU for a non-root cgroup
> (requires a kernel with RT_GROUP_SCHEDULING).
> This causes RT tasks in a non-root cgroup to get throttled
> indefinitely (unless throttling is disabled) once the timer has
> been moved to a housekeeping CPU.
> To fix this, housekeeping CPUs now service all online CPUs
> if HK_FLAG_TIMER (nohz_full) is set.

This originates from
   
https://lore.kernel.org/linux-rt-users/b07b6fc7-1a5f-0dcc-ca65-821de96cd...@gmail.com/

Could someone please take a look?

Sebastian

Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization

2021-02-22 Thread Sebastian Andrzej Siewior

On 2021-02-19 10:33:36 [-0800], Paul E. McKenney wrote:
> For definiteness, here is the first part of the change, posted earlier.
> The commit log needs to be updated.  I will post the change that keeps
> the tick going as a reply to this email.
…
> diff --git a/kernel/softirq.c b/kernel/softirq.c
> index 9d71046..ba78e63 100644
> --- a/kernel/softirq.c
> +++ b/kernel/softirq.c
> @@ -209,7 +209,7 @@ static inline void invoke_softirq(void)
>   if (ksoftirqd_running(local_softirq_pending()))
>   return;
>  
> - if (!force_irqthreads) {
> + if (!force_irqthreads || !__this_cpu_read(ksoftirqd)) {
>  #ifdef CONFIG_HAVE_IRQ_EXIT_ON_IRQ_STACK
>   /*
>* We can safely execute softirq on the current stack if
> @@ -358,8 +358,8 @@ asmlinkage __visible void __softirq_entry 
> __do_softirq(void)
>  
>   pending = local_softirq_pending();
>   if (pending) {
> - if (time_before(jiffies, end) && !need_resched() &&
> - --max_restart)
> + if (!__this_cpu_read(ksoftirqd) ||
> + (time_before(jiffies, end) && !need_resched() && 
> --max_restart))
>   goto restart;

This is hunk shouldn't be needed. The reason for it is probably that the
following wakeup_softirqd() would avoid further invoke_softirq()
performing the actual softirq work. It would leave early due to
ksoftirqd_running(). Unless I'm wrong, any raise_softirq() invocation
outside of an interrupt would do the same. 

I would like PeterZ / tglx to comment on this one. Basically I'm not
sure if it is okay to expect softirqs beeing served and waited on that
early in the boot.

>   wakeup_softirqd();

Sebastian

Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization

2021-02-22 Thread Sebastian Andrzej Siewior

On 2021-02-19 10:18:11 [-0800], Paul E. McKenney wrote:
> If Masami's patch works for the PowerPC guys on v5.10-rc7, then it can
> be backported.  The patch making RCU Tasks initialize itself early won't
> have any effect and can be left or reverted, as we choose.  The self-test
> patch will need to be either adjusted or reverted.
> 
> However...
> 
> The root cause of this problem is that softirq only kind-of works
> during a window of time during boot.  It works only if the number and
> duration of softirq handlers during this time is small enough, for some
> ill-defined notion of "small enough".  If there are too many, whatever
> that means exactly, then we get failed attempt to awaken ksoftirqd, which

The number of registered softirq handlers does not matter nor the amount
times the individual softirqs that were scheduled. The only problem is
that one schedules softirq and then waits for its completion.
So scheduling a timer_list timer works. Waiting for its completion does
not. Once ksoftirqd is up, will be processed.

> (sometimes!) results in a silent hang.  Which, as you pointed out earlier,
> is a really obnoxious error message.  And any minor change could kick
> us into silent-hang state because of the heuristics used to hand off
> to ksoftirqd.  The straw that broke the camel's back and all that.

The problem is that a softirq is raised and being waited for its
completion.
Something like synchronize_rcu() would be such a thing I guess.

> One approach would be to add WARN_ON_ONCE() so that if softirq tries
> to awaken ksoftirqd before it is spawned, we get a nice obvious splat.
> Unfortunately, this gives false positives because there is code that
> needs a softirq handler to run eventually, but is OK with that handler
> being delayed until some random point in the early_initcall() sequence.
> 
> Besides which, if we are going to add a check, why not use that check
> just make things work by forcing handler execution to remain within the
> softirq back-of-interrupt context instead of awakening a not-yet-spawned
> ksoftirqd?  We can further prevent entry into dyntick-idle state until
> the ksoftirqd kthreads have been spawned, which means that if softirq
> handlers must be deferred, they will be resumed within one jiffy by the
> next scheduler-clock interrupt.

This should work.

> Yes, this can allow softirq handlers to impose large latencies, but only
> during early boot, long before any latency-sensitive applications can
> possibly have been created.  So this does not seem like a real problem.
> 
> Am I missing something here?
> 
>   Thanx, Paul

Sebastian

[ANNOUNCE] v5.11-rt7

2021-02-19 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.11-rt7 patch set. 

Changes since v5.11-rt6:

  - PowerPC could fail to compile due to an unused variable which is
only visible with PREEMPT_RT enabled.
 
  - Update John printk patch.
With the update I can strike
  kmsg dumpers expecting not to be called in parallel can clobber
  their temp buffer.

off the known issues list.

Known issues
 - kdb/kgdb can easily deadlock.
 - netconsole triggers WARN.

The delta patch against v5.11-rt6 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/incr/patch-5.11-rt6-rt7.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.11-rt7

The RT patch against v5.11 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patch-5.11-rt7.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patches-5.11-rt7.tar.xz

Sebastian

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 532f226377831..1ef55f4b389a2 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -73,7 +73,8 @@ static const char *nvram_os_partitions[] = {
 };
 
 static void oops_to_nvram(struct kmsg_dumper *dumper,
- enum kmsg_dump_reason reason);
+ enum kmsg_dump_reason reason,
+ struct kmsg_dumper_iter *iter);
 
 static struct kmsg_dumper nvram_kmsg_dumper = {
.dump = oops_to_nvram
@@ -643,7 +644,8 @@ void __init nvram_init_oops_partition(int 
rtas_partition_exists)
  * partition.  If that's too much, go back and capture uncompressed text.
  */
 static void oops_to_nvram(struct kmsg_dumper *dumper,
- enum kmsg_dump_reason reason)
+ enum kmsg_dump_reason reason,
+ struct kmsg_dumper_iter *iter)
 {
struct oops_log_info *oops_hdr = (struct oops_log_info *)oops_buf;
static unsigned int oops_count = 0;
@@ -681,13 +683,13 @@ static void oops_to_nvram(struct kmsg_dumper *dumper,
return;
 
if (big_oops_buf) {
-   kmsg_dump_get_buffer(dumper, false,
+   kmsg_dump_get_buffer(iter, false,
 big_oops_buf, big_oops_buf_sz, _len);
rc = zip_oops(text_len);
}
if (rc != 0) {
-   kmsg_dump_rewind(dumper);
-   kmsg_dump_get_buffer(dumper, false,
+   kmsg_dump_rewind(iter);
+   kmsg_dump_get_buffer(iter, false,
 oops_data, oops_data_sz, _len);
err_type = ERR_TYPE_KERNEL_PANIC;
oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index afab328d08874..d6c3f0b79f1d1 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -54,7 +54,6 @@
 
 #include 
 
-static DEFINE_MUTEX(linear_mapping_mutex);
 unsigned long long memory_limit;
 bool init_mem_is_free;
 
@@ -72,6 +71,7 @@ pgprot_t phys_mem_access_prot(struct file *file, unsigned 
long pfn,
 EXPORT_SYMBOL(phys_mem_access_prot);
 
 #ifdef CONFIG_MEMORY_HOTPLUG
+static DEFINE_MUTEX(linear_mapping_mutex);
 
 #ifdef CONFIG_NUMA
 int memory_add_physaddr_to_nid(u64 start)
diff --git a/arch/powerpc/platforms/powernv/opal-kmsg.c 
b/arch/powerpc/platforms/powernv/opal-kmsg.c
index 6c3bc4b4da983..ec862846bc82c 100644
--- a/arch/powerpc/platforms/powernv/opal-kmsg.c
+++ b/arch/powerpc/platforms/powernv/opal-kmsg.c
@@ -20,7 +20,8 @@
  * message, it just ensures that OPAL completely flushes the console buffer.
  */
 static void kmsg_dump_opal_console_flush(struct kmsg_dumper *dumper,
-enum kmsg_dump_reason reason)
+enum kmsg_dump_reason reason,
+struct kmsg_dumper_iter *iter)
 {
/*
 * Outside of a panic context the pollers will continue to run,
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index dcd817ca2edfd..f51367a3b2318 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3005,7 +3005,7 @@ print_address(unsigned long addr)
 static void
 dump_log_buf(void)
 {
-   struct kmsg_dumper dumper = { .active = 1 };
+   struct kmsg_dumper_iter iter = { .active = 1 };
unsigned char buf[128];
size_t len;
 
@@ -3017,9 +3017,9 @@ dump_log_buf(void)
catch_memory_errors = 1;
sync();
 
-   kmsg_dump_rewind_nolock();
+   kmsg_dump_rewind();
xmon_start_pagination();
-   while (kmsg_dump_get_line_nolock(, false, buf, sizeof(buf), 
)) {
+   while (kmsg_dump_get_line(, false, buf, sizeof(buf), )) {
buf[len] = '\0';
printf("%s", buf);

[ANNOUNCE] v5.10.17-rt32

2021-02-19 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.10.17-rt32 patch set. 

Changes since v5.10.17-rt31:

  - Due to tracing rework, the 'L' marker (for need resched lazy) got
lost and is now back.

  - Update John printk patch.
With the update I can strike
  kmsg dumpers expecting not to be called in parallel can clobber
  their temp buffer.

off the known issues list.

Known issues
 - kdb/kgdb can easily deadlock.
 - netconsole triggers WARN.

The delta patch against v5.10.17-rt31 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/incr/patch-5.10.17-rt31-rt32.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.10.17-rt32

The RT patch against v5.10.17 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.17-rt32.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.17-rt32.tar.xz

Sebastian

diff --git a/arch/powerpc/kernel/nvram_64.c b/arch/powerpc/kernel/nvram_64.c
index 532f226377831..1ef55f4b389a2 100644
--- a/arch/powerpc/kernel/nvram_64.c
+++ b/arch/powerpc/kernel/nvram_64.c
@@ -73,7 +73,8 @@ static const char *nvram_os_partitions[] = {
 };
 
 static void oops_to_nvram(struct kmsg_dumper *dumper,
- enum kmsg_dump_reason reason);
+ enum kmsg_dump_reason reason,
+ struct kmsg_dumper_iter *iter);
 
 static struct kmsg_dumper nvram_kmsg_dumper = {
.dump = oops_to_nvram
@@ -643,7 +644,8 @@ void __init nvram_init_oops_partition(int 
rtas_partition_exists)
  * partition.  If that's too much, go back and capture uncompressed text.
  */
 static void oops_to_nvram(struct kmsg_dumper *dumper,
- enum kmsg_dump_reason reason)
+ enum kmsg_dump_reason reason,
+ struct kmsg_dumper_iter *iter)
 {
struct oops_log_info *oops_hdr = (struct oops_log_info *)oops_buf;
static unsigned int oops_count = 0;
@@ -681,13 +683,13 @@ static void oops_to_nvram(struct kmsg_dumper *dumper,
return;
 
if (big_oops_buf) {
-   kmsg_dump_get_buffer(dumper, false,
+   kmsg_dump_get_buffer(iter, false,
 big_oops_buf, big_oops_buf_sz, _len);
rc = zip_oops(text_len);
}
if (rc != 0) {
-   kmsg_dump_rewind(dumper);
-   kmsg_dump_get_buffer(dumper, false,
+   kmsg_dump_rewind(iter);
+   kmsg_dump_get_buffer(iter, false,
 oops_data, oops_data_sz, _len);
err_type = ERR_TYPE_KERNEL_PANIC;
oops_hdr->version = cpu_to_be16(OOPS_HDR_VERSION);
diff --git a/arch/powerpc/platforms/powernv/opal-kmsg.c 
b/arch/powerpc/platforms/powernv/opal-kmsg.c
index 6c3bc4b4da983..ec862846bc82c 100644
--- a/arch/powerpc/platforms/powernv/opal-kmsg.c
+++ b/arch/powerpc/platforms/powernv/opal-kmsg.c
@@ -20,7 +20,8 @@
  * message, it just ensures that OPAL completely flushes the console buffer.
  */
 static void kmsg_dump_opal_console_flush(struct kmsg_dumper *dumper,
-enum kmsg_dump_reason reason)
+enum kmsg_dump_reason reason,
+struct kmsg_dumper_iter *iter)
 {
/*
 * Outside of a panic context the pollers will continue to run,
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 5559edf36756c..d62b8e053d4c8 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -3005,7 +3005,7 @@ print_address(unsigned long addr)
 static void
 dump_log_buf(void)
 {
-   struct kmsg_dumper dumper = { .active = 1 };
+   struct kmsg_dumper_iter iter = { .active = 1 };
unsigned char buf[128];
size_t len;
 
@@ -3017,9 +3017,9 @@ dump_log_buf(void)
catch_memory_errors = 1;
sync();
 
-   kmsg_dump_rewind_nolock();
+   kmsg_dump_rewind();
xmon_start_pagination();
-   while (kmsg_dump_get_line_nolock(, false, buf, sizeof(buf), 
)) {
+   while (kmsg_dump_get_line(, false, buf, sizeof(buf), )) {
buf[len] = '\0';
printf("%s", buf);
}
diff --git a/arch/um/kernel/kmsg_dump.c b/arch/um/kernel/kmsg_dump.c
index e4abac6c9727c..173999422ed84 100644
--- a/arch/um/kernel/kmsg_dump.c
+++ b/arch/um/kernel/kmsg_dump.c
@@ -1,15 +1,19 @@
 // SPDX-License-Identifier: GPL-2.0
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 
 static void kmsg_dumper_stdout(struct kmsg_dumper *dumper,
-   enum kmsg_dump_reason reason)
+   enum kmsg_dump_reason reason,
+   struct kmsg_dumper_iter

Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization

2021-02-19 Thread Sebastian Andrzej Siewior

On 2021-02-19 12:13:01 [+0100], Uladzislau Rezki wrote:
> I or Paul will ask for a test once it is settled down :) Looks like
> it is, so we should fix for v5.12.

Okay. Since Paul asked for powerpc test on v5.11-rc I wanted check if
parts of it are also -stable material.

Sebastian

Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization

2021-02-19 Thread Sebastian Andrzej Siewior

On 2021-02-19 11:49:58 [+0100], Uladzislau Rezki wrote:
> If above fix works, we can initialize rcu_init_tasks_generic() from the
> core_initcall() including selftst. It means that such initialization can
> be done later:

Good. Please let me know once there is something for me to test.
Do I assume correctly that the self-test, I stumbled upon, is v5.12
material?

Sebastian

Re: [PATCH 34/33] netfs: Pass flag rather than use in_softirq()

2021-02-19 Thread Sebastian Andrzej Siewior

On 2021-02-18 14:02:36 [+], David Howells wrote:
> How about the attached instead?

Thank you for that flag.

> David

Sebastian

Re: [PATCH] kprobes: Fix to delay the kprobes jump optimization

2021-02-19 Thread Sebastian Andrzej Siewior

On 2021-02-18 07:15:54 [-0800], Paul E. McKenney wrote:
> Thank you, but the original report of a problem was from Sebastian
> and the connection to softirq was Uladzislau.  So could you please
> add these before (or even in place of) my Reported-by?
> 
> Reported-by: Sebastian Andrzej Siewior 
> Reported-by: Uladzislau Rezki 
> 
> Other than that, looks good!

Perfect. I'm kind of lost here, nevertheless ;) Does this mean that the
RCU selftest can now be delayed?

> Acked-by: Paul E. McKenney 
> 
>   Thanx, Paul

Sebastian

[ANNOUNCE] v5.11-rt6

2021-02-18 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.11-rt6 patch set. 

Changes since v5.11-rt5:

  - Updated the "tracing: Merge irqflags + preempt counter." patch to
the version Steven posted for upstream inclusion.

  - Due to tracing rework, the 'L' marker (for need resched lazy) got
lost and is now back.

  - The patch for the zsmalloc/zswap regression in v5.11 got updated to
v2 as posted by Barry Song.

  - A kcov enabled kernel did not compile with PREEMPT_RT enabled.

Known issues
 - kdb/kgdb can easily deadlock.
 - kmsg dumpers expecting not to be called in parallel can clobber
   their temp buffer.
 - netconsole triggers WARN.

The delta patch against v5.11-rt5 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/incr/patch-5.11-rt5-rt6.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.11-rt6

The RT patch against v5.11 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patch-5.11-rt6.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patches-5.11-rt6.tar.xz

Sebastian

diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037dc12048..55dc338f6bcdd 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KCOV_H
 #define _LINUX_KCOV_H
 
+#include 
 #include 
 
 struct task_struct;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7337630326751..183e9d90841cb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
index 5d08fb467f69a..89c3f7162267b 100644
--- a/include/linux/trace_events.h
+++ b/include/linux/trace_events.h
@@ -162,9 +162,58 @@ static inline void tracing_generic_entry_update(struct 
trace_entry *entry,
entry->flags= trace_ctx >> 24;
 }
 
-unsigned int _tracing_gen_ctx_flags(unsigned long irqflags);
-unsigned int tracing_gen_ctx_flags(void);
-unsigned int tracing_gen_ctx_flags_dect(void);
+unsigned int tracing_gen_ctx_irq_test(unsigned int irqs_status);
+
+enum trace_flag_type {
+   TRACE_FLAG_IRQS_OFF = 0x01,
+   TRACE_FLAG_IRQS_NOSUPPORT   = 0x02,
+   TRACE_FLAG_NEED_RESCHED = 0x04,
+   TRACE_FLAG_HARDIRQ  = 0x08,
+   TRACE_FLAG_SOFTIRQ  = 0x10,
+   TRACE_FLAG_PREEMPT_RESCHED  = 0x20,
+   TRACE_FLAG_NMI  = 0x40,
+   TRACE_FLAG_NEED_RESCHED_LAZY= 0x80,
+};
+
+#ifdef CONFIG_TRACE_IRQFLAGS_SUPPORT
+static inline unsigned int tracing_gen_ctx_flags(unsigned long irqflags)
+{
+   unsigned int irq_status = irqs_disabled_flags(irqflags) ?
+   TRACE_FLAG_IRQS_OFF : 0;
+   return tracing_gen_ctx_irq_test(irq_status);
+}
+static inline unsigned int tracing_gen_ctx(void)
+{
+   unsigned long irqflags;
+
+   local_save_flags(irqflags);
+   return tracing_gen_ctx_flags(irqflags);
+}
+#else
+
+static inline unsigned int tracing_gen_ctx_flags(unsigned long irqflags)
+{
+   return tracing_gen_ctx_irq_test(TRACE_FLAG_IRQS_NOSUPPORT);
+}
+static inline unsigned int tracing_gen_ctx(void)
+{
+   return tracing_gen_ctx_irq_test(TRACE_FLAG_IRQS_NOSUPPORT);
+}
+#endif
+
+static inline unsigned int tracing_gen_ctx_dec(void)
+{
+   unsigned int trace_ctx;
+
+   trace_ctx = tracing_gen_ctx();
+   /*
+* Subtract one from the preeption counter if preemption is enabled,
+* see trace_event_buffer_reserve()for details.
+*/
+   if (IS_ENABLED(CONFIG_PREEMPTION))
+   trace_ctx--;
+   return trace_ctx;
+}
 
 struct trace_event_file;
 
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index f5c4f1d72a885..c54eae2ab208c 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -79,7 +79,7 @@ static void trace_note(struct blk_trace *bt, pid_t pid, int 
action,
 
if (blk_tracer) {
buffer = blk_tr->array_buffer.buffer;
-   trace_ctx = _tracing_gen_ctx_flags(0);
+   trace_ctx = tracing_gen_ctx_flags(0);
event = trace_buffer_lock_reserve(buffer, TRACE_BLK,
  sizeof(*t) + len + cgid_len,
  trace_ctx);
@@ -253,7 +253,7 @@ static void __blk_add_trace(struct blk_trace *bt, sector_t 
sector, int bytes,
tracing_record_cmdline(current);
 
buffer = blk_tr->array_buffer.buffer;
-   trace_ctx = _tracing_gen_ctx_flags(0);
+   trace_ctx = tracing_gen_ctx_flags(0);
event = trace_buffer_lock_reserve(buffer, TRACE_BLK,

[PATCH] kcov: Remove kcov include from sched.h and move it to its users.

2021-02-18 Thread Sebastian Andrzej Siewior

The recent addition of in_serving_softirq() to kconv.h results in
compile failure on PREEMPT_RT because it requires
task_struct::softirq_disable_cnt. This is not available if kconv.h is
included from sched.h.

It is not needed to include kconv.h from sched.h. All but the net/ user
already include the kconv header file.

Move the include of the kconv.h header from sched.h it its users.
Additionally include sched.h from kconv.h to ensure that everything
task_struct related is available.

Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/kcov.h  | 1 +
 include/linux/sched.h | 1 -
 net/core/skbuff.c | 1 +
 net/mac80211/iface.c  | 1 +
 net/mac80211/rx.c | 1 +
 5 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/kcov.h b/include/linux/kcov.h
index 4e3037dc12048..55dc338f6bcdd 100644
--- a/include/linux/kcov.h
+++ b/include/linux/kcov.h
@@ -2,6 +2,7 @@
 #ifndef _LINUX_KCOV_H
 #define _LINUX_KCOV_H
 
+#include 
 #include 
 
 struct task_struct;
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7337630326751..183e9d90841cb 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -14,7 +14,6 @@
 #include 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 785daff48030d..e64d0a2e21c31 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -60,6 +60,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
diff --git a/net/mac80211/iface.c b/net/mac80211/iface.c
index b31417f40bd56..39943c33abbfa 100644
--- a/net/mac80211/iface.c
+++ b/net/mac80211/iface.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include "ieee80211_i.h"
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 972895e9f22dc..3527b17f235a8 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -17,6 +17,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
-- 
2.30.0

Re: Should RCU_BOOST kernels use hrtimers in GP kthread?

2021-02-17 Thread Sebastian Andrzej Siewior

On 2021-02-17 11:19:07 [-0800], Paul E. McKenney wrote:
> > Ah. One nice thing is that you can move the RCU threads to a house
> > keeping CPU - away from the CPU(s) running the RT tasks. Would this
> > scenario be still affected (if ksoftirqd would be blocked)?
> 
> At this point, I am going to say that it is the sysadm's job to place
> the rcuo kthreads, and if they are placed poorly, life is hard.

Good. Because that is what I suggest :)

> > Oh. One thing I forgot to mention: the timer_list timer is nice in terms
> > of moving forward (the timer did not fire, the condition is true and you
> > move the timeout forward).
> > A hrtimer timer on the other hand needs to be removed, forwarded and
> > added back to the "timer tree". This is considered more expensive
> > especially if the timer does not fire.
> 
> There are some timers that are used to cause a wakeup to happen from
> a clean environment, but maybe these can instead use irq-work.

irq-work has also a "hard" mode because people ended up to throwing
everything in there.

> That it can!  Aravinda Prasad prototyped a mechanism hinting to the
> hypervisor in such cases, but I don't know that this ever saw the light
> of day.

Ah, good to know.

> > My understanding of the need for RCU boosting is to get a task,
> > preempted (by a RT task) within a RCU section, back on the CPU to
> > at least close the RCU section. So it is possible to run RCU callbacks
> > and free memory.
> > The 10 seconds without RCU callbacks shouldn't be bad unless the OOM
> > killer got nervous (and if we had memory allocation failures).
> > Also, running thousands of accumulated callbacks isn't good either.
> 
> Sounds good, thank you!

I hope my understanding was correct. Glad to be if service :)

> 
>   Thanx, Paul
> 
Sebastian

Re: Should RCU_BOOST kernels use hrtimers in GP kthread?

2021-02-17 Thread Sebastian Andrzej Siewior

On 2021-02-17 07:54:47 [-0800], Paul E. McKenney wrote:
> > I though boosting is accomplished by acquiring a rt_mutex in a
> > rcu_read() section. Do you have some code to point me to, to see how a
> > timer is involved here? Or is it the timer saying that *now* boosting is
> > needed.
> 
> Yes, this last, which is in the grace-period kthread code, for example,
> in rcu_gp_fqs_loop().
>
> > If your hrtimer is a "normal" hrtimer then it will be served by
> > ksoftirqd, too. You would additionally need one of the
> > HRTIMER_MODE_*_HARD to make it work.
> 
> Good to know.  Anything I should worry about for this mode?

Well. It is always hardirq. No spinlock_t, etc. within that callback.
If you intend to wake a thread, that thread needs an elevated priority
otherwise it won't be scheduled (assuming there is a RT tasking running
which would block otherwise ksoftirqd).

Ah. One nice thing is that you can move the RCU threads to a house
keeping CPU - away from the CPU(s) running the RT tasks. Would this
scenario be still affected (if ksoftirqd would be blocked)?

Oh. One thing I forgot to mention: the timer_list timer is nice in terms
of moving forward (the timer did not fire, the condition is true and you
move the timeout forward).
A hrtimer timer on the other hand needs to be removed, forwarded and
added back to the "timer tree". This is considered more expensive
especially if the timer does not fire.

> Also, the current test expects callbacks to be invoked, which involves a
> number of additional kthreads and timers, for example, in nocb_gp_wait().
> I suppose I could instead look at grace-period sequence numbers, but I
> believe that real-life use cases needing RCU priority boosting also need
> the callbacks to be invoked reasonably quickly (as in within hundreds
> of milliseconds up through very small numbers of seconds).

A busy/overloaded kvm-host could lead to delays by not scheduling the
guest for a while.

My understanding of the need for RCU boosting is to get a task,
preempted (by a RT task) within a RCU section, back on the CPU to
at least close the RCU section. So it is possible to run RCU callbacks
and free memory.
The 10 seconds without RCU callbacks shouldn't be bad unless the OOM
killer got nervous (and if we had memory allocation failures).
Also, running thousands of accumulated callbacks isn't good either.

> Thoughts?
> 
>   Thanx, Paul
Sebastian

Re: Should RCU_BOOST kernels use hrtimers in GP kthread?

2021-02-17 Thread Sebastian Andrzej Siewior

On 2021-02-16 10:36:09 [-0800], Paul E. McKenney wrote:
> Hello, Sebastian,

Hi Paul,

> I punted on this for the moment by making RCU priority boosting testing
> depend on CONFIG_PREEMPT_RT, but longer term I am wondering if RCU's
> various timed delays and timeouts should use hrtimers rather than normal
> timers in kernels built with CONFIG_RCU_BOOST.  As it is, RCU priority
> boosting can be defeated if any of the RCU grace-period kthread's timeouts
> are serviced by the non-realtime ksoftirqd.

I though boosting is accomplished by acquiring a rt_mutex in a
rcu_read() section. Do you have some code to point me to, to see how a
timer is involved here? Or is it the timer saying that *now* boosting is
needed.

If your hrtimer is a "normal" hrtimer then it will be served by
ksoftirqd, too. You would additionally need one of the
HRTIMER_MODE_*_HARD to make it work.

> This might require things like swait_event_idle_hrtimeout_exclusive(),
> either as primitives or just open coded.
> 
> Thoughts?
> 
>   Thanx, Paul

Sebastian

[tip: sched/core] smp: Process pending softirqs in flush_smp_call_function_from_idle()

2021-02-17 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the sched/core branch of tip:

Commit-ID: f9d34595ae4feed38856b88769e2ba5af22d2548
Gitweb:
https://git.kernel.org/tip/f9d34595ae4feed38856b88769e2ba5af22d2548
Author:Sebastian Andrzej Siewior 
AuthorDate:Sat, 23 Jan 2021 21:10:25 +01:00
Committer: Ingo Molnar 
CommitterDate: Wed, 17 Feb 2021 14:12:42 +01:00

smp: Process pending softirqs in flush_smp_call_function_from_idle()

send_call_function_single_ipi() may wake an idle CPU without sending an
IPI. The woken up CPU will process the SMP-functions in
flush_smp_call_function_from_idle(). Any raised softirq from within the
SMP-function call will not be processed.
Should the CPU have no tasks assigned, then it will go back to idle with
pending softirqs and the NOHZ will rightfully complain.

Process pending softirqs on return from flush_smp_call_function_queue().

Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()")
Reported-by: Jens Axboe 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Signed-off-by: Ingo Molnar 
Link: https://lkml.kernel.org/r/20210123201027.3262800-2-bige...@linutronix.de
---
 kernel/smp.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/smp.c b/kernel/smp.c
index 1b6070b..aeb0adf 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -449,6 +450,9 @@ void flush_smp_call_function_from_idle(void)
 
local_irq_save(flags);
flush_smp_call_function_queue(true);
+   if (local_softirq_pending())
+   do_softirq();
+
local_irq_restore(flags);
 }

[PATCH v4] auxdisplay: Remove in_interrupt() usage.

2021-02-16 Thread Sebastian Andrzej Siewior

charlcd_write() is invoked as a VFS->write() callback and as such it is
always invoked from preemptible context and may sleep.

charlcd_puts() is invoked from register/unregister callback which is
preemptible. The reboot notifier callback is also invoked from
preemptible context.

Therefore there is no need to use in_interrupt() to figure out if it
is safe to sleep because it always is. in_interrupt() and related
context checks are being removed from non-core code.
Using schedule() to schedule (and be friendly to others) is
discouraged and cond_resched() should be used instead.

Remove in_interrupt() and use cond_resched() to schedule every 32
iteration if needed.

Link: https://lkml.kernel.org/r/20200914204209.256266...@linutronix.de
Cc: Miguel Ojeda Sandonis 
Signed-off-by: Sebastian Andrzej Siewior 
---
v2…v4: - Spelling fixes got lot.
   - Add a comment before cond_resched() as requested by Miguel Ojeda.
   - Drop the ` ' comments.
v2…v3: Extend the commit message as suggested by Miguel Ojeda.
v1…v2: Spelling fixes.

 drivers/auxdisplay/charlcd.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/drivers/auxdisplay/charlcd.c b/drivers/auxdisplay/charlcd.c
index f43430e9dceed..95accc941023a 100644
--- a/drivers/auxdisplay/charlcd.c
+++ b/drivers/auxdisplay/charlcd.c
@@ -470,12 +470,14 @@ static ssize_t charlcd_write(struct file *file, const 
char __user *buf,
char c;
 
for (; count-- > 0; (*ppos)++, tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
+   if (((count + 1) & 0x1f) == 0) {
/*
-* let's be a little nice with other processes
-* that need some CPU
+* charlcd_write() is invoked as a VFS->write() callback
+* and as such it is > always invoked from preemptible
+* context and may sleep.
 */
-   schedule();
+   cond_resched();
+   }
 
if (get_user(c, tmp))
return -EFAULT;
@@ -537,12 +539,8 @@ static void charlcd_puts(struct charlcd *lcd, const char 
*s)
int count = strlen(s);
 
for (; count-- > 0; tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
-   /*
-* let's be a little nice with other processes
-* that need some CPU
-*/
-   schedule();
+   if (((count + 1) & 0x1f) == 0)
+   cond_resched();
 
charlcd_write_char(lcd, *tmp);
}
-- 
2.30.0

Re: [PATCH v3] auxdisplay: Remove in_interrupt() usage.

2021-02-16 Thread Sebastian Andrzej Siewior

On 2021-02-16 13:42:19 [+0100], Miguel Ojeda wrote:
> It is not so much about documenting the obvious, but about stating
> that 1) the precondition was properly taken into account and that 2)
> nothing non-obvious is undocumented. When code is changed later on, it
> is much more likely assumptions are broken if not documented.

That should be part of the commit message. You can always rewind to the
commit message that introduce something and check if the commit message
made sense or ignored a detail which made it wrong (or so).

> In fact, from a quick git blame, that seems to be what happened here:
> originally the function could be called from a public function
> intended to be used from inside the kernel; so I assume it was the
> intention to allow calls from softirq contexts. Then it was refactored
> and the check never removed. In this case, the extra check is not a
> big deal, but going in the opposite direction can happen too, and then
> we will have a bug.

So it was needed once, it is not needed anymore. That was my arguing in
v1 about. No word about general removing in_interrupt() from drivers.

> In general, when a patch for a fix is needed, it's usually a good idea
> to add a comment right in the code. Even if only to avoid someone else
> having to backtrack the calls to see it is only called form fs_ops
> etc.

This is not a fix. It just removes not needed code. Also I don't think
that this is a good idea to add a comment to avoid someone to backtrack
/ double check something. If you rely on a comment instead of double
checking that you do is indeed correct you will rely one day on a stale
comment and commit to a bug.

To give you another example: If I would have come along and replaced
GFP_ATOMIC with GFP_KERNEL would you ask for a comment?

Anyway, I'm posting a patch with changes as ordered in a jiffy.

> Cheers,
> Miguel

Sebastian

[ANNOUNCE] v5.11-rt5

2021-02-16 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.11-rt5 patch set. 

Changes since v5.11-rt4:

  - Lazy preemption fix for 64bit PowerPC. It was broken since
v5.9-rc2-rt1. Reported by John Ogness.

  - Two patches for chelsio/cxgb network driver to avoid
tasklet_disable() usage in atomic context on !RT.

  - Due to recent softirq rework it was not possible to compile a kernel
with RT && !SMP. Reported by Jonathan Schwender, patch by Christian
Eggers.

  - Update the block-mq patches to the version, that has been staged for
upstream.

Known issues
 - kdb/kgdb can easily deadlock.
 - kmsg dumpers expecting not to be called in parallel can clobber
   their temp buffer.
 - netconsole triggers WARN.

The delta patch against v5.11-rt4 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/incr/patch-5.11-rt4-rt5.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.11-rt5

The RT patch against v5.11 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patch-5.11-rt5.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patches-5.11-rt5.tar.xz

Sebastian

diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index b304c68dbcf50..092c014b0653e 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -401,7 +401,8 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct 
pt_regs *regs, unsign
if (preempt_count() == 0)
preempt_schedule_irq();
} else if (unlikely(*ti_flagsp & 
_TIF_NEED_RESCHED_LAZY)) {
-   if (current_thread_info()->preempt_lazy_count 
== 0)
+   if ((preempt_count() == 0) &&
+   (current_thread_info()->preempt_lazy_count 
== 0))
preempt_schedule_irq();
}
}
diff --git a/drivers/net/ethernet/chelsio/cxgb/common.h 
b/drivers/net/ethernet/chelsio/cxgb/common.h
index 6475060649e90..0321be77366c4 100644
--- a/drivers/net/ethernet/chelsio/cxgb/common.h
+++ b/drivers/net/ethernet/chelsio/cxgb/common.h
@@ -238,7 +238,6 @@ struct adapter {
int msg_enable;
u32 mmio_len;
 
-   struct work_struct ext_intr_handler_task;
struct adapter_params params;
 
/* Terminator modules. */
@@ -257,6 +256,7 @@ struct adapter {
 
/* guards async operations */
spinlock_t async_lock cacheline_aligned;
+   u32 pending_thread_intr;
u32 slow_intr_mask;
int t1powersave;
 };
@@ -334,8 +334,7 @@ void t1_interrupts_enable(adapter_t *adapter);
 void t1_interrupts_disable(adapter_t *adapter);
 void t1_interrupts_clear(adapter_t *adapter);
 int t1_elmer0_ext_intr_handler(adapter_t *adapter);
-void t1_elmer0_ext_intr(adapter_t *adapter);
-int t1_slow_intr_handler(adapter_t *adapter);
+irqreturn_t t1_slow_intr_handler(adapter_t *adapter);
 
 int t1_link_start(struct cphy *phy, struct cmac *mac, struct link_config *lc);
 const struct board_info *t1_get_board_info(unsigned int board_id);
@@ -347,7 +346,6 @@ int t1_get_board_rev(adapter_t *adapter, const struct 
board_info *bi,
 int t1_init_hw_modules(adapter_t *adapter);
 int t1_init_sw_modules(adapter_t *adapter, const struct board_info *bi);
 void t1_free_sw_modules(adapter_t *adapter);
-void t1_fatal_err(adapter_t *adapter);
 void t1_link_changed(adapter_t *adapter, int port_id);
 void t1_link_negotiated(adapter_t *adapter, int port_id, int link_stat,
int speed, int duplex, int pause);
diff --git a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c 
b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c
index 0e4a0f413960a..512da98019c66 100644
--- a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c
+++ b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c
@@ -211,9 +211,10 @@ static int cxgb_up(struct adapter *adapter)
t1_interrupts_clear(adapter);
 
adapter->params.has_msi = !disable_msi && 
!pci_enable_msi(adapter->pdev);
-   err = request_irq(adapter->pdev->irq, t1_interrupt,
- adapter->params.has_msi ? 0 : IRQF_SHARED,
- adapter->name, adapter);
+   err = request_threaded_irq(adapter->pdev->irq, t1_interrupt,
+  t1_interrupt_thread,
+  adapter->params.has_msi ? 0 : IRQF_SHARED,
+  adapter->name, adapter);
if (err) {
if (adapter->params.has_msi)
pci_disable_msi(adapter->pdev);
@@ -916,51 +917,6 @@ static void mac_stats_task(struct work_struct *work)
spin_unlock(>work_lock);
 }
 
-/*
- * Processes elmer0 external interrupts

Re: [PATCH v3] auxdisplay: Remove in_interrupt() usage.

2021-02-16 Thread Sebastian Andrzej Siewior

On 2021-02-16 10:32:15 [+0100], Miguel Ojeda wrote:
> Hi Sebastian,
Hi,

> On Sat, Feb 13, 2021 at 5:50 PM Sebastian Andrzej Siewior
>  wrote:
> >
> > charlcd_write() is invoked as a VFS->write() callback and as such it is
> > always invoked from preemptible context and may sleep.
> 
> Can we put this sentence as a comment in the code, right before the
> call to cond_resched()?
> 
> > charlcd_puts() is invoked from register/unregister callback which is
> > preemtible. The reboot notifier callback is also invoked from
> 
> Same for this one.

Could we please avoid documenting the obvious? It is more or less common
knowledge that the write callback (like any other) is preemptible user
context (in which write occurs). The same is true for register/probe
functions. The non-preemptible / atomic is mostly the exception because
of the callback. Like from a timer or an interrupt.

> In addition, somehow the spelling fixes got lost from the previous version.
> 
> Same for the "code quotes": some have no quotes, others have `` or `'.
> No big deal, I can fix it on my side if needed, but just letting you
> know! :-)

I'm so sorry. I must have taken the wrong patch while doing the update.
My apologies. Once we sorted out the above, I will provide an update.

> Thanks!
> 
> Cheers,
> Miguel

Sebastian

Re: [RFC PATCH 01/13] futex2: Implement wait and wake functions

2021-02-16 Thread Sebastian Andrzej Siewior

On 2021-02-16 10:56:14 [+0100], Peter Zijlstra wrote:
> So while I'm in favour of adding a new interface, I'm not sure I see
> benefit of reimplementing the basics, sure it seems simpler now, but
> that's because you've not implemented all the 'fun' stuff.

The last attempt tried to hide the updated interface within libc which
did not fly. The global hash state is one of the problems because it
leads to hash collisions of two unrelated locks.
It will get simpler if we go into the kernel for each lock/unlock
operation but this might not very good in terms of performance for locks
which are mostly uncontended. I'm not sure how much we can cheat in
terms of VDSO.

Sebastian

Re: [PATCH 34/33] netfs: Use in_interrupt() not in_softirq()

2021-02-16 Thread Sebastian Andrzej Siewior

On 2021-02-16 09:42:30 [+0100], Christoph Hellwig wrote:
> On Mon, Feb 15, 2021 at 10:46:23PM +, David Howells wrote:
> > The in_softirq() in netfs_rreq_terminated() works fine for the cache being
> > on a normal disk, as the completion handlers may get called in softirq
> > context, but for an NVMe drive, the completion handler may get called in
> > IRQ context.
> > 
> > Fix to use in_interrupt() instead of in_softirq() throughout the read
> > helpers, particularly when deciding whether to punt code that might sleep
> > off to a worker thread.
> 
> We must not use either check, as they all are unreliable especially
> for PREEMPT-RT.

Yes, please. I try to cleanup the users one by one
https://lore.kernel.org/r/20200914204209.256266...@linutronix.de/

https://lore.kernel.org/amd-gfx/20210209124439.408140-1-bige...@linutronix.de/

Sebastian

[ANNOUNCE] v5.10.16-rt30

2021-02-16 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.10.16-rt30 patch set. 

Changes since v5.10.16-rt29:

  - Due to recent softirq rework it was not possible to compile a kernel
with RT && !SMP. Reported by Jonathan Schwender, patch by Christian
Eggers.

  - Update the block-mq patches to the version, that has been staged for
upstream.

Known issues
 - kdb/kgdb can easily deadlock.
 - kmsg dumpers expecting not to be called in parallel can clobber
   their temp buffer.
 - netconsole triggers WARN.

The delta patch against v5.10.16-rt29 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/incr/patch-5.10.16-rt29-rt30.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.10.16-rt30

The RT patch against v5.10.16 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.16-rt30.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.16-rt30.tar.xz

Sebastian

diff --git a/block/blk-mq.c b/block/blk-mq.c
index 5b27fd6c8c7c2..b293f74ea8cad 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -565,15 +565,12 @@ void blk_mq_end_request(struct request *rq, blk_status_t 
error)
 }
 EXPORT_SYMBOL(blk_mq_end_request);
 
-static void blk_complete_reqs(struct llist_head *cpu_list)
+static void blk_complete_reqs(struct llist_head *list)
 {
-   struct llist_node *entry;
-   struct request *rq, *rq_next;
+   struct llist_node *entry = llist_reverse_order(llist_del_all(list));
+   struct request *rq, *next;
 
-   entry = llist_del_all(cpu_list);
-   entry = llist_reverse_order(entry);
-
-   llist_for_each_entry_safe(rq, rq_next, entry, ipi_list)
+   llist_for_each_entry_safe(rq, next, entry, ipi_list)
rq->q->mq_ops->complete(rq);
 }
 
@@ -619,9 +616,34 @@ static inline bool blk_mq_complete_need_ipi(struct request 
*rq)
return cpu_online(rq->mq_ctx->cpu);
 }
 
+static void blk_mq_complete_send_ipi(struct request *rq)
+{
+   struct llist_head *list;
+   unsigned int cpu;
+
+   cpu = rq->mq_ctx->cpu;
+   list = _cpu(blk_cpu_done, cpu);
+   if (llist_add(>ipi_list, list)) {
+   rq->csd.func = __blk_mq_complete_request_remote;
+   rq->csd.info = rq;
+   rq->csd.flags = 0;
+   smp_call_function_single_async(cpu, >csd);
+   }
+}
+
+static void blk_mq_raise_softirq(struct request *rq)
+{
+   struct llist_head *list;
+
+   preempt_disable();
+   list = this_cpu_ptr(_cpu_done);
+   if (llist_add(>ipi_list, list))
+   raise_softirq(BLOCK_SOFTIRQ);
+   preempt_enable();
+}
+
 bool blk_mq_complete_request_remote(struct request *rq)
 {
-   struct llist_head *cpu_list;
WRITE_ONCE(rq->state, MQ_RQ_COMPLETE);
 
/*
@@ -632,27 +654,15 @@ bool blk_mq_complete_request_remote(struct request *rq)
return false;
 
if (blk_mq_complete_need_ipi(rq)) {
-   unsigned int cpu;
-
-   cpu = rq->mq_ctx->cpu;
-   cpu_list = _cpu(blk_cpu_done, cpu);
-   if (llist_add(>ipi_list, cpu_list)) {
-   rq->csd.func = __blk_mq_complete_request_remote;
-   rq->csd.flags = 0;
-   smp_call_function_single_async(cpu, >csd);
-   }
-   } else {
-   if (rq->q->nr_hw_queues > 1)
-   return false;
-
-   preempt_disable();
-   cpu_list = this_cpu_ptr(_cpu_done);
-   if (llist_add(>ipi_list, cpu_list))
-   raise_softirq(BLOCK_SOFTIRQ);
-   preempt_enable();
+   blk_mq_complete_send_ipi(rq);
+   return true;
}
 
-   return true;
+   if (rq->q->nr_hw_queues == 1) {
+   blk_mq_raise_softirq(rq);
+   return true;
+   }
+   return false;
 }
 EXPORT_SYMBOL_GPL(blk_mq_complete_request_remote);
 
diff --git a/kernel/smp.c b/kernel/smp.c
index 4d17501433be7..23778281aaa70 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -449,6 +450,19 @@ void flush_smp_call_function_from_idle(void)
 
local_irq_save(flags);
flush_smp_call_function_queue(true);
+
+   if (local_softirq_pending()) {
+
+   if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
+   do_softirq();
+   } else {
+   struct task_struct *ksoftirqd = this_cpu_ksoftirqd();
+
+   if (ksoftirqd && ksoftirqd->state != TASK_RUNNING)
+   wake_up_process(ksoftirqd);
+   }
+   }
+
local_irq_restore(flags);
 }
 
diff --git a/kernel/softirq.c

[PATCH RT] smp: Wake ksoftirqd on PREEMPT_RT instead do_softirq().

2021-02-15 Thread Sebastian Andrzej Siewior

The softirq implementation on PREEMPT_RT does not provide do_softirq().
The other user of do_softirq() is replaced with a local_bh_disable()
+ enable() around the possible raise-softirq invocation. This can not be
done here because migration_cpu_stop() is invoked with disabled
preemption.

Wake the softirq thread on PREEMPT_RT if there are any pending softirqs.

Signed-off-by: Sebastian Andrzej Siewior 
---
 kernel/smp.c |   14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -450,8 +450,18 @@ void flush_smp_call_function_from_idle(v
 
local_irq_save(flags);
flush_smp_call_function_queue(true);
-   if (local_softirq_pending())
-   do_softirq();
+
+   if (local_softirq_pending()) {
+
+   if (!IS_ENABLED(CONFIG_PREEMPT_RT)) {
+   do_softirq();
+   } else {
+   struct task_struct *ksoftirqd = this_cpu_ksoftirqd();
+
+   if (ksoftirqd && ksoftirqd->state != TASK_RUNNING)
+   wake_up_process(ksoftirqd);
+   }
+   }
 
local_irq_restore(flags);
 }

[tip: core/rcu] rcu: Make RCU_BOOST default on CONFIG_PREEMPT_RT

2021-02-15 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the core/rcu branch of tip:

Commit-ID: 2341bc4a0311e4319ced6c2828bb19309dee74fd
Gitweb:
https://git.kernel.org/tip/2341bc4a0311e4319ced6c2828bb19309dee74fd
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 15 Dec 2020 15:16:45 +01:00
Committer: Paul E. McKenney 
CommitterDate: Mon, 04 Jan 2021 13:43:50 -08:00

rcu: Make RCU_BOOST default on CONFIG_PREEMPT_RT

On PREEMPT_RT kernels, RCU callbacks are deferred to the `rcuc' kthread.
This can stall RCU grace periods due to lengthy preemption not only of RCU
readers but also of 'rcuc' kthreads, either of which prevent grace periods
from completing, which can in turn result in OOM.  Because PREEMPT_RT
kernels have more kthreads that can block grace periods, it is more
important for such kernels to enable RCU_BOOST.

This commit therefore makes RCU_BOOST the default on PREEMPT_RT.
RCU_BOOST can still be manually disabled if need be.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Paul E. McKenney 
---
 kernel/rcu/Kconfig | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index cdc57b4..aa8cc8c 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -188,8 +188,8 @@ config RCU_FAST_NO_HZ
 
 config RCU_BOOST
bool "Enable RCU priority boosting"
-   depends on RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT
-   default n
+   depends on (RT_MUTEXES && PREEMPT_RCU && RCU_EXPERT) || PREEMPT_RT
+   default y if PREEMPT_RT
help
  This option boosts the priority of preempted RCU readers that
  block the current preemptible RCU grace period for too long.

Re: [PATCH 2/2] rcu-tasks: add RCU-tasks self tests

2021-02-15 Thread Sebastian Andrzej Siewior

On 2021-02-13 08:45:54 [-0800], Paul E. McKenney wrote:
> Glad you like it!  But let's see which (if any) of these patches solves
> the problem for Sebastian.

Looking at that, is there any reason for doing this that can not be
solved by moving the self-test a little later? Maybe once we reached at
least SYSTEM_SCHEDULING?
This happens now even before lockdep is up or the console is registered.
So if something bad happens, you end up with a blank terminal.

There is nothing else that early in the boot process that requires
working softirq. The only exception to this is wait_task_inactive()
which is used while starting a new thread (including the ksoftirqd)
which is why it was moved to schedule_hrtimeout().

>   Thanx, Paul

Sebastian

[PATCH v3] auxdisplay: Remove in_interrupt() usage.

2021-02-13 Thread Sebastian Andrzej Siewior

charlcd_write() is invoked as a VFS->write() callback and as such it is
always invoked from preemptible context and may sleep.

charlcd_puts() is invoked from register/unregister callback which is
preemtible. The reboot notifier callback is also invoked from
preemptible context.

Therefore there is no need to use `in_interrupt()' to figure out if it
is save to sleep because it always is. `in_interrupt()` and related
context checks are being removed from non-core code.
Using `schedule()' to schedule (an be friendly to others) is
discouraged and `cond_resched()' should be used instead.

Remove `in_interrupt()' and use `cond_resched()' to schedule every 32
iteration if needed.

Link: https://lkml.kernel.org/r/20200914204209.256266...@linutronix.de
Cc: Miguel Ojeda Sandonis 
Signed-off-by: Sebastian Andrzej Siewior 
---
v2…v3: Extend the commit message as suggested by Miguel Ojeda.
v1…v2: Spelling fixes.

On 2021-02-10 22:46:23 [+0100], Miguel Ojeda wrote:
> Hi Sebastian,
Hi Miguel,

> Yeah, it is a bit confusing when reading without the context (it is
> hard to keep up with everything going on unless you work full-time on
> it :-)

Sorry for leaving it out. I though it is not needed since it was not
needed.

> > since this patch was small, simple and removing not required code I kept
> > it out. Is this enough information for you?
> 
> If you don't mind, please add a quick sentence like (I can do it on my
> side too):
> 
> `in_interrupt()` and related context checks are being removed
> from non-core code.
> 
> Plus the tag:
> 
> Link: https://lore.kernel.org/r/20200914204209.256266...@linutronix.de/

Added as suggested.

 drivers/auxdisplay/charlcd.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/auxdisplay/charlcd.c b/drivers/auxdisplay/charlcd.c
index f43430e9dceed..fbfce95919f72 100644
--- a/drivers/auxdisplay/charlcd.c
+++ b/drivers/auxdisplay/charlcd.c
@@ -470,12 +470,8 @@ static ssize_t charlcd_write(struct file *file, const char 
__user *buf,
char c;
 
for (; count-- > 0; (*ppos)++, tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
-   /*
-* let's be a little nice with other processes
-* that need some CPU
-*/
-   schedule();
+   if (((count + 1) & 0x1f) == 0)
+   cond_resched();
 
if (get_user(c, tmp))
return -EFAULT;
@@ -537,12 +533,8 @@ static void charlcd_puts(struct charlcd *lcd, const char 
*s)
int count = strlen(s);
 
for (; count-- > 0; tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
-   /*
-* let's be a little nice with other processes
-* that need some CPU
-*/
-   schedule();
+   if (((count + 1) & 0x1f) == 0)
+   cond_resched();
 
charlcd_write_char(lcd, *tmp);
}
-- 
2.30.0

Re: [PATCH 2/2] rcu-tasks: add RCU-tasks self tests

2021-02-12 Thread Sebastian Andrzej Siewior

On 2020-12-09 21:27:32 [+0100], Uladzislau Rezki (Sony) wrote:
> Add self tests for checking of RCU-tasks API functionality.
> It covers:
> - wait API functions;
> - invoking/completion call_rcu_tasks*().
> 
> Self-tests are run when CONFIG_PROVE_RCU kernel parameter is set.

I just bisected to this commit. By booting with `threadirqs' I end up
with:
[0.176533] Running RCU-tasks wait API self tests

No stall warning or so.
It boots again with:

diff --git a/init/main.c b/init/main.c
--- a/init/main.c
+++ b/init/main.c
@@ -1489,6 +1489,7 @@ void __init console_on_rootfs(void)
fput(file);
 }
 
+void rcu_tasks_initiate_self_tests(void);
 static noinline void __init kernel_init_freeable(void)
 {
/*
@@ -1514,6 +1515,7 @@ static noinline void __init kernel_init_freeable(void)
 
rcu_init_tasks_generic();
do_pre_smp_initcalls();
+   rcu_tasks_initiate_self_tests();
lockup_detector_init();
 
smp_init();
diff --git a/kernel/rcu/tasks.h b/kernel/rcu/tasks.h
--- a/kernel/rcu/tasks.h
+++ b/kernel/rcu/tasks.h
@@ -1266,7 +1266,7 @@ static void test_rcu_tasks_callback(struct rcu_head *rhp)
rttd->notrun = true;
 }
 
-static void rcu_tasks_initiate_self_tests(void)
+void rcu_tasks_initiate_self_tests(void)
 {
pr_info("Running RCU-tasks wait API self tests\n");
 #ifdef CONFIG_TASKS_RCU
@@ -1322,7 +1322,6 @@ void __init rcu_init_tasks_generic(void)
 #endif
 
// Run the self-tests.
-   rcu_tasks_initiate_self_tests();
 }
 
 #else /* #ifdef CONFIG_TASKS_RCU_GENERIC */

> Signed-off-by: Uladzislau Rezki (Sony) 

Sebastian

[tip: core/rcu] doc: Update RCU's requirements page about the PREEMPT_RT wiki

2021-02-12 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the core/rcu branch of tip:

Commit-ID: 361c0f3d80dc3b54c20a19e8ffa2ad728fc1d23d
Gitweb:
https://git.kernel.org/tip/361c0f3d80dc3b54c20a19e8ffa2ad728fc1d23d
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 15 Dec 2020 15:16:48 +01:00
Committer: Paul E. McKenney 
CommitterDate: Wed, 06 Jan 2021 16:10:41 -08:00

doc: Update RCU's requirements page about the PREEMPT_RT wiki

The PREEMPT_RT wiki moved from kernel.org to the Linux Foundation wiki.
The kernel.org wiki is read only.

This commit therefore updates the URL of the active PREEMPT_RT wiki.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Paul E. McKenney 
---
 Documentation/RCU/Design/Requirements/Requirements.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst 
b/Documentation/RCU/Design/Requirements/Requirements.rst
index 65c7839..bac1cdd 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -2317,7 +2317,7 @@ decides to throw at it.
 
 The Linux kernel is used for real-time workloads, especially in
 conjunction with the `-rt
-patchset <https://rt.wiki.kernel.org/index.php/Main_Page>`__. The
+patchset <https://wiki.linuxfoundation.org/realtime/>`__. The
 real-time-latency response requirements are such that the traditional
 approach of disabling preemption across RCU read-side critical sections
 is inappropriate. Kernels built with ``CONFIG_PREEMPT=y`` therefore use

[tip: core/rcu] doc: Use CONFIG_PREEMPTION

2021-02-12 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the core/rcu branch of tip:

Commit-ID: 81ad58be2f83f9bd675f67ca5b8f420358ddf13c
Gitweb:
https://git.kernel.org/tip/81ad58be2f83f9bd675f67ca5b8f420358ddf13c
Author:Sebastian Andrzej Siewior 
AuthorDate:Tue, 15 Dec 2020 15:16:49 +01:00
Committer: Paul E. McKenney 
CommitterDate: Wed, 06 Jan 2021 16:10:44 -08:00

doc: Use CONFIG_PREEMPTION

CONFIG_PREEMPTION is selected by CONFIG_PREEMPT and by CONFIG_PREEMPT_RT.
Both PREEMPT and PREEMPT_RT require the same functionality which today
depends on CONFIG_PREEMPT.

Update the documents and mention CONFIG_PREEMPTION. Spell out
CONFIG_PREEMPT_RT (instead PREEMPT_RT) since it is an option now.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Paul E. McKenney 
---
 Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst | 
 4 ++--
 Documentation/RCU/Design/Requirements/Requirements.rst   | 
22 +++---
 Documentation/RCU/checklist.rst  | 
 2 +-
 Documentation/RCU/rcubarrier.rst | 
 6 +++---
 Documentation/RCU/stallwarn.rst  | 
 4 ++--
 Documentation/RCU/whatisRCU.rst  | 
10 +-
 6 files changed, 24 insertions(+), 24 deletions(-)

diff --git 
a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst 
b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
index 72f0f6f..6f89cf1 100644
--- 
a/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
+++ 
b/Documentation/RCU/Design/Expedited-Grace-Periods/Expedited-Grace-Periods.rst
@@ -38,7 +38,7 @@ sections.
 RCU-preempt Expedited Grace Periods
 ===
 
-``CONFIG_PREEMPT=y`` kernels implement RCU-preempt.
+``CONFIG_PREEMPTION=y`` kernels implement RCU-preempt.
 The overall flow of the handling of a given CPU by an RCU-preempt
 expedited grace period is shown in the following diagram:
 
@@ -112,7 +112,7 @@ things.
 RCU-sched Expedited Grace Periods
 -
 
-``CONFIG_PREEMPT=n`` kernels implement RCU-sched. The overall flow of
+``CONFIG_PREEMPTION=n`` kernels implement RCU-sched. The overall flow of
 the handling of a given CPU by an RCU-sched expedited grace period is
 shown in the following diagram:
 
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst 
b/Documentation/RCU/Design/Requirements/Requirements.rst
index bac1cdd..42a81e3 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -78,7 +78,7 @@ RCU treats a nested set as one big RCU read-side critical 
section.
 Production-quality implementations of rcu_read_lock() and
 rcu_read_unlock() are extremely lightweight, and in fact have
 exactly zero overhead in Linux kernels built for production use with
-``CONFIG_PREEMPT=n``.
+``CONFIG_PREEMPTION=n``.
 
 This guarantee allows ordering to be enforced with extremely low
 overhead to readers, for example:
@@ -1181,7 +1181,7 @@ and has become decreasingly so as memory sizes have 
expanded and memory
 costs have plummeted. However, as I learned from Matt Mackall's
 `bloatwatch <http://elinux.org/Linux_Tiny-FAQ>`__ efforts, memory
 footprint is critically important on single-CPU systems with
-non-preemptible (``CONFIG_PREEMPT=n``) kernels, and thus `tiny
+non-preemptible (``CONFIG_PREEMPTION=n``) kernels, and thus `tiny
 RCU <https://lore.kernel.org/r/20090113221724.ga15...@linux.vnet.ibm.com>`__
 was born. Josh Triplett has since taken over the small-memory banner
 with his `Linux kernel tinification <https://tiny.wiki.kernel.org/>`__
@@ -1497,7 +1497,7 @@ limitations.
 
 Implementations of RCU for which rcu_read_lock() and
 rcu_read_unlock() generate no code, such as Linux-kernel RCU when
-``CONFIG_PREEMPT=n``, can be nested arbitrarily deeply. After all, there
+``CONFIG_PREEMPTION=n``, can be nested arbitrarily deeply. After all, there
 is no overhead. Except that if all these instances of
 rcu_read_lock() and rcu_read_unlock() are visible to the
 compiler, compilation will eventually fail due to exhausting memory,
@@ -1769,7 +1769,7 @@ implementation can be a no-op.
 
 However, once the scheduler has spawned its first kthread, this early
 boot trick fails for synchronize_rcu() (as well as for
-synchronize_rcu_expedited()) in ``CONFIG_PREEMPT=y`` kernels. The
+synchronize_rcu_expedited()) in ``CONFIG_PREEMPTION=y`` kernels. The
 reason is that an RCU read-side critical section might be preempted,
 which means that a subsequent synchronize_rcu() really does have to
 wait for something, as opposed to simply returning immediately.
@@ -2038,7 +2038,7 @@ the following:
5 rcu_read_unlock();
6 do_something_with(v, user_v);
 
-If the compiler did make t

[tip: locking/core] locking/mutex: Kill mutex_trylock_recursive()

2021-02-10 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 0f319d49a4167e402b01b2b56639386f0b6846ba
Gitweb:
https://git.kernel.org/tip/0f319d49a4167e402b01b2b56639386f0b6846ba
Author:Sebastian Andrzej Siewior 
AuthorDate:Wed, 10 Feb 2021 09:52:47 +01:00
Committer: Peter Zijlstra 
CommitterDate: Wed, 10 Feb 2021 14:44:40 +01:00

locking/mutex: Kill mutex_trylock_recursive()

There are not users of mutex_trylock_recursive() in tree as of
v5.11-rc7.

Remove it.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lkml.kernel.org/r/20210210085248.219210-2-bige...@linutronix.de
---
 include/linux/mutex.h  | 25 -
 kernel/locking/mutex.c | 10 --
 2 files changed, 35 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index dcd185c..0cd631a 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -199,29 +199,4 @@ extern void mutex_unlock(struct mutex *lock);
 
 extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
 
-/*
- * These values are chosen such that FAIL and SUCCESS match the
- * values of the regular mutex_trylock().
- */
-enum mutex_trylock_recursive_enum {
-   MUTEX_TRYLOCK_FAILED= 0,
-   MUTEX_TRYLOCK_SUCCESS   = 1,
-   MUTEX_TRYLOCK_RECURSIVE,
-};
-
-/**
- * mutex_trylock_recursive - trylock variant that allows recursive locking
- * @lock: mutex to be locked
- *
- * This function should not be used, _ever_. It is purely for hysterical GEM
- * raisins, and once those are gone this will be removed.
- *
- * Returns:
- *  - MUTEX_TRYLOCK_FAILED- trylock failed,
- *  - MUTEX_TRYLOCK_SUCCESS   - lock acquired,
- *  - MUTEX_TRYLOCK_RECURSIVE - we already owned the lock.
- */
-extern /* __deprecated */ __must_check enum mutex_trylock_recursive_enum
-mutex_trylock_recursive(struct mutex *lock);
-
 #endif /* __LINUX_MUTEX_H */
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 5352ce5..adb9350 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -86,16 +86,6 @@ bool mutex_is_locked(struct mutex *lock)
 }
 EXPORT_SYMBOL(mutex_is_locked);
 
-__must_check enum mutex_trylock_recursive_enum
-mutex_trylock_recursive(struct mutex *lock)
-{
-   if (unlikely(__mutex_owner(lock) == current))
-   return MUTEX_TRYLOCK_RECURSIVE;
-
-   return mutex_trylock(lock);
-}
-EXPORT_SYMBOL(mutex_trylock_recursive);
-
 static inline unsigned long __owner_flags(unsigned long owner)
 {
return owner & MUTEX_FLAGS;

[tip: locking/core] checkpatch: Don't check for mutex_trylock_recursive()

2021-02-10 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the locking/core branch of tip:

Commit-ID: 6c80408a8a0360fa9223b8c21c0ab8ef42e88bfe
Gitweb:
https://git.kernel.org/tip/6c80408a8a0360fa9223b8c21c0ab8ef42e88bfe
Author:Sebastian Andrzej Siewior 
AuthorDate:Wed, 10 Feb 2021 09:52:48 +01:00
Committer: Peter Zijlstra 
CommitterDate: Wed, 10 Feb 2021 14:44:40 +01:00

checkpatch: Don't check for mutex_trylock_recursive()

mutex_trylock_recursive() has been removed from the tree, there is no
need to check for it.

Remove traces of mutex_trylock_recursive()'s existence.

Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lkml.kernel.org/r/20210210085248.219210-3-bige...@linutronix.de
---
 scripts/checkpatch.pl | 6 --
 1 file changed, 6 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 92e888e..15f7f4f 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -7069,12 +7069,6 @@ sub process {
}
}
 
-# check for mutex_trylock_recursive usage
-   if ($line =~ /mutex_trylock_recursive/) {
-   ERROR("LOCKING",
- "recursive locking is bad, do not use this 
ever.\n" . $herecurr);
-   }
-
 # check for lockdep_set_novalidate_class
if ($line =~ /^.\s*lockdep_set_novalidate_class\s*\(/ ||
$line =~ /__lockdep_no_validate__\s*\)/ ) {

[tip: sched/core] smp: Process pending softirqs in flush_smp_call_function_from_idle()

2021-02-10 Thread tip-bot2 for Sebastian Andrzej Siewior

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 66040b2d5d41f85cb1a752a75260595344c5ec3b
Gitweb:
https://git.kernel.org/tip/66040b2d5d41f85cb1a752a75260595344c5ec3b
Author:Sebastian Andrzej Siewior 
AuthorDate:Sat, 23 Jan 2021 21:10:25 +01:00
Committer: Peter Zijlstra 
CommitterDate: Wed, 10 Feb 2021 14:44:42 +01:00

smp: Process pending softirqs in flush_smp_call_function_from_idle()

send_call_function_single_ipi() may wake an idle CPU without sending an
IPI. The woken up CPU will process the SMP-functions in
flush_smp_call_function_from_idle(). Any raised softirq from within the
SMP-function call will not be processed.
Should the CPU have no tasks assigned, then it will go back to idle with
pending softirqs and the NOHZ will rightfully complain.

Process pending softirqs on return from flush_smp_call_function_queue().

Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()")
Reported-by: Jens Axboe 
Signed-off-by: Sebastian Andrzej Siewior 
Signed-off-by: Peter Zijlstra (Intel) 
Link: https://lkml.kernel.org/r/20210123201027.3262800-2-bige...@linutronix.de
---
 kernel/smp.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/kernel/smp.c b/kernel/smp.c
index 1b6070b..aeb0adf 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -449,6 +450,9 @@ void flush_smp_call_function_from_idle(void)
 
local_irq_save(flags);
flush_smp_call_function_queue(true);
+   if (local_softirq_pending())
+   do_softirq();
+
local_irq_restore(flags);
 }

[PATCH 0/2] locking/mutex: Kill mutex_trylock_recursive()

2021-02-10 Thread Sebastian Andrzej Siewior

Remove mutex_trylock_recursive() from the API and tell checkpatch not to
check it for it anymore.

Sebastian

[PATCH 1/2] locking/mutex: Kill mutex_trylock_recursive()

2021-02-10 Thread Sebastian Andrzej Siewior

There are not users of mutex_trylock_recursive() in tree as of
v5.11-rc7.

Remove it.

Signed-off-by: Sebastian Andrzej Siewior 
---
 include/linux/mutex.h  | 25 -
 kernel/locking/mutex.c | 10 --
 2 files changed, 35 deletions(-)

diff --git a/include/linux/mutex.h b/include/linux/mutex.h
index dcd185cbfe793..0cd631a197276 100644
--- a/include/linux/mutex.h
+++ b/include/linux/mutex.h
@@ -199,29 +199,4 @@ extern void mutex_unlock(struct mutex *lock);
 
 extern int atomic_dec_and_mutex_lock(atomic_t *cnt, struct mutex *lock);
 
-/*
- * These values are chosen such that FAIL and SUCCESS match the
- * values of the regular mutex_trylock().
- */
-enum mutex_trylock_recursive_enum {
-   MUTEX_TRYLOCK_FAILED= 0,
-   MUTEX_TRYLOCK_SUCCESS   = 1,
-   MUTEX_TRYLOCK_RECURSIVE,
-};
-
-/**
- * mutex_trylock_recursive - trylock variant that allows recursive locking
- * @lock: mutex to be locked
- *
- * This function should not be used, _ever_. It is purely for hysterical GEM
- * raisins, and once those are gone this will be removed.
- *
- * Returns:
- *  - MUTEX_TRYLOCK_FAILED- trylock failed,
- *  - MUTEX_TRYLOCK_SUCCESS   - lock acquired,
- *  - MUTEX_TRYLOCK_RECURSIVE - we already owned the lock.
- */
-extern /* __deprecated */ __must_check enum mutex_trylock_recursive_enum
-mutex_trylock_recursive(struct mutex *lock);
-
 #endif /* __LINUX_MUTEX_H */
diff --git a/kernel/locking/mutex.c b/kernel/locking/mutex.c
index 5352ce50a97e3..adb9350907688 100644
--- a/kernel/locking/mutex.c
+++ b/kernel/locking/mutex.c
@@ -86,16 +86,6 @@ bool mutex_is_locked(struct mutex *lock)
 }
 EXPORT_SYMBOL(mutex_is_locked);
 
-__must_check enum mutex_trylock_recursive_enum
-mutex_trylock_recursive(struct mutex *lock)
-{
-   if (unlikely(__mutex_owner(lock) == current))
-   return MUTEX_TRYLOCK_RECURSIVE;
-
-   return mutex_trylock(lock);
-}
-EXPORT_SYMBOL(mutex_trylock_recursive);
-
 static inline unsigned long __owner_flags(unsigned long owner)
 {
return owner & MUTEX_FLAGS;
-- 
2.30.0

[PATCH 2/2] checkpatch: Don't check for mutex_trylock_recursive()

2021-02-10 Thread Sebastian Andrzej Siewior

mutex_trylock_recursive() has been removed from the tree, there is no
need to check for it.

Remove traces of mutex_trylock_recursive()'s existence.

Signed-off-by: Sebastian Andrzej Siewior 
---
 scripts/checkpatch.pl | 6 --
 1 file changed, 6 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 1afe3af1cc097..4b2775fd31d9d 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -7062,12 +7062,6 @@ sub process {
}
}
 
-# check for mutex_trylock_recursive usage
-   if ($line =~ /mutex_trylock_recursive/) {
-   ERROR("LOCKING",
- "recursive locking is bad, do not use this 
ever.\n" . $herecurr);
-   }
-
 # check for lockdep_set_novalidate_class
if ($line =~ /^.\s*lockdep_set_novalidate_class\s*\(/ ||
$line =~ /__lockdep_no_validate__\s*\)/ ) {
-- 
2.30.0

[ANNOUNCE] v5.10.14-rt28

2021-02-09 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.10.14-rt28 patch set. 

Changes since v5.10.14-rt27:

  - Lazy preemption fix for 64bit PowerPC. It was broken since
v5.9-rc2-rt1. Reported by John Ogness.

  - Two patches for chelsio/cxgb network driver to avoid
tasklet_disable() usage in atomic context on !RT.

Known issues
 - kdb/kgdb can easily deadlock.
 - kmsg dumpers expecting not to be called in parallel can clobber
   their temp buffer.
 - netconsole triggers WARN.

The delta patch against v5.10.14-rt27 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/incr/patch-5.10.14-rt27-rt28.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.10.14-rt28

The RT patch against v5.10.14 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.14-rt28.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.14-rt28.tar.xz

Sebastian

diff --git a/arch/powerpc/kernel/syscall_64.c b/arch/powerpc/kernel/syscall_64.c
index 70c8e9bda1f6c..ae3212dcf5627 100644
--- a/arch/powerpc/kernel/syscall_64.c
+++ b/arch/powerpc/kernel/syscall_64.c
@@ -367,7 +367,8 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct 
pt_regs *regs, unsign
if (preempt_count() == 0)
preempt_schedule_irq();
} else if (unlikely(*ti_flagsp & 
_TIF_NEED_RESCHED_LAZY)) {
-   if (current_thread_info()->preempt_lazy_count 
== 0)
+   if ((preempt_count() == 0) &&
+   (current_thread_info()->preempt_lazy_count 
== 0))
preempt_schedule_irq();
}
}
diff --git a/drivers/net/ethernet/chelsio/cxgb/common.h 
b/drivers/net/ethernet/chelsio/cxgb/common.h
index 6475060649e90..0321be77366c4 100644
--- a/drivers/net/ethernet/chelsio/cxgb/common.h
+++ b/drivers/net/ethernet/chelsio/cxgb/common.h
@@ -238,7 +238,6 @@ struct adapter {
int msg_enable;
u32 mmio_len;
 
-   struct work_struct ext_intr_handler_task;
struct adapter_params params;
 
/* Terminator modules. */
@@ -257,6 +256,7 @@ struct adapter {
 
/* guards async operations */
spinlock_t async_lock cacheline_aligned;
+   u32 pending_thread_intr;
u32 slow_intr_mask;
int t1powersave;
 };
@@ -334,8 +334,7 @@ void t1_interrupts_enable(adapter_t *adapter);
 void t1_interrupts_disable(adapter_t *adapter);
 void t1_interrupts_clear(adapter_t *adapter);
 int t1_elmer0_ext_intr_handler(adapter_t *adapter);
-void t1_elmer0_ext_intr(adapter_t *adapter);
-int t1_slow_intr_handler(adapter_t *adapter);
+irqreturn_t t1_slow_intr_handler(adapter_t *adapter);
 
 int t1_link_start(struct cphy *phy, struct cmac *mac, struct link_config *lc);
 const struct board_info *t1_get_board_info(unsigned int board_id);
@@ -347,7 +346,6 @@ int t1_get_board_rev(adapter_t *adapter, const struct 
board_info *bi,
 int t1_init_hw_modules(adapter_t *adapter);
 int t1_init_sw_modules(adapter_t *adapter, const struct board_info *bi);
 void t1_free_sw_modules(adapter_t *adapter);
-void t1_fatal_err(adapter_t *adapter);
 void t1_link_changed(adapter_t *adapter, int port_id);
 void t1_link_negotiated(adapter_t *adapter, int port_id, int link_stat,
int speed, int duplex, int pause);
diff --git a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c 
b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c
index 0e4a0f413960a..512da98019c66 100644
--- a/drivers/net/ethernet/chelsio/cxgb/cxgb2.c
+++ b/drivers/net/ethernet/chelsio/cxgb/cxgb2.c
@@ -211,9 +211,10 @@ static int cxgb_up(struct adapter *adapter)
t1_interrupts_clear(adapter);
 
adapter->params.has_msi = !disable_msi && 
!pci_enable_msi(adapter->pdev);
-   err = request_irq(adapter->pdev->irq, t1_interrupt,
- adapter->params.has_msi ? 0 : IRQF_SHARED,
- adapter->name, adapter);
+   err = request_threaded_irq(adapter->pdev->irq, t1_interrupt,
+  t1_interrupt_thread,
+  adapter->params.has_msi ? 0 : IRQF_SHARED,
+  adapter->name, adapter);
if (err) {
if (adapter->params.has_msi)
pci_disable_msi(adapter->pdev);
@@ -916,51 +917,6 @@ static void mac_stats_task(struct work_struct *work)
spin_unlock(>work_lock);
 }
 
-/*
- * Processes elmer0 external interrupts in process context.
- */
-static void ext_intr_task(struct work_struct *work)
-{
-   struct adapter *adapter =
-   container_of(work, struct adapter, ext_intr_handler_task);
-
-

Re: [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle()

2021-02-09 Thread Sebastian Andrzej Siewior

On 2021-02-09 11:02:10 [+0100], Peter Zijlstra wrote:
> Fair enough. I'll stick this in tip/sched/smp for Jens and merge that
> into tip/sched/core.

Thank you.

> Thanks!

Sebastian

Re: [PATCH] auxdisplay: Remove in_interrupt() usage.

2021-02-09 Thread Sebastian Andrzej Siewior

On 2021-02-08 23:26:57 [+0100], Miguel Ojeda wrote:
> Thanks -- can you please add a Link: tag to a lore URL or the docs or
> similar where more information can be found regarding the
> proposal/discussion for removing `in_interrurt()` etc.? It is useful
> to track why these things are happening around the kernel.

If I post series with more than just one patch I have a cover letter
including:

|in the discussion about preempt count consistency across kernel
|configurations:
|
| https://lore.kernel.org/r/20200914204209.256266...@linutronix.de/
|
|it was concluded that the usage of in_interrupt() and related context
|checks should be removed from non-core code.
|
|In the long run, usage of 'preemptible, in_*irq etc.' should be banned from
|driver code completely.

since this patch was small, simple and removing not required code I kept
it out. Is this enough information for you?

> Also, `hacking.rst` (and related documentation) should be updated
> before this is done, so that we can link to it.

The information is not wrong, it doesn't say you have to use it it your
driver. It also does not mention that you should not. I will look into
this.

> > What I meant was GFP_KERNEL for context which can sleep vs GFP_ATOMIC for
> > context which must not sleep. The commit above also eliminates the
> > in_interrupt() usage within the driver (in multiple steps).
> 
> I was thinking something along those lines, but `in_interrupt()` nor
> `cond_resched()` take an explicit context either, so I am confused.
> Does `cond_reched()` always do the right thing regardless of context?
> The docs are not really clear:
> 
>   "cond_resched() and cond_resched_lock(): latency reduction via
> explicit rescheduling in places that are safe."
> 
> It could be read as "it will only resched whenever safe" or "only to
> be called if it is safe".

You should keep track of the context you are in and not attempt to sleep
if it is not allowed. If you are doing something that may monopolize the
CPU then you should use cond_resched(). The difference compared to
schedule() is that you don't blindly invoke the scheduler and that it is
optimized away on a preemptible kernel. But as you noticed, you must not
not use it if the context does not allow it (like in interrupt handler,
disabled preemption and so on).

> Cheers,
> Miguel

Sebastian

Re: [PATCH] auxdisplay: Remove in_interrupt() usage.

2021-02-08 Thread Sebastian Andrzej Siewior

On 2021-02-08 21:14:54 [+0100], Miguel Ojeda wrote:
> On Mon, Feb 8, 2021 at 8:07 PM Sebastian Andrzej Siewior
>  wrote:
> >
> > Yes.
> 
> In what way?

It hurts to keep in_interrupt() because it is desired to have it removed
from drivers. The problem is that pattern is often copied and people
sometimes get it wrong. For instance, the code here invoked schedule()
based on in_interrupt(). It did not check whether or not the interrupts
are disabled which is also important. It may work now, it can break in
future if an unrelated change is made. An example is commit
   c2d0f1a65ab9f ("scsi: libsas: Introduce a _gfp() variant of event notifiers")

https://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git/commit/?id=c2d0f1a65ab9f

in_interrupt() is often used in old code that was written before
might_sleep() and lockdep was introduced.

> > No. If you know the context, pass it along like this is done for
> > kmalloc() for instance.
> 
> What do you mean?

What I meant was GFP_KERNEL for context which can sleep vs GFP_ATOMIC for
context which must not sleep. The commit above also eliminates the
in_interrupt() usage within the driver (in multiple steps).

> Cheers,
> Miguel

Sebastian

[PATCH v2] auxdisplay: Remove in_interrupt() usage.

2021-02-08 Thread Sebastian Andrzej Siewior

charlcd_write() is invoked as a VFS->write() callback and as such it is
always invoked from preemptible context and may sleep.

charlcd_puts() is invoked from register/unregister callback which is
preemptible. The reboot notifier callback is also invoked from
preemptible context.

Therefore there is no need to use `in_interrupt()' to figure out if it
is safe to sleep because it always is.
Using `schedule()' to schedule (and be friendly to others) is
discouraged and `cond_resched()' should be used instead.

Remove `in_interrupt()' and use `cond_resched()' to schedule every 32
iteration if needed.

Cc: Miguel Ojeda Sandonis 
Signed-off-by: Sebastian Andrzej Siewior 
---

v1…v2: Spelling fixes.

 drivers/auxdisplay/charlcd.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/auxdisplay/charlcd.c b/drivers/auxdisplay/charlcd.c
index f43430e9dceed..fbfce95919f72 100644
--- a/drivers/auxdisplay/charlcd.c
+++ b/drivers/auxdisplay/charlcd.c
@@ -470,12 +470,8 @@ static ssize_t charlcd_write(struct file *file, const char 
__user *buf,
char c;
 
for (; count-- > 0; (*ppos)++, tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
-   /*
-* let's be a little nice with other processes
-* that need some CPU
-*/
-   schedule();
+   if (((count + 1) & 0x1f) == 0)
+   cond_resched();
 
if (get_user(c, tmp))
return -EFAULT;
@@ -537,12 +533,8 @@ static void charlcd_puts(struct charlcd *lcd, const char 
*s)
int count = strlen(s);
 
for (; count-- > 0; tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
-   /*
-* let's be a little nice with other processes
-* that need some CPU
-*/
-   schedule();
+   if (((count + 1) & 0x1f) == 0)
+   cond_resched();
 
charlcd_write_char(lcd, *tmp);
}
-- 
2.30.0

Re: [PATCH] auxdisplay: Remove in_interrupt() usage.

2021-02-08 Thread Sebastian Andrzej Siewior

On 2021-02-08 19:38:10 [+0100], Miguel Ojeda wrote:
> Hi Sebastian,
Hi,

> > Therefore there is no need to use `in_interrupt()' to figure out if it
> > is save to sleep because it always is.
> 
> save -> safe
> 
> Does it hurt to have `in_interrupt()`? Future patches could make it so
Yes.

> that it is no longer a preemptible context. Should it be moved to e.g.
> a `WARN_ON()` instead?

No. If you know the context, pass it along like this is done for
kmalloc() for instance. The long term plan is not make it available to
divers (i.e. core code only where the context can not be known).

> Thanks for the patch!

I'm going to resend it with your corrections.

> Cheers,
> Miguel

Sebastian

[PATCH 2/2] serial: core: Remove BUG_ON(in_interrupt()) check

2021-02-08 Thread Sebastian Andrzej Siewior

From: "Ahmed S. Darwish" 

The usage of in_interrupt() in drivers is phased out for various
reasons.

In both exported functions where BUG_ON(in_interrupt()) is invoked,
there is a mutex_lock() afterwards. mutex_lock() contains a
might_sleep() which will already trigger a stack trace if the target
functions is called from atomic context.

Remove the BUG_ON() and add a "Context: " in the kernel-doc instead.

Signed-off-by: Ahmed S. Darwish 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/tty/serial/serial_core.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
index 7dacdb6a85345..62dc7b5cd60c6 100644
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -2848,6 +2848,8 @@ static const struct attribute_group tty_dev_attr_group = {
  * @drv: pointer to the uart low level driver structure for this port
  * @uport: uart port structure to use for this port.
  *
+ * Context: task context, might sleep
+ *
  * This allows the driver to register its own uart_port structure
  * with the core driver.  The main purpose is to allow the low
  * level uart drivers to expand uart_port, rather than having yet
@@ -2861,8 +2863,6 @@ int uart_add_one_port(struct uart_driver *drv, struct 
uart_port *uport)
struct device *tty_dev;
int num_groups;
 
-   BUG_ON(in_interrupt());
-
if (uport->line >= drv->nr)
return -EINVAL;
 
@@ -2951,6 +2951,8 @@ int uart_add_one_port(struct uart_driver *drv, struct 
uart_port *uport)
  * @drv: pointer to the uart low level driver structure for this port
  * @uport: uart port structure for this port
  *
+ * Context: task context, might sleep
+ *
  * This unhooks (and hangs up) the specified port structure from the
  * core driver.  No further calls will be made to the low-level code
  * for this port.
@@ -2963,8 +2965,6 @@ int uart_remove_one_port(struct uart_driver *drv, struct 
uart_port *uport)
struct tty_struct *tty;
int ret = 0;
 
-   BUG_ON(in_interrupt());
-
mutex_lock(_mutex);
 
/*
-- 
2.30.0

[PATCH 1/2] vt_ioctl: Remove in_interrupt() check

2021-02-08 Thread Sebastian Andrzej Siewior

From: "Ahmed S. Darwish" 

reset_vc() uses a "!in_interrupt()" conditional before resetting the
palettes, which is a blocking operation. Since commit
   8b6312f4dcc1e ("[PATCH] vt: refactor console SAK processing")

all calls are invoked from a workqueue process context, with the
blocking console lock always acquired.

Remove the "!in_interrupt()" check.

Signed-off-by: Ahmed S. Darwish 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/tty/vt/vt_ioctl.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c
index 4a4cbd4a5f37a..89aeaf3c1bca6 100644
--- a/drivers/tty/vt/vt_ioctl.c
+++ b/drivers/tty/vt/vt_ioctl.c
@@ -930,8 +930,7 @@ void reset_vc(struct vc_data *vc)
put_pid(vc->vt_pid);
vc->vt_pid = NULL;
vc->vt_newvt = -1;
-   if (!in_interrupt())/* Via keyboard.c:SAK() - akpm */
-   reset_palette(vc);
+   reset_palette(vc);
 }
 
 void vc_SAK(struct work_struct *work)
-- 
2.30.0

tty: Remove in_interrupt() usage.

2021-02-08 Thread Sebastian Andrzej Siewior

Folks,

a small series removing in_interrupt() usage within the tty layer.

Sebastian

[PATCH] auxdisplay: Remove in_interrupt() usage.

2021-02-08 Thread Sebastian Andrzej Siewior

charlcd_write() is invoked as a VFS->write() callback and as such it is
always invoked from preemptible context and may sleep.

charlcd_puts() is invoked from register/unregister callback which is
preemtible. The reboot notifier callback is also invoked from
preemptible context.

Therefore there is no need to use `in_interrupt()' to figure out if it
is save to sleep because it always is.
Using `schedule()' to schedule (an be friendly to others) is
discouraged and `cond_resched()' should be used instead.

Remove `in_interrupt()' and use `cond_resched()' to schedule every 32
iteration if needed.

Cc: Miguel Ojeda Sandonis 
Signed-off-by: Sebastian Andrzej Siewior 
---
 drivers/auxdisplay/charlcd.c | 16 
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/auxdisplay/charlcd.c b/drivers/auxdisplay/charlcd.c
index f43430e9dceed..fbfce95919f72 100644
--- a/drivers/auxdisplay/charlcd.c
+++ b/drivers/auxdisplay/charlcd.c
@@ -470,12 +470,8 @@ static ssize_t charlcd_write(struct file *file, const char 
__user *buf,
char c;
 
for (; count-- > 0; (*ppos)++, tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
-   /*
-* let's be a little nice with other processes
-* that need some CPU
-*/
-   schedule();
+   if (((count + 1) & 0x1f) == 0)
+   cond_resched();
 
if (get_user(c, tmp))
return -EFAULT;
@@ -537,12 +533,8 @@ static void charlcd_puts(struct charlcd *lcd, const char 
*s)
int count = strlen(s);
 
for (; count-- > 0; tmp++) {
-   if (!in_interrupt() && (((count + 1) & 0x1f) == 0))
-   /*
-* let's be a little nice with other processes
-* that need some CPU
-*/
-   schedule();
+   if (((count + 1) & 0x1f) == 0)
+   cond_resched();
 
charlcd_write_char(lcd, *tmp);
}
-- 
2.30.0

[ANNOUNCE] v5.10.12-rt26

2021-02-03 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.10.12-rt26 patch set. 

Changes since v5.10.12-rt25:

  - Updated the "tracing: Merge irqflags + preempt counter." patch to
the version Steven posted for upstream inclusion.

  - Update the work-in-progress softirq patch. One difference is that
tasklet_disable() becomes now sleeping if the tasklet is running
instead busy-spinning until it is done. Driver which invoke the
function in atomic context on !RT have been converted.

Known issues
 - kdb/kgdb can easily deadlock.
 - kmsg dumpers expecting not to be called in parallel can clobber
   their temp buffer.
 - netconsole triggers WARN.

The delta patch against v5.10.12-rt25 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/incr/patch-5.10.12-rt25-rt26.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.10.12-rt26

The RT patch against v5.10.12 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patch-5.10.12-rt26.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.10/older/patches-5.10.12-rt26.tar.xz

Sebastian

diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index 316a9947541fe..e96a4e8a4a10c 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -2054,7 +2054,7 @@ static int eni_send(struct atm_vcc *vcc,struct sk_buff 
*skb)
}
submitted++;
ATM_SKB(skb)->vcc = vcc;
-   tasklet_disable(_DEV(vcc->dev)->task);
+   tasklet_disable_in_atomic(_DEV(vcc->dev)->task);
res = do_tx(skb);
tasklet_enable(_DEV(vcc->dev)->task);
if (res == enq_ok) return 0;
diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 9811c40956e54..17c9d825188bb 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -2545,7 +2545,7 @@ static int ohci_cancel_packet(struct fw_card *card, 
struct fw_packet *packet)
struct driver_data *driver_data = packet->driver_data;
int ret = -ENOENT;
 
-   tasklet_disable(>tasklet);
+   tasklet_disable_in_atomic(>tasklet);
 
if (packet->ack != 0)
goto out;
@@ -3465,7 +3465,7 @@ static int ohci_flush_iso_completions(struct 
fw_iso_context *base)
struct iso_context *ctx = container_of(base, struct iso_context, base);
int ret = 0;
 
-   tasklet_disable(>context.tasklet);
+   tasklet_disable_in_atomic(>context.tasklet);
 
if (!test_and_set_bit_lock(0, >flushing_completions)) {
context_tasklet((unsigned long)>context);
diff --git a/drivers/net/arcnet/arc-rimi.c b/drivers/net/arcnet/arc-rimi.c
index 98df38fe553ce..12d085405bd05 100644
--- a/drivers/net/arcnet/arc-rimi.c
+++ b/drivers/net/arcnet/arc-rimi.c
@@ -332,7 +332,7 @@ static int __init arc_rimi_init(void)
dev->irq = 9;
 
if (arcrimi_probe(dev)) {
-   free_netdev(dev);
+   free_arcdev(dev);
return -EIO;
}
 
@@ -349,7 +349,7 @@ static void __exit arc_rimi_exit(void)
iounmap(lp->mem_start);
release_mem_region(dev->mem_start, dev->mem_end - dev->mem_start + 1);
free_irq(dev->irq, dev);
-   free_netdev(dev);
+   free_arcdev(dev);
 }
 
 #ifndef MODULE
diff --git a/drivers/net/arcnet/arcdevice.h b/drivers/net/arcnet/arcdevice.h
index 22a49c6d7ae6e..5d4a4c7efbbff 100644
--- a/drivers/net/arcnet/arcdevice.h
+++ b/drivers/net/arcnet/arcdevice.h
@@ -298,6 +298,10 @@ struct arcnet_local {
 
int excnak_pending;/* We just got an excesive nak interrupt */
 
+   /* RESET flag handling */
+   int reset_in_progress;
+   struct work_struct reset_work;
+
struct {
uint16_t sequence;  /* sequence number (incs with each 
packet) */
__be16 aborted_seq;
@@ -350,7 +354,9 @@ void arcnet_dump_skb(struct net_device *dev, struct sk_buff 
*skb, char *desc)
 
 void arcnet_unregister_proto(struct ArcProto *proto);
 irqreturn_t arcnet_interrupt(int irq, void *dev_id);
+
 struct net_device *alloc_arcdev(const char *name);
+void free_arcdev(struct net_device *dev);
 
 int arcnet_open(struct net_device *dev);
 int arcnet_close(struct net_device *dev);
diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
index e04efc0a5c977..d76dd7d14299e 100644
--- a/drivers/net/arcnet/arcnet.c
+++ b/drivers/net/arcnet/arcnet.c
@@ -387,10 +387,44 @@ static void arcnet_timer(struct timer_list *t)
struct arcnet_local *lp = from_timer(lp, t, timer);
struct net_device *dev = lp->dev;
 
-   if (!netif_carrier_ok(dev)) {
+   spin_lock_irq(>lock);
+
+   if (!lp->reset_in_progress && !netif_carrier_ok(dev)) {
netif_carrier_on(dev);
netdev_info(dev, "link up\n");
}
+
+   spin_unlock_irq(>lock);
+}
+
+static void

Re: [PATCH 1/3] smp: Process pending softirqs in flush_smp_call_function_from_idle()

2021-02-01 Thread Sebastian Andrzej Siewior

On 2021-01-23 21:10:25 [+0100], To linux-bl...@vger.kernel.org wrote:
> send_call_function_single_ipi() may wake an idle CPU without sending an
> IPI. The woken up CPU will process the SMP-functions in
> flush_smp_call_function_from_idle(). Any raised softirq from within the
> SMP-function call will not be processed.
> Should the CPU have no tasks assigned, then it will go back to idle with
> pending softirqs and the NOHZ will rightfully complain.
> 
> Process pending softirqs on return from flush_smp_call_function_queue().
> 
> Fixes: b2a02fc43a1f4 ("smp: Optimize send_call_function_single_ipi()")
> Reported-by: Jens Axboe 
> Signed-off-by: Sebastian Andrzej Siewior 

A gentle ping.
This isn't just a requirement for the series: rps_trigger_softirq() is
invoked from smp_call_function_single_async() and raises a softirq.

Sebastian

Re: [PATCH 0/4 v2] tracing: Merge irqflags + preempt counter.

2021-02-01 Thread Sebastian Andrzej Siewior

On 2021-02-01 13:32:10 [-0500], Steven Rostedt wrote:
> Hi!
Hi,

> I'll let you know if your patches have any issues, but expect to seem a
> "for-next" post this week (if all goes well!).

Thanks for the update.

> -- Steve

Sebastian

Re: [PATCH 0/4 v2] tracing: Merge irqflags + preempt counter.

2021-02-01 Thread Sebastian Andrzej Siewior

On 2021-01-25 20:45:07 [+0100], To linux-kernel@vger.kernel.org wrote:
> The merge irqflags + preempt counter, v2.
> 
> v1…v2:
>  - Helper functions renamed.
>  - Added patch #2 which inlines the helper functions.
> 

a gentle ping.

Sebastian

[ANNOUNCE] v5.11-rc5-rt3

2021-01-29 Thread Sebastian Andrzej Siewior

Dear RT folks!

I'm pleased to announce the v5.11-rc5-rt3 patch set. 

Changes since v5.11-rc5-rt2:

  - Update the work-in-progress softirq patch. One difference is that
tasklet_disable() becomes now sleeping if the tasklet is running
instead busy-spinning until it is done. Driver which invoke the
function in atomic context on !RT have been converted.

Known issues
 - kdb/kgdb can easily deadlock.
 - kmsg dumpers expecting not to be called in parallel can clobber
   their temp buffer.
 - netconsole triggers WARN.

The delta patch against v5.11-rc5-rt2 is appended below and can be found here:
 
 
https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/incr/patch-5.11-rc5-rt2-rt3.patch.xz

You can get this release via the git tree at:

git://git.kernel.org/pub/scm/linux/kernel/git/rt/linux-rt-devel.git 
v5.11-rc5-rt3

The RT patch against v5.11-rc5 can be found here:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patch-5.11-rc5-rt3.patch.xz

The split quilt queue is available at:


https://cdn.kernel.org/pub/linux/kernel/projects/rt/5.11/older/patches-5.11-rc5-rt3.tar.xz

Sebastian

diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index 316a9947541fe..e96a4e8a4a10c 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -2054,7 +2054,7 @@ static int eni_send(struct atm_vcc *vcc,struct sk_buff 
*skb)
}
submitted++;
ATM_SKB(skb)->vcc = vcc;
-   tasklet_disable(_DEV(vcc->dev)->task);
+   tasklet_disable_in_atomic(_DEV(vcc->dev)->task);
res = do_tx(skb);
tasklet_enable(_DEV(vcc->dev)->task);
if (res == enq_ok) return 0;
diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 9811c40956e54..17c9d825188bb 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -2545,7 +2545,7 @@ static int ohci_cancel_packet(struct fw_card *card, 
struct fw_packet *packet)
struct driver_data *driver_data = packet->driver_data;
int ret = -ENOENT;
 
-   tasklet_disable(>tasklet);
+   tasklet_disable_in_atomic(>tasklet);
 
if (packet->ack != 0)
goto out;
@@ -3465,7 +3465,7 @@ static int ohci_flush_iso_completions(struct 
fw_iso_context *base)
struct iso_context *ctx = container_of(base, struct iso_context, base);
int ret = 0;
 
-   tasklet_disable(>context.tasklet);
+   tasklet_disable_in_atomic(>context.tasklet);
 
if (!test_and_set_bit_lock(0, >flushing_completions)) {
context_tasklet((unsigned long)>context);
diff --git a/drivers/net/arcnet/arc-rimi.c b/drivers/net/arcnet/arc-rimi.c
index 98df38fe553ce..12d085405bd05 100644
--- a/drivers/net/arcnet/arc-rimi.c
+++ b/drivers/net/arcnet/arc-rimi.c
@@ -332,7 +332,7 @@ static int __init arc_rimi_init(void)
dev->irq = 9;
 
if (arcrimi_probe(dev)) {
-   free_netdev(dev);
+   free_arcdev(dev);
return -EIO;
}
 
@@ -349,7 +349,7 @@ static void __exit arc_rimi_exit(void)
iounmap(lp->mem_start);
release_mem_region(dev->mem_start, dev->mem_end - dev->mem_start + 1);
free_irq(dev->irq, dev);
-   free_netdev(dev);
+   free_arcdev(dev);
 }
 
 #ifndef MODULE
diff --git a/drivers/net/arcnet/arcdevice.h b/drivers/net/arcnet/arcdevice.h
index 22a49c6d7ae6e..5d4a4c7efbbff 100644
--- a/drivers/net/arcnet/arcdevice.h
+++ b/drivers/net/arcnet/arcdevice.h
@@ -298,6 +298,10 @@ struct arcnet_local {
 
int excnak_pending;/* We just got an excesive nak interrupt */
 
+   /* RESET flag handling */
+   int reset_in_progress;
+   struct work_struct reset_work;
+
struct {
uint16_t sequence;  /* sequence number (incs with each 
packet) */
__be16 aborted_seq;
@@ -350,7 +354,9 @@ void arcnet_dump_skb(struct net_device *dev, struct sk_buff 
*skb, char *desc)
 
 void arcnet_unregister_proto(struct ArcProto *proto);
 irqreturn_t arcnet_interrupt(int irq, void *dev_id);
+
 struct net_device *alloc_arcdev(const char *name);
+void free_arcdev(struct net_device *dev);
 
 int arcnet_open(struct net_device *dev);
 int arcnet_close(struct net_device *dev);
diff --git a/drivers/net/arcnet/arcnet.c b/drivers/net/arcnet/arcnet.c
index e04efc0a5c977..d76dd7d14299e 100644
--- a/drivers/net/arcnet/arcnet.c
+++ b/drivers/net/arcnet/arcnet.c
@@ -387,10 +387,44 @@ static void arcnet_timer(struct timer_list *t)
struct arcnet_local *lp = from_timer(lp, t, timer);
struct net_device *dev = lp->dev;
 
-   if (!netif_carrier_ok(dev)) {
+   spin_lock_irq(>lock);
+
+   if (!lp->reset_in_progress && !netif_carrier_ok(dev)) {
netif_carrier_on(dev);
netdev_info(dev, "link up\n");
}
+
+   spin_unlock_irq(>lock);
+}
+
+static void reset_device_work(struct work_struct *work)
+{
+   struct arcnet_local *lp;
+   struct net_device *dev;
+
+   lp =

Re: [PATCH 1/1] kernel/smp: Split call_single_queue into 3 queues

2021-01-28 Thread Sebastian Andrzej Siewior

On 2021-01-28 03:55:06 [-0300], Leonardo Bras wrote:
> Currently, during flush_smp_call_function_queue():
> - All items are transversed once, for inverting.
> - The SYNC items are transversed twice.
> - The ASYNC & IRQ_WORK items are transversed tree times.
> - The TTWU items are transversed four times;.
> 
> Also, a lot of extra work is done to keep track and remove the items
> already processed in each step.
> 
> By using three queues, it's possible to avoid all this work, and
> all items in list are transversed only twice: once for inverting,
> and once for processing..
> 
> In exchange, this requires 2 extra llist_del_all() in the beginning
> of flush_smp_call_function_queue(), and some extra logic to decide
> the correct queue to add the desired csd.
> 
> This is not supposed to cause any change in the order the items are
> processed, but will change the order of printing (cpu offlining)
> to the order the items will be proceessed.
> 
> (The above transversed count ignores the cpu-offlining case, in
> which all items would be transversed again, in both cases.)

Numbers would be good. Having three queues increases the memory foot
print from one pointer to three but we still remain in one cache line.
One difference your patch makes is this hunk:

> + if (smp_add_to_queue(cpu, node))
>   send_call_function_single_ipi(cpu);

Previously only the first addition resulted in sending an IPI. With this
change you could send two IPIs, one for adding to two independent queues.

A quick smoke test ended up
  -0   [005] d..h1..   146.255996: 
flush_smp_call_function_queue: A1 S2 I0 T0 X3

with the patch at the bottom of the mail. This shows that in my
smoke test at least, the number of items in the individual list is low.

diff --git a/kernel/smp.c b/kernel/smp.c
index 1b6070bf97bb0..3acce385b9f97 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -336,6 +336,11 @@ static void flush_smp_call_function_queue(bool 
warn_cpu_offline)
struct llist_node *entry, *prev;
struct llist_head *head;
static bool warned;
+   int num_async = 0;
+   int num_sync = 0;
+   int num_irqw = 0;
+   int num_twu = 0;
+   int total = 0;
 
lockdep_assert_irqs_disabled();
 
@@ -343,6 +348,33 @@ static void flush_smp_call_function_queue(bool 
warn_cpu_offline)
entry = llist_del_all(head);
entry = llist_reverse_order(entry);
 
+   llist_for_each_entry(csd, entry, node.llist) {
+   switch (CSD_TYPE(csd)) {
+   case CSD_TYPE_ASYNC:
+   num_async++;
+   break;
+   case CSD_TYPE_SYNC:
+   num_sync++;
+   break;
+
+   case CSD_TYPE_IRQ_WORK:
+   num_irqw++;
+   break;
+
+   case CSD_TYPE_TTWU:
+   num_twu++;
+   break;
+
+   default:
+   pr_warn("h\n");
+   break;
+   }
+   }
+   total = num_async + num_sync + num_irqw + num_twu;
+   if (total > 2)
+   trace_printk("A%d S%d I%d T%d X%d\n", num_async, num_sync, 
num_irqw, num_twu,
+total);
+
/* There shouldn't be any pending callbacks on an offline CPU. */
if (unlikely(warn_cpu_offline && !cpu_online(smp_processor_id()) &&
 !warned && !llist_empty(head))) {

Sebastian

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 6297 matches

Mail list logo