Re: [PATCH v10 1/9] drm/panic: Add drm panic locking
On 2024-03-15, Jocelyn Falempe wrote: > +static inline int drm_panic_trylock(struct drm_device *dev, unsigned long > *flags) > +{ > + return raw_spin_trylock_irqsave(>mode_config.panic_lock, *flags); > +} [...] > +static inline unsigned long drm_panic_lock(struct drm_device *dev) > +{ > + unsigned long flags; > + > + raw_spin_lock_irqsave(>mode_config.panic_lock, flags); > + return flags; > +} I find it a bit strange that @flags is passed as an argument in the trylock variant but returned in the lock variant. I would pass it as an argument in both variants. Just a suggestion. John Ogness
Re: [PATCH v9 1/9] drm/panic: Add drm panic locking
On 2024-03-07, Jocelyn Falempe wrote: > diff --git a/drivers/gpu/drm/drm_atomic_helper.c > b/drivers/gpu/drm/drm_atomic_helper.c > index 39ef0a6addeb..c0bb91312fb2 100644 > --- a/drivers/gpu/drm/drm_atomic_helper.c > +++ b/drivers/gpu/drm/drm_atomic_helper.c > @@ -38,6 +38,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -3099,6 +3100,7 @@ int drm_atomic_helper_swap_state(struct > drm_atomic_state *state, > } > } > > + drm_panic_lock(state->dev); > for_each_oldnew_plane_in_state(state, plane, old_plane_state, > new_plane_state, i) { > WARN_ON(plane->state != old_plane_state); > > @@ -3108,6 +3110,7 @@ int drm_atomic_helper_swap_state(struct > drm_atomic_state *state, > state->planes[i].state = old_plane_state; > plane->state = new_plane_state; > } > + drm_panic_unlock(state->dev); Is there a reason irqsave/irqrestore variants are not used? Maybe this code path is too hot? By leaving interrupts enabled, there is the risk that a panic from within any interrupt handler may block the drm panic handler. John Ogness
Re: [RFC] drm/panic: Add drm panic locking
Hi Daniel, Great to see this moving forward! On 2024-03-01, Daniel Vetter wrote: > But for the initial cut of a drm panic printing support I don't think > we need that, because the critical sections are extremely small and > only happen once per display refresh. So generally just 60 tiny locked > sections per second, which is nothing compared to a serial console > running a 115kbaud doing really slow mmio writes for each byte. So for > now the raw spintrylock in drm panic notifier callback should be good > enough. Is there a reason you do not use the irqsave/irqrestore variants? By leaving interrupts enabled, there is the risk that a panic from any interrupt handler may block the drm panic handler. John Ogness
Re: [RFC] How to test panic handlers, without crashing the kernel
[Added printk maintainer and kdb folks] Hi Jocelyn, On 2024-03-01, Jocelyn Falempe wrote: > While writing a panic handler for drm devices [1], I needed a way to > test it without crashing the machine. > So from debugfs, I called > atomic_notifier_call_chain(_notifier_list, ...), but it has the > side effect of calling all other panic notifiers registered. > > So Sima suggested to move that to the generic panic code, and test all > panic notifiers with a dedicated debugfs interface. > > I can move that code to kernel/, but before doing that, I would like to > know if you think that's the right way to test the panic code. One major event that happens before the panic notifiers is panic_other_cpus_shutdown(). This can cause special situations because CPUs can be stopped while holding resources (such as raw spin locks). And these are the situations that make it so tricky to have safe and reliable notifiers. If triggered from debugfs, these situations will never occur. My concern is that the tests via debugfs will always succeed, but in the real world panic notifiers are failing/hanging/exploding. IMHO useful panic testing requires real panic'ing. For my printk panic tests I trigger unknown NMIs while booting with "unknown_nmi_panic". Particularly with Qemu this is quite easy and amazingly effective at catching problems. In fact, a recent printk series [0] fixed seven issues that were found through this method of panic testing. > The second question is how to simulate a panic context in a > non-destructive way, so we can test the panic notifiers in CI, without > crashing the machine. I'm wondering if a "fake panic" can be implemented that quiesces all the other CPUs via NMI (similar to kdb) and then calls the panic notifiers. And finally releases everything back to normal. That might produce a fairly realistic panic situation and should be fairly non-destructive (depending on what the notifiers do and how long they take). > The worst case for a panic notifier, is when the panic occurs in NMI > context, but I don't know how to simulate that. The goal would be to > find early if a panic notifier tries to sleep, or do other things that > are not allowed in a panic context. Maybe with a new boot argument "unknown_nmi_fake_panic" that triggers the fake panic instead? John Ogness [0] https://lore.kernel.org/lkml/20240207134103.1357162-1-john.ogn...@linutronix.de
[PATCH v3] drm/nouveau: fix incorrect conversion to dma_resv_wait_timeout()
Commit 41d351f29528 ("drm/nouveau: stop using ttm_bo_wait") converted from ttm_bo_wait_ctx() to dma_resv_wait_timeout(). However, dma_resv_wait_timeout() returns greater than zero on success as opposed to ttm_bo_wait_ctx(). As a result, relocs will fail and log errors even when it was a success. Change the return code handling to match that of nouveau_gem_ioctl_cpu_prep(), which was already using dma_resv_wait_timeout() correctly. Fixes: 41d351f29528 ("drm/nouveau: stop using ttm_bo_wait") Reported-by: Tanmay Bhushan <0070472...@gmail.com> Link: https://lore.kernel.org/lkml/20230119225351.71657-1-0070472...@gmail.com Signed-off-by: John Ogness --- I just realized that the nouveau driver style prefers to scope variables used only in loops. v3: Define @lret within the for-loop. drivers/gpu/drm/nouveau/nouveau_gem.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index f77e44958037..ab9062e50977 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -645,7 +645,7 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, struct drm_nouveau_gem_pushbuf_reloc *reloc, struct drm_nouveau_gem_pushbuf_bo *bo) { - long ret = 0; + int ret = 0; unsigned i; for (i = 0; i < req->nr_relocs; i++) { @@ -653,6 +653,7 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, struct drm_nouveau_gem_pushbuf_bo *b; struct nouveau_bo *nvbo; uint32_t data; + long lret; if (unlikely(r->bo_index >= req->nr_buffers)) { NV_PRINTK(err, cli, "reloc bo index invalid\n"); @@ -703,13 +704,18 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, data |= r->vor; } - ret = dma_resv_wait_timeout(nvbo->bo.base.resv, - DMA_RESV_USAGE_BOOKKEEP, - false, 15 * HZ); - if (ret == 0) + lret = dma_resv_wait_timeout(nvbo->bo.base.resv, +DMA_RESV_USAGE_BOOKKEEP, +false, 15 * HZ); + if (!lret) ret = -EBUSY; + else if (lret > 0) + ret = 0; + else + ret = lret; + if (ret) { - NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n", + NV_PRINTK(err, cli, "reloc wait_idle failed: %d\n", ret); break; } base-commit: 09a9639e56c01c7a00d6c0ca63f4c7c41abe075d -- 2.30.2
[PATCH v2] drm/nouveau: fix incorrect conversion to dma_resv_wait_timeout()
Commit 41d351f29528 ("drm/nouveau: stop using ttm_bo_wait") converted from ttm_bo_wait_ctx() to dma_resv_wait_timeout(). However, dma_resv_wait_timeout() returns greater than zero on success as opposed to ttm_bo_wait_ctx(). As a result, relocs will fail and log errors even when it was a success. Change the return code handling to match that of nouveau_gem_ioctl_cpu_prep(), which was already using dma_resv_wait_timeout() correctly. Fixes: 41d351f29528 ("drm/nouveau: stop using ttm_bo_wait") Reported-by: Tanmay Bhushan <0070472...@gmail.com> Link: https://lore.kernel.org/lkml/20230119225351.71657-1-0070472...@gmail.com Signed-off-by: John Ogness --- The original report was actually a patch that needed fixing. Since nobody has stepped up to fix this regression correctly, I'm posting the v2. This is a real regression introduced in 6.3-rc1. drivers/gpu/drm/nouveau/nouveau_gem.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c b/drivers/gpu/drm/nouveau/nouveau_gem.c index f77e44958037..346839c24273 100644 --- a/drivers/gpu/drm/nouveau/nouveau_gem.c +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c @@ -645,8 +645,9 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, struct drm_nouveau_gem_pushbuf_reloc *reloc, struct drm_nouveau_gem_pushbuf_bo *bo) { - long ret = 0; + int ret = 0; unsigned i; + long lret; for (i = 0; i < req->nr_relocs; i++) { struct drm_nouveau_gem_pushbuf_reloc *r = [i]; @@ -703,13 +704,18 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, data |= r->vor; } - ret = dma_resv_wait_timeout(nvbo->bo.base.resv, - DMA_RESV_USAGE_BOOKKEEP, - false, 15 * HZ); - if (ret == 0) + lret = dma_resv_wait_timeout(nvbo->bo.base.resv, +DMA_RESV_USAGE_BOOKKEEP, +false, 15 * HZ); + if (!lret) ret = -EBUSY; + else if (lret > 0) + ret = 0; + else + ret = lret; + if (ret) { - NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n", + NV_PRINTK(err, cli, "reloc wait_idle failed: %d\n", ret); break; } base-commit: 09a9639e56c01c7a00d6c0ca63f4c7c41abe075d -- 2.30.2
Re: [PATCH] drm/nouveau: Fix bug in buffer relocs for Nouveau
On 2023-01-19, Tanmay Bhushan <0070472...@gmail.com> wrote: > dma_resv_wait_timeout returns greater than zero on success > as opposed to ttm_bo_wait_ctx. As a result of that relocs > will fail and give failure even when it was a success. Today I switched my workstation from 6.2 to 6.3-rc3 and started seeing lots of new kernel messages: [ 642.138313][ T1751] nouveau :f0:10.0: X[1751]: reloc wait_idle failed: 1500 [ 642.138389][ T1751] nouveau :f0:10.0: X[1751]: reloc apply: 1500 [ 646.123490][ T1751] nouveau :f0:10.0: X[1751]: reloc wait_idle failed: 1500 [ 646.123573][ T1751] nouveau :f0:10.0: X[1751]: reloc apply: 1500 The graphics seemed to go slower or hang a bit when these messages would appear. I then found your patch! However, I have some comments about it. First, it should include a fixes tag: Fixes: 41d351f29528 ("drm/nouveau: stop using ttm_bo_wait") > Signed-off-by: Tanmay Bhushan <0070472...@gmail.com> > --- > drivers/gpu/drm/nouveau/nouveau_gem.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c > b/drivers/gpu/drm/nouveau/nouveau_gem.c > index f77e44958037..0e3690459144 100644 > --- a/drivers/gpu/drm/nouveau/nouveau_gem.c > +++ b/drivers/gpu/drm/nouveau/nouveau_gem.c > @@ -706,9 +706,8 @@ nouveau_gem_pushbuf_reloc_apply(struct nouveau_cli *cli, > ret = dma_resv_wait_timeout(nvbo->bo.base.resv, > DMA_RESV_USAGE_BOOKKEEP, > false, 15 * HZ); > - if (ret == 0) > + if (ret <= 0) { > ret = -EBUSY; This is incorrect for 2 reasons: * it treats restarts as timeouts * this function now returns >0 on success > - if (ret) { > NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n", > ret); > break; I rearranged things to basically correctly translate the return code of dma_resv_wait_timeout() to match the previous ttm_bo_wait(): ret = dma_resv_wait_timeout(nvbo->bo.base.resv, DMA_RESV_USAGE_BOOKKEEP, false, 15 * HZ); if (ret == 0) ret = -EBUSY; if (ret > 0) ret = 0; if (ret) { NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n", ret); break; } So the patch just becomes: @@ -708,6 +708,8 @@ nouveau_gem_pushbuf_reloc_apply(struct n false, 15 * HZ); if (ret == 0) ret = -EBUSY; + if (ret > 0) + ret = 0; if (ret) { NV_PRINTK(err, cli, "reloc wait_idle failed: %ld\n", ret); With this variant, everything runs correctly on my workstation again. It probably deserves a comment about why @ret is being translated. Or perhaps a new variable should be introduced to separate the return value of dma_resv_wait_timeout() from the return value of this function. Either way, this is an important fix for 6.3-rc! John Ogness
[PATCH printk v5 32/40] printk, xen: fbfront: create/use safe function for forcing preferred
With commit 9e124fe16ff2("xen: Enable console tty by default in domU if it's not a dummy") a hack was implemented to make sure that the tty console remains the console behind the /dev/console device. The main problem with the hack is that, after getting the console pointer to the tty console, it is assumed the pointer is still valid after releasing the console_sem. This assumption is incorrect and unsafe. Make the hack safe by introducing a new function console_force_preferred_locked() and perform the full operation under the console_list_lock. Signed-off-by: John Ogness Reviewed-by: Petr Mladek --- drivers/video/fbdev/xen-fbfront.c | 12 +++- include/linux/console.h | 1 + kernel/printk/printk.c| 49 +-- 3 files changed, 51 insertions(+), 11 deletions(-) diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 4d2694d904aa..8752d389e382 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -504,18 +504,14 @@ static void xenfb_make_preferred_console(void) if (console_set_on_cmdline) return; - console_lock(); + console_list_lock(); for_each_console(c) { if (!strcmp(c->name, "tty") && c->index == 0) break; } - console_unlock(); - if (c) { - unregister_console(c); - c->flags |= CON_CONSDEV; - c->flags &= ~CON_PRINTBUFFER; /* don't print again */ - register_console(c); - } + if (c) + console_force_preferred_locked(c); + console_list_unlock(); } static int xenfb_resume(struct xenbus_device *dev) diff --git a/include/linux/console.h b/include/linux/console.h index f716e1dd9eaf..9cea254b34b8 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -291,6 +291,7 @@ enum con_flush_mode { }; extern int add_preferred_console(char *name, int idx, char *options); +extern void console_force_preferred_locked(struct console *con); extern void register_console(struct console *); extern int unregister_console(struct console *); extern void console_lock(void); diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 410d3f2cdeb3..ece34abbc9cc 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -247,9 +247,10 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, void console_list_lock(void) { /* -* In unregister_console(), synchronize_srcu() is called with the -* console_list_lock held. Therefore it is not allowed that the -* console_list_lock is taken with the srcu_lock held. +* In unregister_console() and console_force_preferred_locked(), +* synchronize_srcu() is called with the console_list_lock held. +* Therefore it is not allowed that the console_list_lock is taken +* with the srcu_lock held. * * Detecting if this context is really in the read-side critical * section is only possible if the appropriate debug options are @@ -3489,6 +3490,48 @@ int unregister_console(struct console *console) } EXPORT_SYMBOL(unregister_console); +/** + * console_force_preferred_locked - force a registered console preferred + * @con: The registered console to force preferred. + * + * Must be called under console_list_lock(). + */ +void console_force_preferred_locked(struct console *con) +{ + struct console *cur_pref_con; + + if (!console_is_registered_locked(con)) + return; + + cur_pref_con = console_first(); + + /* Already preferred? */ + if (cur_pref_con == con) + return; + + /* +* Delete, but do not re-initialize the entry. This allows the console +* to continue to appear registered (via any hlist_unhashed_lockless() +* checks), even though it was briefly removed from the console list. +*/ + hlist_del_rcu(>node); + + /* +* Ensure that all SRCU list walks have completed so that the console +* can be added to the beginning of the console list and its forward +* list pointer can be re-initialized. +*/ + synchronize_srcu(_srcu); + + con->flags |= CON_CONSDEV; + WARN_ON(!con->device); + + /* Only the new head can have CON_CONSDEV set. */ + console_srcu_write_flags(cur_pref_con, cur_pref_con->flags & ~CON_CONSDEV); + hlist_add_head_rcu(>node, _list); +} +EXPORT_SYMBOL(console_force_preferred_locked); + /* * Initialize the console device. This is called *early*, so * we can't necessarily depend on lots of kernel help here. -- 2.30.2
[PATCH printk v5 00/40] reduce console_lock scope
This is v5 of a series to prepare for threaded/atomic printing. v4 is here [0]. This series focuses on reducing the scope of the BKL console_lock. It achieves this by switching to SRCU and a dedicated mutex for console list iteration and modification, respectively. The console_lock will no longer offer this protection. Also, during the review of v2 it came to our attention that many console drivers are checking CON_ENABLED to see if they are registered. Because this flag can change without unregistering and because this flag does not represent an atomic point when an (un)registration process is complete, a new console_is_registered() function is introduced. This function uses the console_list_lock to synchronize with the (un)registration process to provide a reliable status. All users of the console_lock for list iteration have been modified. For the call sites where the console_lock is still needed (for other reasons), comments are added to explain exactly why the console_lock is needed. All users of CON_ENABLED for registration status have been modified to use console_is_registered(). Note that there are still users of CON_ENABLED, but this is for legitimate purposes about a registered console being able to print. The base commit for this series is from Paul McKenney's RCU tree and provides an NMI-safe SRCU implementation [1]. Without the NMI-safe SRCU implementation, this series is not less safe than mainline. But we will need the NMI-safe SRCU implementation for atomic consoles anyway, so we might as well get it in now. Especially since it _does_ increase the reliability for mainline in the panic path. Changes since v4: printk: - Introduce console_init_seq() to handle the now rather complex procedure to find an appropriate start sequence number for a new console upon registration. - When registering a non-boot console and boot consoles are registered, try to flush all the consoles to get the next @seq value before falling back to use the @seq of the enabled boot console that is furthest behind. - For console_force_preferred_locked(), make the console the head of the console list. John Ogness [0] https://lore.kernel.org/lkml/20221114162932.141883-1-john.ogn...@linutronix.de [1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/log/?h=srcunmisafe.2022.11.09a John Ogness (38): printk: Prepare for SRCU console list protection printk: register_console: use "registered" for variable names printk: move @seq initialization to helper printk: fix setting first seq for consoles um: kmsg_dump: only dump when no output console available tty: serial: kgdboc: document console_lock usage tty: tty_io: document console_lock usage proc: consoles: document console_lock usage printk: introduce console_list_lock console: introduce wrappers to read/write console flags um: kmsg_dumper: use srcu console list iterator kdb: use srcu console list iterator printk: console_flush_all: use srcu console list iterator printk: __pr_flush: use srcu console list iterator printk: console_is_usable: use console_srcu_read_flags printk: console_unblank: use srcu console list iterator printk: console_flush_on_panic: use srcu console list iterator printk: console_device: use srcu console list iterator console: introduce console_is_registered() serial_core: replace uart_console_enabled() with uart_console_registered() tty: nfcon: use console_is_registered() efi: earlycon: use console_is_registered() tty: hvc: use console_is_registered() tty: serial: earlycon: use console_is_registered() tty: serial: pic32_uart: use console_is_registered() tty: serial: samsung_tty: use console_is_registered() tty: serial: xilinx_uartps: use console_is_registered() usb: early: xhci-dbc: use console_is_registered() netconsole: avoid CON_ENABLED misuse to track registration printk, xen: fbfront: create/use safe function for forcing preferred tty: tty_io: use console_list_lock for list synchronization proc: consoles: use console_list_lock for list iteration tty: serial: kgdboc: use srcu console list iterator tty: serial: kgdboc: use console_list_lock for list traversal tty: serial: kgdboc: synchronize tty_find_polling_driver() and register_console() tty: serial: kgdboc: use console_list_lock to trap exit printk: relieve console_lock of list synchronization duties tty: serial: sh-sci: use setup() callback for early console Thomas Gleixner (2): serial: kgdboc: Lock console list in probe function printk: Convert console_drivers list to hlist .clang-format | 1 + arch/m68k/emu/nfcon.c | 9 +- arch/um/kernel/kmsg_dump.c | 24 +- drivers/firmware/efi/earlycon.c | 8 +- drivers/net/netconsole.c| 21 +- drivers/tty/hvc/hvc_console.c | 4 +- drivers/tty/serial/8250/8250_core.c | 2 +- drivers/tty/serial/earlycon.c | 4 +- drivers/tty/seria
Re: [PATCH printk v4 31/39] printk, xen: fbfront: create/use safe function for forcing preferred
Hi, After more detailed runtime testing I discovered that I didn't re-insert the console to the correct place in the list. More below... On 2022-11-14, John Ogness wrote: > diff --git a/include/linux/console.h b/include/linux/console.h > index f716e1dd9eaf..9cea254b34b8 100644 > --- a/include/linux/console.h > +++ b/include/linux/console.h > @@ -291,6 +291,7 @@ enum con_flush_mode { > }; > > extern int add_preferred_console(char *name, int idx, char *options); > +extern void console_force_preferred_locked(struct console *con); > extern void register_console(struct console *); > extern int unregister_console(struct console *); > extern void console_lock(void); > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index e770b1ede6c9..dff76c1cef80 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -3461,6 +3462,48 @@ int unregister_console(struct console *console) > } > EXPORT_SYMBOL(unregister_console); > > +/** > + * console_force_preferred_locked - force a registered console preferred > + * @con: The registered console to force preferred. > + * > + * Must be called under console_list_lock(). > + */ > +void console_force_preferred_locked(struct console *con) > +{ > + struct console *cur_pref_con; > + > + if (!console_is_registered_locked(con)) > + return; > + > + cur_pref_con = console_first(); > + > + /* Already preferred? */ > + if (cur_pref_con == con) > + return; > + > + /* > + * Delete, but do not re-initialize the entry. This allows the console > + * to continue to appear registered (via any hlist_unhashed_lockless() > + * checks), even though it was briefly removed from the console list. > + */ > + hlist_del_rcu(>node); > + > + /* > + * Ensure that all SRCU list walks have completed so that the console > + * can be added to the beginning of the console list and its forward > + * list pointer can be re-initialized. > + */ > + synchronize_srcu(_srcu); > + > + con->flags |= CON_CONSDEV; > + WARN_ON(!con->device); > + > + /* Only the new head can have CON_CONSDEV set. */ > + console_srcu_write_flags(cur_pref_con, cur_pref_con->flags & > ~CON_CONSDEV); > + hlist_add_behind_rcu(>node, console_list.first); This is adding the console as the 2nd item. It should be the new head. The patch below fixes it. I have done careful runtime testing with this fixup. After the force_preferred, the console is the new head and sending data to /dev/console redirects to that console. It would be nice if we could fold this in. Sorry. John Ogness diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 8d635467882f..4b77586cf4cb 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -3494,7 +3494,7 @@ void console_force_preferred_locked(struct console *con) /* Only the new head can have CON_CONSDEV set. */ console_srcu_write_flags(cur_pref_con, cur_pref_con->flags & ~CON_CONSDEV); - hlist_add_behind_rcu(>node, console_list.first); + hlist_add_head_rcu(>node, _list); } EXPORT_SYMBOL(console_force_preferred_locked);
[PATCH printk v4 31/39] printk, xen: fbfront: create/use safe function for forcing preferred
With commit 9e124fe16ff2("xen: Enable console tty by default in domU if it's not a dummy") a hack was implemented to make sure that the tty console remains the console behind the /dev/console device. The main problem with the hack is that, after getting the console pointer to the tty console, it is assumed the pointer is still valid after releasing the console_sem. This assumption is incorrect and unsafe. Make the hack safe by introducing a new function console_force_preferred_locked() and perform the full operation under the console_list_lock. Signed-off-by: John Ogness --- drivers/video/fbdev/xen-fbfront.c | 12 +++- include/linux/console.h | 1 + kernel/printk/printk.c| 49 +-- 3 files changed, 51 insertions(+), 11 deletions(-) diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 4d2694d904aa..8752d389e382 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -504,18 +504,14 @@ static void xenfb_make_preferred_console(void) if (console_set_on_cmdline) return; - console_lock(); + console_list_lock(); for_each_console(c) { if (!strcmp(c->name, "tty") && c->index == 0) break; } - console_unlock(); - if (c) { - unregister_console(c); - c->flags |= CON_CONSDEV; - c->flags &= ~CON_PRINTBUFFER; /* don't print again */ - register_console(c); - } + if (c) + console_force_preferred_locked(c); + console_list_unlock(); } static int xenfb_resume(struct xenbus_device *dev) diff --git a/include/linux/console.h b/include/linux/console.h index f716e1dd9eaf..9cea254b34b8 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -291,6 +291,7 @@ enum con_flush_mode { }; extern int add_preferred_console(char *name, int idx, char *options); +extern void console_force_preferred_locked(struct console *con); extern void register_console(struct console *); extern int unregister_console(struct console *); extern void console_lock(void); diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index e770b1ede6c9..dff76c1cef80 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -247,9 +247,10 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, void console_list_lock(void) { /* -* In unregister_console(), synchronize_srcu() is called with the -* console_list_lock held. Therefore it is not allowed that the -* console_list_lock is taken with the srcu_lock held. +* In unregister_console() and console_force_preferred_locked(), +* synchronize_srcu() is called with the console_list_lock held. +* Therefore it is not allowed that the console_list_lock is taken +* with the srcu_lock held. * * Detecting if this context is really in the read-side critical * section is only possible if the appropriate debug options are @@ -3461,6 +3462,48 @@ int unregister_console(struct console *console) } EXPORT_SYMBOL(unregister_console); +/** + * console_force_preferred_locked - force a registered console preferred + * @con: The registered console to force preferred. + * + * Must be called under console_list_lock(). + */ +void console_force_preferred_locked(struct console *con) +{ + struct console *cur_pref_con; + + if (!console_is_registered_locked(con)) + return; + + cur_pref_con = console_first(); + + /* Already preferred? */ + if (cur_pref_con == con) + return; + + /* +* Delete, but do not re-initialize the entry. This allows the console +* to continue to appear registered (via any hlist_unhashed_lockless() +* checks), even though it was briefly removed from the console list. +*/ + hlist_del_rcu(>node); + + /* +* Ensure that all SRCU list walks have completed so that the console +* can be added to the beginning of the console list and its forward +* list pointer can be re-initialized. +*/ + synchronize_srcu(_srcu); + + con->flags |= CON_CONSDEV; + WARN_ON(!con->device); + + /* Only the new head can have CON_CONSDEV set. */ + console_srcu_write_flags(cur_pref_con, cur_pref_con->flags & ~CON_CONSDEV); + hlist_add_behind_rcu(>node, console_list.first); +} +EXPORT_SYMBOL(console_force_preferred_locked); + /* * Initialize the console device. This is called *early*, so * we can't necessarily depend on lots of kernel help here. -- 2.30.2
[PATCH printk v4 00/39] reduce console_lock scope
This is v4 of a series to prepare for threaded/atomic printing. v3 is here [0]. This series focuses on reducing the scope of the BKL console_lock. It achieves this by switching to SRCU and a dedicated mutex for console list iteration and modification, respectively. The console_lock will no longer offer this protection. Also, during the review of v2 it came to our attention that many console drivers are checking CON_ENABLED to see if they are registered. Because this flag can change without unregistering and because this flag does not represent an atomic point when an (un)registration process is complete, a new console_is_registered() function is introduced. This function uses the console_list_lock to synchronize with the (un)registration process to provide a reliable status. All users of the console_lock for list iteration have been modified. For the call sites where the console_lock is still needed (for other reasons), comments are added to explain exactly why the console_lock was needed. All users of CON_ENABLED for registration status have been modified to use console_is_registered(). Note that there are still users of CON_ENABLED, but this is for legitimate purposes about a registered console being able to print. The base commit for this series is from Paul McKenney's RCU tree and provides an NMI-safe SRCU implementation [1]. Without the NMI-safe SRCU implementation, this series is not less safe than mainline. But we will need the NMI-safe SRCU implementation for atomic consoles anyway, so we might as well get it in now. Especially since it _does_ increase the reliability for mainline in the panic path. Changes since v3: general: - Implement console_srcu_read_flags() and console_srcu_write_flags() to be used for console->flags access under the srcu_read_lock or console_list_lock, respectively. The functions document their relationship to one another and use data_race(), READ_ONCE(), and WRITE_ONCE() macros to annotate their relationship. They also make use of lockdep to warn if used in improper contexts. - Replace all console_is_enabled() usage with console_srcu_read_flags() (all were under the srcu_read_lock). serial_core: - For uart_console_registered(), check uart_console() before taking the console_list_lock to avoid unnecessary lock contention for non-console ports. m68k/emu/nfcon: - Only explicitly enable the console if registering via debug=nfcon. tty/serial/sh-sci: - Add comments about why @options will always be NULL for the earlyprintk console. kdb: - Add comments explaining the expectations for console drivers to work correctly. printk: - Some code sites under SRCU were checking flags directly. Use console_srcu_read_flags() instead. - In register_console() rename bootcon_enabled/realcon_enabled to bootcon_registered/realcon_registered to avoid confusion. - In register_console() only check for boot console sequences if a boot console is registered and !keep_bootcon. In this case, also take the console_lock to guarantee safe access to console->seq. - In console_force_preferred_locked() use hlist_del_rcu() instead of hlist_del_init_rcu() so that there is never a window where the console can be viewed as unregistered. John Ogness [0] https://lore.kernel.org/lkml/20221107141638.3790965-1-john.ogn...@linutronix.de [1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/log/?h=srcunmisafe.2022.11.09a John Ogness (37): printk: Prepare for SRCU console list protection printk: register_console: use "registered" for variable names printk: fix setting first seq for consoles um: kmsg_dump: only dump when no output console available tty: serial: kgdboc: document console_lock usage tty: tty_io: document console_lock usage proc: consoles: document console_lock usage printk: introduce console_list_lock console: introduce wrappers to read/write console flags um: kmsg_dumper: use srcu console list iterator kdb: use srcu console list iterator printk: console_flush_all: use srcu console list iterator printk: __pr_flush: use srcu console list iterator printk: console_is_usable: use console_srcu_read_flags printk: console_unblank: use srcu console list iterator printk: console_flush_on_panic: use srcu console list iterator printk: console_device: use srcu console list iterator console: introduce console_is_registered() serial_core: replace uart_console_enabled() with uart_console_registered() tty: nfcon: use console_is_registered() efi: earlycon: use console_is_registered() tty: hvc: use console_is_registered() tty: serial: earlycon: use console_is_registered() tty: serial: pic32_uart: use console_is_registered() tty: serial: samsung_tty: use console_is_registered() tty: serial: xilinx_uartps: use console_is_registered() usb: early: xhci-dbc: use console_is_registered() netconsole: avoid CON_ENABLED misuse to track registration printk, xen: fbfront: create/u
Re: [PATCH printk v3 33/40] printk, xen: fbfront: create/use safe function for forcing preferred
On 2022-11-10, Petr Mladek wrote: + /* Only the new head can have CON_CONSDEV set. */ + WRITE_ONCE(cur_pref_con->flags, cur_pref_con->flags & ~CON_CONSDEV); >>> >>> As mentioned in the reply for 7th patch, I would prefer to hide this >>> WRITE_ONCE into a wrapper, e.g. console_set_flag(). It might also >>> check that the console_list_lock is taken... >> >> Agreed. For v4 it will become: >> >> console_srcu_write_flags(cur_pref_con, cur_pref_con->flags & ~CON_CONSDEV); > > I am happy that your are going to introduce an API for this. > > Just to be sure. The _srcu_ in the name means that the write > will use WRITE_ONCE() so that it can be read safely in SRCU > context using READ_ONCE(). Do I get it correctly, please? Yes. > I expect that the counter part will be console_srcu_read_flags(). > I like the name. It is better than _unsafe_ that I proposed earlier. Since the only flag that is ever checked in this way is CON_ENABLED, I was planning on calling it console_srcu_is_enabled(). But I suppose I could have it be: (console_srcu_read_flags() & CON_ENABLED). That would keep it more abstract in case anyone should want to read other flag bits under SRCU. There are only 4 call sites that need this, so I suppose we don't need a special _is_enabled() variant. Thanks for the suggestion! John
Re: [PATCH printk v3 33/40] printk, xen: fbfront: create/use safe function for forcing preferred
On 2022-11-10, Petr Mladek wrote: >> +void console_force_preferred_locked(struct console *con) >> +{ >> +struct console *cur_pref_con; >> + >> +if (!console_is_registered_locked(con)) >> +return; >> + >> +cur_pref_con = console_first(); >> + >> +/* Already preferred? */ >> +if (cur_pref_con == con) >> +return; >> + >> +hlist_del_init_rcu(>node); > > We actually should re-initialize the node only after all existing > console list walks are finished. Se we should use here: > > hlist_del_rcu(>node); hlist_del_init_rcu() only re-initializes @pprev pointer. But maybe you are concerned that there is a window where list_unhashed() becomes true? I agree that it should be changed to hlist_del_rcu() because there should not be a window where this console appears unregistered. >> +/* Only the new head can have CON_CONSDEV set. */ >> +WRITE_ONCE(cur_pref_con->flags, cur_pref_con->flags & ~CON_CONSDEV); > > As mentioned in the reply for 7th patch, I would prefer to hide this > WRITE_ONCE into a wrapper, e.g. console_set_flag(). It might also > check that the console_list_lock is taken... Agreed. For v4 it will become: console_srcu_write_flags(cur_pref_con->flags & ~CON_CONSDEV); John
[PATCH printk v3 33/40] printk, xen: fbfront: create/use safe function for forcing preferred
With commit 9e124fe16ff2("xen: Enable console tty by default in domU if it's not a dummy") a hack was implemented to make sure that the tty console remains the console behind the /dev/console device. The main problem with the hack is that, after getting the console pointer to the tty console, it is assumed the pointer is still valid after releasing the console_sem. This assumption is incorrect and unsafe. Make the hack safe by introducing a new function console_force_preferred_locked() and perform the full operation under the console_list_lock. Signed-off-by: John Ogness --- drivers/video/fbdev/xen-fbfront.c | 12 +++-- include/linux/console.h | 1 + kernel/printk/printk.c| 44 --- 3 files changed, 46 insertions(+), 11 deletions(-) diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 4d2694d904aa..8752d389e382 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -504,18 +504,14 @@ static void xenfb_make_preferred_console(void) if (console_set_on_cmdline) return; - console_lock(); + console_list_lock(); for_each_console(c) { if (!strcmp(c->name, "tty") && c->index == 0) break; } - console_unlock(); - if (c) { - unregister_console(c); - c->flags |= CON_CONSDEV; - c->flags &= ~CON_PRINTBUFFER; /* don't print again */ - register_console(c); - } + if (c) + console_force_preferred_locked(c); + console_list_unlock(); } static int xenfb_resume(struct xenbus_device *dev) diff --git a/include/linux/console.h b/include/linux/console.h index cdae70e27377..b6b5d796d15c 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -273,6 +273,7 @@ enum con_flush_mode { }; extern int add_preferred_console(char *name, int idx, char *options); +extern void console_force_preferred_locked(struct console *con); extern void register_console(struct console *); extern int unregister_console(struct console *); extern void console_lock(void); diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index be40a9688403..d74e6e609f7d 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -247,9 +247,10 @@ int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write, void console_list_lock(void) { /* -* In unregister_console(), synchronize_srcu() is called with the -* console_list_lock held. Therefore it is not allowed that the -* console_list_lock is taken with the srcu_lock held. +* In unregister_console() and console_force_preferred_locked(), +* synchronize_srcu() is called with the console_list_lock held. +* Therefore it is not allowed that the console_list_lock is taken +* with the srcu_lock held. * * Whether or not this context is in the read-side critical section * can only be detected if the appropriate debug options are enabled. @@ -3457,6 +3458,43 @@ int unregister_console(struct console *console) } EXPORT_SYMBOL(unregister_console); +/** + * console_force_preferred_locked - force a registered console preferred + * @con: The registered console to force preferred. + * + * Must be called under console_list_lock(). + */ +void console_force_preferred_locked(struct console *con) +{ + struct console *cur_pref_con; + + if (!console_is_registered_locked(con)) + return; + + cur_pref_con = console_first(); + + /* Already preferred? */ + if (cur_pref_con == con) + return; + + hlist_del_init_rcu(>node); + + /* +* Ensure that all SRCU list walks have completed so that the console +* can be added to the beginning of the console list and its forward +* list pointer can be re-initialized. +*/ + synchronize_srcu(_srcu); + + con->flags |= CON_CONSDEV; + WARN_ON(!con->device); + + /* Only the new head can have CON_CONSDEV set. */ + WRITE_ONCE(cur_pref_con->flags, cur_pref_con->flags & ~CON_CONSDEV); + hlist_add_behind_rcu(>node, console_list.first); +} +EXPORT_SYMBOL(console_force_preferred_locked); + /* * Initialize the console device. This is called *early*, so * we can't necessarily depend on lots of kernel help here. -- 2.30.2
[PATCH printk v3 00/40] reduce console_lock scope
This is v3 of a series to prepare for threaded/atomic printing. v2 is here [0]. This series focuses on reducing the scope of the BKL console_lock. It achieves this by switching to SRCU and a dedicated mutex for console list iteration and modification, respectively. The console_lock will no longer offer this protection and is completely removed from (un)register_console() and console_stop/start() code. Also, during the review of v2 it came to our attention that many console drivers are checking CON_ENABLED to see if they are registered. Because this flag can change without unregistering and because this flag does not represent an atomic point when an (un)registration process is complete, a new console_is_registered() function is introduced. This function uses the console_list_lock to synchronize with the (un)registration process to provide a reliable status. All users of the console_lock for list iteration have been modified. For the call sites where the console_lock is still needed (because of other reasons), comments are added to explain exactly why the console_lock was needed. All users of CON_ENABLED for registration status have been modified to use console_is_registered(). Note that there are still users of CON_ENABLED, but this is for legitimate purposes about a registered console being able to print. The base commit for this series is from Paul McKenney's RCU tree and provides an NMI-safe SRCU implementation [1]. Without the NMI-safe SRCU implementation, this series is not less safe than mainline. But we will need the NMI-safe SRCU implementation for atomic consoles anyway, so we might as well get it in now. Especially since it _does_ increase the reliability for mainline in the panic path. Changes since v3: general: - introduce a synchronized console_is_registered() to query if a console is registered, meant to replace CON_ENABLED (mis)use for this purpose - directly read console->flags for registered consoles if it is race-free (and document that it is so) - replace uart_console_enabled() with a new uart_console_registered() based on console_is_registered() - change comments about why console_lock is used to synchronize console->device() by providing an example registration check fixups: - the following drivers were modified to use the new console_is_registered() instead of CON_ENABLED checks - arch/m68k/emu/nfcon.c - drivers/firmware/efi/earlycon.c - drivers/net/netconsole.c - drivers/tty/hvc/hvc_console.c - drivers/tty/serial/8250/8250_core.c - drivers/tty/serial/earlycon.c - drivers/tty/serial/pic32_uart.c - drivers/tty/serial/samsung_tty.c - drivers/tty/serial/serial_core.c - drivers/tty/serial/xilinx_uartps.c - drivers/usb/early/xhci-dbc.c um: kmsg_dumper: - change stdout dump criteria to match original intention kgdb/kdb: - in configure_kgdboc(), take console_list_lock to synchronize tty_find_polling_driver() against register_console() - add comments explaining why calling console->write() without locking might work tty: sh-sci: - use a setup() callback to setup the early console fbdev: xen: - implement a cleaner approach for console_force_preferred_locked() rcu: - implement debug_lockdep_rcu_enabled() for !CONFIG_DEBUG_LOCK_ALLOC printk: - check CONFIG_DEBUG_LOCK_ALLOC for srcu_read_lock_held() availability - for console_lock/_trylock/_unlock, replace "lock the console system" language with "block the console subsystem from printing" - use WRITE_ONCE() for updating console->flags of registered consoles - expand comments of synchronize_srcu() calls to explain why they are needed, and also expand comments to explain when it is not needed - change CON_BOOT consoles to always begin at earliest message - for non-BOOT/non-PRINTBUFFER consoles, initialize @seq to the minimal @seq of any of the enabled boot consoles - add comments and lockdep assertion to unregister_console_locked() because it is not clear from the name which lock is implied - dropped patches that caused unnecessary churn in the series John Ogness [0] https://lore.kernel.org/lkml/20221019145600.1282823-1-john.ogn...@linutronix.de [1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/log/?h=srcunmisafe.2022.10.21a John Ogness (38): rcu: implement lockdep_rcu_enabled for !CONFIG_DEBUG_LOCK_ALLOC printk: Prepare for SRCU console list protection printk: fix setting first seq for consoles um: kmsg_dump: only dump when no output console available console: introduce console_is_enabled() wrapper printk: use console_is_enabled() um: kmsg_dump: use console_is_enabled() kdb: kdb_io: use console_is_enabled() um: kmsg_dumper: use srcu console list iterator tty: serial: kgdboc: document console_lock usage tty: tty_io: document console_lock usage proc: consoles: document console_lock usage kdb: use srcu console list iterator printk: console_flush_all: use src
Re: [PATCH printk v2 38/38] printk, xen: fbfront: create/use safe function for forcing preferred
On 2022-10-27, Petr Mladek wrote: >> -if (c) { >> -unregister_console(c); >> -c->flags |= CON_CONSDEV; >> -c->flags &= ~CON_PRINTBUFFER; /* don't print again */ >> -register_console(c); >> -} >> +if (c) >> +console_force_preferred(c); > > I would prefer to fix this a clean way. > > [...] > > I would suggest to implement: > > [...] > > It is a more code. But it is race-free. Also it is much more clear > what is going on. > > How does this sound, please? I wasn't sure if any of the other preferred-console magic in register_console() was needed, which is why I kept a full register_console() call. But if it really is just about forcing it the head and setting a new CON_CONSDEV, then your suggestion is much simpler. Thanks. John
[PATCH printk v2 24/38] xen: fbfront: use srcu console list iterator
Since the console_lock is not being used for anything other than safe console list traversal, use srcu console list iteration instead. Signed-off-by: John Ogness --- drivers/video/fbdev/xen-fbfront.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 4d2694d904aa..2552c853c6c2 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -500,16 +500,18 @@ static int xenfb_probe(struct xenbus_device *dev, static void xenfb_make_preferred_console(void) { struct console *c; + int cookie; if (console_set_on_cmdline) return; - console_lock(); - for_each_console(c) { + cookie = console_srcu_read_lock(); + for_each_console_srcu(c) { if (!strcmp(c->name, "tty") && c->index == 0) break; } - console_unlock(); + console_srcu_read_unlock(cookie); + if (c) { unregister_console(c); c->flags |= CON_CONSDEV; -- 2.30.2
[PATCH printk v2 38/38] printk, xen: fbfront: create/use safe function for forcing preferred
With commit 9e124fe16ff2("xen: Enable console tty by default in domU if it's not a dummy") a hack was implemented to make sure that the tty console remains the console behind the /dev/console device. The main problem with the hack is that, after getting the console pointer to the tty console, it is assumed the pointer is still valid after releasing the console_sem. This assumption is incorrect and unsafe. Make the hack safe by introducing a new function console_force_preferred() to perform the full operation under the console_list_lock. Signed-off-by: John Ogness --- drivers/video/fbdev/xen-fbfront.c | 8 +--- include/linux/console.h | 1 + kernel/printk/printk.c| 69 +++ 3 files changed, 46 insertions(+), 32 deletions(-) diff --git a/drivers/video/fbdev/xen-fbfront.c b/drivers/video/fbdev/xen-fbfront.c index 2552c853c6c2..aa362b25a60f 100644 --- a/drivers/video/fbdev/xen-fbfront.c +++ b/drivers/video/fbdev/xen-fbfront.c @@ -512,12 +512,8 @@ static void xenfb_make_preferred_console(void) } console_srcu_read_unlock(cookie); - if (c) { - unregister_console(c); - c->flags |= CON_CONSDEV; - c->flags &= ~CON_PRINTBUFFER; /* don't print again */ - register_console(c); - } + if (c) + console_force_preferred(c); } static int xenfb_resume(struct xenbus_device *dev) diff --git a/include/linux/console.h b/include/linux/console.h index bf1e8136424a..41378b00bbdd 100644 --- a/include/linux/console.h +++ b/include/linux/console.h @@ -235,6 +235,7 @@ enum con_flush_mode { }; extern int add_preferred_console(char *name, int idx, char *options); +extern void console_force_preferred(struct console *c); extern void register_console(struct console *); extern int unregister_console(struct console *); extern void console_lock(void); diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 840d581c4b23..9a056a42b8d8 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -3207,38 +3207,17 @@ static void try_enable_default_console(struct console *newcon) static int unregister_console_locked(struct console *console); -/* - * The console driver calls this routine during kernel initialization - * to register the console printing procedure with printk() and to - * print any messages that were printed by the kernel before the - * console driver was initialized. - * - * This can happen pretty early during the boot process (because of - * early_printk) - sometimes before setup_arch() completes - be careful - * of what kernel features are used - they may not be initialised yet. - * - * There are two types of consoles - bootconsoles (early_printk) and - * "real" consoles (everything which is not a bootconsole) which are - * handled differently. - * - Any number of bootconsoles can be registered at any time. - * - As soon as a "real" console is registered, all bootconsoles - *will be unregistered automatically. - * - Once a "real" console is registered, any attempt to register a - *bootconsoles will be rejected - */ -void register_console(struct console *newcon) +static void register_console_locked(struct console *newcon) { struct console *con; bool bootcon_enabled = false; bool realcon_enabled = false; int err; - console_list_lock(); - for_each_console(con) { if (WARN(con == newcon, "console '%s%d' already registered\n", con->name, con->index)) { - goto unlock; + return; } if (con->flags & CON_BOOT) @@ -3251,7 +3230,7 @@ void register_console(struct console *newcon) if (newcon->flags & CON_BOOT && realcon_enabled) { pr_info("Too late to register bootconsole %s%d\n", newcon->name, newcon->index); - goto unlock; + return; } /* @@ -3282,7 +3261,7 @@ void register_console(struct console *newcon) /* printk() messages are not printed to the Braille console. */ if (err || newcon->flags & CON_BRL) - goto unlock; + return; /* * If we have a bootconsole, and are switching to a real console, @@ -3346,7 +3325,31 @@ void register_console(struct console *newcon) unregister_console_locked(con); } } -unlock: +} + +/* + * The console driver calls this routine during kernel initialization + * to register the console printing procedure with printk() and to + * print any messages that were printed by the kernel before the + * console driver was initialized. + * + * This can happen pretty early during the boot process (because of + * early
[PATCH printk v2 00/38] reduce console_lock scope
This is v2 of a series to prepare for threaded/atomic printing. It is a rework of patches 6-12 of the v1 [0]. From the v1, patches 1-5 are already mainline and a rework of patches >12 will be posted in a later series. This series focuses on reducing the scope of the BKL console_lock. It achieves this by switching to SRCU and a dedicated mutex for console list iteration and modification, respectively. The console_lock will no longer offer this protection and is completely removed from (un)register_console() and console_stop/start() code. All users of the console_lock for list iteration have been modified. For the call sites where the console_lock is still needed (because of other reasons), I added comments to explain exactly why the console_lock was needed. The base commit for this series is from Paul McKenney's RCU tree and provides an NMI-safe SRCU implementation [1]. Without the NMI-safe SRCU implementation, this series is not less safe than mainline. But we will need the NMI-safe SRCU implementation for atomic consoles anyway, so we might as well get it in now. Especially since it _does_ increase the reliability for mainline in the panic path. Changes since v2: general: - introduce console_is_enabled() to document safe data race on console->flags - switch all "console->flags & CON_ENABLED" code sites to console_is_enabled() - add "for_each_console_srcu" to .clang-format - cleanup/clarify comments relating to console_lock coverage/usage um: - kmsg_dumper: use srcu instead of console_lock for list iteration kgdb/kdb: - configure_kgdboc: keep console_lock for console->device() synchronization, use srcu for list iteration - kgdboc_earlycon_pre_exp_handler: use srcu instead of documenting unsafety for list iteration - kgdboc_earlycon_init: use console_list_lock instead of console_lock to lock list - kdb_msg_write: use srcu instead of documenting unsafety for list iteration tty: - show_cons_active: keep console_lock for console->device() synchronization fbdev: - xen-fbfront: xenfb_probe: use srcu instead of console_lock for list iteration, introduce console_force_preferred() to safely implement hack proc/consoles: - show_console_dev: keep console_lock for console->device() synchronization - c_next: use hlist_entry_safe() instead of hlist_for_each_entry_continue() printk: - remove console_lock from console_stop/start() and (un)register_console() - introduce console_srcu_read_(un)lock() to wrap scru read (un)lock - rename cons_first() macro to console_first() - for_each_console: add lockdep check instead of introducing new for_each_registered_console() - console_list_lock: add warning if in read-side critical section - release srcu read lock on handover - console_flush_all: use srcu instead of relying on console lock for list iteration - console_unblank: use srcu instead of relying on console_lock for list iteration - console_flush_on_panic: use srcu for list iteration and document console->seq race - device: keep console_lock for console->device() synchronization, usr srcu for list iteration - register_console: split list adding logic into the 3 distinct scenarios - register_console: set initial sequence number before adding to list - unregister_console: fix ENODEV return value if the console is not registered - console_stop: synchronize srcu - printk_late_init: use _safe variant of iteration - __pr_flush: use srcu instead of relying on console_lock for list iteration John Ogness [0] https://lore.kernel.org/r/20220924000454.3319186-1-john.ogn...@linutronix.de [1] https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git/log/?h=srcunmisafe.2022.10.18b John Ogness (37): printk: Convert console_drivers list to hlist printk: Prepare for SRCU console list protection printk: introduce console_is_enabled() wrapper printk: use console_is_enabled() tty: nfcon: use console_is_enabled() um: kmsg_dump: use console_is_enabled() efi: earlycon: use console_is_enabled() netconsole: use console_is_enabled() tty: hvc: use console_is_enabled() tty: serial: earlycon: use console_is_enabled() tty: serial: kgdboc: use console_is_enabled() tty: serial: pic32_uart: use console_is_enabled() tty: serial: samsung_tty: use console_is_enabled() tty: serial: serial_core: use console_is_enabled() tty: serial: xilinx_uartps: use console_is_enabled() tty: tty_io: use console_is_enabled() usb: early: xhci-dbc: use console_is_enabled() kdb: kdb_io: use console_is_enabled() um: kmsg_dumper: use srcu console list iterator serial: kgdboc: use srcu console list iterator serial: kgdboc: document console_lock usage tty: tty_io: document console_lock usage xen: fbfront: use srcu console list iterator proc: consoles: document console_lock usage kdb: use srcu console list iterator printk: console_flush_all: use srcu console list iterator printk:
Re: [PATCH] drm: fb-helper: Avoid nesting spinlock_t into raw_spinlock_t
On 2022-02-15, Sebastian Siewior wrote: >> From: Jiri Kosina >> >> drm_fb_helper_damage() is acquiring spinlock_t (helper->damage_lock), >> while it can be called from contexts where raw_spinlock_t is held (e.g. >> console_owner lock obtained on vprintk_emit() codepath). >> >> As the critical sections protected by damage_lock are super-tiny, let's >> fix this by converting it to raw_spinlock_t in order not to violate >> PREEMPT_RT-imposed lock nesting rules. >> >> This fixes the splat below. >> >> = >> [ BUG: Invalid wait context ] >> 5.17.0-rc4-2-gd567f5db412e #1 Not tainted > > rc4. Is this also the case in the RT tree which includes John's printk > changes? In the RT tree, the fbcon's write() callback is only called in preemptible() contexts. So this is only a mainline issue. The series I recently posted to LKML [0] should also address this issue (if/when it gets accepted). John [0] https://lore.kernel.org/lkml/20220207194323.273637-1-john.ogn...@linutronix.de
Re: printk deadlock due to double lock attempt on current CPU's runqueue
On 2021-11-10, Sultan Alsawaf wrote: > On Wed, Nov 10, 2021 at 11:13:37AM +0106, John Ogness wrote: >> Even after we introduce kthread printers, there will still be >> situations where direct printing is used: booting (before kthreads >> exist) and shutdown/suspend/crash situations, when the kthreads may >> not be active. > > Although I'm unaware of any ongoing kthread printer work, I'm curious > to know how a kthread approach wouldn't employ a try_to_wake_up() from > directly inside printk(), since the try_to_wake_up() hit from inside > the fbcon code is what caused my deadlock. The kthread approach triggers irq_work from printk(). The kthread printer is then woken from the irq_work. John Ogness
Re: printk deadlock due to double lock attempt on current CPU's runqueue
On 2021-11-10, Daniel Vetter wrote: > I'm a bit out of the loop but from lwn articles my understanding is > that part of upstreaming from -rt we no longer have the explicit "I'm > a safe console for direct printing" opt-in. Which I get from a > backwards compat pov, but I still think for at least fbcon we really > should never attempt a direct printk con->write, it's just all around > terrible. Right now we don't have an explicit "I'm a safe console for direct printing" option. Right now all printing is direct. But it sounds to me that we should add this console flag when we introduce kthread printers. > So yeah for fbcon at least I think we really should throw out direct > con->write from printk completely. Even after we introduce kthread printers, there will still be situations where direct printing is used: booting (before kthreads exist) and shutdown/suspend/crash situations, when the kthreads may not be active. I will introduce a console flag so that consoles can opt-out for direct printing. (opt-out rather than opt-in is probably easier, since there are only a few that would need to opt-out). Since kthread printers do not yet exist (hoping to get them in for 5.17), I am not sure how we should address the reported bug for existing kernels. John Ogness
kmemleak report: 5.15.0-rc3: nouveau_fence_new
Hello, With 5.15.0-rc3 on my ppc64 (PowerMac G5) I am seeing kmemleak reports. They are always 96 bytes and with the same stacktrace. unreferenced object 0xc00011d2a7e0 (size 96): comm "X", pid 1743, jiffies 4295010075 (age 5457.040s) hex dump (first 32 bytes): c0 00 00 00 0b 9f f0 00 c0 00 3d 00 00 b0 85 90 ..=. 00 00 00 a9 77 41 30 23 c0 00 00 00 08 db b7 c8 wA0# backtrace: [<6f102108>] .nouveau_fence_new+0x4c/0x120 [nouveau] [<395e0a83>] .nouveau_bo_move+0x4f0/0x870 [nouveau] [<f17bc6da>] .ttm_bo_handle_move_mem+0xb4/0x1e0 [ttm] [<fb36762f>] .ttm_bo_validate+0x144/0x230 [ttm] [<a84dc7b3>] .nouveau_bo_validate+0x70/0xc0 [nouveau] [<b4e870a2>] .nouveau_gem_ioctl_pushbuf+0x6e0/0x1a90 [nouveau] [<7b7c5c38>] .drm_ioctl_kernel+0x104/0x180 [drm] [<0af76e30>] .drm_ioctl+0x244/0x490 [drm] [<ebb759e8>] .nouveau_drm_ioctl+0x78/0x140 [nouveau] [<263274a7>] .__se_sys_ioctl+0xfc/0x160 [<88c39f3d>] .system_call_exception+0x178/0x2a0 [<0cfdf34f>] system_call_common+0xec/0x250 If I decode this stacktrace using decode_stacktrace.sh so that the line numbers can be seen, I get the following: .nouveau_fence_new+0x4c/0x120 [nouveau] linux-5.15-rc3/include/linux/slab.h:591 linux-5.15-rc3/include/linux/slab.h:721 linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_fence.c:424 .nouveau_bo_move+0x4f0/0x870 [nouveau] linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_bo.c:821 linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_bo.c:1032 .ttm_bo_handle_move_mem+0xb4/0x1e0 [ttm] linux-5.15-rc3/drivers/gpu/drm/ttm/ttm_bo.c:197 .ttm_bo_validate+0x144/0x230 [ttm] linux-5.15-rc3/drivers/gpu/drm/ttm/ttm_bo.c:904 linux-5.15-rc3/drivers/gpu/drm/ttm/ttm_bo.c:981 .nouveau_bo_validate+0x70/0xc0 [nouveau] linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_bo.c:647 .nouveau_gem_ioctl_pushbuf+0x6e0/0x1a90 [nouveau] linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_gem.c:548 linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_gem.c:605 linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_gem.c:799 .drm_ioctl_kernel+0x104/0x180 [drm] linux-5.15-rc3/drivers/gpu/drm/drm_ioctl.c:795 .drm_ioctl+0x244/0x490 [drm] linux-5.15-rc3/include/linux/thread_info.h:185 linux-5.15-rc3/include/linux/thread_info.h:218 linux-5.15-rc3/include/linux/uaccess.h:199 linux-5.15-rc3/drivers/gpu/drm/drm_ioctl.c:899 .nouveau_drm_ioctl+0x78/0x140 [nouveau] linux-5.15-rc3/drivers/gpu/drm/nouveau/nouveau_drm.c:1163 .__se_sys_ioctl+0xfc/0x160 linux-5.15-rc3/fs/ioctl.c:51 linux-5.15-rc3/fs/ioctl.c:874 linux-5.15-rc3/fs/ioctl.c:860 .system_call_exception+0x178/0x2a0 .system_call_exception linux-5.15-rc3/arch/powerpc/kernel/interrupt.c:233 system_call_common+0xec/0x250 linux-5.15-rc3/arch/powerpc/kernel/interrupt_64.S:314 Here are all enabled DRM and NOUVEAU configs in my kernel: CONFIG_DRM=m CONFIG_DRM_KMS_HELPER=m CONFIG_DRM_FBDEV_EMULATION=y CONFIG_DRM_FBDEV_OVERALLOC=100 CONFIG_DRM_TTM=m CONFIG_DRM_TTM_HELPER=m CONFIG_DRM_NOUVEAU=m CONFIG_NOUVEAU_DEBUG=5 CONFIG_NOUVEAU_DEBUG_DEFAULT=3 CONFIG_DRM_NOUVEAU_BACKLIGHT=y CONFIG_DRM_PANEL=y CONFIG_DRM_BRIDGE=y CONFIG_DRM_PANEL_BRIDGE=y CONFIG_DRM_PANEL_ORIENTATION_QUIRKS=m And lspci output: :f0:10.0 VGA compatible controller: NVIDIA Corporation NV34 [GeForce FX 5200 Ultra] (rev a1) I have been running 5.12 on my machine without these reports. So it might be something that showed up in 5.13 or 5.14 as well. I do not know if this is a good channel for reporting this, so please let me know if I should report it somewhere else. Also let me know if you need any additional information from me. John Ogness
Re: [PATCH 1/3] printk: use CONFIG_CONSOLE_LOGLEVEL_* directly
On 2021-02-02, Masahiro Yamada wrote: > CONSOLE_LOGLEVEL_DEFAULT is nothing more than a shorthand of > CONFIG_CONSOLE_LOGLEVEL_DEFAULT. > > When you change CONFIG_CONSOLE_LOGLEVEL_DEFAULT from Kconfig, almost > all objects are rebuilt because CONFIG_CONSOLE_LOGLEVEL_DEFAULT is > used in , which is included from most of source files. > > In fact, there are only 4 users of CONSOLE_LOGLEVEL_DEFAULT: > > arch/x86/platform/uv/uv_nmi.c > drivers/firmware/efi/libstub/efi-stub-helper.c > drivers/tty/sysrq.c > kernel/printk/printk.c > > So, when you change CONFIG_CONSOLE_LOGLEVEL_DEFAULT and rebuild the > kernel, it is enough to recompile those 4 files. > > Remove the CONSOLE_LOGLEVEL_DEFAULT definition from , > and use CONFIG_CONSOLE_LOGLEVEL_DEFAULT directly. With commit a8fe19ebfbfd ("kernel/printk: use symbolic defines for console loglevels") it can be seen that various drivers used to hard-code their own values. The introduction of the macros in an intuitive location (include/linux/printk.h) made it easier for authors to find/use the various available printk settings and thresholds. Technically there is no problem using Kconfig macros directly. But will authors bother to hunt down available Kconfig settings? Or will they only look in printk.h to see what is available? IMHO if code wants to use settings from a foreign subsystem, it should be taking those from headers of that subsystem, rather than using some Kconfig settings from that subsystem. Headers exist to make information available to external code. Kconfig (particularly for a subsystem) exist to configure that subsystem. But my feeling on this may be misguided. Is it generally accepted in the kernel that any code can use Kconfig settings of any other code? John Ogness ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v9 04/18] kunit: test: add kunit_stream a std::stream like logger
On 2019-08-01, Brendan Higgins wrote: > On Fri, Jul 26, 2019 at 1:31 AM Petr Mladek wrote: >> On Thu 2019-07-25 13:21:12, Brendan Higgins wrote: >>> On Wed, Jul 24, 2019 at 12:31 AM Petr Mladek wrote: >>>> On Mon 2019-07-22 16:54:10, Stephen Boyd wrote: >>>>> Quoting Brendan Higgins (2019-07-22 15:30:49) >>>>>> On Mon, Jul 22, 2019 at 1:03 PM Stephen Boyd wrote: >>>>>>> What's the calling context of the assertions and expectations? I >>>>>>> still don't like the fact that string stream needs to allocate >>>>>>> buffers and throw them into a list somewhere because the calling >>>>>>> context matters there. >>>>>> >>>>>> The calling context is the same as before, which is anywhere. >>>>> >>>>> Ok. That's concerning then. >>>>> >>>>>>> I'd prefer we just wrote directly to the console/log via printk >>>>>>> instead. That way things are simple because we use the existing >>>>>>> buffering path of printk, but maybe there's some benefit to the >>>>>>> string stream that I don't see? Right now it looks like it >>>>>>> builds a string and then dumps it to printk so I'm sort of lost >>>>>>> what the benefit is over just writing directly with printk. >>>>>> >>>>>> It's just buffering it so the whole string gets printed >>>>>> uninterrupted. If we were to print out piecemeal to printk, >>>>>> couldn't we have another call to printk come in causing it to >>>>>> garble the KUnit message we are in the middle of printing? >>>>> >>>>> Yes, printing piecemeal by calling printk many times could lead to >>>>> interleaving of messages if something else comes in such as an >>>>> interrupt printing something. Printk has some support to hold >>>>> "records" but I'm not sure how that would work here because >>>>> KERN_CONT talks about only being used early on in boot code. I >>>>> haven't looked at printk in detail though so maybe I'm all wrong >>>>> and KERN_CONT just works? >>>> >>>> KERN_CONT does not guarantee that the message will get printed >>>> together. The pieces get interleaved with messages printed in >>>> parallel. >>>> >>>> Note that KERN_CONT was originally really meant to be used only >>>> during boot. It was later used more widely and ended in the best >>>> effort category. >>>> >>>> There were several attempts to make it more reliable. But it was >>>> always either too complicated or error prone or both. >>>> >>>> You need to use your own buffering if you rely want perfect output. >>>> The question is if it is really worth the complexity. Also note >>>> that any buffering reduces the chance that the messages will reach >>>> the console. >>> >>> Seems like that settles it then. Thanks! >>> >>>> BTW: There is a work in progress on a lockless printk ring buffer. >>>> It will make printk() more secure regarding deadlocks. But it might >>>> make transparent handling of continuous lines even more tricky. >>>> >>>> I guess that local buffering, before calling printk(), will be >>>> even more important then. Well, it might really force us to create >>>> an API for it. >>> >>> Cool! Can you CC me on that discussion? >> >> Adding John Oggness into CC. >> >> John, please CC Brendan Higgins on the patchsets eventually switching >> printk() into the lockless buffer. The test framework is going to >> do its own buffering to keep the related messages together. >> >> The lockless ringbuffer might make handling of related (partial) >> lines worse or better. It might justify KUnit's extra buffering >> or it might allow to get rid of it. > > Thanks for CC'ing me on the printk ringbuffer thread. It looks like it > actually probably won't affect my needs for KUnit logging. The biggest > reason I need some sort of buffering system is to be able to compose > messages piece meal into a single message that will be printed out to > the user as a single message with no messages from other printk > callers printed out in the middle of mine. printk has this same requirement for its CONT messages. You can read about how I propose to implement that here[0], using a sep
Re: [PATCH v2 1/3] drm: Add support for panic message output
On 2019-03-14, John Ogness wrote: > On 2019-03-14, Daniel Vetter wrote: >> That's why we came up with the trylock + immediate bail out design if >> that fails. Plus really only render the oops int whatever is the >> current display buffer, so that we don't have to do any hw >> programming at all. > > I think this is your best option. The real work will be identifying > any/all spin locking that currently exists. For all of those, the code > needs to change to: > > 1. trylock if oops_in_progress, otherwise spinlock On second thought, you shouldn't use oops_in_progress. It would be better if DRM had its own flag to signify that it is currently being used in kmsg_dump context. > 2. if trylock fails, the code must have a sane failure > > The 2nd point will be the difficult one. For example, you may have > functions without a return value taking spinlocks. But now those > functions could fail. John Ogness ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/3] drm: Add support for panic message output
On 2019-03-14, Daniel Vetter wrote: > That's why we came up with the trylock + immediate bail out design if > that fails. Plus really only render the oops int whatever is the > current display buffer, so that we don't have to do any hw programming > at all. I think this is your best option. The real work will be identifying any/all spin locking that currently exists. For all of those, the code needs to change to: 1. trylock if oops_in_progress, otherwise spinlock 2. if trylock fails, the code must have a sane failure The 2nd point will be the difficult one. For example, you may have functions without a return value taking spinlocks. But now those functions could fail. John Ogness ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH v2 1/3] drm: Add support for panic message output
On 2019-03-12, Ahmed S. Darwish wrote: >>>> + >>>> +static void drm_panic_kmsg_dump(struct kmsg_dumper *dumper, >>>> + enum kmsg_dump_reason reason) >>>> +{ >>>> + class_for_each_device(drm_class, NULL, dumper, drm_panic_dev_iter); >>> >>> class_for_each_device uses klist, which only uses an irqsave >>> spinlock. I think that's good enough. Comment to that effect would >>> be good e.g. >>> >>> /* based on klist, which uses only a spin_lock_irqsave, which we >>> * assume still works */ >>> >>> If we aim for perfect this should be a trylock still, maybe using >>> our own device list. >>> > > I definitely agree here. > > The lock may already be locked either by a stopped CPU, or by the > very same CPU we execute panic() on (e.g. NMI panic() on the > printing CPU). > > This is why it's very common for example in serial consoles, which > are usually careful about re-entrance and panic contexts, to do: > > xx_console_write(...) { > if (oops_in_progress) > locked = spin_trylock_irqsave(>lock, flags); > else > spin_lock_irqsave(>lock, flags); > } > > I'm quite positive we should do the same for panic drm drivers. This construction will continue, even if the trylock fails. It only makes sense to do this if the driver has a chance of being successful. Ignoring locks is a serious issue. I personally am quite unhappy that the serial drivers do this, which was part of my motivation for the new printk design I'm working on. If the driver is not capable of doing something useful on a failed trylock, then I recommend just skipping that device. Maybe trying it again later after trying all the devices? John Ogness ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel