On Mon 2025-09-08 14:15:08, John Ogness wrote:
> On 2025-09-05, Petr Mladek <pmla...@suse.com> wrote:
> > On Tue 2025-09-02 15:33:53, Marcos Paulo de Souza wrote:
> >> These helpers will be used when calling console->write_atomic on
> >> KDB code in the next patch. It's basically the same implementaion
> >> as nbcon_device_try_acquire, but using NBCON_PORIO_EMERGENCY when
> >> acquiring the context.
> >> 
> >> For release we need to flush the console, since some messages could be
> >> added before the context was acquired, as KDB emits the messages using
> >> con->{write,write_atomic} instead of storing them on the ring buffer.
> >
> > I am a bit confused by the last paragraph. It is a very long sentence.
> >
> > Sigh, I wanted to propose a simple and clear alternative. But I ended
> > in a rabbit hole and with a rather complex text:
> >
> > <proposal>
> > The atomic flush in the release function is questionable. vkdb_printf()
> > is primary called only when other CPUs are quiescent in kdb_main_loop()
> > and do not call the classic printk(). But, for example, the
> > write_atomic() callback might print debug messages. Or there is
> > one kdb_printf() called in kgdb_panic() before other CPUs are
> > quiescent. So the flush might be useful. Especially, when
> > the kdb code fails to quiescent the CPUs and returns early.
> >
> > Let's keep it simple and just call __nbcon_atomic_flush_pending_con().
> > It uses write_atomic() callback which is used by the locked kdb code
> > anyway.
> >
> > The legacy loop (console_trylock()/console_unlock()) is not
> > usable in kdb context.
> >
> > It might make sense to trigger the flush via the printk kthread.
> > But it would not work in panic() where is the only known kdb_printf()
> > called when other CPUs are not quiescent. So, it does not look
> > worth it.
> > </proposal>
> >
> > What do you think?
> >
> > My opinion:
> >
> > Honestly, I think that the flush is not much important because
> > it will most offten have nothing to do.
> >
> > I am just not sure whether it is better to have it there
> > or avoid it. It might be better to remove it after all.
> > And just document the decision.
> 
> IMHO keeping the flush is fine. There are cases where there might be
> something to print. And since a printing kthread will get no chance to
> print as long as kdb is alive, we should have kdb flushing that
> console.
> 
> Note that this is the only console that will actually see the new
> messages immediately as all the other CPUs and quiesced.

I do not understand this argument. IMHO, this new
try_acquire()/release() API should primary flush only
the console which was (b)locked by this API.

It will be called in kdb_msg_write() which tries to write
to all registered consoles. So the other nbcon consoles will
get flushed when the try_acquire() succeeds on them. And the
legacy conosles were never flushed.

> For this reason
> we probably want to use __nbcon_atomic_flush_pending() to try to flush
> _all_ the consoles.

I would prefer to keep __nbcon_atomic_flush_pending_con().
I mean to flush only the console which was blocked.

Note that we would need to increment oops_in_progress if we wanted
to flush legacy consoles in this context... which would spread
the mess into nbcon code...

> As to the last paragraph of the commit message, I would keep it simple:
> 
> After release try to flush all consoles since there may be a backlog of
> messages in the ringbuffer. The kthread console printers do not get a
> chance to run while kdb is active.

I like this text.

Best Regards,
Petr


_______________________________________________
Kgdb-bugreport mailing list
Kgdb-bugreport@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kgdb-bugreport

Reply via email to