Re: [systemd-devel] Antw: [EXT] Re: [systemd‑devel] Q: write error, watchdog, journald core dump, ordering of entries

2021-10-11 Thread Lennart Poettering
On Mo, 11.10.21 10:57, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> > Now when journald hangs due to some underlying IO issue, then it might
> > miss the watchdog deadline, and PID 1 might then kill it to get it
> > back up. It will log about this to the journal, but given tha tthe
> > journal is hanging/being killed it's not going to write the messages
> > to disk, the mesages will remain queued in the logging socket for a
> > bit. Until eventually journald starts up again, and resumes processing
> > log messages. it will then process the messages already queued in the
> > sockets from when it was hanging, and thus the order might be
> > surprising.
>
> Hi!
>
> Thanks for explaining.
> Don't you have some OOB-logging, that is: Log a message before processing the
> queue logs.

The "Journal started" message is inserted into the log stream by
journald itself before processing the already queued messages.

Lennart

--
Lennart Poettering, Berlin


[systemd-devel] Antw: [EXT] Re: [systemd‑devel] Q: write error, watchdog, journald core dump, ordering of entries

2021-10-11 Thread Ulrich Windl
>>> Lennart Poettering  schrieb am 11.10.2021 um 09:54
in
Nachricht :
> On Mi, 06.10.21 10:29, Ulrich Windl (ulrich.wi...@rz.uni‑regensburg.de)
wrote:
> 
>> Hi!
>>
>> We had a stuck networkc card on a server that seems to have caused the RAID

> controller with two SSDs to be stuck on write as well.
>> Anyway journald dumped core with this stack:
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: Process 3321 (systemd‑journal)
of 
> user 0 dumped core.
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: Coredump diverted to 
>
/var/lib/systemd/coredump/core.systemd‑journal.0.a4eb19afcc314d99936cbdd5542e4
> fed.3321.163345758500.lz4
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: Stack trace of thread 3321:
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: #0  0x7f913492d0c2 
> journal_file_append_object (libsystemd‑shared‑234.so)
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: #1  0x7f913492dba3 n/a 
> (libsystemd‑shared‑234.so)
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: #2  0x7f913492fc79 
> journal_file_append_entry (libsystemd‑shared‑234.so)
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: #3  0x557fe532908d n/a 
> (systemd‑journald)
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: #4  0x557fe532b15f n/a 
> (systemd‑journald)
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: #5  0x557fe5324664 n/a 
> (systemd‑journald)
>> Oct 05 20:13:25 h19 systemd‑coredump[26759]: #6  0x557fe5326a80 n/a 
> (systemd‑journald)
>> Oct 05 20:13:25 h19 kernel: printk: systemd‑coredum: 6 output lines 
> suppressed due to ratelimiting
>>
>> (systemd‑234‑24.90.1.x86_64 of SLES15 SP2 on x86_64)
>>
>> journald seems to have restarted later, but I wonder about the ordering of

> the entries following:
>> Oct 05 20:13:25 h19 systemd‑journald[26760]: Journal started
>> Oct 05 20:13:25 h19 systemd‑journald[26760]: System journal 
> (/var/log/journal/8695c89eb080463dad2ca9f9aaedf162) is 928.0M, max 4.0G,
3.0G 
> free.
>>
>> Oct 05 20:12:52 h19 systemd[1]: systemd‑journald.service: Watchdog timeout

> (limit 3min)!
>> Oct 05 20:12:52 h19 systemd[1]: systemd‑journald.service: Killing process 
> 3321 (systemd‑journal) with signal SIGABRT.
>> Oct 05 20:13:25 h19 systemd[1]: Starting Flush Journal to Persistent 
> Storage...
>> Oct 05 20:13:25 h19 systemd[1]: Started Flush Journal to Persistent
Storage.
>>
>> I don't understand why the core dump is logged before the signal
>> being sent and the watchdog timeout.
> 
> PID 1 logs to journald. PID 1 also runs and supervises
> journald. That's quite a special relationship: PID1 both is client to
> journald and manages it.
> 
> Now when journald hangs due to some underlying IO issue, then it might
> miss the watchdog deadline, and PID 1 might then kill it to get it
> back up. It will log about this to the journal, but given tha tthe
> journal is hanging/being killed it's not going to write the messages
> to disk, the mesages will remain queued in the logging socket for a
> bit. Until eventually journald starts up again, and resumes processing
> log messages. it will then process the messages already queued in the
> sockets from when it was hanging, and thus the order might be
> surprising.

Hi!

Thanks for explaining.
Don't you have some OOB-logging, that is: Log a message before processing the
queue logs.
When killing the journal process it would make sense to log the watchdog
timeout and the killing "OOB".
Of course those OOB messages couldn't actually be logged before journald
restarted, so maybe you'd need a second queue (that is emptied first).

Regards,
Ulrich Windl

> 
> ‑‑
> Lennart Poettering, Berlin