Package: logwatch
Version: 7.5.5-1
Severity: important

Dear Maintainer,

I recently discovered that logwatch did not report a systemd service 
(snapper-timeline.service) that failed repeatedly on my system. Looking through 
the script
/usr/share/logwatch/scripts/services/systemd
I determined that the reason for this is that the script makes wrong 
assumptions about what the log entries should look like - or assumptions that 
don't apply to a service of the type 'simple'.

The comments within the script explain the logic:
> # Failure will generate multiple messages like:
> # Feb  5 16:37:50 hostname systemd: ansible-pull.service: main process 
> exited, code=exited, status=2/INVALIDARGUMENT
> # Feb  5 16:37:50 hostname systemd: Failed to start Run ansible-pull on boot.
> # Feb  5 16:37:50 hostname systemd: Unit ansible-pull.service entered failed 
> state.
> # Feb  5 16:37:50 hostname systemd: ansible-pull.service failed.

and a few lines further down:
> # These events will be caught with the Unit X entered failed state message

So, everything other than "Unit {} entered failed state." will be ignored.

The problem here is that type simple services will not trigger this log message 
when they fail, see this example:

$ journalctl -u snapper-timeline.service -S 2023-02-16 -U 2023-02-17
-- Journal begins at Fri 2022-12-02 07:02:42 CET, ends at Tue 2023-02-28 
01:24:53 CET. --
Feb 16 03:05:29 HomeSrv systemd[1]: Started Timeline of Snapper Snapshots.
Feb 16 03:05:43 HomeSrv systemd-helper[2690]: running timeline for 'archive'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'documents'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'home'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'photos'.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: IO Error (.snapshots is not a 
btrfs subvolume).
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: timeline for 'photos' failed.
Feb 16 03:05:44 HomeSrv systemd-helper[2690]: running timeline for 'root'.
Feb 16 03:05:45 HomeSrv systemd[1]: snapper-timeline.service: Main process 
exited, code=exited, status=1/FAILURE
Feb 16 03:05:45 HomeSrv systemd[1]: snapper-timeline.service: Failed with 
result 'exit-code'.

As the log message "Unit {} entered failed state." doesn't appear here, the 
failure of the unit is never reported by logwatch (snapper-timeline.service, in 
this example, is a simple service). I would have expected logwatch would catch 
such failures and report them and I think the script should be changed to look 
for messages that actually appear for all unit types in case of failure.


Thanks and regards,

Timo


-- System Information:
Debian Release: 11.6
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable-security'), (500, 
'stable'), (400, 'proposed-updates')
Architecture: amd64 (x86_64)

Kernel: Linux 5.10.0-21-amd64 (SMP w/4 CPU threads)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages logwatch depends on:
ii  msmtp-mta [mail-transport-agent]  1.8.11-2.1
ii  perl                              5.32.1-4+deb11u2

Versions of packages logwatch recommends:
ii  libdate-manip-perl   6.83-1
ii  libsys-cpu-perl      0.61-2+b6
ii  libsys-meminfo-perl  0.99-1+b5

logwatch suggests no packages.

-- no debconf information
Thank you for using reportbug

Reply via email to