Re: Many systemd units do not start anymore

2024-02-07 Thread Michael Biebl

Am 07.02.24 um 13:07 schrieb Christoph Pleger:

Hello,

Am Mittwoch, dem 07.02.2024 um 12:27 +0100 schrieb Michael Biebl:

Am 07.02.24 um 11:32 schrieb Christoph Pleger:

Hello,



systemd-time-wait-sync.service is running but has not completed.
I wonder if that is
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=940840
https://github.com/systemd/systemd/issues/14061

Which NTP service do you use?


I am using chrony, because the server is offering time services to NTP
clients.

Regards
    Christoph

PS: As a (temporary) workaround, I created an override file for
systemd-time-wait-sync.service that just calls /bin/true .
   



Have you tried to simply disable the service?


I have now, and it works. Any idea what might have enabled it? I have
quite old entries in /root/.bash_history, but no signs of that I
enabled that service manually, I did not even know about its existence
before monday ...


Nowadays, systemd-time-wait-sync.service is disabled in presets. I don't 
now if this was (already) the case for Bullseye (which I no longer use).


Might be that you enabled it via presets / it was enabled via presets.



OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Many systemd units do not start anymore

2024-02-07 Thread Christoph Pleger
Hello,

Am Mittwoch, dem 07.02.2024 um 12:27 +0100 schrieb Michael Biebl:
> Am 07.02.24 um 11:32 schrieb Christoph Pleger:
> > Hello,
> > 
> > > 
> > > systemd-time-wait-sync.service is running but has not completed.
> > > I wonder if that is
> > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=940840
> > > https://github.com/systemd/systemd/issues/14061
> > > 
> > > Which NTP service do you use?
> > 
> > I am using chrony, because the server is offering time services to NTP
> > clients.
> > 
> > Regards
> >    Christoph
> > 
> > PS: As a (temporary) workaround, I created an override file for
> > systemd-time-wait-sync.service that just calls /bin/true .
> >   
> > 
> 
> Have you tried to simply disable the service?

I have now, and it works. Any idea what might have enabled it? I have
quite old entries in /root/.bash_history, but no signs of that I
enabled that service manually, I did not even know about its existence
before monday ... 

Regards
  Christoph


signature.asc
Description: This is a digitally signed message part


Re: Many systemd units do not start anymore

2024-02-07 Thread Michael Biebl

Am 07.02.24 um 11:32 schrieb Christoph Pleger:

Hello,



systemd-time-wait-sync.service is running but has not completed.
I wonder if that is
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=940840
https://github.com/systemd/systemd/issues/14061

Which NTP service do you use?


I am using chrony, because the server is offering time services to NTP
clients.

Regards
   Christoph

PS: As a (temporary) workaround, I created an override file for
systemd-time-wait-sync.service that just calls /bin/true .
  


Have you tried to simply disable the service?


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Many systemd units do not start anymore

2024-02-07 Thread Christoph Pleger
Hello,
> - what is the hardware?

Qemu VM

> - what release of what operating system is in use?

Debian bullseye

> - what happened just before things 'suddenly' stopped working? Did
you
>   upgrade the machine, or install some new software, or just reboot
it
>   or what?

I rebooted the machine, maybe I did something else before, but I do not
remember. 

> - you say it is just one of your server machines. Can you draw any
>   comparisons with the state of your other server machines?


The difference is that this machine chrony as ntp server/client, while
on most others, it is systemd-timesyncd .

Regards
  Christoph


signature.asc
Description: This is a digitally signed message part


Re: Many systemd units do not start anymore

2024-02-07 Thread Christoph Pleger
Hello,

> 
> systemd-time-wait-sync.service is running but has not completed.
> I wonder if that is
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=940840
> https://github.com/systemd/systemd/issues/14061
> 
> Which NTP service do you use?

I am using chrony, because the server is offering time services to NTP
clients.

Regards
  Christoph

PS: As a (temporary) workaround, I created an override file for
systemd-time-wait-sync.service that just calls /bin/true .
 


signature.asc
Description: This is a digitally signed message part


Re: Many systemd units do not start anymore

2024-02-06 Thread debian-user
Christoph Pleger  wrote:
> Hello,
> 
> on one of my server machines, suddenly many systemd units (e.g. cron,
> autofs) do not start any more, neither at boot nor when trying to
> start manually with "systemctl start ", this hangs till I abort
> with Ctrl-C - though the commands defined in ExecStart work when I
> type them in directly. From my judgement, I also believe that it
> takes unusually long till then command "systemctl status "
> returns a result.
> 
> I already removed systemd completely (apt-get --purge --auto-remove
> remove *systemd*) und switched to SYS V Init, in which all services
> started successfully. But after switching back to systemd, I again
> had the problem of non-starting services.
> 
> Does anyone have an idea what is possibly wrong?

As appears later in the thread, you seem to have missed out some pretty
basic information, including:

- what is the hardware?
- what release of what operating system is in use?
- what happened just before things 'suddenly' stopped working? Did you
  upgrade the machine, or install some new software, or just reboot it
  or what?
- you say it is just one of your server machines. Can you draw any
  comparisons with the state of your other server machines?

> Regards
>   Christoph 



Re: Many systemd units do not start anymore

2024-02-05 Thread David Wright
On Tue 06 Feb 2024 at 09:51:02 (+0700), Max Nikulin wrote:
> On 06/02/2024 03:46, Michael Biebl wrote:
> > If you are not using systemd-timesyncd, you could also consider
> > disabling systemd-time-wait-sync.service (via systemctl disable).
> 
> My guess is that this board does not have RTC,

I don't understand how it would have worked before.

> so NTP is a must have
> and dependency on time synchronization is intentional.
> 
> The question is whether it is Debian or some derivative like armbian.
> 
> Perhaps proper timeouts may be set for the case when network is not
> available.

Cheers,
David.



Re: Many systemd units do not start anymore

2024-02-05 Thread Max Nikulin

On 06/02/2024 03:46, Michael Biebl wrote:
If you are not using systemd-timesyncd, you could also consider 
disabling systemd-time-wait-sync.service (via systemctl disable).


My guess is that this board does not have RTC, so NTP is a must have and 
dependency on time synchronization is intentional.


The question is whether it is Debian or some derivative like armbian.

Perhaps proper timeouts may be set for the case when network is not 
available.




Re: Many systemd units do not start anymore

2024-02-05 Thread hw
On Mon, 2024-02-05 at 17:28 +0100, Christoph Pleger wrote:
> Hello,
> 
> > > Does anyone have an idea what is possibly wrong?
> > 
> > Look for more information.
> 
> This is the output of systemctl list-jobs :
> 
> JOB UNIT TYPE  STATE
> 102 autofs.service   start waiting
> 82  mlocate.timerstart waiting
> 80  e2scrub_all.timerstart waiting
> 117 cron.service start waiting
> 1   graphical.target start waiting
> 140 apache2.service  start waiting
> 127 nullmailer.service   start waiting
> 81  phpsessionclean.timerstart waiting
> 94  nslcd.servicestart waiting
> 40  time-sync.target start waiting
> 86  logrotate.timer  start waiting
> 83  man-db.timer start waiting
> 84  apt-daily-upgrade.timer  start waiting
> 115 systemd-update-utmp-runlevel.service start waiting
> 135 atd.service  start waiting
> 79  timers.targetstart waiting
> 87  apt-daily.timer  start waiting
> 39  systemd-time-wait-sync.service   start running
> 88  fstrim.timer start waiting
> 2   multi-user.targetstart waiting
> 
> As you can see, there are really many failed services.

They haven't failed, at least not yet.

>From the man page:


Job Commands
   list-jobs [PATTERN...]
   List jobs that are in progress. If one or more PATTERNs are
   specified, only jobs for units matching one of them are shown.

   When combined with --after or --before the list is augmented
   with information on which other job each job is waiting for,
   and which other jobs are waiting for it, see above.

   cancel [JOB...]
   Cancel one or more jobs specified on the command line by their
   numeric job IDs. If no job ID is specified, cancel all pending
   jobs.


So what are they waiting for?

(I have to admit that this is actually rather friendly to the users.)



Re: Many systemd units do not start anymore

2024-02-05 Thread Michael Biebl

Am 05.02.24 um 21:38 schrieb Michael Biebl:

This is the output of systemctl list-jobs :

JOB UNIT TYPE  STATE
102 autofs.service   start waiting
82  mlocate.timer    start waiting
80  e2scrub_all.timer    start waiting
117 cron.service start waiting
1   graphical.target start waiting
140 apache2.service  start waiting
127 nullmailer.service   start waiting
81  phpsessionclean.timer    start waiting
94  nslcd.service    start waiting
40  time-sync.target start waiting
86  logrotate.timer  start waiting
83  man-db.timer start waiting
84  apt-daily-upgrade.timer  start waiting
115 systemd-update-utmp-runlevel.service start waiting
135 atd.service  start waiting
79  timers.target    start waiting
87  apt-daily.timer  start waiting
39  systemd-time-wait-sync.service   start running
88  fstrim.timer start waiting
2   multi-user.target    start waiting

As you can see, there are really many failed services. It seems that
systemd-time-wait-sync.service is waiting for systemd-timesyncd to
synchronize the clock, but systemd-timesyncd is not installed at all.


Those services are not failed, they are waiting for their dependencies.

systemd-time-wait-sync.service is running but has not completed.
I wonder if that is
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=940840
https://github.com/systemd/systemd/issues/14061

Which NTP service do you use?
Could you try with systemd-timesyncd?



If you are not using systemd-timesyncd, you could also consider 
disabling systemd-time-wait-sync.service (via systemctl disable).


The default is disabled in Debian:

# systemctl status systemd-time-wait-sync.service
○ systemd-time-wait-sync.service - Wait Until Kernel Time Synchronized
 Loaded: loaded 
(/usr/lib/systemd/system/systemd-time-wait-sync.service; disabled; 
preset: disabled)

 Active: inactive (dead)
   Docs: man:systemd-time-wait-sync.service(8)





OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Many systemd units do not start anymore

2024-02-05 Thread Michael Biebl

This is the output of systemctl list-jobs :

JOB UNIT TYPE  STATE
102 autofs.service   start waiting
82  mlocate.timerstart waiting
80  e2scrub_all.timerstart waiting
117 cron.service start waiting
1   graphical.target start waiting
140 apache2.service  start waiting
127 nullmailer.service   start waiting
81  phpsessionclean.timerstart waiting
94  nslcd.servicestart waiting
40  time-sync.target start waiting
86  logrotate.timer  start waiting
83  man-db.timer start waiting
84  apt-daily-upgrade.timer  start waiting
115 systemd-update-utmp-runlevel.service start waiting
135 atd.service  start waiting
79  timers.targetstart waiting
87  apt-daily.timer  start waiting
39  systemd-time-wait-sync.service   start running
88  fstrim.timer start waiting
2   multi-user.targetstart waiting

As you can see, there are really many failed services. It seems that
systemd-time-wait-sync.service is waiting for systemd-timesyncd to
synchronize the clock, but systemd-timesyncd is not installed at all.


Those services are not failed, they are waiting for their dependencies.

systemd-time-wait-sync.service is running but has not completed.
I wonder if that is
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=940840
https://github.com/systemd/systemd/issues/14061

Which NTP service do you use?
Could you try with systemd-timesyncd?

Michael


OpenPGP_signature.asc
Description: OpenPGP digital signature


Re: Many systemd units do not start anymore

2024-02-05 Thread Max Nikulin

On 05/02/2024 23:28, Christoph Pleger wrote:


This is the output of systemctl list-jobs :

JOB UNIT TYPE  STATE
102 autofs.service   start waiting
82  mlocate.timerstart waiting
80  e2scrub_all.timerstart waiting
117 cron.service start waiting
1   graphical.target start waiting
140 apache2.service  start waiting


When timers are excluded, most services in the list depends on network. 
Is it configured? What services fail?


systemctl --failed

Is there anything suspicious in output of

journalctl --boot
dmesg

Are there active processes in "top" output?

I am unsure if the following commands will provide anything useful in 
such state


systemd-analyze blame
systemd-analyze critical-chain

A wild shot... Is it a bare metal system or a virtualized host?

hostnamectl
cat /proc/sys/kernel/random/entropy_avail





Re: Many systemd units do not start anymore

2024-02-05 Thread Christoph Pleger
Hello,

> > Does anyone have an idea what is possibly wrong?
> 
> Look for more information.

This is the output of systemctl list-jobs :

JOB UNIT TYPE  STATE
102 autofs.service   start waiting
82  mlocate.timerstart waiting
80  e2scrub_all.timerstart waiting
117 cron.service start waiting
1   graphical.target start waiting
140 apache2.service  start waiting
127 nullmailer.service   start waiting
81  phpsessionclean.timerstart waiting
94  nslcd.servicestart waiting
40  time-sync.target start waiting
86  logrotate.timer  start waiting
83  man-db.timer start waiting
84  apt-daily-upgrade.timer  start waiting
115 systemd-update-utmp-runlevel.service start waiting
135 atd.service  start waiting
79  timers.targetstart waiting
87  apt-daily.timer  start waiting
39  systemd-time-wait-sync.service   start running
88  fstrim.timer start waiting
2   multi-user.targetstart waiting

As you can see, there are really many failed services. It seems that
systemd-time-wait-sync.service is waiting for systemd-timesyncd to
synchronize the clock, but systemd-timesyncd is not installed at all.

Regards
  Christoph


signature.asc
Description: This is a digitally signed message part


Re: Many systemd units do not start anymore

2024-02-05 Thread Stanislav Vlasov
пн, 5 февр. 2024 г. в 20:57, Christoph Pleger
:

> Unfortunately, there is no useful further information:
>
> systemctl status
> ● autofs.service - Automounts filesystems on demand
>  Loaded: loaded (/lib/systemd/system/autofs.service; enabled;
> vendor preset>
>  Active: inactive (dead)
>Docs: man:autofs(8)

I don't know about other services (thread skipped), but autofs MUST be
configured before start.
And does not start without configuration,

-- 
Stanislav



Re: Many systemd units do not start anymore

2024-02-05 Thread Christoph Pleger
Hello,

Am Montag, dem 05.02.2024 um 07:18 -0500 schrieb Greg Wooledge:
> On Mon, Feb 05, 2024 at 12:53:33PM +0100, Christoph Pleger wrote:
> > on one of my server machines, suddenly many systemd units (e.g. cron, 
> > autofs)
> > do not start any more, neither at boot nor when trying to start manually
> > with "systemctl start ", this hangs till I abort with Ctrl-C -
> 
> > Does anyone have an idea what is possibly wrong?
> 
> Look for more information.  Start with
> 
> systemctl status cron
> journalctl -u cron

Unfortunately, there is no useful further information:

systemctl status
● autofs.service - Automounts filesystems on demand
 Loaded: loaded (/lib/systemd/system/autofs.service; enabled;
vendor preset>
 Active: inactive (dead)
   Docs: man:autofs(8)

journalctl -u autofs
-- Journal begins at Mon 2024-02-05 12:33:34 CET, ends at Mon 2024-02-
05 16:38:>
-- No entries --

This looks almost the same for every failed service, even while a
"systemctl start " is running in another terminal window.

Regards
  Christoph


signature.asc
Description: This is a digitally signed message part


Re: Many systemd units do not start anymore

2024-02-05 Thread Greg Wooledge
On Mon, Feb 05, 2024 at 12:53:33PM +0100, Christoph Pleger wrote:
> on one of my server machines, suddenly many systemd units (e.g. cron, autofs)
> do not start any more, neither at boot nor when trying to start manually
> with "systemctl start ", this hangs till I abort with Ctrl-C -

> Does anyone have an idea what is possibly wrong?

Look for more information.  Start with

systemctl status cron
journalctl -u cron

Just on general principle, it wouldn't hurt to look at the output of
dmesg as well, to see if the kernel is complaining.  And df, to see
whether any file systems are full or nonresponsive.



Many systemd units do not start anymore

2024-02-05 Thread Christoph Pleger
Hello,

on one of my server machines, suddenly many systemd units (e.g. cron, autofs)
do not start any more, neither at boot nor when trying to start manually
with "systemctl start ", this hangs till I abort with Ctrl-C -
though the commands defined in ExecStart work when I type them in directly.
From my judgement, I also believe that it takes unusually long till 
then command "systemctl status " returns a result.

I already removed systemd completely (apt-get --purge --auto-remove remove 
*systemd*) und switched to SYS V Init, in which all services started 
successfully.
But after switching back to systemd, I again had the problem of non-starting 
services.

Does anyone have an idea what is possibly wrong?

Regards
  Christoph 



signature.asc
Description: This is a digitally signed message part