Re: [systemd-devel] Service output missing from journal?

2022-07-05 Thread Uwe Geuder
On Tue, 5 Jul 2022 at 00:29, Lennart Poettering  wrote:
>
On Mo, 04.07.22 23:15, Michael Biebl (mbiebl at gmail.com) wrote:

> Am Mo., 4. Juli 2022 um 19:36 Uhr schrieb Lennart Poettering
> :
> >
> > eOn So, 03.07.22 19:29, Uwe Geuder (systemd-devel-ugeuder at snkmail.com) 
> > wrote:
> >
> > > Hi!
> > >
> > > When I run the command given below on a current Fedora CoreOS system
> > > (systemd 250 (v250.6-1.fc36)) I get a result I absolute cannot understand.
> > > Can anybody help me with what is wrong there?
> > >
> > > $ systemd-run --user sh -c 'while true; do echo foo; df -h 
> > > /var/log/journal/; echo $?; sleep 3; done'
> > > Running as unit: run-r9a155474889b4d40a1ac119823bdc2bf.service
> > > $ journalctl --user -f -u run-r9a155474889b4d40a1ac119823bdc2bf
> > > [ ... similar lines redacted ... ]
> > > Jul 03 15:25:08 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: Filesystem  Size  Used 
> > > Avail Use% Mounted on
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: /dev/nvme0n1p4  9.5G  8.1G  
> > > 1.5G  86% /var
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: 0
> > > [...]
> > >
> > > So the output from the df command appears in the journal pretty rarely,
> > > appearingly at random intervals. When I run the same loop on the
> > > command line the output occurs every time.
> > >
> > > The problem was originally noted in a somewhat loaded system. However,
> > > above reproducer (including the 2 echo commands and a shorter sleep)
> > > shows the same problem even on an idling machine.
> >
> > https://github.com/systemd/systemd/issues/2913
>
> I thought about this as well, but in this case the service is still
> running. So I'm not sure if #2913 applies here.

The service is, but the "df" process exits extremely quickly, before
we can figure out what it belongs to. See the PIDs where it works,
they are different from your shell script's PID, because they are
short-lived child processes.


Thanks for your answers!

I thought I could quickly prove Lennart's explanataion correct by making
the 2 echo commands their own processes, too. They should not be
long-running ones either.

$ systemd-run --user sh -c 'while true; do /bin/echo foo; df -h 
/var/log/journal/; /bin/echo $?; sleep 3; done'

Interestingly enough the result looks much better:

Jul 05 23:14:07 ip-172-31-13-133 sh[2944]: foo
Jul 05 23:14:07 ip-172-31-13-133 sh[2945]: Filesystem  Size  Used Avail 
Use% Mounted on
Jul 05 23:14:07 ip-172-31-13-133 sh[2945]: /dev/nvme0n1p4  9.5G  2.2G  7.4G  
23% /var
Jul 05 23:14:07 ip-172-31-13-133 sh[2946]: 0
Jul 05 23:14:10 ip-172-31-13-133 sh[2948]: foo
Jul 05 23:14:10 ip-172-31-13-133 sh[2949]: Filesystem  Size  Used Avail 
Use% Mounted on
Jul 05 23:14:10 ip-172-31-13-133 sh[2949]: /dev/nvme0n1p4  9.5G  2.2G  7.4G  
23% /var
Jul 05 23:14:10 ip-172-31-13-133 sh[2950]: 0
Jul 05 23:14:13 ip-172-31-13-133 sh[2954]: Filesystem  Size  Used Avail 
Use% Mounted on
Jul 05 23:14:13 ip-172-31-13-133 sh[2954]: /dev/nvme0n1p4  9.5G  2.2G  7.4G  
23% /var
Jul 05 23:14:13 ip-172-31-13-133 sh[2955]: 0
Jul 05 23:14:16 ip-172-31-13-133 sh[2957]: foo
Jul 05 23:14:16 ip-172-31-13-133 sh[2958]: Filesystem  Size  Used Avail 
Use% Mounted on
Jul 05 23:14:16 ip-172-31-13-133 sh[2958]: /dev/nvme0n1p4  9.5G  2.2G  7.4G  
23% /var
Jul 05 23:14:16 ip-172-31-13-133 sh[2959]: 0
Jul 05 23:14:19 ip-172-31-13-133 sh[2961]: foo
Jul 05 23:14:19 ip-172-31-13-133 sh[2962]: Filesystem  Size  Used Avail 
Use% Mounted on
Jul 05 23:14:19 ip-172-31-13-133 sh[2962]: /dev/nvme0n1p4  9.5G  2.2G  7.4G  
23% /var
Jul 05 23:14:19 ip-172-31-13-133 sh[2963]: 0
Jul 05 23:14:22 ip-172-31-13-133 sh[2966]: foo
Jul 05 23:14:22 ip-172-31-13-133 sh[2967]: Filesystem  Size  Used Avail 
Use% Mounted on
Jul 05 23:14:22 ip-172-31-13-133 sh[2967]: /dev/nvme0n1p4  9.5G  2.2G  7.4G  
23% /var
Jul 05 23:14:22 ip-172-31-13-133 sh[2968]: 0
Jul 05 23:14:25 ip-172-31-13-133 sh[2970]: foo
Jul 05 23:14:25 ip-172-31-13-133 sh[2971]: Filesystem  Size  Used Avail 
Use% Mounted on
Jul 05 23:14:25 ip-172-31-13-133 sh[2971]: /dev/nvme0n1p4  9.5G  2.2G  7.4G  
23% /var
Jul 05 23:14:25 ip-172-31-13-133 sh[2972]: 0
Jul 05 23:14:28 ip-172-31-13-133 sh[2974]: foo
Jul 05 23:14:28 ip-172-31-13-133 sh[2975]: Filesystem  Size  Used Avail 
Use% 

Re: [systemd-devel] Service output missing from journal?

2022-07-04 Thread Lennart Poettering
On Mo, 04.07.22 23:15, Michael Biebl (mbi...@gmail.com) wrote:

> Am Mo., 4. Juli 2022 um 19:36 Uhr schrieb Lennart Poettering
> :
> >
> > eOn So, 03.07.22 19:29, Uwe Geuder (systemd-devel-ugeu...@snkmail.com) 
> > wrote:
> >
> > > Hi!
> > >
> > > When I run the command given below on a current Fedora CoreOS system
> > > (systemd 250 (v250.6-1.fc36)) I get a result I absolute cannot understand.
> > > Can anybody help me with what is wrong there?
> > >
> > > $ systemd-run --user sh -c 'while true; do echo foo; df -h 
> > > /var/log/journal/; echo $?; sleep 3; done'
> > > Running as unit: run-r9a155474889b4d40a1ac119823bdc2bf.service
> > > $ journalctl --user -f -u run-r9a155474889b4d40a1ac119823bdc2bf
> > > [ ... similar lines redacted ... ]
> > > Jul 03 15:25:08 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: Filesystem  Size  Used 
> > > Avail Use% Mounted on
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: /dev/nvme0n1p4  9.5G  8.1G  
> > > 1.5G  86% /var
> > > Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: 0
> > > Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: foo
> > > Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: 0
> > > [...]
> > >
> > > So the output from the df command appears in the journal pretty rarely,
> > > appearingly at random intervals. When I run the same loop on the
> > > command line the output occurs every time.
> > >
> > > The problem was originally noted in a somewhat loaded system. However,
> > > above reproducer (including the 2 echo commands and a shorter sleep)
> > > shows the same problem even on an idling machine.
> >
> > https://github.com/systemd/systemd/issues/2913
>
> I thought about this as well, but in this case the service is still
> running. So I'm not sure if #2913 applies here.

The service is, but the "df" process exits extremely quickly, before
we can figure out what it belongs to. See the PIDs where it works,
they are different from your shell script's PID, because they are
short-lived child processes.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Service output missing from journal?

2022-07-04 Thread Michael Biebl
Am Mo., 4. Juli 2022 um 19:36 Uhr schrieb Lennart Poettering
:
>
> eOn So, 03.07.22 19:29, Uwe Geuder (systemd-devel-ugeu...@snkmail.com) wrote:
>
> > Hi!
> >
> > When I run the command given below on a current Fedora CoreOS system
> > (systemd 250 (v250.6-1.fc36)) I get a result I absolute cannot understand.
> > Can anybody help me with what is wrong there?
> >
> > $ systemd-run --user sh -c 'while true; do echo foo; df -h 
> > /var/log/journal/; echo $?; sleep 3; done'
> > Running as unit: run-r9a155474889b4d40a1ac119823bdc2bf.service
> > $ journalctl --user -f -u run-r9a155474889b4d40a1ac119823bdc2bf
> > [ ... similar lines redacted ... ]
> > Jul 03 15:25:08 ip-172-31-8-116 sh[366900]: 0
> > Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: foo
> > Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: 0
> > Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: foo
> > Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: 0
> > Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: foo
> > Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: 0
> > Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: foo
> > Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: Filesystem  Size  Used 
> > Avail Use% Mounted on
> > Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: /dev/nvme0n1p4  9.5G  8.1G  
> > 1.5G  86% /var
> > Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: 0
> > Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: foo
> > Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: 0
> > Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: foo
> > Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: 0
> > Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: foo
> > Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: 0
> > [...]
> >
> > So the output from the df command appears in the journal pretty rarely,
> > appearingly at random intervals. When I run the same loop on the
> > command line the output occurs every time.
> >
> > The problem was originally noted in a somewhat loaded system. However,
> > above reproducer (including the 2 echo commands and a shorter sleep)
> > shows the same problem even on an idling machine.
>
> https://github.com/systemd/systemd/issues/2913

I thought about this as well, but in this case the service is still
running. So I'm not sure if #2913 applies here.


Re: [systemd-devel] Service output missing from journal?

2022-07-04 Thread Lennart Poettering
eOn So, 03.07.22 19:29, Uwe Geuder (systemd-devel-ugeu...@snkmail.com) wrote:

> Hi!
>
> When I run the command given below on a current Fedora CoreOS system
> (systemd 250 (v250.6-1.fc36)) I get a result I absolute cannot understand.
> Can anybody help me with what is wrong there?
>
> $ systemd-run --user sh -c 'while true; do echo foo; df -h /var/log/journal/; 
> echo $?; sleep 3; done'
> Running as unit: run-r9a155474889b4d40a1ac119823bdc2bf.service
> $ journalctl --user -f -u run-r9a155474889b4d40a1ac119823bdc2bf
> [ ... similar lines redacted ... ]
> Jul 03 15:25:08 ip-172-31-8-116 sh[366900]: 0
> Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: foo
> Jul 03 15:25:11 ip-172-31-8-116 sh[366900]: 0
> Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: foo
> Jul 03 15:25:14 ip-172-31-8-116 sh[366900]: 0
> Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: foo
> Jul 03 15:25:17 ip-172-31-8-116 sh[366900]: 0
> Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: foo
> Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: Filesystem  Size  Used Avail 
> Use% Mounted on
> Jul 03 15:25:20 ip-172-31-8-116 sh[366941]: /dev/nvme0n1p4  9.5G  8.1G  1.5G  
> 86% /var
> Jul 03 15:25:20 ip-172-31-8-116 sh[366900]: 0
> Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: foo
> Jul 03 15:25:23 ip-172-31-8-116 sh[366900]: 0
> Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: foo
> Jul 03 15:25:26 ip-172-31-8-116 sh[366900]: 0
> Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: foo
> Jul 03 15:25:29 ip-172-31-8-116 sh[366900]: 0
> [...]
>
> So the output from the df command appears in the journal pretty rarely,
> appearingly at random intervals. When I run the same loop on the
> command line the output occurs every time.
>
> The problem was originally noted in a somewhat loaded system. However,
> above reproducer (including the 2 echo commands and a shorter sleep)
> shows the same problem even on an idling machine.

https://github.com/systemd/systemd/issues/2913

Lennart

--
Lennart Poettering, Berlin