Re: [systemd-devel] udev and probing of eMMC partition devices
On Mi, 30.09.20 13:57, Alan Perry (al...@snowmoose.com) wrote: > > > On 9/23/20 9:29 AM, Lennart Poettering wrote: > > On Di, 22.09.20 10:06, Alan Perry (al...@snowmoose.com) wrote: > > > > > > > device add events will get stuck at the probe step. > > > > "Get stuck"? What does that mean? What is it actually doing? What does > > > > a stack trace say? Anything in the logs? > > > When this happens, the last thing seen in the log for those devices is the > > > probe ("probe /dev/mmcblk0 raid offset=0"). > > This debug log message is generated by udev-builtin-blkid.c, right > > after opening the block device, and right before issuing the probe, > > i.e. reading the fs label/partition table signatures off disk. If > > things hang there, and the blkid prober worker process freezes then > > this really looks like a hw/driver problem, i.e. IO access from the > > block device just hangs. > > > > It does seem to be a hw/driver problem. From what I have seen searching the > web, this seems to be something that sometimes happens with eMMC devices. > > In our experience, the problem resolves itself and subsequent reads and > probes succeed. However, the systemd job is still around, hung, and stopping > boot from completing. I think that changing udev-builtin-blkid to be able to > timeout and end the job gracefully when this happens is the right thing to > do here. But what is a suitable timeout and what does a graceful exit here > look like? udev kills workers after a while. You can configure that with event_timeout= in udev.conf. Defaults to 2min. But note that disk IO sleeps in the kernel usually are non-interruptible, i.e. you cannot kill processes hanging in them. Hence, YMMV. Driver bugs are kernel bugs. Fix them in the kernel, working around them in userspace is ultimately never going to make anyone happy. Lennart -- Lennart Poettering, Berlin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] udev and probing of eMMC partition devices
On 9/23/20 9:29 AM, Lennart Poettering wrote: On Di, 22.09.20 10:06, Alan Perry (al...@snowmoose.com) wrote: device add events will get stuck at the probe step. "Get stuck"? What does that mean? What is it actually doing? What does a stack trace say? Anything in the logs? When this happens, the last thing seen in the log for those devices is the probe ("probe /dev/mmcblk0 raid offset=0"). This debug log message is generated by udev-builtin-blkid.c, right after opening the block device, and right before issuing the probe, i.e. reading the fs label/partition table signatures off disk. If things hang there, and the blkid prober worker process freezes then this really looks like a hw/driver problem, i.e. IO access from the block device just hangs. It does seem to be a hw/driver problem. From what I have seen searching the web, this seems to be something that sometimes happens with eMMC devices. In our experience, the problem resolves itself and subsequent reads and probes succeed. However, the systemd job is still around, hung, and stopping boot from completing. I think that changing udev-builtin-blkid to be able to timeout and end the job gracefully when this happens is the right thing to do here. But what is a suitable timeout and what does a graceful exit here look like? alan ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Antw: [EXT] Re: Q: logrotate and "systemctl kill -s HUP ..."
>>> Mantas Mikulenas schrieb am 30.09.2020 um 12:26 in Nachricht : > On Wed, Sep 30, 2020 at 11:24 AM Ulrich Windl < > ulrich.wi...@rz.uni-regensburg.de> wrote: > >> Hi! >> >> I have a problem with logrotate: My postrotate command does not seem to >> send a HUP signal. However the files are rotated. >> I'm using this (not preferred way, I know): >> >> ... >> postrotate >> test -s '/var/run/iotwatch-LOC1/iotwatch-LOC1.pid' && >> systemctl kill -s HUP --kill-who=main iotwatch@LOC1.service >> endscript >> ... >> >> I've verified that the PID file exists (just rebooted the server a few >> minutes ago): >> # ll /var/run/iotwatch-LOC1/iotwatch-LOC1.pid >> -rw-r--r-- 1 root root 5 Sep 30 10:07 >> /var/run/iotwatch-LOC1/iotwatch-LOC1.pid >> > > Do you need to check for it in the first place? > > Does the same command work from interactive CLI? > > >> >> My service would log the arrival of any HUP signal, but it didn't. Also in >> syslog I could not find any error message related to "systemctl kill". >> What might be wrong? >> >> My service is using ExecStartPre, ExecStartPost, and ExecStart. Could >> systemd be confused about "--kill-who=main" then? > > > --kill-who=main means the signal will be sent to the "main" process that > was started from ExecStart (shown as "Main PID:" in systemctl status). > > The more preferred way of doing this is to have "ExecReload=/bin/kill -HUP > $MAINPID" and then `systemctl reload foo.service`. > > Sending HUP to ExecStartPre and ExecStartPost doesn't make sense, since > those are supposed to be short-running commands – they are not allowed to > actually *have* daemons. Hi! Thanks for the suggestions. Before I work them out, I have a simple question (systemd-228 of SLES12): Systemd loggs quite a lot; does it log a message when it sends a signal to a process (or fails to do so)? If so it would help to isolate the problem. I could not see any message, so my guess is that the test for the PID file does not work as intended. However the dosc says: The lines between postrotate and endscript (both of which must appear on lines by themselves) are executed (using /bin/sh) after the log file is rotated. So the "&&" should work, but maybe a backslash is needed at the end of the line (it is not when entering the command interactively). Regards, Ulrich > > -- > Mantas Mikulėnas ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] How to reply to the list
On Wed, 30 Sep 2020 09:35:35 +0200 "Ulrich Windl" wrote: > >>> Dave Howorth schrieb am 28.09.2020 um > >>> 16:34 in > Nachricht <20200928153422.6bf6e...@acer-suse.lan>: > > On Mon, 28 Sep 2020 14:10:38 +0200 > > Reindl Harald wrote: > >> can you stop "reply‑all" and breaking threads when respond to > >> lists? > > > > I can't answer for the reply‑all, that would annoy me as well. > > But the thread isn't broken, my MUA is showing it nicely. > > Also: Some MUAs only have "reply" and "Reply to all"; the first one > would only reply to the sender, and the last one would reply to list > and sender. Interestingly for some lists a plain "Reply" works just > right, but not for this list. Please do not assume everybody is using > the same MUA than you do... And, the relevance is ? ... I can't help it if people use broken MUAs. They should either change to a MUA that works properly, or edit the replies by hand. And please do not send me private copies of mail to the list If you do, I will put you in my bitbucket. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Antw: Re: Antw: Re: Antw: [EXT] Re: Memory in systemctl status
>>> Benjamin Berg schrieb am 30.09.2020 um 12:08 in Nachricht <2f6a1d5b102e5dade4f578d6d704b07508d03d50.ca...@sipsolutions.net>: > On Wed, 2020-09-30 at 11:04 +0200, Ulrich Windl wrote: >> > > > Reindl Harald schrieb am 30.09.2020 um 10:56 >> > > > in >> Nachricht : >> >> > Am 30.09.20 um 09:06 schrieb Ulrich Windl: >> > > > my webserver is killed because it served at monday, tuesday, thursday >> > > > and friday 4 different files with 2 GB? >> > > >> > > cgroups is for limiting resources, not for killing processes AFAIK >> > >> > [Service] >> > MemoryMax=4G >> > >> > would call OOM killer >> >> Are you sure? I thought OOM is called when the _system_ memory is exhausted. >> IMHO any memory allocation request to the process will be denied, but the >> process wouldn't be killed. But agreed, I didn't track the cgroups changes > in >> the last few years. > > I think you can assume that the OOM killer will kick in rather than the > allocation request being denied. > > This option does cap the amount system memory that is used for the > cgroup. So if memory cannot be reclaimed (e.g. swapped out, file > backed) then the OOM killer will run within the cgroup. OK, didn't know the OOM killer is cgroup-specific > > As I understand it, what Reindl is looking for is seeing and limiting > the amount of resident anonymous pages that the cgroup has rather than > its real memory use. > > Benjamin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Q: logrotate and "systemctl kill -s HUP ..."
On Wed, Sep 30, 2020 at 11:24 AM Ulrich Windl < ulrich.wi...@rz.uni-regensburg.de> wrote: > Hi! > > I have a problem with logrotate: My postrotate command does not seem to > send a HUP signal. However the files are rotated. > I'm using this (not preferred way, I know): > > ... > postrotate > test -s '/var/run/iotwatch-LOC1/iotwatch-LOC1.pid' && > systemctl kill -s HUP --kill-who=main iotwatch@LOC1.service > endscript > ... > > I've verified that the PID file exists (just rebooted the server a few > minutes ago): > # ll /var/run/iotwatch-LOC1/iotwatch-LOC1.pid > -rw-r--r-- 1 root root 5 Sep 30 10:07 > /var/run/iotwatch-LOC1/iotwatch-LOC1.pid > Do you need to check for it in the first place? Does the same command work from interactive CLI? > > My service would log the arrival of any HUP signal, but it didn't. Also in > syslog I could not find any error message related to "systemctl kill". > What might be wrong? > > My service is using ExecStartPre, ExecStartPost, and ExecStart. Could > systemd be confused about "--kill-who=main" then? --kill-who=main means the signal will be sent to the "main" process that was started from ExecStart (shown as "Main PID:" in systemctl status). The more preferred way of doing this is to have "ExecReload=/bin/kill -HUP $MAINPID" and then `systemctl reload foo.service`. Sending HUP to ExecStartPre and ExecStartPost doesn't make sense, since those are supposed to be short-running commands – they are not allowed to actually *have* daemons. -- Mantas Mikulėnas ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Antw: Re: Antw: [EXT] Re: Memory in systemctl status
On Wed, 2020-09-30 at 11:04 +0200, Ulrich Windl wrote: > > > > Reindl Harald schrieb am 30.09.2020 um 10:56 in > Nachricht : > > > Am 30.09.20 um 09:06 schrieb Ulrich Windl: > > > > my webserver is killed because it served at monday, tuesday, thursday > > > > and friday 4 different files with 2 GB? > > > > > > cgroups is for limiting resources, not for killing processes AFAIK > > > > [Service] > > MemoryMax=4G > > > > would call OOM killer > > Are you sure? I thought OOM is called when the _system_ memory is exhausted. > IMHO any memory allocation request to the process will be denied, but the > process wouldn't be killed. But agreed, I didn't track the cgroups changes in > the last few years. I think you can assume that the OOM killer will kick in rather than the allocation request being denied. This option does cap the amount system memory that is used for the cgroup. So if memory cannot be reclaimed (e.g. swapped out, file backed) then the OOM killer will run within the cgroup. As I understand it, what Reindl is looking for is seeing and limiting the amount of resident anonymous pages that the cgroup has rather than its real memory use. Benjamin ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Antw: Re: Antw: [EXT] Re: Memory in systemctl status
Am 30.09.20 um 11:04 schrieb Ulrich Windl: Reindl Harald schrieb am 30.09.2020 um 10:56 in > Nachricht : > >> >> Am 30.09.20 um 09:06 schrieb Ulrich Windl: my webserver is killed because it served at monday, tuesday, thursday and friday 4 different files with 2 GB? >>> >>> cgroups is for limiting resources, not for killing processes AFAIK >> >> [Service] >> MemoryMax=4G >> >> would call OOM killer > > Are you sure? I thought OOM is called when the _system_ memory is exhausted. > IMHO any memory allocation request to the process will be denied, but the > process wouldn't be killed. But agreed, I didn't track the cgroups changes in > the last few years. hell yes i am sure beause the line is from the development machine of my co-worker who managed an andless recursion in a php script eating a piece of memory on each iteration after adding that httpd.service got killed by OOM killer instead bring the whole machine with 16 GB down ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] How to reply to the list
Am 30.09.20 um 09:35 schrieb Ulrich Windl: Dave Howorth schrieb am 28.09.2020 um 16:34 in > Nachricht <20200928153422.6bf6e...@acer-suse.lan>: >> On Mon, 28 Sep 2020 14:10:38 +0200 >> Reindl Harald wrote: >>> can you stop "reply‑all" and breaking threads when respond to lists? >> >> I can't answer for the reply‑all, that would annoy me as well. >> But the thread isn't broken, my MUA is showing it nicely. > > Also: Some MUAs only have "reply" and "Reply to all"; the first one would only > reply to the sender, and the last one would reply to list and sender. and? you are one of the "reply-all" guys and so i need "reply-all" too - now look at my message - do you see anything else then "systemd-devel@lists.freedesktop.org" it's no rocket science using a 08/15 brain and remove useless RCPT's even before start writing the response ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Antw: [EXT] Re: Memory in systemctl status
Am 30.09.20 um 09:11 schrieb Ulrich Windl: Reindl Harald schrieb am 28.09.2020 um 11:37 in >> httpd don't use 8.7 GB RAM - period > > Are you really sure about that? 1000% sure even if one makes the mistake and multiply the shared opcache of 400 MB with the count of worker processes we won't exceed 4500 MB > I haven't checked apache recently, but years > ago, static content was memory-mapped for performance reasons. and you think that mapping is forever long after the request is finished or even unconditional? https://httpd.apache.org/docs/2.4/en/mod/core.html#enablemmap however, the config fro at least a decade: EnableSendFile On EnableMMAP Off ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Antw: Re: Antw: [EXT] Re: Memory in systemctl status
>>> Reindl Harald schrieb am 30.09.2020 um 10:56 in Nachricht : > > Am 30.09.20 um 09:06 schrieb Ulrich Windl: >>> my webserver is killed because it served at monday, tuesday, thursday >>> and friday 4 different files with 2 GB? >> >> cgroups is for limiting resources, not for killing processes AFAIK > > [Service] > MemoryMax=4G > > would call OOM killer Are you sure? I thought OOM is called when the _system_ memory is exhausted. IMHO any memory allocation request to the process will be denied, but the process wouldn't be killed. But agreed, I didn't track the cgroups changes in the last few years. > ___ > systemd‑devel mailing list > systemd‑de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/systemd‑devel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Antw: [EXT] Re: Memory in systemctl status
Am 30.09.20 um 09:06 schrieb Ulrich Windl: >> my webserver is killed because it served at monday, tuesday, thursday >> and friday 4 different files with 2 GB? > > cgroups is for limiting resources, not for killing processes AFAIK [Service] MemoryMax=4G would call OOM killer ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Q: logrotate and "systemctl kill -s HUP ..."
Hi! I have a problem with logrotate: My postrotate command does not seem to send a HUP signal. However the files are rotated. I'm using this (not preferred way, I know): ... postrotate test -s '/var/run/iotwatch-LOC1/iotwatch-LOC1.pid' && systemctl kill -s HUP --kill-who=main iotwatch@LOC1.service endscript ... I've verified that the PID file exists (just rebooted the server a few minutes ago): # ll /var/run/iotwatch-LOC1/iotwatch-LOC1.pid -rw-r--r-- 1 root root 5 Sep 30 10:07 /var/run/iotwatch-LOC1/iotwatch-LOC1.pid My service would log the arrival of any HUP signal, but it didn't. Also in syslog I could not find any error message related to "systemctl kill". What might be wrong? My service is using ExecStartPre, ExecStartPost, and ExecStart. Could systemd be confused about "--kill-who=main" then? Regards, Ulrich ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] How to reply to the list
>>> Dave Howorth schrieb am 28.09.2020 um 16:34 in Nachricht <20200928153422.6bf6e...@acer-suse.lan>: > On Mon, 28 Sep 2020 14:10:38 +0200 > Reindl Harald wrote: >> can you stop "reply‑all" and breaking threads when respond to lists? > > I can't answer for the reply‑all, that would annoy me as well. > But the thread isn't broken, my MUA is showing it nicely. Also: Some MUAs only have "reply" and "Reply to all"; the first one would only reply to the sender, and the last one would reply to list and sender. Interestingly for some lists a plain "Reply" works just right, but not for this list. Please do not assume everybody is using the same MUA than you do... > ___ > systemd‑devel mailing list > systemd‑de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/systemd‑devel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Antw: [EXT] Re: Memory in systemctl status
>>> Reindl Harald schrieb am 28.09.2020 um 11:37 in Nachricht <0e47024a-faeb-bb78-9b08-fdfad2a23...@thelounge.net>: > > Am 28.09.20 um 11:19 schrieb Benjamin Berg: >>> if i would set "MemoryMax" to 4G "Memory: 8.6G" would kill it when the >>> caches are accounted in that context >> >> No, the kernel kicks in and reclaims memory at that point. Which can >> mean either swapping or just dropping caches. > > caches have *nothing* to do with the service itself > >> It really sounds to me like ulimit fits better what you are trying to >> do. That is available through Limit*=, see systemd.exec. > > hell first i want a output in "systemctl status whatever" which is true > and don't contain a ISO image downloaded by someone two days ago > > not more and not less > > httpd don't use 8.7 GB RAM - period Are you really sure about that? I haven't checked apache recently, but years ago, static content was memory-mapped for performance reasons. > > the only interesting memory is RES of all the processes > > my Firefox on the desktop don't use 32 GB RAM even when VIRT shows that > and even if the latest download of a 10 GB file is somewhere in the OS > caches in case it's opened later - it's *free* memory > > Main PID: 713 (httpd) > Tasks: 16 (limit: 1024) >Memory: 8.7G > CPU: 2h 24min 14.348s >CGroup: /system.slice/httpd.service >├─713 /usr/sbin/httpd -D FOREGROUND >├─2435242 /usr/sbin/httpd -D FOREGROUND >├─2435243 /usr/sbin/httpd -D FOREGROUND >├─2435931 /usr/sbin/httpd -D FOREGROUND >├─2435942 /usr/sbin/httpd -D FOREGROUND >├─2435944 /usr/sbin/httpd -D FOREGROUND >├─2435947 /usr/sbin/httpd -D FOREGROUND >├─2435948 /usr/sbin/httpd -D FOREGROUND >├─2435952 /usr/sbin/httpd -D FOREGROUND >├─2435954 /usr/sbin/httpd -D FOREGROUND >├─2435960 /usr/sbin/httpd -D FOREGROUND >├─2435966 /usr/sbin/httpd -D FOREGROUND >├─2435968 /usr/sbin/httpd -D FOREGROUND >├─2435969 /usr/sbin/httpd -D FOREGROUND >├─2435970 /usr/sbin/httpd -D FOREGROUND >└─2435972 /usr/sbin/httpd -D FOREGROUND > ___ > systemd-devel mailing list > systemd-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/systemd-devel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] Antw: [EXT] Re: Memory in systemctl status
>>> Reindl Harald schrieb am 28.09.2020 um 10:08 in Nachricht <5b087cb0-9588-56db-1955-522ac9a6b...@thelounge.net>: > > Am 27.09.20 um 23:39 schrieb Benjamin Berg: > however, that value makes little to no sense and if that's the same > value as accounted for "MemoryMax" it's plain wrong >> But it does make sense. File caches are part of the working set of >> memory that a process needs. Setting MemoryMax=/MemoryMin= >> limits/guarantees the size of this working set. These kinds of limits >> or protections would be a lot less meaningful if caches were not >> accounted for. > > sorry but that is complete nosense > > caches are freed as soon whatever process asks for RAM and so they are > *not* part of the working set > > that kind of limits are completly useless when i would limit a service > to 4 GB but because it served a few million different files within the > last weeks which are accounted to it's cache and working set it's now > killed? Actually there are valid reasons to limit the amount of cache a process may allocate. For example when a process creates a lot of dirty buffers quickly (e.g. writing to a slow disk), it may cause a read-stall for the whole system. > > my webserver is killed because it served at monday, tuesday, thursday > and friday 4 different files with 2 GB? cgroups is for limiting resources, not for killing processes AFAIK. > > frankly my webserver can't even do anything against caching of teh VFS > layer and is not responsible at all nor do other services > > BTW: stop "reply‑all" to mailing‑lists > ___ > systemd‑devel mailing list > systemd‑de...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/systemd‑devel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/systemd-devel