Re: [systemd-devel] setting cpulimit/iolimit on mysql thread not entire process

2023-11-27 Thread Demi Marie Obenour
On Tue, Nov 28, 2023 at 08:35:29AM +0200, Mantas Mikulėnas wrote:
> On Tue, Nov 28, 2023 at 8:27 AM jai  wrote:
> 
> > I am able to set cpulimit, iolimit, etc for a process using its pid
> > through cgroups v2. But for some threads of a single mysql process, how can
> > I achieve that?
> >
> 
> You cannot; 1) the limits are per-cgroup and the entire service is a single
> cgroup; 2) the threads are created by mysqld, not by systemd, and systemd
> does not monitor and move service processes across cgroups once the service
> is already running; 3) afaik, in cgroups v2 it isn't even allowed for
> threads of a single process to straddle multiple cgroups anymore.
> 
> I'm not a DBA but I've heard that one common way to handle this would be to
> create a separate MySQL instance (probably on a separate machine, even)
> that would replicate all the data, for the heavy users to query. (Or the
> other way around, main instance for the heavy updates ⇒ replica for regular
> queries.)

Generally heavy analytical queries should be on a replica.  The reason
is that analytical queries are less likely to need the very latest
data, whereas transactions probably do.
-- 
Sincerely,
Demi Marie Obenour (she/her/hers)
Invisible Things Lab


signature.asc
Description: PGP signature


Re: [systemd-devel] setting cpulimit/iolimit on mysql thread not entire process

2023-11-27 Thread Mantas Mikulėnas
On Tue, Nov 28, 2023 at 8:27 AM jai  wrote:

> I am able to set cpulimit, iolimit, etc for a process using its pid
> through cgroups v2. But for some threads of a single mysql process, how can
> I achieve that?
>

You cannot; 1) the limits are per-cgroup and the entire service is a single
cgroup; 2) the threads are created by mysqld, not by systemd, and systemd
does not monitor and move service processes across cgroups once the service
is already running; 3) afaik, in cgroups v2 it isn't even allowed for
threads of a single process to straddle multiple cgroups anymore.

I'm not a DBA but I've heard that one common way to handle this would be to
create a separate MySQL instance (probably on a separate machine, even)
that would replicate all the data, for the heavy users to query. (Or the
other way around, main instance for the heavy updates ⇒ replica for regular
queries.)

-- 
Mantas Mikulėnas


[systemd-devel] systemd-networkd code design documentation?

2023-11-27 Thread Muggeridge, Matt
Hi,

As I start looking at the code, is there any design documentation for 
developers that describes systemd-networkd?

Specifically, I'm looking for an overview of the data-flow when an IPv6 Router 
Advertisement is received, where it is processed and where it generates the 
reply.

I'm slowly building a picture of this flow, but if someone has already been 
down this path and is willing to share, then it will save me some time.

Thanks,
Matt.




Re: [systemd-devel] networkd 249.11 fails to create ip6gre and vti6 tunnels

2023-11-27 Thread Mantas Mikulėnas
Kernel and systemd changes aside, I kind of want to say that you need to
specify an interface for the link-local endpoint to be bound to – just as
with regular sockets. If the tunnel were device-bound and not independent,
that would happen by default.

It also seems weird that the tunnel has endpoints with different scopes; I
think I've seen routers reject such packets with a "Scope Mismatch" error.

I would try building systemd from Git source; if I remember correctly,
systemd-networkd could be run directly from the build directory, making it
possible to `git bisect` down to the change that fixed this.

On Mon, Nov 27, 2023, 19:38 Danilo Egea Gondolfo <
danilo.egea.gondo...@gmail.com> wrote:

> Hello,
>
> I'm looking for help to understand an issue we are observing on Ubuntu
> 22.04.
>
> networkd is failing with "netdev could not be created: Invalid argument"
> when I try to create either an ip6gre or vti6 device.
>
> We believe this problem started when we pulled this change [1] in to the
> kernel 5.15. The problem also happens with the most recent upstream kernel
> so it's not an issue introduced by Ubuntu.
>
> The problem doesn't happen on recent versions of systemd but we'd like to
> fix it on systemd 249 (used by Ubuntu 22.04).
>
> How to reproduce the problem (tested on Ubuntu 22.04 (jammy) with systemd
> 249.11-0ubuntu3.11 and kernel 5.15.0-89-generic):
>
> --- /etc/systemd/network/tun0.netdev ---
> [NetDev]
> Name=tun0
> Kind=ip6gre
>
> [Tunnel]
> Independent=true
> Local=fe80::1
> Remote=2001:dead:beef::2
> --
>
> --- /etc/systemd/network/tun0.network ---
> [Match]
> Name=tun0
>
> [Network]
> LinkLocalAddressing=ipv6
> ConfigureWithoutCarrier=yes
> --
>
> After restarting networkd I see this in the logs
> tun0: netdev could not be created: Invalid argument
> tun0: netdev removed
>
> If we boot a kernel that doesn't have [1], the interface tun0 is created.
>
> Here is the full log with debug enabled
> https://paste.ubuntu.com/p/dPbPxgRThW/
>
> As I said, the problem seems to be fixed already in systemd, but I'm
> looking for help to understand what changes fixed it.
> The theory is that the netlink attributes used to configure the tunnel
> local/remote IPs might be wrong.
>
> This problem is documented here
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2037667
>
> Thanks in advance.
>
> [1] -
> https://github.com/torvalds/linux/commit/b0ad3c179059089d809b477a1d445c1183a7b8fe
>


Re: [systemd-devel] How to properly wait for udev?

2023-11-27 Thread Richard Weinberger
On Mon, Nov 27, 2023 at 9:29 AM Lennart Poettering
 wrote:
> If they conceptually should be considered block device equivalents, we
> might want to extend the udev logic to such UBI devices too.  Patches
> welcome.

Why doesn't udev flock() every device it is probing?
Or asked differently, why is this feature opt-in instead of opt-out?

-- 
Thanks,
//richard


[systemd-devel] networkd 249.11 fails to create ip6gre and vti6 tunnels

2023-11-27 Thread Danilo Egea Gondolfo

Hello,

I'm looking for help to understand an issue we are observing on Ubuntu 
22.04.


networkd is failing with "netdev could not be created: Invalid argument" 
when I try to create either an ip6gre or vti6 device.


We believe this problem started when we pulled this change [1] in to the 
kernel 5.15. The problem also happens with the most recent upstream 
kernel so it's not an issue introduced by Ubuntu.


The problem doesn't happen on recent versions of systemd but we'd like 
to fix it on systemd 249 (used by Ubuntu 22.04).


How to reproduce the problem (tested on Ubuntu 22.04 (jammy) with 
systemd 249.11-0ubuntu3.11 and kernel 5.15.0-89-generic):


--- /etc/systemd/network/tun0.netdev ---
[NetDev]
Name=tun0
Kind=ip6gre

[Tunnel]
Independent=true
Local=fe80::1
Remote=2001:dead:beef::2
--

--- /etc/systemd/network/tun0.network ---
[Match]
Name=tun0

[Network]
LinkLocalAddressing=ipv6
ConfigureWithoutCarrier=yes
--

After restarting networkd I see this in the logs

tun0: netdev could not be created: Invalid argument
tun0: netdev removed

If we boot a kernel that doesn't have [1], the interface tun0 is created.

Here is the full log with debug enabled 
https://paste.ubuntu.com/p/dPbPxgRThW/


As I said, the problem seems to be fixed already in systemd, but I'm 
looking for help to understand what changes fixed it.
The theory is that the netlink attributes used to configure the tunnel 
local/remote IPs might be wrong.


This problem is documented here 
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2037667


Thanks in advance.

[1] - 
https://github.com/torvalds/linux/commit/b0ad3c179059089d809b477a1d445c1183a7b8fe


Re: [systemd-devel] How to properly wait for udev?

2023-11-27 Thread Richard Weinberger
On Mon, Nov 27, 2023 at 9:29 AM Lennart Poettering
 wrote:
> On So, 26.11.23 00:39, Richard Weinberger (richard.weinber...@gmail.com) 
> wrote:
>
> > Hello!
> >
> > After upgrading my main test worker to a recent distribution, the UBI
> > test suite [0] fails at various places with -EBUSY.
> > The reason is that these tests create and remove UBI volumes rapidly.
> > A typical test sequence is as follows:
> > 1. creation of /dev/ubi0_0
> > 2. some exclusive operation, such as atomic update or volume resize on
> > /dev/ubi0_0
> > 3. removal of /dev/ubi0_0
> >
> > Both steps 2 and 3 can fail with -EBUSY because the udev worker still
> > holds a file descriptor to /dev/ubi0_0.
>
> Hmm, I have no experience with UBI, but are you sure we open that? why
> would we? are such devices analyzed by blkid? We generally don't open
> device nodes unless we have a reason to, such as doing blkid on it or
> so.

I think it came via commit:
dbbf424c8b77 ("rules: ubi mtd - add link to named partitions (#6750)")

Here is the bpftrace output of a failed mkvol_basic run.
The test created a new volume and tried to delete it via ioctl().
Right after creating the volume, udev started inspecting it and mkvol_basic
was unable to delete it because the delete operation needs exclusive ownership.

mkvol_basic(530):   open() = /dev/ubi0
mkvol_basic(530):   ioctl(cmd: 1074032385)
(udev-worker)(531): open UBI volume 0 = 0x96644533ac80
mkvol_basic(530):   open UBI volume 0 = 0xfff0
mkvol_basic(530):   failed ioctl() = -16
(udev-worker)(531): closing UBI volume 0x96644533ac80

> What precisely fails for you? the open()? or some operation on the
> opened fd?

All of that. I depends on the test.
Basically every test assumes that it has the full ownership of a
volume it has created.

> >
> > FWIW, the problem can also get triggered using UBI's shell utilities
> > if the system is fast enough, e.g.
> > # ubimkvol -N testv -S 50 -n 0 /dev/ubi0 && ubirmvol -n 0 /dev/ubi0
> > Volume ID 0, size 50 LEBs (793600 bytes, 775.0 KiB), LEB size 15872
> > bytes (15.5 KiB), dynamic, name "testv", alignment 1
> > ubirmvol: error!: cannot UBI remove volume
> >  error 16 (Device or resource busy)
> >
> > Instead of adding a retry loop around -EBUSY, I believe the best
> > solution is to add code to wait for udev.
> > For example, having a udev barrier in ubi_mkvol() and ubi_rmvol() [1]
> > seems like a good idea to me.
>
> For block devices we implement this:
>
> https://systemd.io/BLOCK_DEVICE_LOCKING
>
> I understand UBI aren't block devices though?

Exactly, UBI volumes are character devices, just like MTDs.

> If they conceptually should be considered block device equivalents, we
> might want to extend the udev logic to such UBI devices too.  Patches
> welcome.
>
> We provide "udevadm lock" to lock a block device according to this
> scheme from shell scripts.
>
> > What function from libsystemd do you suggest for waiting until udev is
> > done with rule processing?
> > My naive approach, using udev_queue_is_empty() and
> > sd_device_get_is_initialized(), does not resolve all failures so far.
> > Firstly, udev_queue_is_empty() doesn't seem to be exported by
> > libsystemd. I have open-coded it as:
> > static int udev_queue_is_empty(void) {
> >return access("/run/udev/queue", F_OK) < 0 ?
> >(errno == ENOENT ? true : -errno) : false;
> > }
>
> This doesn't really work. udev might still process the device in the
> background.

I see.

-- 
Thanks,
//richard


Re: [systemd-devel] WSL Ubuntu creates XDG_RUNTIME_DIR with incorrect permissions

2023-11-27 Thread Andrei Borzenkov
On Mon, Nov 27, 2023 at 1:06 AM Thomas Larsen Wessel  wrote:
>>
>> WSL does not use systemd by default.
>
>
> According to this article, it systemd has been default on WSL Ubuntu since 
> june 2023. https://learn.microsoft.com/en-us/windows/wsl/systemd
>
> "Systemd is now the default for the current version of Ubuntu that will be 
> installed using the wsl --install command default."
>
> Also when I look in the /var/log/auth.log, there are many lines with systemd, 
> e.g.:
>
> Nov 25 22:30:14 ELCON45223 systemd-logind[155]: New session 6 of user velle.
> Nov 25 22:30:14 ELCON45223 systemd: pam_unix(systemd-user:session): session 
> opened for user velle(uid=1000) by (uid=0)
>
> Could someone please help me understand exactly which part creates this 
> XDG_RUNTIME_DIR folder?

/run/user/$UID for the "console" session (the one you get when
starting a WSL instance) is created by WSL before systemd. Adding "ls
-l /run/user" to user-runtime-dir@1000.service ExecStartPre:

Nov 27 12:34:22 tumbleweed unknown: WSL (2) ERROR:
WaitForBootProcess:3237: /sbin/init failed to start within 1
Nov 27 12:34:22 tumbleweed unknown: ms
Nov 27 12:34:22 tumbleweed unknown: WSL (2): Creating login session for andrei
...
Nov 27 12:34:22 tumbleweed systemd[1]: Created slice User Slice of UID 1000.
Nov 27 12:34:22 tumbleweed systemd[1]: Starting User Runtime Directory
/run/user/1000...
Nov 27 12:34:22 tumbleweed ls[520]: total 0
Nov 27 12:34:22 tumbleweed ls[520]: drwxr-xr-x 4 andrei users 120 Nov
27 12:34 1000
Nov 27 12:34:22 tumbleweed systemd-logind[160]: New session 11 of user andrei.
Nov 27 12:34:22 tumbleweed systemd[1]: Finished User Runtime Directory
/run/user/1000.

So logind invokes user-runtime-dir@1000.service, but it sees the
existing directory and does nothing. I would suggest asking this
question on WSL support channels.

> Is it part of the systemd repo or not? And if the answer is (or may be) 
> different between Ubuntu and WSL Ubuntu, I would be happy if you share what 
> you know about any any of those cases :) Right now, I barely know where to 
> report this issue.
>
>
> On Sun, Nov 26, 2023 at 10:07 AM Andrei Borzenkov  wrote:
>>
>> On 26.11.2023 02:39, Thomas Larsen Wessel wrote:
>> > I set up WSL on Windows 10 and created an instance from the default Ubuntu
>> > 22.04 image.
>> >
>> > I ran some (non-GUI) software that somehow relies on Qt, and apparently Qt
>> > does some checks on the XDG environment, so I got the following.
>> >
>> > *Warning: QStandardPaths: wrong permissions on runtime directory
>> > /run/user/1000/, 0755 instead of 0700*
>> >
>> > And yes, all the user folders are set to 755, including much of their
>> > content, which violates the XDG Base Directory Specification. (screenshot:
>> > https://i.imgur.com/ISn3ebh.png).
>> >
>> > As far as I can understand, its some part of systemd, that creates this
>> > folder. So is this an issue with systemd?
>> >
>>
>> WSL does not use systemd by default.
>>
>> > The validate_runtime_directory in pam_systemd already does a number of
>> > checks on XDG_RUNTIME_DIR. How about also checking if the permissions are
>> > correct/valid?
>> >
>> > Sincerely, Thomas
>> >
>>


Re: [systemd-devel] Performance issues after migrating to systemd

2023-11-27 Thread František Šumšal

It would be great to start with `systemd-analyze blame` and `systemd-analyze 
critical-chain`
to see what's going on during boot and point out the time hog(s).

On 11/27/23 07:16, hari.prasat...@microchip.com wrote:

Hello All,

We recently migrated our Yocto project distribution for our embedded
Linux based system to Systemd from sysVinit. We have our graphics
launcher application known as EGT which is public with it's own repo as
below.

https://github.com/linux4sam/egt

We are facing performance issues after migrating to systemd. We have a
set of bench-marking applications whose scores have come down with this
migration especially the startup is too slow. We are trying to use
profiling tools like prof to see what's happening under the hood, but
any pointers on what might be going wrong or areas to check might be
useful. The main launcher service is located at

https://github.com/linux4sam/meta-atmel/blob/kirkstone/recipes-egt/apps/egt-launcher_1.3.bb

Any leads would be helpful.

Regards,
Hari



Re: [systemd-devel] How to properly wait for udev?

2023-11-27 Thread Mantas Mikulėnas
On Mon, Nov 27, 2023 at 10:30 AM Lennart Poettering 
wrote:

> On So, 26.11.23 00:39, Richard Weinberger (richard.weinber...@gmail.com)
> wrote:
>
> > Hello!
> >
> > After upgrading my main test worker to a recent distribution, the UBI
> > test suite [0] fails at various places with -EBUSY.
> > The reason is that these tests create and remove UBI volumes rapidly.
> > A typical test sequence is as follows:
> > 1. creation of /dev/ubi0_0
> > 2. some exclusive operation, such as atomic update or volume resize on
> > /dev/ubi0_0
> > 3. removal of /dev/ubi0_0
> >
> > Both steps 2 and 3 can fail with -EBUSY because the udev worker still
> > holds a file descriptor to /dev/ubi0_0.
>
> Hmm, I have no experience with UBI, but are you sure we open that? why
> would we? are such devices analyzed by blkid? We generally don't open
> device nodes unless we have a reason to, such as doing blkid on it or
> so.
>

blkid and 60-persistent-storage indeed analyze ubi devices, it seems.

-- 
Mantas Mikulėnas


Re: [systemd-devel] How to properly wait for udev?

2023-11-27 Thread Lennart Poettering
On So, 26.11.23 00:39, Richard Weinberger (richard.weinber...@gmail.com) wrote:

> Hello!
>
> After upgrading my main test worker to a recent distribution, the UBI
> test suite [0] fails at various places with -EBUSY.
> The reason is that these tests create and remove UBI volumes rapidly.
> A typical test sequence is as follows:
> 1. creation of /dev/ubi0_0
> 2. some exclusive operation, such as atomic update or volume resize on
> /dev/ubi0_0
> 3. removal of /dev/ubi0_0
>
> Both steps 2 and 3 can fail with -EBUSY because the udev worker still
> holds a file descriptor to /dev/ubi0_0.

Hmm, I have no experience with UBI, but are you sure we open that? why
would we? are such devices analyzed by blkid? We generally don't open
device nodes unless we have a reason to, such as doing blkid on it or
so.

What precisely fails for you? the open()? or some operation on the
opened fd?

>
> FWIW, the problem can also get triggered using UBI's shell utilities
> if the system is fast enough, e.g.
> # ubimkvol -N testv -S 50 -n 0 /dev/ubi0 && ubirmvol -n 0 /dev/ubi0
> Volume ID 0, size 50 LEBs (793600 bytes, 775.0 KiB), LEB size 15872
> bytes (15.5 KiB), dynamic, name "testv", alignment 1
> ubirmvol: error!: cannot UBI remove volume
>  error 16 (Device or resource busy)
>
> Instead of adding a retry loop around -EBUSY, I believe the best
> solution is to add code to wait for udev.
> For example, having a udev barrier in ubi_mkvol() and ubi_rmvol() [1]
> seems like a good idea to me.

For block devices we implement this:

https://systemd.io/BLOCK_DEVICE_LOCKING

I understand UBI aren't block devices though?

If they conceptually should be considered block device equivalents, we
might want to extend the udev logic to such UBI devices too.  Patches
welcome.

We provide "udevadm lock" to lock a block device according to this
scheme from shell scripts.

> What function from libsystemd do you suggest for waiting until udev is
> done with rule processing?
> My naive approach, using udev_queue_is_empty() and
> sd_device_get_is_initialized(), does not resolve all failures so far.
> Firstly, udev_queue_is_empty() doesn't seem to be exported by
> libsystemd. I have open-coded it as:
> static int udev_queue_is_empty(void) {
>return access("/run/udev/queue", F_OK) < 0 ?
>(errno == ENOENT ? true : -errno) : false;
> }

This doesn't really work. udev might still process the device in the
background.

Lennart

--
Lennart Poettering, Berlin