[systemd-devel] Antw: Re: Antw: [EXT] Re: [systemd‑devel] systemd‑nspawn container not starting on RHEL9.0
>>> Neal Gompa schrieb am 11.08.2022 um 09:22 in Nachricht : > On Thu, Aug 11, 2022 at 3:15 AM Ulrich Windl > wrote: >> >> >>> Lennart Poettering schrieb am 10.08.2022 um 22:09 >> in >> Nachricht : >> > On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote: >> > >> >> Thank you again Lennart, and thx Kevin. >> >> >> >> That makes total sense, and accounts for the application's high level >> >> start‑up delay which appears to be what we are stuck with if we are over >> >> xfs. Unfortunately, it's difficult to dictate to the client to change >> their >> >> fs type, consequently we can't develop / ship a tool with that baseline >> >> latency on our primary target platform (RHEL xx.) >> >> >> >> So the next obvious question would be, is XFS reflink support on the >> >> systemd‑nspawn roadmap or actually, (and even better) has support been >> >> incorporated already in the latest and greatest src and I'm just behind >> the >> >> curve working with the older version of nspawn as shipped in RHEL90? >> >> >> >> I'm asking because according to the RHEL 9 docs >> > >> > (https://access.redhat.com/documentation/en‑us/red_hat_enterprise_linux/9/htm > l‑ >> >> > >> > single/managing_file_systems/index#the‑xfs‑file‑system_assembly_overview‑of‑a > vaila >> > ble‑file‑systems) >> >> it's the current default fs and is configured for "Reflink‑based file >> >> copies." >> > >> > We issue copy_file_range() syscall, which should do reflinks on xfs, >> > if it supports that. Question is if your kernel supports that too. I >> > have no experience with xfs though, no idea how xfs hooked up reflink >> > initially. And we never tested that really. I don't think outside RHEL >> > many people use xfs. >> >> Not true: For SUSE /home is typically using XFS, and we use it with SLES for >> (huge) database filesystems. >> > > In openSUSE, this hasn't been the default behavior for a while. SLES > will catch up here eventually. Accidentially I created some filesystems using "yast2 disk" in SLES15 SP4 (latest updates) today: There the default for any "data" filesystems is (still) XFS (while OS uses BtrFS). Agreed, if you don't have a separate filesystem for /home, it'll be a BtrFS subvolume ("fill one, you fill all", I don't like the subvolume concept, or I didn't understand the benefits) Regards, Ulrich > > > -- > 真実はいつも一つ!/ Always, there's only one truth!
Re: [systemd-devel] Antw: [EXT] Re: [systemd‑devel] systemd‑nspawn container not starting on RHEL9.0
On Thu, Aug 11, 2022 at 3:15 AM Ulrich Windl wrote: > > >>> Lennart Poettering schrieb am 10.08.2022 um 22:09 > in > Nachricht : > > On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote: > > > >> Thank you again Lennart, and thx Kevin. > >> > >> That makes total sense, and accounts for the application's high level > >> start‑up delay which appears to be what we are stuck with if we are over > >> xfs. Unfortunately, it's difficult to dictate to the client to change > their > >> fs type, consequently we can't develop / ship a tool with that baseline > >> latency on our primary target platform (RHEL xx.) > >> > >> So the next obvious question would be, is XFS reflink support on the > >> systemd‑nspawn roadmap or actually, (and even better) has support been > >> incorporated already in the latest and greatest src and I'm just behind > the > >> curve working with the older version of nspawn as shipped in RHEL90? > >> > >> I'm asking because according to the RHEL 9 docs > > > (https://access.redhat.com/documentation/en‑us/red_hat_enterprise_linux/9/html‑ > > > > single/managing_file_systems/index#the‑xfs‑file‑system_assembly_overview‑of‑availa > > ble‑file‑systems) > >> it's the current default fs and is configured for "Reflink‑based file > >> copies." > > > > We issue copy_file_range() syscall, which should do reflinks on xfs, > > if it supports that. Question is if your kernel supports that too. I > > have no experience with xfs though, no idea how xfs hooked up reflink > > initially. And we never tested that really. I don't think outside RHEL > > many people use xfs. > > Not true: For SUSE /home is typically using XFS, and we use it with SLES for > (huge) database filesystems. > In openSUSE, this hasn't been the default behavior for a while. SLES will catch up here eventually. -- 真実はいつも一つ!/ Always, there's only one truth!
[systemd-devel] Antw: [EXT] Re: [systemd‑devel] systemd‑nspawn container not starting on RHEL9.0
>>> Lennart Poettering schrieb am 10.08.2022 um 22:09 in Nachricht : > On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote: > >> Thank you again Lennart, and thx Kevin. >> >> That makes total sense, and accounts for the application's high level >> start‑up delay which appears to be what we are stuck with if we are over >> xfs. Unfortunately, it's difficult to dictate to the client to change their >> fs type, consequently we can't develop / ship a tool with that baseline >> latency on our primary target platform (RHEL xx.) >> >> So the next obvious question would be, is XFS reflink support on the >> systemd‑nspawn roadmap or actually, (and even better) has support been >> incorporated already in the latest and greatest src and I'm just behind the >> curve working with the older version of nspawn as shipped in RHEL90? >> >> I'm asking because according to the RHEL 9 docs > (https://access.redhat.com/documentation/en‑us/red_hat_enterprise_linux/9/html‑ > single/managing_file_systems/index#the‑xfs‑file‑system_assembly_overview‑of‑availa > ble‑file‑systems) >> it's the current default fs and is configured for "Reflink‑based file >> copies." > > We issue copy_file_range() syscall, which should do reflinks on xfs, > if it supports that. Question is if your kernel supports that too. I > have no experience with xfs though, no idea how xfs hooked up reflink > initially. And we never tested that really. I don't think outside RHEL > many people use xfs. Not true: For SUSE /home is typically using XFS, and we use it with SLES for (huge) database filesystems. > > If you provide a more complete strace output, you should see the > copy_file_range() stuff there. > > Lennart > > ‑‑ > Lennart Poettering, Berlin
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
> On 10 Aug 2022, at 21:10, Lennart Poettering wrote: > > On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote: > >> Thank you again Lennart, and thx Kevin. >> >> That makes total sense, and accounts for the application's high level >> start-up delay which appears to be what we are stuck with if we are over >> xfs. Unfortunately, it's difficult to dictate to the client to change their >> fs type, consequently we can't develop / ship a tool with that baseline >> latency on our primary target platform (RHEL xx.) >> >> So the next obvious question would be, is XFS reflink support on the >> systemd-nspawn roadmap or actually, (and even better) has support been >> incorporated already in the latest and greatest src and I'm just behind the >> curve working with the older version of nspawn as shipped in RHEL90? >> >> I'm asking because according to the RHEL 9 docs >> (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems) >> it's the current default fs and is configured for "Reflink-based file >> copies." > > We issue copy_file_range() syscall, which should do reflinks on xfs, > if it supports that. Question is if your kernel supports that too. I > have no experience with xfs though, no idea how xfs hooked up reflink > initially. And we never tested that really. I don't think outside RHEL > many people use xfs. Isn’t XFS the default for fedora server? Barry > > If you provide a more complete strace output, you should see the > copy_file_range() stuff there. > > Lennart > > -- > Lennart Poettering, Berlin >
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote: > Thank you again Lennart, and thx Kevin. > > That makes total sense, and accounts for the application's high level > start-up delay which appears to be what we are stuck with if we are over > xfs. Unfortunately, it's difficult to dictate to the client to change their > fs type, consequently we can't develop / ship a tool with that baseline > latency on our primary target platform (RHEL xx.) > > So the next obvious question would be, is XFS reflink support on the > systemd-nspawn roadmap or actually, (and even better) has support been > incorporated already in the latest and greatest src and I'm just behind the > curve working with the older version of nspawn as shipped in RHEL90? > > I'm asking because according to the RHEL 9 docs > (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems) > it's the current default fs and is configured for "Reflink-based file > copies." We issue copy_file_range() syscall, which should do reflinks on xfs, if it supports that. Question is if your kernel supports that too. I have no experience with xfs though, no idea how xfs hooked up reflink initially. And we never tested that really. I don't think outside RHEL many people use xfs. If you provide a more complete strace output, you should see the copy_file_range() stuff there. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
On Wed, Aug 10, 2022 at 11:16 AM Thomas Archambault wrote: > > Thank you again Lennart, and thx Kevin. > > That makes total sense, and accounts for the application's high level > start-up delay which appears to be what we are stuck with if we are over > xfs. Unfortunately, it's difficult to dictate to the client to change > their fs type, consequently we can't develop / ship a tool with that > baseline latency on our primary target platform (RHEL xx.) > > So the next obvious question would be, is XFS reflink support on the > systemd-nspawn roadmap or actually, (and even better) has support been > incorporated already in the latest and greatest src and I'm just behind > the curve working with the older version of nspawn as shipped in RHEL90? > > I'm asking because according to the RHEL 9 docs > (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems) > it's the current default fs and is configured for "Reflink-based file > copies." > > I understand that the project supports many distros, but I'm also > assuming that RHEL would make the list of platforms that the tool should > run optimally on. So, Is this worth submitting an enhancement request? > It's certainly not a bug. > > We like the feature set of nspawn and want to keep banging on it; > Probably pushing it into spaces it was not intended for, but that's > another story, and our issue... > You could also ask Red Hat to consider adding Btrfs to RHEL and give them your use-case so that they could consider adding it. It's already used by default in Fedora and many other distributions, so bringing it back to RHEL at this point would be a matter of customers asking for it. -- 真実はいつも一つ!/ Always, there's only one truth!
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
Thank you again Lennart, and thx Kevin. That makes total sense, and accounts for the application's high level start-up delay which appears to be what we are stuck with if we are over xfs. Unfortunately, it's difficult to dictate to the client to change their fs type, consequently we can't develop / ship a tool with that baseline latency on our primary target platform (RHEL xx.) So the next obvious question would be, is XFS reflink support on the systemd-nspawn roadmap or actually, (and even better) has support been incorporated already in the latest and greatest src and I'm just behind the curve working with the older version of nspawn as shipped in RHEL90? I'm asking because according to the RHEL 9 docs (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems) it's the current default fs and is configured for "Reflink-based file copies." I understand that the project supports many distros, but I'm also assuming that RHEL would make the list of platforms that the tool should run optimally on. So, Is this worth submitting an enhancement request? It's certainly not a bug. We like the feature set of nspawn and want to keep banging on it; Probably pushing it into spaces it was not intended for, but that's another story, and our issue... Regards, -Tom On 8/10/22 04:47, Lennart Poettering wrote: That's the btrfs subvolume ioctl. It's expected to fail on non-btrfs with ENOTTY, and given you have xfs this is behaving as it should. It then starts copying things manually, which is slow. i.e. it's then basically doing what "cp -a" does. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
On Di, 09.08.22 12:40, Thomas Archambault (t...@tparchambault.com) wrote: > Thank you Lennart for the follow-up. > > There does appear to be mostly filesystem operations prior to my manually > killing nspawn as you suggested. I only let it run about 3 minutes prior to > sending a signal given that the strace output = ~25M. > > One obvious issue is the non-zero return from an ioctl call with the > BTRFS_IOC_SUBVOL_CREATE arg at line 410, in the snippet below from my > RHEL9.0 strace capture; this is occurring right after the initial blast of > debug log messages. I'm trying to get a stack trace for that error > currently. > > > 410-2064 ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0, > name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for > device) That's the btrfs subvolume ioctl. It's expected to fail on non-btrfs with ENOTTY, and given you have xfs this is behaving as it should. It then starts copying things manually, which is slow. i.e. it's then basically doing what "cp -a" does. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
On Tue, Aug 9, 2022 at 12:43 PM Thomas Archambault wrote: > One obvious issue is the non-zero return from an ioctl call with the > BTRFS_IOC_SUBVOL_CREATE arg at line 410, in the snippet below from my > RHEL9.0 strace capture; this is occurring right after the initial blast > of debug log messages. I'm trying to get a stack trace for that error > currently. > > > 410-2064 ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0, > name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for > device) Since you are using XFS and not Btrfs this seems like an expected result.
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
Thank you Lennart for the follow-up. There does appear to be mostly filesystem operations prior to my manually killing nspawn as you suggested. I only let it run about 3 minutes prior to sending a signal given that the strace output = ~25M. One obvious issue is the non-zero return from an ioctl call with the BTRFS_IOC_SUBVOL_CREATE arg at line 410, in the snippet below from my RHEL9.0 strace capture; this is occurring right after the initial blast of debug log messages. I'm trying to get a stack trace for that error currently. 410-2064 ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0, name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for device) ... Setting RLIMIT_SIGPENDING to 14657. Setting RLIMIT_MSGQUEUE to 819200. Setting RLIMIT_NICE to 0. Setting RLIMIT_RTPRIO to 0. Setting RLIMIT_RTTIME to infinity. Found cgroup2 on/sys/fs/cgroup/, full unified hierarchy ... With the last line above generated from line 395's writev below. Unfortunately, I believe I left off the '-s 500' arg to strace. I can run things again if that's a help. toma@toma-MacBookPro:20220808$ grep -nA25 cgroup2 nspawn.rhel90.boot.strace.out 395:2064 writev(2, [{iov_base="Found cgroup2 on /sys/fs/cgroup/"..., iov_len=56}, {iov_base="\n", iov_len=1}], 2) = 57 396-2064 rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f0298fb3db0}, NULL, 8) = 0 397-2064 umask(022) = 022 398-2064 openat(AT_FDCWD, "/", O_RDONLY|O_CLOEXEC|O_PATH|O_DIRECTORY) = 3 399-2064 close(3) = 0 400-2064 getrandom("\x8c\x75\xd8\x95\x8f\x01\x7b\xd3", 8, GRND_NONBLOCK|GRND_INSECURE) = 8 401-2064 newfstatat(AT_FDCWD, "/.#machine.c8578d59f810b73d", 0x7ffef92612a0, 0) = -1 ENOENT (No such file or directory) 402-2064 openat(AT_FDCWD, "/.#.#machine.c8578d59f810b73d.lck", O_RDWR|O_CREAT|O_NOCTTY|O_NOFOLLOW|O_CLOEXEC, 0600) = 3 403-2064 fcntl(3, F_OFD_SETLK, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0 404-2064 newfstatat(3, "", {st_mode=S_IFREG|0600, st_size=0, ...}, AT_EMPTY_PATH) = 0 405-2064 rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0 406-2064 openat(AT_FDCWD, "/", O_RDONLY|O_NOCTTY|O_CLOEXEC|O_DIRECTORY) = 4 407-2064 newfstatat(4, "", {st_mode=S_IFDIR|0555, st_size=4096, ...}, AT_EMPTY_PATH) = 0 408-2064 openat(AT_FDCWD, "/", O_RDONLY|O_CLOEXEC|O_DIRECTORY) = 5 409-2064 fcntl(5, F_GETFL) = 0x18000 (flags O_RDONLY|O_LARGEFILE|O_DIRECTORY) 410-2064 ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0, name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for device) 411-2064 close(5) = 0 412-2064 mkdir("/.#machine.c8578d59f810b73d", 0755) = 0 413-2064 newfstatat(4, "", {st_mode=S_IFDIR|0555, st_size=4096, ...}, AT_EMPTY_PATH) = 0 414-2064 fcntl(4, F_DUPFD_CLOEXEC, 3) = 5 415-2064 getrandom("\x77\x72\x24\xdb\xb2\xcf\x6e\x46", 8, GRND_NONBLOCK|GRND_INSECURE) = 8 416-2064 newfstatat(5, "", {st_mode=S_IFDIR|0555, st_size=4096, ...}, AT_EMPTY_PATH) = 0 417-2064 fcntl(5, F_GETFL) = 0x18000 (flags O_RDONLY|O_LARGEFILE|O_DIRECTORY) 418-2064 fcntl(5, F_SETFD, FD_CLOEXEC) = 0 419-2064 openat(AT_FDCWD, "/.#machine.c8578d59f810b73d", O_RDONLY|O_CLOEXEC|O_DIRECTORY) = 6 420-2064 getdents64(6, 0x7ffef9260cc0 /* 2 entries */, 840) = 48 toma@toma-MacBookPro:20220808$ That failure leads to many repeated filesystem operations for each resource, similar to the following except with differing file paths. As you suggested that's the reason for the delay in spawning the container. 2297 newfstatat(7, "systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", {st_mode=S_IFDIR|0700, st_size=17, ...}, AT_SYMLINK_NOFOLLOW) = 0 2297 statx(7, "systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_TYPE, {stx_mask=STATX_BASIC_STA TS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0700, stx_size=17, ...}) = 0 2297 openat(7, "systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", O_RDONLY|O_NOCTTY|O_NOFOLLOW|O_CLOEXEC|O_DIRECTORY) = 9 2765d331699258-switcheroo-control.service-ZzjWeB> 2297 newfstatat(9, "", {st_mode=S_IFDIR|0700, st_size=17, ...}, AT_EMPTY_PATH) = 0 2297 fcntl(9, F_GETFL) = 0x38000 (flags O_RDONLY|O_LARGEFILE|O_NOFOLLOW|O_DIRECTORY) 2297 fcntl(9, F_SETFD, FD_CLOEXEC) = 0 2297 openat(8, "systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", O_RDONLY|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directo ry) 2297 mkdirat(8, "systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", 0700) = 0 2297 openat(8, "systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", O_RDONLY|O_NOCTTY|O_NOFOLLOW|O_CLOEXEC|O_DIRECTORY) = 10
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
On Do, 04.08.22 13:30, Thomas Archambault (t...@tparchambault.com) wrote: > Following up on xfs and reflinks, it appears they are enabled on my > out-of-box RHEL9.0. Fwiw, this is a VBox VM however so if the FC34 system > which works correctly, but is using btrfs. > > As always, appreciate any help/references. Try straceing nspawn, to see what it does. strace -f -y -s 500 -o /tmp/nspawnstrace.log systemd-nspawn … Then look at the generated log and see what is busy doing... If unsure paste things somewhre. Lennart -- Lennart Poettering, Berlin
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
Following up on xfs and reflinks, it appears they are enabled on my out-of-box RHEL9.0. Fwiw, this is a VBox VM however so if the FC34 system which works correctly, but is using btrfs. As always, appreciate any help/references. TIA -Tom [toma@localhost ~]$ xfs_info / meta-data=/dev/mapper/rhel-root isize=512 agcount=4, agsize=4185600 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1 bigtime=1 inobtcount=1 data = bsize=4096 blocks=16742400, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0, ftype=1 log =internal log bsize=4096 blocks=8175, version=2 = sectsz=512 sunit=0 blks, lazy-count=1 realtime =none extsz=4096 blocks=0, rtextents=0 [toma@localhost ~]$ Forwarded Message Subject:Re: systemd-devel Digest, Vol 148, Issue 2 Date: Thu, 4 Aug 2022 11:22:32 -0400 From: Thomas Archambault Reply-To: t...@tparchambault.com To: systemd-devel-requ...@lists.freedesktop.org Thank you Lennart. Very much appreciate the quick and clear response. You're absolutely correct about the btrfs/xfs difference between the working FC34 system and the problematic RHEL9.0 system: /dev/mapper/rhel-root on / type xfs (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota) My strace output did indicate that there are copying going on but I did not know if that that was a problem or not. Obviously it can be in terms of start-up time and UX w/xfs. - Tom On 8/4/22 08:00, systemd-devel-requ...@lists.freedesktop.org wrote: Send systemd-devel mailing list submissions to systemd-devel@lists.freedesktop.org To subscribe or unsubscribe via the World Wide Web, visit https://lists.freedesktop.org/mailman/listinfo/systemd-devel or, via email, send a message with subject or body 'help' to systemd-devel-requ...@lists.freedesktop.org You can reach the person managing the list at systemd-devel-ow...@lists.freedesktop.org When replying, please edit your Subject line so it is more specific than "Re: Contents of systemd-devel digest..." Today's Topics: 1. systemd-nspawn container not starting on RHEL9.0 (Thomas Archambault) 2. Re: systemd-nspawn container not starting on RHEL9.0 (Lennart Poettering) -- Message: 1 Date: Wed, 3 Aug 2022 15:40:21 -0400 From: Thomas Archambault To: systemd-devel@lists.freedesktop.org Subject: [systemd-devel] systemd-nspawn container not starting on RHEL9.0 Message-ID: <2d4567ae-f0e5-9e6a-10fe-9592498c6...@tparchambault.com> Content-Type: text/plain; charset="utf-8"; Format="flowed" Good day everyone on the dev list, We are adding an analysis tool to our application that uses the host's rootfs as one of its inputs. As a proof of concept, we used systemd-nspawn on Fedora 34 to create an isolated container environment using the host's rootfs as the container's rootfs and things worked correctly and as expected. The host's rootfs is analyzed with tmp and results files generated within the container without persistent modifications affecting the host's rootfs. Since RHEL is our ultimate target platform, I've been trying to duplicate our work over RHEL9.0 without success with the container not being instantiated. I've tried to boil down the duplication code to the simplest example, which is also an example in the man page $ sudo systemd-nspawn -xbD/. As with my prototyping, the container does not seem to be instantiated. Any help with troubleshooting, or specific known issues, or requests for more data would be appreciated. TIA tparchambault ps: Regarding security - selinux is in Permissive mode. I do not know if seccomp filters are getting in the way or not; This is an out-ot-the-box RHEL9.0 base workstation install. In the FC34 prototype, I did need to allow certain syscalls via --system-call-filter in order to get a daemon within the container to run correctly but afaik that should have no bearing on the instantiation of the container. On a RHEL9.0 host bash session [toma@localhost ~]$ systemctl --version systemd 250 (250-6.el9_0) +PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified [toma@localhost ~]$ uname -a Linux localhost.localdomain 5.14.0-70.17.1.el9_0.x86_64 #1 SMP PREEMPT Tue Jun 14 11:32:10 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux [toma@localhost ~]$ [toma@localhost ~]$ sudo time systemd-nspawn -D / -xb ^C^C^C^C^CCommand terminated by signal 15 40.81use
Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0
On Mi, 03.08.22 15:40, Thomas Archambault (t...@tparchambault.com) wrote: > Good day everyone on the dev list, > We are adding an analysis tool to our application that uses the host's > rootfs as one of its inputs. > > As a proof of concept, we used systemd-nspawn on Fedora 34 to create an > isolated container environment using the host's rootfs as the container's > rootfs and things worked correctly and as expected. The host's rootfs is > analyzed with tmp and results files generated within the container without > persistent modifications affecting the host's rootfs. Since RHEL is our > ultimate target platform, I've been trying to duplicate our work over > RHEL9.0 without success with the container not being instantiated. > > I've tried to boil down the duplication code to the simplest example, which > is also an example in the man page $ sudo systemd-nspawn -xbD/. As with my > prototyping, the container does not seem to be instantiated. > Any help with troubleshooting, or specific known issues, or requests for > more data would be appreciated. "-x" is ephemeral mode. This means nspawn will make a copy of the OS tree before booting into it, and remove it afterwards. "-x" on btrfs is very fast and space efficient, because btrfs supports both snapshots and reflinks. nspawn will make a subvol snapshot if the root you specify is a subvol. It will make reflink-based file copies otherwise. Other file systems have a more 1990's feature set, i.e. no reflinks nor snapshots. (modern xfs on very new kernels can support reflinks if this is opt-in'ed to.) In that case we have to copy the data files with their contents, and that's slow. Hence, what backing fs do you use? if you use non-btrfs it might hence simply be that we are busy individually copying all files... Lennart -- Lennart Poettering, Berlin
[systemd-devel] systemd-nspawn container not starting on RHEL9.0
Good day everyone on the dev list, We are adding an analysis tool to our application that uses the host's rootfs as one of its inputs. As a proof of concept, we used systemd-nspawn on Fedora 34 to create an isolated container environment using the host's rootfs as the container's rootfs and things worked correctly and as expected. The host's rootfs is analyzed with tmp and results files generated within the container without persistent modifications affecting the host's rootfs. Since RHEL is our ultimate target platform, I've been trying to duplicate our work over RHEL9.0 without success with the container not being instantiated. I've tried to boil down the duplication code to the simplest example, which is also an example in the man page $ sudo systemd-nspawn -xbD/. As with my prototyping, the container does not seem to be instantiated. Any help with troubleshooting, or specific known issues, or requests for more data would be appreciated. TIA tparchambault ps: Regarding security - selinux is in Permissive mode. I do not know if seccomp filters are getting in the way or not; This is an out-ot-the-box RHEL9.0 base workstation install. In the FC34 prototype, I did need to allow certain syscalls via --system-call-filter in order to get a daemon within the container to run correctly but afaik that should have no bearing on the instantiation of the container. On a RHEL9.0 host bash session [toma@localhost ~]$ systemctl --version systemd 250 (250-6.el9_0) +PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT default-hierarchy=unified [toma@localhost ~]$ uname -a Linux localhost.localdomain 5.14.0-70.17.1.el9_0.x86_64 #1 SMP PREEMPT Tue Jun 14 11:32:10 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux [toma@localhost ~]$ [toma@localhost ~]$ sudo time systemd-nspawn -D / -xb ^C^C^C^C^CCommand terminated by signal 15 40.81user 298.75system 6:29.72elapsed 87%CPU (0avgtext+0avgdata 8524maxresident)k 205032inputs+0outputs (0major+3287minor)pagefaults 0swaps [toma@localhost ~]$ In another bash session on the same host [toma@localhost ~]$ sudo machinectl list [sudo] password for toma: No machines. [toma@localhost ~]$ sudo pkill nspawn [toma@localhost ~]$ == In the original host bash session, w/increased logging and strace capture == [toma@localhost ~]$ sudo SYSTEMD_LOG_LEVEL=debug strace -o Development/nspawn.strace.rhel90.out systemd-nspawn -D / -xb [sudo] password for toma: Setting RLIMIT_CPU to infinity. Setting RLIMIT_FSIZE to infinity. Setting RLIMIT_DATA to infinity. Setting RLIMIT_STACK to 8388608:infinity. Setting RLIMIT_CORE to 0:infinity. Setting RLIMIT_RSS to infinity. Setting RLIMIT_NPROC to 14657. Setting RLIMIT_NOFILE to 1024:524288. Setting RLIMIT_MEMLOCK to 65536. Setting RLIMIT_AS to infinity. Setting RLIMIT_LOCKS to infinity. Setting RLIMIT_SIGPENDING to 14657. Setting RLIMIT_MSGQUEUE to 819200. Setting RLIMIT_NICE to 0. Setting RLIMIT_RTPRIO to 0. Setting RLIMIT_RTTIME to infinity. Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy Terminated [toma@localhost ~]$ As with the first run, killed via pkill from the other terminal session. Fwiw, on Fedora 34, the log debug output shows the instantiation of the container after the "Found csgroup2..." line, with the container working as documented eventually presenting the login prompt, i.e. ... Setting RLIMIT_RTTIME to infinity. Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy Spawning container fedora-1aabc34e0a52a82b on /.#machine.6e49b8aa974c6f37. Press ^] three times within 1s to kill container. Outer child is initializing. Mounting / (MS_REC|MS_SLAVE "")... ... [ OK ] Finished Update UTMP about System Runlevel Changes. Fedora 34 (Workstation Edition) Kernel 5.11.12-300.fc34.x86_64 on an x86_64 (console) fedora-1aabc34e0a52a82b login: