[systemd-devel] Antw: Re: Antw: [EXT] Re: [systemd‑devel] systemd‑nspawn container not starting on RHEL9.0

2022-08-11 Thread Ulrich Windl
>>> Neal Gompa  schrieb am 11.08.2022 um 09:22 in
Nachricht
:
> On Thu, Aug 11, 2022 at 3:15 AM Ulrich Windl
>  wrote:
>>
>> >>> Lennart Poettering  schrieb am 10.08.2022 um
22:09
>> in
>> Nachricht :
>> > On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com)
wrote:
>> >
>> >> Thank you again Lennart, and thx Kevin.
>> >>
>> >> That makes total sense, and accounts for the application's high level
>> >> start‑up delay which appears to be what we are stuck with if we are
over
>> >> xfs. Unfortunately, it's difficult to dictate to the client to change
>> their
>> >> fs type, consequently we can't develop / ship a tool with that baseline
>> >> latency on our primary target platform (RHEL xx.)
>> >>
>> >> So the next obvious question would be, is XFS reflink support on the
>> >> systemd‑nspawn roadmap or actually, (and even better) has support been
>> >> incorporated already in the latest and greatest src and I'm just behind
>> the
>> >> curve working with the older version of nspawn as shipped in RHEL90?
>> >>
>> >> I'm asking because according to the RHEL 9 docs
>> >
>> 
>
(https://access.redhat.com/documentation/en‑us/red_hat_enterprise_linux/9/htm

> l‑
>>
>> >
>> 
>
single/managing_file_systems/index#the‑xfs‑file‑system_assembly_overview‑of‑a
> vaila
>> > ble‑file‑systems)
>> >> it's the current default fs and is configured for "Reflink‑based file
>> >> copies."
>> >
>> > We issue copy_file_range() syscall, which should do reflinks on xfs,
>> > if it supports that. Question is if your kernel supports that too. I
>> > have no experience with xfs though, no idea how xfs hooked up reflink
>> > initially. And we never tested that really. I don't think outside RHEL
>> > many people use xfs.
>>
>> Not true: For SUSE /home is typically using XFS, and we use it with SLES
for
>> (huge) database filesystems.
>>
> 
> In openSUSE, this hasn't been the default behavior for a while. SLES
> will catch up here eventually.

Accidentially I created some filesystems using "yast2 disk" in SLES15 SP4
(latest updates) today:
There the default for any "data" filesystems is (still) XFS (while OS uses
BtrFS).
Agreed, if you don't have a separate filesystem for /home, it'll be a BtrFS
subvolume ("fill one, you fill all", I don't like the subvolume concept, or I
didn't understand the benefits)

Regards,
Ulrich

> 
> 
> -- 
> 真実はいつも一つ!/ Always, there's only one truth!





Re: [systemd-devel] Antw: [EXT] Re: [systemd‑devel] systemd‑nspawn container not starting on RHEL9.0

2022-08-11 Thread Neal Gompa
On Thu, Aug 11, 2022 at 3:15 AM Ulrich Windl
 wrote:
>
> >>> Lennart Poettering  schrieb am 10.08.2022 um 22:09
> in
> Nachricht :
> > On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote:
> >
> >> Thank you again Lennart, and thx Kevin.
> >>
> >> That makes total sense, and accounts for the application's high level
> >> start‑up delay which appears to be what we are stuck with if we are over
> >> xfs. Unfortunately, it's difficult to dictate to the client to change
> their
> >> fs type, consequently we can't develop / ship a tool with that baseline
> >> latency on our primary target platform (RHEL xx.)
> >>
> >> So the next obvious question would be, is XFS reflink support on the
> >> systemd‑nspawn roadmap or actually, (and even better) has support been
> >> incorporated already in the latest and greatest src and I'm just behind
> the
> >> curve working with the older version of nspawn as shipped in RHEL90?
> >>
> >> I'm asking because according to the RHEL 9 docs
> >
> (https://access.redhat.com/documentation/en‑us/red_hat_enterprise_linux/9/html‑
>
> >
> single/managing_file_systems/index#the‑xfs‑file‑system_assembly_overview‑of‑availa
> > ble‑file‑systems)
> >> it's the current default fs and is configured for "Reflink‑based file
> >> copies."
> >
> > We issue copy_file_range() syscall, which should do reflinks on xfs,
> > if it supports that. Question is if your kernel supports that too. I
> > have no experience with xfs though, no idea how xfs hooked up reflink
> > initially. And we never tested that really. I don't think outside RHEL
> > many people use xfs.
>
> Not true: For SUSE /home is typically using XFS, and we use it with SLES for
> (huge) database filesystems.
>

In openSUSE, this hasn't been the default behavior for a while. SLES
will catch up here eventually.


-- 
真実はいつも一つ!/ Always, there's only one truth!


[systemd-devel] Antw: [EXT] Re: [systemd‑devel] systemd‑nspawn container not starting on RHEL9.0

2022-08-11 Thread Ulrich Windl
>>> Lennart Poettering  schrieb am 10.08.2022 um 22:09
in
Nachricht :
> On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote:
> 
>> Thank you again Lennart, and thx Kevin.
>>
>> That makes total sense, and accounts for the application's high level
>> start‑up delay which appears to be what we are stuck with if we are over
>> xfs. Unfortunately, it's difficult to dictate to the client to change
their
>> fs type, consequently we can't develop / ship a tool with that baseline
>> latency on our primary target platform (RHEL xx.)
>>
>> So the next obvious question would be, is XFS reflink support on the
>> systemd‑nspawn roadmap or actually, (and even better) has support been
>> incorporated already in the latest and greatest src and I'm just behind
the
>> curve working with the older version of nspawn as shipped in RHEL90?
>>
>> I'm asking because according to the RHEL 9 docs 
>
(https://access.redhat.com/documentation/en‑us/red_hat_enterprise_linux/9/html‑

>
single/managing_file_systems/index#the‑xfs‑file‑system_assembly_overview‑of‑availa
> ble‑file‑systems)
>> it's the current default fs and is configured for "Reflink‑based file
>> copies."
> 
> We issue copy_file_range() syscall, which should do reflinks on xfs,
> if it supports that. Question is if your kernel supports that too. I
> have no experience with xfs though, no idea how xfs hooked up reflink
> initially. And we never tested that really. I don't think outside RHEL
> many people use xfs.

Not true: For SUSE /home is typically using XFS, and we use it with SLES for
(huge) database filesystems.

> 
> If you provide a more complete strace output, you should see the
> copy_file_range() stuff there.
> 
> Lennart
> 
> ‑‑
> Lennart Poettering, Berlin





Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-10 Thread Barry



> On 10 Aug 2022, at 21:10, Lennart Poettering  wrote:
> 
> On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote:
> 
>> Thank you again Lennart, and thx Kevin.
>> 
>> That makes total sense, and accounts for the application's high level
>> start-up delay which appears to be what we are stuck with if we are over
>> xfs. Unfortunately, it's difficult to dictate to the client to change their
>> fs type, consequently we can't develop / ship a tool with that baseline
>> latency on our primary target platform (RHEL xx.)
>> 
>> So the next obvious question would be, is XFS reflink support on the
>> systemd-nspawn roadmap or actually, (and even better) has support been
>> incorporated already in the latest and greatest src and I'm just behind the
>> curve working with the older version of nspawn as shipped in RHEL90?
>> 
>> I'm asking because according to the RHEL 9 docs 
>> (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems)
>> it's the current default fs and is configured for "Reflink-based file
>> copies."
> 
> We issue copy_file_range() syscall, which should do reflinks on xfs,
> if it supports that. Question is if your kernel supports that too. I
> have no experience with xfs though, no idea how xfs hooked up reflink
> initially. And we never tested that really. I don't think outside RHEL
> many people use xfs.

Isn’t XFS the default for fedora server?

Barry

> 
> If you provide a more complete strace output, you should see the
> copy_file_range() stuff there.
> 
> Lennart
> 
> --
> Lennart Poettering, Berlin
> 



Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-10 Thread Lennart Poettering
On Mi, 10.08.22 10:13, Thomas Archambault (t...@tparchambault.com) wrote:

> Thank you again Lennart, and thx Kevin.
>
> That makes total sense, and accounts for the application's high level
> start-up delay which appears to be what we are stuck with if we are over
> xfs. Unfortunately, it's difficult to dictate to the client to change their
> fs type, consequently we can't develop / ship a tool with that baseline
> latency on our primary target platform (RHEL xx.)
>
> So the next obvious question would be, is XFS reflink support on the
> systemd-nspawn roadmap or actually, (and even better) has support been
> incorporated already in the latest and greatest src and I'm just behind the
> curve working with the older version of nspawn as shipped in RHEL90?
>
> I'm asking because according to the RHEL 9 docs 
> (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems)
> it's the current default fs and is configured for "Reflink-based file
> copies."

We issue copy_file_range() syscall, which should do reflinks on xfs,
if it supports that. Question is if your kernel supports that too. I
have no experience with xfs though, no idea how xfs hooked up reflink
initially. And we never tested that really. I don't think outside RHEL
many people use xfs.

If you provide a more complete strace output, you should see the
copy_file_range() stuff there.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-10 Thread Neal Gompa
On Wed, Aug 10, 2022 at 11:16 AM Thomas Archambault
 wrote:
>
> Thank you again Lennart, and thx Kevin.
>
> That makes total sense, and accounts for the application's high level
> start-up delay which appears to be what we are stuck with if we are over
> xfs. Unfortunately, it's difficult to dictate to the client to change
> their fs type, consequently we can't develop / ship a tool with that
> baseline latency on our primary target platform (RHEL xx.)
>
> So the next obvious question would be, is XFS reflink support on the
> systemd-nspawn roadmap or actually, (and even better) has support been
> incorporated already in the latest and greatest src and I'm just behind
> the curve working with the older version of nspawn as shipped in RHEL90?
>
> I'm asking because according to the RHEL 9 docs
> (https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems)
> it's the current default fs and is configured for "Reflink-based file
> copies."
>
> I understand that the project supports many distros, but I'm also
> assuming that RHEL would make the list of platforms that the tool should
> run optimally on. So, Is this worth submitting an enhancement request?
> It's certainly not a bug.
>
> We like the feature set of nspawn and want to keep banging on it;
> Probably pushing it into spaces it was not intended for, but that's
> another story, and our issue...
>

You could also ask Red Hat to consider adding Btrfs to RHEL and give
them your use-case so that they could consider adding it. It's already
used by default in Fedora and many other distributions, so bringing it
back to RHEL at this point would be a matter of customers asking for
it.


-- 
真実はいつも一つ!/ Always, there's only one truth!


Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-10 Thread Thomas Archambault

Thank you again Lennart, and thx Kevin.

That makes total sense, and accounts for the application's high level 
start-up delay which appears to be what we are stuck with if we are over 
xfs. Unfortunately, it's difficult to dictate to the client to change 
their fs type, consequently we can't develop / ship a tool with that 
baseline latency on our primary target platform (RHEL xx.)


So the next obvious question would be, is XFS reflink support on the 
systemd-nspawn roadmap or actually, (and even better) has support been 
incorporated already in the latest and greatest src and I'm just behind 
the curve working with the older version of nspawn as shipped in RHEL90?


I'm asking because according to the RHEL 9 docs 
(https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/managing_file_systems/index#the-xfs-file-system_assembly_overview-of-available-file-systems) 
it's the current default fs and is configured for "Reflink-based file 
copies."


I understand that the project supports many distros, but I'm also 
assuming that RHEL would make the list of platforms that the tool should 
run optimally on. So, Is this worth submitting an enhancement request? 
It's certainly not a bug.


We like the feature set of nspawn and want to keep banging on it; 
Probably pushing it into spaces it was not intended for, but that's 
another story, and our issue...


Regards,

-Tom

On 8/10/22 04:47, Lennart Poettering wrote:

That's the btrfs subvolume ioctl. It's expected to fail on non-btrfs
with ENOTTY, and given you have xfs this is behaving as it should.

It then starts copying things manually, which is slow. i.e. it's then
basically doing what "cp -a" does.

Lennart

--
Lennart Poettering, Berlin



Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-10 Thread Lennart Poettering
On Di, 09.08.22 12:40, Thomas Archambault (t...@tparchambault.com) wrote:

> Thank you Lennart for the follow-up.
>
> There does appear to be mostly filesystem operations prior to my manually
> killing nspawn as you suggested. I only let it run about 3 minutes prior to
> sending a signal given that the strace output = ~25M.
>
> One obvious issue is the non-zero return from an ioctl call with the
> BTRFS_IOC_SUBVOL_CREATE arg at line 410, in the snippet below from my
> RHEL9.0 strace capture; this is occurring right after the initial blast of
> debug log messages. I'm trying to get a stack trace for that error
> currently.
>
>
> 410-2064 ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0,
> name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for
> device)

That's the btrfs subvolume ioctl. It's expected to fail on non-btrfs
with ENOTTY, and given you have xfs this is behaving as it should.

It then starts copying things manually, which is slow. i.e. it's then
basically doing what "cp -a" does.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-09 Thread Kevin P. Fleming
On Tue, Aug 9, 2022 at 12:43 PM Thomas Archambault
 wrote:
> One obvious issue is the non-zero return from an ioctl call with the
> BTRFS_IOC_SUBVOL_CREATE arg at line 410, in the snippet below from my
> RHEL9.0 strace capture; this is occurring right after the initial blast
> of debug log messages. I'm trying to get a stack trace for that error
> currently.
>
>
> 410-2064 ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0,
> name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for
> device)

Since you are using XFS and not Btrfs this seems like an expected result.


Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-09 Thread Thomas Archambault

Thank you Lennart for the follow-up.

There does appear to be mostly filesystem operations prior to my 
manually killing nspawn as you suggested. I only let it run about 3 
minutes prior to sending a signal given that the strace output = ~25M.


One obvious issue is the non-zero return from an ioctl call with the 
BTRFS_IOC_SUBVOL_CREATE arg at line 410, in the snippet below from my 
RHEL9.0 strace capture; this is occurring right after the initial blast 
of debug log messages. I'm trying to get a stack trace for that error 
currently.



410-2064 ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0, 
name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for 
device)


...

Setting RLIMIT_SIGPENDING to 14657.
Setting RLIMIT_MSGQUEUE to 819200.
Setting RLIMIT_NICE to 0.
Setting RLIMIT_RTPRIO to 0.
Setting RLIMIT_RTTIME to infinity.
Found cgroup2 on/sys/fs/cgroup/, full unified hierarchy

...

With the last line above generated from line 395's writev below. 
Unfortunately, I believe I left off the '-s 500' arg to strace. I can 
run things again if that's a help.


toma@toma-MacBookPro:20220808$ grep -nA25 cgroup2 
nspawn.rhel90.boot.strace.out
395:2064  writev(2, [{iov_base="Found cgroup2 on 
/sys/fs/cgroup/"..., iov_len=56}, {iov_base="\n", iov_len=1}], 2) = 57
396-2064  rt_sigaction(SIGPIPE, {sa_handler=SIG_IGN, sa_mask=[], 
sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f0298fb3db0}, NULL, 8) = 0

397-2064  umask(022)    = 022
398-2064  openat(AT_FDCWD, "/", O_RDONLY|O_CLOEXEC|O_PATH|O_DIRECTORY) = 
3

399-2064  close(3)   = 0
400-2064  getrandom("\x8c\x75\xd8\x95\x8f\x01\x7b\xd3", 8, 
GRND_NONBLOCK|GRND_INSECURE) = 8
401-2064  newfstatat(AT_FDCWD, "/.#machine.c8578d59f810b73d", 
0x7ffef92612a0, 0) = -1 ENOENT (No such file or directory)
402-2064  openat(AT_FDCWD, "/.#.#machine.c8578d59f810b73d.lck", 
O_RDWR|O_CREAT|O_NOCTTY|O_NOFOLLOW|O_CLOEXEC, 0600) = 
3
403-2064  fcntl(3, F_OFD_SETLK, 
{l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=0}) = 0
404-2064  newfstatat(3, "", 
{st_mode=S_IFREG|0600, st_size=0, ...}, AT_EMPTY_PATH) = 0

405-2064  rt_sigprocmask(SIG_BLOCK, [INT], [], 8) = 0
406-2064  openat(AT_FDCWD, "/", O_RDONLY|O_NOCTTY|O_CLOEXEC|O_DIRECTORY) 
= 4
407-2064  newfstatat(4, "", {st_mode=S_IFDIR|0555, st_size=4096, 
...}, AT_EMPTY_PATH) = 0

408-2064  openat(AT_FDCWD, "/", O_RDONLY|O_CLOEXEC|O_DIRECTORY) = 5
409-2064  fcntl(5, F_GETFL)  = 0x18000 (flags 
O_RDONLY|O_LARGEFILE|O_DIRECTORY)
410-2064  ioctl(5, BTRFS_IOC_SUBVOL_CREATE, {fd=0, 
name=".#machine.c8578d59f810b73d"}) = -1 ENOTTY (Inappropriate ioctl for 
device)

411-2064  close(5)   = 0
412-2064  mkdir("/.#machine.c8578d59f810b73d", 0755) = 0
413-2064  newfstatat(4, "", {st_mode=S_IFDIR|0555, st_size=4096, 
...}, AT_EMPTY_PATH) = 0

414-2064  fcntl(4, F_DUPFD_CLOEXEC, 3)   = 5
415-2064  getrandom("\x77\x72\x24\xdb\xb2\xcf\x6e\x46", 8, 
GRND_NONBLOCK|GRND_INSECURE) = 8
416-2064  newfstatat(5, "", {st_mode=S_IFDIR|0555, st_size=4096, 
...}, AT_EMPTY_PATH) = 0
417-2064  fcntl(5, F_GETFL)  = 0x18000 (flags 
O_RDONLY|O_LARGEFILE|O_DIRECTORY)

418-2064  fcntl(5, F_SETFD, FD_CLOEXEC)  = 0
419-2064  openat(AT_FDCWD, "/.#machine.c8578d59f810b73d", 
O_RDONLY|O_CLOEXEC|O_DIRECTORY) = 6
420-2064  getdents64(6, 0x7ffef9260cc0 /* 2 
entries */, 840) = 48

toma@toma-MacBookPro:20220808$

That failure leads to many repeated filesystem operations for each 
resource, similar to the following except with differing file paths. As 
you suggested that's the reason for the delay in spawning the container.


2297  newfstatat(7, 
"systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", 
{st_mode=S_IFDIR|0700, st_size=17, ...}, AT_SYMLINK_NOFOLLOW) = 0
2297  statx(7, 
"systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", 
AT_STATX_SYNC_AS_STAT|AT_SYMLINK_NOFOLLOW|AT_NO_AUTOMOUNT, STATX_TYPE, 
{stx_mask=STATX_BASIC_STA
TS|STATX_MNT_ID, stx_attributes=0, stx_mode=S_IFDIR|0700, stx_size=17, 
...}) = 0
2297  openat(7, 
"systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", 
O_RDONLY|O_NOCTTY|O_NOFOLLOW|O_CLOEXEC|O_DIRECTORY) = 
9
2765d331699258-switcheroo-control.service-ZzjWeB>
2297 
newfstatat(9, 
"", {st_mode=S_IFDIR|0700, st_size=17, ...}, AT_EMPTY_PATH) = 0
2297 
fcntl(9, 
F_GETFL) = 0x38000 (flags O_RDONLY|O_LARGEFILE|O_NOFOLLOW|O_DIRECTORY)
2297 
fcntl(9, 
F_SETFD, FD_CLOEXEC) = 0
2297  openat(8, 
"systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", 
O_RDONLY|O_CLOEXEC|O_DIRECTORY) = -1 ENOENT (No such file or directo

ry)
2297  mkdirat(8, 
"systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", 
0700) = 0
2297  openat(8, 
"systemd-private-ceff107148c24952bc2765d331699258-switcheroo-control.service-ZzjWeB", 
O_RDONLY|O_NOCTTY|O_NOFOLLOW|O_CLOEXEC|O_DIRECTORY) = 10

Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-04 Thread Lennart Poettering
On Do, 04.08.22 13:30, Thomas Archambault (t...@tparchambault.com) wrote:

> Following up on xfs and reflinks, it appears they are enabled on my
> out-of-box RHEL9.0. Fwiw, this is a VBox VM however so if the FC34 system
> which works correctly, but is using btrfs.
>
> As always, appreciate any help/references.

Try straceing nspawn, to see what it does.

strace -f -y -s 500 -o /tmp/nspawnstrace.log systemd-nspawn …

Then look at the generated log and see what is busy doing... If unsure
paste things somewhre.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-04 Thread Thomas Archambault
Following up on xfs and reflinks, it appears they are enabled on my 
out-of-box RHEL9.0. Fwiw, this is a VBox VM however so if the FC34 
system which works correctly, but is using btrfs.


As always, appreciate any help/references.

TIA

-Tom

[toma@localhost ~]$ xfs_info /
meta-data=/dev/mapper/rhel-root  isize=512    agcount=4, agsize=4185600 blks
 =   sectsz=512   attr=2, projid32bit=1
 =   crc=1    finobt=1, sparse=1, rmapbt=0
 =   reflink=1    bigtime=1 inobtcount=1
data =   bsize=4096   blocks=16742400, imaxpct=25
 =   sunit=0  swidth=0 blks
naming   =version 2  bsize=4096   ascii-ci=0, ftype=1
log  =internal log   bsize=4096   blocks=8175, version=2
 =   sectsz=512   sunit=0 blks, lazy-count=1
realtime =none   extsz=4096   blocks=0, rtextents=0
[toma@localhost ~]$



 Forwarded Message 
Subject:Re: systemd-devel Digest, Vol 148, Issue 2
Date:   Thu, 4 Aug 2022 11:22:32 -0400
From:   Thomas Archambault 
Reply-To:   t...@tparchambault.com
To: systemd-devel-requ...@lists.freedesktop.org



Thank you Lennart. Very much appreciate the quick and clear response.

You're absolutely correct about the btrfs/xfs difference between the 
working FC34 system and the problematic RHEL9.0 system:


/dev/mapper/rhel-root on / type xfs 

(rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,noquota)

My strace output did indicate that there are copying going on but I did 
not know if that that was a problem or not. Obviously it can be in terms 
of start-up time and UX w/xfs.


- Tom


On 8/4/22 08:00, systemd-devel-requ...@lists.freedesktop.org wrote:

Send systemd-devel mailing list submissions to
systemd-devel@lists.freedesktop.org

To subscribe or unsubscribe via the World Wide Web, visit
https://lists.freedesktop.org/mailman/listinfo/systemd-devel
or, via email, send a message with subject or body 'help' to
systemd-devel-requ...@lists.freedesktop.org

You can reach the person managing the list at
systemd-devel-ow...@lists.freedesktop.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of systemd-devel digest..."


Today's Topics:

1. systemd-nspawn container not starting on RHEL9.0
(Thomas Archambault)
2. Re: systemd-nspawn container not starting on RHEL9.0
(Lennart Poettering)


--

Message: 1
Date: Wed, 3 Aug 2022 15:40:21 -0400
From: Thomas Archambault 
To: systemd-devel@lists.freedesktop.org
Subject: [systemd-devel] systemd-nspawn container not starting on
RHEL9.0
Message-ID: <2d4567ae-f0e5-9e6a-10fe-9592498c6...@tparchambault.com>
Content-Type: text/plain; charset="utf-8"; Format="flowed"

Good day everyone on the dev list,
We are adding an analysis tool to our application that uses the host's
rootfs as one of its inputs.

As a proof of concept, we used systemd-nspawn on Fedora 34 to create an
isolated container environment using the host's rootfs as the
container's rootfs and things worked correctly and as expected. The
host's rootfs is analyzed with tmp and results files generated within
the container without persistent modifications affecting the host's
rootfs. Since RHEL is our ultimate target platform, I've been trying to
duplicate our work over RHEL9.0 without success with the container not
being instantiated.

I've tried to boil down the duplication code to the simplest example,
which is also an example in the man page $ sudo systemd-nspawn -xbD/. As
with my prototyping, the container does not seem to be instantiated.
Any help with troubleshooting, or specific known issues, or requests for
more data would be appreciated.

TIA
tparchambault
ps: Regarding security - selinux is in Permissive mode. I do not know if
seccomp filters are getting in the way or not; This is an out-ot-the-box
RHEL9.0 base workstation install. In the FC34 prototype, I did need to
allow certain syscalls via --system-call-filter in order to get a daemon
within the container to run correctly but afaik that should have no
bearing on the instantiation of the container.


 On a RHEL9.0 host bash session 

[toma@localhost ~]$ systemctl --version
systemd 250 (250-6.el9_0)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS
+OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD
+LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4
+XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT
default-hierarchy=unified

[toma@localhost ~]$ uname -a
Linux localhost.localdomain 5.14.0-70.17.1.el9_0.x86_64 #1 SMP PREEMPT
Tue Jun 14 11:32:10 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux
[toma@localhost ~]$

[toma@localhost ~]$ sudo time systemd-nspawn -D / -xb
^C^C^C^C^CCommand terminated by signal 15
40.81use

Re: [systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-04 Thread Lennart Poettering
On Mi, 03.08.22 15:40, Thomas Archambault (t...@tparchambault.com) wrote:

> Good day everyone on the dev list,
> We are adding an analysis tool to our application that uses the host's
> rootfs as one of its inputs.
>
> As a proof of concept, we used systemd-nspawn on Fedora 34 to create an
> isolated container environment using the host's rootfs as the container's
> rootfs and things worked correctly and as expected. The host's rootfs is
> analyzed with tmp and results files generated within the container without
> persistent modifications affecting the host's rootfs. Since RHEL is our
> ultimate target platform, I've been trying to duplicate our work over
> RHEL9.0 without success with the container not being instantiated.
>
> I've tried to boil down the duplication code to the simplest example, which
> is also an example in the man page $ sudo systemd-nspawn -xbD/. As with my
> prototyping, the container does not seem to be instantiated.
> Any help with troubleshooting, or specific known issues, or requests for
> more data would be appreciated.

"-x" is ephemeral mode. This means nspawn will make a copy of the OS
tree before booting into it, and remove it afterwards.

"-x" on btrfs is very fast and space efficient, because btrfs supports
both snapshots and reflinks. nspawn will make a subvol snapshot if the
root you specify is a subvol. It will make reflink-based file copies
otherwise.

Other file systems have a more 1990's feature set, i.e. no reflinks
nor snapshots. (modern xfs on very new kernels can support reflinks if
this is opt-in'ed to.) In that case we have to copy the data files
with their contents, and that's slow.

Hence, what backing fs do you use?

if you use non-btrfs it might hence simply be that we are busy
individually copying all files...

Lennart

--
Lennart Poettering, Berlin


[systemd-devel] systemd-nspawn container not starting on RHEL9.0

2022-08-03 Thread Thomas Archambault

Good day everyone on the dev list,
We are adding an analysis tool to our application that uses the host's 
rootfs as one of its inputs.


As a proof of concept, we used systemd-nspawn on Fedora 34 to create an 
isolated container environment using the host's rootfs as the 
container's rootfs and things worked correctly and as expected. The 
host's rootfs is analyzed with tmp and results files generated within 
the container without persistent modifications affecting the host's 
rootfs. Since RHEL is our ultimate target platform, I've been trying to 
duplicate our work over RHEL9.0 without success with the container not 
being instantiated.


I've tried to boil down the duplication code to the simplest example, 
which is also an example in the man page $ sudo systemd-nspawn -xbD/. As 
with my prototyping, the container does not seem to be instantiated.
Any help with troubleshooting, or specific known issues, or requests for 
more data would be appreciated.


TIA
tparchambault
ps: Regarding security - selinux is in Permissive mode. I do not know if 
seccomp filters are getting in the way or not; This is an out-ot-the-box 
RHEL9.0 base workstation install. In the FC34 prototype, I did need to 
allow certain syscalls via --system-call-filter in order to get a daemon 
within the container to run correctly but afaik that should have no 
bearing on the instantiation of the container.



 On a RHEL9.0 host bash session 

[toma@localhost ~]$ systemctl --version
systemd 250 (250-6.el9_0)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT +GNUTLS 
+OPENSSL +ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN -IPTC +KMOD 
+LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 
+XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP +SYSVINIT 
default-hierarchy=unified


[toma@localhost ~]$ uname -a
Linux localhost.localdomain 5.14.0-70.17.1.el9_0.x86_64 #1 SMP PREEMPT 
Tue Jun 14 11:32:10 EDT 2022 x86_64 x86_64 x86_64 GNU/Linux

[toma@localhost ~]$

[toma@localhost ~]$ sudo time systemd-nspawn -D / -xb
^C^C^C^C^CCommand terminated by signal 15
40.81user 298.75system 6:29.72elapsed 87%CPU (0avgtext+0avgdata 
8524maxresident)k

205032inputs+0outputs (0major+3287minor)pagefaults 0swaps
[toma@localhost ~]$

 In another bash session on the same host 
[toma@localhost ~]$ sudo machinectl list
[sudo] password for toma:
No machines.
[toma@localhost ~]$ sudo pkill nspawn
[toma@localhost ~]$

== In the original host bash session, w/increased logging and strace 
capture ==


[toma@localhost ~]$ sudo SYSTEMD_LOG_LEVEL=debug strace -o 
Development/nspawn.strace.rhel90.out systemd-nspawn -D / -xb

[sudo] password for toma:
Setting RLIMIT_CPU to infinity.
Setting RLIMIT_FSIZE to infinity.
Setting RLIMIT_DATA to infinity.
Setting RLIMIT_STACK to 8388608:infinity.
Setting RLIMIT_CORE to 0:infinity.
Setting RLIMIT_RSS to infinity.
Setting RLIMIT_NPROC to 14657.
Setting RLIMIT_NOFILE to 1024:524288.
Setting RLIMIT_MEMLOCK to 65536.
Setting RLIMIT_AS to infinity.
Setting RLIMIT_LOCKS to infinity.
Setting RLIMIT_SIGPENDING to 14657.
Setting RLIMIT_MSGQUEUE to 819200.
Setting RLIMIT_NICE to 0.
Setting RLIMIT_RTPRIO to 0.
Setting RLIMIT_RTTIME to infinity.
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Terminated
[toma@localhost ~]$

As with the first run, killed via pkill from the other terminal session.

Fwiw, on Fedora 34, the log debug output shows the instantiation of the
container after the "Found csgroup2..." line, with the container working as
documented eventually presenting the login prompt, i.e.

...
Setting RLIMIT_RTTIME to infinity.
Found cgroup2 on /sys/fs/cgroup/, full unified hierarchy
Spawning container fedora-1aabc34e0a52a82b on /.#machine.6e49b8aa974c6f37.
Press ^] three times within 1s to kill container.
Outer child is initializing.
Mounting / (MS_REC|MS_SLAVE "")...
...

[  OK  ] Finished Update UTMP about System Runlevel Changes.

Fedora 34 (Workstation Edition)
Kernel 5.11.12-300.fc34.x86_64 on an x86_64 (console)

fedora-1aabc34e0a52a82b login: