Re: [systemd-devel] [EXT] Question about timestamps in the USER_RECORD spec

2021-10-28 Thread Lennart Poettering
On Do, 28.10.21 11:46, Arian van Putten (arian.vanput...@gmail.com) wrote:

> Indeed it mentions it; but after careful reading there is no normative
> suggestion to actually adhere to it. (no SHOULD and definitely not a MUST,
> not even a RECOMMENDED).
>
> They just say that to increase interoperability no more than 53 bits of
> integer precision should be assumed without making a clear normative
> decision about it.   The only normative part in that section is that
> numbers consist of an integer part and a fractional part.
>
> They also say that implementations are allowed to set any limits on the
> range and precision of numbers accepted.
>
> So yeah Lennart seems to be technically correct. Even when reading the RFC
> by the letter.

BTW:

https://github.com/systemd/systemd/pull/21168

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Question about timestamps in the USER_RECORD spec

2021-10-26 Thread Lennart Poettering
On Di, 26.10.21 10:41, Arian van Putten (arian.vanput...@gmail.com) wrote:

> Hey list,
>
> I'm reading the https://systemd.io/USER_RECORD/ spec and I have a question
>
> There are some fields in the USER_RECORD spec which are described as
> "unsigned 64 bit integer values".   Specifically the fields describing
> time.
>
> However JSON lacks integers and only has doubles [0]; which would mean 53
> bit integer precision is about the maximum we can reach.

The spec itself doesn't really mandate this is implemented in
double, the spec just says "sticking to doubles would be nice".

Actual implementations implement this differently IRL. Python based
implementations have arbitrary precision for this. sd-bus uses
uint64_t, or int64_t or long double. It prefers the integer types if
the value fits, and uses the floating point type otherwise. json-glibc
uses int64_t or double.

There are plenty of specs that rely that 64bit integers work with full
range (OCI for example, for much of its resource management stuff).

> It's unclear to me from the spec whether I should use doubles to
> encode these fields or use strings.  Would it be possible to further
> clarify it?  If it is indeed a number literal; this means the
> maximum date we can encode is 9007199254740991 which corresponds to
> Tuesday, June 5, 2255 . This honestly is too soon in the future for
> my comfort.

You appear to plan for quite a long life ;-)

Frankly, this is not a problem specific to user records. A multitude
of JSON formats tend to store dates this way. The overflow is still
200y out. I think that leaves plenty time to teach implementations
full 64bit support, and I am pretty sure that the ones that will deal
with user records will catch up sooner or later.

> I suggest encoding 64
> bit integers as string literals instead to avoid the truncation
> problem.

I am sorry, but I am not convinced this is a pressing issue. I value
cleanliness and obviousness a lot more than theoretic issues that
might happen 200 years from now. In particular as they are issues that
can be dealt with in offending JSON implementations, and limitations
in the parser implementations shouldn't really leak in to the spec I think.

I mean, at this point it isn't even clear humanity will survive that
long, and I seriously doubt that systemd is the one project that
survives humanity.

What might make sense is to add a comment about the whole situation to
the spec and be done with it:

 "Please not that this specification assumes that JSON numbers may
 cover the full integer range of -2^63 … 2^64-1 without loss of
 accuracy (i.e. INT64_MIN … UINT64_MAX). Please read, write and
 process user records following this specification only with JSON
 implementations that guarantee this range."

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] A questions about modules-load service in systemd

2021-10-25 Thread Lennart Poettering
On Sa, 23.10.21 02:27, Joakim Zhang (qiangqing.zh...@nxp.com) wrote:

> > It doesn't do that actually. But udev when it loads kernel modules does 
> > things
> > from a bunch of worker processes all in parallel.
>
> Ok, is there a way to disable this parallel tasks in systemd-udev
> service?

udev.children_max=1 on the kernel command line.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] loose thoughts around portable services

2021-10-22 Thread Lennart Poettering
On Mi, 20.10.21 16:01, Umut Tezduyar Lindskog (u...@tezduyar.com) wrote:

> > That said: systemd's nss-systemd NSS module can nowadays (v249) read
> > user definitions from drop-in JSON fragments in
> > /run/host/userdb/. This is is used by nspawn's --bind-user= feature to
> > make a host user easily available in a container, with group info,
> > password and so on. My plan was to also make use of this in the unit
> > executor, i.e. so that whenever RootDirectory=/RootImage= are used the
> > service manager places such minimal user info for the selected user
> > there, so that the user is perfectly resolvable inside the service
> > too. This is particularly relevant for DynamicUser=1 services. I
> > haven't come around actually implementing that though. Given
> > nss-systemd is enabled in most bigger distro's nssswitch.conf file
> > these days I think this is a really nice approach to propagate user
> > databases like that.
> >
>
> Why don't we also make the varlink user API available to most of the
> profiles? This way sandboxed service doesn't need any of the nss conf and
> libraries if they don't want to. Most profiles allow dbus communication. I
> guess in a similar thought, most system services should be able to do a
> user lookup in a modern way.

I sympathize with the idea, but I am not entirely sure this is
desirable to do this 1:1, as this means we'd leak a ton of stuff that
might only make sense on the host into something that is supposed to
be an isolated container. i.e. home dir info and things like
that. shell paths and so on.

Maybe we can find a middle ground on this though. i.e. we could make
systemd-userdb.service listen on a new varlink service socket that
provides the host's database to sandboxed environments in a restricted
form, i.e. with basically all records dumbed down to just contain
uid/gid/name info and nothing else.

We'd then update the portabled profiles that do not use PrivateUsers=
to bind mount that one socket, so that they get the full db,
dynamically.

I kinda like the idea.

> We could implement our own profiles without needing nesting but we believe
> it is beneficial to collaborate on profiles upstream and have common
> additions to upstream profiles with nesting other profiles. If we get to it
> before other people, we would really like to contribute and send a patch on
> this.

A patch adding .d/ style drop-ins for profiles would make a ton of
sense. Happy to take that.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] A questions about modules-load service in systemd

2021-10-22 Thread Lennart Poettering
On Fr, 22.10.21 10:31, Joakim Zhang (qiangqing.zh...@nxp.com) wrote:

>
> Hi systemd experts,
>
> I saw you guys did much contributions in modules-load part recently, I have a 
> questions, some insight you input would be appreciated, thanks in advance.
>
> Do you know how to load all modules in a single task? In other
> words, load all modules within a single task as I want they process
> sequentially.

Are you sure you mean "systemd-modules-load"? Most module loading
happens via udev, not systemd-modules-load. That service is only
required for a few select modules that do not support auto-loading.

udev loads all modules as the hw they are for shows up. And no there's
no way to make that sequential.

Why do you need this? For debugging purposes? To work around a broken driver?

> If I understand correctly, systemd-modules-load service now will
> fork many tasks to process different kernel modules parallelly.

It doesn't do that actually. But udev when it loads kernel modules
does things from a bunch of worker processes all in parallel.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] loose thoughts around portable services

2021-10-18 Thread Lennart Poettering
On Mi, 13.10.21 13:38, Umut Tezduyar Lindskog (umut.tezdu...@axis.com) wrote:

> Hi, we have been playing around more with the portable services and
> lots of loose thoughts came up. Hopefully we can initiate
> discussions.
>
> The PrivateUsers and DynamicUsers are turned off for the trusted
> profile in portable services but none of the passwd/group and nss
> files are mapped to the sandbox by default essentially preventing
> the sandbox to do a user look up. Is this a use case that should be
> offered by the “trusted” profile or should this be handled by the
> services that would like to do a look-up?

The "trusted" profile basically means you dealt with that
synchronization yourself in some way.

That said: systemd's nss-systemd NSS module can nowadays (v249) read
user definitions from drop-in JSON fragments in
/run/host/userdb/. This is is used by nspawn's --bind-user= feature to
make a host user easily available in a container, with group info,
password and so on. My plan was to also make use of this in the unit
executor, i.e. so that whenever RootDirectory=/RootImage= are used the
service manager places such minimal user info for the selected user
there, so that the user is perfectly resolvable inside the service
too. This is particularly relevant for DynamicUser=1 services. I
haven't come around actually implementing that though. Given
nss-systemd is enabled in most bigger distro's nssswitch.conf file
these days I think this is a really nice approach to propagate user
databases like that.

> Is there a way to have PrivateUsers=yes and map more host users to
> the sandbox? We have dynamic, uid based authorization on dbus
> methods. Up on receiving a method, the server checks the sender uid
> against a set of rule files.

I guess we could add BindUser= or so, which could control the
/run/host/userdb/ propagation I proposed above.

> Would it benefit others if the “profile” support was moved out of
> the portable services and be part of the unit files? For example
> part of the [Install] section.

Right now profiles are a concept of portabled, not of the service
manager. There's a github issue somewhere where people asked us to
make this generically usable from services too, so I guess you are not
the only one who'd like someting like that.

> Has there been any thought about nesting profiles? Example, one
> profile can include other profiles in it.

File an RFE issue. I guess we could support that for any profile x
we'd implicitly also pull in x.d/*.conf, or so.

> Systemd analyze security is great! We believe it would be easier to
> audit if we had a way to compare a service file’s sandboxing
> directives against a profile and find the delta. Then score the
> service file against delta.

Interesting idea.

Current git has all kinds of JSON hookup for systemd-analyze security
btw, so tools could do that externally too. But you are right, doing
this implicitly might indeed make sense. Please file an RFE issue on
github.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] [systemd‑devel] Removing bold fonts from boot messages

2021-10-14 Thread Lennart Poettering
On Mi, 13.10.21 18:29, Frank Steiner (fsteiner-ma...@bio.ifi.lmu.de) wrote:

> Ulrich Windl wrote:
>
> > Stupid question: If you see bold face at the end of the serial line, 
> > wouldn't
> > changing the terminal type ($TERM) do?
> > Maybe construct your own terminal capabilities.
>
> I'd need a TERM that has colors but disallows bold fonts. For some
> reason I wasn't even able to construct a terminfo that would disallow
> colors when using that $TERM inside xterm (and starting a new bash).
> It seems that xterm always has certain capabilities, i.e. "ls --color"
> is always showing colors in xterm, also with TERM=xterm-mono and
> everything else I tried.
>
> Anway, settings a translation to bind "allow-bold-fonts(toggle)"
> to a key in xterm resources allows to block bold fonts whenever
> watching systemd boot messages via ipmi or AMT in a xterm...

Note that systemd doesn't care about terminfo/termcap or anything like
that. We only support exactly three types of terminals:

1. TERM=dumb → you get no ANSI sequences, no fancy emojis or other
   non-ASCII unicode chars, no clickable links.

2. TERM=linux → you do get ANSI sequences, but no fancy emojis, but
   some simpler known-safe unicode chars (TERM=linux is the Linux
   console/VT subsystem), no clickable links.

3. everything else → you get ANSI sequences, fancy emojis, fancy
   unicode chars, clickable links.

And that's really it. It's 2021 and so far this was unproblematic. The
ANSI sequences we use aren't crazy exotic stuff but pretty much
baseline and virtually any terminal from the last 25 years probably
supports them.

You can turn these features off individually, too.

SYSTEMD_COLORS=0 → no ANSI colors sequences (alternatively: "NO_COLOR=1" as 
per https://no-color.org/)

SYSTEMD_EMOJI=0 → no unicode emojis

LC_CTYPE=ANSI_X3.4-1968 → no non-ASCII chars (which also means no emojis)

SYSTEMD_URLIFY=0 → no clickable links

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] troubleshooting Clevis

2021-10-12 Thread Lennart Poettering
On Di, 12.10.21 16:17, lejeczek (pelj...@yahoo.co.uk) wrote:

> > > I have 'clevis' set to get luks pin from 'tang' but unlock does not happen
> > > at/during boot time and I wonder if someone can share thoughts on how to
> > > investigate that?
> > > I cannot see anything obvious fail during boot, moreover, manual
> > > 'clevis-luks-unlock' works no problems.
> > This is the systemd mailing list, not the clevis/tang mailing
> > list. Please contact the clevis/tang community instead.
>
> May ask of any possible plans where systemd would, somehow similarly to
> 'tpm', utilize 'tang'(or similar) technique to unlock luks encrypted
> devices?

You mean that networked unlock feature? I mean, it's not always clear
what belongs and systemd and what does not. But outside of data
centers I am not sure tang/clevis really has much use, and that's
quite a limited userbase, so I'd say: no this should be done outside
of systemd. Maybe a plugin for libcryptsetup's "token" feature.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Removing bold fonts from boot messages

2021-10-12 Thread Lennart Poettering
On Di, 12.10.21 12:09, Frank Steiner (fsteiner-ma...@bio.ifi.lmu.de) wrote:

> Hi,
>
> after upgrading from SLES 15 SP2 (systemd 2.34) to SP3 (systemd 2.46)
> the boot messages are not only colored (which I like for seeing failures
> in red) but partially printed in bold face. This makes messages indeed
> harder to read on serial console (with amt or ipmi), so I wonder if
> there is a place where the ascii sequences for colors and font faces
> are defined and can be adjusted?

Sounds like in an graphics issue in your terminal emulator, no?

> Or is there some option to remove the bold face only, but not the colors?
> systemd.log_color=0 removes all formatting, but I'd like to keep the
> colors...

No, this is not configurable. We are not a themeable desktop, sorry.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Tempering the Logging Data when Knowing the Verification Key / Time Synchronization

2021-10-11 Thread Lennart Poettering
On Mo, 11.10.21 17:08, Andreas Krueger (andreas.krue...@fmc-ag.com) wrote:

> Hi Folks,
>
>
> I am currently working in an embedded project that uses Journal for logging. 
> The logging data shall be protected by the Journal's sealing mechanism FSS 
> and for various reasons the verification key is located unprotected in memory.
>
> Regarding this constellation, my first question is that:
>
> If an attacker knows the verification key, is he able to modify the
> logging data in such a way that its tempering remains undetected,
> even if this has happened e.g. one day ago (which means that several
> new sealing keys has been generated in the meantime) ?

Yes, the verification key should be kept secret. (The text output when
it is generated should make this very clear, actually.)

if you don't keep it secret, then all bets are off, the construction
of the underlying cryptography does not work then.

> Since sealing is always done for a time interval, my second question is that:
>
> What will happen to the logging data and sealing mechanism when the
> system clock is suddenly modified? This can e.g. happen, when the
> board starts first with a default time value and then synchronized
> after a while by a time daemon.

the sealing key is "evolved" based on time (which means a new key is
generated from the old and the old one is securely deleted). When time
jumps forward, then this scheme automatically keeps up, and if needed
will evolve a number of steps at once, as necessary.

If time jumps backwards things are more problematic though: the key
appropriate for the old time has already been generated likely, and
while a newer key can be derived from the old an older cannot be
derived from the new (this fact is after all the whole point of the
excercise).

For cases like this it might make sense to ensure that flushing of the
journal to disk (i.e. systemd-journald-flush.service) is scheduled
after correct time has been acquired (i.e. time-sync.target).

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] dm-integrity volume with TPM key?

2021-10-11 Thread Lennart Poettering
On Fr, 08.10.21 21:15, Sebastian Wiesner (sebast...@swsnr.de) wrote:

> Am Montag, dem 04.10.2021 um 14:49 +0200 schrieb Lennart Poettering:
> > On Do, 30.09.21 21:20, Sebastian Wiesner (sebast...@swsnr.de) wrote:
> >
> > > Hello,
> > >
> > > thanks for quick reply, I guess this explains the lack of
> > > instructions
> >
> > btw, coincidentally this was posted on github on the day you posted
> > this:
> >
> > https://github.com/systemd/systemd/pull/20902
> >
> > so hopefully we'll have te missing tools in place soon too.
>
> Great, so it looks as if everything's in place with systemd 250
> perhaps?

Dunno, we'll see, if the submitter rolls another revision possibly,
but it all depends on that. Would love to see this happen, but right
now the ball is in the field of the submitter of that PR.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] [systemd]: How to set systemd not to generate loop0.device and mtdblockx.device?

2021-10-11 Thread Lennart Poettering
On Sa, 09.10.21 11:27, www (ouyangxua...@163.com) wrote:

> systemd version: V242
>
> In our system, the whole machine starts too slowly. We want to do
> some optimization. I found that two services( loop0.device and
> mtdblock5.device) started slowly. I want to remove them (I
> personally think our system are not need them). I want to ask you
> how to avoid generating these two device files and not start them?

/dev/loop0 is a loopback block device. It's probably some tool that
needs them you are using.

/dev/mtdblock5 is some physical hw you have. And it's probably mounted
by something you are using.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Antw: [EXT] Re: [systemd‑devel] Q: write error, watchdog, journald core dump, ordering of entries

2021-10-11 Thread Lennart Poettering
On Mo, 11.10.21 10:57, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> > Now when journald hangs due to some underlying IO issue, then it might
> > miss the watchdog deadline, and PID 1 might then kill it to get it
> > back up. It will log about this to the journal, but given tha tthe
> > journal is hanging/being killed it's not going to write the messages
> > to disk, the mesages will remain queued in the logging socket for a
> > bit. Until eventually journald starts up again, and resumes processing
> > log messages. it will then process the messages already queued in the
> > sockets from when it was hanging, and thus the order might be
> > surprising.
>
> Hi!
>
> Thanks for explaining.
> Don't you have some OOB-logging, that is: Log a message before processing the
> queue logs.

The "Journal started" message is inserted into the log stream by
journald itself before processing the already queued messages.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Q: write error, watchdog, journald core dump, ordering of entries

2021-10-11 Thread Lennart Poettering
On Mi, 06.10.21 10:29, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> Hi!
>
> We had a stuck networkc card on a server that seems to have caused the RAID 
> controller with two SSDs to be stuck on write as well.
> Anyway journald dumped core with this stack:
> Oct 05 20:13:25 h19 systemd-coredump[26759]: Process 3321 (systemd-journal) 
> of user 0 dumped core.
> Oct 05 20:13:25 h19 systemd-coredump[26759]: Coredump diverted to 
> /var/lib/systemd/coredump/core.systemd-journal.0.a4eb19afcc314d99936cbdd5542e4fed.3321.163345758500.lz4
> Oct 05 20:13:25 h19 systemd-coredump[26759]: Stack trace of thread 3321:
> Oct 05 20:13:25 h19 systemd-coredump[26759]: #0  0x7f913492d0c2 
> journal_file_append_object (libsystemd-shared-234.so)
> Oct 05 20:13:25 h19 systemd-coredump[26759]: #1  0x7f913492dba3 n/a 
> (libsystemd-shared-234.so)
> Oct 05 20:13:25 h19 systemd-coredump[26759]: #2  0x7f913492fc79 
> journal_file_append_entry (libsystemd-shared-234.so)
> Oct 05 20:13:25 h19 systemd-coredump[26759]: #3  0x557fe532908d n/a 
> (systemd-journald)
> Oct 05 20:13:25 h19 systemd-coredump[26759]: #4  0x557fe532b15f n/a 
> (systemd-journald)
> Oct 05 20:13:25 h19 systemd-coredump[26759]: #5  0x557fe5324664 n/a 
> (systemd-journald)
> Oct 05 20:13:25 h19 systemd-coredump[26759]: #6  0x557fe5326a80 n/a 
> (systemd-journald)
> Oct 05 20:13:25 h19 kernel: printk: systemd-coredum: 6 output lines 
> suppressed due to ratelimiting
>
> (systemd-234-24.90.1.x86_64 of SLES15 SP2 on x86_64)
>
> journald seems to have restarted later, but I wonder about the ordering of 
> the entries following:
> Oct 05 20:13:25 h19 systemd-journald[26760]: Journal started
> Oct 05 20:13:25 h19 systemd-journald[26760]: System journal 
> (/var/log/journal/8695c89eb080463dad2ca9f9aaedf162) is 928.0M, max 4.0G, 3.0G 
> free.
>
> Oct 05 20:12:52 h19 systemd[1]: systemd-journald.service: Watchdog timeout 
> (limit 3min)!
> Oct 05 20:12:52 h19 systemd[1]: systemd-journald.service: Killing process 
> 3321 (systemd-journal) with signal SIGABRT.
> Oct 05 20:13:25 h19 systemd[1]: Starting Flush Journal to Persistent 
> Storage...
> Oct 05 20:13:25 h19 systemd[1]: Started Flush Journal to Persistent Storage.
>
> I don't understand why the core dump is logged before the signal
> being sent and the watchdog timeout.

PID 1 logs to journald. PID 1 also runs and supervises
journald. That's quite a special relationship: PID1 both is client to
journald and manages it.

Now when journald hangs due to some underlying IO issue, then it might
miss the watchdog deadline, and PID 1 might then kill it to get it
back up. It will log about this to the journal, but given tha tthe
journal is hanging/being killed it's not going to write the messages
to disk, the mesages will remain queued in the logging socket for a
bit. Until eventually journald starts up again, and resumes processing
log messages. it will then process the messages already queued in the
sockets from when it was hanging, and thus the order might be
surprising.

--
Lennart Poettering, Berlin


Re: [systemd-devel] dm-integrity volume with TPM key?

2021-10-04 Thread Lennart Poettering
On Do, 30.09.21 21:20, Sebastian Wiesner (sebast...@swsnr.de) wrote:

> Hello,
>
> thanks for quick reply, I guess this explains the lack of
> instructions

btw, coincidentally this was posted on github on the day you posted
this:

https://github.com/systemd/systemd/pull/20902

so hopefully we'll have te missing tools in place soon too.

> As a workaround you'd use a regular file key for dm-integrity and put
> that on a TPM-protected partition, if I understand you correctly?

yes.

> I.e. you'd
>
> 1. enable secureboot (custom keys or shim),
> 2. bundle kernel & initrd into signed UEFI image for systemd-boot,
> 3. make / a LUKS-encrypted parition with systemd-cryptenroll, bound to
> the TPM (perhaps PCR 0 and 7) aund unlocked automatically at boot,

only pcr 7, for the reasons explained in the blog story.

> 4. make /home a dm-integrity partition, with a regular keyfile from
> e.g. /etc/integrity.key (which is on the encrypted partition), and

actually, after thinking a bit more about this I figure the ultimate
path for this would be /etc/integritysetup-keys.d/home.key – because
we already implemented in systemd-cryptsetup a scheme where we search
for the encryption key for volume xyz in
/etc/cryptsetup-keys.d/xyz.key, and we should probably do it similar
for verity keys, too.

> 5. use homed for LUKS-encrypted home areas on /home?
>
> Does this sound reasonable?  

Yes!

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Authenticated Boot and Disk Encryption on Linux

2021-10-04 Thread Lennart Poettering
On Do, 30.09.21 18:54, Łukasz Stelmach (stl...@poczta.fm) wrote:

> > I have been working on code in homed to "balance" free space between
> > active home dirs in regular intervals (shorter intervals when disk
> > space is low, higher intervals when there's plenty). Also, right now
> > we already run FITRIM on home dirs on logout, to make sure all air is
> > removed then. I intend to also add logic to shrink to minimal size
> > then (and conversely grow on login again).
> >
> > This will only really work in case btrfs is used inside the homedir
> > images, as only then we can both shrink and grow the fs whenever we
> > want to.
>
> Interesting. Apparently[1] loopback driver punches holes in the image
> files and makes them sparse.

We currently issue FITRIM on logout (thus making the file sparse), and
on login we issue fallocate() to remove the holes again, and being
able to give disk space guarantees and disable overcommit during runtime.

> > [Encryption] isn't typically needed for /usr/ given that it generally
> > contains no secret data
>
> This isn't IMHO precisely true. Especially not for laptops. And I don't
> mean the presence of "hacking tools" you mentioned below. Even when all
> the binaries in the /usr all come from the Internet there are many
> different versions available. Knowledge which versions are running on a
> device may be quite valuable for an attacker to mount an remote on-line
> attack and extract data with malware.

Well, that's security through obscurity to some level. I know some
people are concerned about this, and they can encrypt that if they
really thinkg they must. But I doubt that this makes sense for the
cases where your OS payload comes in flatpaks, containers, sysexts,
portable services, …, i.e. is not written to /usr.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Prefix for direct logging

2021-10-04 Thread Lennart Poettering
On Mi, 29.09.21 20:21, Arjun D R (drarju...@gmail.com) wrote:

> Hi Lennart,
>
> Please help me understand how the journald is figuring out the PID of the
> log line.

Google SCM_CREDENTIALS.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Authenticated Boot and Disk Encryption on Linux

2021-09-30 Thread Lennart Poettering
On Mi, 29.09.21 21:09, Łukasz Stelmach (stl...@poczta.fm) wrote:

> Hi, Lennart.
>
> I read your blog post and there is little I can add regarding
> encryption/authentication*. However, distributions need to address one
> more detail, I think. You've mentioned recovery scenarios, but even with
> an additional set of keys stored securely, there are enough moving parts
> in FDE that something may go wrong beyond what recovery keys could
> fix. To help users minimise the risk of data loss distributions should
> provide backup tools and help configure them securely.
>
> This is of course outside of the scope of your original post, but IMHO
> it is a good moment to mention this.
>
> * Well there is one tiny detail.
>
> You noted double encryption needs to be avoided in case of home
> directory images by storing them on a separate partition. Separating
> /home may be considered a slight inefficiency in storage usage, but
> using LVM to distribute storage space between the root(+/usr) and /home
> might help. However, to best of my knowledge (which I will be glad to
> update) there is no tool to dynamically and automatically manage storage
> space used by home images. In theory the code is there, but UX of
> resize2fs(8) and dd(1) is far from satisfying and I am not entirely sure
> what happens if one truncates (after resize2fs, which will work)
> a file containing a mounted image.
>
> The first solution that comes to my mind is to make systemd-homed resize
> home filesystem images according to some policy upon locking and
> unlocking. But it's not perfect as users would need to log out(?) to
> trigger allocation of more storage should they fill their home
> directory.

I have been working on code in homed to "balance" free space between
active home dirs in regular intervals (shorter intervals when disk
space is low, higher intervals when there's plenty). Also, right now
we already run FITRIM on home dirs on logout, to make sure all air is
removed then. I intend to also add logic to shrink to minimal size
then (and conversely grow on login again).

This will only really work in case btrfs is used inside the homedir
images, as only then we can both shrink and grow the fs whenever we
want to.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] dm-integrity volume with TPM key?

2021-09-30 Thread Lennart Poettering
On Mi, 29.09.21 21:53, Sebastian Wiesner (sebast...@swsnr.de) wrote:

> Hello,
>
> "Authenticated Boot and Disk Encryption on Linux" [1] suggests to "make
> /home/ its own dm-integrity volume with a HMAC, keyed by the TPM" when
> using systemd-homed for user home directories.
>
> I'd like to try that but… how? I can use systemd-cryptenroll to make a
> encrypted volume with a TPM key, but how do I make a dm-integrity
> volume with a TPM key?  I've gone through the manpage for
> integritysetup and did a few unsuccessful google searches, but I've not
> found any answer.

It's not easy to find, because it doesn't exist. ;-)

We have the TPM stuff in place, and we cover both cryptsetup +
veritysetup pretty nicely. We are still missing the final glue here
though. systemd-integritysetup + /etc/integritytab. The hard plumbing
problems are all solved, what's missing is just putting together the
porcelain for it.

I had hope that libcryptsetup would support a mode where we can use a
LUKS2 superblock with only dm-integrity without dm-crypt (which would
give us proper key management for this thing). But the idea is not
attractive to the libcryptsetup people unfortunately, as it turns out.

My current thinking how I'd personally deploy this is actually not
necessarily by directly enrolling the HMAC key for dm-integrity with
the TPM, but instead just piggyback things to an otherwise protected
/etc/ or /var/. i.e. define a key file /etc/integrity.key (with a
fallback to /var/lib/integrity.key) or similar, that is used as
implicit HMAC key for all dm-integrity needs. Then, because (at least
in my idealized view) /etc or /var are authenticated territory (bound
to TPM) we get the property we want, indirectly.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] FDE: UEFI/Secureboot solves main part / missing link is /boot encryption

2021-09-29 Thread Lennart Poettering
On Mi, 29.09.21 12:47, Leon Fauster (leonfaus...@googlemail.com) wrote:

> > Encryption is not authentication.
> >
> > Not sure why you would encrypt your boot loader though? The boot
> > loader code is hardly a secret, is it? It's the same for everyone and
> > open source.
> >
> > And with which key? a key the user has to type in? how does that help?
> > it means the user is queried three times for a pw? once by grub, once
> > by cryptsetup and once when logging in? That's not an improvement!
>
> I think was partly misunderstood. Le me rephrase it. My motivation was
> just a thought about one step in the implementation (in the context
> of UEFI), that has a huge benefit. Speak, protecting the initrd. Thats the
> start point.

Well, it encrypts the initrd (which I am not interested in), and it
doesn't authenticate it (which I want), and all that with an
interactively acquired key (which I don't want).

Maybe that solves your specific problem set — but it doesn't solve any
of the issues I am trying to address.

> Yes, enc != auth - but while speaking about authentication. Dracut
> could enroll the signature of the initrd into the allow db (EFI).
> So, grub2 could check both, the kernel and the initrd and making the
> above encryption completely obsolete, thought.

Well, my proposal suggests just including the basic initrd in the
kernel image. The kernel's signature would then also validate this
basic initrd. My focus is that this kernel/initrd signing happens
during build time, not at install time, i.e. the secret signature keys
should be held by the building party only, not by the local
instalations.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] FDE: UEFI/Secureboot solves main part / missing link is /boot encryption

2021-09-28 Thread Lennart Poettering
On Di, 28.09.21 19:44, Leon Fauster (leonfaus...@googlemail.com) wrote:

> Hallo Lennart, corresponding to your last post about FDE:
>
> On an EFI system - would an encrypted "/boot" or /boot on
> an encrypted "/" filesystem eliminate the mentioned main
> attack vector? The whole chain would be authenticated.

Encryption is not authentication.

Not sure why you would encrypt your boot loader though? The boot
loader code is hardly a secret, is it? It's the same for everyone and
open source.

And with which key? a key the user has to type in? how does that help?
it means the user is queried three times for a pw? once by grub, once
by cryptsetup and once when logging in? That's not an improvement!

My blog story is an attempt to do things cleanly: i.e. authenticate
what needs authentication, and do so in a way that doesn't require
interactivity. The ultimate goal is that servers and embedded devices
can boot up entirely unattanded in safe way, and that desktop machines
only query the user once, and that the authentication the user does
unlocks the user's actual data.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Prefix for direct logging

2021-09-28 Thread Lennart Poettering
On Mo, 27.09.21 15:40, Arjun D R (drarju...@gmail.com) wrote:

> Hi Folks,
>
> Currently we are using systemd-journald for service logging. We run
> journalctl for a bunch of services and redirect those to the custom log
> files for every few seconds. This takes up the CPU for that particular
> time period since we have lot of IO operations as well. We came to know
> that systemd version v236+ supports direct logging
> (StandardOutput:file:) to the custom log file by the service. I
> would like to use that facility but we don't get the prefix that we used to
> get when using the journal.
>
> Is there a way to prepare a custom patch locally to add the necessary
> prefix to the stdout before writing to the custom log file? Is that a good
> idea? Any other suggestions?


You might define a socket unit 'prefixlogger@.socket' like this:

  [Unit]
  StopWhenUnneeded=yes

  [Socket]
  ListenFIFO=/run/prefixlogger.fifo.%i
  Service=prefixlogger@%i.service

And then a matching service 'prefixlogger@.service':

  [Service]
  StandardInput=socket
  StandardOutput=file:/var/log/foo.log.%i
  ExecStart=sed -e 's/^/foo:/'

And then in the services that shall run with this:

  [Unit]
  Wants=prefixlogger@%N.socket
  After=prefixlogger@%N.socket

  [Service]
  ExecStart=/my/service/binary
  StandardOutput=file:/run/prefixlogger/fifo.%N

(This is all untested, might need some minor changes to actually work,
but you get the idea).

So what this does is this: first we define a little socket service
that can be initialized easily a bunch of times: it listens on a FIFO
in the fs and everything it reads from it it writes to some log
file. The service is just an invocation of "sed" with standard input
being the fifo and standard output being the log file to write to.

You then use it by using StandrdOutput=… in your main unit, to connect
its stdout/stderr to that fifo. Also, you add deps so that each time a
service that tneeds this starts the log prefix service socket for it
starts too.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] troubleshooting Clevis

2021-09-28 Thread Lennart Poettering
On Di, 28.09.21 12:26, lejeczek (pelj...@yahoo.co.uk) wrote:

> Hi guys.
>
> I have 'clevis' set to get luks pin from 'tang' but unlock does not happen
> at/during boot time and I wonder if someone can share thoughts on how to
> investigate that?
> I cannot see anything obvious fail during boot, moreover, manual
> 'clevis-luks-unlock' works no problems.

This is the systemd mailing list, not the clevis/tang mailing
list. Please contact the clevis/tang community instead.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Add LUKS disk to an Raspberry Pi 4 install

2021-09-27 Thread Lennart Poettering
gOn Sa, 25.09.21 17:47, Barry Scott (ba...@barrys-emacs.org) wrote:

> [I originally ask this question on the Fedora ARM list, but got no reply]
>
> I'm trying to build a RPi4 system that uses a LUKS encrypted disk.
>
> But I cannot get the volume to be unlocked when the system boots.
>
> I have installed Fedora-Minimal-34-1.2.aarch64.raw.xz to with
> arm-image-installer --target=rpi4 and that boots.
>
> Then I have added a new partition to that sdcard that I setup using this
> command on a Fedora 34 x86_86 system.
>
> cryptsetup \
>--type luks2 \
>--cipher xchacha20,aes-adiantum-plain64 \
>--hash sha256 \
>--iter-time 5000 \
>--pbkdf argon2i \
>luksFormat ${DEVICE}
>
> I got these settings from a blog on setting up LUKS for debian on raspberry
> pi.
>
> I add an entry to /etc/crypttab for the volume.
>
> When I boot the system I am not prompted for the password to unlock the
> volume as I was expecting.
>
> Looking in journalctl -b 0 I see these lines:
>
> Apr 06 01:01:36 clef.chelsea.private systemd[1]: dev-disk-
> by\x2duuid-8c2519ae\x2d78a9\x2d44b0\x2d871f\x2d0aa2422de03a.device: Job dev-
> disk-by\x2duuid-8c2519ae\x2d78a9\x2d44b0\x2d871f\x2d0aa2422de03a.device/start
> timed out.

This suggests that the backing device name you specified in
/etc/crypttab doesn't match reality. i..e here you specified a device
node by the UUID of what's on it. (Presumably that's supposed to be
the UUID of the LUKS2 superblock?) And it doesn't appear to match what
is *actually* the UUID of your LUKS2 superblock?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] 回复: systemd-devel Digest, Vol 137, Issue 26

2021-09-23 Thread Lennart Poettering
On Mi, 22.09.21 21:34, krave1986...@gmail.com (krave1986...@gmail.com) wrote:

> So, can I assume that journalctl --flush is for Linux internal processes not
> for end users?

Yes, pretty much.

(Well, I mean, there are scenarios you could think where you want to
tell journald to temporarily switch to volatile logging via "journalctl
--relinquish-var", then do something with /var/log (like replace it,
backup it, overmount it, whatever), and then eventually want to switch
back to using it, which you then can do with "journalctl --flush".

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Pre-installed portable services ?

2021-09-20 Thread Lennart Poettering
On Mo, 20.09.21 11:24, Umut Tezduyar Lindskog (umut.tezdu...@axis.com) wrote:

> Hi. Is there such thing as “pre-installed” portable services? If
> not, what is the best way to achieve it. One option can be to place
> the files that “attach” command creates on the distro but I am
> worried that the files might be outdated depending on the systemd
> version the distro is shipped.

What do you mean by "pre-installed" precisely?

I mean, portable services can be dropped into /usr/lib/portables/,
i.e. a dir that typically is included in the base OS image? (As
opposed to /var/lib/portables/, where they are usually dropped, given
they should be able to be added anytime).

Or do you mean that they are also "pre-attached" and "pre-enabled"? If
you want that you could either call "portablectl attach" at boot, or
just package the symlinks/files the call creates.

We could also add some special dirs that may contain images we'll
automatically attach + enable during boot as we discover them. That'd
be a new feature though.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Examples to distinguish Before=/After= and Wants=/Requires=/BindsTo=

2021-09-18 Thread Lennart Poettering
On Fr, 17.09.21 19:04, Kenneth Porter (sh...@sewingwitch.com) wrote:
65;6402;1c
> --On Friday, September 17, 2021 12:49 PM +0200 Lennart Poettering
>  wrote:
>
> > more specific example: you can use apache without mysql, and you can
> > use mysql without apache, but quite often they are used together, and
> > if so you likely want to start mysql first, and apache second, since
> > it likely consumes services of mysql, and not the other way
> > round. Hence in this example, you'd place an ordering dep, but not
> > requirement dep.
>
> Would such an ordering dependency without a requirements dependency allow
> apache to start without mysql?

Yes.

> Or does an ordering dependency imply a
> requirements dependency?

No.

> In which case, could systemd automatically infer
> the requirements from the ordering?

No.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Systemd-boot not properly loading device tree, when loaded by U-boot (ARM64, tested on RK3399)

2021-09-17 Thread Lennart Poettering
On Fr, 17.09.21 19:25, Qu Wenruo (w...@suse.com) wrote:

> Hi,
>
> I'm recently testing booting my RK3399 boards with the following boot
> sequence:
>
> U-boot -> systemd-boot (EFI payload) -> kernel
>
> Which provides much more flex than plain extlinux conf from U-boot.
> (More choice, easier to write config, runtime kernel change).
>
> So far "kernel" and "initramfs" key work fine.
>
> But I notice that "devicetree" key is not working properly.
>
> The Uboot fdt search path doesn't include "/dtbs" which is used by my
> distro, and my entry config specify the device-tree file like this:
>
> titleManjaroARM boot from nvme
> linux/Image
> devicetree/dtbs/rockchip/rk3399-rockpro64.dtb
> initrd/initramfs-linux.img
> optionsconsole=ttyS2,150 root=/dev/arm_nvme/root rw loglevel=7
>
> Thus if systemd-boot doesn't load the correct device-tree, kernel will
> use the default fdt passed from Uboot, which is already out-of-date and
> can cause problems for the upstream kernel I used.
>
> Unfortunately, with above config, after booting the kernel, the fdt is
> the fallback one from Uboot, not loading the proper one specified by
> systemd-boot config.
>
> The proof I went is checking the opp table.
> I have replaced the "/dtbs/rockchip/rk3399-rockpro64.dtb" with a custom
> dtb which uses op1 tables.
> But the kernel only sees a very out-of-dated fdt, which some opp is even
> invalid.
>
> How could I continue debugging the missing link?
> Like what systemd-boot needs to load the device-tree? Or U-boot EFI
> environment lacks certain facility to support systemd-boot?

Did you see this:

https://github.com/systemd/systemd/pull/19417

(and maybe this: https://github.com/systemd/systemd/pull/20601)

maybe that addresses your issues?

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Examples to distinguish Before=/After= and Wants=/Requires=/BindsTo=

2021-09-17 Thread Lennart Poettering
On Mi, 15.09.21 17:15, Manuel Wagesreither (man...@fastmail.fm) wrote:

> Hello all,
>
> I'm onboarding some collegues who don't have much experience with
> systemd. One thing I would like to focus on is the difference
> between Before=/After= and Wants=/Requires=/BindsTo in systemd
> units.
>
> I think it would get immediately clear if could provide them an
> example where we want one but not the other. Unfortunately I've got
> problems coming up with such an example. In my use cases, whenever I
> needed an After= I needed an Wants= as well.
>
> Can you come up with something good?

Whenever you have a conceptually "weak" dependency between to
packages, i.e. both services can work alone, but they can also wor
together, and if so, one should be ordered after the other.

more specific example: you can use apache without mysql, and you can
use mysql without apache, but quite often they are used together, and
if so you likely want to start mysql first, and apache second, since
it likely consumes services of mysql, and not the other way
round. Hence in this example, you'd place an ordering dep, but not
requirement dep.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Preferred way to recurse over a directory?

2021-09-15 Thread Lennart Poettering
On Di, 14.09.21 23:26, Albert Brox (alb...@exypno.tech) wrote:

> I'm working on PR #20239 loadcred-dir and wondering what the preferred way
> to recurse over a directory is.
>
> I was told recursively calling the `load_credential` function is too racy so
> I'm led to ftw/nftw. However I see in the TODO file, "Get rid of nftw(). We
> should refuse to use such useless APIs on principle." Can anyone point me in
> the right direction?

Do your own recursion, use xopendirat() to get a subdir fd from a dir
fd, and then call your funciton on that, always iterating with
readdir() as needed.

(Probably best to keep these discussions on the PR though).

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] New developer building systemd

2021-09-14 Thread Lennart Poettering
On Fr, 10.09.21 17:44, Marcus Harrison (mar...@harrisonland.co.uk) wrote:

> Hey folks,
>
> I've downloaded the systemd sources and am attempting to build with GCC 9.4 on
> KDE Neon and am receiving the build error described in build-error.txt on
> updated main branch (as of writing).
>
> I've patched around it using the change described in
> remove_unused_function.patch, which allows the build to follow through, but
> the test suite has multiple failures, and requires manual intervention
> multiple times - for example, dropping to a BusyBox recovery shell or log-in
> shell, and some of the tests will hang indefinitely.
>
> I'm wondering how much of this is intended, and if my patch broke
> anything.

You are building without libcryptsetup. Apparently this combination of
build options is currently not tested (i.e. repart on, but
libcryptsetup off).


> diff --git a/src/partition/repart.c b/src/partition/repart.c
> index 926dbb2ae4..8ee78c9b08 100644
> --- a/src/partition/repart.c
> +++ b/src/partition/repart.c
> @@ -206,7 +206,7 @@ static const char *encrypt_mode_table[_ENCRYPT_MODE_MAX] 
> = {
>  [ENCRYPT_KEY_FILE_TPM2] = "key-file+tpm2",
>  };
>
> -DEFINE_PRIVATE_STRING_TABLE_LOOKUP_WITH_BOOLEAN(encrypt_mode, EncryptMode, 
> ENCRYPT_KEY_FILE);
> +DEFINE_PRIVATE_STRING_TABLE_LOOKUP_FROM_STRING_WITH_BOOLEAN(encrypt_mode, 
> EncryptMode, ENCRYPT_KEY_FILE);
>

Patch looks OK, but instead of replacing the line unconditionally, it
should be one or the other depending on `#if HAVE_LIBCRYPTSETUP`, so
that it then works in both cases.

Would be delighted if you could submit such a patch via github PR.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] [RFC] Switching to OpenSSL 3?

2021-09-14 Thread Lennart Poettering
On Di, 14.09.21 10:26, Mike Gilbert (flop...@gentoo.org) wrote:

> > Anyway, I'd be interested in your thoughts about this. i.e. hear
> > multiple takes, opinions, from differently people and positions?
>
> I would definitely like to be able to depend on one crypto/TLS
> implementation that would cover all features in systemd, instead of
> having to depend on OpenSSL for some features, and GnuTLS for other
> features. The current situation is quite messy.
>
> Settling on OpenSSL sounds fine to me.
>
> It will probably take a few months for Gentoo to get fully upgraded to
> OpenSSL 3.0. Here is our tracker for that:
>
> https://bugs.gentoo.org/797325
>
> Do you have a target date/milestone in mind for introducing this
> dependency in systemd?

Well, that depends on a) people actually agreeing that this is OK to
do, and b) someone actually doing the work.

I'd love to do it yesterday. But knowing how things work, this will be
a couple of months I guess, maybe half a year. Or could even be longer.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Portable services

2021-09-14 Thread Lennart Poettering
On Di, 14.09.21 12:10, Umut Tezduyar Lindskog (umut.tezdu...@axis.com) wrote:

> Hello,
>
> We, at Axis, have a monolithic operating system backed by a
> platform. There are teams behind the services making up the
> operating system and we have quite many services. We have been
> investigating sandboxing these services and of course systemd
> sandboxing directives are a way to go. Problem is that it is not
> realistic for us to expect teams to be on top of the directives and
> apply the right ones they need (and keep them updated). There shines
> the portable services for us with it’s “profiles”. We are trying to
> sandbox these services while giving them some host access. There
> shined for example how the default profile is set up by giving dbus
> access (binding dbus system socket to a portable service). We would
> like to create a base runtime and expect services to use the base
> runtime, still giving them the option of overriding the
> runtime. There shined the stackable services with latest “extension”
> support. All and all it fits our use case very well.
>
> I am aware that portable services is still enhancing but who out
> there is using it and I am curious about their use case. (Sorry,
> couldn’t wait for spring in Berlin).

The commit history to that dir might give you hint which companies use
it:

https://github.com/systemd/systemd/commits/main/src/portable

But of course, that's only the ones which use it *and* contribute to
it. I am pretty sure there are others which use it, but don't
contribute.

> Seems like DynamicUsers is part of the default profile and
> DynamicUsers is a good thing. Seems like systemd creates a username
> as the same name as the portable service. Does it work with username
> based dbus policies? Is it that we need to be very careful regarding
> who can start a portable service in case they re-use service name to
> go around dbus rules (vs who can edit /etc/passwd).

So, providing D-Bus services from DynamicUser= services is messy. The two
D-Bus brokers out there want to resolve user names at the time they
load policy, and that of course conflicts with the DynamicUser=
concept to some level, since loading policy happens at early boot but
the whole point of DynamicUser= is that these users only appear the
moment the service starts.

The opposite, i.e. connecting as a client to D-Bus services from
DynamicUser= should be OK (it just means you need to be able to
connect to the D-Bus system socket from the service, i.e. you need to
bind mount that socket) — as long as your client is just a regular
client, i.e. doesn't need specific broker-side policy. Thankfully
clients that require installation of specific D-Bus policies is the
exception.

D-Bus progress is currently a bit stuck. Ideally D-Bus maintainers
would provide us with a way how we could marry socket activation and
D-Bus a bit (in the sense, that systemd passes a pre-connected D-Bus
socket to services, for example, and also uploads policy at that
moment). But I wouldn't hold my breath this happens anytime soon.

Note that portable services and system extensions are two different
things.

Regarding system extensions: at RH we are working on using them as a
way to build fully trusted initrds at the moment. background: it's
currently a major shortcoming of generic Linux distros that initrds
are entirely unprotected cryptographically, anyone can modify them at
will without this being detectable, making FDE pretty weak
conceptually; SecureBoot only covers the kernel, but once the initrd
is run all safety is off. I recently pushed a PR that adds embedded
offline-safe PKCS#7 signature support to the disk image logic that
system extensions and portable services build on (and nspawn, …). With
that you get really nice security properties, as we reinvent initrds
in secure, trusted way: the basic initrd is now built into the kernel
(and thus validated along with it), and exotic storage is then added
in via trusted, verifiable system extensions.

Lennart

--
Lennart Poettering, Berlin


[systemd-devel] [RFC] Switching to OpenSSL 3?

2021-09-14 Thread Lennart Poettering
Heya!

Some of the systemd developers have been discussing switching
systemd's crypto libraries to be exclusively OpenSSL 3.0, and drop
support for older OpenSSL versions, as well as any GNUTLS/libgcrypt
support. As you might have noticed OpenSSL 3.0 has been released
recently, and for the first time resolves the GPL2 license
incompatibility mess comprehensively, which opens this door to us.

I personally care a lot about reducing the combinatorial explosion of
deps a bit, and keeping our tree as maintainable as we can, with a
single implementation of everything, not multiple, and no abstraction
layers and such, and thus removing any compat kludges for other
libraries or other library versions.

Now, before we make a decision on this, I'd like to collect feedback
on such a move. I know that there are some people who backpart new
systemd onto old distros. How big would the pain be require porting
OpenSSL 3, too, at the same time?

(What's not up for discussion: for new additions to systemd we'll do
only OpenSSL, and won't accept anything else. My question is really
just about the stuff we aleady have, where we currently support
GNUTLS/libcgrypt.).

Anyway, I'd be interested in your thoughts about this. i.e. hear
multiple takes, opinions, from differently people and positions?

Thanks,

Lennart


Re: [systemd-devel] Filter/Parse NETLINK_KOBJECT_UEVENT Messages

2021-09-14 Thread Lennart Poettering
On Di, 14.09.21 01:08, Ryan McClue (re.mcc...@protonmail.com) wrote:

> I understand this is slightly off-topic, but I'm completely new to
> BPF. Analyzing libudev source and Internet I understand the general
> idea. However, I don't understand how information/what information
> is passed to the filter from the socket. For example, in my case the
> socket payload, i.e. buf_str =
> add@/devices/pci:00/:00:14.0/usb1/1-2/1-2.4/1-2.4:1.0/input/input38/event14

> 1. How do I pass this string to the sock_filter/sock_fprog
> structures?

You don't. The bpf filtering, and in particular the bloom filter that
is used for that is mostly internal to udev, and not something that is
consider official API and should be reimplemented.

Use sd-device/libudev, it implements all of this, and is the only official API
to the bpf bloom filter stuff udev does there.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Filter/Parse NETLINK_KOBJECT_UEVENT Messages

2021-09-13 Thread Lennart Poettering
On Mo, 13.09.21 09:29, Ryan McClue (re.mcc...@protonmail.com) wrote:

> Currently, I'm listening to NETLINK_KOBJECT_UEVENT messages with the 
> following code:

Don't. Use the sd-device API from C. Or the old libudev API.

Listening to uevents directly means you either only get the kernel's
own data (i.e. no udev props added by userspace), or you'll have to
reimplement the extended packet format udev uses when propagating the
messages. But the latter is not really considered stable (i mean,
effectively it is, we haven't changed it in years, but it's still
undocumented, and we keep the liberty to change it if we must).

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] resolved: disabling automatic resolution of hostname and IP?

2021-09-10 Thread Lennart Poettering
On Do, 09.09.21 18:16, François Cami (fc...@redhat.com) wrote:

> Hi,
>
> Is there a way via the resolved configuration file to disable the automatic
> resolution of the hostname and the IP of the host?

There is no way to do this globally or for the DNS stub,
currently. You could parse the upstream DNS servers from
/run/systemd/resolve/resolv.conf and query those DNS servers
directly. That file always contains a valid resolv.conf with all known
upstream DNS servers and is updated instantly when DNS config changes.

You could also explicitly resolve via resolved (either via D-Bus, or
varlink), where in very recent versions you can set a flag to disable
such "synthetic" RRs. This is also exposed via "resolvectl query
--synthesize=no …".

> The reverse DNS resolution like:
>
> # dig +short -x 192.168.115.40
> ipa0.ipa.test
> ipa0.
> ipa0.local.
>
> is problematic when FreeIPA needs to detect whether the IP of the host
> already belongs to a reverse zone. I'd expect NXDOMAIN there instead.
>
> Any input will be much appreciated.
>
> Thank you,
> François
>
>

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Unable to boot Linux distribution ISO files that have systemd services

2021-09-02 Thread Lennart Poettering
On Mi, 01.09.21 11:39, EpicLemon99 (epiclemo...@protonmail.com) wrote:

> I am unable to boot up ISO files of Linux distributions that use systemd. My 
> computer is a HP Pavilion TG01-2856no, it is recent hardware. The boot gets 
> stuck when it tries to start systemd services, such as Network Time 
> Synchronization.
>
> For example, there are the messages I get when trying to boot up the Arch 
> Linux ISO: https://imgur.com/a/oKGjZk7
>
> Booting it with the kernel argument init=/bin/bash however works.

The second screenshot shows that you have some kernel issue, i.e. the
upper part of the screen shows kernel debug output that happens on
kernel oops.

i.e. it's a driver issue, and systemd hangs simply because the kernel
hangs/crashed.


Please work with your distro, they might be able to
help. Kernel/driver issues like this are out of scope for systemd
though.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Using LoadCredential for passing API key to s3 bucket mount unit

2021-09-02 Thread Lennart Poettering
On Mi, 01.09.21 13:31, Vladimir Timofeenko (vladi...@vtimofeenko.com) wrote:

> Hi,
>
> I am playing with the idea of using systemd mount to mount S3 bucket on
> the system using s3fs.
>
> To mount a bucket, an API key is required. s3fs can read the API key
> from a file specified as an option:
>
> s3fs $bucket_name $where -o passwd_file=${PATH_TO_PASSWORD_FILE} ...
>
> I tried to set up a .mount unit with LoadCredential directive:
>
> [Unit]
> Description=tmp bucket mount
> After=network.target
>
> [Mount]
> What=temp-bucket
> Where=/mnt/tmp
> Type=fuse.s3fs
> LoadCredential=password_file:/etc/s3fs/tmp_key
> Options=passwd_file="${CREDENTIALS_DIRECTORY}"/password_file,url=https://s3...

systemd only resolves env vars in ExecXYZ= lines, nowhere else. And
definitely not in Options=

> I have used a small wrapper that calls env before calling s3fs to
> investigate, and it appears that during the mount command execution
> ${CREDENTIALS_DIRECTORY} is created, but there is no subdirectory
> corresponding to the unit name.

$CREDENIALS_DIRECTORY should already point to a dir with the unit name
in it. i.e. what is the precise value?

I must admit I never tested credentials with mount units. We might be
missing something there, though I see no reason why it shouldn't work.

Consider filing an issue on github, if the creds stuff doesn't
work. But note that the env var replacement you need to do in mout
mount.fuse.s3fs wrapper script really, PID 1 won't do that for you.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Use of systemd-cat

2021-08-31 Thread Lennart Poettering
On Di, 31.08.21 13:16, Nishant Nayan (nayan.nishant2...@gmail.com) wrote:

> I read systemd-cat manpage and its functiinality, how exactly can it be
> useful for logging and debugging, any example would be helpful.

The man page lists two examples already.

here's a third: a service script things that logs its log output to 
stdout/stderr
which you want to invoke from the shell, but still have the logs go to
the journal:

myscript | systemd-cat

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] why log_set_prohibit_ipc() is set in journald

2021-08-27 Thread Lennart Poettering
On Fr, 27.08.21 17:34, Nishant Nayan (nayan.nishant2...@gmail.com) wrote:

> So then where does journald logs its own messages if he wants to?

Depends, some of them it logs to the journal itself, via
server_driver_message(). But unless explicitly written that way log
messages of the journal go to kmsg.

Things is, if you want to log an error message about your inability to
write a log message you better don't write that as a log message
that will then result in an error, and thus another log message, and
thus another log message and so on,  and thus a cycle. thus by default
all our own log messages to go kmsg, except for a bunch where we know
for sure they aren#t immediate effect of an attempt to write a log
message, and thus won't result in a cycle.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] why log_set_prohibit_ipc() is set in journald

2021-08-27 Thread Lennart Poettering
On Fr, 27.08.21 10:01, Mantas Mikulėnas (graw...@gmail.com) wrote:

> On Fri, Aug 27, 2021, 08:52 Nishant Nayan 
> wrote:
>
> > I have just started to learn journald and in its main function (in
> > journald.c) I encountered a function call "log_set_prohibit_ipc(true);"
> > In systemd source, I can see the declaration in src/basic/log.h:/*
> >
> > If turned on, then we'll never use IPC-based logging, * i.e. never log to
> > syslog or the journal. We'll only * log to stderr, the console or kmsg
> > */void log_set_prohibit_ipc(bool b);
> >
> > I did not get this because Journald not writing to journal itself by
> > default is strange, isn't it?
> > What is the reason behind it?
> >
>
> My understanding is that the point isn't to prevent logging to journal, but
> to prevent logging *through IPC* specifically, i.e. make sure journald
> doesn't try to create loopback connections to its own sockets. The journald
> daemon is single-threaded, so if it tries to connect to itself, it'll
> deadlock.
>
> But also if journald wants to log a critical error (e.g. running out of
> space or something like that), then it can't really *rely* on journal still
> working...
>
> Afaik, messages written to kmsg will be imported back into the journal
> anyway, but that happens asynchronously so it's fine.

The above describes exactly how it is, and why journald turns of
logging via IPC. journald should not be a client to itself.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] How does journald talks to other services?

2021-08-24 Thread Lennart Poettering
On Di, 24.08.21 09:11, Nishant Nayan (nayan.nishant2...@gmail.com) wrote:

> So what are the cases where syslog forwards logs to journal?
> Is there a case where both journal and syslog end up sending same logs to
> each other ( like a cycle ) resulting in duplicate logs?

systemd does not pick up messages from another syslog service, only from
syslog clients. Thus, there is no loop.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] How does journald talks to other services?

2021-08-20 Thread Lennart Poettering
On Fr, 20.08.21 15:01, Nishant Nayan (nayan.nishant2...@gmail.com) wrote:

> Hi,
> My query is how does systemd-journald talk to other services so that it
> stores their logs/output in journal files, which could be displayed using
> journalctl utlity.
>   I am currently looking into systemd journal code to find this out but so
> far no luck.
>   Any suggestions would be appreciated.

See docs:

https://www.freedesktop.org/software/systemd/man/systemd-journald.service

There's a list at the very beginning of the description there that
lists the 5 different ways how log messages are delivered to journald.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] [hostnamed] Why the service will automatically exit after 30 seconds

2021-08-19 Thread Lennart Poettering
On Do, 19.08.21 11:38, Michael Chapman (m...@very.puzzling.org) wrote:

> I've looked at this more closely now, and it's a bit more complicated than
> I would have liked.
>
> While hostnamed's own idle timeout can easily be disabled while a
> polkit conversation is in progress, that won't necessarily help anybody
> using hostnamectl. hostnamectl uses sd-bus's default method call timeout,
> which is 25 seconds.

If the exit-on-idle logic doesn't take outstanding policykit requests
into account that's a bug and should be fixed. Please file a github issue.

> Perhaps this should be increased for method calls that are likely to
> result in using polkit? 25 seconds might be too short for some people to
> enter their password.

We have a define HOME_SLOW_BUS_CALL_TIMEOUT_USEC which is what we use
for "slow" bus calls, typically, instead of of the default of 25s. We
use it wherever we expect "slow" method calls, because of
interactivity and such. It's used in a bunch of places, but likely not
at enough. File a bug if you found a place where we should.

> Is it possible for sd_bus_call to detect that the recipient of a call has
> dropped off the bus and is never going to return a response? If that were
> possible we could possibly rely on that rather than an explicit timeout. I
> think the answer to this question might be "no" though...

it detects that out-of-the-box.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Upgraded multiple systems to systemd 249.3 and all had eth1 not started / configured

2021-08-16 Thread Lennart Poettering
On Mo, 16.08.21 17:31, Amish (anon.am...@gmail.com) wrote:

>
> On 16/08/21 5:25 pm, Lennart Poettering wrote:
> > On Mo, 16.08.21 16:09, Amish (anon.am...@gmail.com) wrote:
> >
> > > Some old scripts that we have expect interface names starting with eth. 
> > > But
> > > those names are not predictable.
> > >
> > > So to get predictable names starting with eth*, first I temporarily rename
> > > all interface with tmpeth*. This is done via udev rules.
> > >
> > > SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:XX",
> > > NAME="tmpeth0"
> > > SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:YY",
> > > NAME="tmpeth1"
> > > SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:ZZ",
> > > NAME="tmpeth2"
> > >
> > > Then I have a small service (script) which runs before network-pre.target 
> > > to
> > > convert these names back to eth*
> > >
> > > #search for network interface with name starting from "tmpeth" and rename
> > > them to "eth"
> > > /usr/bin/find /sys/class/net -maxdepth 1 -name "tmpeth[0-9]" -type l 
> > > -printf
> > > "%f\n" | while read tmpiface; do /usr/bin/ip link set dev "$tmpiface" name
> > > "$(echo $tmpiface | sed s/tmpeth/eth/)"; done
> > >
> > > This ensures that I have predictable names starting with eth*. And it is
> > > working fine from 2-3 years. Even with current issue, name assignment is
> > > working fine.
> > This cannot work and is necesarily race. Stay out of the ethXYZ
> > namespace, that's the kernel's namespace. Pick any other names,
> > i.e. "foobar0", "foobar1", but otherwise you just have a racy racy
> > mess, because the kernel might take the name whenever it pleases.
>
> No I dont think this is race. Because my script runs after Udev has finished
> assigning the interfaces names.

device probing can take any time it wants. there isn't a point in time
where everything is probed.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] Upgraded multiple systems to systemd 249.3 and all had eth1 not started / configured

2021-08-16 Thread Lennart Poettering
On Mo, 16.08.21 16:09, Amish (anon.am...@gmail.com) wrote:

> Some old scripts that we have expect interface names starting with eth. But
> those names are not predictable.
>
> So to get predictable names starting with eth*, first I temporarily rename
> all interface with tmpeth*. This is done via udev rules.
>
> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:XX",
> NAME="tmpeth0"
> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:YY",
> NAME="tmpeth1"
> SUBSYSTEM=="net", ACTION=="add", ATTR{address}=="XX:XX:XX:XX:XX:ZZ",
> NAME="tmpeth2"
>
> Then I have a small service (script) which runs before network-pre.target to
> convert these names back to eth*
>
> #search for network interface with name starting from "tmpeth" and rename
> them to "eth"
> /usr/bin/find /sys/class/net -maxdepth 1 -name "tmpeth[0-9]" -type l -printf
> "%f\n" | while read tmpiface; do /usr/bin/ip link set dev "$tmpiface" name
> "$(echo $tmpiface | sed s/tmpeth/eth/)"; done
>
> This ensures that I have predictable names starting with eth*. And it is
> working fine from 2-3 years. Even with current issue, name assignment is
> working fine.

This cannot work and is necesarily race. Stay out of the ethXYZ
namespace, that's the kernel's namespace. Pick any other names,
i.e. "foobar0", "foobar1", but otherwise you just have a racy racy
mess, because the kernel might take the name whenever it pleases.

Lennart

--
Lennart Poettering, Berlin


Re: [systemd-devel] --luks-offline-discard option has no effect on systemd-homed

2021-07-28 Thread Lennart Poettering
On Sa, 17.07.21 23:55, Gibeom Gwon (gb.g...@stackframe.dev) wrote:

> Hello,
>
> I'm trying to create a systemd-homed based user, there is a problem with
> --luks-offline-discard option. According to homectl(1), if this option is
> true,
> home directory loopback file is minified on logout. But in my case it's not
> at all. Here is my executed command and output.

I thought I had fixed that issue. Which systemd version is this?


Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Do Systemd service have limitation w.r.t IPC on forked process

2021-07-28 Thread Lennart Poettering
On Mi, 28.07.21 22:17, Stiju (stiju.e...@gmail.com) wrote:

> Hi,
> I have a templated service and socket, for my program.
> thou may not be relevant here , socket is written for inetd (ie FD 0 and 1)
> the service type is put as "simple" in systemd configuration,
> the service spawns many processes in some flows ( fork + execv ) .
> the spawned process opens , IPC connection ( normal socket to another
> system ), the process is killed by systemd.
> is this expected?

Consider enabling debug logging in PID 1. It logs about everything it
kills then, and the logs should explain why.

"systemd-analyze log-level debug" is the command to do that during runtime.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Problem : service ( systemd )

2021-07-28 Thread Lennart Poettering
On Do, 22.07.21 20:54, Webstrucs (webstr...@gmail.com) wrote:

> What command do I use or how do I clean, delete or delete services that no
> longer exist and are not found from the memory, as I believe that this is
> influencing the activation of services I add, follow the attached terminal
> print:

Typically your package manager does this automatically for you when
you remove the package that contains the unit file.

The usual approach is to issue "systemctl disable --now …" on the unit
files before removing them, to ensure the units are stopped and
disabled. This will implicitly then do a "systemctl daemon-reload" for
you, too. You can also do that part manually, maybe after you actually
removed the unit files from disk.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Changing the Priority Level of Journald during Runtime

2021-07-28 Thread Lennart Poettering
On Mi, 28.07.21 10:02, Andreas Krueger (andreas.krue...@fmc-ag.com) wrote:

> Hi Folks,
>
> I'm a programmer who has to realize a dynamic priority level for
> Journald, i.e. the priority level for incoming messages that decide
> whether they are stored in the journal shall be changeable at
> runtime. As far as I know this is not directly supported by
> Journald, I would like to use a trick to accomplish this:
>
> Each time the level is to be changed, the configuration value of
> MaxLevelStore is changed accordingly and Jorunald is restarted by
> command systemctl restart system-journald. Since this will take some
> time, I ask myself, what will happen to messages that arrive during
> this time? Are they gone or buffered and available after the
> restart?

The sockets that journald listens on are allocated outside of
journald, before it is started, via the systemd-journald*.socket unit
files. They also stay up while journald is restarted, and thus any
messages queued in them will remain so. Thus restarting journald
during runtime should be safe and won't lose you any messages.

> Or is a dynamic priority level available in the meantime?

It's not available right now. Please file an RFE issue on github so
that we can look into it. Or provide a patch that adds it.

We nowadays have the Varlink IPC API in journald, it should be pretty
straight-forward adding this logic there.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Exception safety od sd-bus

2021-07-27 Thread Lennart Poettering
On Do, 22.07.21 14:04, Stanislav Angelovič (angelovi...@gmail.com) wrote:

> Hi guys!
>
> Assuming sd-bus is used in a C++ application, is sd-bus safe against
> exceptions flying from e.g. a sd-bus vtable callback handler (provided by
> the C++ application) and catching them in the caller of sd_bus_process()
> (which is the same C++ app)?
>
> Or this is not supported (so leaks or whatever obscure situations may
> happen then)?

systemd does not use C++, we are a C project.

We make sure superficially our C header files are compatible with C++,
but that's how far our support for C++ goes.

I have no experience with C++ exceptions and C stack frames. We have
no explicit support for any of it, so they are handled like in any
program where C++ is called from C contexts, and I figure there will
be plenty docs about that.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Failing UnitTest for Journald

2021-07-06 Thread Lennart Poettering
On Mo, 05.07.21 18:46, Andreas Krueger (andreas.krue...@fmc-ag.com) wrote:

> Hi Folks,
>
> for a customer I have to verify the behavior of the logger in its
> system (Linux debianVM 4.19.0-6-amd64 #1 SMP Debian
> 4.19.67-2+deb10u1 (2019-09-20) x86_64 GNU/Linux), which is journald
> (systemd 241 (241)).  For this, I have written some unit tests that
> work all well, when executed separately. But running together they
> lead to some erroneous behavior that I cannot explain - maybe you
> have an idea what's going wrong...

I am not sure I follow entirely what you are doing. But please be
aware of the following race we cannot fix without kernel changes:

Whenever journald receieves a log message from a client it uses the
SCM_CREDENTIALS metadata of the incoming packet to retrieve further
medadata from /proc/$PID/ about the client. If the sender's process
exited by the time the message is processed this data is not available
anymore, and thus will not be stored along with the message, and can
thus not be used to search/filter for the message.

Thus: whenever you have a process that logs and immediately exits
there's a chance that once journald processes that log message, it is
seen and written to disk but without much of the metadata. If you then
use "journalctl -u" or a similar command (e.g. the log output of
systemctl status) to look for the logs of the unit it will likely not
be found since the unit name is one of the metadata fields not
available in this case (since it is extracted from the
/proc/$PID/cgroup file by journald).

You should see the log message if you do not use filtering though,
i.e. "journalctl -e" or so should show it.

We'd really like to fix this one day, but unfortunately kernel
developers so far shot down any attempt to optionally attach more
metadata to AF_UNIX datagrams (if we had just the cgroup this would
already make things *so* much better for us).

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Are there any circumstances under which we would *expect* init.scope to not exist?

2021-07-01 Thread Lennart Poettering
On Di, 29.06.21 20:29, McKay, Sean (sean.mc...@hpe.com) wrote:

> Hi there,
>
> I have (what I think will be) an easy question: Are there any
> circumstances or configuration behaviors under which we would expect
> the init.scope cgroup to not get created (still on v1, if
> applicable). I suspect the answer is no, but I wanted to verify.

init.scope is where PID 1 moves itself into early on if it#s not in
there already. Note that if PID 1 is started in a cgroup somewhere
down the tree it will create the init.scope cgroup below *that*, and
not at the top. It's just that the kernel by default will of course
run PID 1 in the top cgroup. Which means if you have a weird initrd
that moves things around, then PID 1 might live in some arbitrarily
chose subcrgoup.

> Context if you care: we've got a source based distribution (yocto
> project based. 3.0, specifically) running systemd 243 (not
> supported, I know) and I've just discovered that init.scope is not
> created on our systems. I assume that this is a bug and something
> that we've broken, but it seemed easy enough to ask if there are any
> circumstances where it might be expected before I get out my shovel
> and start digging.  Thanks!  -Sean McKay

Really old systemd versions didn't do init.scope. They ran PID 1 in
the top-level cgroup. We introduced the init.scope mostly to fulfill
cgroupv2's requirement of not running processes in inner cgroups. We
introduced it on both cgroupsv1 and cgroupvs2 when we added it, even
though not strictly necessary on the former, to minimize behavioural
differences.

maybe your old systemd is just +that* old?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] minimum space needed for reload/reexec

2021-07-01 Thread Lennart Poettering
e65;6401;1cOn Mi, 30.06.21 11:39, CHEN, Jack (zenghuc...@siemens.com) wrote:

> Hi,
>
> I happen to see to a fix in system, which adds a space checking before 
> reload/reexec.
> >From the code:
> Require 16MiB free in /run/systemd for reloading/reexecing. After all we need 
> to serialize our state there, and if
> * we can't we'll fail badly
> #define RELOAD_DISK_SPACE_MIN (UINT64_C(16) * UINT64_C(1024) * UINT64_C(1024))
> And I got further explanation from poettering:
> it's a "safety buffer", see commit msg of the fix. It's set to 16M because it 
> has to be set to something, and it sounded like a reasonably value, and so 
> far we got no feedback to the contrary
>
> However, just for “serialize our state there”, literally, there is
> no need for such a big space (16M).

We can't precisely estimate that, since the space required for tmpfs
inodes is not clear to us from userspace in advance.

Note that 16M is a safety limit. It doesn't mean that we actually
intend to use that much. it just says, let's better not try if there's
less than this available. Much likely we'll use much less than the
16M. Hence, please don't misunderstand this as "this is how much we'll
use", but as "let's not try if less than this is available".

There's simply no Linux kernel API for querying in advance explicitly
how much space would be subtracted from the current free space of a
tmpfs instance if we'd write a single file of size X with a filename
of Y there.

> There is no problem for PC. But for embedded devices, 16M is quite prodigal.
> And in my test, even 1M free space would allow reloading/reexecing
> to work normally.

Bigger systems have bigger state serializations though.

> Shall we remove this restriction? (Removing I mean we just check if there is 
> free space or not) Or at least lower the threshold.
> It is not good to forbid reloading/reexecing when there is enough free space, 
> just because it is smaller than the threshold (16M)。

Removing is not an option. We added it for a reason: people filled up
their /run/ by accident and then issued "systemctl daemon-reload"
which would fail half-way because the file system was full.

I'd be willing to take a patch that changes this to

MIN(16*1024*1024, physical_memory_scale(2, 100));

or something like this.

This would mean we'd scale the safety net by the amount of physical
memory in the system. i.e. 2% of physical RAM, but 16M at most. This
should then cover your case too? i.e. enforce a lower limit on smaller
systems, and the existing 16M limit on bigger ones?

(I pulled the 2% out of my hat, whatever makes sense..)

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Fast respawning jobs

2021-06-25 Thread Lennart Poettering
On Fr, 25.06.21 11:17, Szymanski, Kai (kai.szyman...@luerssen.de) wrote:

> Hi Lennart,
>
>
> the desciption for StartLimitIntervalSec && Startlimitburst:
>
> "more than burst times within an interval time span are not permitted to 
> start any more"
>
>
> But i need: A Job that returns after 4 Seconds with StatusCode 0
> have to be started again. Of cource i can raise the StartLimit...but
> this is not a "really" solution in my eyes 

I don't understand what the problem is supposed to be?

You want quick restarts are you don't?

You configured your unit to prohibit that via the start limits you
defined. If you want to allow quick, repeated starts then raise the
limit.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Fast respawning jobs

2021-06-25 Thread Lennart Poettering
On Fr, 25.06.21 09:32, Szymanski, Kai (kai.szyman...@luerssen.de) wrote:

> Hi,

Please always mention the systemd version you are operating with. (And
distro would be good too).

> i have a job that is very fast respawning if there ist nothing todo. The 
> Servicefile looks like this:
>
>
> [Unit]
> Description=job1
> StartLimitIntervalSec=100
> StartLimitBurst=5

With these two lines you declare that starting the unit shall fail
when attempted more than 5 times in 100s.

> Problem is that after several retries the job goes to failed state (but it 
> exits after 1 Seconds with resultcode 0).

Raise the start limit?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Alias for SMTP providers [ie. mutually exclusive service alternatives]

2021-06-16 Thread Lennart Poettering
On Di, 15.06.21 02:03, Kenneth Porter (sh...@sewingwitch.com) wrote:

> What happens if I list multiple services in a Wants= and After= clause that
> are mutually exclusive (eg. sendmail/postfix/exim? How can I say "This unit
> needs to send mail" without knowing which is enabled?

What does "needs to send mail" even mean? That /usr/sbin/sendmail can
be called to queue a message? That you can talk to localhost:25?

A well behaving MTA actually make /usr/sbin/sendmail work without the
main mail daemon to be up. The mail is then only enqueued, but not
dispatched, but that'll be done once the service is fully up.

A well behaving MTA would also support socket activation, so that the
port 25 is already bound while the daemon tarts up. This means that no
ordering for regular services is necessary.

Anyway, we do not provide any generic target for this, but your distro
might add that.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Running systemd unprivileged in Docker container

2021-06-14 Thread Lennart Poettering
On Fr, 11.06.21 16:55, Johannes Ernst (johannes.er...@gmail.com) wrote:

> I can run a full Arch system (with systemd as PID 1) in a Docker container in 
> Docker privileged mode:
> sudo docker run -i -t --privileged archlinux /usr/lib/systemd/systemd
> but privileged mode is, well, a bit privileged. I believe used to be
> able to tone this down with something like:

So, Docker has an upstream that is pretty hostile towards systemd. As
result, while pretty much all other container managers mostly just
work with systemd as payload, Docker does not.

We document extensively what expectations we have on a container
manager for things to just work:

https://systemd.io/CONTAINER_INTERFACE

The requirements aren't crazy, the few requirements of the above you
really need shold be pretty common sense, yet Docker isn't interested.

My recommendation would be to pick an alternative container manager
with a less hostile upstream. e.g. podman is supposedly a drop-in
replacement and should just work.

If you want to use Docker anyway, I figure you have to make sure you
boot in cgroupsv1 mode (last time I looked the cgroupsv2 support in
Docker wasn't really more than an experiment), and stick to that. Make
sure that cgroupns is enabled, and that /sys/fs/cgroup/ is a tmpfs,
and /sys/fs/cgroup/systemd a cgroupfs mount of the top of the cgroup
namespace the container runs in, and that it is writable.

Not sure how to configure that with Docker, as I am not a Docker
person. Ideally this would be the default setup of Docker, but well,
apparently it isn't.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Alais for SMTP providers

2021-06-14 Thread Lennart Poettering
On Sa, 12.06.21 03:33, Kenneth Porter (sh...@sewingwitch.com) wrote:

> I just finished adding a custom service to send an email on system
> shutdown/startup, based on this article:
>
> <https://askubuntu.com/questions/851946/send-me-an-email-on-computer-shutdown>
>
> I ended up coding an After for postfix.service so the mail would get get
> delivered before the system shut down. I'd like to be able to use the same
> unit file on older systems that use sendmail, and I know there are other
> packages that provide SMTP and local mail. So it would be desirable to have
> an Alias for those services. I'm using CentOS 8 (7 on older systems). Is
> this perhaps already present in a newer commit? Is there a registry for
> well-known aliases for package writers?

You mean a generic target or so that generically encapsulates "mail
server is up"?

We have no such target upstream, and I am not sure we should add
that. Maybe your downstream distro has that though.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd.socket man pages update suggestion

2021-06-14 Thread Lennart Poettering
On Do, 10.06.21 13:44, Ted Toth (txt...@gmail.com) wrote:

>  SELinuxContextFromNet=
>Takes a boolean argument. When true, systemd will attempt to
>figure out the SELinux label used for the instantiated
>service from the information handed by the peer over the
>network. Note that only the security level is used from the
>information provided by the peer. Other parts of the
>resulting SELinux context originate from either the target
>binary that is effectively triggered by socket unit or from
>the value of the SELinuxContext= option. This configuration
>option only affects sockets with Accept= mode set to "yes".
>Also note that this option is useful only when MLS/MCS
>SELinux policy is deployed. Defaults to "false".
>
> Add:
> One or more of the associated service files
> StandardInput/StandardOutput/StandardError options should be set to
> socket for this option to work.
>
> >From execute.c:
>   if (context->std_input == EXEC_INPUT_SOCKET ||
> context->std_output == EXEC_OUTPUT_SOCKET ||
> context->std_error == EXEC_OUTPUT_SOCKET) {
>
> if (params->n_fds != 1) {
> log_unit_error(params->unit_id, "Got more than
> one socket.");
> return -EINVAL;
> }
>
> socket_fd = params->fds[0];
> } else {
> socket_fd = -1;
> fds = params->fds;
> n_fds = params->n_fds;
> }
>
> When socket_fd is -1 the SELinux context is not computed. Text like
> this would have saved a lot of head scratching and code reading :(

We should probably make this work for any service that is instantiated
with a single fd. Can you file a bug on github asking for this?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Are Pathnames in /tmp/systemd-private-foo predictable?

2021-06-14 Thread Lennart Poettering
On So, 13.06.21 21:04, Marc Haber (mh+systemd-de...@zugschlus.de) wrote:

> Hi,
>
> I am wondering where the 32 xdigit number in pathnames like
>
> systemd-private-27aa635a15cf4da0a7ebda10f25c3950-chrony.service-9DShFi/
>
> comes from. I always had the impression that it's the systemd/dbus
> machine id, but that does not seem to be the case. Is that just an
> arbitrary random number, or can it be predicted in a way?

It's the boot ID, i.e. /proc/sys/kernel/random/boot_id. We include it
in the name so that we can distinguish such dirs of the current boot
from those of earlier boots (which can be retained because of abnormal
shutdown or so). In the latter case we can safely remove them to avoid
collecting left-over directories.

The dirs are not predictable in their name. The 6 char suffix you see
is the mkstemp() randomized suffix to make them safe against collision
attacks.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Discrepancy in using dhclient b/w ubuntu 20.04 and ubuntu 16.04

2021-06-08 Thread Lennart Poettering
On Di, 08.06.21 15:05, Reindl Harald (h.rei...@thelounge.net) wrote:

> > I have attached a minimalistic repro along with the codes of all the
> > scripts, service files. I suppose Silvio was able to see the files.
>
> i don't get the bash-nonsense for a handful of lines (most of them doing
> nothing at all) to begin with and given that there is no "Type=" in the unit
> file you may read the docs and try the different types
>
> i also don't get the trial-binary
>
> why in the world don't you trhow away all that crap inlcuding the docker
> container and start dhclient at your own from a trivial systemd-unit?

Reindl, I warned you very explicitly not to behave like this:

https://lists.freedesktop.org/archives/systemd-devel/2021-February/046028.html

You ignored that now. You are now blocked on this mailing list.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-nspawn with filesystem id mapping

2021-06-08 Thread Lennart Poettering
On Fr, 04.06.21 14:53, systemd-de...@notandy.de (systemd-de...@notandy.de) 
wrote:

> Hi again,
>
> after some more debugging this EOVERFLOW seems to be the result of a call to 
> may_o_create in fs/namei.c in the kernel.
> There is a check:
>
> if (!fsuidgid_has_mapping(dir->dentry->d_sb, mnt_userns))
>   return -EOVERFLOW;
>
> This seems to be the one returning EOVERFLOW to nspawn and resulting in the 
> container spawn to fail.
> My guess would be that this is a systemd bug when combining filesystem id 
> mapping with --bind.
> Before I start spending more time debugging this, has anyone so far used 
> --bind with --private-users=pick and --private-users-ownership=map 
> successfull?
>
> As far as I understand the pull request #19438 , didn't add any handling to 
> the mount_bind function. Was this maybe overlooked?
> In my understanding there is a remount_idmap missing in that function well as 
> the touch needs to be done in the correct user namespace or with mapped 
> uid/gids.
>
> I'm new to the systemd source code, could somebody confirm that I'm on the 
> right track there and not heading in the wrong direction?

Let's follow up on the PR, it's the better place to development
discussions on specific bugs or problems. I replied on it the other
day.


Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] rg...@outlook.com

2021-06-07 Thread Lennart Poettering
On Mo, 07.06.21 22:47, Aravindhan Krishnan (aravindhan...@gmail.com) wrote:

> Hi Lennart,
>
> Thanks for the quick response. Yes, we are running systemd inside the
> docker. We were also able to see the same issue even on top of
> Centos 7.9.

Unlike pretty much all other container managers Docker doesn't really
make it easy to run systemd inside it. Docker upstream is pretty
hostile towards systemd, so this is unlikely to change.

We document pretty extensively what container managers have to do to
make sure systemd just works inside containers. Pretty much all
container managers just implement that, but Docker doesn't. This is
what they need to implement:

https://systemd.io/CONTAINER_INTERFACE

Consider switching to a different container manager implementation,
there are plenty others. (in particular podman is mostly a drop-in
replacement for Docker, if you need Docker semantics. Podman upstream
isn't hostile towards systemd, so things mostly just work there.)

> Attaching the kernel and OS details of the centos host
>
> # uname -r
> 3.10.0-1160.25.1.el7.x86_64
>
> # cat /etc/centos-release
> CentOS Linux release 7.9.2009 (Core)

This is very old. You might want to switch to a newer OS for this
anyway.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] rg...@outlook.com

2021-06-07 Thread Lennart Poettering
On Mo, 07.06.21 21:26, Aravindhan Krishnan (aravindhan...@gmail.com) wrote:

> Hi Folks,
>
> I am finding anomalous behavior when I am trying to run dhclient process
> inside my docker container in vanilla Ubuntu 16.04 host. The service gets
> into "deactivating" state and is stuck forever. In the mail I have attached
> a minimalistic reproduction of the issue seen.

Are you running systemd inside of a Docker container on Ubuntu 16.04?

Docker isn't really up to that. In particular not 5y old versions of it.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] DHCP6 client failing when /etc is mounted as overlayfs

2021-06-02 Thread Lennart Poettering
On Di, 01.06.21 09:42, Alessandro Tagliapietra 
(tagliapietra.alessan...@gmail.com) wrote:

> Thanks for helping Mantas,
>
> What I saw is:
>  - before first boot /etc/machine-id is empty (and I think that's expected)
>  - right after boot, /etc/machine-id isn't writable because the root fs is
> mounted as readonly from fstab
>  - after the /etc overlay is mounted /etc/machine-id should still be the
> one from the underlying filesystem and at this point is also writable,
> however it's still empty
>
> During boot I see:
>
> [3.577477] systemd[1]: Initializing machine ID from random generator.
> [3.584284] systemd[1]: Installed transient /etc/machine-id file.
>
> however /etc/machine-id shouldn't be writable at that point, what should I
> do? Make our overlay mount unit depend on whatever service is generating
> machine-id and make sure our mount happens before the generation of
> machine-id?

The assumption is that the machine-id is accessible and remains stable
during the entire system uptime, once the host PID 1 initialized
(i.e. afte transitioning from the initrd). Apps should be able to rely
that the machine ID just works and can be cached.

If you replace /etc/ with a different file system during runtime,
that's OK as long as that file remains accessible throughout.

Note that if /etc/machine-id is empty at boot and /etc read-only PID1
will generate a transient machine ID and write it to a file in /run
which it then bind mounts over /etc/machined-id, so that it appears
there unconditionallty. If you now replce /etc with your own overlayfs
you need to make sure to cover this bind mount too. Note that the
lower layers of an overlayfs refer to the specified top-level mount
points only: a lower layer is nt the whole tree of mounts but only the
mount you explicitly list.

This means you probably want to prepare your overlayfs at some
temporary location first, then bind mount the existing bind mount that
is /etc/machine-id over the overlayfs at the same place, and then move
the whole overlayfs to /etc into place. That way /etc/ is suddenly
replaced by your overlayfs but /etc/machine-id will be accessible in a
stable way continously.

Note that /etc/machine-id is used by various parts of systemd. DHCP
stuff is just one case. Logging uses it too and plenty other
stuff. Hence, you really should follow the documented behaviour of
machine-id, because if you don't then things will break all over the
place.

Please see machine-id(5) for details about the file.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] luks - a particular device systemd treats differently?

2021-06-02 Thread Lennart Poettering
On Mi, 02.06.21 10:00, lejeczek (pelj...@yahoo.co.uk) wrote:
> Conditional check - systemctl is-failed ... - works for all devices but that
> one.

I am sorry, but I don#t really follow.

I understand though that once of the instances of
systemd-cryptsetup@.service fails for you? Please provide the full
logs off that unit. "journalctl -u ".

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] what's the order in which systemd .device units are created ?

2021-06-02 Thread Lennart Poettering
On Mi, 02.06.21 14:00, Abder (koute102...@gmail.com) wrote:

> Hi,
>
> Even though I've browsed through a lot of resources I haven't found any
> satisfying answer to my question. I've been trying to minimize the booting
> time in the user land on my embedded board, and when I run the classic
> $systemd-analyze plot > plot.svg I saw that there is a non-negligible slot
> of time in which all what systemd does is creating device units that were
> discovered via udev.
>
> My problem is in the order in which these device units are created,
> specially for the block device that contains my rootfs. What I noticed is
> that when these device units are created, the one corresponding to my
> rootfs blockdev partition is always the last one created, causing the other
> services depending on it to wait much more than if the device unit was
> created earlier.
>
> So, I would like to know if systemd follows a special order when creating
> these units, and if yes, what can I do so the device unit of my rootfs
> blockdev partition can be the first one created ?

The systemd-udev-trigger.service unit invokes "udevadm trigger" to
trigger all devices that are already discovered by the kernel at that
point. PID 1 listens to that and synthesizes .device units from all
devices popping up. The order in which "udevadm trigger" triggers them
is pretty much the order in which readdir() returns the devices when
iterating through /sys/, i.e. basically undefined.

There's a github issue open somewhere where I proposed adding a new
switch to "udevadm trigger" called --priorize-subsystem=… or so, which
you can use to priorize on or more subsystems: any subsystems listed
that way are triggered first. That way you could do "udevadm trigger
--priorize-subsystem=block" to ensure block devices are triggered
first (in fact, we probably should use this as default once we have
the concept, given that the most relevant devices we wait for at boot
are probably all block device). Would be happy to review/merge a patch
for that.

Note that the order in which devices are triggered does not directly
translate to the order in which devices actually are processed and
make it through udevd and are picked up by PID 1. udevd runs rules for
each device, and if those are slow and take a bit of time, this might
delay delivery of the events to PID 1. However, there's certainly some
relationship here: if certain devices are the ones we start processing
first thy are likely also the devices where we finish processing them
first, even if there's no strict guarantee for that.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] luks - a particular device systemd treats differently?

2021-06-01 Thread Lennart Poettering
On Di, 01.06.21 16:55, lejeczek (pelj...@yahoo.co.uk) wrote:

> Hi guys.
>
> I have a crypttabl here:
>
> luks-devs /dev/mapper/dev1-devs /etc/.etc.enc.loop/crypttab.key
> discard,nofail,timeout=3s,noauto
> luks-home /dev/mapper/dev1-home /etc/.etc.enc.loop/crypttab.key
> discard,nofail,timeout=3s,noauto
> ...
> plus a few more lines with all options just as those two. I have a fstab
> here:
>
> /dev/mapper/luks-home /home   ext4
> noauto,nofail,noatime,nobarrier,noatime,x-systemd.device-timeout=3s 1 2
> /dev/mapper/luks-devs /devs   ext4
> noauto,nofail,noatime,nobarrier,noatime,x-systemd.device-timeout=3s 1 2
> ...
>
> when I check devices manually here:
> ...
> systemctl is-failed "systemd-cryptsetup@luks\x2ddevs.service" -q && {
> systemctl restart "systemd-cryptsetup@luks\x2ddevs.service" && fsck.ext4
> /dev/mapper/luks-devs && mount /devs; }
>
> then I get asked for passphrase and the rest gets going (intentional, as
> those luks devs do not get opened at boot time) for all the devices except
> for "devs"
> As  I understand systemd here does not see, mark that one device as "failed"
> and I have no idea why systemd would do that for that one device.
> Would somebody care to share so ideas?

I am not sure I properly grok what you are trying to say, but: did you
check the logs?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Adding USB ID to hwdb/usb.ids

2021-06-01 Thread Lennart Poettering
On Do, 13.05.21 11:54, Thomas A (thomas...@gmail.com) wrote:

> Hi,
>
> I'm trying to add the info for Thrustmaster T150 Racing Wheel to the hwdb. I
> have found that the USB values are stored hwdb.d/20-usb-vendor-model.hwdb.
> However, it seems that this file is never edited manually, but just pulled
> from linux-usb.org usb.ids.
>
> www.linux-usb.org seems mostly broken and various Google results indicate
> that there is no response to e-mails. I have, just now, tried to submit a
> patch by following the guide for the PCI IDs, from which the USB IDs site
> was copied. I could not register in any way, so I suspect my e-mail will be
> dropped.
>
> Is there any better way to get my ID into the USB ID list in hwdb?
>
> Patch on systemd hwdb:
> https://github.com/thomasa88/systemd/commit/7af622d9d2335f9fc0e94b3b8a5139ef959bef9c

If this is indeed the case I guess we could start maintaining a
database for this kind of stuff in hwdb.d/ and ship it with
systemd. Maybe called 20-usb-vendor-model-extra.hwdb or so?

Kinda sucks though. Any reference where this is discussed? i.e. the
google results you mentioned? A quick google i did myself din't reveal
anything?

Anyway, please consider submitting the addition as a PR if it's indeed
unlikely linux-usb.org comes back as a maintainer for this.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] syntax checker

2021-06-01 Thread Lennart Poettering
On Sa, 22.05.21 14:03, Johannes Köhler (koehler.johan...@googlemail.com) wrote:

> much valued maintainers of systemd!
>
> about myself and my network appearance youll
> find appended... + on(e)of(f) my acronyms is kefko
>
> I am using systemd since it was included within
> ARCH Linux...
>
> = QUESTION =
>
> Like i understood - the principle by systemd is using
> configuration files works like "first come first serve".
> Means, when a keyword matched other keywords will get
> ignored.
>
> Is there something like a "syntax checker tool" to
> generate a summary of the overhead, errors, dependencies
> etc., before running the daemon into the new configuration
> setting...

"systemd-analyze verify"


Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Antw: [EXT] Re: What causes "systemd-journald[3256]: Missed 127 kernel messages"

2021-06-01 Thread Lennart Poettering
On Di, 01.06.21 14:33, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> >>> Lennart Poettering  schrieb am 01.06.2021 um 13:39
> in
> Nachricht :
> > On Di, 01.06.21 12:42, Ulrich Windl (ulrich.wi...@rz.uni‑regensburg.de)
> wrote:
> >
> >> Jun 01 12:33:10 h18 systemd‑journald[3256]: Missed 195 kernel messages
> >>
> >> A few questions:
> >> 1) What causes this?
> >
> > Dunno. Something is massively flooding the kernel log buffer. Probably
> > some borked driver or so. "dmesg" might tell you what.
>
> I had meant the dropping of messages, not the creation of such. It seems it's
> intentional.

Ahumm. It does not. Generating such high frequency log messages is a
bug. Please report to your kernel maintainers.

> > You could also enlarge the kernel log buffer, see log_buf_mem= kernel
> > cmdline switch.
>
> Confused: So is it the kjernel dropping/loosing messages, or is it journald?

The kernel is generating them faster than userspace can keep up with them.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] What causes "systemd-journald[3256]: Missed 127 kernel messages"

2021-06-01 Thread Lennart Poettering
On Di, 01.06.21 12:42, Ulrich Windl (ulrich.wi...@rz.uni-regensburg.de) wrote:

> Jun 01 12:33:10 h18 systemd-journald[3256]: Missed 195 kernel messages
>
> A few questions:
> 1) What causes this?

Dunno. Something is massively flooding the kernel log buffer. Probably
some borked driver or so. "dmesg" might tell you what.

> 2) How can systemd know how many messasges were missed?

Kernel messages delivered to userspace come with a sequence number. If
there are some missing we know the kernel dropped them from the kmsg
log buffer before we could read them.

> 3) Can I avoid that problem?

Figure out which kernel driver/subsystem is responsible.

You could also enlarge the kernel log buffer, see log_buf_mem= kernel
cmdline switch.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] /etc overlay

2021-05-31 Thread Lennart Poettering
On Fr, 28.05.21 16:31, Barbier, Renaud (renaud.barb...@abaco.com) wrote:

> So is it my understanding that as long as the mount or overlay
> happen early enough which is around the service for var-volatile-etc
> then there is a rescan and all config from the /etc in the volume
> will be used then?

The clean codepaths in systemd mean that systemd loads unit files
comprehensively only during early boot before it runs the first
services, or if something invokes "systemctl daemon-reload". The
latter is a bit dirty though, and won't change the boot transaction
already being executed at that time.

Thus, if you intend to drop in additional files as services you should
ideally do so before PID 1 initializes, i.e. in the initrd.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Running pam-enabled /bin/login sessions in unprivileged terminal emulators

2021-05-27 Thread Lennart Poettering
On Do, 27.05.21 22:25, nerdopolis (bluescreen_aven...@verizon.net) wrote:

> I guess I meant to say getty, but getty ends up calling /bin/login anyway 
> after
> resetting the terminal and reading /etc/issue anyway. Or at least I thought.
>
> Interesting I found some simple enough looking samples for granting users the
> ability to start one service. Dang, it might not work with Debian's
> fraken-polkit-0.105 they still have.
>
> I am able to tweak up a test copy of container-getty@.service,
> setting TERM to xterm-256color and doing the XDG_SEAT=seat-vtty workaround so
> the logged in session has PAM too, and nmtui doesn't do this
> https://i.imgur.com/dt7xAMz.png
> so that works.
>
> Something like that is what I was originally looking for, so thanks!
> but I will admit, one thing I've come to like about the socat client/server
> hing is that if say cage or vte takes a segfault during say an apt-get 
> install,
> the running command doesn't die...

The service that implements your terminal emulator could upload the
pty master fds to systemd via the fdstore logic. That way the master
will stay open across restart of that service or when it fails.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Running pam-enabled /bin/login sessions in unprivileged terminal emulators

2021-05-27 Thread Lennart Poettering
On Sa, 22.05.21 13:50, Pekka Paalanen (ppaala...@gmail.com) wrote:

> All in all, this stack would replace the usual stack where
> /bin/login runs directly on the TTY of a VT, allowing to use a more
> featureful terminal, custom display modes, multi-output support,
> maybe multiple parallel sessions for different users a la fast user
> switching, and more.

When you say /bin/login do actually intend to say "getty"? what is
/bin/login good for here? it's a stub that expects you already give it
a user and it then only asks for a pw. It's the second part of a getty
pretty much.

We have multiple services that you can instantiate on ttys, for
example getty@.service (for true VTs), serial-getty@.service (for
serial ports), container-getty.service (for /dev/console),
container-getty@.service (for gettys on pseudo TTYs, pretty much).

It appears to me that the right approach for your case is to do what
container-getty@.service effectively does and instantiate an
appropriate instance of a template service modelled after it for the
"other" side of the pty your terminal app allocates.

Instantiating -getty@.service requires privs, but you can use
polkit to grant that to your terminal app's user. THe polkit auth
request carries the unit name as additional metadata, hence that
should be pretty easily done with some minimal polkit JS.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Running pam-enabled /bin/login sessions in unprivileged terminal emulators

2021-05-27 Thread Lennart Poettering
On Fr, 21.05.21 21:29, nerdopolis (bluescreen_aven...@verizon.net) wrote:

> On Thursday, May 20, 2021 10:53:24 AM EDT  wrote:
> > On So, 16.05.21 19:41, nerdopolis (bluescreen_avenger at verizon.net) wrote:
> >
> > > Hi
> > >
> > > I am trying to experiment around replacing the text mode TTYs with 
> > > usermode
> > > utilities.
> >
> > I don't follow?
> >
>
>
> I am sorry, I will try to be more clear. That first email, went through 
> several
> drafts, as at first it was longer and rambly, and I didn't want to bombard the
> list with a giant wall of text.
>
>
> (I actually found my solution was to use socat since sending that last email.)
>
>
> I was seeking some more existing programs I could try to cobble together to
> come up with a solution. The main things were:

I still donÄt get what the "end goal" is. You start with the tools you
want to reach your end goal, but never specify what precisely that end
goal is.

Do you intend to replace the Linux VT with a userspace implementation
of the same concept?

Or do you want to run a full-screen graphical terminal app on one
Linux graphical VT that behaves like a text VT but renders stuff
graphically?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] how to prevent systemd-logind from moving process to other cgroups when executing su command

2021-05-25 Thread Lennart Poettering
On Di, 25.05.21 22:23, 吾为男子 (csren...@qq.com) wrote:

> Systemd provides pam_systemd.so for PAM module and for many
> commands, such as su command, pam_systemd.so will be called and the
> process will be moved to the cgroup that systemd managed.
>
> Generally, if we move the bash process from its related session
> cgroup created by systemd under /sys/fs/cgroup/systemd/user.slice to
> some other cgroup, then systemd will move the new bash process into
> a new group named as session-c.scope under
> /sys/fs/cgroup/systemd/user.slice after executing su command.
>
> We would like to manage the cgroups for a set of processes created
> by ourselves, how to prevent systemd to do such routines, without
> disabling pam_systemd in PAM module.

This is simply not supported by systemd. If you use systemd then it is
systemd that manages the cgroup tree for you. You may request a
delegated subtree you can manage your own stuff in, but the top-level
of the tree is always owned and controlled by systemd and if you
interfere with it, you get to keep the pieces.

This is explained here:

https://systemd.io/CGROUP_DELEGATION

Sorry if this is disappointing,

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] On the IRC situation

2021-05-25 Thread Lennart Poettering
On Di, 25.05.21 12:32, Mantas Mikulėnas (graw...@gmail.com) wrote:

> Hi,
>
> As you might've heard, the freenode IRC network is suddenly under new
> ownership. Neither the process nor the result are making many people
> comfortable, so the old staff collectively packed up and left to start a
> new network in its place, with many channels either following them to the
> new place (#ubuntu and the OG #gentoo included) or to somewhere else that
> is not freenode.
>
> The #systemd regulars have pretty much already moved the channel to this
> new network on their own, so I have registered it there as well (as a
> "community" since I'm ~not really~ a representative). So if there are no
> objections I'll make a PR to update systemd's README files to "s/
> freenode.org/libera.chat/g" sometime later.

Sounds good. Please do!

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Running pam-enabled /bin/login sessions in unprivileged terminal emulators

2021-05-20 Thread Lennart Poettering
On So, 16.05.21 19:41, nerdopolis (bluescreen_aven...@verizon.net) wrote:

> Hi
>
> I am trying to experiment around replacing the text mode TTYs with usermode
> utilities.

I don't follow?

> While kmscon exists, the problem with it that I see is that it runs as root.
> It's most likely so it can run /bin/login as root, and /bin/login is not 
> setuid
>
>
> I found that doing something like (Can't fit the command in 80 chars, 
> sorry)
> systemd-run --setenv XDG_SEAT=$XDG_SEAT --setenv XDG_VTNR=$XDG_VTNR -t 
> /bin/login -p
> can work in a way to run /bin/login within a non-privleged terminal emulator,
> however authentication is needed to run that command.

hmm? XDG_VTNR is for the Linux VT subsystem but though i don't
understand what you are trying to do i get the impression you don't
want to use VTs? or do you? not following...

> First question:
> Is there a supported way to allow a system user account to run one command
> without a password prompt with systemd-run? Otherwise I guess I can just make 
> a
> setuid binary that calls the systemd-run command...

It's PolicyKit enabled, you can allow your user to run unpriv
commands, but it's a all-or-nothing thing.

> The second thing: Things like nmtui need a full logind session to be able to
> run, and do polkit actions. However on seat0, it seems you need to decide on a
> empty TTY to use, which while you can use TTY63, that doesn't seem to be a
> 'clean' idea.

Can't parse this, sorry.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] manually lading kernel modules and have created /dev/* in container?

2021-05-18 Thread Lennart Poettering
On Mo, 17.05.21 19:08, Marc Weber (marco-owe...@gmx.de) wrote:

> > devtmpfs
>
> thanks. So I can modprobe (-r) the modules from both host/container,
>
> eg dahdi_transcode makes /dev/dahdi/transcode appear.
>
> But when mounting from container I can write / read from it (getting errors
>
> about channels not setup which is probably expected), but I when trying same 
> from the container I
>
> just get operation not permitted. chmod 777 or such doesn't help.
>
> I am not using UID/-U id rewriting in any way. I run the container with 
> --capability=all.
>
> Is there something else I am missing ?

nspawn containers have a strict device policy set up by default. You
need to allow-list your device nodes if you want to be able to use
them from inside the container. Use nspawn's --property= parameter to
tweak this, and set the DeviceAllow= property with it, as needed.

Devices aren't reasonably virtualized for containers
though. i.e. sysfs isn't virtualized and udev doesn't even get
started. Thus, by using --property=DeviceAllow= in combination with
--bind= to make specific device nodes of the host available in a
container you'll really just get the naked devicenodes and not
more. This is typically not enough to run any non-trivial software
that wants to to device management, since the enumerate/monitor
devices via sysfs/uevents/udev and that kind of stuff simply doesn't
work in containers.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] APIs for JournalD commands

2021-05-06 Thread Lennart Poettering
On Do, 06.05.21 09:00, Andreas Krueger (andreas.krue...@fmc-ag.com) wrote:

> Hi Folks,
>
> I have to write some tests to ensure the functionality of JournalD,
> which is used by a project I'm working on. For this I've found the
> APIs defined in header file  that can be used
> for many of my issues, but there is a gap between what this header
> file offers and what can be done by command 'journalctl'. For
> example, for verifying the sealing I haven't found any corresponding
> API in the header file. As well as for rotating.
>
> So, is there somewhere a header file with the missing APIs? Or can
> verification (or rotation) be done only by command 'journalctl'?

Rotation can be triggered via a varlink API or via a UNIX process
signal. The varlink API is not officially documented, i.e. we don't
commit to API stability for it yet. The signal is documented on the
journald man page.

Verification is only available in journalctl.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-run / Failed to create bus connection: Input/output error

2021-05-03 Thread Lennart Poettering
On Fr, 30.04.21 14:33, lejeczek (pelj...@yahoo.co.uk) wrote:

> Hi guys.
>
> I'm do on my pretty vanilla, so I'd like to think, setup this:
>
> -> $ systemd-run --machine=qemu-8-c8kubernode1 /bin/cat /etc/centos-release
> Failed to create bus connection: Input/output error
>
> Someone would care to decipher that for me or/and shed bit more light on
> possible troubleshooting?

which host OS, which payload OS? which host systemd, which payload
systemd? is this an nspawn container? is the container fully booted up?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] early mounts in systemd

2021-05-03 Thread Lennart Poettering
On Fr, 30.04.21 15:14, Kenneth Porter (sh...@sewingwitch.com) wrote:

> --On Friday, April 30, 2021 11:39 AM -0400 Rick Winscot
>  wrote:
>
> > Early in the project it was decided to make the rootfs read-only... in an
> > effort to improve durability in environments where power fluctuations
> > might cause problems on the eMMC. At the same time, making logging (e.g.
> > /var) persistent for debugging was added to requirements. Persistent
> > storage would be achieved by mounting /var to a separate partition that is
> > read-write.
>
> Does /etc need to be read-only? On my last server I decided to make /usr
> read-only but root is writable and /var is part of that. I put /home on its
> own partition.

I think making /usr read-only makes a ton of sense.

The way I see it, besides the traditional Linux scheme where the whole
fs is writable the following two scenarios make the most sense, and
are what I personally intend to support in systemd very well:

1. root fs writable, /var/ part of it, but /usr/ separate and
   read-only/immutable.

2. rootfs read-only/immutable, /usr/ part of it, but /var/ separate
   and writable.

The main difference I that in the second case the configuration is
immutable too, while the firt case allows it to be changed locally.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] early mounts in systemd

2021-05-03 Thread Lennart Poettering
On Fr, 30.04.21 18:23, Silvio Knizek (killermoe...@gmx.net) wrote:

> Am Freitag, dem 30.04.2021 um 10:39 -0400 schrieb Rick Winscot:
> > My question for anyone on the list, is the method outlined below a
> > reasonable solution to mounting /var early in the start-up cycle?
> >
> > Or... is there a better way? Some trimming
> >
>
> Hi Rick,
>
> by definition if you need to mount /var (or /usr for this argument),
> you need an initrd [1] which actually set up everything as you
> requires. Anything else has a tendency to break in unpleasant ways due
> to race conditions and such. You don't need much, just enough to set up
> everything required for the root and API file systems.

>From systemd's side we actually explicitly support environments where
/var is mounted after the initrd transition. From our side everything
should just work, we should have all the necessary deps in place to
make /var being mounted post-initrd but still during early boot just
work. I'd consider a bug in systemd if something of systemd#s own
components can't deal with /var/ being mounted after the transition.

(I mean, there have been discussions on whether we shouldn't require
/var to be mounted from initrd, but so far we didn't decide that this
was necessary, given the political effort this would take to require)

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl reboot get terminated by signal 15

2021-04-30 Thread Lennart Poettering
On Fr, 30.04.21 02:00, Pengpeng Sun (pengpe...@vmware.com) wrote:

> > > Here is where our code calls “/sbin/telinit 6” to reboot Linux.
> > > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fvmware%2Fopen-vm-tools%2Fblob%2Fmaster%2Fopen-vm-tools%2FlibDeployPkg%2FlinuxDeployment.c%23L1466data=04%7C01%7Cpengpengs%40vmware.com%7C703b24bd11584d4d033d08d90b4a1ad9%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637553235200829094%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=DWWAKyPLV6dqnbJ37%2FFVqpAYapYvVC57reiYTSflFH4%3Dreserved=0
> > > This code is executed when systemd vmtoolsd.service starts, attached the
> > > vmtoolsd.service file.
> > > Not seeing this issue before /sbin/telinit becomes a softlink to
> > > systemctl.
> >
> > vmtoolsd.service is probably asked to shutdown because of the system
> > shutdown, and the forked off /sbin/telinit is part of that service, so
> > it gets terminated too?
>
> Yes, this could be the reason. The issue is system does NOT always reboot 
> after
> “uncaught signal 15” happens, while sometimes it does reboot during my local
> testing.  Is there a way/command to make sure system get rebooted?

Check the logs?

https://freedesktop.org/wiki/Software/systemd/Debugging/#index2h1

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl reboot get terminated by signal 15

2021-04-29 Thread Lennart Poettering
On Do, 29.04.21 16:55, Pengpeng Sun (pengpe...@vmware.com) wrote:

> > > Hi Lennart,
> > >
> > > After modify journald.conf, got systemd log when the issue
> > > reproduced. Please find it in attachment.
> >
> > Looking at the logs there seems to be a lot missing, in particular all
> > the debug output of PID 1 is eventually going away.
> >
> > My educated guess: you are runnning "systemctl reboot" from user
> > context? i.e. some script you run as part of your user sesion? If so,
> > when the system goes down your user session of course will be
> > terminated. Thus it's a race: either your session (and its associated
> > services) are terminated via SIGTERM first, including the systemctl
> > command you issued, or the systemctl is quicker and exits before it
> > gets killed.
> >
>
> Here is where our code calls “/sbin/telinit 6” to reboot Linux.
> https://github.com/vmware/open-vm-tools/blob/master/open-vm-tools/libDeployPkg/linuxDeployment.c#L1466
> This code is executed when systemd vmtoolsd.service starts, attached the
> vmtoolsd.service file.
> Not seeing this issue before /sbin/telinit becomes a softlink to
> systemctl.

vmtoolsd.service is probably asked to shutdown because of the system
shutdown, and the forked off /sbin/telinit is part of that service, so
it gets terminated too?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl reboot get terminated by signal 15

2021-04-28 Thread Lennart Poettering
On So, 25.04.21 07:09, Pengpeng Sun (pengpe...@vmware.com) wrote:

> Hi Lennart,
>
> After modify journald.conf, got systemd log when the issue reproduced. Please 
> find it in attachment.

Looking at the logs there seems to be a lot missing, in particular all
the debug output of PID 1 is eventually going away.

My educated guess: you are runnning "systemctl reboot" from user
context? i.e. some script you run as part of your user sesion? If so,
when the system goes down your user session of course will be
terminated. Thus it's a race: either your session (and its associated
services) are terminated via SIGTERM first, including the systemctl
command you issued, or the systemctl is quicker and exits before it
gets killed.

The net effect is the same: before we go down we need to terminate all
processes, of course. And user processes are terminated before sytem
processes, of course.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl reboot get terminated by signal 15

2021-04-23 Thread Lennart Poettering
On Fr, 23.04.21 14:15, Pengpeng Sun (pengpe...@vmware.com) wrote:

> Hi Lennart,
>
> The issue reproduced at 2021-04-22T15:45:30.230Z,  I ran 'sudo journalctl 
> -b', the log began at Apr 22 15:45:48 which is later than the issue 
> reproduced.
> How can I get more early and detailed systemd log?

man 5 journald.conf

Maybe your distro didn't enable persistent storage of journald, and
thus journald uses only in-memory storage in /run, and is thus
constrained by its diminutive size?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] EXT: sdbus_event loop state mark as volatile?

2021-04-23 Thread Lennart Poettering
On Fr, 23.04.21 11:24, Stephen Hemminger (step...@networkplumber.org) wrote:

> On Fri, 6 Sep 2019 16:04:33 +0100
> Simon McVittie  wrote:
>
> > On Fri, 06 Sep 2019 at 06:57:22 +, Ray, Ian (GE Healthcare) wrote:
> > > If thread-safety is a design goal (and I don’t believe that it is [1])
> > > then atomic or thread-safe primitives should be used.
> > >
> > > [1] 
> > > https://lists.freedesktop.org/archives/systemd-devel/2017-March/038519.html
> >
> > [1] is about sd-bus, not sd-event, and doesn't say anything about whether
> > sd-event is designed to be thread-safe or not.
> >
> > However, I think you're correct to say that struct sd_event is also only
> > designed to be used from the single thread that "owns" it.
> >
> > If you need a thread-safe event loop, then you need something like
> > GLib's GMainContext, with mutexes to protect its data structures against
> > concurrent access, and a well-defined mechanism for one main-context to
> > "post" events to other main-contexts (which might be running concurrently
> > in a different thread). Many other event loops are available; GMainContext
> > happens to be the one I'm most familiar with, and I know that it is
> > designed to be thread-safe.
> >
> > The price that things like GMainContext pay for being thread-safe is
> > that they are more complex and less efficient than sd-event: in general,
> > all operations on a thread-aware event loop have to pay the complexity
> > and performance cost of being thread-aware, even if the current program
> > only has one thread.
> >
> > smcv
>
> Excuse me for reviving an old thread. But I see similar problem today
> (especially on Arm). The sd-event model uses signals so it is inherently
> subject to thread issues.

Hmm? sd-event uses signals only via signalfd(), so that it can
dispatch them as part of regular event loop handling. It doesn't
install a signal handler or anything like that.

> It looks like a stronger memory model is needed here (not volatile).
> Other projects use __atomic builtins for this.

All of sd-event's data structures should be accessed from a single
thread only, in a single non-signal execution context.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl reboot get terminated by signal 15

2021-04-23 Thread Lennart Poettering
On Di, 20.04.21 16:48, Pengpeng Sun (pengpe...@vmware.com) wrote:

> Thanks Lennart!
> I reproduced the issue after set "systemd-analyze log-level debug". It turns 
> out 'systemctl reboot' triggered before all the steps logged in 'journalctl 
> -a'. But I did get the stderr of 'systemctl reboot', paste it as below, 
> please check if it helps to locate the root cause.
> And besides 'journalctl -a', where can I get the systemd debug logs?
>
> stderr: Bus n/a: changing state UNSET → OPENING
> Bus n/a: changing state OPENING → AUTHENTICATING
> Executing dbus call org.freedesktop.systemd1.Manager StartUnit(reboot.target, 
> replace-irreversibly)
> Bus n/a: changing state AUTHENTICATING → RUNNING
> Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 
> path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager 
> member=StartUnit cookie=1 reply_cookie=0 signature=ss error-name=n/a 
> error-message=n/a
> Got message type=method_return sender=org.freedesktop.systemd1 
> destination=n/a path=n/a interface=n/a member=n/a cookie=5 reply_cookie=1 
> signature=o error-name=n/a error-message=n/a
> Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 
> path=/org/freedesktop/systemd1 interface=org.freedesktop.systemd1.Manager 
> member=GetUnit cookie=2 reply_cookie=0 signature=s error-name=n/a 
> error-message=n/a

That looks like the debug log of systemctl on the client side? I was
more interested int the logs of systemd, i.e. of PID 1.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemd-networkd: How to configure network with environment variables?

2021-04-23 Thread Lennart Poettering
On Fr, 23.04.21 08:17, Paul Menzel (pmenzel+systemd-de...@molgen.mpg.de) wrote:

> Dear systemd folks,
>
>
> Due to historical reasons, in our environment we have a configuration file
> with the network device name and the to be assigned IP address:
>
> $ more /etc/local/mxhost.conf
> MX_NETDEV=net02
> MX_IPADDR=141.14.18.X
>
> Then a custom service unit `network.service` [1] configures the network with
> the configuration file as environment file.
>
> [Unit]
> Description=Network Connectivity
> DefaultDependencies=no
>
> [Service]
> EnvironmentFile=/etc/local/mxhost.conf
> Type=oneshot
> RemainAfterExit=yes
> ExecStart=/usr/sbin/mxnetctl start
> ExecStart=/sbin/ip addr add ${MX_IPADDR}/20 broadcast 141.14.31.255 dev
> ${MX_NETDEV}
> ExecStart=/sbin/ip link set up dev ${MX_NETDEV}
> ExecStart=/sbin/ip route add default via 141.14.16.X
> ExecStop=/sbin/ip addr del ${MX_IPADDR}/20 dev ${MX_NETDEV}
> StandardOutput=syslog
>
> [Install]
> WantedBy=network.target
>
> Wanting to use systemd-network but keeping local device configuration in
> `/etc/local` is there an easy way? systemd.network(5) does not say anything
> about, that environment variables could be used.
>
> If that does not work, do you have another suggestion? Possible, but not
> nice, solutions, I came up with:

No, networkd has no support for that.

> 1.  Use a generator to create .network files from `/etc/local/mxhost.conf`.

Yes, that should be really easy to implement, i.e. just write a small
shell script, that sources that config files and outputs the .network
file into /run somewhere via a here document, and then run this script
during boot, and order it before networkd, so that the conversion is
completed on each boot, before networkd is run.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Is there a way to know inside of systemd if it's in a reboot state?

2021-04-22 Thread Lennart Poettering
On Mo, 19.04.21 20:19, Tia, Javier (javier@hpe.com) wrote:

> Hi,
>
> Is there a way to know inside of systemd if it's in a reboot state?

systemctl is-system-running

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] resolved: wrong address w/ cache off, wo/ querying DNS server

2021-04-22 Thread Lennart Poettering
On Do, 22.04.21 01:42, Dénes Türei (turei.de...@gmail.com) wrote:

> Dear Systemd Developers,
>
> I got stuck with investigating an issue and I would be very grateful
> for some advice.
> Briefly, systemd-resolved keeps returning a wrong answer to a query,
> despite the cache is disabled. The debug log doesn't show the origin
> of the answer. Systemd-resolved doesn't query the configured DNS
> server when answering this query. If I query the DNS server directly,
> it returns the correct answer. I described the issue in this thread:
> https://bbs.archlinux.org/viewtopic.php?pid=196#p196 -- please
> ignore the starting comment, the last 3 are the most relevant. See
> some system info below the mail.

Maybe your local hostname or an /etc/hosts entry exist that match the
domain name you are looking up?

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] syscvall-filters killing CGI after update to Fedora 33

2021-04-22 Thread Lennart Poettering
On Mo, 19.04.21 18:24, Reindl Harald (h.rei...@thelounge.net) wrote:

> after a long time using this SystemCallFilter perl-cgi with Fedora 33 get
> killed - anyone an idea what changed that's obviously covered by the second
> line
>
> SystemCallFilter=@system-service @network-io @privileged
> SystemCallFilter=~@clock @cpu-emulation @debug @keyring @module @mount
> @obsolete @raw-io @reboot @resources @swap

@resources is included in @system-service for a reason: it's syscalls
are typically used by programs. Regular system service use it, and
that's totally OK and expected.

i.e. the basically explicitly created a configuration that can't
work. My recommendation: just drop the second line altogether. Your
first line implements an allowlist already, hence besides the
@resources thing the second line is entirely redundant, and the
@resources stuff you really don't want.

> either the blacklist of the new systemd version convers more than before or
> something changed in the perl stack

Yeah, programs change the APIs they use. System call filters needs to
be put together with an undrstanding what the programs do, and hence
are besten already put togther upstream or by the distro. If you do it
downstream you might run into issues like this.

The idea of @system-service is that it mostly tries to isolate you
from this, but in your case you overrode what it does, so it fell apart.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] Request for Feedback on Design Issue with Systemd and "Consistent Network Device Naming"

2021-04-21 Thread Lennart Poettering
On Mi, 21.04.21 10:13, Simon Foley (si...@simonfoley.net) wrote:

> The issue is around the depreciation of the support for the HWADDR= argument
> in the ifcfg files (RHEL, other distros are available).

systemd upstream is not involved in that. ifcfg is specific to Red Hat
distributions and systemd doesn't mandate the concept to be
deprecated. It doesn't support them natively, but there's no need to.

The .link concept systemd provides is more powerful and works across
distributions. You can use that to name your interfaces by MAC
address, it's very well supported.

> How HPC architects try to help sysadmins and application teams in the
> process is to have post build modifications.
> Here we can use the HWADDR= variable in the ifcfg-[device name] files to
> move a *specific* device name to these targeted NIC cards and ports.

systemd doesn't stop you to.

It provides a more generic way to do this via .link files, but from
systemd's PoV you don#t have to migrate, if you don't want.

You could easily write a conversion script btw, that takes your ifcfg
files and converts them to .link files in /run, if you like.

> It would appear in RHEL8 that due to systemd the HWADDR= is no longer
> supported and we have lost this fundamentally important feature.

If RHEL deprecated this, that's a decision by RHEL, and the upstream
systemd project does not mandate anything in this area. It provides a
generic mechanism to do the same, but you can use whatever you want.

Anyway, the upstream systemd project is the wrong forum to discuss any
of this. You are apparently upset by a RHEL decision. While I
sympathize with the decision, it's not a decision the systemd project
took, but RHEL did, and technically nothing in systemd mandates
this.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] systemctl reboot get terminated by signal 15

2021-04-19 Thread Lennart Poettering
On Mo, 19.04.21 14:29, Pengpeng Sun (pengpe...@vmware.com) wrote:

> Hi there,
>
> Our program executes 'systemctl reboot' in a child process to reboot
> Linux right after its booted, Sometimes there is no error, but
> sometimes the child process terminated due to received uncaught
> signal 15, then no reboot happened. WIFSIGNALED evaluated a non-zero
> value, WTERMSIG evaluated 15. Don't understand why the uncaught
> signal 15 happened here, could you please shed light on this,
> Thanks.

15 is SIGTERM, i.e. the signal sent when a process is politely asked
to shut down. Something is terminating your process.

It could be systemd, could be something else.

To track down what it is, maybe turn on debug logging in systemd, maybe
you find the explanation there. i.e. "systemd-analyze log-level debug"
and then reproduce the issue.

ALternatively, install a signal handler for SIGTERM via sigaction, and
look into the .si_pid field of the siginfo_t you can receive in the
handler. It tells you which processes sent the SIGTERM.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] How to reboot from within a service

2021-04-19 Thread Lennart Poettering
On So, 18.04.21 12:01, Norbert Lange (nolang...@gmail.com) wrote:

> Hello,
>
> My setup is some embedded system without logind, running a service
> that should be able to poweroff, reboot or reboot-and-update.
> Generally I guess this also includes triggering further services
> programmatically fron another.
>
> 1) As I do currently: send SIGRTMIN+5 to PID1
> (not really the complete solution, would need to discern between
> reboot/update).
> 2) exec 'systemctl start reboot.target'
> basically a variant of 1).
> 3) Same thing as 2), but do it with dbus or varlink.
> 4) Some unknown and likely not existing configuration of unit files.
> doing something like "IfExitCode=121 then start reboot.target"
> 5) Use the Watchdog and let it expire.
>
> While working, I would expect the first 3 options to be depended on
> various level of rights to interfere with
> PID1, aswell as being systemd specific (using dbus or systemd DSO).
> Some sort of separation between

If you want to reboot the system in a sysv compatible way you can only
fork off "reboot" or "shutdown -r", or maybe send SIGINT to PID 1. The
latter is pretty ugly though, since this will be treated as if people
actually used Ctlr-Alt-Del on the console by PID 1, i.e. this is
subject to misleading log messages and the reboot hard after hitting
this 7x in 2s.

> advertising the need for reboot and acting on it would be cleaner (ie.
> hooking it up in service files).
>
> What are the best option(s) here?

Use logind's D-Bus APIs. It's the cleanest way to reboot, as it
honours inhibitors and stuff.

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


Re: [systemd-devel] .local searches not working

2021-04-10 Thread Lennart Poettering
On Fr, 09.04.21 15:20, Phillip Susi (ph...@thesusis.net) wrote:

>
> Silvio Knizek writes:
>
> > So in fact your network is not standard conform. You have to define
> > .local as search and routing domain in the configuration of sd-
> > resolved.
>
> Interesting... so what are you supposed to name your local, private
> domains?

This draft RFC suggests .home or .corp:

https://www.ietf.org/archive/id/draft-chapin-additional-reserved-tlds-02.txt

It never made it beyond a draft, but I think that#s already enough to
be pretty sure these domains unlikely will be used elsewhere.

RFC 6762, Appendix G suggests using .lan, .intranet, .internal and
.private.

RFC 8375 suggests .home.arpa. This is probably the RFC that is the
most official one, but OTOH its probably at the moment the least
widely used one. Still, probably the safest bet, though it does sound
a bit weird when used in a corporate context.

> I believe Microsoft used to ( or still do? ) recommend using
> .local to name your domain if you don't have a public domain name, so
> surely I'm not the first person to run into this?  Why does
> systemd-resolved not fall back to DNS if it can't first resolve the name
> using mDNS?  That appears to be allowed by the RFC.

You can enable this, just add ~local to the routing domains of the
relevant DNS server.

We won't do this automatically for security reasons, as locally scoped
names should not be routed to Internet DNS servers, as that leaks
pretty sensitive information about the local network infrastructur

Lennart

--
Lennart Poettering, Berlin
___
systemd-devel mailing list
systemd-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/systemd-devel


  1   2   3   4   5   6   7   8   9   10   >