Re: [lfs-book] [LFS Trac] #4745: systemd-247

LFS Trac via lfs-book Tue, 01 Dec 2020 14:19:54 -0800

#4745: systemd-247
--------------------+-----------------------
 Reporter:  renodr  |       Owner:  renodr
     Type:  task    |      Status:  assigned
 Priority:  normal  |   Milestone:  10.1
Component:  Book    |     Version:  SVN
 Severity:  normal  |  Resolution:
 Keywords:          |
--------------------+-----------------------


Comment (by renodr):

 {{{
 CHANGES WITH 247:

         * KERNEL API INCOMPATIBILITY: Linux 4.14 introduced two new
 uevents
           "bind" and "unbind" to the Linux device model. When this kernel
           change was made, systemd-udevd was only minimally updated to
 handle
           and propagate these new event types. The introduction of these
 new
           uevents (which are typically generated for USB devices and
 devices
           needing a firmware upload before being functional) resulted in a
           number of issues which we so far didn't address. We hoped the
 kernel
           maintainers would themselves address these issues in some form,
 but
           that did not happen. To handle them properly, many (if not most)
 udev
           rules files shipped in various packages need updating, and so do
 many
           programs that monitor or enumerate devices with libudev or sd-
 device,
           or otherwise process uevents. Please note that this
 incompatibility
           is not fault of systemd or udev, but caused by an incompatible
 kernel
           change that happened back in Linux 4.14, but is becoming more
 and
           more visible as the new uevents are generated by more kernel
 drivers.

           To minimize issues resulting from this kernel change (but not
 avoid
           them entirely) starting with systemd-udevd 247 the udev "tags"
           concept (which is a concept for marking and filtering devices
 during
           enumeration and monitoring) has been reworked: udev tags are now
           "sticky", meaning that once a tag is assigned to a device it
 will not
           be removed from the device again until the device itself is
 removed
           (i.e. unplugged). This makes sure that any application
 monitoring
           devices that match a specific tag is guaranteed to both see
 uevents
           where the device starts being relevant, and those where it stops
           being relevant (the latter now regularly happening due to the
 new
           "unbind" uevent type). The udev tags concept is hence now a
 concept
           tied to a *device* instead of a device *event* — unlike for
 example
           udev properties whose lifecycle (as before) is generally tied to
 a
           device event, meaning that the previously determined properties
 are
           forgotten whenever a new uevent is processed.

           With the newly redefined udev tags concept, sometimes it's
 necessary
           to determine which tags are the ones applied by the most recent
           uevent/database update, in order to discern them from those
           originating from earlier uevents/database updates of the same
           device. To accommodate for this a new automatic property
 CURRENT_TAGS
           has been added that works similar to the existing TAGS property
 but
           only lists tags set by the most recent uevent/database
           update. Similarly, the libudev/sd-device API has been updated
 with
           new functions to enumerate these 'current' tags, in addition to
 the
           existing APIs that now enumerate the 'sticky' ones.

           To properly handle "bind"/"unbind" on Linux 4.14 and newer it is
           essential that all udev rules files and applications are updated
 to
           handle the new events. Specifically:

           • All rule files that currently use a header guard similar to
             ACTION!="add|change",GOTO="xyz_end" should be updated to use
             ACTION=="remove",GOTO="xyz_end" instead, so that the
             properties/tags they add are also applied whenever "bind" (or
             "unbind") is seen. (This is most important for all physical
 device
             types — those for which "bind" and "unbind" are currently
             generated, for all other device types this change is still
             recommended but not as important — but certainly prepares for
             future kernel uevent type additions).

           • Similarly, all code monitoring devices that contains an 'if'
 branch
             discerning the "add" + "change" uevent actions from all other
             uevents actions (i.e. considering devices only relevant after
 "add"
             or "change", and irrelevant on all other events) should be
 reworked
             to instead negatively check for "remove" only (i.e.
 considering
             devices relevant after all event types, except for "remove",
 which
             invalidates the device). Note that this also means that
 devices
             should be considered relevant on "unbind", even though
 conceptually
             this — in some form — invalidates the device. Since the
 precise
             effect of "unbind" is not generically defined, devices should
 be
             considered relevant even after "unbind", however I/O errors
             accessing the device should then be handled gracefully.

           • Any code that uses device tags for deciding whether a device
 is
             relevant or not most likely needs to be updated to use the new
             udev_device_has_current_tag() API (or
 sd_device_has_current_tag()
             in case sd-device is used), to check whether the tag is set at
 the
             moment an uevent is seen (as opposed to the existing
             udev_device_has_tag() API which checks if the tag ever existed
 on
             the device, following the API concept redefinition explained
             above).

           We are very sorry for this breakage and the requirement to
 update
           packages using these interfaces. We'd again like to underline
 that
           this is not caused by systemd/udev changes, but result of a
 kernel
           behaviour change.

         * UPCOMING INCOMPATIBILITY: So far most downstream distribution
           packages have not retriggered devices once the udev package (or
 any
           auxiliary package installing additional udev rules) is updated.
 We
           intend to work with major distributions to change this, so that
           "udevadm trigger -a change" is issued on such upgrades, ensuring
 that
           the updated ruleset is applied to the devices already
 discovered, so
           that (asynchronously) after the upgrade completed the udev
 database
           is consistent with the updated rule set. This means udev rules
 must
           be ready to be retriggered with a "change" action any time, and
           result in correct and complete udev database entries. While the
           majority of udev rule files known to us currently get this
 right,
           some don't. Specifically, there are udev rules files included in
           various packages that only set udev properties on the "add"
 action,
           but do not handle the "change" action. If a device matching
 those
           rules is retriggered with the "change" action (as is intended
 here)
           it would suddenly lose the relevant properties. This always has
 been
           problematic, but as soon as all udev devices are triggered on
 relevant
           package upgrades this will become particularly so. It is
 strongly
           recommended to fix offending rules so that they can handle a
 "change"
           action at any time, and acquire all necessary udev properties
 even
           then. Or in other words: the header guard mentioned above
           (ACTION=="remove",GOTO="xyz_end") is the correct approach to
 handle
           this, as it makes sure rules are rerun on "change" correctly,
 and
           accumulate the correct and complete set of udev properties. udev
 rule
           definitions that cannot handle "change" events being triggered
 at
           arbitrary times should be considered buggy.

         * The MountAPIVFS= service file setting now defaults to on if
           RootImage= and RootDirectory= are used, which means that with
 those
           two settings /proc/, /sys/ and /dev/ are automatically properly
 set
           up for services. Previous behaviour may be restored by
 explicitly
           setting MountAPIVFS=off.

         * Since PAM 1.2.0 (2015) configuration snippets may be placed in
           /usr/lib/pam.d/ in addition to /etc/pam.d/. If a file exists in
 the
           latter it takes precedence over the former, similar to how most
 of
           systemd's own configuration is handled. Given that PAM stack
           definitions are primarily put together by OS
 vendors/distributions
           (though possibly overridden by users), this systemd release
 moves its
           own PAM stack configuration for the "systemd-user" PAM service
 (i.e.
           for the PAM session invoked by the per-user [email protected]
 instance)
           from /etc/pam.d/ to /usr/lib/pam.d/. We recommend moving all
           packages' vendor versions of their PAM stack definitions from
           /etc/pam.d/ to /usr/lib/pam.d/, but if such OS-wide migration is
 not
           desired the location to which systemd installs its PAM stack
           configuration may be changed via the -Dpamconfdir Meson option.

         * The runtime dependencies on libqrencode, libpcre2,
 libidn/libidn2,
           libpwquality and libcryptsetup have been changed to be based on
           dlopen(): instead of regular dynamic library dependencies
 declared in
           the binary ELF headers, these libraries are now loaded on demand
           only, if they are available. If the libraries cannot be found
 the
           relevant operations will fail gracefully, or a suitable fallback
           logic is chosen. This is supposed to be useful for general
 purpose
           distributions, as it allows minimizing the list of dependencies
 the
           systemd packages pull in, permitting building of more minimal OS
           images, while still making use of these "weak" dependencies
 should
           they be installed. Since many package managers automatically
           synthesize package dependencies from ELF shared library
 dependencies,
           some additional manual packaging work has to be done now to
 replace
           those (slightly downgraded from "required" to "recommended" or
           whatever is conceptually suitable for the package manager). Note
 that
           this change does not alter build-time behaviour: as before the
           build-time dependencies have to be installed during build, even
 if
           they now are optional during runtime.

         * sd-event.h gained a new call sd_event_add_time_relative() for
           installing timers relative to the current time. This is mostly a
           convenience wrapper around the pre-existing sd_event_add_time()
 call
           which installs absolute timers.

         * sd-event event sources may now be placed in a new "exit-on-
 failure"
           mode, which may be controlled via the new
           sd_event_source_get_exit_on_failure() and
           sd_event_source_set_exit_on_failure() functions. If enabled, any
           failure returned by the event source handler functions will
 result in
           exiting the event loop (unlike the default behaviour of just
           disabling the event source but continuing with the event loop).
 This
           feature is useful to set for all event sources that define
 "primary"
           program behaviour (where failure should be fatal) in contrast to
           "auxiliary" behaviour (where failure should remain local).

         * Most event source types sd-event supports now accept a NULL
 handler
           function, in which case the event loop is exited once the event
           source is to be dispatched, using the userdata pointer —
 converted to
           a signed integer — as exit code of the event loop. Previously
 this
           was supported for IO and signal event sources already. Exit
 event
           sources still do not support this (simply because it makes
 little
           sense there, as the event loop is already exiting when they are
           dispatched).

         * A new per-unit setting RootImageOptions= has been added which
 allows
           tweaking the mount options for any file system mounted as effect
 of
           the RootImage= setting.

         * Another new per-unit setting MountImages= has been added, that
 allows
           mounting additional disk images into the file system tree
 accessible
           to the service.

         * Timer units gained a new FixedRandomDelay= boolean setting. If
           enabled, the random delay configured with RandomizedDelaySec= is
           selected in a way that is stable on a given system (though still
           different for different units).

         * Socket units gained a new setting Timestamping= that takes "us",
 "ns"
           or "off". This controls the SO_TIMESTAMP/SO_TIMESTAMPNS socket
           options.

         * systemd-repart now generates JSON output when requested with the
 new
           --json= switch.

         * systemd-machined's OpenMachineShell() bus call will now pass
           additional policy metadata data fields to the PolicyKit
           authentication request.

         * systemd-tmpfiles gained a new -E switch, which is equivalent to
           --exclude-prefix=/dev --exclude-prefix=/proc --exclude=/run
           --exclude=/sys. It's particularly useful in combination with
 --root=,
           when operating on OS trees that do not have any of these four
 runtime
           directories mounted, as this means no files below these subtrees
 are
           created or modified, since those mount points should probably
 remain
           empty.

         * systemd-tmpfiles gained a new --image= switch which is like
 --root=,
           but takes a disk image instead of a directory as argument. The
           specified disk image is mounted inside a temporary mount
 namespace
           and the tmpfiles.d/ drop-ins stored in the image are executed
 and
           applied to the image. systemd-sysusers similarly gained a new
           --image= switch, that allows the sysusers.d/ drop-ins stored in
 the
           image to be applied onto the image.

         * Similarly, the journalctl command also gained an --image=
 switch,
           which is a quick one-step solution to look at the log data
 included
           in OS disk images.

         * journalctl's --output=cat option (which outputs the log content
           without any metadata, just the pure text messages) will now make
 use
           of terminal colors when run on a suitable terminal, similarly to
 the
           other output modes.

         * JSON group records now support a "description" string that may
 be
           used to add a human-readable textual description to such groups.
 This
           is supposed to match the user's GECOS field which traditionally
           didn't have a counterpart for group records.

         * The "systemd-dissect" tool that may be used to inspect OS disk
 images
           and that was previously installed to /usr/lib/systemd/ has now
 been
           moved to /usr/bin/, reflecting its updated status of an
 officially
           supported tool with a stable interface. It gained support for a
 new
           --mkdir switch which when combined with --mount has the effect
 of
           creating the directory to mount the image to if it is missing
           first. It also gained two new commands --copy-from and --copy-to
 for
           copying files and directories in and out of an OS image without
 the
           need to manually mount it. It also acquired support for a new
 option
           --json= to generate JSON output when inspecting an OS image.

         * The cgroup2 file system is now mounted with the
           "memory_recursiveprot" mount option, supported since kernel 5.7.
 This
           means that the MemoryLow= and MemoryMin= unit file settings now
 apply
           recursively to whole subtrees.

         * systemd-homed now defaults to using the btrfs file system — if
           available — when creating home directories in LUKS volumes. This
 may
           be changed with the DefaultFileSystemType= setting in
 homed.conf.
           It's now the default file system in various major distributions
 and
           has the major benefit for homed that it can be grown and shrunk
 while
           mounted, unlike the other contenders ext4 and xfs, which can
 both be
           grown online, but not shrunk (in fact xfs is the technically
 most
           limited option here, as it cannot be shrunk at all).

         * JSON user records managed by systemd-homed gained support for
           "recovery keys". These are basically secondary passphrases that
 can
           unlock user accounts/home directories. They are computer-
 generated
           rather than user-chosen, and typically have greater entropy.
           homectl's --recovery-key= option may be used to add a recovery
 key to
           a user account. The generated recovery key is displayed as a QR
 code,
           so that it can be scanned to be kept in a safe place. This
 feature is
           particularly useful in combination with systemd-homed's support
 for
           FIDO2 or PKCS#11 authentication, as a secure fallback in case
 the
           security tokens are lost. Recovery keys may be entered wherever
 the
           system asks for a password.

         * systemd-homed now maintains a "dirty" flag for each LUKS
 encrypted
           home directory which indicates that a home directory has not
 been
           deactivated cleanly when offline. This flag is useful to
 identify
           home directories for which the offline discard logic did not run
 when
           offlining, and where it would be a good idea to log in again to
 catch
           up.

         * systemctl gained a new parameter --timestamp= which may be used
 to
           change the style in which timestamps are output, i.e. whether to
 show
           them in local timezone or UTC, or whether to show µs
 granularity.

         * Alibaba's "pouch" container manager is now detected by
           systemd-detect-virt, ConditionVirtualization= and similar
           constructs. Similar, they now also recognize IBM PowerVM machine
           virtualization.

         * systemd-nspawn has been reworked to use the /run/host/incoming/
 as
           place to use for propagating external mounts into the
           container. Similarly /run/host/notify is now used as the socket
 path
           for container payloads to communicate with the container manager
           using sd_notify(). The container manager now uses the
           /run/host/inaccessible/ directory to place "inaccessible" file
 nodes
           of all relevant types which may be used by the container payload
 as
           bind mount source to over-mount inodes to make them
 inaccessible.
           /run/host/container-manager will now be initialized with the
 same
           string as the $container environment variable passed to the
           container's PID 1. /run/host/container-uuid will be initialized
 with
           the same string as $container_uuid. This means the /run/host/
           hierarchy is now the primary way to make host resources
 available to
           the container. The Container Interface documents these new files
 and
           directories:

           https://systemd.io/CONTAINER_INTERFACE

         * Support for the "ConditionNull=" unit file condition has been
           deprecated and undocumented for 6 years. systemd started to warn
           about its use 1.5 years ago. It has now been removed entirely.

         * sd-bus.h gained a new API call sd_bus_error_has_names(), which
 takes
           a sd_bus_error struct and a list of error names, and checks if
 the
           error matches one of these names. It's a convenience wrapper
 that is
           useful in cases where multiple errors shall be handled the same
 way.

         * A new system call filter list "@known" has been added, that
 contains
           all system calls known at the time systemd was built.

         * Behaviour of system call filter allow lists has changed
 slightly:
           system calls that are contained in @known will result in a EPERM
 by
           default, while those not contained in it result in ENOSYS. This
           should improve compatibility because known system calls will
 thus be
           communicated as prohibited, while unknown (and thus newer ones)
 will
           be communicated as not implemented, which hopefully has the
 greatest
           chance of triggering the right fallback code paths in client
           applications.

         * "systemd-analyze syscall-filter" will now show two separate
 sections
           at the bottom of the output: system calls known during systemd
 build
           time but not included in any of the filter groups shown above,
 and
           system calls defined on the local kernel but known during
 systemd
           build time.

         * If the $SYSTEMD_LOG_SECCOMP=1 environment variable is set for
           systemd-nspawn all system call filter violations will be logged
 by
           the kernel (audit). This is useful for tracking down system
 calls
           invoked by container payloads that are prohibited by the
 container's
           system call filter policy.

         * If the $SYSTEMD_SECCOMP=0 environment variable is set for
           systemd-nspawn (and other programs that use seccomp) all seccomp
           filtering is turned off.

         * Two new unit file settings ProtectProc= and ProcSubset= have
 been
           added that expose the hidepid= and subset= mount options of
 procfs.
           All processes of the unit will only see processes in /proc that
 are
           are owned by the unit's user. This is an important new
 sandboxing
           option that is recommended to be set on all system services. All
           long-running system services that are included in systemd itself
 set
           this option now. This option is only supported on kernel 5.8 and
           above, since the hidepid= option supported on older kernels was
 not a
           per-mount option but actually applied to the whole PID
 namespace.

         * Socket units gained a new boolean setting FlushPending=. If
 enabled
           all pending socket data/connections are flushed whenever the
 socket
           unit enters the "listening" state, i.e. after the associated
 service
           exited.

         * The unit file setting NUMAMask= gained a new "all" value: when
 used,
           all existing NUMA nodes are added to the NUMA mask.

         * A new "credentials" logic has been added to system services.
 This is
           a simple mechanism to pass privileged data to services in a safe
 and
           secure way. It's supposed to be used to pass per-service secret
 data
           such as passwords or cryptographic keys but also associated less
           private information such as user names, certificates, and
 similar to
           system services. Each credential is identified by a short user-
 chosen
           name and may contain arbitrary binary data. Two new unit file
           settings have been added: SetCredential= and LoadCredential=.
 The
           former allows setting a credential to a literal string, the
 latter
           sets a credential to the contents of a file (or data read from a
           user-chosen AF_UNIX stream socket). Credentials are passed to
 the
           service via a special credentials directory, one file for each
           credential. The path to the credentials directory is passed in a
 new
           $CREDENTIALS_DIRECTORY environment variable. Since the
 credentials
           are passed in the file system they may be easily referenced in
           ExecStart= command lines too, thus no explicit support for the
           credentials logic in daemons is required (though ideally daemons
           would look for the bits they need in $CREDENTIALS_DIRECTORY
           themselves automatically, if set). The $CREDENTIALS_DIRECTORY is
           backed by unswappable memory if privileges allow it, immutable
 if
           privileges allow it, is accessible only to the service's UID,
 and is
           automatically destroyed when the service stops.

         * systemd-nspawn supports the same credentials logic. It can both
           consume credentials passed to it via the aforementioned
           $CREDENTIALS_DIRECTORY protocol as well as pass these
 credentials on
           to its payload. The service manager/PID 1 has been updated to
 match
           this: it can also accept credentials from the container manager
 that
           invokes it (in fact: any process that invokes it), and passes
 them on
           to its services. Thus, credentials can be propagated recursively
 down
           the tree: from a system's service manager to a systemd-nspawn
           service, to the service manager that runs as container payload
 and to
           the service it runs below. Credentials may also be added on the
           systemd-nspawn command line, using new --set-credential= and
           --load-credential= command line switches that match the
           aforementioned service settings.

         * systemd-repart gained new settings Format=, Encrypt=, CopyFiles=
 in
           the partition drop-ins which may be used to format/LUKS
           encrypt/populate any created partitions. The partitions are
           encrypted/formatted/populated before they are registered in the
           partition table, so that they appear atomically: either the
           partitions do not exist yet or they exist fully encrypted,
 formatted,
           and populated — there is no time window where they are
           "half-initialized". Thus the system is robust to abrupt
 shutdown: if
           the tool is terminated half-way during its operations on next
 boot it
           will start from the beginning.

         * systemd-repart's --size= operation gained a new "auto" value. If
           specified, and operating on a loopback file it is automatically
 sized
           to the minimal size the size constraints permit. This is useful
 to
           use "systemd-repart" as an image builder for minimally sized
 images.

         * systemd-resolved now gained a third IPC interface for requesting
 name
           resolution: besides D-Bus and local DNS to 127.0.0.53 a Varlink
           interface is now supported. The nss-resolve NSS module has been
           modified to use this new interface instead of D-Bus. Using
 Varlink
           has a major benefit over D-Bus: it works without a broker
 service,
           and thus already during earliest boot, before the dbus daemon
 has
           been started. This means name resolution via systemd-resolved
 now
           works at the same time systemd-networkd operates: from earliest
 boot
           on, including in the initrd.

         * systemd-resolved gained support for a new DNSStubListenerExtra=
           configuration file setting which may be used to specify
 additional IP
           addresses the built-in DNS stub shall listen on, in addition to
 the
           main one on 127.0.0.53:53.

         * Name lookups issued via systemd-resolved's D-Bus and Varlink
           interfaces (and thus also via glibc NSS if nss-resolve is used)
 will
           now honour a trailing dot in the hostname: if specified the
 search
           path logic is turned off. Thus "resolvectl query foo." is now
           equivalent to "resolvectl query --search=off foo.".

         * systemd-resolved gained a new D-Bus property "ResolvConfMode"
 that
           exposes how /etc/resolv.conf is currently managed: by resolved
 (and
           in which mode if so) or another subsystem. "resolvctl" will
 display
           this property in its status output.

         * The resolv.conf snippets systemd-resolved provides will now set
 "."
           as the search domain if no other search domain is known. This
 turns
           off the derivation of an implicit search domain by nss-dns for
 the
           hostname, when the hostname is set to an FQDN. This change is
 done to
           make nss-dns using resolv.conf provided by systemd-resolved
 behave
           more similarly to nss-resolve.

         * systemd-tmpfiles' file "aging" logic (i.e. the automatic clean-
 up of
           /tmp/ and /var/tmp/ based on file timestamps) now looks at the
           "birth" time (btime) of a file in addition to the atime, mtime,
 and
           ctime.

         * systemd-analyze gained a new verb "capability" that lists all
 known
           capabilities by the systemd build and by the kernel.

         * If a file /usr/lib/clock-epoch exists, PID 1 will read its mtime
 and
           advance the system clock to it at boot if it is noticed to be
 before
           that time. Previously, PID 1 would only advance the time to an
 epoch
           time that is set during build-time. With this new file OS
 builders
           can change this epoch timestamp on individual OS images without
           having to rebuild systemd.

         * systemd-logind will now listen to the KEY_RESTART key from the
 Linux
           input layer and reboot the system if it is pressed, similarly to
 how
           it already handles KEY_POWER, KEY_SUSPEND or KEY_SLEEP.
 KEY_RESTART
           was originally defined in the Multimedia context (to restart
 playback
           of a song or film), but is now primarily used in various
 embedded
           devices for "Reboot" buttons. Accordingly, systemd-logind will
 now
           honour it as such. This may configured in more detail via the
 new
           HandleRebootKey= and RebootKeyIgnoreInhibited=.

         * systemd-nspawn/systemd-machined will now reconstruct hardlinks
 when
           copying OS trees, for example in "systemd-nspawn --ephemeral",
           "systemd-nspawn --template=", "machinectl clone" and similar.
 This is
           useful when operating with OSTree images, which use hardlinks
 heavily
           throughout, and where such copies previously resulting in
 "exploding"
           hardlinks.

         * systemd-nspawn's --console= setting gained support for a new
           "autopipe" value, which is identical to "interactive" when
 invoked on
           a TTY, and "pipe" otherwise.

         * systemd-networkd's .network files gained support for explicitly
           configuring the multicast membership entries of bridge devices
 in the
           [BridgeMDB] section. It also gained support for the PIE queuing
           discipline in the [FlowQueuePIE] sections.

         * systemd-networkd's .netdev files may now be used to create
 "BareUDP"
           tunnels, configured in the new [BareUDP] setting.

         * systemd-networkd's Gateway= setting in .network files now
 accepts the
           special values "_dhcp4" and "_ipv6ra" to configure additional,
           locally defined, explicit routes to the gateway acquired via
 DHCP or
           IPv6 Router Advertisements. The old setting "_dhcp" is
 deprecated,
           but still accepted for backwards compatibility.

         * systemd-networkd's [IPv6PrefixDelegation] section and
           IPv6PrefixDelegation= options have been renamed as [IPv6SendRA]
 and
           IPv6SendRA= (the old names are still accepted for backwards
           compatibility).

         * systemd-networkd's .network files gained the
 DHCPv6PrefixDelegation=
           boolean setting in [Network] section. If enabled, the delegated
 prefix
           gained by another link will be configured, and an address within
 the
           prefix will be assigned.

         * systemd-networkd's .network files gained the Announce= boolean
 setting
           in [DHCPv6PrefixDelegation] section. When enabled, the delegated
           prefix will be announced through IPv6 router advertisement (IPv6
 RA).
           The setting is enabled by default.

         * VXLAN tunnels may now be marked as independent of any underlying
           network interface via the new Independent= boolean setting.

         * systemctl gained support for two new verbs: "service-log-level"
 and
           "service-log-target" may be used on services that implement the
           generic org.freedesktop.LogControl1 D-Bus interface to
 dynamically
           adjust the log level and target. All of systemd's long-running
           services support this now, but ideally all system services would
           implement this interface to make the system more uniformly
           debuggable.

         * The SystemCallErrorNumber= unit file setting now accepts the new
           "kill" and "log" actions, in addition to arbitrary error number
           specifications as before. If "kill" the processes are killed on
 the
           event, if "log" the offending system call is audit logged.

         * A new SystemCallLog= unit file setting has been added that
 accepts a
           list of system calls that shall be logged about (audit).

         * The OS image dissection logic (as used by RootImage= in unit
 files or
           systemd-nspawn's --image= switch) has gained support for
 identifying
           and mounting explicit /usr/ partitions, which are now defined in
 the
           discoverable partition specification. This should be useful for
           environments where the root file system is
           generated/formatted/populated dynamically on first boot and
 combined
           with an immutable /usr/ tree that is supplied by the vendor.

         * In the final phase of shutdown, within the systemd-shutdown
 binary
           we'll now try to detach MD devices (i.e software RAID) in
 addition to
           loopback block devices and DM devices as before. This is
 supposed to
           be a safety net only, in order to increase robustness if things
 go
           wrong. Storage subsystems are expected to properly detach their
           storage volumes during regular shutdown already (or in case of
           storage backing the root file system: in the initrd hook we
 return to
           later).

         * If the SYSTEMD_LOG_TID environment variable is set all systemd
 tools
           will now log the thread ID in their log output. This is useful
 when
           working with heavily threaded programs.

         * If the SYSTEMD_RDRAND environment variable is set to "0",
 systemd will
           not use the RDRAND CPU instruction. This is useful in
 environments
           such as replay debuggers where non-deterministic behaviour is
 not
           desirable.

         * The autopaging logic in systemd's various tools (such as
 systemctl)
           has been updated to turn on "secure" mode in "less"
           (i.e. $LESSECURE=1) if execution in a "sudo" environment is
           detected. This disables invoking external programs from the
 pager,
           via the pipe logic. This behaviour may be overridden via the new
           $SYSTEMD_PAGERSECURE environment variable.

         * Units which have resource limits (.service, .mount, .swap,
 .slice,
           .socket, and .slice) gained new configuration settings
           ManagedOOMSwap=, ManagedOOMMemoryPressure=, and
           ManagedOOMMemoryPressureLimitPercent= that specify resource
 pressure
           limits and optional action taken by systemd-oomd.

         * A new service systemd-oomd has been added. It monitors resource
           contention for selected parts of the unit hierarchy using the
 PSI
           information reported by the kernel, and kills processes when
 memory
           or swap pressure is above configured limits. This service is
 only
           enabled by default in developer mode (see below) and should be
           considered a preview in this release. Behaviour details and
 option
           names are subject to change without the usual backwards-
 compatibility
           promises.

         * A new helper oomctl has been added to introspect systemd-oomd
 state.
           It is only enabled by default in developer mode and should be
           considered a preview without the usual backwards-compatibility
           promises.

         * New meson option -Dcompat-mutable-uid-boundaries= has been
 added. If
           enabled, systemd reads the system UID boundaries from
 /etc/login.defs
           at runtime, instead of using the built-in values selected during
           build. This is an option to improve compatibility for upgrades
 from
           old systems. It's strongly recommended not to make use of this
           functionality on new systems (or even enable it during build),
 as it
           makes something runtime-configurable that is mostly an
 implementation
           detail of the OS, and permits avoidable differences in
 deployments
           that create all kinds of problems in the long run.

         * New meson option '-Dmode=developer|release' has been added. When
           'developer', additional checks and features are enabled that are
           relevant during upstream development, e.g. verification that
           semi-automatically-generated documentation has been properly
 updated
           following API changes. Those checks are considered hints for
           developers and are not actionable in downstream builds. In
 addition,
           extra features that are not ready for general consumption may be
           enabled in developer mode. It is thus recommended to set
           '-Dmode=release' in end-user and distro builds.

         * systemd-cryptsetup gained support for processing detached LUKS
           headers specified on the kernel command line via the header=
           parameter of the luks.options= kernel command line option. The
 same
           device/path syntax as for key files is supported for header
 files
           like this.

         * The "net_id" built-in of udev has been updated to ignore ACPI
 _SUN
           slot index data for devices that are connected through a PCI
 bridge
           where the _SUN index is associated with the bridge instead of
 the
           network device itself. Previously this would create ambiguous
 device
           naming if multiple network interfaces were connected to the same
 PCI
           bridge. Since this is a naming scheme incompatibility on systems
 that
           possess hardware like this it has been introduced as new naming
           scheme "v247". The previous scheme can be selected via the
           "net.naming-scheme=v245" kernel command line parameter.

         * ConditionFirstBoot= semantics have been modified to be safe
 towards
           abnormal system power-off during first boot. Specifically, the
           "systemd-machine-id-commit.service" service now acts as boot
           milestone indicating when the first boot process is sufficiently
           complete in order to not consider the next following boot also a
           first boot. If the system is reset before this unit is reached
 the
           first time, the next boot will still be considered a first boot;
 once
           it has been reached, no further boots will be considered a first
           boot. The "first-boot-complete.target" unit now acts as official
 hook
           point to order against this. If a service shall be run on every
 boot
           until the first boot fully succeeds it may thus be ordered
 before
           this target unit (and pull it in) and carry ConditionFirstBoot=
           appropriately.

         * bootctl's set-default and set-oneshot commands now accept the
 three
           special strings "@default", "@oneshot", "@current" in place of a
 boot
           entry id. These strings are resolved to the current default and
           oneshot boot loader entry, as well as the currently booted one.
 Thus
           a command "bootctl set-default @current" may be used to make the
           currently boot menu item the new default for all subsequent
 boots.

         * "systemctl edit" has been updated to show the original effective
 unit
           contents in commented form in the text editor.

         * Units in user mode are now segregated into three new slices:
           session.slice (units that form the core of graphical session),
           app.slice ("normal" user applications), and background.slice
           (low-priority tasks). Unless otherwise configured, user units
 are
           placed in app.slice. The plan is to add resource limits and
           protections for the different slices in the future.

         * New GPT partition types for RISCV32/64 for the root and /usr
           partitions, and their associated Verity partitions have been
 defined,
           and are now understood by systemd-gpt-auto-generator, and the OS
           image dissection logic.

         Contributions from: Adolfo Jayme Barrientos, afg, Alec Moskvin,
 Alyssa
         Ross, Amitanand Chikorde, Andrew Hangsleben, Anita Zhang, Ansgar
         Burchardt, Arian van Putten, Aurelien Jarno, Axel Rasmussen,
 bauen1,
         Beniamino Galvani, Benjamin Berg, Bjørn Mork, brainrom,
 Chandradeep
         Dey, Charles Lee, Chris Down, Christian Göttsche, Christof
 Efkemann,
         Christoph Ruegge, Clemens Gruber, Daan De Meyer, Daniele Medri,
 Daniel
         Mack, Daniel Rusek, Dan Streetman, David Tardon, Dimitri John
 Ledkov,
         Dmitry Borodaenko, Elias Probst, Elisei Roca, ErrantSpore, Etienne
         Doms, Fabrice Fontaine, fangxiuning, Felix Riemann, Florian Klink,
         Franck Bui, Frantisek Sumsal, fwSmit, George Rawlinson, germanztz,
         Gibeom Gwon, Glen Whitney, Gogo Gogsi, Göran Uddeborg, Grant
 Mathews,
         Hans de Goede, Hans Ulrich Niedermann, Haochen Tong, Harald
 Seiler,
         huangyong, Hubert Kario, igo95862, Ikey Doherty, Insun Pyo, Jan
 Chren,
         Jan Schlüter, Jérémy Nouhaud, Jian-Hong Pan, Joerg Behrmann,
 Jonathan
         Lebon, Jörg Thalheim, Josh Brobst, Juergen Hoetzel, Julien
 Humbert,
         Kai-Chuan Hsieh, Kairui Song, Kamil Dudka, Kir Kolyshkin,
 Kristijan
         Gjoshev, Kyle Huey, Kyle Russell, Lee Whalen, Lennart Poettering,
         lichangze, Luca Boccassi, Lucas Werkmeister, Luca Weiss, Marc
         Kleine-Budde, Marco Wang, Martin Wilck, Marti Raudsepp,
 masmullin2000,
         Máté Pozsgay, Matt Fenwick, Michael Biebl, Michael Scherer, Michal
         Koutný, Michal Sekletár, Michal Suchanek, Mikael Szreder, Milo
         Casagrande, mirabilos, Mitsuha_QuQ, mog422, Muhammet Kara, Nazar
         Vinnichuk, Nicholas Narsing, Nicolas Fella, Njibhu, nl6720, Oğuz
 Ersen,
         Olivier Le Moal, Ondrej Kozina, onlybugreports, Pass Automated
 Testing
         Suite, Pat Coulthard, Pavel Sapezhko, Pedro Ruiz, perry_yuan,
 Peter
         Hutterer, Phaedrus Leeds, PhoenixDiscord, Piotr Drąg, Plan C,
         Purushottam choudhary, Rasmus Villemoes, Renaud Métrich, Robert
 Marko,
         Roman Beranek, Ronan Pigott, Roy Chen (陳彥廷),
 RussianNeuroMancer,
         Samanta Navarro, Samuel BF, scootergrisen, Sorin Ionescu, Steve
 Dodd,
         Susant Sahani, Timo Rothenpieler, Tobias Hunger, Tobias Kaufmann,
 Topi
         Miettinen, vanou, Vito Caputo, Weblate, Wen Yang, Whired Planck,
         williamvds, Yu, Li-Yu, Yuri Chornoivan, Yu Watanabe, Zbigniew
         Jędrzejewski-Szmek, Zmicer Turok, Дамјан Георгиевски

         – Warsaw, 2020-11-26
 }}}

--
Ticket URL: <http://wiki.linuxfromscratch.org/lfs/ticket/4745#comment:9>
LFS Trac <http://wiki.linuxfromscratch.org/lfs/>
Linux From Scratch: Your Distro, Your Rules.
-- 
http://lists.linuxfromscratch.org/listinfo/lfs-book
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Re: [lfs-book] [LFS Trac] #4745: systemd-247

Reply via email to