[Touch-packages] [Bug 2002445] Re: udev NIC renaming race with mlx5_core driver

2023-03-24 Thread Mustafa Kemal Gilor
** Changed in: systemd (Ubuntu Focal)
 Assignee: (unassigned) => Mustafa Kemal Gilor (mustafakemalgilor)

** Changed in: systemd (Ubuntu Focal)
   Status: Triaged => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/2002445

Title:
  udev NIC renaming race with mlx5_core driver

Status in systemd package in Ubuntu:
  Fix Committed
Status in systemd source package in Focal:
  In Progress
Status in systemd source package in Jammy:
  In Progress
Status in systemd source package in Kinetic:
  In Progress
Status in systemd source package in Lunar:
  Fix Committed

Bug description:
  [Impact]
  On systems with mellanox NICs, udev's NIC renaming races with the mlx5_core 
driver's own configuration of subordinate interfaces. When the kernel wins this 
race, the device cannot be renamed as udev has attempted, and this causes 
systemd-network-online.target to timeout waiting for links to be configured. 
This ultimately results in boot being delayed by about 2 minutes.

  [Test Plan]
  Repeated launches of Standard_D8ds_v5 instance types will generally hit this 
race around 1 in 10 runs. Create a vm snapshot with updated systemd from 
ppa:enr0n/systemd-245. Launch 100 Standard_D8ds_v5 instances with updated 
systemd. Assert not failure in cloud-init status and no 2 minute delay in 
network-online.target.

  To check for failure symptom:
    - Assert that network-online.target isn't the longest pole from 
systemd-analyze blame.

  To assert success condition during net rename busy race:
    - assert when "eth1" is still the primary device name, that two altnames 
are listed (preserving the altname due to the primary NIC rename being hit.

  Sample script uses pycloudlib to create modified base image for test
  and launches 100 VMs of type Standard_D8ds_v5, counting both successes
  and any failures seen.

  #!/usr/bin/env python3
  # This file is part of pycloudlib. See LICENSE file for license information.
  """Basic examples of various lifecycle with an Azure instance."""

  import logging
  import json

  import pycloudlib
  LOG = logging.getLogger()

  base_cfg = """#cloud-config
  ssh-import-id: [chad.smith, enr0n]
  """

  apt_cfg = """
  # Add developer PPA
  apt:
   sources:
     systemd-testing:
   source: "deb [allow-insecure=yes] 
https://ppa.launchpadcontent.net/enr0n/systemd-245/ubuntu focal main"
  # upgrade systemd after cloud-init is nearly done
  runcmd:
   - apt install systemd udev -y --allow-unauthenticated
  """

  debug_systemd_cfg = """
  # Create systemd-udev debug override.conf in base image
  write_files:
  - path: /etc/systemd/system/systemd-networkd.service.d/override.conf
    owner: root:root
    defer: {defer}
    content: |
  [Service]
  Environment=SYSTEMD_LOG_LEVEL=debug

  - path: /etc/systemd/system/systemd-udevd.service.d/override.conf
    owner: root:root
    defer: {defer}
    content: |
  [Service]
  Environment=SYSTEMD_LOG_LEVEL=debug
  LogRateLimitIntervalSec=0
  """

  cloud_config = base_cfg + apt_cfg + debug_systemd_cfg
  cloud_config2 = base_cfg + debug_systemd_cfg

  def debug_systemd_image_launch_overlake_v5_with_snapshot():
  """Test overlake v5 timeouts

  test procedure:
  - Launch base focal image
  - enable ppa:enr0n/systemd-245 and systemd/udev debugging
  - cloud-init clean --logs && deconfigure waalinux agent before shutdown
  - snapshot a base image
  - launch v5 system from snapshot
  - check systemd-analyze for expected timeout
  """
  client = pycloudlib.Azure(tag="azure")

  image_id = client.daily_image(release="focal")
  pub_path = "/home/ubuntu/.ssh/id_rsa.pub"
  priv_path = "/home/ubuntu/.ssh/id_rsa"

  client.use_key(pub_path, priv_path)

  base_instance = client.launch(
  image_id=image_id,
  instance_type="Standard_DS1_v2",
  user_data=cloud_config.format(defer="true"),
  )

  LOG.info(f"base instance: ssh ubuntu@{base_instance.ip}")
  base_instance.wait()
  LOG.info(base_instance.execute("apt cache policy systemd"))
  snapshotted_image_id = client.snapshot(base_instance)

  reproducer = False
  tries = 0
  success_count_with_race = 0
  success_count_no_race = 0
  failure_count_network_delay = 0
  failure_count_no_altnames = 0
  TEST_SUMMARY_TMPL = """
  - Test run complete: {tries} attempted -
  Successes without rename race: {success_count_no_race}
  Successes with rename race and preserved altname: 
{success_count_with_race}
  Failures d

[Touch-packages] [Bug 1978079] Re: EFI pstore not cleared on boot

2022-09-09 Thread Mustafa Kemal Gilor
Verification done for focal:

- Environment -

ubuntu@crustle:~$ uname -a
Linux crustle 5.4.0-125-generic #141-Ubuntu SMP Wed Aug 10 13:42:03 UTC 2022 
x86_64 x86_64 x86_64 GNU/Linux

ubuntu@crustle:~$ cat /etc/os-release 
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/;
SUPPORT_URL="https://help.ubuntu.com/;
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/;
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy;
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

ubuntu@crustle:~$ systemd --version
systemd 245 (245.4-4ubuntu3.18)
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP 
+GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 
default-hierarchy=hybrid

root@crustle:/home/ubuntu# cat /sys/module/pstore/parameters/backend
efi

-

- Test steps -

See [Test Plan]

-

- Result -

Verification : OK

root@crustle:/home/ubuntu# echo 1 > /proc/sys/kernel/sysrq
root@crustle:/home/ubuntu# echo 1 > /proc/sys/kernel/panic
root@crustle:/home/ubuntu# echo "c" > /proc/sysrq-trigger

* system reboots *

root@crustle:/home/ubuntu# ls /sys/fs/pstore
root@crustle:/home/ubuntu# ls /var/lib/systemd/pstore
166271364
root@crustle:/home/ubuntu# systemctl status systemd-pstore
● systemd-pstore.service - Platform Persistent Storage Archival
 Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
 Active: active (exited) since Fri 2022-09-09 08:57:29 UTC; 3min 10s ago
   Docs: man:systemd-pstore(8)
Process: 639 ExecStart=/lib/systemd/systemd-pstore (code=exited, 
status=0/SUCCESS)
   Main PID: 639 (code=exited, status=0/SUCCESS)

Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364709001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364709001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364708001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364708001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364607001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364607001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364606001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364606001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364605001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364605001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364604001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364604001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364603001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364603001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364602001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364602001
Sep 09 08:57:29 crustle systemd-pstore[639]: PStore dmesg-efi-166271364601001 
moved to /var/lib/systemd/pstore/166271364/dmesg-efi-166271364601001
Sep 09 08:57:29 crustle systemd[1]: Finished Platform Persistent Storage 
Archival.

** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1978079

Title:
  EFI pstore not cleared on boot

Status in systemd package in Ubuntu:
  Fix Released
Status in systemd source package in Focal:
  Fix Committed
Status in systemd source package in Impish:
  Won't Fix
Status in systemd source package in Jammy:
  Fix Released
Status in systemd source package in Kinetic:
  Fix Released

Bug description:
  [Impact]

  Systemd has a systemd-pstore component that scans the pstore on boot
  and if non-empty, takes all previously created dumps, transfers them
  into its journal and removes the pstore elements. This is very
  important on UEFI systems, which only have a limited amount of space
  for variables.

  In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m
  which means the EFI pstore support gets loaded dynamically. In all of
  my boots, this dynamic module loading happened *after* systemd tried
  to check for pstore variables. So systemd-pstore never starts and
  never clears the UEFI variable store. I see this happening in AWS on
  Graviton instances, which eventually run out of space to store the
  dumps. On real hardware, this behavior may lead to unbootable systems.

  ```
  $ systemctl status systemd-pstore
  ○ systemd-pstore.service - Platform Persistent Storage Archival
   Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
   Active: inactive (dead)
    Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
   └─ 

[Touch-packages] [Bug 1978079] Re: EFI pstore not cleared on boot

2022-06-20 Thread Mustafa Kemal Gilor
** Description changed:

+ [Impact]
+ 
  Systemd has a systemd-pstore component that scans the pstore on boot and
  if non-empty, takes all previously created dumps, transfers them into
  its journal and removes the pstore elements. This is very important on
  UEFI systems, which only have a limited amount of space for variables.
  
  In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m which
  means the EFI pstore support gets loaded dynamically. In all of my
  boots, this dynamic module loading happened *after* systemd tried to
  check for pstore variables. So systemd-pstore never starts and never
  clears the UEFI variable store. I see this happening in AWS on Graviton
  instances, which eventually run out of space to store the dumps. On real
  hardware, this behavior may lead to unbootable systems.
  
  ```
  $ systemctl status systemd-pstore
  ○ systemd-pstore.service - Platform Persistent Storage Archival
-  Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
-  Active: inactive (dead)
-   Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
-  └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
-Docs: man:systemd-pstore(8)
+  Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
+  Active: inactive (dead)
+   Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
+  └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
+    Docs: man:systemd-pstore(8)
  
  Jun 09 09:11:41 ip-172-31-0-61 systemd[1]: Condition check resulted in
  Platform Persistent Storage Archival being skipped.
  
  $ ls -la /sys/fs/pstore
  total 0
  drwxr-x--- 2 root root0 Jun  9 09:11 .
  drwxr-xr-x 8 root root0 Jun  9 09:11 ..
  -r--r--r-- 1 root root 1803 Jun  9 09:07 dmesg-efi-165476562001001
  -r--r--r-- 1 root root 1777 Jun  9 09:07 dmesg-efi-165476562002001
  -r--r--r-- 1 root root 1773 Jun  9 09:07 dmesg-efi-165476562003001
  -r--r--r-- 1 root root 1815 Jun  9 09:07 dmesg-efi-165476562004001
  -r--r--r-- 1 root root 1826 Jun  9 09:07 dmesg-efi-165476562005001
  -r--r--r-- 1 root root 1754 Jun  9 09:07 dmesg-efi-165476562006001
  -r--r--r-- 1 root root 1821 Jun  9 09:07 dmesg-efi-165476562007001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562008001
  -r--r--r-- 1 root root 1729 Jun  9 09:07 dmesg-efi-165476562009001
  -r--r--r-- 1 root root 1819 Jun  9 09:07 dmesg-efi-165476562010001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562011001
  -r--r--r-- 1 root root 1775 Jun  9 09:07 dmesg-efi-165476562012001
  -r--r--r-- 1 root root 1802 Jun  9 09:07 dmesg-efi-165476562013001
  -r--r--r-- 1 root root 1812 Jun  9 09:07 dmesg-efi-165476562014001
  -r--r--r-- 1 root root 1764 Jun  9 09:07 dmesg-efi-165476562015001
  -r--r--r-- 1 root root 1795 Jun  9 09:11 dmesg-efi-165476589801001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589802001
  -r--r--r-- 1 root root 1683 Jun  9 09:11 dmesg-efi-165476589803001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589804001
  -r--r--r-- 1 root root 1771 Jun  9 09:11 dmesg-efi-165476589805001
  -r--r--r-- 1 root root 1797 Jun  9 09:11 dmesg-efi-165476589806001
  -r--r--r-- 1 root root 1805 Jun  9 09:11 dmesg-efi-165476589807001
  -r--r--r-- 1 root root 1781 Jun  9 09:11 dmesg-efi-165476589808001
  -r--r--r-- 1 root root 1806 Jun  9 09:11 dmesg-efi-165476589809001
  -r--r--r-- 1 root root 1821 Jun  9 09:11 dmesg-efi-165476589810001
  -r--r--r-- 1 root root 1763 Jun  9 09:11 dmesg-efi-165476589811001
  -r--r--r-- 1 root root 1783 Jun  9 09:11 dmesg-efi-165476589812001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589813001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589814001
  -r--r--r-- 1 root root 1786 Jun  9 09:11 dmesg-efi-165476589815001
  ```
  
  This problem affects (at least) Ubuntu 20.04 and 22.04. A quick fix
  would be to configure CONFIG_EFI_VARS_PSTORE=y so that it's always
  available. A long term fix would make systemd rescan the directory after
  all module probing settled.
+ 
+ [Test Plan]
+ 
+ In order to be able to reproduce this issue, the system must have EFI-
+ backed pstore.
+ 
+ To check which kind of backend that pstore, use `cat
+ /sys/module/pstore/parameters/backend`
+ 
+ If it says `efi`, the steps below are applicable. Otherwise, find an
+ environment that has EFI backed pstore.
+ 
+ # Enable the pstore service. This service is supposed to move the data in 
/sys/fs/pstore
+ # to the `/var/lib/systemd/pstore` path on boot.
+ systemctl enable systemd-pstore.service # (or can be vendor enabled)
+ 
+ # Crash the kernel
+ echo 1 > /proc/sys/kernel/sysrq
+ echo 1 > /proc/sys/kernel/panic # this is usually set to zero, causing kernel 
to loop over the panic and freeze
+ echo "c" > /proc/sysrq-trigger
+ 
+ # The system will reboot itself. Check `/sys/fs/pstore` path first:
+ ls 

[Touch-packages] [Bug 1978079] Re: EFI pstore not cleared on boot

2022-06-20 Thread Mustafa Kemal Gilor
** Tags added: ubuntu-sponsors

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1978079

Title:
  EFI pstore not cleared on boot

Status in systemd package in Ubuntu:
  In Progress

Bug description:
  Systemd has a systemd-pstore component that scans the pstore on boot
  and if non-empty, takes all previously created dumps, transfers them
  into its journal and removes the pstore elements. This is very
  important on UEFI systems, which only have a limited amount of space
  for variables.

  In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m
  which means the EFI pstore support gets loaded dynamically. In all of
  my boots, this dynamic module loading happened *after* systemd tried
  to check for pstore variables. So systemd-pstore never starts and
  never clears the UEFI variable store. I see this happening in AWS on
  Graviton instances, which eventually run out of space to store the
  dumps. On real hardware, this behavior may lead to unbootable systems.

  ```
  $ systemctl status systemd-pstore
  ○ systemd-pstore.service - Platform Persistent Storage Archival
   Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
   Active: inactive (dead)
Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
   └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
 Docs: man:systemd-pstore(8)

  Jun 09 09:11:41 ip-172-31-0-61 systemd[1]: Condition check resulted in
  Platform Persistent Storage Archival being skipped.

  $ ls -la /sys/fs/pstore
  total 0
  drwxr-x--- 2 root root0 Jun  9 09:11 .
  drwxr-xr-x 8 root root0 Jun  9 09:11 ..
  -r--r--r-- 1 root root 1803 Jun  9 09:07 dmesg-efi-165476562001001
  -r--r--r-- 1 root root 1777 Jun  9 09:07 dmesg-efi-165476562002001
  -r--r--r-- 1 root root 1773 Jun  9 09:07 dmesg-efi-165476562003001
  -r--r--r-- 1 root root 1815 Jun  9 09:07 dmesg-efi-165476562004001
  -r--r--r-- 1 root root 1826 Jun  9 09:07 dmesg-efi-165476562005001
  -r--r--r-- 1 root root 1754 Jun  9 09:07 dmesg-efi-165476562006001
  -r--r--r-- 1 root root 1821 Jun  9 09:07 dmesg-efi-165476562007001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562008001
  -r--r--r-- 1 root root 1729 Jun  9 09:07 dmesg-efi-165476562009001
  -r--r--r-- 1 root root 1819 Jun  9 09:07 dmesg-efi-165476562010001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562011001
  -r--r--r-- 1 root root 1775 Jun  9 09:07 dmesg-efi-165476562012001
  -r--r--r-- 1 root root 1802 Jun  9 09:07 dmesg-efi-165476562013001
  -r--r--r-- 1 root root 1812 Jun  9 09:07 dmesg-efi-165476562014001
  -r--r--r-- 1 root root 1764 Jun  9 09:07 dmesg-efi-165476562015001
  -r--r--r-- 1 root root 1795 Jun  9 09:11 dmesg-efi-165476589801001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589802001
  -r--r--r-- 1 root root 1683 Jun  9 09:11 dmesg-efi-165476589803001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589804001
  -r--r--r-- 1 root root 1771 Jun  9 09:11 dmesg-efi-165476589805001
  -r--r--r-- 1 root root 1797 Jun  9 09:11 dmesg-efi-165476589806001
  -r--r--r-- 1 root root 1805 Jun  9 09:11 dmesg-efi-165476589807001
  -r--r--r-- 1 root root 1781 Jun  9 09:11 dmesg-efi-165476589808001
  -r--r--r-- 1 root root 1806 Jun  9 09:11 dmesg-efi-165476589809001
  -r--r--r-- 1 root root 1821 Jun  9 09:11 dmesg-efi-165476589810001
  -r--r--r-- 1 root root 1763 Jun  9 09:11 dmesg-efi-165476589811001
  -r--r--r-- 1 root root 1783 Jun  9 09:11 dmesg-efi-165476589812001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589813001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589814001
  -r--r--r-- 1 root root 1786 Jun  9 09:11 dmesg-efi-165476589815001
  ```

  This problem affects (at least) Ubuntu 20.04 and 22.04. A quick fix
  would be to configure CONFIG_EFI_VARS_PSTORE=y so that it's always
  available. A long term fix would make systemd rescan the directory
  after all module probing settled.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1978079/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1978079] Re: EFI pstore not cleared on boot

2022-06-20 Thread Mustafa Kemal Gilor
** Tags added: seg sts sts-sponsor

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1978079

Title:
  EFI pstore not cleared on boot

Status in systemd package in Ubuntu:
  In Progress

Bug description:
  Systemd has a systemd-pstore component that scans the pstore on boot
  and if non-empty, takes all previously created dumps, transfers them
  into its journal and removes the pstore elements. This is very
  important on UEFI systems, which only have a limited amount of space
  for variables.

  In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m
  which means the EFI pstore support gets loaded dynamically. In all of
  my boots, this dynamic module loading happened *after* systemd tried
  to check for pstore variables. So systemd-pstore never starts and
  never clears the UEFI variable store. I see this happening in AWS on
  Graviton instances, which eventually run out of space to store the
  dumps. On real hardware, this behavior may lead to unbootable systems.

  ```
  $ systemctl status systemd-pstore
  ○ systemd-pstore.service - Platform Persistent Storage Archival
   Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
   Active: inactive (dead)
Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
   └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
 Docs: man:systemd-pstore(8)

  Jun 09 09:11:41 ip-172-31-0-61 systemd[1]: Condition check resulted in
  Platform Persistent Storage Archival being skipped.

  $ ls -la /sys/fs/pstore
  total 0
  drwxr-x--- 2 root root0 Jun  9 09:11 .
  drwxr-xr-x 8 root root0 Jun  9 09:11 ..
  -r--r--r-- 1 root root 1803 Jun  9 09:07 dmesg-efi-165476562001001
  -r--r--r-- 1 root root 1777 Jun  9 09:07 dmesg-efi-165476562002001
  -r--r--r-- 1 root root 1773 Jun  9 09:07 dmesg-efi-165476562003001
  -r--r--r-- 1 root root 1815 Jun  9 09:07 dmesg-efi-165476562004001
  -r--r--r-- 1 root root 1826 Jun  9 09:07 dmesg-efi-165476562005001
  -r--r--r-- 1 root root 1754 Jun  9 09:07 dmesg-efi-165476562006001
  -r--r--r-- 1 root root 1821 Jun  9 09:07 dmesg-efi-165476562007001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562008001
  -r--r--r-- 1 root root 1729 Jun  9 09:07 dmesg-efi-165476562009001
  -r--r--r-- 1 root root 1819 Jun  9 09:07 dmesg-efi-165476562010001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562011001
  -r--r--r-- 1 root root 1775 Jun  9 09:07 dmesg-efi-165476562012001
  -r--r--r-- 1 root root 1802 Jun  9 09:07 dmesg-efi-165476562013001
  -r--r--r-- 1 root root 1812 Jun  9 09:07 dmesg-efi-165476562014001
  -r--r--r-- 1 root root 1764 Jun  9 09:07 dmesg-efi-165476562015001
  -r--r--r-- 1 root root 1795 Jun  9 09:11 dmesg-efi-165476589801001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589802001
  -r--r--r-- 1 root root 1683 Jun  9 09:11 dmesg-efi-165476589803001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589804001
  -r--r--r-- 1 root root 1771 Jun  9 09:11 dmesg-efi-165476589805001
  -r--r--r-- 1 root root 1797 Jun  9 09:11 dmesg-efi-165476589806001
  -r--r--r-- 1 root root 1805 Jun  9 09:11 dmesg-efi-165476589807001
  -r--r--r-- 1 root root 1781 Jun  9 09:11 dmesg-efi-165476589808001
  -r--r--r-- 1 root root 1806 Jun  9 09:11 dmesg-efi-165476589809001
  -r--r--r-- 1 root root 1821 Jun  9 09:11 dmesg-efi-165476589810001
  -r--r--r-- 1 root root 1763 Jun  9 09:11 dmesg-efi-165476589811001
  -r--r--r-- 1 root root 1783 Jun  9 09:11 dmesg-efi-165476589812001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589813001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589814001
  -r--r--r-- 1 root root 1786 Jun  9 09:11 dmesg-efi-165476589815001
  ```

  This problem affects (at least) Ubuntu 20.04 and 22.04. A quick fix
  would be to configure CONFIG_EFI_VARS_PSTORE=y so that it's always
  available. A long term fix would make systemd rescan the directory
  after all module probing settled.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1978079/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp


[Touch-packages] [Bug 1978079] Re: EFI pstore not cleared on boot

2022-06-16 Thread Mustafa Kemal Gilor
** Changed in: systemd (Ubuntu)
 Assignee: (unassigned) => Mustafa Kemal Gilor (mustafakemalgilor)

** Changed in: systemd (Ubuntu)
   Status: New => In Progress

-- 
You received this bug notification because you are a member of Ubuntu
Touch seeded packages, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1978079

Title:
  EFI pstore not cleared on boot

Status in systemd package in Ubuntu:
  In Progress

Bug description:
  Systemd has a systemd-pstore component that scans the pstore on boot
  and if non-empty, takes all previously created dumps, transfers them
  into its journal and removes the pstore elements. This is very
  important on UEFI systems, which only have a limited amount of space
  for variables.

  In Ubuntu, the kernel is configured with CONFIG_EFI_VARS_PSTORE=m
  which means the EFI pstore support gets loaded dynamically. In all of
  my boots, this dynamic module loading happened *after* systemd tried
  to check for pstore variables. So systemd-pstore never starts and
  never clears the UEFI variable store. I see this happening in AWS on
  Graviton instances, which eventually run out of space to store the
  dumps. On real hardware, this behavior may lead to unbootable systems.

  ```
  $ systemctl status systemd-pstore
  ○ systemd-pstore.service - Platform Persistent Storage Archival
   Loaded: loaded (/lib/systemd/system/systemd-pstore.service; enabled; 
vendor preset: enabled)
   Active: inactive (dead)
Condition: start condition failed at Thu 2022-06-09 09:11:41 UTC; 29min ago
   └─ ConditionDirectoryNotEmpty=/sys/fs/pstore was not met
 Docs: man:systemd-pstore(8)

  Jun 09 09:11:41 ip-172-31-0-61 systemd[1]: Condition check resulted in
  Platform Persistent Storage Archival being skipped.

  $ ls -la /sys/fs/pstore
  total 0
  drwxr-x--- 2 root root0 Jun  9 09:11 .
  drwxr-xr-x 8 root root0 Jun  9 09:11 ..
  -r--r--r-- 1 root root 1803 Jun  9 09:07 dmesg-efi-165476562001001
  -r--r--r-- 1 root root 1777 Jun  9 09:07 dmesg-efi-165476562002001
  -r--r--r-- 1 root root 1773 Jun  9 09:07 dmesg-efi-165476562003001
  -r--r--r-- 1 root root 1815 Jun  9 09:07 dmesg-efi-165476562004001
  -r--r--r-- 1 root root 1826 Jun  9 09:07 dmesg-efi-165476562005001
  -r--r--r-- 1 root root 1754 Jun  9 09:07 dmesg-efi-165476562006001
  -r--r--r-- 1 root root 1821 Jun  9 09:07 dmesg-efi-165476562007001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562008001
  -r--r--r-- 1 root root 1729 Jun  9 09:07 dmesg-efi-165476562009001
  -r--r--r-- 1 root root 1819 Jun  9 09:07 dmesg-efi-165476562010001
  -r--r--r-- 1 root root 1767 Jun  9 09:07 dmesg-efi-165476562011001
  -r--r--r-- 1 root root 1775 Jun  9 09:07 dmesg-efi-165476562012001
  -r--r--r-- 1 root root 1802 Jun  9 09:07 dmesg-efi-165476562013001
  -r--r--r-- 1 root root 1812 Jun  9 09:07 dmesg-efi-165476562014001
  -r--r--r-- 1 root root 1764 Jun  9 09:07 dmesg-efi-165476562015001
  -r--r--r-- 1 root root 1795 Jun  9 09:11 dmesg-efi-165476589801001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589802001
  -r--r--r-- 1 root root 1683 Jun  9 09:11 dmesg-efi-165476589803001
  -r--r--r-- 1 root root 1785 Jun  9 09:11 dmesg-efi-165476589804001
  -r--r--r-- 1 root root 1771 Jun  9 09:11 dmesg-efi-165476589805001
  -r--r--r-- 1 root root 1797 Jun  9 09:11 dmesg-efi-165476589806001
  -r--r--r-- 1 root root 1805 Jun  9 09:11 dmesg-efi-165476589807001
  -r--r--r-- 1 root root 1781 Jun  9 09:11 dmesg-efi-165476589808001
  -r--r--r-- 1 root root 1806 Jun  9 09:11 dmesg-efi-165476589809001
  -r--r--r-- 1 root root 1821 Jun  9 09:11 dmesg-efi-165476589810001
  -r--r--r-- 1 root root 1763 Jun  9 09:11 dmesg-efi-165476589811001
  -r--r--r-- 1 root root 1783 Jun  9 09:11 dmesg-efi-165476589812001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589813001
  -r--r--r-- 1 root root 1788 Jun  9 09:11 dmesg-efi-165476589814001
  -r--r--r-- 1 root root 1786 Jun  9 09:11 dmesg-efi-165476589815001
  ```

  This problem affects (at least) Ubuntu 20.04 and 22.04. A quick fix
  would be to configure CONFIG_EFI_VARS_PSTORE=y so that it's always
  available. A long term fix would make systemd rescan the directory
  after all module probing settled.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1978079/+subscriptions


-- 
Mailing list: https://launchpad.net/~touch-packages
Post to : touch-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~touch-packages
More help   : https://help.launchpad.net/ListHelp