Public bug reported:
### uname -a (64-bit ARM, official image):
`Linux ubuntu 5.4.0-1015-raspi #15-Ubuntu SMP Fri Jul 10 05:34:24 UTC 2020
aarch64 aarch64 aarch64 GNU/Linux`
### LSB release (Ubuntu *Server*, focal):
Description: Ubuntu 20.04.1 LTS
### Interesting packages installed
- zfs-dkms (with initramfs support) @ 0.8.3-1ubuntu12.2
* spl-dkms @ 0.8.3-1ubuntu12.2
- dphys-swapfile
### Hardware model:
Raspberry Pi 3 Model B
- 32 GiB SD card with root partition
* had a swap partition; now unused
* migrated to dphys-swapfile
- Attached 32 GiB USB stick as zpool for storage (not root FS)
- Current PSU reportedly outputs 2.4A supply for the Pi
* Still have occasional undervolt warnings (formally requires 2.5A)
* Lightning indicator not present however
- Connected over wireless networking
## Issue
- When under significant computational load at some point, the machine appears
to freeze.
* I usually log in in a headless manner via ssh, so externally the machine is
frozen and I need to pull the power cable
- Connectig the HDMI monitor the following messsages appear, in various orders
each time:
```terminal
cpu cpu0: dev_pm_opp_set_rate: failed to find current OPP for freq
9,223,372,036,854,775,698 ({illegible on my photograph, presumably -110})
hwmon hwmon1: Failed to get throttled (-110)
raspberrypi-clk firmware clocks: Failed to change plib frequency: -110
mmc0: timeout waiting for hardware interrupt
# mmc0 would be the root partition
### ... typically later on in the output
rcu: INFO: rcu_sched detected stalls on CPU/tasks
rcu: $1-...0: (1 GPs behind) idle=.../1/0x40000{more 0s...}02
softirq=66377/66378 {or 26106/26107} fqs={this value varies}
INFO: task kworker/{...} blocked for more than 120 seconds
TAINTED: P WC OE 5.4.0-1015-raspi #15-ubuntu
watchdog: BUG: soft lockup - CPU #3 stuck for 22s!
```
The OPP frequency above looks to me like it may be the cause of the
issue, I have added the commas myself to the output but it would appear
to be a rubbish value; [this](https://lkml.org/lkml/2020/7/24/683)
mailing list archive I found whilst searching for terms found in the
messages appears to back up my belief that we should be seeing a
sensible CPU frequency here, expressed in integer Hz; the above would be
9.2 EHz assuming Hertz are the base unit, higher still if it's k/M/GHz
etc. My most sensible guess is this value has been brought up somewhere
as garbage, and understandably the system fails to scale the clock
speed, with the resultant crashes presumably due to this.
Beyond this point, there is no kernel panic, however the machine locks
up externally; does not respond to USB keyboard NumLock and is invisible
on the network, with more and more errors gradually being output to the
console via the HDMI display; the most notable being the SD card is not
responding
Just before encountering this issue I had added a swap aprtition, to the
SD card, as I had none by default and the system seemed to be hanging
when it presumably was sending bad_allocs to userland processes as it
failed to allocate memory. As the SD card was mentioned, I have tried a
variety of power supplies (as I was getting several undervolt warnings)
and eventually removed the swap partition and used a swapfile with
`dphys-swapfile` knowing that the way the Pi accesses the SD card is
somewhat different from a typical machine. However, neither of these two
seems to have resolved the issue, giving further evidence that the
frequency scaling may well be the primary issue and the rest is simply
the carnage that ensues.
## Steps to Reproduce
- Seems to happen sporadically when the machine is under stress, within 5-25
minutes
- Currently I am trying to set up a rootless docker compose file
* Attempting to pull the images eventually leads to the issue
* The images are being downloaded to the zpool on the USB stick and *not* the
SD card
- The system seems to hang initially waiting on the SDcard to respond to an IRQ
- however I believe that the CPU scaling message seems to be the root cause
- Do not have any of the importat messages in the `syslog`, I need an external
HDMI monitor to get the output on screen from the kernel ring buffer
## Links
- [Related AskUbuntu
question](https://askubuntu.com/questions/1241412/ubuntu-20-04-lts-hangs-with-error-hwmon1-failed-to-get-throttled-110)
- [Potentially related bug - the frequency issue seems to be the same, however
the specific cause and a workaround are
different](https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/1875148)
## Extra
- Attaching /proc/cpuinfo
- Please let me know if any more diagnostics required; I would use hardinfo or
inxi but both want to install large parts of X which I don't want to do
** Affects: ubuntu
Importance: Undecided
Status: New
** Tags: raspberrypi
** Attachment added: "/proc/cpuinfo"
https://bugs.launchpad.net/bugs/1889637/+attachment/5397187/+files/cpuinfo
--
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1889637
Title:
Raspberry Pi 3B hangs - dev_pm_opp_set_rate: failed to find current
OPP, Failed to get throttled, Failed to change plib frequency; mmc
timeout waiting for hardware interrupt
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+bug/1889637/+subscriptions
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs