I'm afraid if you're getting undervolt warnings that's almost certainly
the cause of any serious issues like kernel hangs. The lightning bolt
not being visible simply means that the undervolt condition is not
sustained, but nevertheless any such warnings indicate that your power
supply is dropping below 4.7V (usually under load). Before investigating
this further, we'd need to eliminate that as a potential cause.
** Changed in: linux (Ubuntu)
Status: Confirmed => Incomplete
--
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1889637
Title:
Raspberry Pi 3B hangs - dev_pm_opp_set_rate: failed to find current
OPP, Failed to get throttled, Failed to change plib frequency; mmc
timeout waiting for hardware interrupt
Status in linux package in Ubuntu:
Incomplete
Bug description:
### uname -a (64-bit ARM, official image):
`Linux ubuntu 5.4.0-1015-raspi #15-Ubuntu SMP Fri Jul 10 05:34:24 UTC 2020
aarch64 aarch64 aarch64 GNU/Linux`
### LSB release (Ubuntu *Server*, focal):
Description: Ubuntu 20.04.1 LTS
### Interesting packages installed
- zfs-dkms (with initramfs support) @ 0.8.3-1ubuntu12.2
* spl-dkms @ 0.8.3-1ubuntu12.2
- dphys-swapfile
### Hardware model:
Raspberry Pi 3 Model B
- 32 GiB SD card with root partition
* had a swap partition; now unused
* migrated to dphys-swapfile
- Attached 32 GiB USB stick as zpool for storage (not root FS)
- Current PSU reportedly outputs 2.4A supply for the Pi
* Still have occasional undervolt warnings (formally requires 2.5A)
* Lightning indicator not present however
- Connected over wireless networking
## Issue
- When under significant computational load at some point, the machine
appears to freeze.
* I usually log in in a headless manner via ssh, so externally the machine
is frozen and I need to pull the power cable
- Connectig the HDMI monitor the following messsages appear, in various
orders each time:
```terminal
cpu cpu0: dev_pm_opp_set_rate: failed to find current OPP for freq
9,223,372,036,854,775,698 ({illegible on my photograph, presumably -110})
hwmon hwmon1: Failed to get throttled (-110)
raspberrypi-clk firmware clocks: Failed to change plib frequency: -110
mmc0: timeout waiting for hardware interrupt
# mmc0 would be the root partition
### ... typically later on in the output
rcu: INFO: rcu_sched detected stalls on CPU/tasks
rcu: $1-...0: (1 GPs behind) idle=.../1/0x40000{more 0s...}02
softirq=66377/66378 {or 26106/26107} fqs={this value varies}
INFO: task kworker/{...} blocked for more than 120 seconds
TAINTED: P WC OE 5.4.0-1015-raspi #15-ubuntu
watchdog: BUG: soft lockup - CPU #3 stuck for 22s!
```
The OPP frequency above looks to me like it may be the cause of the
issue, I have added the commas myself to the output but it would
appear to be a rubbish value;
[this](https://lkml.org/lkml/2020/7/24/683) mailing list archive I
found whilst searching for terms found in the messages appears to back
up my belief that we should be seeing a sensible CPU frequency here,
expressed in integer Hz; the above would be 9.2 EHz assuming Hertz are
the base unit, higher still if it's k/M/GHz etc. My most sensible
guess is this value has been brought up somewhere as garbage, and
understandably the system fails to scale the clock speed, with the
resultant crashes presumably due to this.
Beyond this point, there is no kernel panic, however the machine locks
up externally; does not respond to USB keyboard NumLock and is
invisible on the network, with more and more errors gradually being
output to the console via the HDMI display; the most notable being the
SD card is not responding
Just before encountering this issue I had added a swap aprtition, to
the SD card, as I had none by default and the system seemed to be
hanging when it presumably was sending bad_allocs to userland
processes as it failed to allocate memory. As the SD card was
mentioned, I have tried a variety of power supplies (as I was getting
several undervolt warnings) and eventually removed the swap partition
and used a swapfile with `dphys-swapfile` knowing that the way the Pi
accesses the SD card is somewhat different from a typical machine.
However, neither of these two seems to have resolved the issue, giving
further evidence that the frequency scaling may well be the primary
issue and the rest is simply the carnage that ensues.
## Steps to Reproduce
- Seems to happen sporadically when the machine is under stress, within 5-25
minutes
- Currently I am trying to set up a rootless docker compose file
* Attempting to pull the images eventually leads to the issue
* The images are being downloaded to the zpool on the USB stick and *not*
the SD card
- The system seems to hang initially waiting on the SDcard to respond to an
IRQ
- however I believe that the CPU scaling message seems to be the root cause
- Do not have any of the importat messages in the `syslog`, I need an
external HDMI monitor to get the output on screen from the kernel ring buffer
## Links
- [Related AskUbuntu
question](https://askubuntu.com/questions/1241412/ubuntu-20-04-lts-hangs-with-error-hwmon1-failed-to-get-throttled-110)
- [Potentially related bug - the frequency issue seems to be the same,
however the specific cause and a workaround are
different](https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/1875148)
## Extra
- Attaching /proc/cpuinfo
- Please let me know if any more diagnostics required; I would use hardinfo
or inxi but both want to install large parts of X which I don't want to do
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1889637/+subscriptions
--
Mailing list: https://launchpad.net/~kernel-packages
Post to : [email protected]
Unsubscribe : https://launchpad.net/~kernel-packages
More help : https://help.launchpad.net/ListHelp