If anyone has to stay on this kernel, like I do, or they are just too stubborn to downgrade, also like me, I've created a setup script to enable watchdog with some timeouts that seem to work on my pi 5 with Ubuntu 25.10 which I assume is going to be the same as you, if you are here chasing this issue.
It will ping a list of IP addresses (should be on the same network but not belong to the machine in question) and should ALL of them fail, will unload/reload the failed macb driver module and check again after 10 seconds. Should this not work, then the system will reboot in ~60 seconds as a fallback. The idea here is to detect the issue and attempt repair quickly to avoid inter-node timeouts and degraded replicas that would have to be rebuilt. However, to ensure robustness, the fallback reboot is still there if required. Watchdog uses a hardware timer that must be "petted" every so often or it will reboot the system. Beware of boot loops and test /etc/watchdog.d/ping-targets before starting/enabling the watchdog. Should you get stuck in one, as I did when I set the watchdog timer too low, spamming the node with this saved me: ssh <node> sudo systemctl disable watchdog Hope this helps --- # Based on post in https://forums.raspberrypi.com/viewtopic.php?t=89527 by Denny Fox sudo apt-get install watchdog -y sudo mkdir -p /etc/watchdog.d # set a range of ips that currently ping and are not local in /etc/watchdog.d/targets { host_ips=$(hostname -i) prefix=192.168.220. # <--- Set prefix for i in 11 {25..29} # <--- Set range of IPs, in this case: 11, 25, 26, 27, 28, 29 do ip=${prefix}${i} grep -vq ${ip} <<<${host_ips} && ping -q -c1 -W2 ${ip} &> /dev/null && echo "${ip}" done } | sudo tee /etc/watchdog.d/targets # ping script /etc/watchdog.d/ping-targets cat <<"EOF" | sudo tee /etc/watchdog.d/ping-targets #!/usr/bin/env bash # A test/repair script for the raspberry pi watchdog # This script only returns an error if *all* the hosts listed in the datafile # do not respond to ping LOGFILE=/var/log/ping-targets.log log() { echo $(date +'%Y%m%d %H:%M:%S') $@ >> ${LOGFILE} } # Watchdog calls us again with arg1 = repair if we signal an error in test mode if [ "${1}" == "repair" ] then log "attempting repair" log "unloading macb module" modprobe -r macb log "reloading macb module" modprobe macb log "sleeping for 10 seconds" sleep 10 log "confirming repair - pinging targets" fi # Try to ping each IP and exit status 0 on any success while read ip || [[ -n $ip ]] do ping -q -c1 -W.1 ${ip} &> /dev/null && exit 0 # <--- Adjust -W ping timeout if needed log "${ip} ping failed" done < /etc/watchdog.d/targets log "all pings failed, exiting status 1" exit 1 EOF sudo chmod a+x /etc/watchdog.d/ping-targets # Watchdog config cat <<"EOF" | sudo tee /etc/watchdog.conf watchdog-device = /dev/watchdog watchdog-timeout = 60 test-timeout = 60 repair-timeout = 60 interval = 2 retry-timeout = 0 realtime = yes priority = 1 EOF sudo systemctl start watchdog # Recommend enabling the watchdog to survive reboots when you are happy this works for you sudo systemctl enable watchdog --- To monitor: sudo tail -f /var/log/ping-targets.log -n 50 A successful repair for node with ip 192.168.220.27 would look something like this (note that ping successes are not recorded to avoid spamming the log): 20251222 12:07:08 192.168.220.11 ping failed 20251222 12:07:08 192.168.220.25 ping failed 20251222 12:07:08 192.168.220.26 ping failed 20251222 12:07:08 192.168.220.28 ping failed 20251222 12:07:08 192.168.220.29 ping failed 20251222 12:07:08 all pings failed, exiting status 1 20251222 12:07:10 attempting repair 20251222 12:07:10 unloading macb module 20251222 12:07:10 reloading macb module 20251222 12:07:10 sleeping for 10 seconds 20251222 12:07:20 confirming repair - pinging targets -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/2133877 Title: Complete network hang on Raspberry Pi 5 with kernel 6.17 under load - possibly related to CPU frequency scaling To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-raspi/+bug/2133877/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
