On Wed, May 20, 2026 at 05:03:55PM +0530, Mallesh Koujalagi wrote:
> When ``WEDGED=cold-reset`` is sent, it indicates that the device has
> encountered an error condition that cannot be resolved through other
> recovery methods such as driver rebind or bus reset, and requires a
> complete device power cycle to restore functionality.
...
> +Example - cold-reset
> +--------------------
> +
> +Udev rule::
> +
> + SUBSYSTEM=="drm", ENV{WEDGED}=="cold-reset", DEVPATH=="*/drm/card[0-9]",
> + RUN+="/path/to/cold-reset.sh $env{DEVPATH}"
> +
> +Recovery script::
> +
> + #!/bin/sh
> +
> + [ -z "$1" ] && echo "Usage: $0 <device-path>" && exit 1
> +
> + # Get device
> + DEVPATH=$(readlink -f /sys/$1/device 2>/dev/null || readlink -f /sys/$1)
> + DEVICE=$(basename $DEVPATH)
> +
> + echo "Cold reset: $DEVICE"
> +
> + # The PCI core exposes a 'slot' symlink on the device that sits in a
> + # registered hotplug slot. Use it directly instead of scanning every
> + # slot on the system.
> + SLOT=""
> + if [ -L "$DEVPATH/slot" ]; then
I think we'll need to iterate through the hierarchy to find it.
Raag
> + SLOT=$(basename "$(readlink -f "$DEVPATH/slot")")
> + fi
> +
> + if [ -n "$SLOT" ]; then
> + echo "Using slot $SLOT"
> +
> + # Remove device
> + echo 1 > /sys/bus/pci/devices/$DEVICE/remove
> +
> + # Power cycle slot. A platform-specific settle delay may be required
> + # between power off and power on; tune to the hardware as needed.
> + echo 0 > /sys/bus/pci/slots/$SLOT/power
> + echo 1 > /sys/bus/pci/slots/$SLOT/power
> +
> + # Rescan
> + echo 1 > /sys/bus/pci/rescan
> + echo "Done!"
> + else
> + echo "No slot found"
> + fi
> +
> Customization
> -------------
>
> --
> 2.34.1
>