On Wed, May 20, 2026 at 05:03:55PM +0530, Mallesh Koujalagi wrote:
> When ``WEDGED=cold-reset`` is sent, it indicates that the device has
> encountered an error condition that cannot be resolved through other
> recovery methods such as driver rebind or bus reset, and requires a
> complete device power cycle to restore functionality.

...

> +Example - cold-reset
> +--------------------
> +
> +Udev rule::
> +
> +    SUBSYSTEM=="drm", ENV{WEDGED}=="cold-reset", DEVPATH=="*/drm/card[0-9]",
> +    RUN+="/path/to/cold-reset.sh $env{DEVPATH}"
> +
> +Recovery script::
> +
> +    #!/bin/sh
> +
> +    [ -z "$1" ] && echo "Usage: $0 <device-path>" && exit 1
> +
> +    # Get device
> +    DEVPATH=$(readlink -f /sys/$1/device 2>/dev/null || readlink -f /sys/$1)
> +    DEVICE=$(basename $DEVPATH)
> +
> +    echo "Cold reset: $DEVICE"
> +
> +    # The PCI core exposes a 'slot' symlink on the device that sits in a
> +    # registered hotplug slot. Use it directly instead of scanning every
> +    # slot on the system.
> +    SLOT=""
> +    if [ -L "$DEVPATH/slot" ]; then

I think we'll need to iterate through the hierarchy to find it.

Raag

> +         SLOT=$(basename "$(readlink -f "$DEVPATH/slot")")
> +    fi
> +
> +    if [ -n "$SLOT" ]; then
> +     echo "Using slot $SLOT"
> +
> +     # Remove device
> +     echo 1 > /sys/bus/pci/devices/$DEVICE/remove
> +
> +     # Power cycle slot. A platform-specific settle delay may be required
> +     # between power off and power on; tune to the hardware as needed.
> +     echo 0 > /sys/bus/pci/slots/$SLOT/power
> +     echo 1 > /sys/bus/pci/slots/$SLOT/power
> +
> +     # Rescan
> +     echo 1 > /sys/bus/pci/rescan
> +     echo "Done!"
> +    else
> +     echo "No slot found"
> +    fi
> +
>  Customization
>  -------------
>  
> -- 
> 2.34.1
> 

Reply via email to