Re: [PATCH] powerpc/eeh: avoid possible crash when edev->pdev changes

2024-06-13 Thread Ganesh G R
On 6/11/24 8:18 AM, Michael Ellerman wrote: Hi Ganesh, Ganesh Goudar writes: If a PCI device is removed during eeh_pe_report_edev(), edev->pdev will change and can cause a crash, hold the PCI rescan/remove lock while taking a copy of edev->pdev. Signed-off-by: Ganesh Goudar ---

Re: [PATCH] powerpc/eeh: Permanently disable the removed device

2024-04-15 Thread Ganesh G R
On 4/9/24 14:37, Michael Ellerman wrote: Hi Ganesh, Ganesh Goudar writes: When a device is hot removed on powernv, the hotplug driver clears the device's state. However, on pseries, if a device is removed by phyp after reaching the error threshold, the kernel remains unaware, leading to the

Re: [RFC 0/3] Asynchronous EEH recovery

2023-07-17 Thread Ganesh G R
On 6/13/23 8:06 AM, Oliver O'Halloran wrote: On Tue, Jun 13, 2023 at 11:44 AM Ganesh Goudar wrote: Hi, EEH recovery is currently serialized and these patches shorten the time taken for EEH recovery by making the recovery to run in parallel. The original author of these patches is Sam

Re: [PATCH v2] powerpc/mce: log the error for all unrecoverable errors

2023-02-01 Thread Ganesh G R
On 1/31/23 4:59 PM, Michael Ellerman wrote: Ganesh Goudar writes: For all unrecoverable errors we are missing to log the error, Since machine_check_log_err() is not getting called for unrecoverable errors. Raise irq work in save_mce_event() for unrecoverable errors, So that we log the error