When certain PHB HW failure causes phyp to recover PHB, it marks the PE state as temporarily unavailable until recovery is complete. This also triggers an EEH handler in Linux which needs to notify drivers, and perform recovery. But before notifying the driver about the pci error it uses get_adapter_state()->get-sesnor-state() operation of the hotplug_slot to determine if the slot contains a device or not. if the slot is empty, the recovery is skipped entirely.
However on certain PHB failures, the rtas call get-sesnor-state() returns extended busy error (9902) until PHB is recovered by phyp. Once PHB is recovered, the get-sensor-state() returns success with correct presence status. The rtas call interface rtas_get_sensor() loops over the rtas call on extended delay return code (9902) until the return value is either success (0) or error (-1). This causes the EEH handler to get stuck for ~6 seconds before it could notify that the pci error has been detected and stop any active operations. Hence with running I/O traffic, during this 6 seconds, the network driver continues its operation and hits a timeout (netdev watchdog). On timeouts, network driver go into ffdc capture mode and reset path assuming the PCI device is in fatal condition. This sometimes causes EEH recovery to fail. This impacts the ssh connection and leads to the system being inaccessible. ------------ [52732.244731] DEBUG: ibm_read_slot_reset_state2() [52732.244762] DEBUG: ret = 0, rets[0]=5, rets[1]=1, rets[2]=4000, rets[3]=> [52732.244798] DEBUG: in eeh_slot_presence_check [52732.244804] DEBUG: error state check [52732.244807] DEBUG: Is slot hotpluggable [52732.244810] DEBUG: hotpluggable ops ? [52732.244953] DEBUG: Calling ops->get_adapter_status [52732.244958] DEBUG: calling rpaphp_get_sensor_state [52736.564262] ------------[ cut here ]------------ [52736.564299] NETDEV WATCHDOG: enP64p1s0f3 (tg3): transmit queue 0 timed o> [52736.564324] WARNING: CPU: 1442 PID: 0 at net/sched/sch_generic.c:478 dev> [...] [52736.564505] NIP [c000000000c32368] dev_watchdog+0x438/0x440 [52736.564513] LR [c000000000c32364] dev_watchdog+0x434/0x440 ------------ Fix this issue by introducing a new rtas_get_sensor_nonblocking() that does not get blocked on BUSY condition and returns immediately with error. Use this function in pseries pci hotplug driver which can return an error if slot presence state can not be detected immediately. Please note that only in certain PHB failures, the slot presence check returns BUSY condition. In normal cases it returns immediately with a correct presence state value. Hence this change has no impact on normal pci dlpar operations. We could use rtas_get_sensor_fast() variant, but it thorws WARN_ON on BUSY condition. The rtas_get_sensor_nonblocking() suppresses WARN_ON. Signed-off-by: Mahesh Salgaonkar <mah...@linux.ibm.com> --- This is an alternate approach to fix the EEH issue instead of delaying slot presence check proposed at https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-November/236956.html Also refer: https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-November/237027.html --- arch/powerpc/include/asm/rtas.h | 1 + arch/powerpc/kernel/rtas.c | 19 ++++++++++++++++--- drivers/pci/hotplug/rpaphp_pci.c | 8 ++++---- 3 files changed, 21 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/rtas.h b/arch/powerpc/include/asm/rtas.h index 9dc97d2f9d27e..d8e8befb1c193 100644 --- a/arch/powerpc/include/asm/rtas.h +++ b/arch/powerpc/include/asm/rtas.h @@ -250,6 +250,7 @@ extern void rtas_os_term(char *str); void rtas_activate_firmware(void); extern int rtas_get_sensor(int sensor, int index, int *state); extern int rtas_get_sensor_fast(int sensor, int index, int *state); +int rtas_get_sensor_nonblocking(int sensor, int index, int *state); extern int rtas_get_power_level(int powerdomain, int *level); extern int rtas_set_power_level(int powerdomain, int level, int *setlevel); extern bool rtas_indicator_present(int token, int *maxindex); diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c index ac61e226c9af6..fd5aa3bbd46c5 100644 --- a/arch/powerpc/kernel/rtas.c +++ b/arch/powerpc/kernel/rtas.c @@ -609,7 +609,8 @@ int rtas_get_sensor(int sensor, int index, int *state) } EXPORT_SYMBOL(rtas_get_sensor); -int rtas_get_sensor_fast(int sensor, int index, int *state) +static int +__rtas_get_sensor(int sensor, int index, int *state, bool warn_on) { int token = rtas_token("get-sensor-state"); int rc; @@ -618,14 +619,26 @@ int rtas_get_sensor_fast(int sensor, int index, int *state) return -ENOENT; rc = rtas_call(token, 2, 2, state, sensor, index); - WARN_ON(rc == RTAS_BUSY || (rc >= RTAS_EXTENDED_DELAY_MIN && - rc <= RTAS_EXTENDED_DELAY_MAX)); + WARN_ON(warn_on && + (rc == RTAS_BUSY || (rc >= RTAS_EXTENDED_DELAY_MIN && + rc <= RTAS_EXTENDED_DELAY_MAX))); if (rc < 0) return rtas_error_rc(rc); return rc; } +int rtas_get_sensor_fast(int sensor, int index, int *state) +{ + return __rtas_get_sensor(sensor, index, state, true); +} + +int rtas_get_sensor_nonblocking(int sensor, int index, int *state) +{ + return __rtas_get_sensor(sensor, index, state, false); +} +EXPORT_SYMBOL(rtas_get_sensor_nonblocking); + bool rtas_indicator_present(int token, int *maxindex) { int proplen, count, i; diff --git a/drivers/pci/hotplug/rpaphp_pci.c b/drivers/pci/hotplug/rpaphp_pci.c index c380bdacd1466..8a7d681254ce9 100644 --- a/drivers/pci/hotplug/rpaphp_pci.c +++ b/drivers/pci/hotplug/rpaphp_pci.c @@ -23,7 +23,7 @@ int rpaphp_get_sensor_state(struct slot *slot, int *state) int rc; int setlevel; - rc = rtas_get_sensor(DR_ENTITY_SENSE, slot->index, state); + rc = rtas_get_sensor_nonblocking(DR_ENTITY_SENSE, slot->index, state); if (rc < 0) { if (rc == -EFAULT || rc == -EEXIST) { @@ -38,10 +38,10 @@ int rpaphp_get_sensor_state(struct slot *slot, int *state) if (rc < 0) { dbg("%s: power on slot[%s] failed rc=%d.\n", __func__, slot->name, rc); - } else { - rc = rtas_get_sensor(DR_ENTITY_SENSE, - slot->index, state); + return rc; } + rc = rtas_get_sensor_nonblocking(DR_ENTITY_SENSE, + slot->index, state); } else if (rc == -ENODEV) info("%s: slot is unusable\n", __func__); else