The PE number (@frozen_pe_no), filled by opal_pci_next_error() is in big-endian format. It should be converted to CPU-dian before it is passed to opal_pci_eeh_freeze_clear() when clearing the frozen state if the PE is invalid one. As Michael Ellerman pointed out, the issue is also detected by sparse:
gwshan@gwshan:~/sandbox/l$ make C=2 CF=-D__CHECK_ENDIAN__ \ arch/powerpc/platforms/powernv/eeh-powernv.o : arch/powerpc/platforms/powernv/eeh-powernv.c:1541:41: \ warning: incorrect type in argument 2 (different base types) arch/powerpc/platforms/powernv/eeh-powernv.c:1541:41: \ expected unsigned long long [unsigned] [usertype] pe_number arch/powerpc/platforms/powernv/eeh-powernv.c:1541:41: \ got restricted __be64 [addressable] [usertype] frozen_pe_no This passes CPU-endian PE number to opal_pci_eeh_freeze_clear() and it should be part of commit <0f36db77643b> ("powerpc/eeh: Fix wrong printed PE number"), which was merged to 4.3 kernel. Fixes: 71b540adffd9 ("powerpc/powernv: Don't escalate non-existing frozen PE") Cc: sta...@vger.kernel.org # v4.3+ Suggested-by: Paul Mackerras <pau...@samba.org> Signed-off-by: Gavin Shan <gws...@linux.vnet.ibm.com> --- arch/powerpc/platforms/powernv/eeh-powernv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/platforms/powernv/eeh-powernv.c b/arch/powerpc/platforms/powernv/eeh-powernv.c index 86544ea..75363d9 100644 --- a/arch/powerpc/platforms/powernv/eeh-powernv.c +++ b/arch/powerpc/platforms/powernv/eeh-powernv.c @@ -1538,7 +1538,7 @@ static int pnv_eeh_next_error(struct eeh_pe **pe) /* Try best to clear it */ opal_pci_eeh_freeze_clear(phb->opal_id, - frozen_pe_no, + be64_to_cpu(frozen_pe_no), OPAL_EEH_ACTION_CLEAR_FREEZE_ALL); ret = EEH_NEXT_ERR_NONE; } else if ((*pe)->state & EEH_PE_ISOLATED || -- 2.1.0 _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@lists.ozlabs.org https://lists.ozlabs.org/listinfo/linuxppc-dev