On 2/25/26 11:46, Mario Limonciello wrote:
On 2/25/2026 1:30 PM, Lizhi Hou wrote:Using legacy driver with latest firmware causes a power off issue. Fix this by assigning a different filename (npu_7.sbin) to the latestfirmware. The driver attempts to load the latest firmware first and fallsback to the previous firmware version if loading fails. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5009Fixes: f1eac46fe5f7 ("accel/amdxdna: Update firmware version check for latest firmware")Signed-off-by: Lizhi Hou <[email protected]>Thanks for the quick response on this one. A few comments inline.--- drivers/accel/amdxdna/aie2_pci.c | 21 +++++++++++++++++++-- drivers/accel/amdxdna/amdxdna_pci_drv.c | 4 +++- drivers/accel/amdxdna/npu1_regs.c | 2 +- drivers/accel/amdxdna/npu4_regs.c | 2 +- drivers/accel/amdxdna/npu5_regs.c | 2 +- drivers/accel/amdxdna/npu6_regs.c | 2 +- 6 files changed, 26 insertions(+), 7 deletions(-)diff --git a/drivers/accel/amdxdna/aie2_pci.c b/drivers/accel/amdxdna/aie2_pci.cindex 4b3e6bb97bd2..884e7702b674 100644 --- a/drivers/accel/amdxdna/aie2_pci.c +++ b/drivers/accel/amdxdna/aie2_pci.c @@ -32,6 +32,11 @@ static int aie2_max_col = XRS_MAX_COL; module_param(aie2_max_col, uint, 0600); MODULE_PARM_DESC(aie2_max_col, "Maximum column could be used"); +static char *npu_fw[] = { + "npu_7.sbin", + "npu.sbin" +}; + /* * The management mailbox channel is allocated by firmware. * The related register and ring buffer information is on SRAM BAR. @@ -489,6 +494,7 @@ static int aie2_init(struct amdxdna_dev *xdna) struct psp_config psp_conf; const struct firmware *fw; unsigned long bars = 0; + char *fw_full_path; int i, nvec, ret; if (!hypervisor_is_type(X86_HYPER_NATIVE)) { @@ -503,10 +509,21 @@ static int aie2_init(struct amdxdna_dev *xdna) ndev->priv = xdna->dev_info->dev_priv; ndev->xdna = xdna; - ret = request_firmware(&fw, ndev->priv->fw_path, &pdev->dev); + for (i = 0; i < ARRAY_SIZE(npu_fw); i++) {+ fw_full_path = kasprintf(GFP_KERNEL, "%s%s", ndev->priv->fw_path,+ npu_fw[i]); + if (!fw_full_path) + return -ENOMEM; + + ret = request_firmware(&fw, fw_full_path, &pdev->dev); + kfree(fw_full_path); + if (!ret) + break;Since you're falling through two different binaries, I think that it would be a good idea to use firmware_request_nowarn() and then have your own warning if both are missing.
Good point. I will send V2.
+ } + if (ret) { XDNA_ERR(xdna, "failed to request_firmware %s, ret %d", - ndev->priv->fw_path, ret); + ndev->priv->fw_path, ret);Looks like unintended whitespace change.
Will fix this.
return ret; }diff --git a/drivers/accel/amdxdna/amdxdna_pci_drv.c b/drivers/accel/amdxdna/amdxdna_pci_drv.cindex 4ada45d06fcf..d5c699e1afe4 100644 --- a/drivers/accel/amdxdna/amdxdna_pci_drv.c +++ b/drivers/accel/amdxdna/amdxdna_pci_drv.c @@ -22,7 +22,9 @@ MODULE_FIRMWARE("amdnpu/1502_00/npu.sbin"); MODULE_FIRMWARE("amdnpu/17f0_10/npu.sbin"); MODULE_FIRMWARE("amdnpu/17f0_11/npu.sbin"); -MODULE_FIRMWARE("amdnpu/17f0_20/npu.sbin");I think this should be separate commit. It's actually a fix for this right?Fixes: 3ef93841033ed ("accel/amdxdna: Remove NPU2 support")
Correct. I will remove it from this patch. Thanks, Lizhi
+MODULE_FIRMWARE("amdnpu/1502_00/npu_7.sbin"); +MODULE_FIRMWARE("amdnpu/17f0_10/npu_7.sbin"); +MODULE_FIRMWARE("amdnpu/17f0_11/npu_7.sbin"); /* * 0.0: Initial versiondiff --git a/drivers/accel/amdxdna/npu1_regs.c b/drivers/accel/amdxdna/npu1_regs.cindex 6f36a27b5a02..6e3d3ca69c04 100644 --- a/drivers/accel/amdxdna/npu1_regs.c +++ b/drivers/accel/amdxdna/npu1_regs.c@@ -72,7 +72,7 @@ static const struct aie2_fw_feature_tbl npu1_fw_feature_table[] = {}; static const struct amdxdna_dev_priv npu1_dev_priv = { - .fw_path = "amdnpu/1502_00/npu.sbin", + .fw_path = "amdnpu/1502_00/", .rt_config = npu1_default_rt_cfg, .dpm_clk_tbl = npu1_dpm_clk_table, .fw_feature_tbl = npu1_fw_feature_table,diff --git a/drivers/accel/amdxdna/npu4_regs.c b/drivers/accel/amdxdna/npu4_regs.cindex a8d6f76dde5f..ce25eef5fc34 100644 --- a/drivers/accel/amdxdna/npu4_regs.c +++ b/drivers/accel/amdxdna/npu4_regs.c@@ -98,7 +98,7 @@ const struct aie2_fw_feature_tbl npu4_fw_feature_table[] = {}; static const struct amdxdna_dev_priv npu4_dev_priv = { - .fw_path = "amdnpu/17f0_10/npu.sbin", + .fw_path = "amdnpu/17f0_10/", .rt_config = npu4_default_rt_cfg, .dpm_clk_tbl = npu4_dpm_clk_table, .fw_feature_tbl = npu4_fw_feature_table,diff --git a/drivers/accel/amdxdna/npu5_regs.c b/drivers/accel/amdxdna/npu5_regs.cindex c0a35cfd886c..c0ac5daf32ee 100644 --- a/drivers/accel/amdxdna/npu5_regs.c +++ b/drivers/accel/amdxdna/npu5_regs.c @@ -63,7 +63,7 @@ #define NPU5_SRAM_BAR_BASE MMNPU_APERTURE1_BASE static const struct amdxdna_dev_priv npu5_dev_priv = { - .fw_path = "amdnpu/17f0_11/npu.sbin", + .fw_path = "amdnpu/17f0_11/", .rt_config = npu4_default_rt_cfg, .dpm_clk_tbl = npu4_dpm_clk_table, .fw_feature_tbl = npu4_fw_feature_table,diff --git a/drivers/accel/amdxdna/npu6_regs.c b/drivers/accel/amdxdna/npu6_regs.cindex 1fb07df99186..ce591ed0d483 100644 --- a/drivers/accel/amdxdna/npu6_regs.c +++ b/drivers/accel/amdxdna/npu6_regs.c @@ -63,7 +63,7 @@ #define NPU6_SRAM_BAR_BASE MMNPU_APERTURE1_BASE static const struct amdxdna_dev_priv npu6_dev_priv = { - .fw_path = "amdnpu/17f0_10/npu.sbin", + .fw_path = "amdnpu/17f0_10/", .rt_config = npu4_default_rt_cfg, .dpm_clk_tbl = npu4_dpm_clk_table, .fw_feature_tbl = npu4_fw_feature_table,
