From: Ming Lei <[email protected]>
When SCSI-MQ is enabled, in some case system would present
nr_possible_cpus() which is greater than requested vectors
by the driver. This results into driver being able to get
larger number of MSI-X vectors than actual online CPUs.
Driver then uses pci_alloc_irq_vectors_affinity() to assign
1:1 mapping and affinity for each MSI-x vector to CPUs. When
the command is submitted using MSI-x vector, assigned to
offline CPU, it results in an ABTS and system hang. This hang
is result of a driver not being able to process interrupt on a
vector assigned to an Off-line CPUs
This patch fixes this issue by setting irq_offset value for the
blk_mq_pci_map_queues() to use only those CPUs which has CPU mask
affinity assigned and are online. By using the irq_offset value,
driver will allow online cpumask to decide which vectors are used
in blk_mq_pci_map_queues().
Fixes: 5601236b6f794 ("scsi: qla2xxx: Add Block Multi Queue functionality.")
Cc: <[email protected]> #4.19
Signed-off-by: Ming Lei <[email protected]>
Reviewed-by: Himanshu Madhani <[email protected]>
Tested-by: Himanshu Madhani <[email protected]>
Signed-off-by: Himanshu Madhani <[email protected]>
---
Hi Martin,
With SCSI-MQ set as default starting 4.19, we found regression where systems
with lower
online CPUs but higer possible CPUs in a system resulted in the driver
assigning MSIx
vectors to offline CPU causing system hang.
Please apply this patch to 5.0/fixes branch at your earliest convenience for
inclusion
in 5.0-rc3.
Thanks,
Himanshu
---
drivers/scsi/qla2xxx/qla_def.h | 2 ++
drivers/scsi/qla2xxx/qla_isr.c | 1 +
drivers/scsi/qla2xxx/qla_os.c | 2 +-
3 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h
index 26b93c563f92..d1fc4958222a 100644
--- a/drivers/scsi/qla2xxx/qla_def.h
+++ b/drivers/scsi/qla2xxx/qla_def.h
@@ -4394,6 +4394,8 @@ typedef struct scsi_qla_host {
uint16_t n2n_id;
struct list_head gpnid_list;
struct fab_scan scan;
+
+ unsigned int irq_offset;
} scsi_qla_host_t;
struct qla27xx_image_status {
diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c
index 30d3090842f8..8507c43b918c 100644
--- a/drivers/scsi/qla2xxx/qla_isr.c
+++ b/drivers/scsi/qla2xxx/qla_isr.c
@@ -3446,6 +3446,7 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct
rsp_que *rsp)
"Adjusted Max no of queues pairs: %d.\n",
ha->max_qpairs);
}
}
+ vha->irq_offset = desc.pre_vectors;
ha->msix_entries = kcalloc(ha->msix_count,
sizeof(struct qla_msix_entry),
GFP_KERNEL);
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
index ea69dafc9774..c6ef83d0d99b 100644
--- a/drivers/scsi/qla2xxx/qla_os.c
+++ b/drivers/scsi/qla2xxx/qla_os.c
@@ -6939,7 +6939,7 @@ static int qla2xxx_map_queues(struct Scsi_Host *shost)
if (USER_CTRL_IRQ(vha->hw))
rc = blk_mq_map_queues(qmap);
else
- rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev, 0);
+ rc = blk_mq_pci_map_queues(qmap, vha->hw->pdev,
vha->irq_offset);
return rc;
}
--
2.12.0