It is observed that IOPS can be improved much by simply making
hw queue per NUMA node on null_blk, so this patch applies the
introduced .host_tagset for improving performance.

In reality, .can_queue is quite big, and NUMA node number is
often small, so each hw queue's depth should be high enough to
saturate device.

Cc: Arun Easi <[email protected]>
Cc: Omar Sandoval <[email protected]>,
Cc: "Martin K. Petersen" <[email protected]>,
Cc: James Bottomley <[email protected]>,
Cc: Christoph Hellwig <[email protected]>,
Cc: Don Brace <[email protected]>
Cc: Kashyap Desai <[email protected]>
Cc: Peter Rivera <[email protected]>
Cc: Laurence Oberman <[email protected]>
Cc: Hannes Reinecke <[email protected]>
Cc: Mike Snitzer <[email protected]>
Signed-off-by: Ming Lei <[email protected]>
---
 drivers/scsi/hpsa.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/scsi/hpsa.c b/drivers/scsi/hpsa.c
index 3a9eca163db8..0747751b7e1c 100644
--- a/drivers/scsi/hpsa.c
+++ b/drivers/scsi/hpsa.c
@@ -978,6 +978,7 @@ static struct scsi_host_template hpsa_driver_template = {
        .shost_attrs = hpsa_shost_attrs,
        .max_sectors = 1024,
        .no_write_same = 1,
+       .host_tagset = 1,
 };
 
 static inline u32 next_command(struct ctlr_info *h, u8 q)
@@ -5761,6 +5762,11 @@ static int hpsa_scsi_host_alloc(struct ctlr_info *h)
 static int hpsa_scsi_add_host(struct ctlr_info *h)
 {
        int rv;
+       /* 256 tags should be high enough to saturate device */
+       int max_queues = DIV_ROUND_UP(h->scsi_host->can_queue, 256);
+
+       /* per NUMA node hw queue */
+       h->scsi_host->nr_hw_queues = min_t(int, nr_node_ids, max_queues);
 
        rv = scsi_add_host(h->scsi_host, &h->pdev->dev);
        if (rv) {
-- 
2.9.5

Reply via email to