On 22/09/2014 8:37, Christoph Hellwig wrote:
One thing that is missing is generation multiqueue-aware tags at the
blk-mq level, which should be as simple as always adding a queue
prefix in the tag allocation code.

Hello Christoph,

Adding a queue prefix in the tag allocation code is an interesting idea. Encoding the hardware context index in the upper bits of the 'tag' field in 'struct request' is something I have considered. The reason I have not done that is because I think several block drivers assume that the rq->tag field is a number in the range [0..queue_depth-1]. Here is just one example from the mtip32xx driver:

        fis->sect_count  = ((rq->tag << 3) | (rq->tag >> 5));

Did you consider switching srp to use the block layer provided tags?

This is on my to-do list. The only reason I have not yet done this is because I have not yet had the time to work on it. Another item that is on my to-do list is to eliminate per-request memory allocation and instead to use your patch that added a "cmd_size" field in the SCSI host template.

Also do you have any performance numbers for just using multiple
queues inside srp vs using blk-mq exposed queues?

So far I have only rerun the multithreaded write test. For that test I see about 15% more IOPS with this patch series (exploiting multiple hardware queues and a 1:1 mapping between hardware context and RDMA queue pair) compared to the previous implementation (one hardware queue and multiple RDMA queue pairs). Please keep in mind that in that test the CPU's of the target system are saturated so the performance potential of using multiple hardware queues is probably larger than the difference I measured.

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to