Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/12/13 00:16, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. The merge window has been closed early which means that I'll have to resend this patch series anyway. How about using the name queue_size instead ? Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On Mon, 2013-09-16 at 16:25 +0200, Bart Van Assche wrote: On 09/12/13 00:16, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. The merge window has been closed early which means that I'll have to resend this patch series anyway. How about using the name queue_size instead ? I'm good with that, but I should be able to make some time to pull it all into a git tree for Roland; this will keep you from having to resubmit. Dave -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/12/2013 06:30 PM, Bart Van Assche wrote: On 09/12/13 18:16, Jack Wang wrote: On 09/12/2013 12:16 AM, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. What the allow range for this queue size? Default cmd_per_lun and can_queue with same value makes no sense to me. Could we bump can_queue to bigger value like 512? Increasing the default value is only necessary when using a hard disk array at the target side. When using a single hard disk or an SSD at the target side the default value is fine. Bart. Agree, from another side increasing default can_queue value is harmless for single harddisk/SSD, but will boot performance for multiple LUN attached to same host, so why not? Jack -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/13/13 10:06, Jack Wang wrote: On 09/12/2013 06:30 PM, Bart Van Assche wrote: On 09/12/13 18:16, Jack Wang wrote: On 09/12/2013 12:16 AM, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. What the allow range for this queue size? Default cmd_per_lun and can_queue with same value makes no sense to me. Could we bump can_queue to bigger value like 512? Increasing the default value is only necessary when using a hard disk array at the target side. When using a single hard disk or an SSD at the target side the default value is fine. Agree, from another side increasing default can_queue value is harmless for single harddisk/SSD, but will boot performance for multiple LUN attached to same host, so why not? Increasing that parameter increases memory consumption for each path between initiator and target. In a setup where both the initiator and the target have multiple IB ports the number of paths can be large even when only a single LUN is exported by the target. That's why I'm not enthusiast about increasing the default value of the queue size. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/13/13 10:40, Bart Van Assche wrote: On 09/13/13 10:06, Jack Wang wrote: On 09/12/2013 06:30 PM, Bart Van Assche wrote: On 09/12/13 18:16, Jack Wang wrote: On 09/12/2013 12:16 AM, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. What the allow range for this queue size? Default cmd_per_lun and can_queue with same value makes no sense to me. Could we bump can_queue to bigger value like 512? Increasing the default value is only necessary when using a hard disk array at the target side. When using a single hard disk or an SSD at the target side the default value is fine. Agree, from another side increasing default can_queue value is harmless for single harddisk/SSD, but will boot performance for multiple LUN attached to same host, so why not? Increasing that parameter increases memory consumption for each path between initiator and target. In a setup where both the initiator and the target have multiple IB ports the number of paths can be large even when only a single LUN is exported by the target. That's why I'm not enthusiast about increasing the default value of the queue size. (replying to my own e-mail) Note: with the srp_daemon implementation available on github it's possible to set the can_queue parameter for all logins initiated by srp_daemon by adding something like the following at the end of /etc/srp_daemon.conf: a can_queue=512 See also http://github.com/bvanassche/srptools. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/13/2013 11:24 AM, Bart Van Assche wrote: On 09/13/13 10:40, Bart Van Assche wrote: On 09/13/13 10:06, Jack Wang wrote: On 09/12/2013 06:30 PM, Bart Van Assche wrote: On 09/12/13 18:16, Jack Wang wrote: On 09/12/2013 12:16 AM, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. What the allow range for this queue size? Default cmd_per_lun and can_queue with same value makes no sense to me. Could we bump can_queue to bigger value like 512? Increasing the default value is only necessary when using a hard disk array at the target side. When using a single hard disk or an SSD at the target side the default value is fine. Agree, from another side increasing default can_queue value is harmless for single harddisk/SSD, but will boot performance for multiple LUN attached to same host, so why not? Increasing that parameter increases memory consumption for each path between initiator and target. In a setup where both the initiator and the target have multiple IB ports the number of paths can be large even when only a single LUN is exported by the target. That's why I'm not enthusiast about increasing the default value of the queue size. (replying to my own e-mail) Note: with the srp_daemon implementation available on github it's possible to set the can_queue parameter for all logins initiated by srp_daemon by adding something like the following at the end of /etc/srp_daemon.conf: a can_queue=512 See also http://github.com/bvanassche/srptools. Bart. Thanks Bart, I tried your srp-ha branch in github, echo string SRP2=id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512 to add_target failed with ib_srp: unknown parameter or missing value 'can_queue=512 [ 122.405385] ' in target creation request Do I miss something? Cheers, Jack -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/13/13 14:25, Jack Wang wrote: I tried your srp-ha branch in github, echo string SRP2=id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512 to add_target failed with ib_srp: unknown parameter or missing value 'can_queue=512 [ 122.405385] ' in target creation request Do I miss something? Hello Jack, Have you already tried the same command with echo -n instead of echo ? If I remember correctly the SRP initiator doesn't like a newline at the end of the string written into the sysfs add_target attribute. Hope this helps, Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/13/2013 03:33 PM, Bart Van Assche wrote: On 09/13/13 14:25, Jack Wang wrote: I tried your srp-ha branch in github, echo string SRP2=id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512 to add_target failed with ib_srp: unknown parameter or missing value 'can_queue=512 [ 122.405385] ' in target creation request Do I miss something? Hello Jack, Have you already tried the same command with echo -n instead of echo ? If I remember correctly the SRP initiator doesn't like a newline at the end of the string written into the sysfs add_target attribute. Hope this helps, Bart. Thanks Bart, You're right, that is the problem, with -n option I can set can_queue successfully. But cat /sys/class/scsi_host/hostx/can_queue still show 63 Looks like it cause by the logic below: /* 1870 * Reserve credits for task management so we don't 1871 * bounce requests back to the SCSI mid-layer. 1872 */ 1873 target-scsi_host-can_queue 1874 = min(target-req_lim - SRP_TSK_MGMT_SQ_SIZE, 1875 target-scsi_host-can_queue); 1876 target-scsi_host-cmd_per_lun 1877 = min_t(int, target-scsi_host-can_queue, 1878 target-scsi_host-cmd_per_lun); Could you give some hint on this, why we need to update can_queue and cmd_per_lun here? Cheers, Jack -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/13/13 15:51, Jack Wang wrote: On 09/13/2013 03:33 PM, Bart Van Assche wrote: On 09/13/13 14:25, Jack Wang wrote: I tried your srp-ha branch in github, echo string SRP2=id_ext=${THCA2_GUID},ioc_guid=${THCA2_GUID},dgid=${TGID_P2},pkey=${PKEY},service_id=${THCA2_GUID},can_queue=512 to add_target failed with ib_srp: unknown parameter or missing value 'can_queue=512 [ 122.405385] ' in target creation request Do I miss something? Hello Jack, Have you already tried the same command with echo -n instead of echo ? If I remember correctly the SRP initiator doesn't like a newline at the end of the string written into the sysfs add_target attribute. Hope this helps, Bart. Thanks Bart, You're right, that is the problem, with -n option I can set can_queue successfully. But cat /sys/class/scsi_host/hostx/can_queue still show 63 Looks like it cause by the logic below: /* 1870 * Reserve credits for task management so we don't 1871 * bounce requests back to the SCSI mid-layer. 1872 */ 1873 target-scsi_host-can_queue 1874 = min(target-req_lim - SRP_TSK_MGMT_SQ_SIZE, 1875 target-scsi_host-can_queue); 1876 target-scsi_host-cmd_per_lun 1877 = min_t(int, target-scsi_host-can_queue, 1878 target-scsi_host-cmd_per_lun); Could you give some hint on this, why we need to update can_queue and cmd_per_lun here? Hello Jack, What is the value of target-req_lim on your setup ? The initiator limits the queue size to the target request limit because any attempt to queue more commands would cause the target to reply with BUSY anyway. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/13/13 16:15, Jack Wang wrote: Hello Bart, cat /sys/class/scsi_host/host36/req_lim 64 I just checked srp spec, which do define such behaviour, I wonder in SCST/SRPT, how the request limit is chosen, is it report from low level hardware driver? Hello Jack, The following code probably provides the information you are looking for: /* * Avoid QUEUE_FULL conditions by limiting the number of buffers used * for the SRP protocol to the SCST SCSI command queue size. */ ch-rq_size = min(SRPT_RQ_SIZE, scst_get_max_lun_commands(NULL, 0)); In other words, req_lim can be increased by increasing both SRPT_RQ_SIZE and SCST_MAX_DEV_COMMANDS. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/12/2013 12:16 AM, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. -- Hi, What the allow range for this queue size? Default cmd_per_lun and can_queue with same value makes no sense to me. Could we bump can_queue to bigger value like 512? Best Jack -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/12/13 18:16, Jack Wang wrote: On 09/12/2013 12:16 AM, David Dillow wrote: On Tue, 2013-09-10 at 19:44 +0200, Bart Van Assche wrote: If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 Yes, I know it has been used before, but I'm torn between not furthering a bad naming choice and consistency. Foolish consistency and all that... I really don't like can_queue, but I'll not complain if Roland decides to take it as-is. What the allow range for this queue size? Default cmd_per_lun and can_queue with same value makes no sense to me. Could we bump can_queue to bigger value like 512? Increasing the default value is only necessary when using a hard disk array at the target side. When using a single hard disk or an SSD at the target side the default value is fine. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 09/10/13 05:01, David Dillow wrote: On Tue, 2013-08-20 at 14:50 +0200, Bart Van Assche wrote: @@ -2227,6 +2270,7 @@ static const match_table_t srp_opt_tokens = { { SRP_OPT_SG_TABLESIZE, sg_tablesize=%u }, { SRP_OPT_COMP_VECTOR, comp_vector=%u}, { SRP_OPT_TL_RETRY_COUNT, tl_retry_count=%u }, + { SRP_OPT_CAN_QUEUE,can_queue=%d }, I'm pretty much OK with the patch, but since we're stuck with it going forward, I'd like to have a better externally visible name here -- queue_depth? max_queue? queue_size? Otherwise, Acked-by: David Dillow dillo...@ornl.gov Hello Dave, If this name was not yet in use in any interface that is visible in user space, I would agree that we should come up with a better name. However, the SCSI mid-layer already uses that name today to export the queue size. To me this looks like a good reason to use the name can_queue ? An example: $ cat /sys/class/scsi_host/host93/can_queue 62 See also the shost_rd_attr(can_queue, %hd\n) statement in drivers/scsi/scsi_sysfs.c. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On Tue, 2013-08-20 at 14:50 +0200, Bart Van Assche wrote: @@ -2227,6 +2270,7 @@ static const match_table_t srp_opt_tokens = { { SRP_OPT_SG_TABLESIZE, sg_tablesize=%u }, { SRP_OPT_COMP_VECTOR, comp_vector=%u}, { SRP_OPT_TL_RETRY_COUNT, tl_retry_count=%u }, + { SRP_OPT_CAN_QUEUE,can_queue=%d }, I'm pretty much OK with the patch, but since we're stuck with it going forward, I'd like to have a better externally visible name here -- queue_depth? max_queue? queue_size? Otherwise, Acked-by: David Dillow dillo...@ornl.gov -- Dave Dillow National Center for Computational Science Oak Ridge National Laboratory (865) 241-6602 office -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 8/20/2013 8:43 PM, David Dillow wrote: On Tue, 2013-08-20 at 17:55 +0200, Bart Van Assche wrote: On 08/20/13 17:34, Sagi Grimberg wrote: Question, If srp now will allow larger queues while using a single global FMR pool of size 1024, isn't it more likely now that in stress environment srp will run out of FMRs to handle IO commands? I mean that let's say that you have x scsi hosts with can_queue size of 512 (+-) and all of them are running IO stress, is it possible that all FMRs will be inuse and no FMR is available to register the next IO SG-list? Did you try out such a scenario? I guess that in such a case IB core will return EAGAIN and SRP will return SCSI_MLQUEUE_HOST_BUSY. I think it is a good Idea to move FMR pools to be per connection rather than a global pool, what do you think? That makes sense to me. And as long as the above has not yet been implemented I'm fine with dropping patch 8/8 from this patch set. Don't drop it; most configs won't have all that many connections and shouldn't have an issue; even those that do will only see a potential slowdown when running with everything at once. We can address the FMR/BMME issues on top of this patch. Agree. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 8/8] IB/srp: Make queue size configurable
Certain storage configurations, e.g. a sufficiently large array of hard disks in a RAID configuration, need a queue depth above 64 to achieve optimal performance. Hence make the queue depth configurable. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com Cc: Konrad Grzybowski konr...@k2.pl --- drivers/infiniband/ulp/srp/ib_srp.c | 125 ++- drivers/infiniband/ulp/srp/ib_srp.h | 17 +++-- 2 files changed, 103 insertions(+), 39 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index ece1f2d..6de2323 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -299,16 +299,16 @@ static int srp_create_target_ib(struct srp_target_port *target) return -ENOMEM; recv_cq = ib_create_cq(target-srp_host-srp_dev-dev, - srp_recv_completion, NULL, target, SRP_RQ_SIZE, - target-comp_vector); + srp_recv_completion, NULL, target, + target-queue_size, target-comp_vector); if (IS_ERR(recv_cq)) { ret = PTR_ERR(recv_cq); goto err; } send_cq = ib_create_cq(target-srp_host-srp_dev-dev, - srp_send_completion, NULL, target, SRP_SQ_SIZE, - target-comp_vector); + srp_send_completion, NULL, target, + target-queue_size, target-comp_vector); if (IS_ERR(send_cq)) { ret = PTR_ERR(send_cq); goto err_recv_cq; @@ -317,8 +317,8 @@ static int srp_create_target_ib(struct srp_target_port *target) ib_req_notify_cq(recv_cq, IB_CQ_NEXT_COMP); init_attr-event_handler = srp_qp_event; - init_attr-cap.max_send_wr = SRP_SQ_SIZE; - init_attr-cap.max_recv_wr = SRP_RQ_SIZE; + init_attr-cap.max_send_wr = target-queue_size; + init_attr-cap.max_recv_wr = target-queue_size; init_attr-cap.max_recv_sge= 1; init_attr-cap.max_send_sge= 1; init_attr-sq_sig_type = IB_SIGNAL_ALL_WR; @@ -364,6 +364,10 @@ err: return ret; } +/* + * Note: this function may be called without srp_alloc_iu_bufs() having been + * invoked. Hence the target-[rt]x_ring checks. + */ static void srp_free_target_ib(struct srp_target_port *target) { int i; @@ -375,10 +379,18 @@ static void srp_free_target_ib(struct srp_target_port *target) target-qp = NULL; target-send_cq = target-recv_cq = NULL; - for (i = 0; i SRP_RQ_SIZE; ++i) - srp_free_iu(target-srp_host, target-rx_ring[i]); - for (i = 0; i SRP_SQ_SIZE; ++i) - srp_free_iu(target-srp_host, target-tx_ring[i]); + if (target-rx_ring) { + for (i = 0; i target-queue_size; ++i) + srp_free_iu(target-srp_host, target-rx_ring[i]); + kfree(target-rx_ring); + target-rx_ring = NULL; + } + if (target-tx_ring) { + for (i = 0; i target-queue_size; ++i) + srp_free_iu(target-srp_host, target-tx_ring[i]); + kfree(target-tx_ring); + target-tx_ring = NULL; + } } static void srp_path_rec_completion(int status, @@ -564,7 +576,11 @@ static void srp_free_req_data(struct srp_target_port *target) struct srp_request *req; int i; - for (i = 0, req = target-req_ring; i SRP_CMD_SQ_SIZE; ++i, ++req) { + if (!target-req_ring) + return; + + for (i = 0; i target-req_ring_size; ++i) { + req = target-req_ring[i]; kfree(req-fmr_list); kfree(req-map_page); if (req-indirect_dma_addr) { @@ -574,6 +590,9 @@ static void srp_free_req_data(struct srp_target_port *target) } kfree(req-indirect_desc); } + + kfree(target-req_ring); + target-req_ring = NULL; } static int srp_alloc_req_data(struct srp_target_port *target) @@ -586,7 +605,12 @@ static int srp_alloc_req_data(struct srp_target_port *target) INIT_LIST_HEAD(target-free_reqs); - for (i = 0; i SRP_CMD_SQ_SIZE; ++i) { + target-req_ring = kzalloc(target-req_ring_size * + sizeof(*target-req_ring), GFP_KERNEL); + if (!target-req_ring) + goto out; + + for (i = 0; i target-req_ring_size; ++i) { req = target-req_ring[i]; req-fmr_list = kmalloc(target-cmd_sg_cnt * sizeof(void *), GFP_KERNEL); @@ -810,7 +834,7 @@ static void srp_terminate_io(struct
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 8/20/2013 3:50 PM, Bart Van Assche wrote: Certain storage configurations, e.g. a sufficiently large array of hard disks in a RAID configuration, need a queue depth above 64 to achieve optimal performance. Hence make the queue depth configurable. Signed-off-by: Bart Van Assche bvanass...@acm.org Cc: Roland Dreier rol...@purestorage.com Cc: David Dillow dillo...@ornl.gov Cc: Vu Pham v...@mellanox.com Cc: Sebastian Riemer sebastian.rie...@profitbricks.com Cc: Konrad Grzybowski konr...@k2.pl --- drivers/infiniband/ulp/srp/ib_srp.c | 125 ++- drivers/infiniband/ulp/srp/ib_srp.h | 17 +++-- 2 files changed, 103 insertions(+), 39 deletions(-) diff --git a/drivers/infiniband/ulp/srp/ib_srp.c b/drivers/infiniband/ulp/srp/ib_srp.c index ece1f2d..6de2323 100644 --- a/drivers/infiniband/ulp/srp/ib_srp.c +++ b/drivers/infiniband/ulp/srp/ib_srp.c @@ -299,16 +299,16 @@ static int srp_create_target_ib(struct srp_target_port *target) return -ENOMEM; recv_cq = ib_create_cq(target-srp_host-srp_dev-dev, - srp_recv_completion, NULL, target, SRP_RQ_SIZE, - target-comp_vector); + srp_recv_completion, NULL, target, + target-queue_size, target-comp_vector); if (IS_ERR(recv_cq)) { ret = PTR_ERR(recv_cq); goto err; } send_cq = ib_create_cq(target-srp_host-srp_dev-dev, - srp_send_completion, NULL, target, SRP_SQ_SIZE, - target-comp_vector); + srp_send_completion, NULL, target, + target-queue_size, target-comp_vector); if (IS_ERR(send_cq)) { ret = PTR_ERR(send_cq); goto err_recv_cq; @@ -317,8 +317,8 @@ static int srp_create_target_ib(struct srp_target_port *target) ib_req_notify_cq(recv_cq, IB_CQ_NEXT_COMP); init_attr-event_handler = srp_qp_event; - init_attr-cap.max_send_wr = SRP_SQ_SIZE; - init_attr-cap.max_recv_wr = SRP_RQ_SIZE; + init_attr-cap.max_send_wr = target-queue_size; + init_attr-cap.max_recv_wr = target-queue_size; init_attr-cap.max_recv_sge= 1; init_attr-cap.max_send_sge= 1; init_attr-sq_sig_type = IB_SIGNAL_ALL_WR; @@ -364,6 +364,10 @@ err: return ret; } +/* + * Note: this function may be called without srp_alloc_iu_bufs() having been + * invoked. Hence the target-[rt]x_ring checks. + */ static void srp_free_target_ib(struct srp_target_port *target) { int i; @@ -375,10 +379,18 @@ static void srp_free_target_ib(struct srp_target_port *target) target-qp = NULL; target-send_cq = target-recv_cq = NULL; - for (i = 0; i SRP_RQ_SIZE; ++i) - srp_free_iu(target-srp_host, target-rx_ring[i]); - for (i = 0; i SRP_SQ_SIZE; ++i) - srp_free_iu(target-srp_host, target-tx_ring[i]); + if (target-rx_ring) { + for (i = 0; i target-queue_size; ++i) + srp_free_iu(target-srp_host, target-rx_ring[i]); + kfree(target-rx_ring); + target-rx_ring = NULL; + } + if (target-tx_ring) { + for (i = 0; i target-queue_size; ++i) + srp_free_iu(target-srp_host, target-tx_ring[i]); + kfree(target-tx_ring); + target-tx_ring = NULL; + } } static void srp_path_rec_completion(int status, @@ -564,7 +576,11 @@ static void srp_free_req_data(struct srp_target_port *target) struct srp_request *req; int i; - for (i = 0, req = target-req_ring; i SRP_CMD_SQ_SIZE; ++i, ++req) { + if (!target-req_ring) + return; + + for (i = 0; i target-req_ring_size; ++i) { + req = target-req_ring[i]; kfree(req-fmr_list); kfree(req-map_page); if (req-indirect_dma_addr) { @@ -574,6 +590,9 @@ static void srp_free_req_data(struct srp_target_port *target) } kfree(req-indirect_desc); } + + kfree(target-req_ring); + target-req_ring = NULL; } static int srp_alloc_req_data(struct srp_target_port *target) @@ -586,7 +605,12 @@ static int srp_alloc_req_data(struct srp_target_port *target) INIT_LIST_HEAD(target-free_reqs); - for (i = 0; i SRP_CMD_SQ_SIZE; ++i) { + target-req_ring = kzalloc(target-req_ring_size * + sizeof(*target-req_ring), GFP_KERNEL); + if (!target-req_ring) + goto out; + + for (i = 0; i target-req_ring_size; ++i) { req = target-req_ring[i]; req-fmr_list = kmalloc(target-cmd_sg_cnt * sizeof(void *), GFP_KERNEL); @@ -810,7 +834,7 @@
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On 08/20/13 17:34, Sagi Grimberg wrote: On 8/20/2013 3:50 PM, Bart Van Assche wrote: Certain storage configurations, e.g. a sufficiently large array of hard disks in a RAID configuration, need a queue depth above 64 to achieve optimal performance. Hence make the queue depth configurable. [ ... ] I noticed this patch in your github and played with it, I agree that this patch is needed for a long time... Question, If srp now will allow larger queues while using a single global FMR pool of size 1024, isn't it more likely now that in stress environment srp will run out of FMRs to handle IO commands? I mean that let's say that you have x scsi hosts with can_queue size of 512 (+-) and all of them are running IO stress, is it possible that all FMRs will be inuse and no FMR is available to register the next IO SG-list? Did you try out such a scenario? I guess that in such a case IB core will return EAGAIN and SRP will return SCSI_MLQUEUE_HOST_BUSY. I think it is a good Idea to move FMR pools to be per connection rather than a global pool, what do you think? That makes sense to me. And as long as the above has not yet been implemented I'm fine with dropping patch 8/8 from this patch set. Bart. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 8/8] IB/srp: Make queue size configurable
On Tue, 2013-08-20 at 17:55 +0200, Bart Van Assche wrote: On 08/20/13 17:34, Sagi Grimberg wrote: Question, If srp now will allow larger queues while using a single global FMR pool of size 1024, isn't it more likely now that in stress environment srp will run out of FMRs to handle IO commands? I mean that let's say that you have x scsi hosts with can_queue size of 512 (+-) and all of them are running IO stress, is it possible that all FMRs will be inuse and no FMR is available to register the next IO SG-list? Did you try out such a scenario? I guess that in such a case IB core will return EAGAIN and SRP will return SCSI_MLQUEUE_HOST_BUSY. I think it is a good Idea to move FMR pools to be per connection rather than a global pool, what do you think? That makes sense to me. And as long as the above has not yet been implemented I'm fine with dropping patch 8/8 from this patch set. Don't drop it; most configs won't have all that many connections and shouldn't have an issue; even those that do will only see a potential slowdown when running with everything at once. We can address the FMR/BMME issues on top of this patch. -- To unsubscribe from this list: send the line unsubscribe linux-rdma in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html