Re: [Lustre-discuss] [PATCH] Avoid Lustre failure on temporary failure

Alexey Lyashkov Tue, 02 Sep 2014 03:10:35 -0700

credits for Lustre ? it’s works? now it’s strange number without relation to 
real network structure and produce over buffering issues on server side.
 
On Sep 2, 2014, at 12:22 PM, Zhen, Liang <[email protected]> wrote:


> Yes, I think this is the potential issue of this patch, for each 1M data 
> lustre has 256 fragments (256 pages) on 4K pagesize system, which means we 
> can have max to (credits X 256) outstanding work requests for each 
> connection, decreasing max_send_wr may hit ib_post_send() failure under heavy 
> workload.
> 
> I understand this may be a problem for low level stack to allocate big chunk 
> of space, and cause memory allocating failures. The solution is enabling 
> map_on_demand and use FMR, however, enabling this on some nodes will prevent 
> them to join cluster if other nodes have no map_on_demand, we already have a 
> patch for this which is pending on review, please check this (LU-3322)
> 
> Thanks
> Liang
> 
> From: David McMillen <[email protected]<mailto:[email protected]>>
> Date: Sunday, August 31, 2014 at 6:48 PM
> To: "[email protected]<mailto:[email protected]>" 
> <[email protected]<mailto:[email protected]>>, 
> Eli Cohen <[email protected]<mailto:[email protected]>>
> Subject: Re: [Lustre-discuss] [PATCH] Avoid Lustre failure on temporary 
> failure
> 
> Has this been tested with a significant I/O load?  We had tried a similar 
> approach but ran into subsequent errors and connection drops when the 
> ib_post_send() failed.  The code assumes that the original 
> init_qp_attr->cap.max_send_wr value succeeded.  Is there a second part to 
> this patch?
> 
> Dave
> 
> On Sun, Aug 31, 2014 at 2:53 AM, Eli Cohen 
> <[email protected]<mailto:[email protected]>> wrote:
> 
>> Lustre code tries to create a QP with max_send_wr which depends on a module
>> parameter.  The device capabilities do provide the maximum number of send 
>> work
>> requests that the device supports but the actual number of work requests that
>> can be supported in a specific case depends on other characteristics of the
>> work queue, the transport type, etc. This is in compliance with the IB spec:
>> 
>> 11.2.1.2 QUERY HCA
>> Description:
>> Returns the attributes for the specified HCA.
>> The maximum values defined in this section are guaranteed
>> not-to-exceed values. It is possible for an implementation to allocate
>> some HCA resources from the same space. In that case, the maximum
>> values returned are not guaranteed for all of those resources
>> simultaneously.
>> 
>> This patch tries to decrease the number of requested work requests to a level
>> that can be supported by the HCA. This prevents unnecessary failures.
>> 
>> Signed-off-by: Eli Cohen <eli at mellanox.com>
>> ---
>> lnet/klnds/o2iblnd/o2iblnd.c | 25 ++++++++++++++++++-------
>> 1 file changed, 18 insertions(+), 7 deletions(-)
>> 
>> diff --git a/lnet/klnds/o2iblnd/o2iblnd.c b/lnet/klnds/o2iblnd/o2iblnd.c
>> index 4061db00cba2..ef1c6e07cb45 100644
>> --- a/lnet/klnds/o2iblnd/o2iblnd.c
>> +++ b/lnet/klnds/o2iblnd/o2iblnd.c
>> @@ -736,6 +736,7 @@ kiblnd_create_conn(kib_peer_t *peer, struct rdma_cm_id 
>> *cmid,
>>      int                     cpt;
>>      int                     rc;
>>      int                     i;
>> +     int                     orig_wr;
>> 
>>      LASSERT(net != NULL);
>>      LASSERT(!in_interrupt());
>> @@ -862,13 +863,23 @@ kiblnd_create_conn(kib_peer_t *peer, struct rdma_cm_id 
>> *cmid,
>> 
>>      conn->ibc_sched = sched;
>> 
>> -        rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, init_qp_attr);
>> -        if (rc != 0) {
>> -                CERROR("Can't create QP: %d, send_wr: %d, recv_wr: %d\n",
>> -                       rc, init_qp_attr->cap.max_send_wr,
>> -                       init_qp_attr->cap.max_recv_wr);
>> -                goto failed_2;
>> -        }
>> +     orig_wr = init_qp_attr->cap.max_send_wr;
>> +     do {
>> +             rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, 
>> init_qp_attr);
>> +             if (!rc || init_qp_attr->cap.max_send_wr < 16)
>> +                     break;
>> +
>> +             init_qp_attr->cap.max_send_wr /= 2;
>> +     } while (rc);
>> +     if (rc != 0) {
>> +             CERROR("Can't create QP: %d, send_wr: %d, recv_wr: %d\n",
>> +                    rc, init_qp_attr->cap.max_send_wr,
>> +                    init_qp_attr->cap.max_recv_wr);
>> +             goto failed_2;
>> +     }
>> +     if (orig_wr != init_qp_attr->cap.max_send_wr)
>> +             pr_info("original send wr %d, created with %d\n",
>> +                     orig_wr, init_qp_attr->cap.max_send_wr);
>> 
>>         LIBCFS_FREE(init_qp_attr, sizeof(*init_qp_attr));
>> 
> 
> _______________________________________________
> Lustre-discuss mailing list
> [email protected]
> http://lists.lustre.org/mailman/listinfo/lustre-discuss

_______________________________________________
Lustre-discuss mailing list
[email protected]
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] [PATCH] Avoid Lustre failure on temporary failure

Reply via email to