credits for Lustre ? it’s works? now it’s strange number without relation to real network structure and produce over buffering issues on server side. On Sep 2, 2014, at 12:22 PM, Zhen, Liang <[email protected]> wrote:
> Yes, I think this is the potential issue of this patch, for each 1M data > lustre has 256 fragments (256 pages) on 4K pagesize system, which means we > can have max to (credits X 256) outstanding work requests for each > connection, decreasing max_send_wr may hit ib_post_send() failure under heavy > workload. > > I understand this may be a problem for low level stack to allocate big chunk > of space, and cause memory allocating failures. The solution is enabling > map_on_demand and use FMR, however, enabling this on some nodes will prevent > them to join cluster if other nodes have no map_on_demand, we already have a > patch for this which is pending on review, please check this (LU-3322) > > Thanks > Liang > > From: David McMillen <[email protected]<mailto:[email protected]>> > Date: Sunday, August 31, 2014 at 6:48 PM > To: "[email protected]<mailto:[email protected]>" > <[email protected]<mailto:[email protected]>>, > Eli Cohen <[email protected]<mailto:[email protected]>> > Subject: Re: [Lustre-discuss] [PATCH] Avoid Lustre failure on temporary > failure > > Has this been tested with a significant I/O load? We had tried a similar > approach but ran into subsequent errors and connection drops when the > ib_post_send() failed. The code assumes that the original > init_qp_attr->cap.max_send_wr value succeeded. Is there a second part to > this patch? > > Dave > > On Sun, Aug 31, 2014 at 2:53 AM, Eli Cohen > <[email protected]<mailto:[email protected]>> wrote: > >> Lustre code tries to create a QP with max_send_wr which depends on a module >> parameter. The device capabilities do provide the maximum number of send >> work >> requests that the device supports but the actual number of work requests that >> can be supported in a specific case depends on other characteristics of the >> work queue, the transport type, etc. This is in compliance with the IB spec: >> >> 11.2.1.2 QUERY HCA >> Description: >> Returns the attributes for the specified HCA. >> The maximum values defined in this section are guaranteed >> not-to-exceed values. It is possible for an implementation to allocate >> some HCA resources from the same space. In that case, the maximum >> values returned are not guaranteed for all of those resources >> simultaneously. >> >> This patch tries to decrease the number of requested work requests to a level >> that can be supported by the HCA. This prevents unnecessary failures. >> >> Signed-off-by: Eli Cohen <eli at mellanox.com> >> --- >> lnet/klnds/o2iblnd/o2iblnd.c | 25 ++++++++++++++++++------- >> 1 file changed, 18 insertions(+), 7 deletions(-) >> >> diff --git a/lnet/klnds/o2iblnd/o2iblnd.c b/lnet/klnds/o2iblnd/o2iblnd.c >> index 4061db00cba2..ef1c6e07cb45 100644 >> --- a/lnet/klnds/o2iblnd/o2iblnd.c >> +++ b/lnet/klnds/o2iblnd/o2iblnd.c >> @@ -736,6 +736,7 @@ kiblnd_create_conn(kib_peer_t *peer, struct rdma_cm_id >> *cmid, >> int cpt; >> int rc; >> int i; >> + int orig_wr; >> >> LASSERT(net != NULL); >> LASSERT(!in_interrupt()); >> @@ -862,13 +863,23 @@ kiblnd_create_conn(kib_peer_t *peer, struct rdma_cm_id >> *cmid, >> >> conn->ibc_sched = sched; >> >> - rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, init_qp_attr); >> - if (rc != 0) { >> - CERROR("Can't create QP: %d, send_wr: %d, recv_wr: %d\n", >> - rc, init_qp_attr->cap.max_send_wr, >> - init_qp_attr->cap.max_recv_wr); >> - goto failed_2; >> - } >> + orig_wr = init_qp_attr->cap.max_send_wr; >> + do { >> + rc = rdma_create_qp(cmid, conn->ibc_hdev->ibh_pd, >> init_qp_attr); >> + if (!rc || init_qp_attr->cap.max_send_wr < 16) >> + break; >> + >> + init_qp_attr->cap.max_send_wr /= 2; >> + } while (rc); >> + if (rc != 0) { >> + CERROR("Can't create QP: %d, send_wr: %d, recv_wr: %d\n", >> + rc, init_qp_attr->cap.max_send_wr, >> + init_qp_attr->cap.max_recv_wr); >> + goto failed_2; >> + } >> + if (orig_wr != init_qp_attr->cap.max_send_wr) >> + pr_info("original send wr %d, created with %d\n", >> + orig_wr, init_qp_attr->cap.max_send_wr); >> >> LIBCFS_FREE(init_qp_attr, sizeof(*init_qp_attr)); >> > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
