On Wed, 2007-09-26 at 14:06 -0500, Jim Mott wrote: > This is a two part bug report. One is a conceptual problem that may just > be a problem of understanding on my part. The other is > what I believe to be a bug in the mlx4 driver.
mthca has the same issue. > > 1) ib_create_qp() fails with max_sge > If you use ib_query_device() to return the device specific > attribute max_sge, it seems reasonable to expect you can create > a QP with max_send_sge=max_sge. The problem is that this often > fails. > > The reason is that depending on the QP type (RC, UD, etc.) and > how the QP will be used (send, RDMA, atomic, etc.), there can be > extra segments required in the WQE that eat up SGE entries. So > while some send WQE might have max_sge available SGEs, many will > not. > > Normally the difference between max_sge and the actual maximum > value allowed (and checked) for max_send_sge is 1 or 2. > > This issue may need API extensions to definitively resolve. In > the short term, it would be very nice if max_sge reported by > ib_query_device() could always return a value that ib_create_qp() > could use. Think of it as the minimum max_send_sge value that > will work for all QP types. > > > 2) mlx4 setting of max send SQEs > The recent patch to support shrinking WQEs introduces a > behavior that creates a big difference between the mlx4 > supported send SGEs (checked against 61, should be 59 or 60, > and reported in ib_query_device as 32 to equal receive side > max_rq_sg value). > > The patch that follows will allow an MLX4 to support the > number of send SGEs returned by ib_query_devce, and in fact > quite a few more. It probably breaks shrinking WQEs and thus > should not be applied directly. > > Note that if ib_query_device() returned max_sge adjusted > for the raddr and atomic segments, this fix would not be > needed. MLX4 would still support more SGEs in hardware than > can be used through the API, but that is a different problem. > > --- ofa_1_3_dev_kernel.orig/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 > 13:27:47.000000000 -0500 > +++ ofa_1_3_dev_kernel/drivers/infiniband/hw/mlx4/qp.c 2007-09-26 > 13:36:40.000000000 -0500 > @@ -370,7 +370,7 @@ static int set_kernel_sq_size(struct mlx > qp->sq.wqe_shift = ilog2(roundup_pow_of_two(s)); > > for (;;) { > - if (1 << qp->sq.wqe_shift > dev->dev->caps.max_sq_desc_sz) > + if (s > dev->dev->caps.max_sq_desc_sz) > return -EINVAL; > > qp->sq_max_wqes_per_wr = DIV_ROUND_UP(s, 1 << > qp->sq.wqe_shift); > > _______________________________________________ > general mailing list > general@lists.openfabrics.org > http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general > > To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general _______________________________________________ general mailing list general@lists.openfabrics.org http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general