On Wed, Dec 09, 2015 at 11:34:10AM +0200, Moni Shoua wrote:
> > Eh? I think you've missed the point, there is no net device when
> > looking at a wc.
> >
> > Look, here is a concrete direction:
> >
> > Replace all the crap in
> > ib_init_ah_from_wc/get_sgid_index_from_eth/rdma_addr_find_dmac_by_grh
> >
> > with a straightforward
> >
> >    rdma_dgid_index_from_wc(
> >                         const struct ib_qp *qp,
> >                         const struct ib_wc *wc,
> >                         const struct ib_grh *grh,
> >                         u16 *gid_index)
> >
> > Sort of function that reads the GRH and wc and returns the unambiguous
> > gid index that was used to receive that packet on the UD QP.
> >
> 
> I already answered this to but I'll do it again
> RoCEv2 spec says that L3 header will be scattered to receive WQE in
> the following way
> IPv6 and RoCEv1 - 40 bytes of the L3 header (GRH or IPv6) to the first
> 40 bytes of the receive bufs
> IPv4 - 20 bytes of the L3 header to the second half of the first 40
> bytes of the receive bufs. The first 20 bytes remain undefined.
> 
> Now, if you think how you deduce network_type from GRH you'll see that
> it requires tools like checksum validation and other validations and
> you end up with a method that is not 100% error free. So,to eliminate
> the need for heavy computation (with regards to the other option) and
> be free from false deductions you have the option of getting
> network_type from the hardware. So, if you do have hardware that
> supports it why give it up?

I understand how it works.

>From an API perspective, I want to see something that can be used
*correctly* and that means more along the lines of
rdma_dgid_index_from_wc and not what is proposed in this series.

That means not exposing the callers to this awful mess.

> > That said, I wouldn't object to vendor-specific bits in the wc. Ie if
> > mlx hardware needs a network_type bit to implement
> > rdma_find_dgid_index_from_wc, then fine - define a vendor specific
> > place to put it. In this case rdma_find_dgid_index_from_wc would be a
> > driver call back, which is fine, and what Caitlin was talking about.
> 
> This is not a Mellanox specific flag. See a quote from the spec
>
> A17.4.5.1 UD COMPLETION QUEUE ENTRIES (CQES)
> For UD, the Completion Queue Entry (CQE) includes remote address
> information (InfiniBand Specification Vol. 1 Rev 1.2.1 Section
> 11.4.2.1). For RoCEv2, the remote address information comprises the
> source L2 Address and a flag that indicates if the received frame is
> an IPv4, IPv6 or RoCE packet.

rdma_dgid_index_from_wc satisfies the above spec requirements without
creating a horrible API, or requiring all vendors to implement a
network type flag.

> > But, it is not part of our verbs API, and I'd *strongly* encourage
> > other vendors and future hardware to simply return the gid index that
> > the hardware matched instead of requiring the software to try and
> > guess after the fact.
> 
> Could be problematic for virtual machine architectures that give a
> portion of the entire GID table to a VM that index it 0..N

I doubt it.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to