On Wed, Dec 09, 2015 at 11:34:10AM +0200, Moni Shoua wrote: > > Eh? I think you've missed the point, there is no net device when > > looking at a wc. > > > > Look, here is a concrete direction: > > > > Replace all the crap in > > ib_init_ah_from_wc/get_sgid_index_from_eth/rdma_addr_find_dmac_by_grh > > > > with a straightforward > > > > rdma_dgid_index_from_wc( > > const struct ib_qp *qp, > > const struct ib_wc *wc, > > const struct ib_grh *grh, > > u16 *gid_index) > > > > Sort of function that reads the GRH and wc and returns the unambiguous > > gid index that was used to receive that packet on the UD QP. > > > > I already answered this to but I'll do it again > RoCEv2 spec says that L3 header will be scattered to receive WQE in > the following way > IPv6 and RoCEv1 - 40 bytes of the L3 header (GRH or IPv6) to the first > 40 bytes of the receive bufs > IPv4 - 20 bytes of the L3 header to the second half of the first 40 > bytes of the receive bufs. The first 20 bytes remain undefined. > > Now, if you think how you deduce network_type from GRH you'll see that > it requires tools like checksum validation and other validations and > you end up with a method that is not 100% error free. So,to eliminate > the need for heavy computation (with regards to the other option) and > be free from false deductions you have the option of getting > network_type from the hardware. So, if you do have hardware that > supports it why give it up?
I understand how it works. >From an API perspective, I want to see something that can be used *correctly* and that means more along the lines of rdma_dgid_index_from_wc and not what is proposed in this series. That means not exposing the callers to this awful mess. > > That said, I wouldn't object to vendor-specific bits in the wc. Ie if > > mlx hardware needs a network_type bit to implement > > rdma_find_dgid_index_from_wc, then fine - define a vendor specific > > place to put it. In this case rdma_find_dgid_index_from_wc would be a > > driver call back, which is fine, and what Caitlin was talking about. > > This is not a Mellanox specific flag. See a quote from the spec > > A17.4.5.1 UD COMPLETION QUEUE ENTRIES (CQES) > For UD, the Completion Queue Entry (CQE) includes remote address > information (InfiniBand Specification Vol. 1 Rev 1.2.1 Section > 11.4.2.1). For RoCEv2, the remote address information comprises the > source L2 Address and a flag that indicates if the received frame is > an IPv4, IPv6 or RoCE packet. rdma_dgid_index_from_wc satisfies the above spec requirements without creating a horrible API, or requiring all vendors to implement a network type flag. > > But, it is not part of our verbs API, and I'd *strongly* encourage > > other vendors and future hardware to simply return the gid index that > > the hardware matched instead of requiring the software to try and > > guess after the fact. > > Could be problematic for virtual machine architectures that give a > portion of the entire GID table to a VM that index it 0..N I doubt it. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html