> From: Jason Gunthorpe [mailto:[email protected]] > Sent: Thursday, June 04, 2015 4:20 PM > > On Thu, Jun 04, 2015 at 08:26:49AM -0400, [email protected] wrote: > > From: Kaike Wan <[email protected]> > > > > This patch routes a SA pathrecord query to netlink first and processes > > the response appropriately. If a failure is returned, the request will > > be sent through IB. The decision whether to route the request to > > netlink first is determined by the presence of a listener for the > > local service netlink multicast group. If the user-space local service > > netlink multicast group listener is not present, the request will be > > sent through IB, just like what is currently being done. > > Why doesn't this follow the usual netlink usage idioms? I don't see a single > nla_put or nla_nest_start, which, based on the RFC, is what I expect to see. > Don't home brew net link message construction without a really great reason.
The only reason why we did not use nla_put is to avoid data copying. For example, to translate the input struct ib_sa_path_rec into struct ib_user_path_rec, we directly use the skb buffer reserved for the pathrecord attribute as the destination (ib_nl_set_path_rec_attr()) of ib_copy_path_rec_to_user, which is essentially the guts of nla_put. To use nla_put directly, we would have to use a temp buffer to hold ib_user_path_rec and then copy it to the skb. > > I am very surprised to see this layered under send_mad, all the context is > lost > at that point, and it makes it impossible to provide all the non-IB context > (IP > addrs, netdevs, QOS, etc) As Sean pointed out in a separate e-mail, this provides benefits to more applications. Currently, rdma_cm, ipoib, and SRP calls ib_sa_path_rec_get() to query pathrecord. In addition, this does not preclude optimization in other locations in the kernel in the future. > > Confused why all sorts of LS_NLA_PATH_F_X are defined but only > LS_NLA_PATH_F_USER is referenced. What does F_USER even mean? it indicates that the format of the attribute is struct ib_user_path_rec instead of packed MAD wire format (or struct ibv_path_rec in user space). This has been documented in the original RFC. I'm > pretty sure that should be the same as cma: > (IB_PATH_GMP | IB_PATH_PRIMARY | IB_PATH_BIDIRECTIONAL)) > > Why define new path flags when this is already done as a UAPI? yes, the flag is really copied from the pathrecord flags for struct ib_path_rec_data in include/uapi/rdma/ib_user_sa.h because our pathrecord attribute (struct rdma_nla_ls_path_rec) is similar to struct ib_path_rec_data, but not exactly the same. That's why we should define its own flags as a separate entity. > > Not sure the reply parse is done right either, where is the for loop? Well, for ib_sa_path_rec_get(), we are expecting to see only one pathrecord in the response. For other kernel locations where multiple pathrecords are expected, one may have to iterate through all returned attributes. > > The entire point of the original comment was to move away from using IB SA > Path query MADs in netlink, but this really only changes the reply format. > Better, but wondering if we should go further. > > I was hoping to see the three callers of ib_sa_path_rec_get contribute their > information to the netlink message. > > Suggest putting all the netlink stuff in its own file.. All our netlink stuff is based on the rdma_netlink.h definitions (client, multicast group, opcode, attributes) and I think that the include/uapi/rdma/rdma_entlink.h file is its natural habitat. > > Bunch of smaller stuff, can look at v4.. > > Overall, much improved, I think. > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html
