> From: Jason Gunthorpe [mailto:[email protected]]
> Sent: Thursday, June 04, 2015 4:20 PM
> 
> On Thu, Jun 04, 2015 at 08:26:49AM -0400, [email protected] wrote:
> > From: Kaike Wan <[email protected]>
> >
> > This patch routes a SA pathrecord query to netlink first and processes
> > the response appropriately. If a failure is returned, the request will
> > be sent through IB. The decision whether to route the request to
> > netlink first is determined by the presence of a listener for the
> > local service netlink multicast group. If the user-space local service
> > netlink multicast group listener is not present, the request will be
> > sent through IB, just like what is currently being done.
> 
> Why doesn't this follow the usual netlink usage idioms? I don't see a single
> nla_put or nla_nest_start, which, based on the RFC, is what I expect to see.
> Don't home brew net link message construction without a really great reason.

The only reason why we did not use nla_put is to avoid data copying. For 
example, to translate the input struct ib_sa_path_rec into struct 
ib_user_path_rec, we directly use the skb buffer reserved for the pathrecord 
attribute as the destination (ib_nl_set_path_rec_attr()) of 
ib_copy_path_rec_to_user, which is essentially the guts of nla_put. To use 
nla_put directly, we would have to use a temp buffer to hold ib_user_path_rec 
and then copy it to the skb.

> 
> I am very surprised to see this layered under send_mad, all the context is 
> lost
> at that point, and it makes it impossible to provide all the non-IB context 
> (IP
> addrs, netdevs, QOS, etc)

As Sean pointed out in a separate e-mail, this provides benefits to more 
applications. Currently, rdma_cm, ipoib, and SRP calls ib_sa_path_rec_get() to 
query pathrecord. In addition, this does not preclude optimization in other 
locations in the kernel in the future.

> 
> Confused why all sorts of LS_NLA_PATH_F_X are defined but only
> LS_NLA_PATH_F_USER is referenced. What does F_USER even mean? 

it indicates that the format of the attribute is struct ib_user_path_rec 
instead of packed MAD wire format (or struct ibv_path_rec in user space). This 
has been documented in the original RFC.

I'm
> pretty sure that should be the same as cma:
>   (IB_PATH_GMP | IB_PATH_PRIMARY | IB_PATH_BIDIRECTIONAL))
> 
> Why define new path flags when this is already done as a UAPI?

yes, the flag is really copied from the pathrecord flags for struct 
ib_path_rec_data in include/uapi/rdma/ib_user_sa.h because our pathrecord 
attribute (struct rdma_nla_ls_path_rec) is similar to struct ib_path_rec_data, 
but not exactly the same.  That's why we should define its own flags as a 
separate entity.

> 
> Not sure the reply parse is done right either, where is the for loop?

Well, for ib_sa_path_rec_get(), we are expecting to see only one pathrecord in 
the response. For other kernel locations where multiple pathrecords are 
expected, one may have to iterate through all returned attributes.

> 
> The entire point of the original comment was to move away from using IB SA
> Path query MADs in netlink, but this really only changes the reply format.
> Better, but wondering if we should go further.
> 
> I was hoping to see the three callers of ib_sa_path_rec_get contribute their
> information to the netlink message.
> 
> Suggest putting all the netlink stuff in its own file..

All our netlink stuff is based on the rdma_netlink.h definitions (client, 
multicast group, opcode, attributes) and I think that the 
include/uapi/rdma/rdma_entlink.h file is its natural habitat.

> 
> Bunch of smaller stuff, can look at v4..
> 
> Overall, much improved, I think.
> 
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to