On Sun, Apr 26, 2020 at 10:17:12AM +0300, Maor Gottlieb wrote:
> +int rdma_lag_get_ah_roce_slave(struct ib_device *device,
> + struct rdma_ah_attr *ah_attr,
> + struct net_device **xmit_slave)
Please do not use ** and also return int. The function should return
net_device directly and use ERR_PTR()
> +{
> + struct net_device *master;
> + struct net_device *slave;
> + int err = 0;
> +
> + *xmit_slave = NULL;
> + if (!(ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE &&
> + ah_attr->grh.sgid_attr->gid_type == IB_GID_TYPE_ROCE_UDP_ENCAP))
> + return 0;
> +
> + rcu_read_lock();
> + master = rdma_read_gid_attr_ndev_rcu(ah_attr->grh.sgid_attr);
> + if (IS_ERR(master)) {
> + err = PTR_ERR(master);
> + goto unlock;
> + }
> + dev_hold(master);
What is the point of this dev_hold? This whole thing is under
rcu_read_lock()
> +
> + if (!netif_is_bond_master(master))
> + goto put;
> +
> + slave = rdma_get_xmit_slave_udp(device, master, ah_attr);
IMHO it is probably better to keep with the dev_hold and drop the RCU
while doing rdma_build_skb so that the allocation in here doesn't have
to be atomic. This isn't performance sensitive so the extra atomic for
the dev_hold is better than the unnecessary GFP_ATOMIC allocation
> + if (!slave) {
> + ibdev_warn(device, "Failed to get lag xmit slave\n");
> + err = -EINVAL;
> + goto put;
> + }
> +
> + dev_hold(slave);
And I think the dev_hold should be in the rdma_get_xmit_slave_udp() as
things called 'get' really ought to return with references.
Jason