On 29/01/2015 23:59, Yann Droneaud wrote:
> But I wonder about this part of commit 860f10a799c8:
>
> diff --git a/drivers/infiniband/core/uverbs_cmd.c
> b/drivers/infiniband/core/uverbs_cmd.c
> index c7a43624c96b..f9326ccda4b5 100644
> --- a/drivers/infiniband/core/uverbs_cmd.c
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -953,6 +953,18 @@ ssize_t ib_uverbs_reg_mr(struct ib_uverbs_file *file,
> goto err_free;
> }
>
> + if (cmd.access_flags & IB_ACCESS_ON_DEMAND) {
> + struct ib_device_attr attr;
> +
> + ret = ib_query_device(pd->device, &attr);
> + if (ret || !(attr.device_cap_flags &
> + IB_DEVICE_ON_DEMAND_PAGING)) {
> + pr_debug("ODP support not available\n");
> + ret = -EINVAL;
> + goto err_put;
> + }
> + }
> +
>
> AFAICT (1 << 6) bit from struct ib_uverbs_reg_mr access_flags field
> was not enforced to be 0 previously, as ib_check_mr_access() only check
> for some coherency between a subset of the bits (it's not a function
> dedicated to check flags provided by userspace):
>
> include/rdma/ib_verbs.h:
>
> 1094 enum ib_access_flags {
> 1095 IB_ACCESS_LOCAL_WRITE = 1,
> 1096 IB_ACCESS_REMOTE_WRITE = (1<<1),
> 1097 IB_ACCESS_REMOTE_READ = (1<<2),
> 1098 IB_ACCESS_REMOTE_ATOMIC = (1<<3),
> 1099 IB_ACCESS_MW_BIND = (1<<4),
> 1100 IB_ZERO_BASED = (1<<5),
> 1101 IB_ACCESS_ON_DEMAND = (1<<6),
> 1102 };
>
> drivers/infiniband/core/uverbs_cmd.c: ib_uverbs_reg_mr()
>
> 961 ret = ib_check_mr_access(cmd.access_flags);
> 962 if (ret)
> 963 return ret;
>
> include/rdma/ib_verbs.h:
>
> 2643 static inline int ib_check_mr_access(int flags)
> 2644 {
> 2645 /*
> 2646 * Local write permission is required if remote write or
> 2647 * remote atomic permission is also requested.
> 2648 */
> 2649 if (flags & (IB_ACCESS_REMOTE_ATOMIC |
> IB_ACCESS_REMOTE_WRITE) &&
> 2650 !(flags & IB_ACCESS_LOCAL_WRITE))
> 2651 return -EINVAL;
> 2652
> 2653 return 0;
> 2654 }
>
> drivers/infiniband/core/uverbs_cmd.c: ib_uverbs_reg_mr()
>
> 990 mr = pd->device->reg_user_mr(pd, cmd.start, cmd.length,
> cmd.hca_va,
> 991 cmd.access_flags, &udata);
>
> reg_user_mr() functions may call ib_umem_get() and pass access_flags to
> it:
>
> drivers/infiniband/core/umem.c: ib_umem_get()
>
> 114 /*
> 115 * We ask for writable memory if any of the following
> 116 * access flags are set. "Local write" and "remote write"
> 117 * obviously require write access. "Remote atomic" can do
> 118 * things like fetch and add, which will modify memory, and
> 119 * "MW bind" can change permissions by binding a window.
> 120 */
> 121 umem->writable = !!(access &
> 122 (IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_WRITE |
> 123 IB_ACCESS_REMOTE_ATOMIC | IB_ACCESS_MW_BIND));
> 124
> 125 if (access & IB_ACCESS_ON_DEMAND) {
> 126 ret = ib_umem_odp_get(context, umem);
> 127 if (ret) {
> 128 kfree(umem);
> 129 return ERR_PTR(ret);
> 130 }
> 131 return umem;
> 132 }
>
>
> As you can see only a few bits in access_flags are checked in the end,
> so it may exist a very unlikely possibility that an existing userspace
> program is setting this IB_ACCESS_ON_DEMAND bit without the intention
> of enabling on demand paging as this would be unnoticed by older kernel.
>
> In the other hand, a newer program built with on-demand-paging in mind
> will set the bit, but when run on older kernel, it will be a no-op,
> allowing the program to continue, perhaps thinking on-demand-paging
> is available.
This is why we added a method for userspace to check the kernel
capabilities both when adding the memory windows support and with ODP.
User-space can still send garbage to the kernel, but if it does so
without checking the kernel supports its request, it's user-space's problem.
> That should be avoided as much as possible.
>
> Unfortunately, I think this cannot be fixed as it's was long since
> IB_ZERO_BASED was added by commit 7083e42ee2 ("IB/core: Add "type 2"
> memory windows support").
> Anyway there was no check for IB_ACCESS_REMOTE_READ, nor
> IB_ACCESS_MW_BIND in the uverb layer either.
I think it would be perfectly fine to add a check today that validates
the MR access flags input is known. Because the newer feature require
user-space to check capabilities, I don't think it would matter if we
returned an error on newer kernels that do not support these features.
Haggai
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html