On Wed, May 13, 2026 at 03:12:36AM +0000, [email protected] wrote: > Thank you for your contribution! Sashiko AI review found 5 potential issue(s) > to consider: > - [Critical] TOCTOU heap buffer overflow due to unvalidated `num_sge` from > user-shared memory. > - [High] Memory leak of the kernel queue structure (`srq->rq.kwq`) on > user-backed SRQ modifications. > - [High] Locking imbalance and freeing memory while locked. > - [High] Inconsistent state and Use-After-Free on error path. > - [Low] Uninitialized variable compiler warning for `offset_addr`. > --
These are all pre-existing and rvt is too confusing to try to fix be me. Dennis will have to handle them > commit dac7699d9e58c179b3a8859ac64da79e5fd21f46 > Author: Jason Gunthorpe <[email protected]> > Subject: IB/rdmavt: Don't abuse udata and ib_respond_udata() > > This commit modifies rvt_modify_srq to use copy_to_user directly instead > of ib_copy_to_udata for returning the offset address. > > > diff --git a/drivers/infiniband/sw/rdmavt/srq.c > > b/drivers/infiniband/sw/rdmavt/srq.c > > index fe125bf85b272..d022aa56c5bfd 100644 > > --- a/drivers/infiniband/sw/rdmavt/srq.c > > +++ b/drivers/infiniband/sw/rdmavt/srq.c > > @@ -128,6 +128,7 @@ int rvt_modify_srq(struct ib_srq *ibsrq, struct > > ib_srq_attr *attr, > > struct rvt_srq *srq = ibsrq_to_rvtsrq(ibsrq); > > struct rvt_dev_info *dev = ib_to_rvt(ibsrq->device); > > struct rvt_rq tmp_rq = {}; > > + __u64 offset_addr; > > int ret = 0; > > > > if (attr_mask & IB_SRQ_MAX_WR) { > > Could hoisting offset_addr lead to a compiler warning for being potentially > uninitialized? > > The compiler might assume udata->inlen could change concurrently across the > spin_lock_irq() and spin_unlock_irq() calls. If the first check fails > but the second one passes: > > if (udata && udata->inlen >= sizeof(__u64)) > > Does this allow offset_addr to be used without initialization? > > [ ... ] > > > @@ -149,19 +150,17 @@ int rvt_modify_srq(struct ib_srq *ibsrq, struct > > ib_srq_attr *attr, > > return -ENOMEM; > > /* Check that we can write the offset to mmap. */ > > if (udata && udata->inlen >= sizeof(__u64)) { > > - __u64 offset_addr; > > __u64 offset = 0; > > > > ret = ib_copy_from_udata(&offset_addr, udata, > > sizeof(offset_addr)); > > if (ret) > > goto bail_free; > > - udata->outbuf = (void __user *) > > - (unsigned long)offset_addr; > > - ret = ib_copy_to_udata(udata, &offset, > > - sizeof(offset)); > > - if (ret) > > + if (copy_to_user(u64_to_user_ptr(offset_addr), &offset, > > + sizeof(offset))) { > > + ret = -EFAULT; > > goto bail_free; > > + } > > } > > > > spin_lock_irq(&srq->rq.kwq->c_lock); > > Can this code leak the kernel queue structure? > > When resizing a user-backed SRQ, the code fetches elements from the old > user-mapped queue. In the udata path, okwq is left as NULL: > > if (udata) { > owq = srq->rq.wq; > head = RDMA_READ_UAPI_ATOMIC(owq->head); > tail = RDMA_READ_UAPI_ATOMIC(owq->tail); > } else { > okwq = srq->rq.kwq; > ... > > When it later calls kvfree(okwq), it seems nothing is freed, which could > permanently leak the old kwq structure. > > Is there a lock imbalance here when the queue is replaced? > > The original queue is locked with: > > spin_lock_irq(&srq->rq.kwq->c_lock); > > But then the queue pointer is replaced: > > srq->rq.kwq = tmp_rq.kwq; > > When the lock is released: > > spin_unlock_irq(&srq->rq.kwq->c_lock); > > Does this unlock the new queue instead of the old one, and then free the old > queue via kvfree(okwq) while its lock is still held? > > Can the loop migrating WQEs overflow the destination buffer? > > The number of scatter-gather elements is read directly from user-shared > memory: > > wqe = rvt_get_rwqe_ptr(&srq->rq, tail); > p->wr_id = wqe->wr_id; > p->num_sge = wqe->num_sge; > for (i = 0; i < wqe->num_sge; i++) > p->sg_list[i] = wqe->sg_list[i]; > > If a userspace program concurrently modifies num_sge to a large value, could > this loop exceed the bounds of tmp_rq.kwq->curr_wq, since the destination > size was calculated based on srq->rq.max_sge? > > [ ... ] > > > @@ -236,10 +235,10 @@ int rvt_modify_srq(struct ib_srq *ibsrq, struct > > ib_srq_attr *attr, > > * See rvt_mmap() for details. > > */ > > if (udata && udata->inlen >= sizeof(__u64)) { > > - ret = ib_copy_to_udata(udata, &ip->offset, > > - sizeof(ip->offset)); > > - if (ret) > > - return ret; > > + if (copy_to_user(u64_to_user_ptr(offset_addr), > > + &ip->offset, > > + sizeof(ip->offset))) > > + return -EFAULT; > > } > > > > Does returning -EFAULT here leave the kernel and userspace in an inconsistent > state? > > At this point, the kernel has already applied the SRQ size change, replaced > the queue, and permanently freed the old memory: > > vfree(owq); > kvfree(okwq); > > If copy_to_user fails and returns -EFAULT, userspace might interpret this as > a failed resize and continue using its existing memory mapping of the old > queue. Could this lead to a use-after-free since the pages have already > been freed? > > -- > Sashiko AI review ยท > https://sashiko.dev/#/patchset/[email protected]?part=2
