On 6/10/19 9:02 AM, Jason Gunthorpe wrote:
On Fri, Jun 07, 2019 at 02:37:07PM -0700, Ralph Campbell wrote:

On 6/6/19 11:44 AM, Jason Gunthorpe wrote:
From: Jason Gunthorpe <j...@mellanox.com>

hmm_release() is called exactly once per hmm. ops->release() cannot
accidentally trigger any action that would recurse back onto
hmm->mirrors_sem.

This fixes a use after-free race of the form:

         CPU0                                   CPU1
                                             hmm_release()
                                               up_write(&hmm->mirrors_sem);
   hmm_mirror_unregister(mirror)
    down_write(&hmm->mirrors_sem);
    up_write(&hmm->mirrors_sem);
    kfree(mirror)
                                               mirror->ops->release(mirror)

The only user we have today for ops->release is an empty function, so this
is unambiguously safe.

As a consequence of plugging this race drivers are not allowed to
register/unregister mirrors from within a release op.

Signed-off-by: Jason Gunthorpe <j...@mellanox.com>

I agree with the analysis above but I'm not sure that release() will
always be an empty function. It might be more efficient to write back
all data migrated to a device "in one pass" instead of relying
on unmap_vmas() calling hmm_start_range_invalidate() per VMA.

I think we have to focus on the *current* kernel - and we have two
users of release, nouveau_svm.c is empty and amdgpu_mn.c does
schedule_work() - so I believe we should go ahead with this simple
solution to the actual race today that both of those will suffer from.

If we find a need for a more complex version then it can be debated
and justified with proper context...

Ok?

Jason

OK.
I guess we have enough on the plate already :-)
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Reply via email to