On Sun, May 13, 2018 at 02:34:13PM -0600, Jason Gunthorpe wrote:
> On Fri, May 11, 2018 at 02:25:22PM +0900, DaeRyong Jeong wrote:
> > We report the crash: KASAN: use-after-free Read in cma_cancel_operation
> > Note that this bug is previously reported by syzkaller.
> > https://syzkaller.appspot.com/bug?id=95f89b8fb9fdc42e28ad586e657fea074e4e719b
> > Nonetheless, this bug has not fixed yet, and we hope that this report and
> > our
> > analysis, which gets help by the RaceFuzzer's feature, will helpful to fix
> > the
> > crash.
> > This crash has been found in v4.17-rc1 using RaceFuzzer (a modified
> > version of Syzkaller), which we describe more at the end of this
> > report. Our analysis shows that the race occurs when invoking two
> > syscalls concurrently, write$rdma_cm and write$rdma_cm.
> Well, calling rdma_destroy_id() twice/concurrently is invalid.. The
> confusing part of this is how does it happen from ucma.c ..
> Double calls via write look OK to me, the ID is removed from the IDR
> at the top so it cannot be invoked twice.. So not sure what
> "write$rdma_cm and write$rdma_cm." is supposed to me?
It meant two write syscalls. One for ucma_listen, one for ucma_resolve_ip.
> Is your test showing that write() vs close() is the problem? The oops
> suggests that.. And the logic around ctx->closing looks tortured
> enough that it is probably wrong...
My previous dianosis seems wrong... I'm sorry to make you confusing.
I'm looking into the code and trying to find out the cause of the crash.
We have found the one more crash. null-ptr-deref in cma_bind_listen caused
by two ucma_listen calls.
I think I have the clear idea why null-ptr-deref occured. I will send the
report for the null-ptr-deref. Please look at it.
We suspect that this crash and null-ptr-deref have the same root cause,
incomplete state check.
Hopefully, the second report is helpful to find out the root cause of this