Alexander Bluhm wrote: > On Mon, Jan 16, 2017 at 08:36:46PM +0100, [email protected] wrote: > > kernel: protection fault trap, code=0 > > Stopped at fd_getfile+0x20: testb $0x2,mptramp_gdt32_desc+0x1e(%r > > ax) > > ddb{3}> fd_getfile() at fd_getfile+0x20 > > sys_fstat() at sys_fstat+0x43 > > syscall() at syscall+0x27b > > It crashes in fd_getfile() FILE_IS_USABLE(fp) as fdp->fd_ofiles has > been freed. > > fdexpand() assumes that is has the write lock, calls free(fdp->fd_ofiles) > and then sleeps in mallocarray(M_WAITOK) before updating fdp->fd_ofiles. > > As fd_getfile() does not grab a readlock, it may get scheduled while > fdexpand() sleeps. > > Simply calling rw_enter_read(&fdp->fd_lock) in fd_getfile() does > not work, as sometimes it is called with the writelock already held. > Not sure wether such a write lock check is nice style, but it avoids > to fix all callers.
I worry that this leaves some other race open. When I added fdplock(), I never even considered a race in getfile. But fixing all the callers will be invasive and it seems likely we'll miss one, so I guess this is the best approach.
