> Date: Sat, 13 Aug 2022 19:47:54 +0700 > From: Robert Elz <[email protected]> > > However, since we now know the issue we have been looking at does involve > the raw devices, not the block ones, I'm not sure what is the point of > reverting that specfs_vnode.c patch, which only affects the block device > open. If that is needed, we might as well keep it, right? It shouldn't > affect current testing either way.
When _userland_ opens the raw /dev/rdkN _character_ device, for a wedge on (say) raid0, the _kernel_ will do the equivalent of opening the /dev/raid0 _block_ device. All I/O by userland through /dev/rdkN goes through the block device in the kernel. Normally, the paths in the kernel for open/close on /dev/rdkN arrange to open the block device only once at a time, and serialize the opening and closing the block device under a lock -- well, except they _don't_ serialize closing the block device under that lock. So if, say, fsck opens and closes /dev/rdkN, and dkctl opens /dev/rdkM at about the same time, dkctl might race to open the block device (in the kernel) before fsck has finished closing it (again, in the kernel). That's the race that the patch to dk.c avoids. The patch to spec_vnops.c is necessary to make spec_open gracefully return EBUSY instead of crashing the kernel when this happens. The patch to dk.c is necessary to serialize the /dev/rdkN open/close logic so that it never hits this case at all when opening the raid0 block device -- and thus never spuriously fails with EBUSY _or_ crashes the kernel.
