On Mon, Feb 27, 2012 at 10:55:48AM +1030, Brett Lymn wrote: > I have been tracking through a bug with Coda where, basically, getdents(2) > is not returning all the directory entries. The files exist in the > directory but do not show up on a globbed listing. After some digging I > found that things ended up in ufs_readdir() which was terminating early > due to a bad dirent. > > What is happening is the userland part of Coda, venus, manufactures a > the dirents for a directory when a read request comes up from the > kernel. It has code to carefully avoid a dirent spanning a DIRBLKSIZ > boundary by padding a dirent near the boundary. This data is returned > to the kernel for processing. Where things come unstuck is DIRBLKSIZ is > defined in /usr/include/dirent.h as 1024 bytes, inside the kernel ufs > code sys/ufs/ufs/dir.h DIRBLKSIZE is set to DEV_BSIZE which is 512 > bytes. This means that venus can produce a block of dirents that finish > before the 512 byte mark, the kernel code tries to align back to a 512 > byte boundary and fails to find a valid dirent - the fact that ufs_readdir() > exits gracefully rather than causing a panic is more by luck than > anything else.
Ugh, what a mess. > To fix it I think I have the following choices: > > 1) Patch venus so it ignores the userland DIRBLKSIZ and, instead, uses > DEV_BSIZE (if available) or just hard code in 512 bytes as the dirent > block size. > > 2) Change DIRBLKSIZ in dirent.h so it matches the kernel > > 3) Fix mount_coda so it updates the um_dirblksiz to match userland. I don't think any of those is the right answer. Coda is not limited to running on top of ffs, so it shouldn't be doing only filesystem- independent things when talking to the filesystem it uses for storage. (Right?) Therefore it should be using the value from <dirent.h> in both the kernel and in venus. If it's running on top of ffs, ffs will provide dirents with padding at 512-byte intervals that it would think unnecessary, but I would think it shouldn't notice or care. Then again, maybe I don't understand what's going on, as there shouldn't be any way for ufs_readdir to see, much less trip on, dirents generated by venus. Note that ffs needs DIRBLKSIZ to be the same as the underlying atomic I/O size, or various unspecified bad things can happen in crashes. So you can't change what ffs is doing. Also, I have no idea why the userland value diverged from the ffs value, but I doubt it's safe to change it without adding a large pile of compat wibble. Finally, we should not not not have duplicate symbol names like this. I guess now that we've branched I can go clean it up... -- David A. Holland [email protected]
