On Sat, Nov 2, 2013 at 5:48 PM, Ingo Schwarze <schwa...@usta.de> wrote:
> Here is an updated patch which now works correctly with Otto's
> regression test, with the new test i just committed, and with
> the test from the Perl test suite Andrew pointed out, even with
> threads enabled.  It also survived quite some manual testing.
...
> Comments?  Tests?  OKs?
...
>  void
>  seekdir(DIR *dirp, long loc)
>  {
> +       struct dirent *dp;
> +
> +       /*
> +        * First check whether the directory entry to seek for
> +        * is still buffered in the directory structure in memory.
> +        */
> +
>         _MUTEX_LOCK(&dirp->dd_lock);
> -       __seekdir(dirp, loc);
> +       dp = (struct dirent *)dirp->dd_buf;
> +       if (dirp->dd_size > 0 && dp->d_off <= loc) {
> +               for (dirp->dd_loc = 0;
> +                    dirp->dd_loc < dirp->dd_size;
> +                    dirp->dd_loc += dp->d_reclen) {
> +                       dp = (struct dirent *)(dirp->dd_buf + dirp->dd_loc);
> +                       if (dp->d_off < loc)
> +                               continue;

This diff assumes that the d_off values for directory entries are
monotonically increasing as you advance through the directory.  That
is not guaranteed and is false for our implementation for some
filesystems.  The d_off values for NFS come straight from the NFS
server, and the NFS RFCs place no requirement on the cookies returned
by the server other than that the zero cookie indicates the first
entry.  (An OpenBSD NFS server returns cookies from the underlying
filesystem, so this will be difficult to see on a normal system.)

The current tmpfs code will also cause problems for this diff: the
d_off values for tmpfs are practically random, being derived from
kernel pointers returned by malloc(9).  That will need to change, as
kernel pointer values shouldn't be exposed to userland like that, but
it should serve as a way to confirm that this behavior causes problems
for the optimized seekdir().

One possibility is for the kernel to tell userland whether d_off
values are sure to be monotonically increasing for the opened
directory and then have userland only use the opimization when that's
the case.  We could add a pathconf/fpathconf name to get that
information: fpathconf(fd, _PC_DIRECTORY_MONOTONIC_OFFSETS) anyone?


Philip Guenther

Reply via email to