Ingo Schwarze wrote on Mon, Nov 04, 2013 at 09:51:41AM +0100:

> Then i will send two cleanup patches to remove useless stuff
> and put the code into the right place, not changing any functionality.

Done & committed (thanks to Otto and Todd for checking).

> Finally, we can work out how to do the optimization.
> Probably, that will naturally factorize into two steps:
> 
>  (1) Use the information available in the userland buffer
>      to avoid the getdents(2) syscall, *without* assuming
>      monotonicity.

That is done by the patch below; i'm asking for OKs.

Note that, even though it looks superficially similar to the
patch i sent originally, this one does not require monotonicity.

To confirm that it is really an optimization, i did some
measurements on my notebook:

 * The typical time required for a seekdir()/readdir() pair
   when the entry asked for is right at the beginning of the
   userland buffer is about   0.05 microseconds

 * The typical time required for searching through the
   whole userland buffer from its beginning to its end
   is about   5 microseconds

 * The typical time required for one getdents() syscall
   is about   100 microseconds

Consequently, we have the following typical edge cases:

 * Best case:
   opendir; readdir; telldir
   then 100.000 times (seekdir; readdir) at that position
    = The entry asked for is near the beginning of the buffer.
   For that case, the following patch reduces total execution
   time from 10.7s to 0.005s (-99.95%).

 * Maximum searching case:
   opendir; 500 times readdir; telldir
   then 100.000 times (seekdir; readdir) at that position
    = The entry asked for is near the end of the buffer.
   For that case, the following patch reduces total execution
   time from 10.7s to 0.5s (-95%).

 * Worst case:
   opendir; telldir
   then 100.000 times (seekdir; readdir) at that position
    = The entry asked for is the very first entry in the buffer,
      which cannot be found, because each dirent only contains
      the address of the *following* dirent, so each readdir
      triggers getdents anyway, but we still search the whole
      buffer each time.
   For that case, the following patch increases total execution
   time from 10.7s to 11.3s (+5%).

So in some cases, very large speedups are possible,
for a relatively small cost in the (very unlikely) worst case.

>  (2) Further optimize searching when monotonicity is available.

Given the data above, i don't consider further optimization
worthwhile any longer, so i consider what i'm sending here
to be the final optimization patch.

There are two cases where further optimization might in principle
be possible based on monotonicity:

 * Avoid searching through the whole buffer when monotonicity
   allows to conclude that the entry cannot be anywhere in the
   buffer.
   But in that case, getdents() is needed anyway, which is so
   much more expensive than searching the buffer that the
   latter is practically irrelevant.

 * Exploit monotonicity to replace the linear search by a binary
   search.  In the best case - the entries searched for are
   always near the end of the buffer - that might theoretically
   reduce execution times by a factor of up to 50.  However,
   that only applies to a very narrow range of problems,
   specifically directories with several hundred but less than
   about 500 entries.  For larger directories, the buffer cannot
   hold the whole directory at once, so execution times will
   be dominated by the required getdents() calls, and the search
   algorithm becomes irrelevant.  So, in the best case, we could
   save up to 5 microseconds per seekdir()/readdir() pair in
   directories of 500 entries.  That's at most a few milliseconds
   per iteration through the directory.  So, is anybody going to
   have tens of thousands of directories, each just below 500
   entries, and iterate all of them with seekdir()/readdir()?
   Not likely.  I don't think this remote possibility warrants
   additional complication of the code, and much less introducing
   a new pathconf(2) value.

That said, here is the optimization patch:

Yours,
  Ingo


Index: opendir.c
===================================================================
RCS file: /cvs/src/lib/libc/gen/opendir.c,v
retrieving revision 1.25
diff -u -p -r1.25 opendir.c
--- gen/opendir.c       6 Oct 2013 17:57:11 -0000       1.25
+++ gen/opendir.c       5 Nov 2013 22:58:40 -0000
@@ -111,6 +111,7 @@ __fdopendir(int fd)
                return (NULL);
        }
 
+       dirp->dd_size = 0;
        dirp->dd_loc = 0;
        dirp->dd_fd = fd;
        dirp->dd_lock = NULL;
Index: seekdir.c
===================================================================
RCS file: /cvs/src/lib/libc/gen/seekdir.c,v
retrieving revision 1.10
diff -u -p -r1.10 seekdir.c
--- gen/seekdir.c       5 Nov 2013 20:36:51 -0000       1.10
+++ gen/seekdir.c       5 Nov 2013 22:58:40 -0000
@@ -1,31 +1,18 @@
 /*     $OpenBSD: seekdir.c,v 1.10 2013/11/05 20:36:51 schwarze Exp $ */
 /*
- * Copyright (c) 1983, 1993
- *     The Regents of the University of California.  All rights reserved.
+ * Copyright (c) 2013 Ingo Schwarze <schwa...@openbsd.org>
  *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- *    notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- *    notice, this list of conditions and the following disclaimer in the
- *    documentation and/or other materials provided with the distribution.
- * 3. Neither the name of the University nor the names of its contributors
- *    may be used to endorse or promote products derived from this software
- *    without specific prior written permission.
+ * Permission to use, copy, modify, and distribute this software for any
+ * purpose with or without fee is hereby granted, provided that the above
+ * copyright notice and this permission notice appear in all copies.
  *
- * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
+ * THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
+ * WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
+ * ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+ * WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ * ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
+ * OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
  */
 
 #include <dirent.h>
@@ -41,7 +28,39 @@
 void
 seekdir(DIR *dirp, long loc)
 {
+       struct dirent *dp;
+
+       /*
+        * First check whether the directory entry to seek for
+        * is still buffered in the directory structure in memory.
+        */
+
        _MUTEX_LOCK(&dirp->dd_lock);
+       for (dirp->dd_loc = 0;
+            dirp->dd_loc < dirp->dd_size;
+            dirp->dd_loc += dp->d_reclen) {
+               dp = (struct dirent *)(dirp->dd_buf + dirp->dd_loc);
+               if (dp->d_off != loc)
+                       continue;
+
+               /*
+                * Entry found in the buffer, use it.  If readdir(3)
+                * follows, this will save us a getdents(2) syscall.
+                * Note that d_off is the offset of the _next_ entry,
+                * so advance dd_loc.
+                */
+
+               dirp->dd_loc += dp->d_reclen;
+               dirp->dd_curpos = loc;
+               _MUTEX_UNLOCK(&dirp->dd_lock);
+               return;
+       }
+
+       /*
+        * The entry is not in the buffer, prepare a call to getdents(2).
+        * In particular, invalidate dd_loc.
+        */
+
        dirp->dd_loc = 0;
        dirp->dd_curpos = lseek(dirp->dd_fd, loc, SEEK_SET);
        _MUTEX_UNLOCK(&dirp->dd_lock);

Reply via email to