[TO -= bug-gnulib, TO += findutils-patches] I'm planning to apply Jim's patch to findutils soon. I have snipped out the gnulib patch he sent, because he later changed it.
---------- Forwarded message ---------- From: Jim Meyering <[EMAIL PROTECTED]> Date: Fri, Nov 28, 2008 at 7:19 PM Subject: Re: [Patch] Add dirent.d_type support to Cygwin 1.7 ? To: James Youngman <[EMAIL PROTECTED]> Cc: Eric Blake <[EMAIL PROTECTED]>, bug-findutils mailing list <[email protected]>, bug-gnulib <[EMAIL PROTECTED]>, Christian Franke <[EMAIL PROTECTED]> "James Youngman" <[EMAIL PROTECTED]> wrote: > On Thu, Nov 27, 2008 at 10:53 PM, Eric Blake <[EMAIL PROTECTED]> wrote: >> -----BEGIN PGP SIGNED MESSAGE----- >> Hash: SHA1 >> >> [dropping cygwin-patches, since posts are closed to non-subscribers, and >> adding bug-findutils and bug-gnulib. Christian is working on a patch that >> lets cygwin do initial support of dirent.d_type] >> >> According to Christian Franke on 11/27/2008 2:41 PM: >>> >>> PS: find is not as smart as expected: 'find /path -type d' calls lstat() >>> for each entry, even if d_type != DT_UNKNOWN. >>> So 'find /path' is 2-3 times faster than 'find /path -type d'. >> >> This seems like it might be a bug in gnulib's fts implementation. How >> does 'oldfind /path -type d' perform? oldfind has the advantage of not >> using fts, so if it performs better, then there is a hole where we need to >> improve gnulib's fts to make directory-only or non-directory-only >> traversals use d_type for optimization. > > I believe that at least a large part of the problem may be the fact > that some files are stat()ed by both fts (as part of the traversal > logic) and find. However, this hasn't been fully investigated. > > If someone would be able to investigate and discover the places where > duplicate *stat() calls are made, this would allow us to close the > performance gap between oldfind and find. That and the performance > gap between find and oldfind with "-execdir ...+" are IIRC the only > reasons why we still build oldfind at all. > > So if someone would be able to investigate, I'd be very grateful. There was indeed an opportunity for improvement. find/fts would unnecessarily stat all non-directories with any type-only predicates -- and assuming that the file system has dirent.d_type support. I've fixed it with two changes. The first in gnulib's fts.c to expose the d_type information, and the second, in find, to use that newly-available data. Now, on a file system type e.g., ext3 or ext4, that provide usable dirent.d_type, a use like "find . -type d" will stat only the directories. Before, it would stat everything. I'll apply the gnulib/fts change after a little more testing. Jim >From d7ccb70f88ce912616b2935c3757a7aae9f64d3b Mon Sep 17 00:00:00 2001 From: Jim Meyering <[EMAIL PROTECTED]> Date: Fri, 28 Nov 2008 17:21:55 +0100 Subject: [PATCH] find: avoid redundant stat calls when dirent.d_type is usable When fts provides usable dirent.d_type information, and the only stat information required by find's predicates the type information normally found in stat.st_mode, skip the now-redundant stat calls. This change is useful only with the very latest version of fts.c from gnulib. * find/ftsfind.c (find): Use dirent.d_type info from latest fts. --- find/ftsfind.c | 10 +++++----- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/find/ftsfind.c b/find/ftsfind.c index 543b80f..bd81c91 100644 --- a/find/ftsfind.c +++ b/find/ftsfind.c @@ -472,8 +472,8 @@ consider_visiting(FTS *p, FTSENT *ent) || ent->fts_info == FTS_NS /* e.g. symlink loop */) { assert (!state.have_stat); - assert (!state.have_type); - state.type = mode = 0; + assert (state.type != 0); + mode = state.type; } else { @@ -614,9 +614,9 @@ find(char *arg) { while ( (ent=fts_read(p)) != NULL ) { - state.have_stat = false; - state.have_type = false; - state.type = 0; + state.have_stat = false; // FIXME: depends on FTS_NOSTAT + state.have_type = !!ent->fts_statp->st_mode; + state.type = state.have_type ? ent->fts_statp->st_mode : 0; consider_visiting(p, ent); } fts_close(p); -- 1.6.0.4.1101.g642f8
