On Monday, August 18th, 2025 at 09:27, Bernhard Voelker 
<m...@bernhard-voelker.de> wrote:

> I've rolled the change into a proper Git commit and would push it like
> that in your name (unless you don't want that, or want another change).
> Good to go?

I haven't been able to identify another change yet! danny mcClanahan 
<dmc2@amass.energy> is fine for attribution.

> > As a side note, I have been working on a library to perform directory 
> > traversal and really wished
> 
> > I'd looked at `find` more closely earlier. In particular the discussion of 
> > symbolic link handling,
> 
> > stat optimization with -noleaf, and the clever error behavior from 
> > -ignore_readdir_race all help
> 
> > to resolve some questions I'd been having difficulty with.
> 
> 
> Other implementations using the gnulib FTS module are e.g. in coreutils: 
> du(1), rm(1), cp(1) and
> chmod(1). There might be interesting cases for you as well.

Thanks for identifying the FTS module! I've been learning about autotools 
recently as I am working on a build tool but am less familiar with gnulib at 
the moment and hadn't parsed everything used by findutils.

...[changing subject]...

I wasn't sure earlier, but it now seems appropriate to mention: one of the 
problems I've been spending a truly inordinate amount of time on is 
multithreaded directory traversal, particularly while incorporating ignore 
patterns and deduplicating overlapping symlink targets ("deduplicating" is 
difficult to even define consistently in the presence of path-based filtering). 
On top of that, I also want to construct an in-memory VFS representation which 
can be updated in response to e.g. inotify events. One use case for this would 
be to update source code indices (I mention an example application at the end 
of my emacsconf talk last year: https://emacsconf.org/2024/talks/regex/).

I suspect that ignore patterns and notify events are out of scope for findutils 
(and are not terribly difficult to achieve with a small shell script invoking 
find and other utilities). However, I'd be curious to know if multithreaded 
directory traversal would ever be in scope for the find and/or updatedb 
commands (I haven't looked at updatedb yet). I suspect it would induce a great 
degree of complexity and nondeterminism to a tool like find which otherwise 
upholds such precise guarantees upon its behavior.

Given all of that, is there a project that might be interested in such a 
feature? Or is there a way to limit the scope of such a feature that would make 
it useful for find itself?

Thanks again for your time, and please no need to reply extensively. I fixed a 
deadlock and other issues in the rust "ignore" library used by ripgrep a while 
ago, and I would like parallel traversal to be available to portable C 
utilities as well.

Have a great day,
danny

Reply via email to