Pádraig Brady <[email protected]> writes:

> On 09/01/2026 07:19, Collin Funk wrote:
>> The 'readlink' and 'realpath' programs have an uncommon case where they
>> can run for a very long time. When canonicalizing file names longer than
>> PATH_MAX, we have to call 'openat' for each directory up the tree until
>> we reach root which takes a long time. Here is an example of the current
>> behavior:
>>      $ mkdir -p $(yes a/ | head -n $((32 * 1024)) | tr -d '\n')
>>      $ while cd $(yes a/ | head -n 1024 | tr -d '\n'); do :; \
>>          done 2>/dev/null
>>      $ pwd | tr '/' '\n' | wc -l
>>      32771
>>      $ env time --format=%E readlink -f $(yes . | head -n 5) > /dev/full
>>      readlink: write error: No space left on device
>>      Command exited with non-zero status 1
>>      0:59.72
>>      $ env time --format=%E realpath $(yes . | head -n 5) > /dev/full
>>      realpath: write error: No space left on device
>>      Command exited with non-zero status 1
>>      1:00.32
>> It is better to exit as soon as there is an error writing to
>> standard
>> output:
>>      $ env time --format=%E readlink -f $(yes . | head -n 5) >
>> /dev/full
>>      readlink: write error: No space left on device
>>      Command exited with non-zero status 1
>>      0:11.88
>>      $ env time --format=%E realpath $(yes . | head -n 5) > /dev/full
>>      realpath: write error: No space left on device
>>      Command exited with non-zero status 1
>>      0:12.04
>
> Yes this is worth doing, as even though these commands can't run indefinitely,
> they can run for a while, and the very simple code addition is worth it.
> Consider also the time difference with something like:
>
>   yes | xargs sh -c 'realpath -ms "$@" || exit 255' >/dev/full
>
> Since this is just a perf thing there is no real test I can think of.

Finding ARG_MAX using binary search instead of 'getconf ARG_MAX' which
is incorrect, we get:

    $ realpath $(seq 146961) > /dev/zero 
    $ realpath $(seq 146962) > /dev/zero 
    bash: realpath: Argument list too long

If you pass the deep directory in my example 146961 times, it would take
around 20.5 days to execute.

Of course no one will do that unintentionally, but it felt worth
handling for more reasonable invocations, even if they are still
uncommon.

I assume most systems won't allow directories that deep, so I agree that
it isn't really testable.

> Though that got me thinking that commands with --files0-from
> can have indefinite output, so I adjusted du and wc to output early
> and added tests in the attached.

Nice catch. I pushed my patch. I'll let you decide if you want to merge
the NEWS entries into one, or leave them seperate to describe the
rationale for each case.

Collin

Reply via email to