On Sat, 28 Mar 2026 at 04:08, David Laight <[email protected]> wrote:
>
> On Fri, 27 Mar 2026 17:29:21 -0700
> Linus Torvalds <[email protected]> wrote:
> > The trivial cases don't even matter, because all the cost of execve()
> > are elsewhere for those cases.
> >
> > But the cases where the strings *do* matter, they are many and long.
>
> Is that the strncpy_from_user() path?
No. For annoying reasons, execve() mainly uses "strnlen_user()"
followed by "copy_from_user()".
See fs/exec.v: copy_strings().
The reason is that it needs to know the size of the string before it
can start copying it, because the destination address will depend on
it.
And yes, it's racy, and yes, if y ou modify the arguments or the
environment while an exevbe() is going on, you get what you deserve
(but it's not a security issue, it's just a "resulting argv[] array is
odd", but you could have made it odd in the first place, so whatever).
It would be lovely to be able to od it in one go and not walk the
source string twice, but that's sadly not how the execve() interface
works (or somebody would need to come up with a clever trick).
The main user of strncpy_from_user() is the path copying: see the
'getname' variations in fs/namei.c.
And sometimes pathnames are short, but we had a semi-recent discussion
about the distribution of pathname lengths due to some allocation
optimizations recently:
https://lore.kernel.org/all/cagudohemjwcolep+tdkljuguhekn9+e+azwfkyk_syptzy8...@mail.gmail.com/
so while short names are common, longer names aren't *uncommon*, and
and loads that use them tend to keep using them.
We ended up aiming for ~128 bytes for the initial allocation
(EMBEDDED_NAME_MAX is 168 in one common config) for that reason.
Don't get me wrong: there are certainly many other users of
strnlen_user() and strncpy_from_user(), but the ones I've seen in any
half-way normal loads are those two: execve() and pathname copying.
> I started looking at this because someone was trying to write the
> 'bit-masking'
> version for (possibly) RISC-V and I deciding that they weren't making a good
> job of it and that it probably wasn't worth while (since x86-64 just uses
> the byte code).
Ok.
I do think that in user space, strlen() and friends can be absolutely
critical for some loads, because the C string model is horrible.
But in the kernel, I really don't think any of this matters. Our
strlen() is bad not because it's bad - it's bad because nobody really
should *care*.
Some of our "rep scas" users have been kept around exactly because
absolutely nobody cares, and it's a cute remnant of a very naive young
Linus who was using them because he was trying to learn things about
his new i80386 CPU, and started a whole small hobby project as a
result...
Linus