On 31/03/2026 19:28, Bernhard Voelker wrote:
A simple (wrong) command line likeyes /dev/null | wc --files0-from=- will for sure lead to OOM, because the input is not '\0'-separated. Other tools like du(1) and sort(1) suffer from the same issue, and for sure more tools like 'find -files0-from=-' from the findutils are in the same boat. The digesting of input file names is done by the gnulib argv-iter module, which in turn uses getdelim() to read and realloc the memory endlessly. My point is: for the --files0-from option, we definitely know that a returned filename can not usefully be longer than PATH_MAX. Therefore, the memory for parsing the file should never need to grow more than that, and we could fail early with ENAMETOOLONG, and eventually skip such an overly long entry in the file list. Where in the call chain tool -> argv_iter -> getdelim could we make a better cut to fail/skip early for bad entries longer than PATH_MAX? The change will definitely be in gnulib, but I wanted first to discuss this from the utilities' side. FWIW: I'm not sure if we have other utilities which digest '\0'-delimited input, and where we know that a useful iteration item can never be longer than N. If we find a nice solution, we might be able to make it more generic and read up to such given N.
Good point. Without looking it sounds like argv-iter should use getndelim2 not getdelim, as the former supports specifying a max length. Now PATH_MAX isn't always defined, but we could use some sensible upper bound if not. cheers, Padraig
