On 11/10/2025 21:21, Collin Funk wrote:
Pádraig Brady <[email protected]> writes:
On 11/10/2025 19:50, Collin Funk wrote:
Collin Funk <[email protected]> writes:
$ timeout 10 sh -c \
"(ulimit -v $s && uutils base64 < /dev/zero >out 2>err)"
$ cat err && wc -c out
base64: out of memory
0 out
In case anyone is looking for work to do, there are some programs in
GNU
Coreutils who have this issue. For example, 'basenc --base58' requires
that the file fits into memory. It would be nice to fix those where
possible.
The 'basenc --base58' case, if I remember correctly, is because of
the
gmp functions we use. If we could fix it without sacrificing the
speedups from gmp, that would be great.
It also would be a good idea to add bounded-memory.sh tests to other
programs.
I definitely would not have added a non streamable base58 implementation
if a streamable one was supported. Unfortunately the base conversion
operation used is not streamable.
For example, see the completely different output below for very similar inputs:
$ printf '|%*s|' 100 | basenc
--base582cypwC9WHm46BUbDVHDDtpoEE8HyXVnVw3ytMvBLww1vipUBejw9HGLm3yRvvJstDJqfW11X7nyp
xsuaWQZgUD4KWXbfomaCWbqNnoejTfzhJqVGbTFU9iHzWsfdy6eZXH3pcT2a9G2w
$ printf '|%*s|' 101 | basenc --base58
89opaCCYGySdsEn27pYwujy3NwBvVc3phaA5hKif3TR67aWz1aVyYniDS852zHub3Kjnw13Hvywp
hdh4to1FD6DehL42HRjXoekXBAuYQj7euj1mq4rrqTFzrrdYAbvnYdqT9TswHTEZR
Thanks for the explanation.
It's not a huge problem for base58 anyway because it's designed for
small amounts of data where unambiguous transcription is important.
Yep, that is what I figured from your description of what base58 was
used for when you added. I think all the programs where streamable input
is important, it is done. For example, 'cksum' is obviously important.
It is still a nicety for all programs, in my opinion.
Absoutely it's important to minimize memory usage where possible.
That reminds me of the many memory improvements in coreutils 8.22:
https://www.pixelbeat.org/programming/avoiding_large_buffers.html
Mentioned there is potential memory improvements in sort.
Currently it can both over allocate and under allocate,
so would benefit from a more dynamic allocation mechanism.
An example of under allocation is where threading is not enabled
due to too little memory being allocated when reading from a pipe.
(A pipe is treated like a small file which is wrong).
See for e.g. https://superuser.com/a/938634/11613
Over allocation based on a percentage of total system memory
also has various issues, like; not being container aware,
being too aggressive in general as it may trash caches,
and also edge cases like --compress-program (fork) failing due to
the process being too big.
cheers,
Padraig