I noticed large realloc()s still being done by fread_file() for the
common case of a regular file. As well as being inefficient,
it could needlessly fail for large files.
$ truncate -s1M t.f
$ ltrace ./a.out < t.f
__libc_start_main(0x8048624, 1, 0xbfd86c34, 0x8048940, 0x8048930 <unfinished
...>
fileno(0x7b7420) = 0
__fxstat(3, 0, 0xbfd86acc) = 0
ftello(0x7b7420, 0xbfd86acc, 0xb78c22ac, 0xf63d4e2e, 3) = 0
malloc(1048577) = 0xb77c0008
fread(0xb77c0008, 1, 1048576, 0x7b7420) = 1048576
realloc(0xb77c0008, 1572865) = 0xb763f008
fread(0xb773f008, 1, 524288, 0x7b7420) = 0
__errno_location() = 0xb78c1688
ferror(0x7b7420) = 0
realloc(0xb763f008, 1048577) = 0xb763f008
After applying the attached patch I get:
$ ltrace ./a.out < t.f
__libc_start_main(0x8048624, 1, 0xbfa14994, 0x8048940, 0x8048930 <unfinished
...>
fileno(0x7b7420) = 0
__fxstat(3, 0, 0xbfa1482c) = 0
ftello(0x7b7420, 0xbfa1482c, 0xb783a2ac, 0xf63d4e2e, 3) = 0
malloc(1048578) = 0xb7738008
fread(0xb7738008, 1, 1048577, 0x7b7420) = 1048576
__errno_location() = 0xb7839688
ferror(0x7b7420) = 0
On a related note I was wondering if we should fall back
to increasing the buffer by +=BUFSIZ rather than *=2
when we get ENOMEM? If we do then we'd need to be more careful
with the overflow checking with lines like the following:
if (size + BUFSIZ + 1 > alloc)
cheers,
Pádraig.
diff --git a/lib/read-file.c b/lib/read-file.c
index 0a15c5a..4ea38f7 100644
--- a/lib/read-file.c
+++ b/lib/read-file.c
@@ -65,7 +65,8 @@ fread_file (FILE * stream, size_t * length)
return NULL;
}
- alloc = alloc_off + 1;
+ /* 1 > file size so eof immediately detected. */
+ alloc = alloc_off + 2;
buf = malloc (alloc);
if (!buf)
@@ -120,8 +121,10 @@ fread_file (FILE * stream, size_t * length)
if (ferror (stream))
break;
- /* Shrink the allocated memory if possible. */
- if (size + 1 < alloc)
+ /* Shrink the allocated memory if possible,
+ but don't bother about the extra byte
+ allocated to detect the eof. */
+ if (size < alloc - 2)
{
char *smaller_buf = realloc (buf, size + 1);
if (smaller_buf != NULL)