Hello,
[Please CC the replies to me, as I am not subscribed to this list]
While investigating recent FTBFS bug reports [0,1] I have come to a conclusion, that something is wrong with either the dynamic linker ld-linux.so.2, or the way kernel handles certain mmap() calls (at least on sparc64, possibly on i386 as well). Below is the illustration of debugging the problem on sparc64 machine (up-to-date unstable chroot, kernel 2.6.8-1-sparc64, libc6 2.3.2.ds1-18) on a test example
char a[134084860];
int main() { return 0; }compiled into an a.out executable. Running 'ld-linux.so.2 ./a.out' under gdb and looking in /proc/<pid>/maps I see (irrelevant paths and whites pace removed for brevity):
08000000-0801a000 r-xp 00000000 08:11 319415 ld-2.3.2.so 08028000-0802a000 rwxp 00018000 08:11 319415 ld-2.3.2.so efffe000-f0000000 rw-p efffe000 00:00 0
So, the executable is mapped starting at 0x8000000. I then continue execution, catching the SIGILL. After that /proc/<pid>/maps looks like that:
00010000-00012000 r-xp 00000000 08:11 458670 a.out
00020000-00024000 rwxp 00000000 08:11 458670 a.out
00024000-08002000 rwxp 00024000 00:00 0 08002000-0801a000 r-xp 00002000 08:11 319415 ld-2.3.2.so
08028000-0802a000 rwxp 00018000 08:11 319415 ld-2.3.2.so
efffe000-f0000000 rw-p efffe000 00:00 0
As you can see, as a result of mmapping of ./a.out to memory, the section (containing executable code!) 08000000-08002000 has been overwritten (with zeroes), producing a SIGILL. This picture correlates nicely with the result of running it under strace:
execve("/usr/lib/debug/ld-linux.so.2", ["/usr/lib/debug/ld-linux.so.2",
"./a.out"], [/* 16 vars */]) = 0
uname({sys="Linux", node="kundera", ...}) = 0
brk(0) = 0x802a000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
open("./a.out", O_RDONLY) = 3
read(3, "\177ELF\1\2\1\0\0\0\0\0\0\0\0\0\0\2\0\2\0\0\0\1\0\1\3P"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=19133, ...}) = 0
getcwd("/root", 128) = 6
mmap(0x10000, 8192, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x10000
mmap(0x20000, 16384, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED, 3,
0) = 0x20000
mmap(0x24000, 134077000, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x24000
close(3) = 0
open("/etc/ld.so.preload", O_RDONLY) = -1 ENOENT (No such file or directory)
--- SIGILL (Illegal instruction) @ 0 (0) ---
+++ killed by SIGILL +++So, a million dollar question is: whose fault is it? I see two possibilities: either ld-linux.so.2 is supposed to make sure that there is enough memory available for the mmapping but fails to do it for some reason; or this check is supposed to be performed by kernel and the mmap call above should not succeed. One point of view, presented by Richard Mortimer in [2] and based on POSIX specification of mmap leads to a conclusion that kernel is not at fault here, it just follows the POSIX-defined behaviour. I am clearly not an expert on the issue, so any information you can provide will be greatly appreciated. The offending mmap call, as far as I can tell, comes from the line 1146 in elf/dl-load.c:
mapat = __mmap ((caddr_t) zeropage, zeroend - zeropage,
c->prot, MAP_ANON|MAP_PRIVATE|MAP_FIXED,
ANONFD, 0);[0] http://bugs.debian.org/268450 [1] http://lists.debian.org/debian-sparc/2004/12/msg00009.html [2] http://marc.theaimsgroup.com/?l=linux-sparc&m=110220197504985&w=2
Best regards,
Jurij Smakov [EMAIL PROTECTED] Key: http://www.wooyd.org/pgpkey/ KeyID: C99E03CC
-- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]

