Bug#544145: libc6-i686 - Segfaults on amd64/xen

2009-08-30 Thread Bastian Blank
reassign 544145 linux-2.6 2.6.30-1
thanks

On Sat, Aug 29, 2009 at 10:57:40PM +0200, Bastian Blank wrote:
 On Sat, Aug 29, 2009 at 11:31:43AM +0200, Bastian Blank wrote:
  Hmm, just below the dynlinker, we have the vdso.
 vdso was the point.

I'm pretty sure that it is a problem either in the kernel or in the
hypervisor.

 A check in the Xen source show the following changeset, which is missing
 in this hypervisor:
 | user:Keir Fraser keir.fra...@citrix.com
 | date:Fri Dec 05 15:21:59 2008 +
 | files:   xen/arch/x86/x86_64/compat/entry.S
 | description:
 | x86/32on64: adjust address when converting syscall to fault
 Looks like the problem is caught. Will check this tomorrow.

Was not a fix. Also it is still broken in Xen 3.4.1.

Bastian

-- 
It is more rational to sacrifice one life than six.
-- Spock, The Galileo Seven, stardate 2822.3



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#544145: libc6-i686 - Segfaults on amd64/xen

2009-08-29 Thread Bastian Blank
Package: libc6-i686
Version: 2.9-25
Severity: important

The last upgrade of my i386 chroot pulled in libc6-i686. After that all
binaries started to segfault:

| [   38.652407] sh[3131]: segfault at 0 ip (null) sp ffb9a138 error 14 
in bash[8048000+be000]
| [  236.958701] ls[3650]: segfault at af7f22000 ip 000af7f22000 sp 
ff992390 error 14
| [  252.066863] ls[3651]: segfault at af7fde000 ip 000af7fde000 sp 
ff8745f0 error 14

A removal of libc6-i686 or disabling via /etc/ld.so.nohwcap got the
system into a usable state again.

The system uses a amd64 kernel and userland on Xen.

Bastian

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (x86_64)

Kernel: Linux 2.6.31-rc8-amd64 (SMP w/4 CPU cores)
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Shell: /bin/sh linked to /bin/bash

Versions of packages libc6-i686 depends on:
ii  libc6 2.9-25 GNU C Library: Shared libraries

libc6-i686 recommends no packages.

libc6-i686 suggests no packages.
-- 
It is more rational to sacrifice one life than six.
-- Spock, The Galileo Seven, stardate 2822.3



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#544145: libc6-i686 - Segfaults on amd64/xen

2009-08-29 Thread Aurelien Jarno
On Sat, Aug 29, 2009 at 09:14:40AM +0200, Bastian Blank wrote:
 Package: libc6-i686
 Version: 2.9-25
 Severity: important
 
 The last upgrade of my i386 chroot pulled in libc6-i686. After that all
 binaries started to segfault:

This is because of the suggests changed into recommends.

 | [   38.652407] sh[3131]: segfault at 0 ip (null) sp ffb9a138 error 
 14 in bash[8048000+be000]
 | [  236.958701] ls[3650]: segfault at af7f22000 ip 000af7f22000 sp 
 ff992390 error 14
 | [  252.066863] ls[3651]: segfault at af7fde000 ip 000af7fde000 sp 
 ff8745f0 error 14
 
 A removal of libc6-i686 or disabling via /etc/ld.so.nohwcap got the
 system into a usable state again.
 
 The system uses a amd64 kernel and userland on Xen.
 

This flavour works perfectly on a non-Xen system. is there also some
requirements on the compilation of glibc on Xen amd64, like on i386?

-- 
Aurelien Jarno  GPG: 1024D/F1BCDB73
aurel...@aurel32.net http://www.aurel32.net



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#544145: libc6-i686 - Segfaults on amd64/xen

2009-08-29 Thread Bastian Blank
On Sat, Aug 29, 2009 at 10:33:24AM +0200, Aurelien Jarno wrote:
 On Sat, Aug 29, 2009 at 09:14:40AM +0200, Bastian Blank wrote:
  The system uses a amd64 kernel and userland on Xen.
 This flavour works perfectly on a non-Xen system. is there also some
 requirements on the compilation of glibc on Xen amd64, like on i386?

Not that I'm aware of. But the memory layout may be different.

Debugging a live binary is impossible as it already breaks before the
debugging setup.

A core file shows:
| #0  0xf7fb1000 in ?? ()
| #1  0xf7e0bdc8 in _init () at 
/build/buildd-eglibc_2.9-25-i386-IXOntq/eglibc-2.9/build-tree/i386-i686/nptl/crti.S:24
This seems to come from /lib/i686/cmov/libpthread.so.0
| #2  0xf7fbf284 in call_init (l=0xf7dfd720, argc=1, argv=0xffa03094, 
env=0xffa0309c) at dl-init.c:70
| #3  0xf7fbf416 in _dl_init (main_map=0xf7fce670, argc=1, argv=0xffa03094, 
env=0xffa0309c) at dl-init.c:100
| #4  0xf7fb188f in _dl_start_user () from /lib/ld-linux.so.2
| (gdb) up
| #1  0xf7e0bdc8 in _init () at 
/build/buildd-eglibc_2.9-25-i386-IXOntq/eglibc-2.9/build-tree/i386-i686/nptl/crti.S:24
| 24  
/build/buildd-eglibc_2.9-25-i386-IXOntq/eglibc-2.9/build-tree/i386-i686/nptl/crti.S:
 No such file or directory.
| in 
/build/buildd-eglibc_2.9-25-i386-IXOntq/eglibc-2.9/build-tree/i386-i686/nptl/crti.S
| Current language:  auto; currently asm
| (gdb) disassemble 
| Dump of assembler code for function _init:
| 0xf7e0bdc0 _init+0:   push   %ebp
| 0xf7e0bdc1 _init+1:   mov%esp,%ebp
| 0xf7e0bdc3 _init+3:   call   0xf7e0c380 
__pthread_initialize_minimal_internal
| 0xf7e0bdc8 _init+8:   call   0xf7e0c2c0 frame_dummy
| 0xf7e0bdcd _init+13:  call   0xf7e18040 __do_global_ctors_aux
| 0xf7e0bdd2 _init+18:  pop%ebp
| 0xf7e0bdd3 _init+19:  ret
| End of assembler dump.

| (gdb) disassemble 0xf7e0c2c0
| Dump of assembler code for function frame_dummy:
This function seems to come from gcc.
| 0xf7e0c2c0 frame_dummy+0: push   %ebp
| 0xf7e0c2c1 frame_dummy+1: mov%esp,%ebp
| 0xf7e0c2c3 frame_dummy+3: push   %ebx
| 0xf7e0c2c4 frame_dummy+4: call   0xf7e0c230 __i686.get_pc_thunk.bx
| 0xf7e0c2c9 frame_dummy+9: add$0x11d2b,%ebx
| 0xf7e0c2cf frame_dummy+15:sub$0x4,%esp
| 0xf7e0c2d2 frame_dummy+18:mov-0x214(%ebx),%edx
| 0xf7e0c2d8 frame_dummy+24:test   %edx,%edx
| 0xf7e0c2da frame_dummy+26:je 0xf7e0c2f1 frame_dummy+49
| 0xf7e0c2dc frame_dummy+28:mov-0x28(%ebx),%edx
| 0xf7e0c2e2 frame_dummy+34:test   %edx,%edx
| 0xf7e0c2e4 frame_dummy+36:je 0xf7e0c2f1 frame_dummy+49
| 0xf7e0c2e6 frame_dummy+38:lea-0x214(%ebx),%eax
| 0xf7e0c2ec frame_dummy+44:mov%eax,(%esp)
| 0xf7e0c2ef frame_dummy+47:call   *%edx
| 0xf7e0c2f1 frame_dummy+49:add$0x4,%esp
| 0xf7e0c2f4 frame_dummy+52:pop%ebx
| 0xf7e0c2f5 frame_dummy+53:pop%ebp
| 0xf7e0c2f6 frame_dummy+54:ret
| 0xf7e0c2f7 frame_dummy+55:nop
| 0xf7e0c2f8 frame_dummy+56:nop
| 0xf7e0c2f9 frame_dummy+57:nop
| 0xf7e0c2fa frame_dummy+58:nop
| 0xf7e0c2fb frame_dummy+59:nop
| 0xf7e0c2fc frame_dummy+60:nop
| 0xf7e0c2fd frame_dummy+61:nop
| 0xf7e0c2fe frame_dummy+62:nop
| 0xf7e0c2ff frame_dummy+63:nop
| End of assembler dump.

| (gdb) print {char[4]}0xf7fb1000
| $7 = \177ELF

Hmm, just below the dynlinker, we have the vdso. If I debug a binary
without using the i686/cmov libs then I miss the symbols for the vdso
so, it looks like they are never used.

Bastian

-- 
It would be illogical to assume that all conditions remain stable.
-- Spock, The Enterprise Incident, stardate 5027.3



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#544145: libc6-i686 - Segfaults on amd64/xen

2009-08-29 Thread Bastian Blank
On Sat, Aug 29, 2009 at 11:31:43AM +0200, Bastian Blank wrote:
 Hmm, just below the dynlinker, we have the vdso.

vdso was the point. However I don't know what the real problem is. A
minimal failing program is:

| int main() {
|   unsigned int resultvar;
|   asm volatile (
|   movl %1, %%eax\n\t
|   call *%%gs:0x10\n\t
|   : =a (resultvar)
|   : i (0) : memory, cc);
| }

This just calls the syscall 0 through the vdso and on my system this
produces the following:

| (gdb) run
| Starting program: /test 
| 
| Program received signal SIGSEGV, Segmentation fault.
| 0xf7fdf42f in __kernel_vsyscall ()
| (gdb) bt
| #0  0xf7fdf42f in __kernel_vsyscall ()
| #1  0xf7fd6ff4 in ?? () from /lib/libc.so.6
| #2  0xf7eb07a5 in __libc_start_main (main=0x8048394 main, argc=1, 
ubp_av=0xd884, init=0x80483d0 __libc_csu_init, 
| fini=0x80483c0 __libc_csu_fini, rtld_fini=0xf7fee6e0 _dl_fini, 
stack_end=0xd87c) at libc-start.c:222
| #3  0x08048301 in _start () at ../sysdeps/i386/elf/start.S:119
| (gdb) disassemble 
| Dump of assembler code for function __kernel_vsyscall:
| 0xf7fdf420 __kernel_vsyscall+0:   push   %ebp
| 0xf7fdf421 __kernel_vsyscall+1:   mov%ecx,%ebp
| 0xf7fdf423 __kernel_vsyscall+3:   syscall 
| 0xf7fdf425 __kernel_vsyscall+5:   mov$0x2b,%ecx
| 0xf7fdf42a __kernel_vsyscall+10:  mov%ecx,%ss
| 0xf7fdf42c __kernel_vsyscall+12:  mov%ebp,%ecx
| 0xf7fdf42e __kernel_vsyscall+14:  pop%ebp
| 0xf7fdf42f __kernel_vsyscall+15:  ret
| End of assembler dump.

  If I debug a binary
 without using the i686/cmov libs then I miss the symbols for the vdso
 so, it looks like they are never used.

This binaries don't use the vdso in the INLINE_SYSCALL macro.

A check in the Xen source show the following changeset, which is missing
in this hypervisor:

| user:Keir Fraser keir.fra...@citrix.com
| date:Fri Dec 05 15:21:59 2008 +
| files:   xen/arch/x86/x86_64/compat/entry.S
| description:
| x86/32on64: adjust address when converting syscall to fault
| 
| The faulting address is at the start of the syscall instruction rather
| than at the following one.

Looks like the problem is caught. Will check this tomorrow.

Bastian

-- 
Sometimes a feeling is all we humans have to go on.
-- Kirk, A Taste of Armageddon, stardate 3193.9



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org