Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-17 Thread Helge Deller

On 7/17/23 22:21, Michael Tokarev wrote:

17.07.2023 22:58, Helge Deller wrote:

This patch seems to work. Tested with qemu-arm and qemu-amd64.


Wow!


diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a26200d9f3..b583018591 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3615,6 +3631,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)

  if (elf_interpreter) {
  load_elf_interp(elf_interpreter, _info, bprm->buf);
+    /*
+ * adjust brk address if the interpreter was loaded above the main
+ * executable, e.g. happens with static binaries on armhf


Guess you mean dynamic binaries?  the klibc binaries we used are dynamic, no?


Well, it's a static binary, but with dynamic interpreter:

deller@abel:~$ file fstype
fstype: ELF 32-bit LSB executable, ARM, EABI5 version 1 (SYSV), statically 
linked, interpreter /lib/klibc-m13AniKHUCMUNN8mXSUhIi8CUSA.so, 
BuildID[sha1]=127738bcbae6cad12468cc4182c9b289c3452864, stripped


+ */
+    if (interp_info.brk > info->brk) {
+    info->brk = interp_info.brk;
+    }


Heh.  So it clashes with brk. Nice... ;)

You should ping upstream about this one before 8.1 is out, I think.


I've queued up quite some other brk() fixes here:
https://github.com/hdeller/qemu-hppa/tree/upx-strace-fix-2
They hopefully fix all remaining issues.

Helge



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-17 Thread Michael Tokarev

17.07.2023 22:58, Helge Deller wrote:

This patch seems to work. Tested with qemu-arm and qemu-amd64.


Wow!


diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a26200d9f3..b583018591 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3615,6 +3631,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)

  if (elf_interpreter) {
  load_elf_interp(elf_interpreter, _info, bprm->buf);
+/*
+ * adjust brk address if the interpreter was loaded above the main
+ * executable, e.g. happens with static binaries on armhf


Guess you mean dynamic binaries?  the klibc binaries we used are dynamic, no?



+ */
+if (interp_info.brk > info->brk) {
+info->brk = interp_info.brk;
+}


Heh.  So it clashes with brk. Nice... ;)

You should ping upstream about this one before 8.1 is out, I think.

/mjt



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-17 Thread Helge Deller
This patch seems to work. Tested with qemu-arm and qemu-amd64.

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a26200d9f3..b583018591 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3615,6 +3631,13 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)

 if (elf_interpreter) {
 load_elf_interp(elf_interpreter, _info, bprm->buf);
+/*
+ * adjust brk address if the interpreter was loaded above the main
+ * executable, e.g. happens with static binaries on armhf
+ */
+if (interp_info.brk > info->brk) {
+info->brk = interp_info.brk;
+}

 /* If the program interpreter is one of these two, then assume
an iBCS2 image.  Otherwise assume a native linux image.  */



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-17 Thread Helge Deller

Without the patch this is the memory layout:
startend  size prot
0001-00011000 1000 r-x
00011000-0002 f000 ---
0002-00021000 1000 rw-
4000-40001000 1000 ---
40001000-40801000 0080 rwx
40801000-40802000 1000 r-x


The difference between armhf and amd64 regarding the fstype binary is:
armhf:
fstype loads at 0001 and klibc.so loads at 4000
for amd64:
fstype loads at 0040 and klibc.so loads at 0020

So, on amd64 the brk region is above both elf binaries,
while on armhf if clashes with the klibc areas.



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-17 Thread Helge Deller
This patch (hack) fixes the crash on armhf.

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index a26200d9f3..2efa981061 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -3674,7 +3685,7 @@ int load_elf_binary(struct linux_binprm *bprm, struct 
image_info *info)
  * The implementation of do_brk in syscalls.c expects to be able
  * to mmap pages in this space.
  */
-if (info->reserve_brk) {
+if (0 && info->reserve_brk) {
 abi_ulong start_brk = HOST_PAGE_ALIGN(info->brk);
 abi_ulong end_brk = HOST_PAGE_ALIGN(info->brk + info->reserve_brk);
 target_munmap(start_brk, end_brk - start_brk);

Still wondering what the best fix is.

Without the patch this is the memory layout:
startend  size prot
0001-00011000 1000 r-x
00011000-0002 f000 ---
0002-00021000 1000 rw-
4000-40001000 1000 ---
40001000-40801000 0080 rwx
40801000-40802000 1000 r-x
-1000 1000 r-x
start_brk   0x
end_code0x00010a73
start_code  0x0001
start_data  0x00020a78
end_data0x00020cd0
start_stack 0x407ffe50
brk 0x00020cd4
entry   0x003800f9
argv_start  0x407ffe54
env_start   0x407ffe60
auxv_start  0x407fff28


With the patch, this is the layout:
startend  size prot
0001-00011000 1000 r-x
00011000-0002 f000 ---
0002-00021000 1000 rw-
00021000-0038 0035f000 ---
0038-0038d000 d000 r-x
0038d000-0039c000 f000 ---
0039c000-0039d000 1000 rw-
0039d000-0039f000 2000 rw-
0039f000-01021000 00c82000 ---
4000-40001000 1000 ---
40001000-40801000 0080 rwx
40801000-40802000 1000 r-x
-1000 1000 r-x
start_brk   0x
end_code0x00010a73
start_code  0x0001
start_data  0x00020a78
end_data0x00020cd0
start_stack 0x407ffe50
brk 0x00020cd4
entry   0x003800f9
argv_start  0x407ffe54
env_start   0x407ffe60
auxv_start  0x407fff28

As can be seen, the memory segment of "entry" at 0x003800f9
has been unmapped when releasing the "reserve_brk" region.
Since qemu can't then fetch the instructions, it crashes immediately.



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-17 Thread Michael Tokarev

17.07.2023 15:55, Helge Deller wrote:

Hello,

Could someone please try the 3 qemu patches (and one revert) which I pushed to my 
"upx-fix"
branch with this binary?

It's based on top of qemu git master:
https://github.com/hdeller/qemu-hppa/commits/upx-fix
You can pull from:
git pull https://github.com/hdeller/qemu-hppa.git  upx-fix

I think those fix this bug here.


It does not with the fstype reproducer:

$ ./qemu-arm /usr/lib/klibc/bin/fstype
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault
$ _

Neither on top of master nor staging-8.0.

The segfault is about the same, with same stack trace.

/mjt



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-17 Thread Helge Deller

Hello,

Could someone please try the 3 qemu patches (and one revert) which I pushed to my 
"upx-fix"
branch with this binary?

It's based on top of qemu git master:
https://github.com/hdeller/qemu-hppa/commits/upx-fix
You can pull from:
git pull https://github.com/hdeller/qemu-hppa.git  upx-fix

I think those fix this bug here.

Helge



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-15 Thread Ben Hutchings
On Fri, 2023-07-14 at 22:08 +, Thorsten Glaser wrote:
> Michael Tokarev dixit:
> 
> > > 
> > > From the comment, it reserves 16 MiB after the main executable.
> > > 
> > > In klibc/armhf, however, the main executable starts around
> > > 0x0001 whereas the interpreter starts after that, around
> > > 0x0038…
> > 
> > Aren't it happens on all architectures, not just armhf?
> 
> bullseye/amd64:
> 
> 0x0020 interp
> 0x0040 main executable
> 
> So no, this applies only to some architectures, but not because…
> 
> > I had an impression it is not arch-specific. $subject
> 
> … it’s arch-specific but because klibc memory map is, so the
> effect only occurs on those arches where klibc puts the interp
> before the main executable.

You mean after, but it's more specific than that in practice.

> This is unfortunately hard to grep for, because…
> 
> usr/klibc/arch/arm/MCONFIG:KLIBCSHAREDFLAGS = $(LD_IMAGE_BASE_OPT) 0x38
> 
> … this applies to the interp, but for the main executables
> it uses the linker’s default AFAICT.
> 
> There is…
> 
> usr/klibc/arch/arm64/MCONFIG:KLIBCLDFLAGS  = $(LD_IMAGE_BASE_OPT) 0x0040
> usr/klibc/arch/arm64/MCONFIG:KLIBCSHAREDFLAGS = $(LD_IMAGE_BASE_OPT) 0x020
> 
> … which does transfer to main at 0040 interp at 0020 respectively,
> but only arm64 and “x32”, which really builds as amd64, do that.
> 
> And Itanic uses a linker script, putting the interp at
> 0x2000 (which seems to be standard for ia64).
> 0x41c8 is the beginning of the main executable
> there, from analysing the built binaries.
> 
> It would be more robust if klibc always specified both.
[...]

It would be a little more robust, and certainly easier to understand.

Here's what we have now:

Architecture  klibc base (hex)  Exec base (hex) Offset (MiB)
-
alpha1_c0001_2000* -2_560
arm [THUMB=y]38 1**-3
arm [THUMB=n]   180 1**   -24
arm642040   2
i386600   800* 32
ia64  2000_ 4000_** 2_199_023_255_552
loongarch64  1_27E01_2000*   -126
m68k   b000  8000*   -768
mips 2040*  2
mips64   1_2FE01_2000*   -254
parisc 40001000 1**-1_024
ppc f80  1000*  8
ppc64   f00  1000* 16
riscv64  20 1* -2
s390   400040**-1_020
s390x  4000   100**-1_008
sh   2040*  2
sparc  4000 1* -1_024
sparc64800010* -2_047
x86_64   2040   2

* Assumption commented in MCONFIG.
** Observed with objdump.

In 12 cases (a majority), the offset between klibc.so and the
executable is negative.  However, only riscv64 and arm with THUMB=y
have such small negative offsets (-2 and -3.4375 MiB respectively)
which I suppose puts them more at risk from this bug.

The Debian package is built with THUMB=y for armhf but not armel, which
seems like a mistake because the Arm EABI requires Thumb support.

Ben.

-- 
Ben Hutchings
The generation of random numbers is too important to be left to chance.
   - Robert Coveyou



signature.asc
Description: This is a digitally signed message part


Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-14 Thread Thorsten Glaser
Michael Tokarev dixit:

>>
>> From the comment, it reserves 16 MiB after the main executable.
>>
>> In klibc/armhf, however, the main executable starts around
>> 0x0001 whereas the interpreter starts after that, around
>> 0x0038…
>
> Aren't it happens on all architectures, not just armhf?

bullseye/amd64:

0x0020 interp
0x0040 main executable

So no, this applies only to some architectures, but not because…

> I had an impression it is not arch-specific. $subject

… it’s arch-specific but because klibc memory map is, so the
effect only occurs on those arches where klibc puts the interp
before the main executable.

This is unfortunately hard to grep for, because…

usr/klibc/arch/arm/MCONFIG:KLIBCSHAREDFLAGS = $(LD_IMAGE_BASE_OPT) 0x38

… this applies to the interp, but for the main executables
it uses the linker’s default AFAICT.

There is…

usr/klibc/arch/arm64/MCONFIG:KLIBCLDFLAGS  = $(LD_IMAGE_BASE_OPT) 0x0040
usr/klibc/arch/arm64/MCONFIG:KLIBCSHAREDFLAGS = $(LD_IMAGE_BASE_OPT) 0x020

… which does transfer to main at 0040 interp at 0020 respectively,
but only arm64 and “x32”, which really builds as amd64, do that.

And Itanic uses a linker script, putting the interp at
0x2000 (which seems to be standard for ia64).
0x41c8 is the beginning of the main executable
there, from analysing the built binaries.

It would be more robust if klibc always specified both.

But, as I said earlier, this won’t help bookworm and earlier
so fixing this in qemu is appreciated ;-)

>> The BSD manpage begins with…
>> DESCRIPTION
>>  The brk() and sbrk() functions are historical curiosities left over from
>>  earlier days before the advent of virtual memory management.
>> … so… oh well.
>
> That's lovely.  There's another change for brk() pending in qemu right
> now (to make it page-aligned; and no, it does not fix this issue).
> I guess it is not just curiocities :)

Perhaps. In the BSD world, malloc has been always using mmap for
ages, especially as the kernel randomises anon mmap addresses.

>> Anyway, while my proposed fix in theory moves the “end of the
>> process’ data segment” to behind the interpreter instead of
>> behind the main executable, processes are not supposed to use
>> it in combination with _end, only the returned pointers. It’s
>> something to at least consider. Will you forward this upstream?
>
> Yeah, already did, was just waiting for it to appear in the archives
> for the URL.

I meant the suggested fix. I’m not sure people over there will
dig through all of the analysis and discussion here… but maybe
a tl;dr could be posted there as well?

Thanks,
//mirabilos
-- 
“It is inappropriate to require that a time represented as
 seconds since the Epoch precisely represent the number of
 seconds between the referenced time and the Epoch.”
-- IEEE Std 1003.1b-1993 (POSIX) Section B.2.2.2



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-14 Thread Michael Tokarev

Control: forwarded -1 
https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg03138.html

14.07.2023 23:07, Thorsten Glaser wrote:

Michael Tokarev dixit:


commit 6fd5944980f4ccee728ce34bdaffc117db50b34d


 From the comment, it reserves 16 MiB after the main executable.

In klibc/armhf, however, the main executable starts around
0x0001 whereas the interpreter starts after that, around
0x0038…


Aren't it happens on all architectures, not just armhf?
I had an impression it is not arch-specific. $subject
mentions armhf only, but I think somewhere in the discussion
it's been said all architectures are affected?  Ok.


Perhaps the fix here would be to see if the interpreter comes
within 16 MiB past the main executable’s end, and if so, to
move the break (I wasn’t aware stuff on GNU/Linux still uses
that!) to start after the interpreter instead.

The BSD manpage begins with…
DESCRIPTION
  The brk() and sbrk() functions are historical curiosities left over from
  earlier days before the advent of virtual memory management.
… so… oh well.


That's lovely.  There's another change for brk() pending in qemu right
now (to make it page-aligned; and no, it does not fix this issue).
I guess it is not just curiocities :)


Anyway, while my proposed fix in theory moves the “end of the
process’ data segment” to behind the interpreter instead of
behind the main executable, processes are not supposed to use
it in combination with _end, only the returned pointers. It’s
something to at least consider. Will you forward this upstream?


Yeah, already did, was just waiting for it to appear in the archives
for the URL.

Thank you!

/mjt



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-14 Thread Thorsten Glaser
Michael Tokarev dixit:

> commit 6fd5944980f4ccee728ce34bdaffc117db50b34d

From the comment, it reserves 16 MiB after the main executable.

In klibc/armhf, however, the main executable starts around
0x0001 whereas the interpreter starts after that, around
0x0038…

Perhaps the fix here would be to see if the interpreter comes
within 16 MiB past the main executable’s end, and if so, to
move the break (I wasn’t aware stuff on GNU/Linux still uses
that!) to start after the interpreter instead.

The BSD manpage begins with…
DESCRIPTION
 The brk() and sbrk() functions are historical curiosities left over from
 earlier days before the advent of virtual memory management.
… so… oh well.

Anyway, while my proposed fix in theory moves the “end of the
process’ data segment” to behind the interpreter instead of
behind the main executable, processes are not supposed to use
it in combination with _end, only the returned pointers. It’s
something to at least consider. Will you forward this upstream?


Given how this both constraints executables’ sizes and perhaps
has effects on other loaders, perhaps the klibc upstream could
consider switching to using linker scripts or at least move the
respective text basēs, on all architectures, so that the main
executable comes after the interpreter.

This will of course not help bookworm users but, perhaps, it is
something, again, to consider, at least.


bye,
//mirabilos
-- 
This space for rent.

https://paypal.me/mirabilos to support my work.



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-14 Thread Michael Tokarev

14.07.2023 21:34, Thorsten Glaser wrote:
..


And, indeed, the combination of klibc-sKNr1Fw-Rh9G1FYpGCXRnrwmP2A.so
and fstype from 2.0.4-2 (jessie) also fails, so you have something
that can reliably be used in bisect tests I think.


Ok, this works.  The bisection points to this commit between qemu v4.2.0
and v5.0.0:

https://gitlab.com/qemu-project/qemu/-/commit/6fd5944980f4ccee728ce34bdaffc117db50b34d

commit 6fd5944980f4ccee728ce34bdaffc117db50b34d
Author: Richard Henderson 
Date:   Fri Jan 17 13:02:45 2020 -1000

linux-user: Reserve space for brk

With bad luck, we can wind up with no space at all for brk,
which will generally cause the guest malloc to fail.

This bad luck is easier to come by with ET_DYN (PIE) binaries,
where either the stack or the interpreter (ld.so) gets placed
immediately after the main executable.

But there's nothing preventing this same thing from happening
with ET_EXEC (normal) binaries, during probe_guest_base().

In both cases, reserve some extra space via mmap and release
it back to the system after loading the interpreter and
allocating the stack.

The choice of 16MB is somewhat arbitrary.  It's enough for libc
to get going, but without being so large that 32-bit guests or
32-bit hosts are in danger of running out of virtual address space.
It is expected that libc will be able to fall back to mmap arenas
after the limited brk space is exhausted.

Launchpad: https://bugs.launchpad.net/qemu/+bug/1749393
Signed-off-by: Richard Henderson 
Reviewed-by: Alex Bennée 
Tested-by: Alex Bennée 
Message-Id: <20200117230245.5040-1-richard.hender...@linaro.org>
Signed-off-by: Laurent Vivier 


Does this ring any bells?  Verified on armhf.
I'll bug upstream.

Thanks,

/mjt



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-14 Thread Thorsten Glaser
Ben Hutchings dixit:

>7.2.  This introduced regressions for shared-library executables for

I did notice the following:


amd64:

/lib/klibc-YUkGbOClhnaZRUUd4cUed0X2XZI.so: file format elf64-x86-64
/lib/klibc-YUkGbOClhnaZRUUd4cUed0X2XZI.so
architecture: i386:x86-64, flags 0x0102:
EXEC_P, D_PAGED
start address 0x00201034

Program Header:
LOAD off0x vaddr 0x0020 paddr 
0x0020 align 2**12
 filesz 0x01b4 memsz 0x01b4 flags r--
LOAD off0x1000 vaddr 0x00201000 paddr 
0x00201000 align 2**12
 filesz 0xcf07 memsz 0xcf07 flags r-x
LOAD off0xe000 vaddr 0x0020e000 paddr 
0x0020e000 align 2**12
 filesz 0x3f6f memsz 0x3f6f flags r--
LOAD off0x00012000 vaddr 0x00212000 paddr 
0x00212000 align 2**12
 filesz 0x0140 memsz 0x4438 flags rw-
NOTE off0x0190 vaddr 0x00200190 paddr 
0x00200190 align 2**2
 filesz 0x0024 memsz 0x0024 flags r--
   STACK off0x vaddr 0x paddr 
0x align 2**4
 filesz 0x memsz 0x flags rwx

Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 .note.gnu.build-id 0024  00200190  00200190  0190  
2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text cf07  00201000  00201000  1000  2**2
  CONTENTS, ALLOC, LOAD, READONLY, CODE
[…]
Program Headers:
  Type   Offset   VirtAddr   PhysAddr   FileSiz  MemSiz 
  Flg Align
  LOAD   0x00 0x0020 0x0020 0x0001b4 
0x0001b4 R   0x1000
  LOAD   0x001000 0x00201000 0x00201000 0x00cf07 
0x00cf07 R E 0x1000
  LOAD   0x00e000 0x0020e000 0x0020e000 0x003f6f 
0x003f6f R   0x1000
  LOAD   0x012000 0x00212000 0x00212000 0x000140 
0x004438 RW  0x1000
  NOTE   0x000190 0x00200190 0x00200190 0x24 
0x24 R   0x4
  GNU_STACK  0x00 0x 0x 0x00 
0x00 RWE 0x10

 Section to Segment mapping:
  Segment Sections...
   00 .note.gnu.build-id
   01 .text
   02 .rodata
   03 .data .bss
   04 .note.gnu.build-id


armhf:

/lib/klibc-m13AniKHUCMUNN8mXSUhIi8CUSA.so: file format elf32-littlearm
/lib/klibc-m13AniKHUCMUNN8mXSUhIi8CUSA.so
architecture: armv7, flags 0x0102:
EXEC_P, D_PAGED
start address 0x003800f9

Program Header:
LOAD off0x vaddr 0x0038 paddr 0x0038 align 2**12
 filesz 0xcecf memsz 0xcecf flags r-x
LOAD off0xd000 vaddr 0x0038d000 paddr 0x0038d000 align 2**12
 filesz 0x00b8 memsz 0x2254 flags rw-
NOTE off0x00b4 vaddr 0x003800b4 paddr 0x003800b4 align 2**2
 filesz 0x0024 memsz 0x0024 flags r--
   STACK off0x vaddr 0x paddr 0x align 2**4
 filesz 0x memsz 0x flags rw-
private flags = 5000400: [Version5 EABI] [hard-float ABI]

Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 .note.gnu.build-id 0024  003800b4  003800b4  00b4  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .text 9974  003800d8  003800d8  00d8  2**3
  CONTENTS, ALLOC, LOAD, READONLY, CODE
[…]
Program Headers:
  Type   Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  LOAD   0x00 0x0038 0x0038 0x0cecf 0x0cecf R E 0x1000
  LOAD   0x00d000 0x0038d000 0x0038d000 0x000b8 0x02254 RW  0x1000
  NOTE   0xb4 0x003800b4 0x003800b4 0x00024 0x00024 R   0x4
  GNU_STACK  0x00 0x 0x 0x0 0x0 RW  0x10

 Section to Segment mapping:
  Segment Sections...
   00 .note.gnu.build-id .text .rodata
   01 .data .bss
   02 .note.gnu.build-id


Specifically, what I noted is that, on amd64, the .text section
inside the text segment, which here isn’t shared with the buildid
section, has a page align increment from .note.gnu.build-id but
the armhf one doesn’t.

So I added -Ttext 0x381000 to the link command, to get…

Sections:
Idx Name  Size  VMA   LMA   File off  Algn
  0 .text 997c  00381000  00381000  1000  2**3
  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .note.gnu.build-id 0024  003800b4  003800b4  00b4  2**2
  CONTENTS, ALLOC, LOAD, READONLY, DATA
[…]

… but the segfault persists, so maybe that’s not it, or that’s
not yet enough and we have to use a linker script to put the
buildid section inside its own segment like it does on amd64.

The 

Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-13 Thread Helge Deller

On 7/14/23 01:56, Thorsten Glaser wrote:

Dixi quod…


My guess here is that it’s, as usual, the fault of qemu-user,


Strong evidence for that: doesn’t look like it even executes
one bit of klibc code:

$ qemu-arm-static -d cpu ./fstype --help
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)


what does this show?:
QEMU_STRACE=1 qemu-arm-static -d cpu ./fstype --help

I still believe, that the problem is that qemu's brk(NULL) doesn't return
a page-aligned address, which will have lots of other side-effects.
(see Andreas' RISC-V crash here: 
https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg00645.html)

Helge



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-13 Thread Thorsten Glaser
Dixi quod…

>My guess here is that it’s, as usual, the fault of qemu-user,

Strong evidence for that: doesn’t look like it even executes
one bit of klibc code:

$ qemu-arm-static -d cpu ./fstype --help
qemu: uncaught target signal 11 (Segmentation fault) - core dumped
Segmentation fault (core dumped)

And:

GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/bin/qemu-arm-static...
Downloading separate debug info for /usr/bin/qemu-arm-static...
Reading symbols from 
/home/tglase/.cache/debuginfod_client/5a14d0155c981c94a528d6468ded2c203f1e1908/debuginfo...
(gdb) r
Starting program: /usr/bin/qemu-arm-static ./fstype --help
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x77ff8700 (LWP 27273)]

Thread 1 "qemu-arm-static" received signal SIGSEGV, Segmentation fault.
0x004c5cb6 in cpu_lduw_code (env=env@entry=0xcbed30, ptr=3670264) at 
./include/qemu/bswap.h:329
Download failed: Invalid argument.  Continuing without source file 
./b/user-static/./include/qemu/bswap.h.
329 ./include/qemu/bswap.h: No such file or directory.
(gdb) bt
#0  0x004c5cb6 in cpu_lduw_code (env=env@entry=0xcbed30, ptr=3670264) 
at ./include/qemu/bswap.h:329
#1  0x0045c9ac in translator_lduw_swap (do_swap=false, pc=, env=0xcbed30)
at ./include/exec/translator.h:178
#2  arm_lduw_code (sctlr_b=false, addr=, env=0xcbed30) at 
../../target/arm/arm_ldst.h:44
#3  thumb_tr_translate_insn (dcbase=0x7fffdd50, cpu=) at 
../../target/arm/translate.c:9054
#4  0x004bc1e9 in translator_loop (ops=0xa7f180 , 
db=db@entry=0x7fffdd50,
cpu=cpu@entry=0xcb6a60, tb=tb@entry=0x7fffe840 , 
max_insns=max_insns@entry=512)
at ../../accel/tcg/translator.c:103
#5  0x00463eb3 in gen_intermediate_code (cpu=cpu@entry=0xcb6a60,
tb=tb@entry=0x7fffe840 , 
max_insns=max_insns@entry=512)
at ../../target/arm/translate.c:9283
#6  0x00512d75 in tb_gen_code (cpu=cpu@entry=0xcb6a60, pc=3670264, 
cs_base=0, flags=1196288,
cflags=-16777216, cflags@entry=0) at ../../accel/tcg/translate-all.c:1744
#7  0x004b4734 in tb_find (cf_mask=0, tb_exit=0, last_tb=0x0, 
cpu=0xcb6a60)
at ../../accel/tcg/cpu-exec.c:414
#8  cpu_exec (cpu=cpu@entry=0xcb6a60) at ../../accel/tcg/cpu-exec.c:770
#9  0x00422608 in cpu_loop (env=env@entry=0xcbed30) at 
../../linux-user/arm/cpu_loop.c:237
#10 0x00402949 in main (argc=, argv=0x7fffe230, 
envp=)
at ../../linux-user/main.c:882
(gdb) info r
rax0x40d94000  1087979520
rbx0x7fffdd50  140737488346448
rcx0xd9a72814264104
rdx0xc64d6012995936
rsi0x3800f83670264
rdi0xcbed3013364528
rbp0x0 0x0
rsp0x7fffdc48  0x7fffdc48
r8 0xc64d6012995936
r9 0xc656e812998376
r100x0 0
r110x0 0
r120xcbed3013364528
r130x0 0
r140x0 0
r150x7fffdd50  140737488346448
rip0x4c5cb60x4c5cb6 
eflags 0x10246 [ PF ZF IF RF ]
cs 0x3351
ss 0x2b43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
(gdb) disas
Dump of assembler code for function cpu_lduw_code:
   0x004c5ca0 <+0>: movQWORD PTR fs:0xff58,0x1
   0x004c5cad <+13>:movesi,esi
   0x004c5caf <+15>:movrax,QWORD PTR [rip+0x79efa2]# 
0xc64c58 
=> 0x004c5cb6 <+22>:movzx  eax,WORD PTR [rax+rsi*1]
   0x004c5cba <+26>:movQWORD PTR fs:0xff58,0x0
   0x004c5cc7 <+39>:ret
End of assembler dump.


The content of rax (guest_base) looks legit:

$ cat /proc/27269/maps
0040-00401000 r--p  fd:00 2624234
/usr/bin/qemu-arm-static
00401000-0071e000 r-xp 1000 fd:00 2624234

Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-13 Thread Thorsten Glaser
Hi Helge,

>Can you check if this patch fixes the problem:
>https://patchew.org/QEMU/mvmpm55qnno@suse.de/
>(linux-user: make sure brk(0) returns a page-aligned value,   from Andreas 
>Schwab)

I doubt it, klibc malloc uses mmap(2) normally.

(And given I tested it on a bullseye system, the mmap bug in the
bookworm kernel is also not applicable.)

bye,
//mirabilos
-- 
If Harry Potter gets a splitting headache in his scar
when he’s near Tom Riddle (aka Voldemort),
does Tom get pain in the arse when Harry is near him?
-- me, wondering why it’s not Jerry Potter………



Bug#1040981: klibc-utils: segfault executing armhf binaries under qemu-user

2023-07-13 Thread Thorsten Glaser
Dixi quod…

>My guess here is that it’s, as usual, the fault of qemu-user,
>which has multiple outstanding emulation bugs, some of which
>affecting klibc-built binaries especially, though this, since
>a statically linked mksh works, is probably an issue with how
>qemu-user handles .interp *shrug*

An interesting data point (here on a bullseye/amd64 system):

$ /usr/lib/klibc/bin/fstype --help
--help: No such file or directory
$ /lib/klibc-YUkGbOClhnaZRUUd4cUed0X2XZI.so  /usr/lib/klibc/bin/fstype --help
Segmentation fault (core dumped)

So running the interpreter directly is already not supported.
I’m guessing that that is what qemu-user tries, though.

Wild shoot into the blue but maybe it helps…

bye,
//mirabilos
-- 
„Cool, /usr/share/doc/mksh/examples/uhr.gz ist ja ein Grund,
mksh auf jedem System zu installieren.“
-- XTaran auf der OpenRheinRuhr, ganz begeistert
(EN: “[…]uhr.gz is a reason to install mksh on every system.”)