Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps

2024-02-21 Thread Richard Henderson

On 2/12/24 10:43, Ilya Leoshkevich wrote:

int main(void)
{
 shmat(shmget(IPC_PRIVATE, 1836016, IPC_CREAT | 0600), (void 
*)0x2804000, 0);
 open("/proc/self/maps", O_RDONLY);
}

Apparently an mmap() is missing for shmat() when g>h and shmaddr is
specified. The mismatch between the host's and the guest's view of the
mapping's tail appears to be causing the SEGV.


Yes, shmat() is handling none of the h != g page size issues;
it is in fact fairly broken.


r~



Re: Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps

2024-02-12 Thread Ilya Leoshkevich
On Mon, Feb 05, 2024 at 11:11:06AM +, Richard Purdie wrote:
> On Mon, 2024-02-05 at 13:05 +1000, Richard Henderson wrote:
> > On 1/26/24 23:52, Richard Purdie wrote:
> > > On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote:
> > > > 26.01.2024 16:03, Richard Purdie wrote:
> > > > > I've run into a problem with this change.
> > > > > 
> > > > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we
> > > > > started seeing errors cross compiling webkitgtk on x86_64 for x86_64
> > > > > during the introspection code which runs under user mode qemu.
> > > > 
> > > > Besides your observations, please be aware there's quite a few issues 
> > > > in 8.2.0.
> > > > Please take a look at 
> > > > https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/
> > > > (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which 
> > > > is updated
> > > > less often) for fixes already queued up, if you haven't looked there 
> > > > already.
> > > > 8.2.1 stable/bugfix release is scheduled for the beginning of the next 
> > > > week.
> > > 
> > > Thanks.
> > > 
> > > I should note that I did test the staging-8.2 branch and nothing there
> > > helped. The issue was also present with master as of yesterday.
> > > 
> > > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto
> > > Projects tracking of the issue which has the commits for master and
> > > staging-8.2 that I tested.
> > 
> > The yocto logs referenced here are not helpful for reproducing the problem.
> 
> It took me a couple of days I didn't have to workout which commit
> caused it, which versions showed the issue and how to work around it.
> 
> It looks host kernel specific since it doesn't happen on some systems
> so even with the binaries/command/environment vars, it may not be
> enough.
> 
> I was hoping the indication of the cause might help point to the fix as
> there is quite a bit of work in trying to extract this into a
> reproducer. The failure is 20 mins into a webkitgtk compile on a remote
> CI system which no longer has the context on it.
> 
> > Please extract a binary to run, inputs, and command-line.
> 
> I wish I could say that to the bug reports I get! :)
> 
> I'll do my best but finding the time is going to be a challenge.
> 
> Cheers,
> 
> Richard

I just ran into a similar crash and could reproduce it with
5005aed8a7e7 alpha-linux-user as follows:

#include 
#include 

int main(void)
{
shmat(shmget(IPC_PRIVATE, 1836016, IPC_CREAT | 0600), (void 
*)0x2804000, 0);
open("/proc/self/maps", O_RDONLY);
}

Apparently an mmap() is missing for shmat() when g>h and shmaddr is
specified. The mismatch between the host's and the guest's view of the
mapping's tail appears to be causing the SEGV.



Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps

2024-02-05 Thread Richard Purdie
On Mon, 2024-02-05 at 13:05 +1000, Richard Henderson wrote:
> On 1/26/24 23:52, Richard Purdie wrote:
> > On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote:
> > > 26.01.2024 16:03, Richard Purdie wrote:
> > > > I've run into a problem with this change.
> > > > 
> > > > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we
> > > > started seeing errors cross compiling webkitgtk on x86_64 for x86_64
> > > > during the introspection code which runs under user mode qemu.
> > > 
> > > Besides your observations, please be aware there's quite a few issues in 
> > > 8.2.0.
> > > Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/
> > > (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is 
> > > updated
> > > less often) for fixes already queued up, if you haven't looked there 
> > > already.
> > > 8.2.1 stable/bugfix release is scheduled for the beginning of the next 
> > > week.
> > 
> > Thanks.
> > 
> > I should note that I did test the staging-8.2 branch and nothing there
> > helped. The issue was also present with master as of yesterday.
> > 
> > https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto
> > Projects tracking of the issue which has the commits for master and
> > staging-8.2 that I tested.
> 
> The yocto logs referenced here are not helpful for reproducing the problem.

It took me a couple of days I didn't have to workout which commit
caused it, which versions showed the issue and how to work around it.

It looks host kernel specific since it doesn't happen on some systems
so even with the binaries/command/environment vars, it may not be
enough.

I was hoping the indication of the cause might help point to the fix as
there is quite a bit of work in trying to extract this into a
reproducer. The failure is 20 mins into a webkitgtk compile on a remote
CI system which no longer has the context on it.

> Please extract a binary to run, inputs, and command-line.

I wish I could say that to the bug reports I get! :)

I'll do my best but finding the time is going to be a challenge.

Cheers,

Richard



Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps

2024-02-04 Thread Richard Henderson

On 1/26/24 23:52, Richard Purdie wrote:

Hi Michael,

On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote:

26.01.2024 16:03, Richard Purdie wrote:

I've run into a problem with this change.

We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we
started seeing errors cross compiling webkitgtk on x86_64 for x86_64
during the introspection code which runs under user mode qemu.


Besides your observations, please be aware there's quite a few issues in 8.2.0.
Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/
(and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is 
updated
less often) for fixes already queued up, if you haven't looked there already.
8.2.1 stable/bugfix release is scheduled for the beginning of the next week.


Thanks.

I should note that I did test the staging-8.2 branch and nothing there
helped. The issue was also present with master as of yesterday.

https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto
Projects tracking of the issue which has the commits for master and
staging-8.2 that I tested.


The yocto logs referenced here are not helpful for reproducing the problem.
Please extract a binary to run, inputs, and command-line.


r~




Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps

2024-01-26 Thread Richard Purdie
Hi Michael,

On Fri, 2024-01-26 at 16:33 +0300, Michael Tokarev wrote:
> 26.01.2024 16:03, Richard Purdie wrote:
> > I've run into a problem with this change.
> > 
> > We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we
> > started seeing errors cross compiling webkitgtk on x86_64 for x86_64
> > during the introspection code which runs under user mode qemu.
> 
> Besides your observations, please be aware there's quite a few issues in 
> 8.2.0.
> Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/
> (and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is 
> updated
> less often) for fixes already queued up, if you haven't looked there already.
> 8.2.1 stable/bugfix release is scheduled for the beginning of the next week.

Thanks.

I should note that I did test the staging-8.2 branch and nothing there
helped. The issue was also present with master as of yesterday.

https://bugzilla.yoctoproject.org/show_bug.cgi?id=15367 is Yocto
Projects tracking of the issue which has the commits for master and
staging-8.2 that I tested.

Cheers,

Richard





Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps

2024-01-26 Thread Michael Tokarev

26.01.2024 16:03, Richard Purdie wrote:

Hi,

I've run into a problem with this change.

We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we
started seeing errors cross compiling webkitgtk on x86_64 for x86_64
during the introspection code which runs under user mode qemu.


Hi Richard!

Besides your observations, please be aware there's quite a few issues in 8.2.0.
Please take a look at https://gitlab.com/mjt0k/qemu/-/commits/staging-8.2/
(and https://gitlab.com/qemu-project/qemu/-/commits/staging-8.2/ which is 
updated
less often) for fixes already queued up, if you haven't looked there already.
8.2.1 stable/bugfix release is scheduled for the beginning of the next week.

Thanks,

/mjt



Re: [PULL 05/13] linux-user: Use walk_memory_regions for open_self_maps

2024-01-26 Thread Richard Purdie
Hi,

I've run into a problem with this change.

We (Yocto Project) upgraded to qemu 8.2.0 recently and after that we
started seeing errors cross compiling webkitgtk on x86_64 for x86_64
during the introspection code which runs under user mode qemu.

The error we see is:

qemu-x86_64: QEMU internal SIGSEGV {code=MAPERR, addr=0x20}
Segmentation fault

e.g. here:

https://autobuilder.yoctoproject.org/typhoon/#/builders/40/builds/8488/steps/11/logs/stdio

This usually seems to happen on our debian 11 based autobuilder
machines.

I took one of the broken builds and bisected it to this change (commit
7b7a3366e142d3baeb3fd1d3660a50e7956c19eb).

There was a change in output from commit
7dfd3ca8d95f9962cdd2ebdfcdd699279b98fa18, before that it was:

ERROR:../git/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: assertion 
failed: (cpu == current_cpu)
Bail out! ERROR:../git/accel/tcg/cpu-exec.c:532:cpu_exec_longjmp_cleanup: 
assertion failed: (cpu == current_cpu)

After digging into the code and trying to work out what is going on, I
realised that n is NULL when it fails so this makes the problem "go
away":

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index e384e14248..2577fb770d 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -8085,6 +8085,9 @@ static int open_self_maps_2(void *opaque, target_ulong 
guest_start,
 while (1) {
 IntervalTreeNode *n =
 interval_tree_iter_first(d->host_maps, host_start, host_start);
+if (!n) {
+return 0;
+}
 MapInfo *mi = container_of(n, MapInfo, itree);
 uintptr_t this_hlast = MIN(host_last, n->last);
 target_ulong this_gend = h2g(this_hlast) + 1;


I'm hoping that might be enough to give you an idea of what is going on
and what the correct fix may be?

I haven't managed to make an isolated test to reproduce the issue yet.

Cheers,

Richard