On 07/11/2023 04:49, Richard Henderson wrote:

On 11/6/23 14:02, Mark Cave-Ayland wrote:
I was working through my SPARC boot tests for your latest target/sparc series when I spotted a segfault on my FreeBSD SPARC64 image. A git bisect indicated that this was the patch that originally introduced the error, something I must have missed when testing the original decodetree conversion series.

The reproducer is:

./qemu-system-sparc64 -m 256 -cdrom FreeBSD-10.3-RELEASE-sparc64-bootonly.iso \
     -boot d -nographic

and the error is a segfault in devd:

...
...
Trying to mount root from cd9660:/dev/iso9660/10_3_RELEASE_SPARC64_BO [ro]...
Entropy harvesting: interrupts ethernet point_to_point swi.
Starting file system checks:
Mounting local file systems:.
Writing entropy file:.
/etc/rc: WARNING: $hostname is not set -- see rc.conf(5).
Starting Network: lo0 hme0.
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
         options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6>
         inet6 ::1 prefixlen 128
         inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
         inet 127.0.0.1 netmask 0xff000000
         nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
hme0: flags=8802<BROADCAST,SIMPLEX,MULTICAST> metric 0 mtu 1500
         options=8000b<RXCSUM,TXCSUM,VLAN_MTU,LINKSTATE>
         ether 52:54:00:12:34:56
         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
         media: Ethernet autoselect
Starting devd.
pid 246 (ps), uid 0: exited on signal 11
Segmentation fault
^^^^^^^^^^^^^^^^^^


I certainly can't imagine that FdMULq is really at fault, because it's not implemented on real hardware (and thus I really doubt FreeBSD attempted to use it), and CPU_FEATURE_FLOAT128 is not enabled by your command-line.

The only thing that I can imagine is that this is some sort of timing related issue and bisect behaved randomly.

Hmmm you're right, it seems that there is a semi-random aspect to the issue which is why the bisection didn't give a good result.

In order to mitigate this, I repeated the bisection again but this time booting the FreeBSD image 5 times in a row, only marking the commit as good if all 5 boot tests passed [1] without displaying a segfault message [2]. That bisection led me to this commit:


86b82fe021f46ed4501b16132f7e3fccd0a1ad5d is the first bad commit
commit 86b82fe021f46ed4501b16132f7e3fccd0a1ad5d
Author: Richard Henderson <richard.hender...@linaro.org>
Date:   Wed Oct 4 17:51:37 2023 -0700

     target/sparc: Move JMPL, RETT, RETURN to decodetree

     Tested-by: Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk>
     Acked-by: Mark Cave-Ayland <mark.cave-ayl...@ilande.co.uk>
     Signed-off-by: Richard Henderson <richard.hender...@linaro.org>

  target/sparc/insns.decode |   7 +++
  target/sparc/translate.c  | 126 +++++++++++++++++++++++++++++-----------------
  2 files changed, 88 insertions(+), 45 deletions(-)


I tried repeating the boot 5 times in a row test on both this commit and the commit before it, and that appeared to confirm the bisection result with the failures only appearing with commit 86b82fe021.

> All that said, I can't replicate this with master.
> Can you, now?

Yes, the issue is still present in master for me. My host is Debian bookworm (stable, x86_64) configured as './configure' '--target-list=sparc64-softmmu' '--enable-slirp' '--enable-debug' and the command line is:

./qemu-system-sparc64 -m 256 -cdrom FreeBSD-10.3-RELEASE-sparc64-bootonly.iso \
     -boot d -nographic


ATB,

Mark.

[1] Sometimes the image hangs just after "Trying to mount root .." but that appears to be a timing issue and not directly related to this series. In this cases where this happened, I simply quit QEMU and rebooted the image again.

[2] Normally the segfault message comes from devd, but during the bisection I did occasionally see it coming from other processes.

Reply via email to