[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-20 Thread K
An update for anyone interested: I didn't remember seeing the leading
0x10 because the values are correct when retrieved from memory. They get
packed into a structure that gets returned in a single register, so the
0x10 second element ends up in the upper 4 bytes of x0 which is provided
as the first argument to strcmp. strcmp doesn't appear to clear the
upper bytes of x0 in ilp32 mode before using it to access memory. This
issue is actually either a GCC codegen problem or a multilib selection
problem in the build environment.

Also of note, GDB prints the full 64bit address when printing $w0
instead of the lower 4 bytes, but I don't think that's a Qemu bug
either.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  Invalid

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions



[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-17 Thread K
Ok, thanks for rooting this out. I could swear that I checked that
address several times and I clearly remember 0x4010ca28, but I don't
remember ever seeing 0x10 ahead of it. I'll dig into it a bit and
hopefully find the root cause in my code.

** Changed in: qemu
   Status: New => Invalid

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  Invalid

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions



[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-17 Thread Peter Maydell
It does still crash on current QEMU. The proximate cause of the crash is
that you are trying to read from an address which is way outside RAM:

Trace 0: 0x7f8d50054340 [/400195d8/0x82104000] strcmp
 PC=400195d8 X00=00104010ca28 X01=4001ec28
X02=0fe8 X03=401098c8 X04=4010ba40
X05=5641526f44654b00 X06=1f276f6c62717372 X07=
X08=ffda X09=401097d0 X10=0101010101010101
X11= X12= X13=
X14= X15= X16=40014610
X17=0008 X18= X19=4010b9f0
X20=4001ec28 X21=00084001ec20 X22=4001ec60
X23=4001ec40 X24=4001f548 X25=00104001ec28
X26=00104001ec40 X27=00034001ec38 X28=
X29=401098d0 X30=40008a38  SP=401098d0
PSTATE=4005 -Z-- EL1h
Taking exception 4 [Data Abort]
...from EL1 to EL1
...with ESR 0x25/0x9610
...with FAR 0x104010ca28
...with ELR 0x400195d8
...to EL1 PC 0x40018200 PSTATE 0x3c5

where the insn at 0x400195d8 is (inside strcmp)
0x400195d8:  f8408402  ldr  x2, [x0], #8

You can see that x0 is is 00104010ca28, so QEMU is correct to give
the data abort here. Further diagnosis would require working back
through the log to find out where that address came from, which will be
easier for you to do since you have the source.

NB: I recommend these options for producing the logfile:
 /tmp/q.log -d in_asm,int,cpu_reset,exec,cpu,guest_errors,nochain -singlestep
Execution will be slower, but the crash here is pretty quick so that's not a 
problem, and these options mean that every insn executed will produce a "Trace" 
line and a CPU register dump. That's easier to understand and read (especially 
reading backwards) than logs produced when QEMU is doing its normal 
optimisations of chaining TBs together and putting multiple guest insns in each 
TB.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  New

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions



[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-17 Thread K
I would have thought that TLB considerations would not apply when the
MMU is disabled (RTEMS runs in a completely flat memory space). I'll try
to reproduce on more modern QEMU today. Thanks for taking a look at
this.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  New

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions



[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-16 Thread Peter Maydell
Writing to SCTLR can cause QEMU to flush its TLB (as an internal
implementation detail), so if adding SCTLR writes is sufficient to cause
the problem to go away, I would be suspicious that your guest code is
missing necessary TLB maintenance instructions.

QEMU 3.1 and 4.1 are quite old -- can you reproduce with 5.0 or
(ideally) head-of-git ?

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  New

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions



[Bug 1887854] Re: Spurious Data Abort on qemu-system-aarch64

2020-07-16 Thread K
** Description changed:

  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5
  
  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101
  
  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer misalignment,
  and synchronous External aborts, including synchronous parity or ECC
  errors. Not used for debug related exceptions.
  
  ESR ISS field: 0b1
  
  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.
  
  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors
  
  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).
  
  Edit: This bug can be worked around by getting and setting SCTLR without
- changing its value.
+ changing its value before each data abort would occur. This test needs 6
+ of these workarounds to operate successfully.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1887854

Title:
  Spurious Data Abort on qemu-system-aarch64

Status in QEMU:
  New

Bug description:
  When running RTEMS test psxndbm01.exe built for AArch64-ilp32 (this code is 
not yet publically available), the test generates a spurious data abort (the 
MMU and alignment checks should be disabled according to bits 1, 0 of 
SCTLR_EL1). The abort information is as follows:
  Taking exception 4 [Data Abort]
  ...from EL1 to EL1
  ...with ESR 0x25/0x9610
  ...with FAR 0x104010ca28
  ...with ELR 0x400195d8
  ...to EL1 PC 0x40018200 PSTATE 0x3c5

  The ESR indicates that a synchronous external abort has occurred.
  ESR EC field: 0b100101

  From the ARMv8 technical manual: Data Abort taken without a change in
  Exception level. Used for MMU faults generated by data accesses,
  alignment faults other than those caused by Stack Pointer
  misalignment, and synchronous External aborts, including synchronous
  parity or ECC errors. Not used for debug related exceptions.

  ESR ISS field: 0b1

  From the ARMv8 technical manual: Synchronous External abort, not on
  translation table walk or hardware update of translation table.

  The following command line is used to invoke qemu:
  qemu-system-aarch64 -machine virt -cpu cortex-a53 -m 256M -no-reboot 
-nographic -serial mon:stdio -kernel 
build/aarch64/a53_ilp32_qemu/testsuites/psxtests/psxndbm01.exe -D qemu.log -d 
in_asm,int,cpu_reset,unimp,guest_errors

  This occurs on Qemu 3.1.0 as distributed via Debian and on Qemu 4.1 as
  built by the RTEMS source builder (4.1+minor patches).

  Edit: This bug can be worked around by getting and setting SCTLR
  without changing its value before each data abort would occur. This
  test needs 6 of these workarounds to operate successfully.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1887854/+subscriptions