On 10/01/17 15:08, Josua Mayer wrote:
Hi Alex,


Am 08.01.2017 um 22:46 schrieb Alexander Graf:


On 07/01/2017 19:50, Josua Mayer wrote:
Hi everybody,

I am approaching you with a collection of unusual crashes that occur on
my test machine:
It is an early version of the 8040 Community Board by SolidRun running a
patched 4.9.0 kernel with the 42.2 rootfs from
http://download.opensuse.org/ports/aarch64/distribution/leap/42.2/appliances/openSUSE-Leap42.2-ARM-JeOS.aarch64-rootfs.aarch64-2017.01.02-Build1.2.tbz


Now what I see are the following situations:

1) unhandled level XY translation fault (11) at 0xYYYYYYYY, esr
0xYYYYYYYY
I have seen this with level 0, 1, 2 and 3 so far.
So far I  have seen it with zypper, where it freezes seemlinly at
random, and ctrl+x produces this kind of error in dmesg.

That's really just a kernel log entry for a user space segmentation
fault. The addresses that faulted were:

[ 2312.480811] zypper[5524]: unhandled level 2 translation fault (11)
at 0x00000000, esr 0x82000006

-->

ESR 0x82000006 means "Instruction Abort from a lower Exception level"
Fault type: Translation fault. (in EL2)
The faulting address is 0x00000000 (PC)
I see.



[ 2321.136185] zypper[9319]: unhandled level 2 translation fault (11)
at 0xffffe5720449, esr 0x92000006

-->

ESR 0x92000006 means "Data Abort from a lower Exception level"
Fault type: Translation fault. (in EL2)
The faulting address is 0xffffe5720449 (x0)


I'm curious. You seem to be running in EL2. Do you have the following
option enabled in your kernel?

  CONFIG_ARM64_VHE

If so, please try to disable it and see whether that makes things work.
is indeed enabled, will comment below.


2) undefined instruction: pc=...
This one causes a program to crash and exit without any delay. I
actively observed this twice with zypper, but the kernel log suggests it
can happen to other applications too.

Attached to this mail you can find a full system log with both kinds of
crashes. You will find that I played with qemu when this log was
produced. Sadly I did not save the initial log with the zypper crashes,
except for one extract.

I am quite unsure what to do about this. Has anybody seen such behaviour
with other boards?

I have seen it on a different CPU type with VHE enabled, yes.
I see


So after some additional tests I can conclude that it is some kernel
option triggering the defect. I have one .config that works just fine,
and one that is broken. Unfortunately the diff is not small.
Both have VHE enabled, so I am not sure if that is the root cause.
However I will try disabling VHE and let you know how it goes.

From my experience I would start by changing the page size in your broken config.

Regards,
Matthias
--
To unsubscribe, e-mail: [email protected]
To contact the owner, e-mail: [email protected]

Reply via email to