Hi everyone,

sorry for the delay on this. We dropped this kernel tree and moved on to
4.10-rc6.
So far I have not had any issues there.

Except:
[ 8305.931816] cc1[31947]: unhandled level 0 translation fault (11) at
0x10004df5cfc19, esr 0x92000004
But this time inside a VM running debian arm64. The host seems
untroubled by this.
So for now, this story is closed. As to the errors inside the guest for
now I will ignore them, but if necessary Debian seems to be the place to
ask about it.


Am 10.01.2017 um 16:12 schrieb Matthias Brugger:
>
>
> On 10/01/17 15:08, Josua Mayer wrote:
>> Hi Alex,
>>
>>
>> Am 08.01.2017 um 22:46 schrieb Alexander Graf:
>>>
>>>
>>> On 07/01/2017 19:50, Josua Mayer wrote:
>>>> Hi everybody,
>>>>
>>>> I am approaching you with a collection of unusual crashes that
>>>> occur on
>>>> my test machine:
>>>> It is an early version of the 8040 Community Board by SolidRun
>>>> running a
>>>> patched 4.9.0 kernel with the 42.2 rootfs from
>>>> http://download.opensuse.org/ports/aarch64/distribution/leap/42.2/appliances/openSUSE-Leap42.2-ARM-JeOS.aarch64-rootfs.aarch64-2017.01.02-Build1.2.tbz
>>>>
>>>>
>>>>
>>>> Now what I see are the following situations:
>>>>
>>>> 1) unhandled level XY translation fault (11) at 0xYYYYYYYY, esr
>>>> 0xYYYYYYYY
>>>> I have seen this with level 0, 1, 2 and 3 so far.
>>>> So far I  have seen it with zypper, where it freezes seemlinly at
>>>> random, and ctrl+x produces this kind of error in dmesg.
>>>
>>> That's really just a kernel log entry for a user space segmentation
>>> fault. The addresses that faulted were:
>>>
>>> [ 2312.480811] zypper[5524]: unhandled level 2 translation fault (11)
>>> at 0x00000000, esr 0x82000006
>>>
>>> -->
>>>
>>> ESR 0x82000006 means "Instruction Abort from a lower Exception level"
>>> Fault type: Translation fault. (in EL2)
>>> The faulting address is 0x00000000 (PC)
>> I see.
>>>
>>>
>>>
>>> [ 2321.136185] zypper[9319]: unhandled level 2 translation fault (11)
>>> at 0xffffe5720449, esr 0x92000006
>>>
>>> -->
>>>
>>> ESR 0x92000006 means "Data Abort from a lower Exception level"
>>> Fault type: Translation fault. (in EL2)
>>> The faulting address is 0xffffe5720449 (x0)
>>>
>>>
>>> I'm curious. You seem to be running in EL2. Do you have the following
>>> option enabled in your kernel?
>>>
>>>   CONFIG_ARM64_VHE
>>>
>>> If so, please try to disable it and see whether that makes things work.
>> is indeed enabled, will comment below.
>>>
>>>>
>>>> 2) undefined instruction: pc=...
>>>> This one causes a program to crash and exit without any delay. I
>>>> actively observed this twice with zypper, but the kernel log
>>>> suggests it
>>>> can happen to other applications too.
>>>>
>>>> Attached to this mail you can find a full system log with both
>>>> kinds of
>>>> crashes. You will find that I played with qemu when this log was
>>>> produced. Sadly I did not save the initial log with the zypper
>>>> crashes,
>>>> except for one extract.
>>>>
>>>> I am quite unsure what to do about this. Has anybody seen such
>>>> behaviour
>>>> with other boards?
>>>
>>> I have seen it on a different CPU type with VHE enabled, yes.
>> I see
>>>
>>>
>> So after some additional tests I can conclude that it is some kernel
>> option triggering the defect. I have one .config that works just fine,
>> and one that is broken. Unfortunately the diff is not small.
>> Both have VHE enabled, so I am not sure if that is the root cause.
>> However I will try disabling VHE and let you know how it goes.
>
> From my experience I would start by changing the page size in your
> broken config.
>
> Regards,
> Matthias

-- 
To unsubscribe, e-mail: [email protected]
To contact the owner, e-mail: [email protected]

Reply via email to