On 6/25/19 3:21 PM, Adam Przybylski wrote:
> Am Dienstag, 25. Juni 2019 14:50:44 UTC+2 schrieb Ralf Ramsauer:
>> On 6/25/19 2:46 PM, Adam Przybylski wrote:
>>> Am Dienstag, 25. Juni 2019 14:14:41 UTC+2 schrieb Ralf Ramsauer:
>>>> On 6/25/19 1:31 PM, Adam Przybylski wrote:
>>>>> Am Dienstag, 25. Juni 2019 12:10:03 UTC+2 schrieb Ralf Ramsauer:
>>>>>> Hi,
>>>>>>
>>>>>> On 6/25/19 9:38 AM, Adam Przybylski wrote:
>>>>>>> Am Sonntag, 23. Juni 2019 18:32:37 UTC+2 schrieb Henning Schild:
>>>>>>>> Am Fri, 21 Jun 2019 07:18:14 -0700
>>>>>>>> schrieb Adam Przybylski:
>>>>>>>>
>>>>>>>>> Am Freitag, 21. Juni 2019 15:54:15 UTC+2 schrieb Henning Schild:
>>>>>>>>>> Am Fri, 21 Jun 2019 14:51:30 +0200
>>>>>>>>>> schrieb Ralf Ramsauer:
>>>>>>>>>>   
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> On 6/21/19 2:22 PM, Valentine Sinitsyn wrote:  
>>>>>>>>>>>> Hi Adam,
>>>>>>>>>>>>
>>>>>>>>>>>> On 21.06.2019 17:16, Adam Przybylski wrote:    
>>>>>>>>>>>>> Dear Jailhouse Community,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am trying to enabled Jailhouse on the AMD EPYC 7351P 16-Core
>>>>>>>>>>>>> Processor. Unfortunately the system hangs after I execute
>>>>>>>>>>>>> "jailhouse enable sysconfig.cell".
>>>>>>>>>>>>>
>>>>>>>>>>>>> Do you have any hint how to debug and instrument this issue?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Any kind of help is appreciated.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Attached you can find the jailhouse logs, processor info, and
>>>>>>>>>>>>> sysconfig.c.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Looking forward to hear from you.    
>>>>>>>>>>>> I'd say the following line is the culprit:
>>>>>>>>>>>>     
>>>>>>>>>>>>> FATAL: Invalid PIO read, port: 814 size: 1    
>>>>>>>>>>>
>>>>>>>>>>> Could you please attach /proc/ioports? This will tell us the
>>>>>>>>>>> secret behind Port 814.  
>>>>>>>>>>
>>>>>>>>>> Not always, the driver doing that has to be so friendly to register
>>>>>>>>>> the region.
>>>>>>>>>>   
>>>>>>>>>>>>
>>>>>>>>>>>> As a quick fix, you may grant your root cell access to all I/O
>>>>>>>>>>>> ports and see if it helps.    
>>>>>>>>>>>
>>>>>>>>>>> Allowing access will suppress the symptoms, yet we should
>>>>>>>>>>> investigate its cause. Depending on the semantics of Port 819, to
>>>>>>>>>>> allow access might have unintended side effects.
>>>>>>>>>>>
>>>>>>>>>>> You could also try to disassemble your kernel (objdump -d
>>>>>>>>>>> vmlinux) and check what function hides behind the instruction
>>>>>>>>>>> pointer at the moment of the crash 0xffffffffa4ac3114.  
>>>>>>>>>>
>>>>>>>>>> A look in the System.map can also answer that question. On a distro
>>>>>>>>>> that will be ready to read somewhere in /boot/.
>>>>>>>>>>
>>>>>>>>>> Henning
>>>>>>>>>>   
>>>>>>>>>>>   Ralf
>>>>>>>>>>>   
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Valentine
>>>>>>>>>>>>     
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kind regards,
>>>>>>>>>>>>> Adam Przybylski
>>>>>>>>>>>>>    
>>>>>>>>>>>>     
>>>>>>>>>>>  
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I looked up the function which gets executed in the Kernel. It's
>>>>>>>>> "acpi_idle_do_entry".
>>>>>>>>
>>>>>>>> Well now you are back to what Valentine said. Open up those ports one
>>>>>>>> by one, until the problem goes away. The alternative is to disable the
>>>>>>>> drivers in the root-linux. In the case of ACPI i.e. acpi=off as kernel
>>>>>>>> parameter, but you probably do not want that.
>>>>>>>>
>>>>>>>> Note that whatever you allow might cause weaker isolation, in this case
>>>>>>>> maybe real-time related.
>>>>>>>>
>>>>>>>> Henning
>>>>>>>>
>>>>>>>>> Adam
>>>>>>>>>
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> after allowing the access to 0x800-0x89f IO ports the issue with PIO 
>>>>>>> read is solved.
>>>>>>>
>>>>>>> Now I am facing issues with IOMMU/RAM, NMI IPI, MSR. Please see 
>>>>>>> attached log.
>>>>>>
>>>>>> You can again look at the system.map to find the code behind the MSR 
>>>>>> access.
>>>>>>
>>>>>> The rest can probably solved by consolidating some non-page aligned
>>>>>> spreaded memory regions in your config -- could you please attach the
>>>>>> output of jailhouse config collect? It should contain all data that is
>>>>>> relevant for debugging.
>>>>>>
>>>>>> Thanks
>>>>>>   Ralf
>>>>>>
>>>>>>>
>>>>>>> Any idea how to debug this?
>>>>>>>
>>>>>>> Adam
>>>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> attached the jailhouse config collect output.
>>>>
>>>> Please try the attached config on next.
>>>>
>>>> You can use diff to see what I changed: I consolidated some memory
>>>> regions to one large, contiguous region. Should at least solve the MMIO
>>>> traps and the unknown instruction error.
>>>>
>>>> Remains the MSR access. What code is behind the instruction pointer?
>>>>
>>>> Thanks
>>>>   Ralf
>>>>
>>>>>
>>>>> Adam
>>>>>
>>>
>>> Hi,
>>>
>>> the attached config works fine regarding the IOMMU/RAM accesses. Thank you!
>>
>> Great, good to hear.
>>
>>>
>>> The function behind the RIP is native_read_msr_safe.
>>
>> Well... That doesn't help. :-)
>>
>> could you please
>> $ echo #define CRASH_CELL_ON_PANIC 1 >> include/jailhouse/config.h
>>
>> and then recompile and reinstall jailhouse. This should give you a
>> stacktrace of the kernel when the crash happens. Then we can go on
>> debugging.
>>
>>   Ralf
>>
>>>
>>> Adam
>>>
> 
> Attached the dmesg with the kernel crash.

Perfect. Try to add mce=off to your kernel command line.

  Ralf

> 
> Adam
> 

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/eb09293c-1b9d-8e4f-dc7a-1a0bd1263b5f%40oth-regensburg.de.
For more options, visit https://groups.google.com/d/optout.

Reply via email to