On 6/25/19 3:21 PM, Adam Przybylski wrote: > Am Dienstag, 25. Juni 2019 14:50:44 UTC+2 schrieb Ralf Ramsauer: >> On 6/25/19 2:46 PM, Adam Przybylski wrote: >>> Am Dienstag, 25. Juni 2019 14:14:41 UTC+2 schrieb Ralf Ramsauer: >>>> On 6/25/19 1:31 PM, Adam Przybylski wrote: >>>>> Am Dienstag, 25. Juni 2019 12:10:03 UTC+2 schrieb Ralf Ramsauer: >>>>>> Hi, >>>>>> >>>>>> On 6/25/19 9:38 AM, Adam Przybylski wrote: >>>>>>> Am Sonntag, 23. Juni 2019 18:32:37 UTC+2 schrieb Henning Schild: >>>>>>>> Am Fri, 21 Jun 2019 07:18:14 -0700 >>>>>>>> schrieb Adam Przybylski: >>>>>>>> >>>>>>>>> Am Freitag, 21. Juni 2019 15:54:15 UTC+2 schrieb Henning Schild: >>>>>>>>>> Am Fri, 21 Jun 2019 14:51:30 +0200 >>>>>>>>>> schrieb Ralf Ramsauer: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> On 6/21/19 2:22 PM, Valentine Sinitsyn wrote: >>>>>>>>>>>> Hi Adam, >>>>>>>>>>>> >>>>>>>>>>>> On 21.06.2019 17:16, Adam Przybylski wrote: >>>>>>>>>>>>> Dear Jailhouse Community, >>>>>>>>>>>>> >>>>>>>>>>>>> I am trying to enabled Jailhouse on the AMD EPYC 7351P 16-Core >>>>>>>>>>>>> Processor. Unfortunately the system hangs after I execute >>>>>>>>>>>>> "jailhouse enable sysconfig.cell". >>>>>>>>>>>>> >>>>>>>>>>>>> Do you have any hint how to debug and instrument this issue? >>>>>>>>>>>>> >>>>>>>>>>>>> Any kind of help is appreciated. >>>>>>>>>>>>> >>>>>>>>>>>>> Attached you can find the jailhouse logs, processor info, and >>>>>>>>>>>>> sysconfig.c. >>>>>>>>>>>>> >>>>>>>>>>>>> Looking forward to hear from you. >>>>>>>>>>>> I'd say the following line is the culprit: >>>>>>>>>>>> >>>>>>>>>>>>> FATAL: Invalid PIO read, port: 814 size: 1 >>>>>>>>>>> >>>>>>>>>>> Could you please attach /proc/ioports? This will tell us the >>>>>>>>>>> secret behind Port 814. >>>>>>>>>> >>>>>>>>>> Not always, the driver doing that has to be so friendly to register >>>>>>>>>> the region. >>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> As a quick fix, you may grant your root cell access to all I/O >>>>>>>>>>>> ports and see if it helps. >>>>>>>>>>> >>>>>>>>>>> Allowing access will suppress the symptoms, yet we should >>>>>>>>>>> investigate its cause. Depending on the semantics of Port 819, to >>>>>>>>>>> allow access might have unintended side effects. >>>>>>>>>>> >>>>>>>>>>> You could also try to disassemble your kernel (objdump -d >>>>>>>>>>> vmlinux) and check what function hides behind the instruction >>>>>>>>>>> pointer at the moment of the crash 0xffffffffa4ac3114. >>>>>>>>>> >>>>>>>>>> A look in the System.map can also answer that question. On a distro >>>>>>>>>> that will be ready to read somewhere in /boot/. >>>>>>>>>> >>>>>>>>>> Henning >>>>>>>>>> >>>>>>>>>>> Ralf >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Valentine >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Kind regards, >>>>>>>>>>>>> Adam Przybylski >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> I looked up the function which gets executed in the Kernel. It's >>>>>>>>> "acpi_idle_do_entry". >>>>>>>> >>>>>>>> Well now you are back to what Valentine said. Open up those ports one >>>>>>>> by one, until the problem goes away. The alternative is to disable the >>>>>>>> drivers in the root-linux. In the case of ACPI i.e. acpi=off as kernel >>>>>>>> parameter, but you probably do not want that. >>>>>>>> >>>>>>>> Note that whatever you allow might cause weaker isolation, in this case >>>>>>>> maybe real-time related. >>>>>>>> >>>>>>>> Henning >>>>>>>> >>>>>>>>> Adam >>>>>>>>> >>>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> after allowing the access to 0x800-0x89f IO ports the issue with PIO >>>>>>> read is solved. >>>>>>> >>>>>>> Now I am facing issues with IOMMU/RAM, NMI IPI, MSR. Please see >>>>>>> attached log. >>>>>> >>>>>> You can again look at the system.map to find the code behind the MSR >>>>>> access. >>>>>> >>>>>> The rest can probably solved by consolidating some non-page aligned >>>>>> spreaded memory regions in your config -- could you please attach the >>>>>> output of jailhouse config collect? It should contain all data that is >>>>>> relevant for debugging. >>>>>> >>>>>> Thanks >>>>>> Ralf >>>>>> >>>>>>> >>>>>>> Any idea how to debug this? >>>>>>> >>>>>>> Adam >>>>>>> >>>>> >>>>> Hi, >>>>> >>>>> attached the jailhouse config collect output. >>>> >>>> Please try the attached config on next. >>>> >>>> You can use diff to see what I changed: I consolidated some memory >>>> regions to one large, contiguous region. Should at least solve the MMIO >>>> traps and the unknown instruction error. >>>> >>>> Remains the MSR access. What code is behind the instruction pointer? >>>> >>>> Thanks >>>> Ralf >>>> >>>>> >>>>> Adam >>>>> >>> >>> Hi, >>> >>> the attached config works fine regarding the IOMMU/RAM accesses. Thank you! >> >> Great, good to hear. >> >>> >>> The function behind the RIP is native_read_msr_safe. >> >> Well... That doesn't help. :-) >> >> could you please >> $ echo #define CRASH_CELL_ON_PANIC 1 >> include/jailhouse/config.h >> >> and then recompile and reinstall jailhouse. This should give you a >> stacktrace of the kernel when the crash happens. Then we can go on >> debugging. >> >> Ralf >> >>> >>> Adam >>> > > Attached the dmesg with the kernel crash.
Perfect. Try to add mce=off to your kernel command line. Ralf > > Adam > -- You received this message because you are subscribed to the Google Groups "Jailhouse" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jailhouse-dev/eb09293c-1b9d-8e4f-dc7a-1a0bd1263b5f%40oth-regensburg.de. For more options, visit https://groups.google.com/d/optout.
