On 12.05.22 13:37, Lad, Prabhakar wrote:
> Hi Jan,
> 
> On Thu, May 12, 2022 at 12:16 PM Jan Kiszka <[email protected]> wrote:
>>
>> On 12.05.22 13:06, Lad, Prabhakar wrote:
>>> Hi Jan,
>>>
>>> On Thu, May 12, 2022 at 11:24 AM Jan Kiszka <[email protected]> wrote:
>>>>
>>>> On 12.05.22 09:01, Lad, Prabhakar wrote:
>>>>> Hi Jan,
>>>>>
>>>>> On Thu, May 12, 2022 at 6:45 AM Jan Kiszka <[email protected]> wrote:
>>>>>>
>>>>>> On 11.05.22 19:09, Lad, Prabhakar wrote:
>>>>>>> Hi Jan,
>>>>>>>
>>>>>>> On Wed, May 11, 2022 at 4:11 PM Jan Kiszka <[email protected]> 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> On 11.05.22 13:20, Prabhakar Lad wrote:
>>>>>>>>> To add further more details:
>>>>>>>>>
>>>>>>>>> I am using jailhouse-enabling/5.10 Linux branch [0] with -next branch
>>>>>>>>> for jailhouse [1].
>>>>>>>>>
>>>>>>>>> I added some debug prints and I see the panic is caused when entry()
>>>>>>>>> function is called (in enter_hypervisor). The entry function lands 
>>>>>>>>> into
>>>>>>>>> assembly code (entry.S). I dont have a JTAG to see which exact
>>>>>>>>> instruction is causing this issue.
>>>>>>>>
>>>>>>>> So, already the first instruction in the loaded hypervisor binary is 
>>>>>>>> not
>>>>>>>> executable? That would explain why we see no hypervisor output at all.
>>>>>>>>
>>>>>>> To clarify when the hypervisor is loaded the output will be via
>>>>>>> debug_console specified in the root cell config?
>>>>>>>
>>>>>>
>>>>>> Correct - if you reach entry() in setup.c.
>>>>>>
>>>>>>>> Was that memory region properly reserved from Linux (via DTB carve-out
>>>>>>>> e.g.)?
>>>>>>>>
>>>>>>> Yes I have the below memory reserved in my dts:
>>>>>>>
>>>>>>>     memory@48000000 {
>>>>>>>         device_type = "memory";
>>>>>>>         /* first 128MB is reserved for secure area. */
>>>>>>>         reg = <0x0 0x48000000 0x0 0x78000000>;
>>>>>>>     };
>>>>>>>
>>>>>>>     reserved-memory {
>>>>>>>         #address-cells = <2>;
>>>>>>>         #size-cells = <2>;
>>>>>>>         ranges;
>>>>>>>
>>>>>>>         jh_inmate@a7f00000 {
>>>>>>>             status = "okay";
>>>>>>>             no-map;
>>>>>>>             reg = <0x00 0xa7f00000 0x00 0xf000000>;
>>>>>>>         };
>>>>>>>
>>>>>>>         jailhouse: jailhouse@b6f00000 {
>>>>>>>             status = "okay";
>>>>>>>             reg = <0x0 0xb6f00000 0x0 0x1000000>;
>>>>>>>             no-map;
>>>>>>>         };
>>>>>>>     };
>>>>>>>
>>>>>>> Linux does report the hole in RAM:
>>>>>>>
>>>>>>> root@smarc-rzg2l:~# cat /proc/iomem  |  grep RAM
>>>>>>> 48000000-a7efffff : System RAM
>>>>>>> b7f00000-bfffffff : System RAM
>>>>>>>
>>>>>>> And below is my root cell config related to hypervisor memory:
>>>>>>>
>>>>>>>         .hypervisor_memory = {
>>>>>>>             .phys_start = 0xb6f00000,
>>>>>>>             .size =       0x1000000,
>>>>>>>         },
>>>>>>>
>>>>>>> Is there anything obvious I have missed above?
>>>>>>>
>>>>>>
>>>>>> Nothing obvious to me so far.
>>>>>>
>>>>>> So, is this really the first instruction in
>>>>>> hypervisor/arch/arm64/entry.S, arch_entry, that triggers the undefinstr?
>>>>>> Check the reported address by Linux against the disassembly of the
>>>>>> hypervisor. You could also play with adding 'ret' as first instruction,
>>>>>> to check if that returns without a crash.
>>>>>>
>>>>> I did play around with ret, below is my observation:
>>>>>
>>>>> Up until line 98 [0] just before calling arm_dcaches_flush adding ret
>>>>> returned without a crash. Adding a ret at line 99 [1] ie after
>>>>> arm_dcaches_flush it never returned.
>>>>>
>>>>> So I continued with adding ret in  arm_dcaches_flush, I added ret as a
>>>>> first statement in arm_dcaches_flush and was able to see the panic!
>>>>
>>>> Which Jailhouse revision are you building? Already disassembled
>>>> hypervisor.o around arch_entry and arm_dcaches_flush? This is what I
>>>> have here for next:
>>>>
>>> I'm using the next branch too.
>>>
>>>> 0000ffffc0203efc <arm_dcaches_flush>:
>>>>     ffffc0203efc:       d53b0024        mrs     x4, ctr_el0
>>>>     ffffc0203f00:       d3504c84        ubfx    x4, x4, #16, #4
>>>>     ...
>>>>
>>>> 0000ffffc0204800 <arch_entry>:
>>>>     ffffc0204800:       aa0003f0        mov     x16, x0
>>>>     ffffc0204804:       aa1e03f1        mov     x17, x30
>>>>     ...
>>>>     ffffc0204844:       d2800042        mov     x2, #0x2                   
>>>>      // #2
>>>>     ffffc0204848:       97fffdad        bl      ffffc0203efc 
>>>> <arm_dcaches_flush>
>>>>
>>>> You could check if there is such a relative call in your case as well.
>>> yes it does exist, below is the snippet:
>>>
>>> 0000ffffc0204000 <arch_entry>:
>>>     ffffc0204000:    aa0003f0     mov    x16, x0
>>>     ffffc0204004:    aa1e03f1     mov    x17, x30
>>>     ffffc0204008:    10fdffc0     adr    x0, ffffc0200000 
>>> <hypervisor_header>
>>>     ffffc020400c:    f9402412     ldr    x18, [x0, #72]
>>>     ffffc0204010:    5800fd0f     ldr    x15, ffffc0205fb0 
>>> <sdei_handler+0x2c>
>>>     ffffc0204014:    900000e1     adrp    x1, ffffc0220000 <__page_pool>
>>>     ffffc0204018:    79406002     ldrh    w2, [x0, #48]
>>>     ffffc020401c:    d2880003     mov    x3, #0x4000
>>>  // #16384
>>>     ffffc0204020:    9b030441     madd    x1, x2, x3, x1
>>>     ffffc0204024:    f842c02e     ldur    x14, [x1, #44]
>>>     ffffc0204028:    5800fc8d     ldr    x13, ffffc0205fb8 
>>> <sdei_handler+0x34>
>>>     ffffc020402c:    f840c02c     ldur    x12, [x1, #12]
>>>     ffffc0204030:    cb0d018b     sub    x11, x12, x13
>>>     ffffc0204034:    924051c1     and    x1, x14, #0x1fffff
>>>     ffffc0204038:    8b0101ef     add    x15, x15, x1
>>>     ffffc020403c:    f9001c0f     str    x15, [x0, #56]
>>>     ffffc0204040:    f9400401     ldr    x1, [x0, #8]
>>>     ffffc0204044:    d2800042     mov    x2, #0x2                       // 
>>> #2
>>>     ffffc0204048:    97ffff6c     bl    ffffc0203df8 <arm_dcaches_flush>
>>>     ffffc020404c:    5800fba1     ldr    x1, ffffc0205fc0 
>>> <sdei_handler+0x3c>
>>>     ffffc0204050:    8b0b0021     add    x1, x1, x11
>>>     ffffc0204054:    d2800000     mov    x0, #0x0                       // 
>>> #0
>>>     ffffc0204058:    f100025f     cmp    x18, #0x0
>>>     ffffc020405c:    54000041     b.ne    ffffc0204064
>>> <arch_entry+0x64>  // b.any
>>>     ffffc0204060:    d2800020     mov    x0, #0x1                       // 
>>> #1
>>>     ffffc0204064:    d4000002     hvc    #0x0
>>>     ffffc0204068:    d4000002     hvc    #0x0
>>>     ffffc020406c:    14000000     b    ffffc020406c <arch_entry+0x6c>
>>>
>>>> Then you could check, before jumping into arch_entry from the
>>>> hypervisor, that the opcode at the actual arm_dcaches_flush address is
>>>> as expected. But maybe already the building injects an issue here.
>>>>
>>> Any pointers on how I could do that?
>>>
>>
>> arm_dcaches_flush is loaded at arch_entry (header->entry +
>> hypervisor_mem) - 0x208. Read the u32 at that address and check if it is
>> what is in your hypervisor.o (likely also d53b0024).
>>
>> If that is the case, not the jump but that "mrs     x4, ctr_el0" may
>> actually be the problem. Check out hypervisor/arch/arm64/caches.S and
>> see if that code, specifically dcache_line_size, causes trouble outside
>> of Jailhouse as well, maybe as inline assembly in the driver module.
>>
> 
> With the below ret added, I get "JAILHOUSE_ENABLE: Success"
> 
> diff --git a/hypervisor/arch/arm64/entry.S b/hypervisor/arch/arm64/entry.S
> index a9cabf7f..4e98b292 100644
> --- a/hypervisor/arch/arm64/entry.S
> +++ b/hypervisor/arch/arm64/entry.S
> @@ -96,6 +96,7 @@ arch_entry:
>          */
>         ldr     x1, [x0, #HEADER_CORE_SIZE]
>         mov     x2, DCACHE_CLEAN_AND_INVALIDATE_ASM
> +       ret
>         bl      arm_dcaches_flush
> 
>         /* install bootstrap_vectors */
> 
> Now when I undo the above and do the below, Im seeing a panic:
> 
> diff --git a/hypervisor/arch/arm64/caches.S b/hypervisor/arch/arm64/caches.S
> index 39dad4af..ce29b8e8 100644
> --- a/hypervisor/arch/arm64/caches.S
> +++ b/hypervisor/arch/arm64/caches.S
> @@ -40,6 +40,7 @@
>   */
>         .global arm_dcaches_flush
>  arm_dcaches_flush:
> +       ret
>         dcache_line_size x3, x4
>         add     x1, x0, x1
>         sub     x4, x3, #1
> 
> Issue is seen even without dcache_line_size being called. Does that
> mean we are not landing in arm_dcaches_flush?
> 

Wait! This last "ret" will make you go back to arch_entry only, not to
the caller of arch_entry. Furthermore, x30/lr is after "bl
arm_dcaches_flush" no longer pointing to the caller of arch_entry. That
value is saved into x17.

Could it be that arm_dcaches_flush was a red herring? Maybe we are
actually crashing on hvc in arch_entry, and that because of Linux not
starting in HVC mode, thus no hypervisor take-over stub being installed.

Could you share the boot log of the kernel?

Jan

-- 
Siemens AG, Technology
Competence Center Embedded Linux

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/2fe4d998-1784-53c5-a653-994882ec7fb6%40siemens.com.

Reply via email to