Hi Jan,

On Fri, May 13, 2022 at 9:52 AM Jan Kiszka <[email protected]> wrote:
>
> On 13.05.22 09:31, Lad, Prabhakar wrote:
> > Hi Jan,
> >
> > On Thu, May 12, 2022 at 6:05 PM Jan Kiszka <[email protected]> wrote:
> >>
> >> On 12.05.22 13:37, Lad, Prabhakar wrote:
> >>> Hi Jan,
> >>>
> >>> On Thu, May 12, 2022 at 12:16 PM Jan Kiszka <[email protected]> 
> >>> wrote:
> >>>>
> >>>> On 12.05.22 13:06, Lad, Prabhakar wrote:
> >>>>> Hi Jan,
> >>>>>
> >>>>> On Thu, May 12, 2022 at 11:24 AM Jan Kiszka <[email protected]> 
> >>>>> wrote:
> >>>>>>
> >>>>>> On 12.05.22 09:01, Lad, Prabhakar wrote:
> >>>>>>> Hi Jan,
> >>>>>>>
> >>>>>>> On Thu, May 12, 2022 at 6:45 AM Jan Kiszka <[email protected]> 
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> On 11.05.22 19:09, Lad, Prabhakar wrote:
> >>>>>>>>> Hi Jan,
> >>>>>>>>>
> >>>>>>>>> On Wed, May 11, 2022 at 4:11 PM Jan Kiszka <[email protected]> 
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>> On 11.05.22 13:20, Prabhakar Lad wrote:
> >>>>>>>>>>> To add further more details:
> >>>>>>>>>>>
> >>>>>>>>>>> I am using jailhouse-enabling/5.10 Linux branch [0] with -next 
> >>>>>>>>>>> branch
> >>>>>>>>>>> for jailhouse [1].
> >>>>>>>>>>>
> >>>>>>>>>>> I added some debug prints and I see the panic is caused when 
> >>>>>>>>>>> entry()
> >>>>>>>>>>> function is called (in enter_hypervisor). The entry function 
> >>>>>>>>>>> lands into
> >>>>>>>>>>> assembly code (entry.S). I dont have a JTAG to see which exact
> >>>>>>>>>>> instruction is causing this issue.
> >>>>>>>>>>
> >>>>>>>>>> So, already the first instruction in the loaded hypervisor binary 
> >>>>>>>>>> is not
> >>>>>>>>>> executable? That would explain why we see no hypervisor output at 
> >>>>>>>>>> all.
> >>>>>>>>>>
> >>>>>>>>> To clarify when the hypervisor is loaded the output will be via
> >>>>>>>>> debug_console specified in the root cell config?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Correct - if you reach entry() in setup.c.
> >>>>>>>>
> >>>>>>>>>> Was that memory region properly reserved from Linux (via DTB 
> >>>>>>>>>> carve-out
> >>>>>>>>>> e.g.)?
> >>>>>>>>>>
> >>>>>>>>> Yes I have the below memory reserved in my dts:
> >>>>>>>>>
> >>>>>>>>>     memory@48000000 {
> >>>>>>>>>         device_type = "memory";
> >>>>>>>>>         /* first 128MB is reserved for secure area. */
> >>>>>>>>>         reg = <0x0 0x48000000 0x0 0x78000000>;
> >>>>>>>>>     };
> >>>>>>>>>
> >>>>>>>>>     reserved-memory {
> >>>>>>>>>         #address-cells = <2>;
> >>>>>>>>>         #size-cells = <2>;
> >>>>>>>>>         ranges;
> >>>>>>>>>
> >>>>>>>>>         jh_inmate@a7f00000 {
> >>>>>>>>>             status = "okay";
> >>>>>>>>>             no-map;
> >>>>>>>>>             reg = <0x00 0xa7f00000 0x00 0xf000000>;
> >>>>>>>>>         };
> >>>>>>>>>
> >>>>>>>>>         jailhouse: jailhouse@b6f00000 {
> >>>>>>>>>             status = "okay";
> >>>>>>>>>             reg = <0x0 0xb6f00000 0x0 0x1000000>;
> >>>>>>>>>             no-map;
> >>>>>>>>>         };
> >>>>>>>>>     };
> >>>>>>>>>
> >>>>>>>>> Linux does report the hole in RAM:
> >>>>>>>>>
> >>>>>>>>> root@smarc-rzg2l:~# cat /proc/iomem  |  grep RAM
> >>>>>>>>> 48000000-a7efffff : System RAM
> >>>>>>>>> b7f00000-bfffffff : System RAM
> >>>>>>>>>
> >>>>>>>>> And below is my root cell config related to hypervisor memory:
> >>>>>>>>>
> >>>>>>>>>         .hypervisor_memory = {
> >>>>>>>>>             .phys_start = 0xb6f00000,
> >>>>>>>>>             .size =       0x1000000,
> >>>>>>>>>         },
> >>>>>>>>>
> >>>>>>>>> Is there anything obvious I have missed above?
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>> Nothing obvious to me so far.
> >>>>>>>>
> >>>>>>>> So, is this really the first instruction in
> >>>>>>>> hypervisor/arch/arm64/entry.S, arch_entry, that triggers the 
> >>>>>>>> undefinstr?
> >>>>>>>> Check the reported address by Linux against the disassembly of the
> >>>>>>>> hypervisor. You could also play with adding 'ret' as first 
> >>>>>>>> instruction,
> >>>>>>>> to check if that returns without a crash.
> >>>>>>>>
> >>>>>>> I did play around with ret, below is my observation:
> >>>>>>>
> >>>>>>> Up until line 98 [0] just before calling arm_dcaches_flush adding ret
> >>>>>>> returned without a crash. Adding a ret at line 99 [1] ie after
> >>>>>>> arm_dcaches_flush it never returned.
> >>>>>>>
> >>>>>>> So I continued with adding ret in  arm_dcaches_flush, I added ret as a
> >>>>>>> first statement in arm_dcaches_flush and was able to see the panic!
> >>>>>>
> >>>>>> Which Jailhouse revision are you building? Already disassembled
> >>>>>> hypervisor.o around arch_entry and arm_dcaches_flush? This is what I
> >>>>>> have here for next:
> >>>>>>
> >>>>> I'm using the next branch too.
> >>>>>
> >>>>>> 0000ffffc0203efc <arm_dcaches_flush>:
> >>>>>>     ffffc0203efc:       d53b0024        mrs     x4, ctr_el0
> >>>>>>     ffffc0203f00:       d3504c84        ubfx    x4, x4, #16, #4
> >>>>>>     ...
> >>>>>>
> >>>>>> 0000ffffc0204800 <arch_entry>:
> >>>>>>     ffffc0204800:       aa0003f0        mov     x16, x0
> >>>>>>     ffffc0204804:       aa1e03f1        mov     x17, x30
> >>>>>>     ...
> >>>>>>     ffffc0204844:       d2800042        mov     x2, #0x2               
> >>>>>>          // #2
> >>>>>>     ffffc0204848:       97fffdad        bl      ffffc0203efc 
> >>>>>> <arm_dcaches_flush>
> >>>>>>
> >>>>>> You could check if there is such a relative call in your case as well.
> >>>>> yes it does exist, below is the snippet:
> >>>>>
> >>>>> 0000ffffc0204000 <arch_entry>:
> >>>>>     ffffc0204000:    aa0003f0     mov    x16, x0
> >>>>>     ffffc0204004:    aa1e03f1     mov    x17, x30
> >>>>>     ffffc0204008:    10fdffc0     adr    x0, ffffc0200000 
> >>>>> <hypervisor_header>
> >>>>>     ffffc020400c:    f9402412     ldr    x18, [x0, #72]
> >>>>>     ffffc0204010:    5800fd0f     ldr    x15, ffffc0205fb0 
> >>>>> <sdei_handler+0x2c>
> >>>>>     ffffc0204014:    900000e1     adrp    x1, ffffc0220000 <__page_pool>
> >>>>>     ffffc0204018:    79406002     ldrh    w2, [x0, #48]
> >>>>>     ffffc020401c:    d2880003     mov    x3, #0x4000
> >>>>>  // #16384
> >>>>>     ffffc0204020:    9b030441     madd    x1, x2, x3, x1
> >>>>>     ffffc0204024:    f842c02e     ldur    x14, [x1, #44]
> >>>>>     ffffc0204028:    5800fc8d     ldr    x13, ffffc0205fb8 
> >>>>> <sdei_handler+0x34>
> >>>>>     ffffc020402c:    f840c02c     ldur    x12, [x1, #12]
> >>>>>     ffffc0204030:    cb0d018b     sub    x11, x12, x13
> >>>>>     ffffc0204034:    924051c1     and    x1, x14, #0x1fffff
> >>>>>     ffffc0204038:    8b0101ef     add    x15, x15, x1
> >>>>>     ffffc020403c:    f9001c0f     str    x15, [x0, #56]
> >>>>>     ffffc0204040:    f9400401     ldr    x1, [x0, #8]
> >>>>>     ffffc0204044:    d2800042     mov    x2, #0x2                       
> >>>>> // #2
> >>>>>     ffffc0204048:    97ffff6c     bl    ffffc0203df8 <arm_dcaches_flush>
> >>>>>     ffffc020404c:    5800fba1     ldr    x1, ffffc0205fc0 
> >>>>> <sdei_handler+0x3c>
> >>>>>     ffffc0204050:    8b0b0021     add    x1, x1, x11
> >>>>>     ffffc0204054:    d2800000     mov    x0, #0x0                       
> >>>>> // #0
> >>>>>     ffffc0204058:    f100025f     cmp    x18, #0x0
> >>>>>     ffffc020405c:    54000041     b.ne    ffffc0204064
> >>>>> <arch_entry+0x64>  // b.any
> >>>>>     ffffc0204060:    d2800020     mov    x0, #0x1                       
> >>>>> // #1
> >>>>>     ffffc0204064:    d4000002     hvc    #0x0
> >>>>>     ffffc0204068:    d4000002     hvc    #0x0
> >>>>>     ffffc020406c:    14000000     b    ffffc020406c <arch_entry+0x6c>
> >>>>>
> >>>>>> Then you could check, before jumping into arch_entry from the
> >>>>>> hypervisor, that the opcode at the actual arm_dcaches_flush address is
> >>>>>> as expected. But maybe already the building injects an issue here.
> >>>>>>
> >>>>> Any pointers on how I could do that?
> >>>>>
> >>>>
> >>>> arm_dcaches_flush is loaded at arch_entry (header->entry +
> >>>> hypervisor_mem) - 0x208. Read the u32 at that address and check if it is
> >>>> what is in your hypervisor.o (likely also d53b0024).
> >>>>
> >>>> If that is the case, not the jump but that "mrs     x4, ctr_el0" may
> >>>> actually be the problem. Check out hypervisor/arch/arm64/caches.S and
> >>>> see if that code, specifically dcache_line_size, causes trouble outside
> >>>> of Jailhouse as well, maybe as inline assembly in the driver module.
> >>>>
> >>>
> >>> With the below ret added, I get "JAILHOUSE_ENABLE: Success"
> >>>
> >>> diff --git a/hypervisor/arch/arm64/entry.S b/hypervisor/arch/arm64/entry.S
> >>> index a9cabf7f..4e98b292 100644
> >>> --- a/hypervisor/arch/arm64/entry.S
> >>> +++ b/hypervisor/arch/arm64/entry.S
> >>> @@ -96,6 +96,7 @@ arch_entry:
> >>>          */
> >>>         ldr     x1, [x0, #HEADER_CORE_SIZE]
> >>>         mov     x2, DCACHE_CLEAN_AND_INVALIDATE_ASM
> >>> +       ret
> >>>         bl      arm_dcaches_flush
> >>>
> >>>         /* install bootstrap_vectors */
> >>>
> >>> Now when I undo the above and do the below, Im seeing a panic:
> >>>
> >>> diff --git a/hypervisor/arch/arm64/caches.S 
> >>> b/hypervisor/arch/arm64/caches.S
> >>> index 39dad4af..ce29b8e8 100644
> >>> --- a/hypervisor/arch/arm64/caches.S
> >>> +++ b/hypervisor/arch/arm64/caches.S
> >>> @@ -40,6 +40,7 @@
> >>>   */
> >>>         .global arm_dcaches_flush
> >>>  arm_dcaches_flush:
> >>> +       ret
> >>>         dcache_line_size x3, x4
> >>>         add     x1, x0, x1
> >>>         sub     x4, x3, #1
> >>>
> >>> Issue is seen even without dcache_line_size being called. Does that
> >>> mean we are not landing in arm_dcaches_flush?
> >>
> >> Likely. I've never seen such an effect.
> >>
> >> If you look the reported fault address, when making it relative
> >> (subtract hypervisor_mem), is that arm_dcaches_flush (relative to
> >> arch_entry)?
> >>
> > Could you please elaborate on it more. I moved the cache.S code in
> > entry.S  file but still seeing the same issue.
>
>
> $ aarch64-linux-gnu-objdump -x hypervisor/hypervisor.o | \
>   grep "arch_entry\|arm_dcaches_flush"
> 0000ffffc0203efc g       .text  0000000000000000 arm_dcaches_flush
> 0000ffffc0204800 g       .text  0000000000000000 arch_entry
>
> -> delta of 0x904 here
>
> diff --git a/driver/main.c b/driver/main.c
> index 64e2b9a4..cb197d77 100644
> --- a/driver/main.c
> +++ b/driver/main.c
> @@ -246,6 +246,8 @@ static void enter_hypervisor(void *info)
>
>         entry = header->entry + (unsigned long) hypervisor_mem;
>
> +       printk("obcode @arm_dcaches_flush: %08x\n", *(u32 *)(entry - 0x904));
> +
>         if (cpu < header->max_cpus)
>                 /* either returns 0 or the same error code across all CPUs */
>                 err = entry(cpu);
>
>
> Untested, though.
>
Thanks for the pointer,

$aarch64-linux-gnu-objdump -x hypervisor/hypervisor.o |   grep
"arch_entry\|arm_dcaches_flush"
0000ffffc0203f64 g       .text    0000000000000000 arm_dcaches_flush
0000ffffc0204800 g       .text    0000000000000000 arch_entry

I get a difference of 0x89c, so I added the below code:
diff --git a/driver/main.c b/driver/main.c
index 64e2b9a4..8684816a 100644
--- a/driver/main.c
+++ b/driver/main.c
@@ -246,6 +246,8 @@ static void enter_hypervisor(void *info)

        entry = header->entry + (unsigned long) hypervisor_mem;

+       printk("obcode @arm_dcaches_flush: %08x\n", *(u32 *)(entry - 0x89c));

This results in:

[   18.077167] jailhouse: loading out-of-tree module taints kernel.
Reading configuration set:
  Root cell:     Renesas RZ/V2L SMARC (renesas-r9a07g054l2.cell)
Overlapping memory regions inside cell: None
Overlapping memory regions with hypervisor: None
Missing resource interceptions for architecture arm64: None
[   19.035199] obcode @arm_dcaches_flush: d53b0024
[   19.035203] obcode @arm_dcaches_flush: d53b0024
[   19.035233] ------------[ cut here ]------------
[   19.039748] ------------[ cut here ]------------
[   19.044245] kernel BUG at arch/arm64/kernel/traps.c:407!
[   19.048835] kernel BUG at arch/arm64/kernel/traps.c:407!
[   19.053427] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[   19.069449] Modules linked in: jailhouse(O)
[   19.073625] CPU: 1 PID: 143 Comm: rngd Tainted: G           O
5.10.112-cip6+ #13
[   19.081419] Hardware name: Renesas SMARC EVK based on r9a07g054l2 (DT)
[   19.087918] pstate: 00400085 (nzcv daIf +PAN -UAO -TCO BTYPE=--)
[   19.093908] pc : do_undefinstr+0x26c/0x320
[   19.097985] lr : do_undefinstr+0x1cc/0x320
[   19.102060] sp : ffff8000118d3cf0
[   19.105357] x29: ffff8000118d3cf0 x28: ffff00000ad5b800
[   19.110648] x27: 0000000000000000 x26: ffff8000118d4000
[   19.115938] x25: ffff8000118d0000 x24: 0000000000000000
[   19.121228] x23: 0000000020400085 x22: ffff800013004864
[   19.126521] x21: ffff8000118d3ed0 x20: ffff8000118d3d80
[   19.131812] x19: ffff800011107000 x18: 0000000000000001
[   19.137103] x17: ffff800008c11828 x16: 0000000000000001
[   19.142393] x15: ffff800013004864 x14: 000000001004b800
[   19.147684] x13: 0000ffffc0200000 x12: 00000000b6f00000
[   19.152974] x11: ffff0000f6d00000 x10: ffff8000118d3ed0
[   19.158265] x9 : ffff8000118d3ed0 x8 : 3062333564203a68
[   19.163556] x7 : 0000000000000000 x6 : ffff8000118d3d48
[   19.168847] x5 : 00000000d5300000 x4 : ffff800011635410
[   19.174137] x3 : 00000000d4000000 x2 : 0000000000000000
[   19.179428] x1 : ffff00000ad5b800 x0 : 0000000020400085
[   19.184720] Call trace:
[   19.187157]  do_undefinstr+0x26c/0x320
[   19.190894]  el1_undef+0x30/0x50
[   19.194108]  el1_sync_handler+0xc4/0xe0
[   19.197927]  el1_sync+0x84/0x140
[   19.201141]  0xffff800013004864
[   19.204272]  flush_smp_call_function_queue+0xf8/0x268
[   19.209302]  generic_smp_call_function_single_interrupt+0x14/0x20
[   19.215370]  ipi_handler+0x8c/0x158
[   19.218846]  handle_percpu_devid_fasteoi_ipi+0x74/0x88
[   19.223963]  generic_handle_irq+0x30/0x48
[   19.227957]  __handle_domain_irq+0x60/0xb8
[   19.232037]  gic_handle_irq+0x58/0x128
[   19.235769]  el0_irq_naked+0x4c/0x54
[   19.239332] Code: f94013b5 17fffff1 a9025bb5 f9001bb7 (d4210000)
[   19.245407] ---[ end trace e90110789d0a42e7 ]---
[   19.250004] Kernel panic - not syncing: Oops - BUG: Fatal exception
in interrupt
[   19.257368] SMP: stopping secondary CPUs
[   20.345055] SMP: failed to stop secondary CPUs 0-1
[   20.349824] Kernel Offset: disabled
[   20.353295] CPU features: 0x0040026,2a00a238
[   20.357545] Memory Limit: none
[   20.360587] ---[ end Kernel panic - not syncing: Oops - BUG: Fatal
exception in interrupt ]---

When compared to the objdump of cache.o it does match the value d53b0024,

$ aarch64-linux-gnu-objdump -D hypervisor/arch/arm64/caches.o

hypervisor/arch/arm64/caches.o:     file format elf64-littleaarch64


Disassembly of section .text:

0000000000000000 <arm_dcaches_flush>:
   0:    d53b0024     mrs    x4, ctr_el0
   4:    d3504c84     ubfx    x4, x4, #16, #4
   8:    d2800083     mov    x3, #0x4                       // #4
   c:    9ac42063     lsl    x3, x3, x4
  10:    8b010001     add    x1, x0, x1
  14:    d1000464     sub    x4, x3, #0x1
  18:    8a240000     bic    x0, x0, x4
  1c:    f100005f     cmp    x2, #0x0
  20:    54000061     b.ne    2c <arm_dcaches_flush+0x2c>  // b.any
  24:    d50b7a20     dc    cvac, x0
  28:    14000006     b    40 <arm_dcaches_flush+0x40>
  2c:    f100045f     cmp    x2, #0x1
  30:    54000061     b.ne    3c <arm_dcaches_flush+0x3c>  // b.any
  34:    d5087620     dc    ivac, x0
  38:    14000002     b    40 <arm_dcaches_flush+0x40>
  3c:    d50b7e20     dc    civac, x0
  40:    8b030000     add    x0, x0, x3
  44:    eb01001f     cmp    x0, x1
  48:    54fffea3     b.cc    1c <arm_dcaches_flush+0x1c>  // b.lo, b.ul, b.last
  4c:    d5033f9f     dsb    sy
  50:    d65f03c0     ret

So no problem above, any more pointers where I can give it a shot?

> >
> > In some of the reference platforms LPAE is enabled in the u-boot. Is
> > that a strict requirement? Also in the requirement section it's
>
> LPAE is 32-bit arm, your are on 64-bit, no?
>
My bad, yes I'm on 64-bit.

> > mentioned "Linux is started in HYP mode" does that mean Before loading
> > the jailhouse the Linux should be running on EL2? Also to be sure, do
> > we need any special configs enabled in TF-A at all?
>
> You need Linux to start in HYP mode so that Linux installs a stub that
> KVM (when not using Jailhouse) and Jailhouse can use to take over the
> hypervisor role. But your init crashes before arch_entry is able to
> issue the related hvc instructions.
>
Ah right, I misunderstood there.

> TF-A needs to be there in order to have PSCI. Special settings are
> usually only related to SDEI, which is optional.
>
OK, so TF-A without any changes should work.

> >
> > Fyi, I am using arm64_defconfig_5.10 [0] (+ additional configs to
> > enable my platform) to build the Linux kernel, should these configs be
> > sufficient for Jailhouse?
>
> Yes, at least for the various targets we cover with this so far.
>
Great!

Cheers,
Prabhakar

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/CA%2BV-a8umio3A9LsmdwB-x3W%2BJH1wOiwXJkH-FXdFBDLvbH%3DzUw%40mail.gmail.com.

Reply via email to