On Mon, Jul 26, 2021 at 8:08 PM Jan Kiszka <[email protected]> wrote:
> On 26.07.21 19:14, Huang Shihua wrote: > > > > > > On Wednesday, 21 July 2021 at 17:50:53 UTC+2 [email protected] > wrote: > > > > On 13.07.21 18:09, Huang Shihua wrote: > > > HI, > > > > > > Currently, I'm trying to run the ivshmem-demo to establish > > communication > > > between Linux root cell and one non-root cell. Configuration files > > are > > > attached. > > > > > > Two cases were tested: > > > > > > 1. Let the non-root cell load the ivshmem-demo and then target at > > > itself (target=1). _All interrupts can be sent and received > > correctly_. > > > 2. Let the root cell and the non-root cell send interrupts to each > > > other. I.e., root cell runs /./tools/demos/ivshmem-demo -t 1, > /while > > > the non-root cell load /inmates/demos/x86/ivshmem-demo.bin -s > > > "target=0" -a 0x1000 /and then run. The result turned out to be, > > > * the non-root cell got the interrupts from the root cell, > > > * _while the root cell did not receive any interrupt._ > > > > > > As Jan mentioned > > > > > in > https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ > > < > https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ>, > > > > > tuning the iommu index should do the trick. > > > However, unfortunately, it did not work for me :c > > > > > > There are 8 iommu units on the hardware, I tuned the iommu index > > in the > > > > Wow, 8 units... > > > > > root cell configuration from 0 to 7. The same behavior, no > interrupts > > > were received by the root cell, remains when tuning the index from > > 0 to > > > 6. When the iommu is set to 7, the kernel crashed immediately when > > the > > > demo was started on the non-root cell. > > > > > > Any idea regarding why the root cell always failed to receive > > interrupts? > > > > This may require in-detail debugging. For that, you would have to > > instrument the hypervisor along its virtual IRQ injection path. That > > starts in ivshmem_trigger_interrupt() (hypervisor/ivshmem.c). The > > sending side will call it on writing the doorbell registers. Check > > along > > this call path if conditions to actually send the IRQ are not met. > > > > If all are met, the hypervisor sends an IPI to a target cell CPU > (will > > be directly delivered to the guest) that should cause the normal IRQ > > processing there. But usually, we do not get so far in such cases. > > > > Another function of interest here is arch_ivshmem_update_msix() when > > called for the root cell while it defines where ivshmem IRQs should > go > > to. Possibly, Jailhouse decides that the programming Linux issued is > > not > > valid and therefore leaves the irq_cache that > > arch_ivshmem_trigger_interrupt() uses invalid. You can also check > that > > via instrumentations (printk). > > > > > > Indeed, when .iommu is assigned as 0,1,..6, irq_cache is invalid. I > suspect > > the reason is that their correpsonding VT-d interrupt remappting table > > entries > > are not for ivshmem devices, i.e., unmatched device ID. > > When .iommu is tuned to 7, irq_cache becomes valid. > > > > OK, then we know what needs to be set. I will have to check eventually > if we can read out that information also from sysfs so that this > guessing can end. > > > (BTW, as I mentioned before, the kernel crashed immediately when the > > demo was started on the non-root cell. _One missing detail here is_, on > the > > root-cell side, ./tools/demos/ivshmem-demo is running/has run, i.e., > > init_control has been set to 1. If ./tools/demos/ivshmem-demo has not > been > > run on the root cell yet, then starting the demo on the non-root cell > > will not > > kill the kernel.) > > Now we need to understand the crash. The root cell kernel oopses, right? > Any logs from that? > Activating hypervisor CAT: Using COS 0 with bitmask 000007ff for cell ivshmem-demo Adding virtual PCI device 00:0e.0 to cell "ivshmem-demo" Shared memory connection established, peer cells: "RootCell" Created cell "ivshmem-demo" Page pool usage after cell creation: mem 938/3534, remap 65603/131072 Cell "ivshmem-demo" can be loaded CPU 1 received SIPI, vector 100 Started cell "ivshmem-demo" IVSHMEM: Found device at 00:0e.0 IVSHMEM: bar0 is at 0x00000000ff000000 IVSHMEM: bar1 is at 0x00000000ff001000 IVSHMEM: ID is 1 IVSHMEM: max. peers is 3 IVSHMEM: state table is at 0x000000003f0f0000 IVSHMEM: R/W section is at 0x000000003f0f1000 IVSHMEM: input sections start at 0x000000003f0fa000 IVSHMEM: output section is at 0x000000003f0fc000 IVSHMEM: initialized device state[0] = 0 state[1] = 2 state[2] = 0 rw[0] = -1347440721 rw[1] = 0 rw[2] = -1347440721 in@0x0000 = -1347440721 in@0x2000 = 0 in@0x4000 = -1347440721 IVSHMEM: sending IRQ 2 to peer 2 IVSHMEM: sending IRQ 2 to peer 2 <---------- ./tools/demos/ivshmem-demo -t 1 (root cell) IVSHMEM: got interrupt 0 (#1) state[0] = 0 state[1] = 2 state[2] = 3 rw[0] = -1347440721 rw[1] = 0 rw[2] = 0 in@0x0000 = -1347440721 in@0x2000 = 0 in@0x4000 = 0 IVSHMEM: sending IRQ 2 to peer 2 FATAL: Unhandled VM-Exit, reason 26 qualification 0 vectoring info: 0 interrupt info: 0 RIP: 0xffffffff8d05f6ae RSP: 0xffffafa9c0003fc0 FLAGS: 2 RAX: 0x00000000007626f0 RBX: 0x0000000000000000 RCX: 0x000000007ffefbff RDX: 0x00000000bfebfbff RSI: 0xffffafa9c0003fc8 RDI: 0xffffafa9c0003fc4 CS: 10 BASE: 0x0000000000000000 AR-BYTES: a09b EFER.LMA 1 CR0: 0x0000000080050033 CR3: 0x0000001fbd80a004 CR4: 0x00000000007626f0 EFER: 0x0000000000000d01 Parking CPU 0 (Cell: "RootCell") IVSHMEM: sending IRQ 2 to peer 2 Ignoring NMI IPI to CPU 0 Ignoring NMI IPI to CPU 2 Ignoring NMI IPI to CPU 3 Ignoring NMI IPI to CPU 5 Ignoring NMI IPI to CPU 6 Ignoring NMI IPI to CPU 7 Ignoring NMI IPI to CPU 8 Ignoring NMI IPI to CPU 9 Ignoring NMI IPI to CPU 10 Ignoring NMI IPI to CPU 11 Ignoring NMI IPI to CPU 12 Ignoring NMI IPI to CPU 13 Ignoring NMI IPI to CPU 14 Ignoring NMI IPI to CPU 15 IVSHMEM: sending IRQ 2 to peer 2 > > And what do yo mean with init_control? > oops, typo, should be int_control... the int_control of struct ivshm_regs in ivshmem-demo/c struct ivshm_regs { uint32_t id; uint32_t max_peer; uint32_t int_control; ..... } *so when root cell mimo_write 1 to regs->int_control while non-root cell has been running, then the kernel crashes.* > > > > > To avoid the kernel crashing situation, I only ran the demo on the > > non-root cell. With .iommu being set validly, I will expect at least > > seeing the > > interrupt count increases, when grep ivshmem /proc/interrupts. > > But nope, _still no interrupts received on the root cell_. > > > > If there is no driver registered on the root side or not opened (by the > demo app), then the interrupt reception is disabled. We need to debug > the "hot" case. > Right, after diving into the source code, I did see that as when ive->int_ctrl_reg=0, no interrupt will be triggered, i.e., arch_ivshmem_trigger_interrupt is skipped. I have a question regarding the code below. static void ivshmem_trigger_interrupt(struct ivshmem_endpoint *ive, unsigned int vector) { /* * Hold the IRQ lock while sending the interrupt so that ivshmem_exit * and ivshmem_register_mmio can synchronize on the completion of the * delivery. */ spin_lock(&ive->irq_lock); if (ive->int_ctrl_reg & IVSHMEM_INT_ENABLE) { if (ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] & IVSHMEM_CFG_ONESHOT_INT) ive->int_ctrl_reg = 0; arch_ivshmem_trigger_interrupt(ive, vector); } spin_unlock(&ive->irq_lock); } Q1: IVSHMEM_CFG_ONESHOT_INT means? Q2: What does meeting this condition mean, ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] & IVSHMEM_CFG_ONESHOT_INT? Q3: Why trigger_interrupt when ive->int_ctrl_reg = 0? Q4: I tried to add "else" a line above arch_ivshmem_trigger_interrupt, i.e., arch_ivshmem_trigger_interrupt is skipped when > > Jan > > -- > Siemens AG, T RDA IOT > Corporate Competence Center Embedded Linux > > -- > You received this message because you are subscribed to the Google Groups > "Jailhouse" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/jailhouse-dev/2d2c72b6-cae0-e210-8db2-630b33180335%40siemens.com > . > -- You received this message because you are subscribed to the Google Groups "Jailhouse" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jailhouse-dev/CAPKBGcn%3Dm5f_3RGzhZ%2B%3DBF9_-v-SAN8y%3DxOCk5Zf8RgEm7Jz_Q%40mail.gmail.com.
