On Wed, Jul 28, 2021 at 2:10 PM Huang Shihua <[email protected]> wrote:
> > > On Mon, Jul 26, 2021 at 8:08 PM Jan Kiszka <[email protected]> wrote: > >> On 26.07.21 19:14, Huang Shihua wrote: >> > >> > >> > On Wednesday, 21 July 2021 at 17:50:53 UTC+2 [email protected] >> wrote: >> > >> > On 13.07.21 18:09, Huang Shihua wrote: >> > > HI, >> > > >> > > Currently, I'm trying to run the ivshmem-demo to establish >> > communication >> > > between Linux root cell and one non-root cell. Configuration files >> > are >> > > attached. >> > > >> > > Two cases were tested: >> > > >> > > 1. Let the non-root cell load the ivshmem-demo and then target at >> > > itself (target=1). _All interrupts can be sent and received >> > correctly_. >> > > 2. Let the root cell and the non-root cell send interrupts to each >> > > other. I.e., root cell runs /./tools/demos/ivshmem-demo -t 1, >> /while >> > > the non-root cell load /inmates/demos/x86/ivshmem-demo.bin -s >> > > "target=0" -a 0x1000 /and then run. The result turned out to be, >> > > * the non-root cell got the interrupts from the root cell, >> > > * _while the root cell did not receive any interrupt._ >> > > >> > > As Jan mentioned >> > > >> > in >> https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ >> > < >> https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ>, >> > >> > > tuning the iommu index should do the trick. >> > > However, unfortunately, it did not work for me :c >> > > >> > > There are 8 iommu units on the hardware, I tuned the iommu index >> > in the >> > >> > Wow, 8 units... >> > >> > > root cell configuration from 0 to 7. The same behavior, no >> interrupts >> > > were received by the root cell, remains when tuning the index from >> > 0 to >> > > 6. When the iommu is set to 7, the kernel crashed immediately when >> > the >> > > demo was started on the non-root cell. >> > > >> > > Any idea regarding why the root cell always failed to receive >> > interrupts? >> > >> > This may require in-detail debugging. For that, you would have to >> > instrument the hypervisor along its virtual IRQ injection path. That >> > starts in ivshmem_trigger_interrupt() (hypervisor/ivshmem.c). The >> > sending side will call it on writing the doorbell registers. Check >> > along >> > this call path if conditions to actually send the IRQ are not met. >> > >> > If all are met, the hypervisor sends an IPI to a target cell CPU >> (will >> > be directly delivered to the guest) that should cause the normal IRQ >> > processing there. But usually, we do not get so far in such cases. >> > >> > Another function of interest here is arch_ivshmem_update_msix() when >> > called for the root cell while it defines where ivshmem IRQs should >> go >> > to. Possibly, Jailhouse decides that the programming Linux issued is >> > not >> > valid and therefore leaves the irq_cache that >> > arch_ivshmem_trigger_interrupt() uses invalid. You can also check >> that >> > via instrumentations (printk). >> > >> > >> > Indeed, when .iommu is assigned as 0,1,..6, irq_cache is invalid. I >> suspect >> > the reason is that their correpsonding VT-d interrupt remappting table >> > entries >> > are not for ivshmem devices, i.e., unmatched device ID. >> > When .iommu is tuned to 7, irq_cache becomes valid. >> > >> >> OK, then we know what needs to be set. I will have to check eventually >> if we can read out that information also from sysfs so that this >> guessing can end. >> >> > (BTW, as I mentioned before, the kernel crashed immediately when the >> > demo was started on the non-root cell. _One missing detail here is_, on >> the >> > root-cell side, ./tools/demos/ivshmem-demo is running/has run, i.e., >> > init_control has been set to 1. If ./tools/demos/ivshmem-demo has not >> been >> > run on the root cell yet, then starting the demo on the non-root cell >> > will not >> > kill the kernel.) >> >> Now we need to understand the crash. The root cell kernel oopses, right? >> Any logs from that? >> > > Activating hypervisor > CAT: Using COS 0 with bitmask 000007ff for cell ivshmem-demo > Adding virtual PCI device 00:0e.0 to cell "ivshmem-demo" > Shared memory connection established, peer cells: > "RootCell" > Created cell "ivshmem-demo" > Page pool usage after cell creation: mem 938/3534, remap 65603/131072 > Cell "ivshmem-demo" can be loaded > CPU 1 received SIPI, vector 100 > Started cell "ivshmem-demo" > IVSHMEM: Found device at 00:0e.0 > IVSHMEM: bar0 is at 0x00000000ff000000 > IVSHMEM: bar1 is at 0x00000000ff001000 > IVSHMEM: ID is 1 > IVSHMEM: max. peers is 3 > IVSHMEM: state table is at 0x000000003f0f0000 > IVSHMEM: R/W section is at 0x000000003f0f1000 > IVSHMEM: input sections start at 0x000000003f0fa000 > IVSHMEM: output section is at 0x000000003f0fc000 > IVSHMEM: initialized device > state[0] = 0 > state[1] = 2 > state[2] = 0 > rw[0] = -1347440721 > rw[1] = 0 > rw[2] = -1347440721 > in@0x0000 = -1347440721 > in@0x2000 = 0 > in@0x4000 = -1347440721 > > IVSHMEM: sending IRQ 2 to peer 2 > > IVSHMEM: sending IRQ 2 to peer 2 > <---------- ./tools/demos/ivshmem-demo -t 1 (root cell) > IVSHMEM: got interrupt 0 (#1) > state[0] = 0 > state[1] = 2 > state[2] = 3 > rw[0] = -1347440721 > rw[1] = 0 > rw[2] = 0 > in@0x0000 = -1347440721 > in@0x2000 = 0 > in@0x4000 = 0 > > IVSHMEM: sending IRQ 2 to peer 2 > FATAL: Unhandled VM-Exit, reason 26 > qualification 0 > vectoring info: 0 interrupt info: 0 > RIP: 0xffffffff8d05f6ae RSP: 0xffffafa9c0003fc0 FLAGS: 2 > RAX: 0x00000000007626f0 RBX: 0x0000000000000000 RCX: 0x000000007ffefbff > RDX: 0x00000000bfebfbff RSI: 0xffffafa9c0003fc8 RDI: 0xffffafa9c0003fc4 > CS: 10 BASE: 0x0000000000000000 AR-BYTES: a09b EFER.LMA 1 > CR0: 0x0000000080050033 CR3: 0x0000001fbd80a004 CR4: 0x00000000007626f0 > EFER: 0x0000000000000d01 > Parking CPU 0 (Cell: "RootCell") > > IVSHMEM: sending IRQ 2 to peer 2 > Ignoring NMI IPI to CPU 0 > Ignoring NMI IPI to CPU 2 > Ignoring NMI IPI to CPU 3 > Ignoring NMI IPI to CPU 5 > Ignoring NMI IPI to CPU 6 > Ignoring NMI IPI to CPU 7 > Ignoring NMI IPI to CPU 8 > Ignoring NMI IPI to CPU 9 > Ignoring NMI IPI to CPU 10 > Ignoring NMI IPI to CPU 11 > Ignoring NMI IPI to CPU 12 > Ignoring NMI IPI to CPU 13 > Ignoring NMI IPI to CPU 14 > Ignoring NMI IPI to CPU 15 > > IVSHMEM: sending IRQ 2 to peer 2 > > >> >> And what do yo mean with init_control? >> > > oops, typo, should be int_control... > the int_control of struct ivshm_regs in ivshmem-demo/c > struct ivshm_regs { > uint32_t id; > uint32_t max_peer; > uint32_t int_control; > ..... > } > *so when root cell mimo_write 1 to regs->int_control while non-root cell > has been running, then the kernel crashes.* > >> >> > >> > To avoid the kernel crashing situation, I only ran the demo on the >> > non-root cell. With .iommu being set validly, I will expect at least >> > seeing the >> > interrupt count increases, when grep ivshmem /proc/interrupts. >> > But nope, _still no interrupts received on the root cell_. >> > >> >> If there is no driver registered on the root side or not opened (by the >> demo app), then the interrupt reception is disabled. We need to debug >> the "hot" case. >> > > Right, after diving into the source code, I did see that as when > ive->int_ctrl_reg=0, > no interrupt will be triggered, i.e., arch_ivshmem_trigger_interrupt is > skipped. > > I have a question regarding the code below. > static void ivshmem_trigger_interrupt(struct ivshmem_endpoint *ive, > unsigned int vector) > { > > /* > * Hold the IRQ lock while sending the interrupt so that ivshmem_exit > * and ivshmem_register_mmio can synchronize on the completion of the > * delivery. > */ > spin_lock(&ive->irq_lock); > > > if (ive->int_ctrl_reg & IVSHMEM_INT_ENABLE) { > > if (ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] & > > IVSHMEM_CFG_ONESHOT_INT) > > ive->int_ctrl_reg = 0; > > > arch_ivshmem_trigger_interrupt(ive, vector); > > } > > > spin_unlock(&ive->irq_lock); > > } > > Q1: IVSHMEM_CFG_ONESHOT_INT means? > Q2: What does meeting this condition mean, > ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] & IVSHMEM_CFG_ONESHOT_INT? > Q3: Why trigger_interrupt when ive->int_ctrl_reg = 0? > Q4: I tried to add "else" a line above arch_ivshmem_trigger_interrupt, > i.e., arch_ivshmem_trigger_interrupt is skipped when > pff.....by mistake the incomplete email was sent..... anyway, let me continue .... Q4: I tried to add "else" a line above arch_ivshmem_trigger_interrupt, i.e., arch_ivshmem_trigger_interrupt was skipped when ./tools/demos/ivshmem-demo -t 1 was executed on the root cell, thus no kernel crash, non-root can later receive interrupt #*!*0 from the root cell, and :) yeah the root cell still receives nothing. >> Jan >> >> -- >> Siemens AG, T RDA IOT >> Corporate Competence Center Embedded Linux >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Jailhouse" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/jailhouse-dev/2d2c72b6-cae0-e210-8db2-630b33180335%40siemens.com >> . >> > -- You received this message because you are subscribed to the Google Groups "Jailhouse" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/jailhouse-dev/CAPKBGc%3D%3DELN%2B6Ws%3D%3DmH%3DtCc%3DNqrJfSxvrEF3ey56VwWFsmS_3w%40mail.gmail.com.
