On Mon, Jul 26, 2021 at 8:08 PM Jan Kiszka <[email protected]> wrote:

> On 26.07.21 19:14, Huang Shihua wrote:
> >
> >
> > On Wednesday, 21 July 2021 at 17:50:53 UTC+2 [email protected]
> wrote:
> >
> >     On 13.07.21 18:09, Huang Shihua wrote:
> >     > HI,
> >     >
> >     > Currently, I'm trying to run the ivshmem-demo to establish
> >     communication
> >     > between Linux root cell and one non-root cell. Configuration files
> >     are
> >     > attached.
> >     >
> >     > Two cases were tested:
> >     >
> >     > 1. Let the non-root cell load the ivshmem-demo and then target at
> >     > itself (target=1). _All interrupts can be sent and received
> >     correctly_.
> >     > 2. Let the root cell and the non-root cell send interrupts to each
> >     > other. I.e., root cell runs /./tools/demos/ivshmem-demo -t 1,
> /while
> >     > the non-root cell load /inmates/demos/x86/ivshmem-demo.bin -s
> >     > "target=0" -a 0x1000 /and then run. The result turned out to be,
> >     > * the non-root cell got the interrupts from the root cell,
> >     > * _while the root cell did not receive any interrupt._
> >     >
> >     > As Jan mentioned
> >     >
> >     in
> https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ
> >     <
> https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ>,
> >
> >     > tuning the iommu index should do the trick.
> >     > However, unfortunately, it did not work for me :c
> >     >
> >     > There are 8 iommu units on the hardware, I tuned the iommu index
> >     in the
> >
> >     Wow, 8 units...
> >
> >     > root cell configuration from 0 to 7. The same behavior, no
> interrupts
> >     > were received by the root cell, remains when tuning the index from
> >     0 to
> >     > 6. When the iommu is set to 7, the kernel crashed immediately when
> >     the
> >     > demo was started on the non-root cell.
> >     >
> >     > Any idea regarding why the root cell always failed to receive
> >     interrupts?
> >
> >     This may require in-detail debugging. For that, you would have to
> >     instrument the hypervisor along its virtual IRQ injection path. That
> >     starts in ivshmem_trigger_interrupt() (hypervisor/ivshmem.c). The
> >     sending side will call it on writing the doorbell registers. Check
> >     along
> >     this call path if conditions to actually send the IRQ are not met.
> >
> >     If all are met, the hypervisor sends an IPI to a target cell CPU
> (will
> >     be directly delivered to the guest) that should cause the normal IRQ
> >     processing there. But usually, we do not get so far in such cases.
> >
> >     Another function of interest here is arch_ivshmem_update_msix() when
> >     called for the root cell while it defines where ivshmem IRQs should
> go
> >     to. Possibly, Jailhouse decides that the programming Linux issued is
> >     not
> >     valid and therefore leaves the irq_cache that
> >     arch_ivshmem_trigger_interrupt() uses invalid. You can also check
> that
> >     via instrumentations (printk).
> >
> >
> > Indeed, when .iommu is assigned as 0,1,..6,  irq_cache is invalid. I
> suspect
> > the reason is that their correpsonding VT-d interrupt remappting table
> > entries
> > are not for ivshmem devices, i.e., unmatched device ID.
> > When .iommu is tuned to 7, irq_cache becomes valid.
> >
>
> OK, then we know what needs to be set. I will have to check eventually
> if we can read out that information also from sysfs so that this
> guessing can end.
>
> > (BTW, as I mentioned before, the kernel crashed immediately when the
> > demo was started on the non-root cell. _One missing detail here is_, on
> the
> > root-cell side,  ./tools/demos/ivshmem-demo is running/has run, i.e.,
> > init_control has been set to 1. If ./tools/demos/ivshmem-demo has not
> been
> > run on the root cell yet, then starting the demo on the non-root cell
> > will not
> > kill the kernel.)
>
> Now we need to understand the crash. The root cell kernel oopses, right?
> Any logs from that?
>

Activating hypervisor
CAT: Using COS 0 with bitmask 000007ff for cell ivshmem-demo
Adding virtual PCI device 00:0e.0 to cell "ivshmem-demo"
Shared memory connection established, peer cells:
 "RootCell"
Created cell "ivshmem-demo"
Page pool usage after cell creation: mem 938/3534, remap 65603/131072
Cell "ivshmem-demo" can be loaded
CPU 1 received SIPI, vector 100
Started cell "ivshmem-demo"
IVSHMEM: Found device at 00:0e.0
IVSHMEM: bar0 is at 0x00000000ff000000
IVSHMEM: bar1 is at 0x00000000ff001000
IVSHMEM: ID is 1
IVSHMEM: max. peers is 3
IVSHMEM: state table is at 0x000000003f0f0000
IVSHMEM: R/W section is at 0x000000003f0f1000
IVSHMEM: input sections start at 0x000000003f0fa000
IVSHMEM: output section is at 0x000000003f0fc000
IVSHMEM: initialized device
state[0] = 0
state[1] = 2
state[2] = 0
rw[0] = -1347440721
rw[1] = 0
rw[2] = -1347440721
in@0x0000 = -1347440721
in@0x2000 = 0
in@0x4000 = -1347440721

IVSHMEM: sending IRQ 2 to peer 2

IVSHMEM: sending IRQ 2 to peer 2
<---------- ./tools/demos/ivshmem-demo -t 1 (root cell)
IVSHMEM: got interrupt 0 (#1)
state[0] = 0
state[1] = 2
state[2] = 3
rw[0] = -1347440721
rw[1] = 0
rw[2] = 0
in@0x0000 = -1347440721
in@0x2000 = 0
in@0x4000 = 0

IVSHMEM: sending IRQ 2 to peer 2
FATAL: Unhandled VM-Exit, reason 26
qualification 0
vectoring info: 0 interrupt info: 0
RIP: 0xffffffff8d05f6ae RSP: 0xffffafa9c0003fc0 FLAGS: 2
RAX: 0x00000000007626f0 RBX: 0x0000000000000000 RCX: 0x000000007ffefbff
RDX: 0x00000000bfebfbff RSI: 0xffffafa9c0003fc8 RDI: 0xffffafa9c0003fc4
CS: 10 BASE: 0x0000000000000000 AR-BYTES: a09b EFER.LMA 1
CR0: 0x0000000080050033 CR3: 0x0000001fbd80a004 CR4: 0x00000000007626f0
EFER: 0x0000000000000d01
Parking CPU 0 (Cell: "RootCell")

IVSHMEM: sending IRQ 2 to peer 2
Ignoring NMI IPI to CPU 0
Ignoring NMI IPI to CPU 2
Ignoring NMI IPI to CPU 3
Ignoring NMI IPI to CPU 5
Ignoring NMI IPI to CPU 6
Ignoring NMI IPI to CPU 7
Ignoring NMI IPI to CPU 8
Ignoring NMI IPI to CPU 9
Ignoring NMI IPI to CPU 10
Ignoring NMI IPI to CPU 11
Ignoring NMI IPI to CPU 12
Ignoring NMI IPI to CPU 13
Ignoring NMI IPI to CPU 14
Ignoring NMI IPI to CPU 15

IVSHMEM: sending IRQ 2 to peer 2


>
> And what do yo mean with init_control?
>

oops, typo, should be int_control...
the int_control of struct ivshm_regs in ivshmem-demo/c
struct ivshm_regs {
         uint32_t id;
         uint32_t max_peer;
         uint32_t int_control;
         .....
}
*so when root cell mimo_write 1 to regs->int_control while non-root cell
has been running, then the kernel crashes.*

>
> >
> > To avoid the kernel crashing situation, I only ran the demo on the
> > non-root cell. With .iommu being set validly, I will expect at least
> > seeing the
> > interrupt count increases,  when grep ivshmem /proc/interrupts.
> > But nope, _still no interrupts received on the root cell_.
> >
>
> If there is no driver registered on the root side or not opened (by the
> demo app), then the interrupt reception is disabled. We need to debug
> the "hot" case.
>

Right, after diving into the source code, I did see that as when
ive->int_ctrl_reg=0,
no interrupt will be triggered, i.e., arch_ivshmem_trigger_interrupt is
skipped.

I have a question regarding the code below.
static void ivshmem_trigger_interrupt(struct ivshmem_endpoint *ive,
     unsigned int vector)
{

/*
* Hold the IRQ lock while sending the interrupt so that ivshmem_exit
* and ivshmem_register_mmio can synchronize on the completion of the
* delivery.
*/
spin_lock(&ive->irq_lock);


if (ive->int_ctrl_reg & IVSHMEM_INT_ENABLE) {

if (ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] &

   IVSHMEM_CFG_ONESHOT_INT)

ive->int_ctrl_reg = 0;


arch_ivshmem_trigger_interrupt(ive, vector);

}


spin_unlock(&ive->irq_lock);

}

Q1: IVSHMEM_CFG_ONESHOT_INT means?
Q2: What does meeting this condition mean,
ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] & IVSHMEM_CFG_ONESHOT_INT?
Q3: Why trigger_interrupt when ive->int_ctrl_reg = 0?
Q4: I tried to add "else" a line above arch_ivshmem_trigger_interrupt,
i.e.,  arch_ivshmem_trigger_interrupt is skipped when

>
> Jan
>
> --
> Siemens AG, T RDA IOT
> Corporate Competence Center Embedded Linux
>
> --
> You received this message because you are subscribed to the Google Groups
> "Jailhouse" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/jailhouse-dev/2d2c72b6-cae0-e210-8db2-630b33180335%40siemens.com
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/CAPKBGcn%3Dm5f_3RGzhZ%2B%3DBF9_-v-SAN8y%3DxOCk5Zf8RgEm7Jz_Q%40mail.gmail.com.

Reply via email to