On Wed, Jul 28, 2021 at 2:10 PM Huang Shihua <[email protected]>
wrote:

>
>
> On Mon, Jul 26, 2021 at 8:08 PM Jan Kiszka <[email protected]> wrote:
>
>> On 26.07.21 19:14, Huang Shihua wrote:
>> >
>> >
>> > On Wednesday, 21 July 2021 at 17:50:53 UTC+2 [email protected]
>> wrote:
>> >
>> >     On 13.07.21 18:09, Huang Shihua wrote:
>> >     > HI,
>> >     >
>> >     > Currently, I'm trying to run the ivshmem-demo to establish
>> >     communication
>> >     > between Linux root cell and one non-root cell. Configuration files
>> >     are
>> >     > attached.
>> >     >
>> >     > Two cases were tested:
>> >     >
>> >     > 1. Let the non-root cell load the ivshmem-demo and then target at
>> >     > itself (target=1). _All interrupts can be sent and received
>> >     correctly_.
>> >     > 2. Let the root cell and the non-root cell send interrupts to each
>> >     > other. I.e., root cell runs /./tools/demos/ivshmem-demo -t 1,
>> /while
>> >     > the non-root cell load /inmates/demos/x86/ivshmem-demo.bin -s
>> >     > "target=0" -a 0x1000 /and then run. The result turned out to be,
>> >     > * the non-root cell got the interrupts from the root cell,
>> >     > * _while the root cell did not receive any interrupt._
>> >     >
>> >     > As Jan mentioned
>> >     >
>> >     in
>> https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ
>> >     <
>> https://groups.google.com/g/jailhouse-dev/c/GRCWFzNaHX8/m/ht8z51BOCgAJ>,
>> >
>> >     > tuning the iommu index should do the trick.
>> >     > However, unfortunately, it did not work for me :c
>> >     >
>> >     > There are 8 iommu units on the hardware, I tuned the iommu index
>> >     in the
>> >
>> >     Wow, 8 units...
>> >
>> >     > root cell configuration from 0 to 7. The same behavior, no
>> interrupts
>> >     > were received by the root cell, remains when tuning the index from
>> >     0 to
>> >     > 6. When the iommu is set to 7, the kernel crashed immediately when
>> >     the
>> >     > demo was started on the non-root cell.
>> >     >
>> >     > Any idea regarding why the root cell always failed to receive
>> >     interrupts?
>> >
>> >     This may require in-detail debugging. For that, you would have to
>> >     instrument the hypervisor along its virtual IRQ injection path. That
>> >     starts in ivshmem_trigger_interrupt() (hypervisor/ivshmem.c). The
>> >     sending side will call it on writing the doorbell registers. Check
>> >     along
>> >     this call path if conditions to actually send the IRQ are not met.
>> >
>> >     If all are met, the hypervisor sends an IPI to a target cell CPU
>> (will
>> >     be directly delivered to the guest) that should cause the normal IRQ
>> >     processing there. But usually, we do not get so far in such cases.
>> >
>> >     Another function of interest here is arch_ivshmem_update_msix() when
>> >     called for the root cell while it defines where ivshmem IRQs should
>> go
>> >     to. Possibly, Jailhouse decides that the programming Linux issued is
>> >     not
>> >     valid and therefore leaves the irq_cache that
>> >     arch_ivshmem_trigger_interrupt() uses invalid. You can also check
>> that
>> >     via instrumentations (printk).
>> >
>> >
>> > Indeed, when .iommu is assigned as 0,1,..6,  irq_cache is invalid. I
>> suspect
>> > the reason is that their correpsonding VT-d interrupt remappting table
>> > entries
>> > are not for ivshmem devices, i.e., unmatched device ID.
>> > When .iommu is tuned to 7, irq_cache becomes valid.
>> >
>>
>> OK, then we know what needs to be set. I will have to check eventually
>> if we can read out that information also from sysfs so that this
>> guessing can end.
>>
>> > (BTW, as I mentioned before, the kernel crashed immediately when the
>> > demo was started on the non-root cell. _One missing detail here is_, on
>> the
>> > root-cell side,  ./tools/demos/ivshmem-demo is running/has run, i.e.,
>> > init_control has been set to 1. If ./tools/demos/ivshmem-demo has not
>> been
>> > run on the root cell yet, then starting the demo on the non-root cell
>> > will not
>> > kill the kernel.)
>>
>> Now we need to understand the crash. The root cell kernel oopses, right?
>> Any logs from that?
>>
>
> Activating hypervisor
> CAT: Using COS 0 with bitmask 000007ff for cell ivshmem-demo
> Adding virtual PCI device 00:0e.0 to cell "ivshmem-demo"
> Shared memory connection established, peer cells:
>  "RootCell"
> Created cell "ivshmem-demo"
> Page pool usage after cell creation: mem 938/3534, remap 65603/131072
> Cell "ivshmem-demo" can be loaded
> CPU 1 received SIPI, vector 100
> Started cell "ivshmem-demo"
> IVSHMEM: Found device at 00:0e.0
> IVSHMEM: bar0 is at 0x00000000ff000000
> IVSHMEM: bar1 is at 0x00000000ff001000
> IVSHMEM: ID is 1
> IVSHMEM: max. peers is 3
> IVSHMEM: state table is at 0x000000003f0f0000
> IVSHMEM: R/W section is at 0x000000003f0f1000
> IVSHMEM: input sections start at 0x000000003f0fa000
> IVSHMEM: output section is at 0x000000003f0fc000
> IVSHMEM: initialized device
> state[0] = 0
> state[1] = 2
> state[2] = 0
> rw[0] = -1347440721
> rw[1] = 0
> rw[2] = -1347440721
> in@0x0000 = -1347440721
> in@0x2000 = 0
> in@0x4000 = -1347440721
>
> IVSHMEM: sending IRQ 2 to peer 2
>
> IVSHMEM: sending IRQ 2 to peer 2
> <---------- ./tools/demos/ivshmem-demo -t 1 (root cell)
> IVSHMEM: got interrupt 0 (#1)
> state[0] = 0
> state[1] = 2
> state[2] = 3
> rw[0] = -1347440721
> rw[1] = 0
> rw[2] = 0
> in@0x0000 = -1347440721
> in@0x2000 = 0
> in@0x4000 = 0
>
> IVSHMEM: sending IRQ 2 to peer 2
> FATAL: Unhandled VM-Exit, reason 26
> qualification 0
> vectoring info: 0 interrupt info: 0
> RIP: 0xffffffff8d05f6ae RSP: 0xffffafa9c0003fc0 FLAGS: 2
> RAX: 0x00000000007626f0 RBX: 0x0000000000000000 RCX: 0x000000007ffefbff
> RDX: 0x00000000bfebfbff RSI: 0xffffafa9c0003fc8 RDI: 0xffffafa9c0003fc4
> CS: 10 BASE: 0x0000000000000000 AR-BYTES: a09b EFER.LMA 1
> CR0: 0x0000000080050033 CR3: 0x0000001fbd80a004 CR4: 0x00000000007626f0
> EFER: 0x0000000000000d01
> Parking CPU 0 (Cell: "RootCell")
>
> IVSHMEM: sending IRQ 2 to peer 2
> Ignoring NMI IPI to CPU 0
> Ignoring NMI IPI to CPU 2
> Ignoring NMI IPI to CPU 3
> Ignoring NMI IPI to CPU 5
> Ignoring NMI IPI to CPU 6
> Ignoring NMI IPI to CPU 7
> Ignoring NMI IPI to CPU 8
> Ignoring NMI IPI to CPU 9
> Ignoring NMI IPI to CPU 10
> Ignoring NMI IPI to CPU 11
> Ignoring NMI IPI to CPU 12
> Ignoring NMI IPI to CPU 13
> Ignoring NMI IPI to CPU 14
> Ignoring NMI IPI to CPU 15
>
> IVSHMEM: sending IRQ 2 to peer 2
>
>
>>
>> And what do yo mean with init_control?
>>
>
> oops, typo, should be int_control...
> the int_control of struct ivshm_regs in ivshmem-demo/c
> struct ivshm_regs {
>          uint32_t id;
>          uint32_t max_peer;
>          uint32_t int_control;
>          .....
> }
> *so when root cell mimo_write 1 to regs->int_control while non-root cell
> has been running, then the kernel crashes.*
>
>>
>> >
>> > To avoid the kernel crashing situation, I only ran the demo on the
>> > non-root cell. With .iommu being set validly, I will expect at least
>> > seeing the
>> > interrupt count increases,  when grep ivshmem /proc/interrupts.
>> > But nope, _still no interrupts received on the root cell_.
>> >
>>
>> If there is no driver registered on the root side or not opened (by the
>> demo app), then the interrupt reception is disabled. We need to debug
>> the "hot" case.
>>
>
> Right, after diving into the source code, I did see that as when
> ive->int_ctrl_reg=0,
> no interrupt will be triggered, i.e., arch_ivshmem_trigger_interrupt is
> skipped.
>
> I have a question regarding the code below.
> static void ivshmem_trigger_interrupt(struct ivshmem_endpoint *ive,
>      unsigned int vector)
> {
>
> /*
> * Hold the IRQ lock while sending the interrupt so that ivshmem_exit
> * and ivshmem_register_mmio can synchronize on the completion of the
> * delivery.
> */
> spin_lock(&ive->irq_lock);
>
>
> if (ive->int_ctrl_reg & IVSHMEM_INT_ENABLE) {
>
> if (ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] &
>
>    IVSHMEM_CFG_ONESHOT_INT)
>
> ive->int_ctrl_reg = 0;
>
>
> arch_ivshmem_trigger_interrupt(ive, vector);
>
> }
>
>
> spin_unlock(&ive->irq_lock);
>
> }
>
> Q1: IVSHMEM_CFG_ONESHOT_INT means?
> Q2: What does meeting this condition mean,
> ive->cspace[IVSHMEM_CFG_VNDR_CAP/4] & IVSHMEM_CFG_ONESHOT_INT?
> Q3: Why trigger_interrupt when ive->int_ctrl_reg = 0?
> Q4: I tried to add "else" a line above arch_ivshmem_trigger_interrupt,
> i.e.,  arch_ivshmem_trigger_interrupt is skipped when
>

pff.....by mistake the incomplete email was sent.....
anyway, let me continue ....

 Q4: I tried to add "else" a line above arch_ivshmem_trigger_interrupt,
i.e.,  arch_ivshmem_trigger_interrupt was skipped when
./tools/demos/ivshmem-demo -t 1 was executed on the root cell, thus no
kernel crash,
non-root can later receive interrupt #*!*0 from the root cell, and :) yeah
the root cell still receives nothing.


>> Jan
>>
>> --
>> Siemens AG, T RDA IOT
>> Corporate Competence Center Embedded Linux
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Jailhouse" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/jailhouse-dev/2d2c72b6-cae0-e210-8db2-630b33180335%40siemens.com
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/jailhouse-dev/CAPKBGc%3D%3DELN%2B6Ws%3D%3DmH%3DtCc%3DNqrJfSxvrEF3ey56VwWFsmS_3w%40mail.gmail.com.

Reply via email to