On 9/13/23 10:47, Tim Blechmann wrote:
> many thanks for your analysis, very interesting.
>
>> please find below an extract of your trace with my analysis:
>>
>>
>> stm32mp1_bulk_p-390 [001] ..... 907.241226: rpmsg_send
>> <-rpmsg_intercore_send_buffer.constprop.0
>> stm32mp1_bulk_p-390 [001] ..... 907.241228: virtio_rpmsg_send
>> <-rpmsg_send
>> stm32mp1_bulk_p-390 [001] ..... 907.241237: virtqueue_enable_cb
>> <-rpmsg_send_offchannel_raw
>> stm32mp1_bulk_p-390 [001] ..... 907.241239:
>> virtqueue_enable_cb_prepare
>>
>> At this point seems that no more TX-buffer
>>
>> <-rpmsg_recv_single
>> kworker/0:4-67 [000] ..... 907.242533: vring_interrupt
>> <-rproc_vq_interrupt
>> kworker/0:4-67 [000] ..... 907.242536: rpmsg_xmit_done
>>
>> Here you receive an interrupt indicating that TX buffer has been released by
>> the remote. that's the expected behavior.
>>
>>
>> <-rpmsg_send_offchannel_raw
>> stm32mp1_bulk_p-390 [000] ..... 984.054941: rpmsg_send
>> <-rpmsg_intercore_send_buffer.constprop.0
>> stm32mp1_bulk_p-390 [000] ..... 984.054943: virtio_rpmsg_send
>> <-rpmsg_send
>> stm32mp1_bulk_p-390 [000] ..... 984.054956: virtqueue_enable_cb
>> <-rpmsg_send_offchannel_raw
>> stm32mp1_bulk_p-390 [000] ..... 984.054958:
>> virtqueue_enable_cb_prepare
>> <-virtqueue_enable_cb
>> stm32mp1_bulk_p-390 [000] ..... 999.398667: virtqueue_disable_cb
>> <-rpmsg_send_offchannel_raw
>> stm32mp1_bulk_p-390 [000] ..... 999.414840: rpmsg_send
>> <-rpmsg_intercore_send_buffer.constprop.0
>> stm32mp1_bulk_p-390 [000] ..... 999.414843: virtio_rpmsg_send
>> <-rpmsg_send
>> stm32mp1_bulk_p-390 [000] ..... 999.414855: virtqueue_enable_cb
>> <-rpmsg_send_offchannel_raw
>> stm32mp1_bulk_p-390 [000] ..... 999.414857:
>> virtqueue_enable_cb_prepare
>>
>> Here you have again no more TX buffer. From this point there is no more
>> activity
>> neither in TX nor in RX until the timeout of 15 seconds.
>> If you have a look to rproc_vq_interrupt the last one occurs at 907.242533
>>
>>
>> As you have no more virtqueue interrupts call for both directions, the issue
>> is
>> probably either in the Cortex-M firmware, which seems to be stalled, or due
>> to a
>> disable of the IRQs in Linux.
>
> afaict we can rule out a complete stall of the cortex-m firmware: if we change
> the rpmsg_send to a rpmsg_trysend/msleep loop, the trysend will succeed to
> get a
> buffer after a few iterations.
>
>> or due to a disable of the IRQs in Linux.
>
> do you have some recommendations how we could trace this?
First, check if the Cortex-M sends the IPCC mailbox notification during the
15-second period. Then, you can verify the IPCC registers to check if some flags
are pending. On the Linux side, there is probably literature available to
explain how to trace it.
>
> many thanks,
> tim
>
>> <-virtqueue_enable_cb
>> stm32mp1_bulk_p-390 [000] ..... 1014.758678: virtqueue_disable_cb
>> <-rpmsg_send_offchannel_raw
>> stm32mp1_bulk_p-390 [000] ..... 1014.774802: rpmsg_send
>> <-rpmsg_intercore_send_buffer.constprop.0
>> stm32mp1_bulk_p-390 [000] ..... 1014.774804: virtio_rpmsg_send
>> <-rpmsg_send
>> stm32mp1_bulk_p-390 [000] ..... 1014.774815: virtqueue_enable_cb
>
>