Re: [v3 PATCH] remoteproc: xlnx: Use high-prio workqueue instead of system wq

Stefan Roese Thu, 18 Dec 2025 07:45:22 -0800

Hi Mathieu,

On 12/17/25 22:34, Mathieu Poirier wrote:

On Wed, Dec 17, 2025 at 11:27:44AM +0100, Stefan Roese wrote:

Hi Mathieu,


On 12/16/25 22:47, Mathieu Poirier wrote:

On Tue, Dec 16, 2025 at 03:34:18PM +0100, Stefan Roese wrote:

Hi Mathieu,

On 12/15/25 02:14, Mathieu Poirier wrote:

On Wed, Dec 10, 2025 at 12:28:52PM -0600, Tanmay Shah wrote:

Hello, please check my comments below:

On 12/10/25 2:29 AM, Stefan Roese wrote:

Hi Tanmay,

On 12/10/25 03:51, Zhongqiu Han wrote:

On 12/5/2025 8:06 PM, Stefan Roese wrote:

Hi Tanmay,

On 12/4/25 17:45, Tanmay Shah wrote:

Hello,

Thank You for your patch. Please find my comments below.

On 12/4/25 4:40 AM, Stefan Roese wrote:

Testing on our ZynqMP platform has shown, that some R5 messages might
get dropped under high CPU load. This patch creates a new high-prio


This commit text should be fixed. Messages are not dropped by Linux, but R5
can't send new messages as rx vq is not processed by Linux.


I agree.

Here, I would like to understand what it means by "R5
messages might get dropped"

Even under high CPU load, the messages from R5 are stored in
the virtqueues. If Linux doesn't read it, then it is not
really lost/ dropped.

Could you please explain your use case in detail and how the
testing is conducted?


Our use-case is, that we send ~4k messages per second from the R5 to
Linux - sometimes even a bit more. Normally these messages are received
okay and no messages are dropped. Sometimes, under "high CPU load"
scenarios it happens, that the R5 has to drop messages, as there is no
free space in the RPMsg buffer, which is 256 entries AFAIU. Resulting
from the Linux driver not emptying the RX queue.


Thanks for the details. Your understanding is correct.

Could you please elaborate on these virtqueues a bit? Especially why no
messages drop should happen because of these virtqueues?


AFAIK, as a transport layer based on virtqueue, rpmsg is reliable once a
message has been successfully enqueued. The observed "drop" here appears
to be on the R5 side, where the application discards messages when no
entry buffer is available.


Correct.

In the long run, while improving the Linux side is recommended,


Yes, please.

it could
also be helpful for the R5 side to implement strategies such as an
application-level buffer and retry mechanisms.


We already did this. We've added an additional buffer mechanism to the
R5, which improved this "message drop situation" a bit. Still it did not
fix it for all our high message rate situations - still resulting in
frame drops on the R5 side (the R5 is a bit resource restricted).

Improving the responsiveness on the Linux side seems to be the best way
for us to deal with this problem.


I agree to this. However, Just want to understand and cover full picture
here.

On R5 side, I am assuming open-amp library is used for the RPMsg
communication.

rpmsg_send() API will end up here: 
https://github.com/OpenAMP/open-amp/blob/be5770f30516505c1a4d35efcffff9fb547f7dcf/lib/rpmsg/rpmsg_virtio.c#L384

Here, if the new buffer is not available, then R5 is supposed to wait for
1ms before sending a new message. After 1ms, R5 will try to get buffer
again, and this continues for 15 seconds. This is the default mechanism.

This mechanism is used in your case correctly ?

Alternatively you can register platform specific wait mechanism via this
callback: 
https://github.com/OpenAMP/open-amp/blob/be5770f30516505c1a4d35efcffff9fb547f7dcf/lib/include/openamp/rpmsg_virtio.h#L42

Few questions for further understanding:

1) As per your use case, 4k per second data transfer rate must be maintained
all the time? And this is achieved with this patch?

Even after having the high priority queue, if someone wants to achieve 8k
per seconds or 16k per seconds data transfer rate, at some point we will hit
this issue again.


Right, I also think this patch is not the right solution.


Hmmm. My understanding of Tanmays's comments is somewhat different. He
is not "against" this patch in general AFAIU. Please see my reply with
a more detailed description of our system setup and it's message flow
and limitations that I just sent a few minutes ago.


Regardless of how we spin things around, this patch is about running out of
resource (CPU cycles and memory).  It is only a matter of time before this
solution becomes obsolete.

The main issue here is that we are adding a priority workqueue for everyone
using this driver, which may have unwanted side effects.  Please add a kernel
module parameter to control what kind of workqueue is to be used.


Okay, will do.


Please see this patchset [1] Tanmay is currently working on.  I would much
rather see that solution put to work than playing with workqueue priorities.

[1]. "[RFC PATCH 0/2] Enhance RPMsg buffer management"


Thanks for the notice. I'll take a look at it and if possible give it
a try and will report back.

Thanks,
Stefan

Re: [v3 PATCH] remoteproc: xlnx: Use high-prio workqueue instead of system wq

Reply via email to