On Wed, Dec 17, 2025 at 11:27:44AM +0100, Stefan Roese wrote: > Hi Mathieu, > > On 12/16/25 22:47, Mathieu Poirier wrote: > > On Tue, Dec 16, 2025 at 03:34:18PM +0100, Stefan Roese wrote: > > > Hi Mathieu, > > > > > > On 12/15/25 02:14, Mathieu Poirier wrote: > > > > On Wed, Dec 10, 2025 at 12:28:52PM -0600, Tanmay Shah wrote: > > > > > Hello, please check my comments below: > > > > > > > > > > On 12/10/25 2:29 AM, Stefan Roese wrote: > > > > > > Hi Tanmay, > > > > > > > > > > > > On 12/10/25 03:51, Zhongqiu Han wrote: > > > > > > > On 12/5/2025 8:06 PM, Stefan Roese wrote: > > > > > > > > Hi Tanmay, > > > > > > > > > > > > > > > > On 12/4/25 17:45, Tanmay Shah wrote: > > > > > > > > > Hello, > > > > > > > > > > > > > > > > > > Thank You for your patch. Please find my comments below. > > > > > > > > > > > > > > > > > > On 12/4/25 4:40 AM, Stefan Roese wrote: > > > > > > > > > > Testing on our ZynqMP platform has shown, that some R5 > > > > > > > > > > messages might > > > > > > > > > > get dropped under high CPU load. This patch creates a new > > > > > > > > > > high-prio > > > > > > > > > > > > > > > > > > > This commit text should be fixed. Messages are not dropped by Linux, > > > > > but R5 > > > > > can't send new messages as rx vq is not processed by Linux. > > > > > > > > > > > > > I agree. > > > > > > > > > Here, I would like to understand what it means by "R5 > > > > > > > > > messages might get dropped" > > > > > > > > > > > > > > > > > > Even under high CPU load, the messages from R5 are stored in > > > > > > > > > the virtqueues. If Linux doesn't read it, then it is not > > > > > > > > > really lost/ dropped. > > > > > > > > > > > > > > > > > > Could you please explain your use case in detail and how the > > > > > > > > > testing is conducted? > > > > > > > > > > > > > > > > Our use-case is, that we send ~4k messages per second from the > > > > > > > > R5 to > > > > > > > > Linux - sometimes even a bit more. Normally these messages are > > > > > > > > received > > > > > > > > okay and no messages are dropped. Sometimes, under "high CPU > > > > > > > > load" > > > > > > > > scenarios it happens, that the R5 has to drop messages, as > > > > > > > > there is no > > > > > > > > free space in the RPMsg buffer, which is 256 entries AFAIU. > > > > > > > > Resulting > > > > > > > > from the Linux driver not emptying the RX queue. > > > > > > > > > > > > > > > > > > Thanks for the details. Your understanding is correct. > > > > > > > > > > > > > Could you please elaborate on these virtqueues a bit? > > > > > > > > Especially why no > > > > > > > > messages drop should happen because of these virtqueues? > > > > > > > > > > > > > > AFAIK, as a transport layer based on virtqueue, rpmsg is reliable > > > > > > > once a > > > > > > > message has been successfully enqueued. The observed "drop" here > > > > > > > appears > > > > > > > to be on the R5 side, where the application discards messages > > > > > > > when no > > > > > > > entry buffer is available. > > > > > > > > > > > > Correct. > > > > > > > > > > > > > In the long run, while improving the Linux side is recommended, > > > > > > > > > > > > Yes, please. > > > > > > > > > > > > > it could > > > > > > > also be helpful for the R5 side to implement strategies such as an > > > > > > > application-level buffer and retry mechanisms. > > > > > > > > > > > > We already did this. We've added an additional buffer mechanism to > > > > > > the > > > > > > R5, which improved this "message drop situation" a bit. Still it > > > > > > did not > > > > > > fix it for all our high message rate situations - still resulting in > > > > > > frame drops on the R5 side (the R5 is a bit resource restricted). > > > > > > > > > > > > Improving the responsiveness on the Linux side seems to be the best > > > > > > way > > > > > > for us to deal with this problem. > > > > > > > > > > > > > > > > I agree to this. However, Just want to understand and cover full > > > > > picture > > > > > here. > > > > > > > > > > On R5 side, I am assuming open-amp library is used for the RPMsg > > > > > communication. > > > > > > > > > > rpmsg_send() API will end up here: > > > > > https://github.com/OpenAMP/open-amp/blob/be5770f30516505c1a4d35efcffff9fb547f7dcf/lib/rpmsg/rpmsg_virtio.c#L384 > > > > > > > > > > Here, if the new buffer is not available, then R5 is supposed to wait > > > > > for > > > > > 1ms before sending a new message. After 1ms, R5 will try to get buffer > > > > > again, and this continues for 15 seconds. This is the default > > > > > mechanism. > > > > > > > > > > This mechanism is used in your case correctly ? > > > > > > > > > > Alternatively you can register platform specific wait mechanism via > > > > > this > > > > > callback: > > > > > https://github.com/OpenAMP/open-amp/blob/be5770f30516505c1a4d35efcffff9fb547f7dcf/lib/include/openamp/rpmsg_virtio.h#L42 > > > > > > > > > > Few questions for further understanding: > > > > > > > > > > 1) As per your use case, 4k per second data transfer rate must be > > > > > maintained > > > > > all the time? And this is achieved with this patch? > > > > > > > > > > Even after having the high priority queue, if someone wants to > > > > > achieve 8k > > > > > per seconds or 16k per seconds data transfer rate, at some point we > > > > > will hit > > > > > this issue again. > > > > > > > > > > > > > Right, I also think this patch is not the right solution. > > > > > > Hmmm. My understanding of Tanmays's comments is somewhat different. He > > > is not "against" this patch in general AFAIU. Please see my reply with > > > a more detailed description of our system setup and it's message flow > > > and limitations that I just sent a few minutes ago. > > > > > > > Regardless of how we spin things around, this patch is about running out of > > resource (CPU cycles and memory). It is only a matter of time before this > > solution becomes obsolete. > > > > The main issue here is that we are adding a priority workqueue for everyone > > using this driver, which may have unwanted side effects. Please add a > > kernel > > module parameter to control what kind of workqueue is to be used. > > Okay, will do.
Please see this patchset [1] Tanmay is currently working on. I would much rather see that solution put to work than playing with workqueue priorities. [1]. "[RFC PATCH 0/2] Enhance RPMsg buffer management" > > Thanks, > Stefan >
