On 9/4/23 22:43, Mathieu Poirier wrote:
> On Mon, Sep 04, 2023 at 03:52:56PM +0200, Arnaud POULIQUEN wrote:
>> Hello Tim,
>>
>> On 9/4/23 10:36, Tim Blechmann wrote:
>>> when we cannot get a tx buffer (`get_a_tx_buf`) `rpmsg_upref_sleepers`
>>> enables tx-complete interrupt.
>>> however if the interrupt is executed after `get_a_tx_buf` and before
>>> `rpmsg_upref_sleepers` we may mis the tx-complete interrupt and sleep
>>> for the full 15 seconds.
>>
>>
>> Is there any reason why your co-processor is unable to release the TX RPMSG
>> buffers for 15 seconds? If not, you should first determine the reason why it 
>> is
>> stalled.
> 
> Arnaud's concern is valid.  If the remote processor can't consume a buffer
> within 15 seconds, something is probably wrong.
> 
> That said, I believe your assesment of the situation is correct.  *If* the TX
> callback is disabled and there is no buffer available, there is a window of
> opportunity between calls to get_a_tx_buf() and rpmsg_upref_sleepers() for an
> interrupt to arrive in function rpmsg_send_offchannel_raw().  
> 
> From here three things need to happen:
> 
> 1) You send another version of this patch with a changelong that uses proper
> english, i.e capital letters when they are needed and no spelling mistake.
> 
> 2) Arnaud confirms our suspicions.

Seems to me that this patch is useless
- wait_event_interruptible_timeout() function already seems
to test the condition (so call get_a_tx_buf()) before entering in sleep[1].
- ftraces show that vq interrupt is not called during the 15-second period.
  So it is a normal behavior that the vrp->sendq is never waked-up.

Tim needs to analyze the reason why no mailbox interrupt occurs.

[1]https://elixir.bootlin.com/linux/latest/source/include/linux/wait.h#L534


> 
> 3) This patch gets applied when rc1 comes out so that it has 6 or 7 weeks to
> soak.  No error are locks are reported due to this patch during that time. 
> 
>>
>> Regards,
>> Arnaud
>>
>>>
>>> in this case, so we re-try once before we really start to sleep
>>>
>>> Signed-off-by: Tim Blechmann <[email protected]>
>>> ---
>>>  drivers/rpmsg/virtio_rpmsg_bus.c | 24 +++++++++++++++---------
>>>  1 file changed, 15 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c 
>>> b/drivers/rpmsg/virtio_rpmsg_bus.c
>>> index 905ac7910c98..2a9d42225e60 100644
>>> --- a/drivers/rpmsg/virtio_rpmsg_bus.c
>>> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c
>>> @@ -587,21 +587,27 @@ static int rpmsg_send_offchannel_raw(struct 
>>> rpmsg_device *rpdev,
>>>  
>>>     /* no free buffer ? wait for one (but bail after 15 seconds) */
>>>     while (!msg) {
>>>             /* enable "tx-complete" interrupts, if not already enabled */
>>>             rpmsg_upref_sleepers(vrp);
>>>  
>>> -           /*
>>> -            * sleep until a free buffer is available or 15 secs elapse.
>>> -            * the timeout period is not configurable because there's
>>> -            * little point in asking drivers to specify that.
>>> -            * if later this happens to be required, it'd be easy to add.
>>> -            */
>>> -           err = wait_event_interruptible_timeout(vrp->sendq,
>>> -                                   (msg = get_a_tx_buf(vrp)),
>>> -                                   msecs_to_jiffies(15000));
>>> +           /* make sure to retry to grab tx buffer before we start waiting 
>>> */
>>> +           msg = get_a_tx_buf(vrp);
>>> +           if (msg) {
>>> +                   err = 0;
>>> +           } else {
>>> +                   /*
>>> +                    * sleep until a free buffer is available or 15 secs 
>>> elapse.
>>> +                    * the timeout period is not configurable because 
>>> there's
>>> +                    * little point in asking drivers to specify that.
>>> +                    * if later this happens to be required, it'd be easy 
>>> to add.
>>> +                    */
>>> +                   err = wait_event_interruptible_timeout(vrp->sendq,
>>> +                                           (msg = get_a_tx_buf(vrp)),
>>> +                                           msecs_to_jiffies(15000));
>>> +           }
>>>  
>>>             /* disable "tx-complete" interrupts if we're the last sleeper */
>>>             rpmsg_downref_sleepers(vrp);
>>>  
>>>             /* timeout ? */
>>>             if (!err) {

Reply via email to