Send from my phone

> Op 26 mei 2023 om 20:52 heeft Ilya Maximets <[email protected]> het volgende 
> geschreven:
> 
> On 5/26/23 20:43, Ilya Maximets wrote:
>>> On 5/23/23 12:39, Frode Nordahl wrote:
>>> The tc module combines the use of the `tc_transact` helper
>>> function for communication with the in-kernel tc infrastructure
>>> with assertions on the reply data by `ofpbuf_at_assert` on the
>>> received data prior to further processing.
>>> 
>>> `tc_transact` in turn calls `nl_transact`, which via
>>> `nl_transact_multiple__` ultimately calls and handles return
>>> value from `recvmsg`.  On error a check for EAGAIN is performed
>>> and a consequence of this condition is effectively to provide a
>>> non-error (0) result and an empty reply buffer.
>> 
>> Hi, Frode, others.
>> 
>> I took a closer look at the patch and the code in the netlink-socket.
>> IIUC, the EAGAIN here is not a result of operation that we're requesting,
>> it's just a EAGAIN on a non-blocking (MSG_DONTWAIT) netlink socket while
>> trying to read.  The reply to that transaction will arrive eventually and
>> we will interpret it later as a reply to a different netlink transaction.
>> 
>> The issue appears to be introduced in the following commit:
>>  407556ac6c90 ("netlink-socket: Avoid forcing a reply for final message in a 
>> transaction.")
>> 
>> And it was a performance optimization introduced as part of the set:
>>  https://mail.openvswitch.org/pipermail/ovs-dev/2012-April/260122.html
>> 
>> The problem is that we can't tell apart socket EAGAIN if there is nothing
>> to wait (requests didn't need a reply) and EAGAIN if kernel just didn't
>> manage to reply yet.
> 
> It's still strance though that reply is delayed.  Typically netlink
> replies are formed right in the request handler.  Did you manage to
> figure out what is causing this in tc specifically?  It wasn't an issue
> for an OVS and other common families for many years.
> 

Replying from my phone so I did not look at any code. However when I looked at 
the code during the review I noticed it will ignore earlier (none processed 
messages) replies based in the receive loop on the transaction ID. So I do not 
think it should be an issue of messages getting out of sync.

I assumed the delay happens when we request something to a hardware offload 
drive which is replying async due to it being busy.

Maybe Frode can get some perf traces to confirm this. 

//Eelco
>> 
>> The real solution would be to revert commit 407556ac6c90 or be a bit
>> smarter and wait for reply only on requests that specify a reply buffer.
>> 
>> Or am I missing something?
>> 
>> Best regards, Ilya Maximets.
> 
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to