Send from my phone
> Op 26 mei 2023 om 20:52 heeft Ilya Maximets <[email protected]> het volgende > geschreven: > > On 5/26/23 20:43, Ilya Maximets wrote: >>> On 5/23/23 12:39, Frode Nordahl wrote: >>> The tc module combines the use of the `tc_transact` helper >>> function for communication with the in-kernel tc infrastructure >>> with assertions on the reply data by `ofpbuf_at_assert` on the >>> received data prior to further processing. >>> >>> `tc_transact` in turn calls `nl_transact`, which via >>> `nl_transact_multiple__` ultimately calls and handles return >>> value from `recvmsg`. On error a check for EAGAIN is performed >>> and a consequence of this condition is effectively to provide a >>> non-error (0) result and an empty reply buffer. >> >> Hi, Frode, others. >> >> I took a closer look at the patch and the code in the netlink-socket. >> IIUC, the EAGAIN here is not a result of operation that we're requesting, >> it's just a EAGAIN on a non-blocking (MSG_DONTWAIT) netlink socket while >> trying to read. The reply to that transaction will arrive eventually and >> we will interpret it later as a reply to a different netlink transaction. >> >> The issue appears to be introduced in the following commit: >> 407556ac6c90 ("netlink-socket: Avoid forcing a reply for final message in a >> transaction.") >> >> And it was a performance optimization introduced as part of the set: >> https://mail.openvswitch.org/pipermail/ovs-dev/2012-April/260122.html >> >> The problem is that we can't tell apart socket EAGAIN if there is nothing >> to wait (requests didn't need a reply) and EAGAIN if kernel just didn't >> manage to reply yet. > > It's still strance though that reply is delayed. Typically netlink > replies are formed right in the request handler. Did you manage to > figure out what is causing this in tc specifically? It wasn't an issue > for an OVS and other common families for many years. > Replying from my phone so I did not look at any code. However when I looked at the code during the review I noticed it will ignore earlier (none processed messages) replies based in the receive loop on the transaction ID. So I do not think it should be an issue of messages getting out of sync. I assumed the delay happens when we request something to a hardware offload drive which is replying async due to it being busy. Maybe Frode can get some perf traces to confirm this. //Eelco >> >> The real solution would be to revert commit 407556ac6c90 or be a bit >> smarter and wait for reply only on requests that specify a reply buffer. >> >> Or am I missing something? >> >> Best regards, Ilya Maximets. > _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
