On 02.06.2016 04:32, Daniele Di Proietto wrote: > > On 25/05/2016 04:03, "Ilya Maximets" <i.maxim...@samsung.com> wrote: > >> On 23.05.2016 17:55, Traynor, Kevin wrote: >>>> -----Original Message----- >>>> From: Ilya Maximets [mailto:i.maxim...@samsung.com] >>>> Sent: Tuesday, May 17, 2016 4:09 PM >>>> To: dev@openvswitch.org; Daniele Di Proietto <diproiet...@vmware.com> >>>> Cc: Dyasly Sergey <s.dya...@samsung.com>; Heetae Ahn >>>> <heetae82....@samsung.com>; Flavio Leitner <f...@sysclose.org>; >>>> Traynor, Kevin <kevin.tray...@intel.com>; Pravin B Shelar >>>> <pshe...@ovn.org>; Ilya Maximets <i.maxim...@samsung.com> >>>> Subject: [PATCH] netdev-dpdk: Fix PMD threads hang in >>>> __netdev_dpdk_vhost_send(). >>>> >>>> There are situations when PMD thread can hang forever inside >>>> __netdev_dpdk_vhost_send() because of broken virtqueue ring. >>>> >>>> This happens if rte_vring_available_entries() always positive and >>>> rte_vhost_enqueue_burst() can't send anything (possible with broken >>>> ring). >>>> >>>> In this case time expiration will be never checked and 'do {} while >>>> (cnt)' >>>> loop will be infinite. >>>> >>>> This scenario sometimes reproducible with dpdk-16.04-rc2 inside guest >>>> VM. >>>> Also it may be reproduced by manual braking of ring structure inside >>>> the guest VM. >>>> >>>> Fix that by checking time expiration even if we have available >>>> entries. >>> >>> Hi Ilya, >> >> Hi, Kevin. >> >> Christian and Thiago CC-ed, because, I think, they're faced with similar >> issue. >> >>> >>> Thanks for catching this. This intersects with something else I've seen >>> wrt retry code and there's a few options... >>> >>> 1. Remove retries when nothing sent. For the VM that needs retries it is a >>> good thing to have, but Bhanu and I saw in a test with multiple VM's >>> recently >>> that if one VM causes a lot of retries there is a large performance >>> degradation >>> for the other VM's. So I changed the retry to only occur when at least one >>> packet >>> has been sent on the previous call. I put a patch up here. >>> http://openvswitch.org/pipermail/dev/2016-May/071517.html >>> >>> If we keep retries we can either >>> >>> 2. Make more robust coordination between rte_ring_available_entries() and >>> rte_vhost_enqueue_burst(), as per your patch. >>> >>> 3. As you've shown that we can't rely on the rte_ring_available_entries() >>> to know we >>> can enqueue, how about just remove it and use rte_vhost_enqueue_burst() >>> directly >>> in the retry loop. >>> >>> My preference would be for 1. because on balance I'd rather one VM did not >>> degrade >>> performance of others, more than I'd like it to have retries. Of course >>> there could >>> be some compromise between them as well i.e. reduce amount of retries, but >>> any retries >>> could affect performance for another path if they are using the same core. >>> >>> What do you think? >> >> I'm worry about scenarios with "pulsing" traffic, i.e. if we have not very >> big but >> enough amount of packets to overload vring in a short time and long period >> of silence >> after that. HW can keep in its RX queues much more packets than can be >> pushed to >> virtio ring. In this scenario, without retrying, most of packets will be >> dropped. >> >> How about just decreasing of VHOST_ENQ_RETRY_USECS to, may be, 1 usec with >> my fix >> applied of course? Such interval should be enough to handle 20G traffic with >> 64B >> packets by one PMD thread. And, also, this timeout may be applied to both >> cases >> (something sent or not) to decrease cost of retrying. >> >> Best regards, Ilya Maximets. > > I haven't done any performance testing with many vms, but ... > > I think the retry logic was introduced because at the time NETDEV_MAX_BURST > was 192 and we felt that batches of 192 packets could easily overload the > guest ring. > > Now that NETDEV_MAX_BURST is only 32, I agree with Kevin on applying solution > 1. > Retries can degrade performance for other vms, which IMHO is worse than > dropping packets destined for a "slow" receiver vm. > > Thanks, > > Daniele
Ok. I agree, that removing of timeout is reasonable. Another thing: Implementation provided by Kevin allows situation (in some corner case) where rte_vhost_enqueue_burst() will be called 32 times in a row. How much time will it take? May be we should limit number of retries or time spent inside __netdev_dpdk_vhost_send() ? What do you think? Best regards, Ilya Maximets. _______________________________________________ dev mailing list dev@openvswitch.org http://openvswitch.org/mailman/listinfo/dev