>
>On 27.06.2017 23:31, Bodireddy, Bhanuprakash wrote:
>>> On 26.06.2017 00:52, Bodireddy, Bhanuprakash wrote:
>>>>>> +
>>>>>> +/* Flush the txq if there are any packets available.
>>>>>> + * dynamic_txqs/concurrent_txq is disabled for vHost User ports
>>>>>> +as
>>>>>> + * 'OVS_VHOST_MAX_QUEUE_NUM' txqs are preallocated.
>>>>>> + */
>>>>>
>>>>> This comment is completely untrue. You may ignore 'concurrent_txq'
>>>>> because you *must* lock the queue in any case because of dynamic
>>>>> txq remapping inside netdev-dpdk. You must take the spinlock for
>>>>> the
>>>>> 'dev-
>>>>>> tx_q[qid % netdev->n_txq].map' before sending packets.
>>>>
>>>> Thanks for catching this and the lock should be taken before
>>>> flushing the
>>> queue. Below is how the new logic with spinlocks.
>>>>
>>>> /* Flush the txq if there are any packets available. */ static int
>>>> netdev_dpdk_vhost_txq_flush(struct netdev *netdev, int qid,
>>>> bool concurrent_txq OVS_UNUSED) {
>>>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>> struct dpdk_tx_queue *txq;
>>>>
>>>> qid = dev->tx_q[qid % netdev->n_txq].map;
>>>>
>>>> txq = &dev->tx_q[qid];
>>>> if (OVS_LIKELY(txq->vhost_pkt_cnt)) {
>>>> rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>>>> netdev_dpdk_vhost_tx_burst(dev, qid);
>>>> rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>>>> }
>>>>
>>>> return 0;
>>>> }
>>>>
>>>>>
>>>>> In current implementation you're able to call send and flush
>>>>> simultaneously for the same queue from different threads because
>>>>> 'flush' doesn't care about queue remapping.
>>>>>
>>>>> See '__netdev_dpdk_vhost_send' and 'netdev_dpdk_remap_txqs' for
>>> detail.
>>>>>
>>>>> Additionally, flushing logic will be broken in case of txq
>>>>> remapping because you may have different underneath queue each
>time
>>>>> you trying to send of flush.
>>>>
>>>> I remember you raised this point earlier. To handle this case,
>'last_used_qid'
>>> was introduced in tx_port.
>>>> With this we can track any change in qid and make sure the packets
>>>> are
>>> flushed in the old queue.
>>>> This logic is in patch 3/6 of this series.
>>>
>>> I'm talking about txq remapping inside netdev-dpdk not about XPS.
>>> You're trying to flush the 'qid = tx_q[...].map' but the 'map'
>>> value can be changed at any time because of enabling/disabling vrings
>>> inside guest. Refer the 'vring_state_changed()' callback that
>>> triggers 'netdev_dpdk_remap_txqs' which I already mentioned.
>>> It's actually the reason why we're using unconditional locking for
>>> vhost-user ports ignoring 'concurrent_txq' value.
>>
>> I spent some time looking at the above mentioned functions and see that
>'qid = tx_q[...].map' is updated by 'vhost_thread' as part of callback
>function.
>This gets triggered only when the queues are enabled/disabled in the guest.
>I was using 'ethtool -L' in the guest for testing this.
>> As you rightly mentioned the flushing logic breaks and the packets may
>potentially get stuck in the queue. During my testing I found this corner case
>that poses a problem with the patch (steps in order).
>>
>> - Multiqueue is enabled on the guest with 2 rx,tx queues.
>> - In PVP case, Traffic is sent to VM and the packets are sent on the 'queue
>> 1'
>of the vHostUser port.
>> - PMD thread keeps queueing packets on the vhostuser port queue.
>> [using netdev_dpdk_vhost_send()]
>> - Meanwhile user changes queue configuration using 'ethtool -L ens3
>combined 1', enables only one queue now and disabling the other.
>> - Callback functions gets triggered and disables the queue 1.
>> - 'tx_q[...].map' is updated to '0' [queue 0] - default queue.
>>
>> - Meantime PMD thread reads the map and finds that tx_q[..].map == '0'
>and flushes the packets in the queue0.
>> This is the problem as the packets enqueued earlier on queue1 were not
>flushed.
>>
>> How about the below fix?
>> - Before disabling the queues (qid = OVS_VHOST_QUEUE_DISABLED), flush
>the packets in the queue from vring_state_changed().
>
>It's not right to send packets to queues we're going to disable in OvS. We're
>just responding on notification, the real queue was already effectively
>disabled on the guest side and packets will not be delivered anyway and we
>should not try to write to disabled queue from the OvS point of view.
>I think it's better to just drop them here.
I was in doubt how to handle this. I initially dropped the packets but then
thought maybe there is a small time window for the queue to get disabled and so
invoked flush function here.
No worries, I will drop the packets.
>
>>
>> --------------------------vring_state_changed()-----------------------
>> ------------- if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
>> if (enable) {
>> dev->tx_q[qid].map = qid;
>> } else {
>> /* If the queue is disabled in the guest, the corresponding
>> qid
>> * map should be set to OVS_VHOST_QUEUE_DISABLED(-2).
>> *
>> * The packets that were queued in 'qid' can be potentially
>> * stuck and should be flushed before it is disabled.
>> */
>> netdev_dpdk_vhost_txq_flush(&dev->up,
>> dev->tx_q[qid].map, 0);
>
>It doesn't matter because of my explanation above, but here should be just
>'qid'
>not the 'dev->tx_q[qid].map', otherwise you may pass negative value to the
>flush function and remap it again to nowhere.
Ok.
>
>> dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
>> }
>> netdev_dpdk_remap_txqs(dev);
>> exists = true;
>> ovs_mutex_unlock(&dev->mutex);
>> break;
>> }
>> ovs_mutex_unlock(&dev->mutex);
>> ----------------------------------------------------------------------
>> ------------------------
>>
>> Also update the txq_flush function to return immediately if queue is
>> disabled (qid == OVS_VHOST_QUEUE_DISABLED)
>
>Hm. That is not the only issue here. You're actually need the same checking as
>in send function:
Ok similar to what we do in the vhost_send().
>
> if (OVS_UNLIKELY(!is_vhost_running(dev) || qid < 0
> || !(dev->flags & NETDEV_UP))) {
> rte_spinlock_lock(&dev->stats_lock);
> dev->stats.tx_dropped+= cnt;
> rte_spinlock_unlock(&dev->stats_lock);
> goto out;
> }
Instead of goto I will free the packets there itself if at all any.
>
>Otherwise you may try to send packets in the middle of initialization or
>reconfiguration process where device may not have 'vid' or initialized queue
>ids and even mempool can be not allocated at this point.
Completely agree.
- Bhanuprakash.
>
>>
>> -----------------------------netdev_dpdk_vhost_txq_flush()------------
>> ----------------- netdev_dpdk_vhost_txq_flush(struct netdev *netdev,
>> int qid,
>> bool concurrent_txq OVS_UNUSED) {
>> struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>> struct dpdk_tx_queue *txq;
>>
>> qid = dev->tx_q[qid % netdev->n_txq].map;
>>
>> /* The qid may be disabled in the guest and has been set to
>> * OVS_VHOST_QUEUE_DISABLED.
>> */
>> if (OVS_UNLIKELY(qid < 0)) {
>> return 0;
>> }
>>
>> txq = &dev->tx_q[qid];
>> if (OVS_LIKELY(txq->vhost_pkt_cnt)) {
>> rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>> netdev_dpdk_vhost_tx_burst(dev, qid);
>> rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>> }
>>
>> return 0;
>> }
>> ----------------------------------------------------------------------
>> ----------------------------
>>
>> Regards,
>> Bhanuprakash.
>>
>>>
>>>>> Have you ever tested this with multiqueue vhost?
>>>>> With disabling/enabling queues inside the guest?
>>>>
>>>> I did basic sanity testing with vhost multiqueue to verify
>>>> throughput and to
>>> check if flush logic is working at low rate(1 pkt each is sent to VM)
>>> on multiple VMs.
>>>> When you say queue 'enable/disable' in the guest are you referring
>>>> to using
>>> 'ethtool -L <inface> combined <channel no>' ?
>>>> If this is the case I did do this by configuring 5 rxqs(DPDK and
>>>> vhost User)
>>> and changed the channel nos and verified with testpmd app again for
>>> flushing scenarios. I didn't testing with kernel forwarding here.
>>>>
>>>> Regards,
>>>> Bhanuprakash.
>>>>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev