On 27.06.2017 23:31, Bodireddy, Bhanuprakash wrote:
>> On 26.06.2017 00:52, Bodireddy, Bhanuprakash wrote:
>>>>> +
>>>>> +/* Flush the txq if there are any packets available.
>>>>> + * dynamic_txqs/concurrent_txq is disabled for vHost User ports as
>>>>> + * 'OVS_VHOST_MAX_QUEUE_NUM' txqs are preallocated.
>>>>> + */
>>>>
>>>> This comment is completely untrue. You may ignore 'concurrent_txq'
>>>> because you *must* lock the queue in any case because of dynamic txq
>>>> remapping inside netdev-dpdk. You must take the spinlock for the
>>>> 'dev-
>>>>> tx_q[qid % netdev->n_txq].map' before sending packets.
>>>
>>> Thanks for catching this and the lock should be taken before flushing the
>> queue. Below is how the new logic with spinlocks.
>>>
>>> /* Flush the txq if there are any packets available. */ static int
>>> netdev_dpdk_vhost_txq_flush(struct netdev *netdev, int qid,
>>>                             bool concurrent_txq OVS_UNUSED) {
>>>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>>>     struct dpdk_tx_queue *txq;
>>>
>>>     qid = dev->tx_q[qid % netdev->n_txq].map;
>>>
>>>     txq = &dev->tx_q[qid];
>>>     if (OVS_LIKELY(txq->vhost_pkt_cnt)) {
>>>         rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>>>         netdev_dpdk_vhost_tx_burst(dev, qid);
>>>         rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>>>     }
>>>
>>>     return 0;
>>> }
>>>
>>>>
>>>> In current implementation you're able to call send and flush
>>>> simultaneously for the same queue from different threads because
>>>> 'flush' doesn't care about queue remapping.
>>>>
>>>> See '__netdev_dpdk_vhost_send' and 'netdev_dpdk_remap_txqs' for
>> detail.
>>>>
>>>> Additionally, flushing logic will be broken in case of txq remapping
>>>> because you may have different underneath queue each time you trying
>>>> to send of flush.
>>>
>>> I remember you raised this point earlier. To handle this case,  
>>> 'last_used_qid'
>> was introduced in tx_port.
>>> With this we can track any change in qid and make sure the packets are
>> flushed in the old queue.
>>> This logic is in patch 3/6 of this series.
>>
>> I'm talking about txq remapping inside netdev-dpdk not about XPS.
>> You're trying to flush the 'qid = tx_q[...].map' but the 'map'
>> value can be changed at any time because of enabling/disabling vrings inside
>> guest. Refer the 'vring_state_changed()' callback that triggers
>> 'netdev_dpdk_remap_txqs' which I already mentioned.
>> It's actually the reason why we're using unconditional locking for vhost-user
>> ports ignoring 'concurrent_txq' value.
> 
> I  spent some time looking at the above mentioned functions and see that 'qid 
> = tx_q[...].map' is updated by 'vhost_thread' as part of callback function. 
> This  gets triggered only when the queues are enabled/disabled in the guest.  
> I was using 'ethtool -L' in the guest for testing this.
> As you rightly mentioned the flushing logic breaks and the packets may 
> potentially get stuck in the queue. During my testing I found this corner 
> case that poses a problem with the patch  (steps in order).
>       
> -  Multiqueue is enabled on the guest with 2 rx,tx queues.
> - In PVP case, Traffic is sent to VM and the packets are sent on the 'queue 
> 1' of the vHostUser port.
> - PMD thread  keeps queueing packets on the vhostuser port queue. [using 
> netdev_dpdk_vhost_send()]
> - Meanwhile user changes queue configuration using 'ethtool -L ens3 combined 
> 1', enables only one queue now and disabling the other.
> - Callback functions gets triggered and disables the queue 1.
> - 'tx_q[...].map' is updated to '0' [queue 0] - default queue.
> 
> - Meantime PMD thread reads the map and finds that tx_q[..].map == '0' and 
> flushes the packets in the queue0. 
> This is the problem as the packets enqueued earlier on queue1 were not 
> flushed.
> 
> How about the below fix?
> - Before disabling the queues (qid = OVS_VHOST_QUEUE_DISABLED), flush the 
> packets in the queue from vring_state_changed().

It's not right to send packets to queues we're going to disable in OvS. We're
just responding on notification, the real queue was already effectively disabled
on the guest side and packets will not be delivered anyway and we should not
try to write to disabled queue from the OvS point of view.
I think it's better to just drop them here.

> 
> --------------------------vring_state_changed()------------------------------------
> if (strncmp(ifname, dev->vhost_id, IF_NAME_SZ) == 0) {
>             if (enable) {
>                 dev->tx_q[qid].map = qid;
>             } else {
>                 /* If the queue is disabled in the guest, the corresponding 
> qid
>                  * map should be set to OVS_VHOST_QUEUE_DISABLED(-2).
>                  *
>                  * The packets that were queued in 'qid' can be potentially
>                  * stuck and should be flushed before it is disabled.
>                  */
>                 netdev_dpdk_vhost_txq_flush(&dev->up, dev->tx_q[qid].map, 0);

It doesn't matter because of my explanation above, but here should be just 'qid'
not the 'dev->tx_q[qid].map', otherwise you may pass negative value to the flush
function and remap it again to nowhere.

>                 dev->tx_q[qid].map = OVS_VHOST_QUEUE_DISABLED;
>             }
>             netdev_dpdk_remap_txqs(dev);
>             exists = true;
>             ovs_mutex_unlock(&dev->mutex);
>             break;
>         }
>         ovs_mutex_unlock(&dev->mutex);
> ----------------------------------------------------------------------------------------------
> 
> Also update the txq_flush function to return immediately if queue is disabled 
> (qid == OVS_VHOST_QUEUE_DISABLED)

Hm. That is not the only issue here. You're actually need the same checking
as in send function:

     if (OVS_UNLIKELY(!is_vhost_running(dev) || qid < 0                         
                      || !(dev->flags & NETDEV_UP))) {                          
         rte_spinlock_lock(&dev->stats_lock);                                   
         dev->stats.tx_dropped+= cnt;                                           
         rte_spinlock_unlock(&dev->stats_lock);                                 
         goto out;                                                              
     }

Otherwise you may try to send packets in the middle of initialization
or reconfiguration process where device may not have 'vid' or
initialized queue ids and even mempool can be not allocated at this
point.

> 
> -----------------------------netdev_dpdk_vhost_txq_flush()-----------------------------
> netdev_dpdk_vhost_txq_flush(struct netdev *netdev, int qid,
>                             bool concurrent_txq OVS_UNUSED)
> {
>     struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>     struct dpdk_tx_queue *txq;
> 
>     qid = dev->tx_q[qid % netdev->n_txq].map;
> 
>     /* The qid may be disabled in the guest and has been set to
>      * OVS_VHOST_QUEUE_DISABLED.
>      */
>     if (OVS_UNLIKELY(qid < 0)) {
>         return 0;
>     }
> 
>     txq = &dev->tx_q[qid];
>     if (OVS_LIKELY(txq->vhost_pkt_cnt)) {
>         rte_spinlock_lock(&dev->tx_q[qid].tx_lock);
>         netdev_dpdk_vhost_tx_burst(dev, qid);
>         rte_spinlock_unlock(&dev->tx_q[qid].tx_lock);
>     }
> 
>     return 0;
> }
> --------------------------------------------------------------------------------------------------
> 
> Regards,
> Bhanuprakash.
> 
>>
>>>> Have you ever tested this with multiqueue vhost?
>>>> With disabling/enabling queues inside the guest?
>>>
>>> I did basic sanity testing with vhost multiqueue to verify throughput and to
>> check if flush logic is working at low rate(1 pkt each is sent to VM) on 
>> multiple
>> VMs.
>>> When you say queue 'enable/disable' in the guest are you referring to using
>> 'ethtool -L <inface> combined <channel no>' ?
>>> If this is the case I did do this by configuring 5 rxqs(DPDK and vhost User)
>> and changed the channel nos and verified with testpmd app again for flushing
>> scenarios. I didn't testing with kernel forwarding here.
>>>
>>> Regards,
>>> Bhanuprakash.
>>>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to