On 04.07.2017 14:37, Bodireddy, Bhanuprakash wrote:
> Apologies for snipping the text. I did it to keep this thread readable. 
> 
>>
>> Hi Darrell and Jan.
>> Thanks for looking at this. I agree with Darrell that mixing implementations 
>> on
>> two different levels is a bad idea, but as I already wrote in reply to
>> Bhanuprakash [2], there is no issues with implementing of output batching of
>> more than one rx batch.
>>
>> [2] https://mail.openvswitch.org/pipermail/ovs-dev/2017-July/334808.html
>>
>> Look at the incremental below. This is how it may look like:
> HI Ilya,
> 
> I briefly  went through the incremental patch and see that you introduced 
> config parameter
> to tune the latency and built the logic around it. It may work but we are 
> back to same question.
> 
> Is dpif layer the right place to do all of this? Shouldn't this logic be part 
> of netdev layer as tx batching 
> rightly belongs to netdev layer. 

Why do you think so? 'batched RX' and 'Per-flow batching' are the parts of
dpif-netdev and 'TX batching' logically should belong to that layer.

> If a specific use case warrants tuning the queuing, flushing and latency
> parameters let this be done at netdev layer by providing more configs with 
> acceptable defaults and leave
> the dpif layer simple as it is now.

You will need to implement all the flushing logic on dpif layer anyway
to avoid issues with lost and stuck packets. This means that you'll
need to implement almost the same code as in my incremental patch with
one difference that you'll call 'netdev_txq_flush' instead of
'dp_netdev_pmd_flush_output_on_port'.

About simplicity of dpif-netdev: Incremental looks large and complex, but
it mostly refactors the code implemented in the main patch. So it'll be
not so complex after the squashing.

About 'acceptable defaults': I guess you mean different default values for
physical DPDK ports and vhost-user. But, as Jan mentioned, the main usecase
for output batching of many rx batches is the kernel virtio-net driver in
guest with enabled notifications. From my point of view, DPDK virtio_pmd
driver is the main usecase for OvS+DPDK solution. This means that we'll not
be able to set up same acceptable for both cases default values for knobs.

To provide more flexible configuration we can switch the config parameter
to per-port configurable. This will allow as to use both kind of guests
simultaneously with possibility to achieve good results on both.
But this can be done using both approaches of implementation layer.

> You referred  to performance issues with flushing triggered from non-local 
> thread (on different NUMA node).
> This may be because in lab, we simulate these cases and saturate the 10G 
> link. But this may not be a very pressing
> issue in real world scenarios.

I have few real scenarios there same traffic value goes to a single VM
from 2 different NUMA nodes simultaneously. I expect performance degradation
in this scenario if the packets will be flushed from different thread because
the cross-numa datapath much slower. (I measured that on my system cross-numa
path is around 20-25% slower)

Unfortunately, I don't have the lab with these scenarios installed now to
make tests.

To be clear, I'm not really a fan of time-based output batching of multiple
rx batches. I like the patch as it is without the incremental.
But if Darrel and Jan thinks that it's really needed I can add it in the way
implemented in incremental patch. I prefer to keep default max output latency
set to 0 because it covers the main usecase for OvS+DPDK: DPDK-enabled guest.

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to