Re: [ovs-dev] [RFC V2] netdev-rte-offloads: HW offload virtio-forwarder

Ilya Maximets Mon, 27 May 2019 01:33:21 -0700

On 24.05.2019 23:51, Roni Bar Yanai wrote:
> Hi Ilya,
> (S
> 
> See inline.
> 
>> -----Original Message-----
>> From: Ilya Maximets <[email protected]>
>> Sent: Friday, May 24, 2019 3:21 PM
>> To: Simon Horman <[email protected]>; Roni Bar Yanai
>> <[email protected]>
>> Cc: [email protected]; Ian Stokes <[email protected]>; Kevin Traynor
>> <[email protected]>; Oz Shlomo <[email protected]>; Eli Britstein
>> <[email protected]>; Eyal Lavee <[email protected]>; Rony Efraim
>> <[email protected]>; Ben Pfaff <[email protected]>
>> Subject: Re: [ovs-dev] [RFC V2] netdev-rte-offloads: HW offload 
>> virtio-forwarder
>>
>> On 22.05.2019 15:10, Simon Horman wrote:
>>> Hi,
>>>
>>> On Thu, May 16, 2019 at 08:44:31AM +0000, Roni Bar Yanai wrote:
>>>>> -----Original Message-----
>>>>> From: Ilya Maximets <[email protected]>
>>>>> Sent: Wednesday, May 15, 2019 4:37 PM
>>>>> To: Roni Bar Yanai <[email protected]>; [email protected]; Ian
>>>>> Stokes <[email protected]>; Kevin Traynor <[email protected]>
>>>>> Cc: Eyal Lavee <[email protected]>; Oz Shlomo <[email protected]>;
>> Eli
>>>>> Britstein <[email protected]>; Rony Efraim <[email protected]>; Asaf
>>>>> Penso <[email protected]>
>>>>> Subject: Re: [RFC V2] netdev-rte-offloads: HW offload virtio-forwarder
>>>>>
>>>>> On 15.05.2019 16:01, Roni Bar Yanai wrote:
>>>>>> Hi Ilya,
>>>>>>
>>>>>> Thanks for the comment.
>>>>>>
>>>>>> I think the suggested arch is very good and has many advantages, and
>>>>>> in fact I had something very similar as my internally first approach.
>>>>>>
>>>>>> However, I had one problem: it doesn't solves the kernel case. It make
>>>>>> sense doing forwarding using dpdk also when OVS is kernel (port
>>>>>> representor and rule offloads are done with kernel OVS). It makes
>>>>>> sense because we can have one solution and because DPDK has better
>>>>>> performance.
>>>>>
>>>>> I'm not sure if it makes practical sense to run separate userpace
>>>>> datapath just to pass packets between vhost and VF. This actually
>>>>> matches with some of your own disadvantages of separate DPDK apps.
>>>>> Separate userspace datapath will need its own complex start,
>>>>> configuration and maintenance. Also it will consume additional cpu cores
>>>>> which will not be shared with kernel packet processing.  I think that
>>>>> just move everything to userspace in this case would be much more simple
>>>>> for user than maintaining such configuration.
>>>>
>>>> Maybe It doesn't make sense for OVS-DPDK but for OVS users it does.  When
>>>> you run offload with OVS-kernel, and for some vendors this is the current
>>>> status, and virtio is a requirement, you now have millions of packets
>>>> that should be forwarded. Basically you have two options:
>>>>
>>>> 1. use external application (we discussed that).
>>>>
>>>> 2. create user space data plane and configure forwarding (OVS), but then
>>>> you have performance issues as OVS is not optimized for this. And for
>>>> kernel data plane much worse off course.
>>>>
>>>> Regarding burning a core. In case of HW offload you will do it either
>>>> way, and there is no benefit for adding FW functionality for kernel data
>>>> path, mainly because of kernel performance limitations.
>>>>
>>>> I agree that in such case moving to user space is a solution for some,
>>>> but keep in mind that some doesn't have such support for DPDK and for
>>>> others they have their own OVS based data path with their adjustments, so
>>>> it will be a hard transition.
>>>>
>>>> While arch is good for the two DPDK use cases, it leaves the kernel one
>>>> out.  Any thoughts how we can add this use case as well and still keep
>>>> the suggested arch?
>>>
>>> ...
>>>
>>> At Netronome we have an Open Source standalone application,
>>> called virtio-forwarder
>> (https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.co
>> m%2FNetronome%2Fvirtio-
>> forwarder&amp;data=02%7C01%7Croniba%40mellanox.com%7C72d768141c794f4
>> 59eee08d6e0424d0a%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C6369
>> 42972699205388&amp;sdata=Nk1a9tRUD4%2BjDFOh%2BUwrdnQ6sJ2cnhXfvJsXJs
>> F0X3E%3D&amp;reserved=0).
>>> The reason that we provide this solution is that we see this as a
>>> requirement for some customers. This includes customers using OVS
>>> with the kernel based HW offload (OVS-TC).
>>>
>>> In general I agree that integration with OVS has some advantages and
>>> I'm happy to see this issue being discussed. But as we see demand
>>> for use of virtio-forwarder in conjunction with OVS-TC I see that
>>> as a requirement for a solution that is integrated with OVS, which leads
>>> me to lean towards the proposal put forward by Roni.
>>>
>>> I also feel that the proposal put forward by Roni is likely to prove more
>>> flexible that a port-based approach, as proposed by Ilya. For one thing
>>> such a design ought to allow for arbitrary combinations of port types.
>>> In fact, it would be entirely feasible to run this in conjunction with a
>>> non-OVS offload aware NIC (SR-IOV in VEB mode).
>>>
>>> Returning to the stand-alone Netronome implementation, I would welcome
>>> discussion of how any useful portions of this could be reused.
>>>
>>
>> Hi Simon. Thanks for link. It's very interesting.
>>
>> My key point about the proposal put forward by Roni is that Open vSwitch
>> is an *OF switch* first of all and *not the multitool*. This proposal adds
> 
> Let's not forget performance is a major factor for a switch and currently 
> there 
> is a gap between the market demand + hw capabilities and performance. 
> The idea is not to change OVS into multi-tool, The idea is to improve OVS 
> performance
> and making it a complete solution:  a standalone OF switch with great 
> performance.


But your proposal has nothing common with OF. It hides real ports from the OF.
So, this is no longer an OF switch. It's a "standalone OF switch" + "side tool
with great performance".

> 
> Currently we are having some technical gap with forwarding into virtio that 
> we must 
> close with SW.
>  I've suggested splitting the code into a separate module so it can be 
> configured independently, and will require minimum change in OVS non offload
> code path. In addition to what Simon mentioned, separate module will also 
> allow other
>  performance improvement such as TSO, (now it is always disabled by SW). it 
> can be 
> enabled on the forwarder, OVS will see the traffic after HW segments it.

It's not obvious if you'll have any performance benefits from TSO enabling,
because drivers for HW NICs will (at least for Intel NICs) use non-vectorized
Tx functions and you will have to parse packets in many cases to prepare them
for checksum offloading. Performance of HW itself is not a bottleneck in virtio
forwarding case.

Anyway, you may enable it inside the DPDK vdev implementation.

> 
>> some parasite work to the main OVS workflow which doesn't connected
> 
> I can agree with you, it is not ideal, but It is a matter of how you look at 
> it.
> You still want performance, and still the OVS role is to forward the packets
> not just take decisions.
> Maybe we can think how we can minimize it even more.
> 
>> with its main purpose. If you really want this implemented, this should
>> probably be done inside DPDK. You may implement a virtual device in DPDK
>> (like bonding) that will forward traffic between subports while calling
>> receive function. Adding this vdev to OVS as a usual DPDK port you will
>> be able to achieve your goal. DPDK as a development kit (an actual multitool)
>> is much more appropriate place for such solutions.
> 
> I don't agree with this point.  what about other forwarding use case. What 
> would 
> you do in vdpa? 

For real vDPA we still could implement netdev-vdpa. If you want solutions
to look equal for vDPA and the SW forwarder, you may implement vdev vdpa inside
DPDK too.

> There is still control to do. (I guess you can create another type of
>  port), 

All the control handling could be done by vhost_pmd. We could start using
eth device callbacks via rte_eth_dev_callback_register() to handle LSC/QUEUE
events from virtio-forwarder vdev. This is standard mechanism. No need for
separate netdev type.

> what about kernel? You will still need a standalone. 

No. You'll run the userspace datapath with virtio-forwarder DPDK ports added.
It's the same as in your proposal, but with ports. Representor inside kernel
will receive the traffic.

But I still think that there is nothing different for user between running
separate datapath and running separate virtio-forwarder DPDK app. Also,
thanks to Simon, you don't need to implement it. Standalone virtio-forwarder
already exists and supports all the required features.

> I think the use case for this special port , is when you want offload to 
> virtio (not just
> Dpdk), and this exists mainly in virtualization, when you have a vswitch. I 
> don’t see 
> DPDK application including HW offload using it unless it is a switch. The 
> ability to use 
> it highly depends on the switch architecture.

There are a lot of switches. OVS is not the only one. Implementing the forwarder
inside DPDK will allow to simply using it from other solutions like VPP or
custom DPDK based switches.

> 
>>
>> BTW, the root cause of this approach is the slow packet forwarding in OVS
>> in compare with direct rx + tx without parsing.
> 
> This is only part of it. Those are not switch ports. Those are HW offload 
> ports.
> you don't want forwarding rules on them. In fact you don't want that the user
> will see them as part of the switch dpif/show for example.

You contradict yourself. "not part of the switch" vs "OF switch with great 
performance".
It's not the OF switch anymore if you're hiding ports from OF and users.


One more thing to mention is that appctl based configurations will not survive
daemon restart. 

> 
>> OVS performance improvement is probably the right direction where we
>> can move to achieve reasonably effective packet forwarding. I prepared
>> a patch that should allow much faster packet forwarding for direct
>> output flows like "in_port=1,actions=output:2". Take a look here:
>>
>> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpatchwork
>> .ozlabs.org%2Fpatch%2F1104878%2F&amp;data=02%7C01%7Croniba%40mellanox
>> .com%7C72d768141c794f459eee08d6e0424d0a%7Ca652971c7d2e4d9ba6a4d14925
>> 6f461b%7C0%7C0%7C636942972699205388&amp;sdata=MH0qI9meo3mBk0tOpCb
>> 9u4ri2Hmo4ZfJm2I011XyCo4%3D&amp;reserved=0
>> It still will be slower than "no parsing at all", but could be suitable
>> in practice for some use cases.
>>
> 
> Thanks. Impressive, we should test the performance. 
> BR,
> Roni
> 
>> Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [RFC V2] netdev-rte-offloads: HW offload virtio-forwarder

Reply via email to