Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-03-15 Thread Stokes, Ian
> > -Original Message-
> > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > Sent: Monday, January 29, 2018 7:00 AM
> > To: d...@openvswitch.org
> > Cc: Stokes, Ian ; Finn Christensen
> > ; Darrell Ball ; Chandran, Sugesh
> > ; Simon Horman
> > ; Yuanhan Liu 
> > Subject: [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> >
> > Hi,
> >
> > Here is a joint work from Mellanox and Napatech, to enable the flow hw
> > offload with the DPDK generic flow interface (rte_flow).
> >
> 
> Thanks for this Yuanhan,

Adding Shahaf Shuler who will continue this work.

> 
> I'm currently in the middle of reviewing and testing the series. I have
> some feedback on patch 1 and 2 of the series so decided to provide it now
> rather than wait.
> 
> Just a general comment, I'm seeing compilation failures for sparse, clang
> and gcc etc. Maybe you already run these (I know sparse can be a bit hit
> and miss) but if you don't it's worth giving them a go before submitting,
> it will catch smaller bugs that break the OVS travis build and avoid
> version re-spins.
> 
> I'll have more comments on the rest of the series this week.
> 
> Thanks
> Ian
> 
> > The basic idea is to associate the flow with a mark id (a unit32_t
> > number).
> > Later, we then get the flow directly from the mark id, which could
> > bypass some heavy CPU operations, including but not limiting to mini
> > flow extract, emc lookup, dpcls lookup, etc.
> >
> > The association is done with CMAP in patch 1. The CPU workload
> > bypassing is done in patch 2. The flow offload is done in patch 3,
> > which mainly does two things:
> >
> > - translate the ovs match to DPDK rte flow patterns
> > - bind those patterns with a RSS + MARK action.
> >
> > Patch 5 makes the offload work happen in another thread, for leaving
> > the datapath as light as possible.
> >
> > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> > million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than
> > 260% performance boost.
> >
> > Note that it's disabled by default, which can be enabled by:
> >
> > $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> >
> > v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
> > - set the rss_conf for rss action to NULL, to workaround a mlx5
> change
> >   in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK
> has
> >   set in the beginning. Thus, nothing should be effected.
> >
> > v6: - fixed a sparse warning
> > - added documentation
> > - used hash_int to compute mark to flow hash
> > - added more comments
> > - added lock for pot lookup
> > - rebased on top of the latest code
> >
> > v5: - fixed an issue that it took too long if we do flow add/remove
> >   repeatedly.
> > - removed an unused mutex lock
> > - turned most of the log level to DBG
> > - rebased on top of the latest code
> >
> > v4: - use RSS action instead of QUEUE action with MARK
> > - make it work with multiple queue (see patch 1)
> > - rebased on top of latest code
> >
> > v3: - The mark and id association is done with array instead of CMAP.
> > - Added a thread to do hw offload operations
> > - Removed macros completely
> > - dropped the patch to set FDIR_CONF, which is a workround some
> >   Intel NICs.
> > - Added a debug patch to show all flow patterns we have created.
> > - Misc fixes
> >
> > v2: - workaround the queue action issue
> > - fixed the tcp_flags being skipped issue, which also fixed the
> >   build warnings
> > - fixed l2 patterns for Intel nic
> > - Converted some macros to functions
> > - did not hardcode the max number of flow/action
> > - rebased on top of the latest code
> >
> > Thanks.
> >
> > --yliu
> >
> > ---
> > Finn Christensen (1):
> >   netdev-dpdk: implement flow offload with rte flow
> >
> > Yuanhan Liu (5):
> >   dpif-netdev: associate flow with a mark id
> >   dpif-netdev: retrieve flow directly from the flow mark
> >   netdev-dpdk: add debug for rte flow patterns
> >   dpif-netdev: do hw flow offload in a thread
> >   Documentation: document ovs-dpdk flow offload
> >
> >  Documentation/howto/dpdk.rst |  17 +
> >  NEWS |   1 +
> >  lib/dp-packet.h  |  13 +
> >  lib/dpif-netdev.c| 495 -
> >  lib/flow.c   | 155 +++--
> >  lib/flow.h   |   1 +
> >  lib/netdev-dpdk.c| 736
> > ++-
> >  lib/netdev.h |   6 +
> >  8 files changed, 1385 insertions(+), 39 deletions(-)
> >
> > --
> > 2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-03-15 Thread Stokes, Ian
> -Original Message-
> From: Shahaf Shuler [mailto:shah...@mellanox.com]
> Sent: Tuesday, March 13, 2018 8:53 AM
> To: Yuanhan Liu <y...@fridaylinux.org>; Stokes, Ian <ian.sto...@intel.com>
> Cc: Flavio Leitner <f...@sysclose.org>; d...@openvswitch.org; Simon Horman
> <simon.hor...@netronome.com>; Olga Shern <ol...@mellanox.com>
> Subject: RE: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> Wednesday, February 7, 2018 4:12 AM, Yuanhan Liu:
> > On Tue, Feb 06, 2018 at 09:34:36PM +0000, Stokes, Ian wrote:
> > > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > > rte_flow
> 
> Hi,
> 
> Just to inform I am taking over this patchset for integration in ovs 2.10
> release.
> 
> Appreciate if you can sent again the comments/question you have on this
> patchset so I can fix/answer.

Thanks for helping with this, I'll add you to the patch threads to continue the 
review discussion.

Ian
> 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-03-13 Thread Shahaf Shuler
Wednesday, February 7, 2018 4:12 AM, Yuanhan Liu:
> On Tue, Feb 06, 2018 at 09:34:36PM +, Stokes, Ian wrote:
> > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > rte_flow

Hi,

Just to inform I am taking over this patchset for integration in ovs 2.10 
release.

Appreciate if you can sent again the comments/question you have on this 
patchset so I can fix/answer. 


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-03-07 Thread Stokes, Ian
> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, January 29, 2018 7:00 AM
> To: d...@openvswitch.org
> Cc: Stokes, Ian ; Finn Christensen
> ; Darrell Ball ; Chandran, Sugesh
> ; Simon Horman ;
> Yuanhan Liu 
> Subject: [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> Hi,
> 
> Here is a joint work from Mellanox and Napatech, to enable the flow hw
> offload with the DPDK generic flow interface (rte_flow).
> 

Thanks for this Yuanhan,

I'm currently in the middle of reviewing and testing the series. I have some 
feedback on patch 1 and 2 of the series so decided to provide it now rather 
than wait.

Just a general comment, I'm seeing compilation failures for sparse, clang and 
gcc etc. Maybe you already run these (I know sparse can be a bit hit and miss) 
but if you don't it's worth giving them a go before submitting, it will catch 
smaller bugs that break the OVS travis build and avoid version re-spins.

I'll have more comments on the rest of the series this week.

Thanks
Ian

> The basic idea is to associate the flow with a mark id (a unit32_t
> number).
> Later, we then get the flow directly from the mark id, which could bypass
> some heavy CPU operations, including but not limiting to mini flow
> extract, emc lookup, dpcls lookup, etc.
> 
> The association is done with CMAP in patch 1. The CPU workload bypassing
> is done in patch 2. The flow offload is done in patch 3, which mainly does
> two things:
> 
> - translate the ovs match to DPDK rte flow patterns
> - bind those patterns with a RSS + MARK action.
> 
> Patch 5 makes the offload work happen in another thread, for leaving the
> datapath as light as possible.
> 
> A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
> performance boost.
> 
> Note that it's disabled by default, which can be enabled by:
> 
> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> 
> v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
> - set the rss_conf for rss action to NULL, to workaround a mlx5 change
>   in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK has
>   set in the beginning. Thus, nothing should be effected.
> 
> v6: - fixed a sparse warning
> - added documentation
> - used hash_int to compute mark to flow hash
> - added more comments
> - added lock for pot lookup
> - rebased on top of the latest code
> 
> v5: - fixed an issue that it took too long if we do flow add/remove
>   repeatedly.
> - removed an unused mutex lock
> - turned most of the log level to DBG
> - rebased on top of the latest code
> 
> v4: - use RSS action instead of QUEUE action with MARK
> - make it work with multiple queue (see patch 1)
> - rebased on top of latest code
> 
> v3: - The mark and id association is done with array instead of CMAP.
> - Added a thread to do hw offload operations
> - Removed macros completely
> - dropped the patch to set FDIR_CONF, which is a workround some
>   Intel NICs.
> - Added a debug patch to show all flow patterns we have created.
> - Misc fixes
> 
> v2: - workaround the queue action issue
> - fixed the tcp_flags being skipped issue, which also fixed the
>   build warnings
> - fixed l2 patterns for Intel nic
> - Converted some macros to functions
> - did not hardcode the max number of flow/action
> - rebased on top of the latest code
> 
> Thanks.
> 
> --yliu
> 
> ---
> Finn Christensen (1):
>   netdev-dpdk: implement flow offload with rte flow
> 
> Yuanhan Liu (5):
>   dpif-netdev: associate flow with a mark id
>   dpif-netdev: retrieve flow directly from the flow mark
>   netdev-dpdk: add debug for rte flow patterns
>   dpif-netdev: do hw flow offload in a thread
>   Documentation: document ovs-dpdk flow offload
> 
>  Documentation/howto/dpdk.rst |  17 +
>  NEWS |   1 +
>  lib/dp-packet.h  |  13 +
>  lib/dpif-netdev.c| 495 -
>  lib/flow.c   | 155 +++--
>  lib/flow.h   |   1 +
>  lib/netdev-dpdk.c| 736
> ++-
>  lib/netdev.h |   6 +
>  8 files changed, 1385 insertions(+), 39 deletions(-)
> 
> --
> 2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-02-07 Thread Stokes, Ian
> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Wednesday, February 7, 2018 9:37 AM
> To: Stokes, Ian <ian.sto...@intel.com>
> Cc: Flavio Leitner <f...@sysclose.org>; d...@openvswitch.org; Simon Horman
> <simon.hor...@netronome.com>; Olga Shern <ol...@mellanox.com>; Shahaf
> Shuler <shah...@mellanox.com>
> Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> On Wed, Feb 07, 2018 at 09:24:52AM +, Stokes, Ian wrote:
> > > -Original Message-
> > > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > > Sent: Wednesday, February 7, 2018 2:12 AM
> > > To: Stokes, Ian <ian.sto...@intel.com>
> > > Cc: Flavio Leitner <f...@sysclose.org>; d...@openvswitch.org; Simon
> > > Horman <simon.hor...@netronome.com>; Olga Shern
> > > <ol...@mellanox.com>; Shahaf Shuler <shah...@mellanox.com>
> > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > rte_flow
> > >
> > > On Tue, Feb 06, 2018 at 09:34:36PM +, Stokes, Ian wrote:
> > > > > -Original Message-
> > > > > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > > > > Sent: Tuesday, February 6, 2018 9:10 AM
> > > > > To: Flavio Leitner <f...@sysclose.org>; Stokes, Ian
> > > > > <ian.sto...@intel.com>
> > > > > Cc: d...@openvswitch.org; Simon Horman
> > > > > <simon.hor...@netronome.com>
> > > > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > > > rte_flow
> > > > >
> > > > > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > > > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > > > > Hi,
> > > > > > >
> > > > > > > Here is a joint work from Mellanox and Napatech, to enable
> > > > > > > the flow hw offload with the DPDK generic flow interface
> (rte_flow).
> > > > > > >
> > > > > > > The basic idea is to associate the flow with a mark id (a
> > > > > > > unit32_t
> > > > > number).
> > > > > > > Later, we then get the flow directly from the mark id, which
> > > > > > > could bypass some heavy CPU operations, including but not
> > > > > > > limiting to mini flow extract, emc lookup, dpcls lookup, etc.
> > > > > > >
> > > > > > > The association is done with CMAP in patch 1. The CPU
> > > > > > > workload bypassing is done in patch 2. The flow offload is
> > > > > > > done in patch 3, which mainly does two things:
> > > > > > >
> > > > > > > - translate the ovs match to DPDK rte flow patterns
> > > > > > > - bind those patterns with a RSS + MARK action.
> > > > > > >
> > > > > > > Patch 5 makes the offload work happen in another thread, for
> > > > > > > leaving the datapath as light as possible.
> > > > > > >
> > > > > > > A PHY-PHY forwarding with 1000 mega flows
> > > > > > > (udp,tp_src=1000-1999) and
> > > > > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show
> > > > > > > more than 260% performance boost.
> > > > > > >
> > > > > > > Note that it's disabled by default, which can be enabled by:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > First of all, thanks for working on this feature.
> > > > > >
> > > > > > I have some general comments I'd like to discuss before going
> > > > > > deeper on it.
> > > > > >
> > > > > > The documentation is too simple.  It should mention the HW
> > > > > > requirements in order to use this feature.
> > > > >
> > > > > It listed the NICs that support this feature.
> > > > >
> > > > > >  Also some important limitations, like no support for IP
> > > > > > frags, MPLS or conntrack, for instance.
> > > > >
> > > > > Yes, that could be added.
> > > > >
> > > > > > It seems it would be possible to leave the HW offloading code
> > > > > > outside of dpif-netde

Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-02-07 Thread Yuanhan Liu
On Wed, Feb 07, 2018 at 09:24:52AM +, Stokes, Ian wrote:
> > -Original Message-
> > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > Sent: Wednesday, February 7, 2018 2:12 AM
> > To: Stokes, Ian <ian.sto...@intel.com>
> > Cc: Flavio Leitner <f...@sysclose.org>; d...@openvswitch.org; Simon Horman
> > <simon.hor...@netronome.com>; Olga Shern <ol...@mellanox.com>; Shahaf
> > Shuler <shah...@mellanox.com>
> > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> > 
> > On Tue, Feb 06, 2018 at 09:34:36PM +, Stokes, Ian wrote:
> > > > -Original Message-
> > > > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > > > Sent: Tuesday, February 6, 2018 9:10 AM
> > > > To: Flavio Leitner <f...@sysclose.org>; Stokes, Ian
> > > > <ian.sto...@intel.com>
> > > > Cc: d...@openvswitch.org; Simon Horman <simon.hor...@netronome.com>
> > > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > > rte_flow
> > > >
> > > > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > > > Hi,
> > > > > >
> > > > > > Here is a joint work from Mellanox and Napatech, to enable the
> > > > > > flow hw offload with the DPDK generic flow interface (rte_flow).
> > > > > >
> > > > > > The basic idea is to associate the flow with a mark id (a
> > > > > > unit32_t
> > > > number).
> > > > > > Later, we then get the flow directly from the mark id, which
> > > > > > could bypass some heavy CPU operations, including but not
> > > > > > limiting to mini flow extract, emc lookup, dpcls lookup, etc.
> > > > > >
> > > > > > The association is done with CMAP in patch 1. The CPU workload
> > > > > > bypassing is done in patch 2. The flow offload is done in patch
> > > > > > 3, which mainly does two things:
> > > > > >
> > > > > > - translate the ovs match to DPDK rte flow patterns
> > > > > > - bind those patterns with a RSS + MARK action.
> > > > > >
> > > > > > Patch 5 makes the offload work happen in another thread, for
> > > > > > leaving the datapath as light as possible.
> > > > > >
> > > > > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999)
> > > > > > and
> > > > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > > > > than 260% performance boost.
> > > > > >
> > > > > > Note that it's disabled by default, which can be enabled by:
> > > > >
> > > > > Hi,
> > > > >
> > > > > First of all, thanks for working on this feature.
> > > > >
> > > > > I have some general comments I'd like to discuss before going
> > > > > deeper on it.
> > > > >
> > > > > The documentation is too simple.  It should mention the HW
> > > > > requirements in order to use this feature.
> > > >
> > > > It listed the NICs that support this feature.
> > > >
> > > > >  Also some important limitations, like no support for IP frags,
> > > > > MPLS or conntrack, for instance.
> > > >
> > > > Yes, that could be added.
> > > >
> > > > > It seems it would be possible to leave the HW offloading code
> > > > > outside of dpif-netdev.c which is quite long already. I hope it
> > > > > will improve isolation and code clarity too.
> > > >
> > > > Yes. My thoughts was we can do it later, when we are going to add
> > > > some other flow models (say, full offload) in future. Moreover, it
> > > > seems that it's a custom in OVS to have long source code files. I
> > > > was just following it.
> > > >
> > > > But I hve no objections to break it. Ian, should I break it now?
> > > > What's your plan to merge it?
> > >
> > > Hi Yuanhan, I believe I gave the same feedback on the first review pass
> > I did. Previously I was happy to hold off on this work in order to make it
> > into the 2.9 release but as that window has passed I think it makes sense
> > to break it out now. It will save the refactor work later.
> > 
> > So it won't in ovs 2.9 release?
> 
> No, the code freeze was the 31st of January for new features. Only bug fixes 
> are accepted for the remaining 2.9 window.  This would be part of the 2.10 
> release.

I thought we all come an agreement that this will be part of 2.9 release
as a experimental feature. What made we missed it?

Thanks.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-02-07 Thread Stokes, Ian
> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Wednesday, February 7, 2018 2:12 AM
> To: Stokes, Ian <ian.sto...@intel.com>
> Cc: Flavio Leitner <f...@sysclose.org>; d...@openvswitch.org; Simon Horman
> <simon.hor...@netronome.com>; Olga Shern <ol...@mellanox.com>; Shahaf
> Shuler <shah...@mellanox.com>
> Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> On Tue, Feb 06, 2018 at 09:34:36PM +, Stokes, Ian wrote:
> > > -Original Message-
> > > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > > Sent: Tuesday, February 6, 2018 9:10 AM
> > > To: Flavio Leitner <f...@sysclose.org>; Stokes, Ian
> > > <ian.sto...@intel.com>
> > > Cc: d...@openvswitch.org; Simon Horman <simon.hor...@netronome.com>
> > > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with
> > > rte_flow
> > >
> > > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > > Hi,
> > > > >
> > > > > Here is a joint work from Mellanox and Napatech, to enable the
> > > > > flow hw offload with the DPDK generic flow interface (rte_flow).
> > > > >
> > > > > The basic idea is to associate the flow with a mark id (a
> > > > > unit32_t
> > > number).
> > > > > Later, we then get the flow directly from the mark id, which
> > > > > could bypass some heavy CPU operations, including but not
> > > > > limiting to mini flow extract, emc lookup, dpcls lookup, etc.
> > > > >
> > > > > The association is done with CMAP in patch 1. The CPU workload
> > > > > bypassing is done in patch 2. The flow offload is done in patch
> > > > > 3, which mainly does two things:
> > > > >
> > > > > - translate the ovs match to DPDK rte flow patterns
> > > > > - bind those patterns with a RSS + MARK action.
> > > > >
> > > > > Patch 5 makes the offload work happen in another thread, for
> > > > > leaving the datapath as light as possible.
> > > > >
> > > > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999)
> > > > > and
> > > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > > > than 260% performance boost.
> > > > >
> > > > > Note that it's disabled by default, which can be enabled by:
> > > >
> > > > Hi,
> > > >
> > > > First of all, thanks for working on this feature.
> > > >
> > > > I have some general comments I'd like to discuss before going
> > > > deeper on it.
> > > >
> > > > The documentation is too simple.  It should mention the HW
> > > > requirements in order to use this feature.
> > >
> > > It listed the NICs that support this feature.
> > >
> > > >  Also some important limitations, like no support for IP frags,
> > > > MPLS or conntrack, for instance.
> > >
> > > Yes, that could be added.
> > >
> > > > It seems it would be possible to leave the HW offloading code
> > > > outside of dpif-netdev.c which is quite long already. I hope it
> > > > will improve isolation and code clarity too.
> > >
> > > Yes. My thoughts was we can do it later, when we are going to add
> > > some other flow models (say, full offload) in future. Moreover, it
> > > seems that it's a custom in OVS to have long source code files. I
> > > was just following it.
> > >
> > > But I hve no objections to break it. Ian, should I break it now?
> > > What's your plan to merge it?
> >
> > Hi Yuanhan, I believe I gave the same feedback on the first review pass
> I did. Previously I was happy to hold off on this work in order to make it
> into the 2.9 release but as that window has passed I think it makes sense
> to break it out now. It will save the refactor work later.
> 
> So it won't in ovs 2.9 release?

No, the code freeze was the 31st of January for new features. Only bug fixes 
are accepted for the remaining 2.9 window.  This would be part of the 2.10 
release.

Thanks
Ian
> 
>   --yliu
> 
> > I should have more time later this week to continue reviewing, I'll
> provide some more feedback on the rest of the patchset then.
> >
> >

Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-02-06 Thread Yuanhan Liu
On Tue, Feb 06, 2018 at 09:34:36PM +, Stokes, Ian wrote:
> > -Original Message-
> > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > Sent: Tuesday, February 6, 2018 9:10 AM
> > To: Flavio Leitner <f...@sysclose.org>; Stokes, Ian <ian.sto...@intel.com>
> > Cc: d...@openvswitch.org; Simon Horman <simon.hor...@netronome.com>
> > Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> > 
> > On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > > Hi,
> > > >
> > > > Here is a joint work from Mellanox and Napatech, to enable the flow
> > > > hw offload with the DPDK generic flow interface (rte_flow).
> > > >
> > > > The basic idea is to associate the flow with a mark id (a unit32_t
> > number).
> > > > Later, we then get the flow directly from the mark id, which could
> > > > bypass some heavy CPU operations, including but not limiting to mini
> > > > flow extract, emc lookup, dpcls lookup, etc.
> > > >
> > > > The association is done with CMAP in patch 1. The CPU workload
> > > > bypassing is done in patch 2. The flow offload is done in patch 3,
> > > > which mainly does two things:
> > > >
> > > > - translate the ovs match to DPDK rte flow patterns
> > > > - bind those patterns with a RSS + MARK action.
> > > >
> > > > Patch 5 makes the offload work happen in another thread, for leaving
> > > > the datapath as light as possible.
> > > >
> > > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and
> > > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > > than 260% performance boost.
> > > >
> > > > Note that it's disabled by default, which can be enabled by:
> > >
> > > Hi,
> > >
> > > First of all, thanks for working on this feature.
> > >
> > > I have some general comments I'd like to discuss before going deeper
> > > on it.
> > >
> > > The documentation is too simple.  It should mention the HW
> > > requirements in order to use this feature.
> > 
> > It listed the NICs that support this feature.
> > 
> > >  Also some important limitations, like no support for IP frags, MPLS
> > > or conntrack, for instance.
> > 
> > Yes, that could be added.
> > 
> > > It seems it would be possible to leave the HW offloading code outside
> > > of dpif-netdev.c which is quite long already. I hope it will improve
> > > isolation and code clarity too.
> > 
> > Yes. My thoughts was we can do it later, when we are going to add some
> > other flow models (say, full offload) in future. Moreover, it seems that
> > it's a custom in OVS to have long source code files. I was just following
> > it.
> > 
> > But I hve no objections to break it. Ian, should I break it now?
> > What's your plan to merge it?
> 
> Hi Yuanhan, I believe I gave the same feedback on the first review pass I 
> did. Previously I was happy to hold off on this work in order to make it into 
> the 2.9 release but as that window has passed I think it makes sense to break 
> it out now. It will save the refactor work later.

So it won't in ovs 2.9 release?

--yliu

> I should have more time later this week to continue reviewing, I'll provide 
> some more feedback on the rest of the patchset then.
> 
> Thanks
> Ian
> 
> 
> > 
> > > So far there is no synchronization between PMDs in the fast path.
> > > However, we got a new mutex to sync PMDs and a new thread to manage.
> > > Even without the patch adding the thread, there would be a new mutex
> > > in the fast path.  It seems the slow path today causes issues, so
> > > maybe the whole upcall processing could be pushed to another thread. I
> > > realize this is outside of the scope of this patchset, but it is
> > > something we should consider.
> > >
> > > As an alternative solution, maybe we could use a DPDK ring to have a
> > > lockless way to push flows to the auxiliary thread.
> > 
> > It could be an option, but it brings a hard dependency on DPDK. Note that
> > the netdev datapath could leave without DPDK, IIUC.
> > 
> > > There are some memory allocations and deallocations in the fast path
> > > using OVS functions.  Perhaps it is better to use rte_* functions
> > &g

Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-02-06 Thread Stokes, Ian
> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Tuesday, February 6, 2018 9:10 AM
> To: Flavio Leitner <f...@sysclose.org>; Stokes, Ian <ian.sto...@intel.com>
> Cc: d...@openvswitch.org; Simon Horman <simon.hor...@netronome.com>
> Subject: Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow
> 
> On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> > On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > > Hi,
> > >
> > > Here is a joint work from Mellanox and Napatech, to enable the flow
> > > hw offload with the DPDK generic flow interface (rte_flow).
> > >
> > > The basic idea is to associate the flow with a mark id (a unit32_t
> number).
> > > Later, we then get the flow directly from the mark id, which could
> > > bypass some heavy CPU operations, including but not limiting to mini
> > > flow extract, emc lookup, dpcls lookup, etc.
> > >
> > > The association is done with CMAP in patch 1. The CPU workload
> > > bypassing is done in patch 2. The flow offload is done in patch 3,
> > > which mainly does two things:
> > >
> > > - translate the ovs match to DPDK rte flow patterns
> > > - bind those patterns with a RSS + MARK action.
> > >
> > > Patch 5 makes the offload work happen in another thread, for leaving
> > > the datapath as light as possible.
> > >
> > > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and
> > > 1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more
> > > than 260% performance boost.
> > >
> > > Note that it's disabled by default, which can be enabled by:
> >
> > Hi,
> >
> > First of all, thanks for working on this feature.
> >
> > I have some general comments I'd like to discuss before going deeper
> > on it.
> >
> > The documentation is too simple.  It should mention the HW
> > requirements in order to use this feature.
> 
> It listed the NICs that support this feature.
> 
> >  Also some important limitations, like no support for IP frags, MPLS
> > or conntrack, for instance.
> 
> Yes, that could be added.
> 
> > It seems it would be possible to leave the HW offloading code outside
> > of dpif-netdev.c which is quite long already. I hope it will improve
> > isolation and code clarity too.
> 
> Yes. My thoughts was we can do it later, when we are going to add some
> other flow models (say, full offload) in future. Moreover, it seems that
> it's a custom in OVS to have long source code files. I was just following
> it.
> 
> But I hve no objections to break it. Ian, should I break it now?
> What's your plan to merge it?

Hi Yuanhan, I believe I gave the same feedback on the first review pass I did. 
Previously I was happy to hold off on this work in order to make it into the 
2.9 release but as that window has passed I think it makes sense to break it 
out now. It will save the refactor work later.

I should have more time later this week to continue reviewing, I'll provide 
some more feedback on the rest of the patchset then.

Thanks
Ian


> 
> > So far there is no synchronization between PMDs in the fast path.
> > However, we got a new mutex to sync PMDs and a new thread to manage.
> > Even without the patch adding the thread, there would be a new mutex
> > in the fast path.  It seems the slow path today causes issues, so
> > maybe the whole upcall processing could be pushed to another thread. I
> > realize this is outside of the scope of this patchset, but it is
> > something we should consider.
> >
> > As an alternative solution, maybe we could use a DPDK ring to have a
> > lockless way to push flows to the auxiliary thread.
> 
> It could be an option, but it brings a hard dependency on DPDK. Note that
> the netdev datapath could leave without DPDK, IIUC.
> 
> > There are some memory allocations and deallocations in the fast path
> > using OVS functions.  Perhaps it is better to use rte_* functions
> > instead (another reason to split the code out of dpif-netdev.c)
> 
> Why do you think the one from DPDK is better? Also, it brings the DPDK
> dependency to the netdev datapth.
> 
> > I am curious to know why there is no flow dump or flush?
> 
> Not quite sure I got it.
> 
> > The function to help debugging (dump_flow_pattern) should use an
> > initial condition to return asap if debug is not enabled.
> > E.g.:
> > if (VLOG_DROP_DBG(rl)) {
> > return;
> > }
> 
> Good tip. Thanks!
> 
> > I am still wrapping my head around the RSS+MARK action and rte_flow
> > API, so I can't really comment those yet.
> 
> Thanks for looking at it!
> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-02-06 Thread Yuanhan Liu
On Thu, Feb 01, 2018 at 01:35:15AM -0200, Flavio Leitner wrote:
> On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> > Hi,
> > 
> > Here is a joint work from Mellanox and Napatech, to enable the flow hw
> > offload with the DPDK generic flow interface (rte_flow).
> > 
> > The basic idea is to associate the flow with a mark id (a unit32_t number).
> > Later, we then get the flow directly from the mark id, which could bypass
> > some heavy CPU operations, including but not limiting to mini flow extract,
> > emc lookup, dpcls lookup, etc.
> > 
> > The association is done with CMAP in patch 1. The CPU workload bypassing
> > is done in patch 2. The flow offload is done in patch 3, which mainly does
> > two things:
> > 
> > - translate the ovs match to DPDK rte flow patterns
> > - bind those patterns with a RSS + MARK action.
> > 
> > Patch 5 makes the offload work happen in another thread, for leaving the
> > datapath as light as possible.
> > 
> > A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> > million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
> > performance boost.
> > 
> > Note that it's disabled by default, which can be enabled by:
> 
> Hi,
> 
> First of all, thanks for working on this feature.
> 
> I have some general comments I'd like to discuss before going deeper
> on it.
> 
> The documentation is too simple.  It should mention the HW requirements
> in order to use this feature.

It listed the NICs that support this feature.

>  Also some important limitations, like no
> support for IP frags, MPLS or conntrack, for instance.

Yes, that could be added.

> It seems it would be possible to leave the HW offloading code outside
> of dpif-netdev.c which is quite long already. I hope it will improve
> isolation and code clarity too.

Yes. My thoughts was we can do it later, when we are going to add some
other flow models (say, full offload) in future. Moreover, it seems
that it's a custom in OVS to have long source code files. I was just
following it.

But I hve no objections to break it. Ian, should I break it now?
What's your plan to merge it?

> So far there is no synchronization between PMDs in the fast path.
> However, we got a new mutex to sync PMDs and a new thread to manage.
> Even without the patch adding the thread, there would be a new mutex
> in the fast path.  It seems the slow path today causes issues, so maybe
> the whole upcall processing could be pushed to another thread. I
> realize this is outside of the scope of this patchset, but it is
> something we should consider.
> 
> As an alternative solution, maybe we could use a DPDK ring to have a
> lockless way to push flows to the auxiliary thread.

It could be an option, but it brings a hard dependency on DPDK. Note that
the netdev datapath could leave without DPDK, IIUC.

> There are some memory allocations and deallocations in the fast path
> using OVS functions.  Perhaps it is better to use rte_* functions
> instead (another reason to split the code out of dpif-netdev.c)

Why do you think the one from DPDK is better? Also, it brings the DPDK
dependency to the netdev datapth.

> I am curious to know why there is no flow dump or flush?

Not quite sure I got it.

> The function to help debugging (dump_flow_pattern) should use an
> initial condition to return asap if debug is not enabled.
> E.g.:
> if (VLOG_DROP_DBG(rl)) {
> return;
> }   

Good tip. Thanks!

> I am still wrapping my head around the RSS+MARK action and rte_flow
> API, so I can't really comment those yet.

Thanks for looking at it!

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v7 0/6] OVS-DPDK flow offload with rte_flow

2018-01-31 Thread Flavio Leitner
On Mon, Jan 29, 2018 at 02:59:42PM +0800, Yuanhan Liu wrote:
> Hi,
> 
> Here is a joint work from Mellanox and Napatech, to enable the flow hw
> offload with the DPDK generic flow interface (rte_flow).
> 
> The basic idea is to associate the flow with a mark id (a unit32_t number).
> Later, we then get the flow directly from the mark id, which could bypass
> some heavy CPU operations, including but not limiting to mini flow extract,
> emc lookup, dpcls lookup, etc.
> 
> The association is done with CMAP in patch 1. The CPU workload bypassing
> is done in patch 2. The flow offload is done in patch 3, which mainly does
> two things:
> 
> - translate the ovs match to DPDK rte flow patterns
> - bind those patterns with a RSS + MARK action.
> 
> Patch 5 makes the offload work happen in another thread, for leaving the
> datapath as light as possible.
> 
> A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
> million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
> performance boost.
> 
> Note that it's disabled by default, which can be enabled by:

Hi,

First of all, thanks for working on this feature.

I have some general comments I'd like to discuss before going deeper
on it.

The documentation is too simple.  It should mention the HW requirements
in order to use this feature. Also some important limitations, like no
support for IP frags, MPLS or conntrack, for instance.

It seems it would be possible to leave the HW offloading code outside
of dpif-netdev.c which is quite long already. I hope it will improve
isolation and code clarity too.

So far there is no synchronization between PMDs in the fast path.
However, we got a new mutex to sync PMDs and a new thread to manage.
Even without the patch adding the thread, there would be a new mutex
in the fast path.  It seems the slow path today causes issues, so maybe
the whole upcall processing could be pushed to another thread. I
realize this is outside of the scope of this patchset, but it is
something we should consider.

As an alternative solution, maybe we could use a DPDK ring to have a
lockless way to push flows to the auxiliary thread.

There are some memory allocations and deallocations in the fast path
using OVS functions.  Perhaps it is better to use rte_* functions
instead (another reason to split the code out of dpif-netdev.c)

I am curious to know why there is no flow dump or flush?

The function to help debugging (dump_flow_pattern) should use an
initial condition to return asap if debug is not enabled.
E.g.:
if (VLOG_DROP_DBG(rl)) {
return;
}   

I am still wrapping my head around the RSS+MARK action and rte_flow
API, so I can't really comment those yet.

Thanks!
fbl

> 
> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> 
> v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
> - set the rss_conf for rss action to NULL, to workaround a mlx5 change
>   in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK has
>   set in the beginning. Thus, nothing should be effected.
> 
> v6: - fixed a sparse warning
> - added documentation
> - used hash_int to compute mark to flow hash
> - added more comments
> - added lock for pot lookup
> - rebased on top of the latest code
> 
> v5: - fixed an issue that it took too long if we do flow add/remove
>   repeatedly.
> - removed an unused mutex lock
> - turned most of the log level to DBG
> - rebased on top of the latest code
> 
> v4: - use RSS action instead of QUEUE action with MARK
> - make it work with multiple queue (see patch 1)
> - rebased on top of latest code
> 
> v3: - The mark and id association is done with array instead of CMAP.
> - Added a thread to do hw offload operations
> - Removed macros completely
> - dropped the patch to set FDIR_CONF, which is a workround some
>   Intel NICs.
> - Added a debug patch to show all flow patterns we have created.
> - Misc fixes
> 
> v2: - workaround the queue action issue
> - fixed the tcp_flags being skipped issue, which also fixed the
>   build warnings
> - fixed l2 patterns for Intel nic
> - Converted some macros to functions
> - did not hardcode the max number of flow/action
> - rebased on top of the latest code
> 
> Thanks.
> 
> --yliu
> 
> ---
> Finn Christensen (1):
>   netdev-dpdk: implement flow offload with rte flow
> 
> Yuanhan Liu (5):
>   dpif-netdev: associate flow with a mark id
>   dpif-netdev: retrieve flow directly from the flow mark
>   netdev-dpdk: add debug for rte flow patterns
>   dpif-netdev: do hw flow offload in a thread
>   Documentation: document ovs-dpdk flow offload
> 
>  Documentation/howto/dpdk.rst |  17 +
>  NEWS |   1 +
>  lib/dp-packet.h  |  13 +
>  lib/dpif-netdev.c| 495 -
>  lib/flow.c   | 155 +++--
>  lib/flow.h   |