Re: [PATCH net-next,v3 00/12] add flow_rule infrastructure

2018-11-26 Thread Marcelo Ricardo Leitner
On Mon, Nov 26, 2018 at 08:33:36PM +0100, Pablo Neira Ayuso wrote:
> Hi Marcelo,

Hello!

> 
> On Thu, Nov 22, 2018 at 07:08:32PM -0200, Marcelo Ricardo Leitner wrote:
> > On Thu, Nov 22, 2018 at 02:22:20PM -0200, Marcelo Ricardo Leitner wrote:
> > > On Wed, Nov 21, 2018 at 03:51:20AM +0100, Pablo Neira Ayuso wrote:
> > > > Hi,
> > > > 
> > > > This patchset is the third iteration [1] [2] [3] to introduce a kernel
> > > > intermediate (IR) to express ACL hardware offloads.
> > > 
> > > On v2 cover letter you had:
> > > 
> > > """
> > > However, cost of this layer is very small, adding 1 million rules via
> > > tc -batch, perf shows:
> > > 
> > >  0.06%  tc   [kernel.vmlinux][k] tc_setup_flow_action
> > > """
> > > 
> > > The above doesn't include time spent on children calls and I'm worried
> > > about the new allocation done by flow_rule_alloc(), as it can impact
> > > rule insertion rate. I'll run some tests here and report back.
> > 
> > I'm seeing +60ms on 1.75s (~3.4%) to add 40k flower rules on ingress
> > with skip_hw and tc in batch mode, with flows like:
> > 
> > filter add dev p6p2 parent : protocol ip prio 1 flower skip_hw
> > src_mac ec:13:db:00:00:00 dst_mac ec:14:c2:00:00:00 src_ip
> > 56.0.0.0 dst_ip 55.0.0.0 action drop
> > 
> > Only 20ms out of those 60ms were consumed within fl_change() calls
> > (considering children calls), though.
> > 
> > Do you see something similar?  I used current net-next (d59da3fbfe3f)
> > and with this patchset applied.
> 
> I see lots of send() and recv() in tc -batch via strace, using this
> example rule, repeating it N times:
> 
> filter add dev eth0 parent : protocol ip pref 1 flower dst_mac 
> f4:52:14:10:df:92 action mirred egress redirect dev eth1
> 
> This is taking ~8 seconds for 40k rules from my old laptop [*], this
> is already not too fast (without my patchset).

On a E5-2643 v3 @ 3.40GHz I see a total of 1.17s with an old iproute
(4.11) (more below).

> 
> I remember we discussed about adding support for real batching for tc
> - probably we can probably do this transparently by assuming that if the
> skbuff length mismatches nlmsghdr->len field, then we enter the batch
> mode from the kernel. This would require to update iproute2 to use
> libmnl batching routines, or code that follows similar approach
> otherwise.

Yes, I believe you're referring to

commit 485d0c6001c4aa134b99c86913d6a7089b7b2ab0
Author: Chris Mi 
Date:   Fri Jan 12 14:13:16 2018 +0900

tc: Add batchsize feature for filter and actions

Which is present in 4.16. It does transparent batching on app side.

With tc from today's tip, I get 1.05s for 40k rules, both with this
patchset applied.

> 
> [*] 0.5 seconds in nft (similar ruleset), this is using netlink batching.

Nice.

Cheers,
  Marcelo


Re: [PATCH net-next,v3 00/12] add flow_rule infrastructure

2018-11-26 Thread Pablo Neira Ayuso
Hi Marcelo,

On Thu, Nov 22, 2018 at 07:08:32PM -0200, Marcelo Ricardo Leitner wrote:
> On Thu, Nov 22, 2018 at 02:22:20PM -0200, Marcelo Ricardo Leitner wrote:
> > On Wed, Nov 21, 2018 at 03:51:20AM +0100, Pablo Neira Ayuso wrote:
> > > Hi,
> > > 
> > > This patchset is the third iteration [1] [2] [3] to introduce a kernel
> > > intermediate (IR) to express ACL hardware offloads.
> > 
> > On v2 cover letter you had:
> > 
> > """
> > However, cost of this layer is very small, adding 1 million rules via
> > tc -batch, perf shows:
> > 
> >  0.06%  tc   [kernel.vmlinux][k] tc_setup_flow_action
> > """
> > 
> > The above doesn't include time spent on children calls and I'm worried
> > about the new allocation done by flow_rule_alloc(), as it can impact
> > rule insertion rate. I'll run some tests here and report back.
> 
> I'm seeing +60ms on 1.75s (~3.4%) to add 40k flower rules on ingress
> with skip_hw and tc in batch mode, with flows like:
> 
> filter add dev p6p2 parent : protocol ip prio 1 flower skip_hw
> src_mac ec:13:db:00:00:00 dst_mac ec:14:c2:00:00:00 src_ip
> 56.0.0.0 dst_ip 55.0.0.0 action drop
> 
> Only 20ms out of those 60ms were consumed within fl_change() calls
> (considering children calls), though.
> 
> Do you see something similar?  I used current net-next (d59da3fbfe3f)
> and with this patchset applied.

I see lots of send() and recv() in tc -batch via strace, using this
example rule, repeating it N times:

filter add dev eth0 parent : protocol ip pref 1 flower dst_mac 
f4:52:14:10:df:92 action mirred egress redirect dev eth1

This is taking ~8 seconds for 40k rules from my old laptop [*], this
is already not too fast (without my patchset).

I remember we discussed about adding support for real batching for tc
- probably we can probably do this transparently by assuming that if the
skbuff length mismatches nlmsghdr->len field, then we enter the batch
mode from the kernel. This would require to update iproute2 to use
libmnl batching routines, or code that follows similar approach
otherwise.

[*] 0.5 seconds in nft (similar ruleset), this is using netlink batching.


Re: [PATCH net-next,v3 00/12] add flow_rule infrastructure

2018-11-22 Thread Marcelo Ricardo Leitner
On Thu, Nov 22, 2018 at 02:22:20PM -0200, Marcelo Ricardo Leitner wrote:
> On Wed, Nov 21, 2018 at 03:51:20AM +0100, Pablo Neira Ayuso wrote:
> > Hi,
> > 
> > This patchset is the third iteration [1] [2] [3] to introduce a kernel
> > intermediate (IR) to express ACL hardware offloads.
> 
> On v2 cover letter you had:
> 
> """
> However, cost of this layer is very small, adding 1 million rules via
> tc -batch, perf shows:
> 
>  0.06%  tc   [kernel.vmlinux][k] tc_setup_flow_action
> """
> 
> The above doesn't include time spent on children calls and I'm worried
> about the new allocation done by flow_rule_alloc(), as it can impact
> rule insertion rate. I'll run some tests here and report back.

I'm seeing +60ms on 1.75s (~3.4%) to add 40k flower rules on ingress
with skip_hw and tc in batch mode, with flows like:

filter add dev p6p2 parent : protocol ip prio 1 flower skip_hw
src_mac ec:13:db:00:00:00 dst_mac ec:14:c2:00:00:00 src_ip
56.0.0.0 dst_ip 55.0.0.0 action drop

Only 20ms out of those 60ms were consumed within fl_change() calls
(considering children calls), though.

Do you see something similar?  I used current net-next (d59da3fbfe3f)
and with this patchset applied.



Re: [PATCH net-next,v3 00/12] add flow_rule infrastructure

2018-11-22 Thread Marcelo Ricardo Leitner
On Wed, Nov 21, 2018 at 03:51:20AM +0100, Pablo Neira Ayuso wrote:
> Hi,
> 
> This patchset is the third iteration [1] [2] [3] to introduce a kernel
> intermediate (IR) to express ACL hardware offloads.

On v2 cover letter you had:

"""
However, cost of this layer is very small, adding 1 million rules via
tc -batch, perf shows:

 0.06%  tc   [kernel.vmlinux][k] tc_setup_flow_action
"""

The above doesn't include time spent on children calls and I'm worried
about the new allocation done by flow_rule_alloc(), as it can impact
rule insertion rate. I'll run some tests here and report back.



[PATCH net-next,v3 00/12] add flow_rule infrastructure

2018-11-20 Thread Pablo Neira Ayuso
Hi,

This patchset is the third iteration [1] [2] [3] to introduce a kernel
intermediate (IR) to express ACL hardware offloads.

This round addresses feedback from Jiri Pirko:

* Add net/core/flow_offload.c and include/net/flow_offload.h.
* Add flow_rule_alloc() helper function.
* Remove _key postfix and _KEY_ infix in flow_action definitions.
* Use enum flow_action_mangle_base for consistency.
* Rename key field to entries and num_keys to num_entries.
* Rename struct flow_action_key to flow_action_entry.
* Use placeholder in struct flow_action to store array of actions
  from flow_rule_alloc().
* Add tcf_exts_num_actions() and pass it to flow_rule_alloc() to
  calculate the size of the array of actions.
* Rename to struct flow_stats and to function flow_stats_update().
* Add struct ethtool_rx_flow_rule, keep placeholder to private
  dissector information.
* Pass struct flow_rule *rule to all parser functions in qlogic/qede
  driver.

This also fixes a bug reported by Manish Chopra, in the ethtool_rx_spec
to flow_rule translator.

Making all these changes have been an exercise to review the existing
infrastructure, to understand what has been done and to propose
improvements to the _great work_ that core drivers developers have done
so far to introduce HW offloads through the existing frontend APIs.  I
still have more feedback and technical ideas that I'm very much looking
forward to discuss with them in the future.

Main goal of this patchset is to avoid code duplication for driver
developers. There are no netfilter changes coming in this batch.
I would like to explore Netfilter hardware offloads in the future.

Thanks a lot for reviewing!

[1] https://lwn.net/Articles/766695/
[2] https://marc.info/?l=linux-netdev&m=154233253114506&w=2
[3] https://marc.info/?l=linux-netdev&m=154258780717036&w=2

Pablo Neira Ayuso (12):
  flow_dissector: add flow_rule and flow_match structures and use them
  net/mlx5e: support for two independent packet edit actions
  flow_dissector: add flow action infrastructure
  cls_api: add translator to flow_action representation
  cls_flower: add statistics retrieval infrastructure and use it
  drivers: net: use flow action infrastructure
  cls_flower: don't expose TC actions to drivers anymore
  flow_dissector: add wake-up-on-lan and queue to flow_action
  flow_dissector: add basic ethtool_rx_flow_spec to flow_rule structure 
translator
  dsa: bcm_sf2: use flow_rule infrastructure
  qede: place ethtool_rx_flow_spec after code after TC flower codebase
  qede: use ethtool_rx_flow_rule() to remove duplicated parser code

 drivers/net/dsa/bcm_sf2_cfp.c  | 109 +--
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c   | 252 +++
 .../net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c   | 450 ++---
 drivers/net/ethernet/intel/i40e/i40e_main.c| 178 ++---
 drivers/net/ethernet/intel/iavf/iavf_main.c| 195 +++---
 drivers/net/ethernet/intel/igb/igb_main.c  |  64 +-
 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c| 743 ++---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c |   2 +-
 .../net/ethernet/mellanox/mlxsw/spectrum_flower.c  | 258 ---
 drivers/net/ethernet/netronome/nfp/flower/action.c | 196 +++---
 drivers/net/ethernet/netronome/nfp/flower/match.c  | 417 ++--
 .../net/ethernet/netronome/nfp/flower/offload.c| 150 ++---
 drivers/net/ethernet/qlogic/qede/qede_filter.c | 560 ++--
 include/linux/ethtool.h|  10 +
 include/net/flow_offload.h | 199 ++
 include/net/pkt_cls.h  |  18 +-
 net/core/Makefile  |   2 +-
 net/core/ethtool.c | 189 ++
 net/core/flow_offload.c| 153 +
 net/sched/cls_api.c| 116 
 net/sched/cls_flower.c |  69 +-
 21 files changed, 2339 insertions(+), 1991 deletions(-)
 create mode 100644 include/net/flow_offload.h
 create mode 100644 net/core/flow_offload.c

-- 
2.11.0