Re: [ovs-dev] can not update userspace vxlan tunnel neigh mac when peer VTEP mac changed

2018-03-27 Thread ychen


HI, Jan,
  Thanks for your reply.
  we have already modify code snooping on the GARP packets, but these 2 problem 
still exists.
   I think the main problem is that GARP packets are not sending from 
interfaces when we changed NIC mac address or IP address(read the linux kernel 
code, there is no such process)
   so we must depend on data packet to trigger the ARP request.
  I know that in linux kernel, when ARP packet is triggered, data packets will 
be cached in a specified time, so the first data packet can still be send out 
when ARP reply is received.


  for the second problem, can we update tunnel neigh cache when we receive data 
packet from remote VTEP? since we  can fetch tun_src and outer mac sa from the 
data packet.







At 2018-03-28 04:41:12, "Jan Scheurich"  wrote:
>Hi Ychen,
>
>Funny! Again we are already working on a solution for problem 1. 
>
>In our scenario the situation arises with a tunnel next hop being a VRRP 
>switch pair. The switch sends periodic gratuitous ARPs (GARPs) to announce the 
>VRRP IP but OVS native tunneling doesn't snoop on GARPs, only on ARP 
>replies. The host IP stack, on the other hand, accepts these GARPs and stops 
>sending refresh ARP requests itself. Hence nothing for OVS to snoop upon.
>
>The solution is to make OVS snoop on GARP requests also.
> 
>It is quite possible that this will also fix your problem 2. If you also have 
>a VRRP tunnel next hop which just moves its VRRP IP address but not the MAC 
>address,  should send a GARP with the new IP/MAC mapping when it moves the IP 
>address, which would now update OVS' tunnel neighbor cache.
>
>@Mano: Can you submit the GARP patch in the near future?
>
>BR, Jan
>
>> -Original Message-
>> From: ovs-dev-boun...@openvswitch.org 
>> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of ychen
>> Sent: Tuesday, 27 March, 2018 14:44
>> To: d...@openvswitch.org
>> Subject: [ovs-dev] can not update userspace vxlan tunnel neigh mac when peer 
>> VTEP mac changed
>> 
>> Hi,
>>I found that sometime userspace vxlan can not work happily.
>>1.  first data packet loss
>> when tunnel neigh cache is empty, then the first data packet 
>> triggered  sending ARP packet to peer VTEP, and the data packet
>> dropped,
>> tunnel neigh cache added this entry when receive ARP reply packet.
>> 
>> err = tnl_neigh_lookup(out_dev->xbridge->name, _ip6, );
>>if (err) {
>> xlate_report(ctx, OFT_DETAIL,
>>  "neighbor cache miss for %s on bridge %s, "
>>  "sending %s request",
>>  buf_dip6, out_dev->xbridge->name, d_ip ? "ARP" : "ND");
>> if (d_ip) {
>> tnl_send_arp_request(ctx, out_dev, smac, s_ip, d_ip);
>> } else {
>> tnl_send_nd_request(ctx, out_dev, smac, _ip6, _ip6);
>> }
>> return err;
>> }
>> 
>> 
>> 2. connection lost when peer VTEP mac changed
>> when VTEP mac is already in tunnel neigh cache,   exp:
>> 10.182.6.81   fa:eb:26:c3:16:a5   br-phy
>> 
>> so when data packet come in,  it will use this mac for encaping outer 
>> VXLAN header.
>> but VTEP 10.182.6.81  mac changed from  fa:eb:26:c3:16:a5 to  
>> 24:eb:26:c3:16:a5 because of NIC changed.
>> 
>> data packet continue sending with the old mac  fa:eb:26:c3:16:a5, but 
>> the peer VTEP will not accept these packets because of mac
>> not match.
>> the wrong tunnel neigh entry aging until the data packet stop sending.
>> 
>> 
>>if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
>> tnl_neigh_snoop(flow, wc, ctx->xbridge->name);
>> }
>> 
>> 
>> 3. is there anybody has working for these problems?
>> 
>> 
>> 
>> ___
>> dev mailing list
>> d...@openvswitch.org
>> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Do Need Financial Help?

2018-03-27 Thread showroom . ml
:Do you need an loans to pay off bills or to start up a bussiness? If 
interested reply for more information:  mrbrainja...@aol.com
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Use new default nb and sb dbs for sandbox northd:

2018-03-27 Thread aginwala
As per new clustering change, ovn-northd sandbox should use nb1.ovsdb and
sb1.ovsdb. It was updated in ovn-northd --help section but missed for sandbox.
This commit fixes the same

Reported-by: Mark Michelson 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/345535.html
Signed-off-by: aginwala 
---
 tutorial/ovn-setup.sh | 4 
 tutorial/ovs-sandbox  | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tutorial/ovn-setup.sh b/tutorial/ovn-setup.sh
index 943ca58..9a725cf 100755
--- a/tutorial/ovn-setup.sh
+++ b/tutorial/ovn-setup.sh
@@ -31,5 +31,9 @@ ovs-vsctl add-port br-int p2 -- \
 # View a summary of the configuration
 printf "\n=== ovn-nbctl show ===\n\n"
 ovn-nbctl show
+printf "\n=== ovn-nbctl show with wait hv ===\n\n"
+ovn-nbctl --wait=hv show
 printf "\n=== ovn-sbctl show ===\n\n"
 ovn-sbctl show
+printf "\n=== ovn-sbctl show with wait hv ===\n\n"
+ovn-sbctl --wait=hv show
diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
index babc032..c3e9f12 100755
--- a/tutorial/ovs-sandbox
+++ b/tutorial/ovs-sandbox
@@ -510,8 +510,8 @@ if $ovn; then
 fi
 rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach \
 --no-chdir --pidfile -vconsole:off --log-file \
---ovnsb-db=unix:"$sandbox"/ovnsb_db.sock \
---ovnnb-db=unix:"$sandbox"/ovnnb_db.sock
+--ovnsb-db=unix:"$sandbox"/sb1.ovsdb \
+--ovnnb-db=unix:"$sandbox"/nb1.ovsdb
 rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
 $OVN_CTRLR_PKI --detach --no-chdir --pidfile -vconsole:off --log-file
 rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] Use new default nb and sb dbs for sandbox northd:

2018-03-27 Thread aginwala
Please ignore this. I will resend a new patch as the patch file got messed
up.

On Fri, Mar 23, 2018 at 1:04 PM,  wrote:

> From: aginwala 
>
> As per new clustering change, ovn-northd sandbox should use nb1.ovsdb and
> sb1.ovsdb. It was updated in ovn-northd --help section but missed for
> sandbox.
> This commit fixes the same
>
> Reported-by: Mark Michelson 
> Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/
> 345535.html
> Acked-By: aginwala 
> Signed-off-by: aginwala 
> ---
>  ...-default-nb-and-sb-dbs-for-sandbox-northd.patch | 50
> ++
>  tutorial/ovn-setup.sh  |  4 ++
>  tutorial/ovs-sandbox   |  4 +-
>  3 files changed, 56 insertions(+), 2 deletions(-)
>  create mode 100644 0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.
> patch
>
> diff --git a/0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.patch
> b/0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.patch
> new file mode 100644
> index 000..c8a0286
> --- /dev/null
> +++ b/0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.patch
> @@ -0,0 +1,50 @@
> +From eb9051426693843797ea0f2a0bf21b1b5272fd2f Mon Sep 17 00:00:00 2001
> +From: aginwala 
> +Date: Fri, 23 Mar 2018 12:41:24 -0700
> +Subject: [PATCH] Use new default nb and sb dbs for sandbox northd:
> +
> +As per new clustering change, ovn-northd sandbox should use nb1.ovsdb and
> +sb1.ovsdb. It was updated in ovn-northd --help section but missed for
> sandbox.
> +This commit fixes the same
> +
> +Reported-by: Mark Michelson 
> +Reported-at: https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/
> 345535.html
> +Acked-By: aginwala 
> +Signed-off-by: aginwala 
> +---
> + tutorial/ovn-setup.sh | 4 
> + tutorial/ovs-sandbox  | 4 ++--
> + 2 files changed, 6 insertions(+), 2 deletions(-)
> +
> +diff --git a/tutorial/ovn-setup.sh b/tutorial/ovn-setup.sh
> +index 943ca58..9a725cf 100755
> +--- a/tutorial/ovn-setup.sh
>  b/tutorial/ovn-setup.sh
> +@@ -31,5 +31,9 @@ ovs-vsctl add-port br-int p2 -- \
> + # View a summary of the configuration
> + printf "\n=== ovn-nbctl show ===\n\n"
> + ovn-nbctl show
> ++printf "\n=== ovn-nbctl show with wait hv ===\n\n"
> ++ovn-nbctl --wait=hv show
> + printf "\n=== ovn-sbctl show ===\n\n"
> + ovn-sbctl show
> ++printf "\n=== ovn-sbctl show with wait hv ===\n\n"
> ++ovn-sbctl --wait=hv show
> +diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
> +index babc032..c3e9f12 100755
> +--- a/tutorial/ovs-sandbox
>  b/tutorial/ovs-sandbox
> +@@ -510,8 +510,8 @@ if $ovn; then
> + fi
> + rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach \
> + --no-chdir --pidfile -vconsole:off --log-file \
> +---ovnsb-db=unix:"$sandbox"/ovnsb_db.sock \
> +---ovnnb-db=unix:"$sandbox"/ovnnb_db.sock
> ++--ovnsb-db=unix:"$sandbox"/sb1.ovsdb \
> ++--ovnnb-db=unix:"$sandbox"/nb1.ovsdb
> + rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
> + $OVN_CTRLR_PKI --detach --no-chdir --pidfile -vconsole:off
> --log-file
> + rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
> +--
> +1.9.1
> +
> diff --git a/tutorial/ovn-setup.sh b/tutorial/ovn-setup.sh
> index 943ca58..9a725cf 100755
> --- a/tutorial/ovn-setup.sh
> +++ b/tutorial/ovn-setup.sh
> @@ -31,5 +31,9 @@ ovs-vsctl add-port br-int p2 -- \
>  # View a summary of the configuration
>  printf "\n=== ovn-nbctl show ===\n\n"
>  ovn-nbctl show
> +printf "\n=== ovn-nbctl show with wait hv ===\n\n"
> +ovn-nbctl --wait=hv show
>  printf "\n=== ovn-sbctl show ===\n\n"
>  ovn-sbctl show
> +printf "\n=== ovn-sbctl show with wait hv ===\n\n"
> +ovn-sbctl --wait=hv show
> diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
> index babc032..c3e9f12 100755
> --- a/tutorial/ovs-sandbox
> +++ b/tutorial/ovs-sandbox
> @@ -510,8 +510,8 @@ if $ovn; then
>  fi
>  rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach \
>  --no-chdir --pidfile -vconsole:off --log-file \
> ---ovnsb-db=unix:"$sandbox"/ovnsb_db.sock \
> ---ovnnb-db=unix:"$sandbox"/ovnnb_db.sock
> +--ovnsb-db=unix:"$sandbox"/sb1.ovsdb \
> +--ovnnb-db=unix:"$sandbox"/nb1.ovsdb
>  rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
>  $OVN_CTRLR_PKI --detach --no-chdir --pidfile -vconsole:off
> --log-file
>  rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
> --
> 1.9.1
>
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Use new default nb and sb dbs for sandbox northd:

2018-03-27 Thread amginwal
From: aginwala 

As per new clustering change, ovn-northd sandbox should use nb1.ovsdb and
sb1.ovsdb. It was updated in ovn-northd --help section but missed for sandbox.
This commit fixes the same

Reported-by: Mark Michelson 
Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/345535.html
Acked-By: aginwala 
Signed-off-by: aginwala 
---
 ...-default-nb-and-sb-dbs-for-sandbox-northd.patch | 50 ++
 tutorial/ovn-setup.sh  |  4 ++
 tutorial/ovs-sandbox   |  4 +-
 3 files changed, 56 insertions(+), 2 deletions(-)
 create mode 100644 0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.patch

diff --git a/0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.patch 
b/0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.patch
new file mode 100644
index 000..c8a0286
--- /dev/null
+++ b/0001-Use-new-default-nb-and-sb-dbs-for-sandbox-northd.patch
@@ -0,0 +1,50 @@
+From eb9051426693843797ea0f2a0bf21b1b5272fd2f Mon Sep 17 00:00:00 2001
+From: aginwala 
+Date: Fri, 23 Mar 2018 12:41:24 -0700
+Subject: [PATCH] Use new default nb and sb dbs for sandbox northd:
+
+As per new clustering change, ovn-northd sandbox should use nb1.ovsdb and
+sb1.ovsdb. It was updated in ovn-northd --help section but missed for sandbox.
+This commit fixes the same
+
+Reported-by: Mark Michelson 
+Reported-at: 
https://mail.openvswitch.org/pipermail/ovs-dev/2018-March/345535.html
+Acked-By: aginwala 
+Signed-off-by: aginwala 
+---
+ tutorial/ovn-setup.sh | 4 
+ tutorial/ovs-sandbox  | 4 ++--
+ 2 files changed, 6 insertions(+), 2 deletions(-)
+
+diff --git a/tutorial/ovn-setup.sh b/tutorial/ovn-setup.sh
+index 943ca58..9a725cf 100755
+--- a/tutorial/ovn-setup.sh
 b/tutorial/ovn-setup.sh
+@@ -31,5 +31,9 @@ ovs-vsctl add-port br-int p2 -- \
+ # View a summary of the configuration
+ printf "\n=== ovn-nbctl show ===\n\n"
+ ovn-nbctl show
++printf "\n=== ovn-nbctl show with wait hv ===\n\n"
++ovn-nbctl --wait=hv show
+ printf "\n=== ovn-sbctl show ===\n\n"
+ ovn-sbctl show
++printf "\n=== ovn-sbctl show with wait hv ===\n\n"
++ovn-sbctl --wait=hv show
+diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
+index babc032..c3e9f12 100755
+--- a/tutorial/ovs-sandbox
 b/tutorial/ovs-sandbox
+@@ -510,8 +510,8 @@ if $ovn; then
+ fi
+ rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach \
+ --no-chdir --pidfile -vconsole:off --log-file \
+---ovnsb-db=unix:"$sandbox"/ovnsb_db.sock \
+---ovnnb-db=unix:"$sandbox"/ovnnb_db.sock
++--ovnsb-db=unix:"$sandbox"/sb1.ovsdb \
++--ovnnb-db=unix:"$sandbox"/nb1.ovsdb
+ rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
+ $OVN_CTRLR_PKI --detach --no-chdir --pidfile -vconsole:off --log-file
+ rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
+-- 
+1.9.1
+
diff --git a/tutorial/ovn-setup.sh b/tutorial/ovn-setup.sh
index 943ca58..9a725cf 100755
--- a/tutorial/ovn-setup.sh
+++ b/tutorial/ovn-setup.sh
@@ -31,5 +31,9 @@ ovs-vsctl add-port br-int p2 -- \
 # View a summary of the configuration
 printf "\n=== ovn-nbctl show ===\n\n"
 ovn-nbctl show
+printf "\n=== ovn-nbctl show with wait hv ===\n\n"
+ovn-nbctl --wait=hv show
 printf "\n=== ovn-sbctl show ===\n\n"
 ovn-sbctl show
+printf "\n=== ovn-sbctl show with wait hv ===\n\n"
+ovn-sbctl --wait=hv show
diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
index babc032..c3e9f12 100755
--- a/tutorial/ovs-sandbox
+++ b/tutorial/ovs-sandbox
@@ -510,8 +510,8 @@ if $ovn; then
 fi
 rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach \
 --no-chdir --pidfile -vconsole:off --log-file \
---ovnsb-db=unix:"$sandbox"/ovnsb_db.sock \
---ovnnb-db=unix:"$sandbox"/ovnnb_db.sock
+--ovnsb-db=unix:"$sandbox"/sb1.ovsdb \
+--ovnnb-db=unix:"$sandbox"/nb1.ovsdb
 rungdb $gdb_ovn_controller $gdb_ovn_controller_ex ovn-controller \
 $OVN_CTRLR_PKI --detach --no-chdir --pidfile -vconsole:off --log-file
 rungdb $gdb_ovn_controller_vtep $gdb_ovn_controller_vtep_ex \
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Clustered DB commits causes sandbox errors

2018-03-27 Thread aginwala
Hi Mark:


The thing is clustering db uses new sockets nb1.ovsdb and sb1.ovsdb.
However, northd was still trying to use old ovnsb_db.sock and ovnnb_db.sock
.

I was able to fix the issue as per below patch

diff --git a/tutorial/ovs-sandbox b/tutorial/ovs-sandbox
index babc032..c3e9f12 100755
--- a/tutorial/ovs-sandbox
+++ b/tutorial/ovs-sandbox
@@ -510,8 +510,8 @@ if $ovn; then
 fi
 rungdb $gdb_ovn_northd $gdb_ovn_northd_ex ovn-northd --detach \
 --no-chdir --pidfile -vconsole:off --log-file \
---ovnsb-db=unix:"$sandbox"/ovnsb_db.sock \
---ovnnb-db=unix:"$sandbox"/ovnnb_db.sock
+--ovnsb-db=unix:"$sandbox"/sb1.ovsdb \
+--ovnnb-db=unix:"$sandbox"/nb1.ovsdb

test:

#/ovs/tutorial/sandbox# ovn-nbctl --wait=hv sync
#/ovs/tutorial/sandbox#
#/ovs/tutorial/sandbox#ovn-nbctl --wait=hv ls-add ls1
#/ovs/tutorial/sandbox# ovn-nbctl --wait=hv ls-del ls1
#/ovs/tutorial/sandbox

Will submit the formal patch and add --wait=hv in ovn-setup.sh so that its
easier to trace this condition too to ensure northd is always working fine.


Let me know if someone else has any comments.


Regards,


On Tue, Mar 27, 2018 at 10:34 AM, aginwala  wrote:

> Hi Mark:
>
> I did reset to  HEAD~6 and the sandbox still crashes . So took commit:
> cb8cbbbe97b56401c399fa261b9670eb1698bf14 that Han recently used to rebase
> his patches and it works fine. So the diff is somewhere from this commit to
> the master.
>
>
> Also, I noticed one thing that if we are using ssl by default for sandbox,
> it's better to configure northd to use ssl too by generating :
> e.g. I restarted northd as below in the current master sandbox and it
> worked fine:
> ovn-northd --detach --no-chdir --pidfile -vconsole:off --log-file
> --ovnsb-db=ssl:127.0.0.1:6642 --ovnnb-db=ssl:127.0.0.1:6641 -p
> /root/ovs/tutorial/sandbox/chassis-1-privkey.pem -c
> /root/ovs/tutorial/sandbox/chassis-1-cert.pem -C
> /root/ovs/tutorial/sandbox/pki/switchca/cacert.pem
>
> Will try to see if I can get the fix real quick.  Ofcourse the diff is
> super big! :)
>
>
> On Mon, Mar 26, 2018 at 2:42 PM, Mark Michelson 
> wrote:
>
>> Hi,
>>
>> I'm currently on the master branch of OVS, commit "1b1d2e6da ovsdb:
>> Introduce experimental support for clustered databases." I started the OVS
>> sandbox using `make sandbox SANDBOXFLAGS="--ovn"` . I tried to run some
>> tests to add some logical switch ports to OVN. Running `ovn-nbctl --wait=hv
>> lsp-add ls0 lsp0` blocks forever. I found that ovn-northd.log was peppered
>> with lines like the following:
>>
>> 2018-03-26T21:21:06.509Z|00018|reconnect|INFO|unix:/home/put
>> nopvut/ovs/tutorial/sandbox/ovnnb_db.sock: connecting...
>> 2018-03-26T21:21:06.509Z|00019|reconnect|INFO|unix:/home/put
>> nopvut/ovs/tutorial/sandbox/ovnnb_db.sock: connection attempt failed (No
>> such file or directory)
>> 2018-03-26T21:21:06.509Z|00020|reconnect|INFO|unix:/home/put
>> nopvut/ovs/tutorial/sandbox/ovnnb_db.sock: continuing to reconnect in
>> the background but suppressing further logging
>> 2018-03-26T21:21:06.509Z|00021|reconnect|INFO|unix:/home/put
>> nopvut/ovs/tutorial/sandbox/ovnsb_db.sock: connecting...
>> 2018-03-26T21:21:06.509Z|00022|reconnect|INFO|unix:/home/put
>> nopvut/ovs/tutorial/sandbox/ovnsb_db.sock: connection attempt failed (No
>> such file or directory)
>> 2018-03-26T21:21:06.509Z|00023|reconnect|INFO|unix:/home/put
>> nopvut/ovs/tutorial/sandbox/ovnsb_db.sock: continuing to reconnect in
>> the background but suppressing further logging
>>
>> And ovn-controller.log has lines like:
>>
>> 2018-03-26T21:21:00.202Z|00021|rconn|INFO|unix:/home/putnopv
>> ut/ovs/tutorial/sandbox/br-int.mgmt: connected
>> 2018-03-26T21:21:00.203Z|00022|ovsdb_idl|WARN|transaction error:
>> {"details":"RBAC rules for client \"chassis-1\" role \"ovn-controller\"
>> prohibit row insertion into table \"Encap\".","error":"permission error"}
>>
>> I attempted to bisect to see what commit introduced the problem, but I
>> ran into problems here, too. If I revert to HEAD~6 (077f03028
>> jsonrpc-server: Separate changing read_only status from reconnecting.),
>> then the ovs-sandbox works as expected. If I revert to HEAD~5, HEAD~4,
>> HEAD~3, HEAD~2, or HEAD~, I hit a compilation error:
>>
>> In file included from lib/ovsdb-idl.c:45:0:
>> lib/ovsdb-idl.c: In function ‘ovsdb_idl_send_monitor_request’:
>> lib/ovsdb-idl.c:1638:34: error: ‘struct ovsdb_idl’ has no member named
>> ‘class_’
>>idl->class_->database, column->name);
>>   ^
>> ./include/openvswitch/vlog.h:271:41: note: in definition of macro ‘VLOG’
>>  vlog(_module, level__, __VA_ARGS__);   \
>>  ^~~
>> lib/ovsdb-idl.c:1636:21: note: in expansion of macro ‘VLOG_WARN’
>>  VLOG_WARN("%s table in %s database has synthetic "
>>  ^
>>
>> Unfortunately, I have a 6 commit 

Re: [ovs-dev] [PATCH] rhel: don't drop capabilities when running as root

2018-03-27 Thread Russell Bryant
On Tue, Mar 27, 2018 at 9:26 AM, Aaron Conole  wrote:
> Aaron Conole  writes:
>
>> Currently, regardless of which user is being set as the running user,
>> Open vSwitch daemons on RHEL systems drop capabilities.  This means the
>> very powerful CAP_SYS_ADMIN is dropped, even when the user is 'root'.
>>
>> For the majority of use cases this behavior works, as the user can
>> enable or disable various configurations, regardless of which datapath
>> functions are desired.  However, when using certain DPDK PMDs, the
>> enablement and configuration calls require CAP_SYS_ADMIN.
>>
>> Instead of retaining CAP_SYS_ADMIN in all cases, which would practically
>> nullify the uid/gid and privilege drop, we don't pass the --ovs-user
>> option to the daemons.  This shunts the capability and privilege
>> dropping code.
>>
>> Reported-by: Marcos Felipe Schwarz 
>> Reported-at: 
>> https://mail.openvswitch.org/pipermail/ovs-discuss/2018-January/045955.html
>> Fixes: e3e738a3d058 ("redhat: allow dpdk to also run as non-root user")
>> Signed-off-by: Aaron Conole 
>> ---
>
> Ping?

Applied to master and branch-2.9.

Please continue to CC me on rhel patches like this that have been
reviewed by someone and you feel are ready to be applied.

Thanks,

-- 
Russell Bryant
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] can not update userspace vxlan tunnel neigh mac when peer VTEP mac changed

2018-03-27 Thread Jan Scheurich
Hi Ychen,

Funny! Again we are already working on a solution for problem 1. 

In our scenario the situation arises with a tunnel next hop being a VRRP switch 
pair. The switch sends periodic gratuitous ARPs (GARPs) to announce the VRRP 
IP but OVS native tunneling doesn't snoop on GARPs, only on ARP replies. 
The host IP stack, on the other hand, accepts these GARPs and stops sending 
refresh ARP requests itself. Hence nothing for OVS to snoop upon.

The solution is to make OVS snoop on GARP requests also.
 
It is quite possible that this will also fix your problem 2. If you also have a 
VRRP tunnel next hop which just moves its VRRP IP address but not the MAC 
address,  should send a GARP with the new IP/MAC mapping when it moves the IP 
address, which would now update OVS' tunnel neighbor cache.

@Mano: Can you submit the GARP patch in the near future?

BR, Jan

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of ychen
> Sent: Tuesday, 27 March, 2018 14:44
> To: d...@openvswitch.org
> Subject: [ovs-dev] can not update userspace vxlan tunnel neigh mac when peer 
> VTEP mac changed
> 
> Hi,
>I found that sometime userspace vxlan can not work happily.
>1.  first data packet loss
> when tunnel neigh cache is empty, then the first data packet 
> triggered  sending ARP packet to peer VTEP, and the data packet
> dropped,
> tunnel neigh cache added this entry when receive ARP reply packet.
> 
> err = tnl_neigh_lookup(out_dev->xbridge->name, _ip6, );
>if (err) {
> xlate_report(ctx, OFT_DETAIL,
>  "neighbor cache miss for %s on bridge %s, "
>  "sending %s request",
>  buf_dip6, out_dev->xbridge->name, d_ip ? "ARP" : "ND");
> if (d_ip) {
> tnl_send_arp_request(ctx, out_dev, smac, s_ip, d_ip);
> } else {
> tnl_send_nd_request(ctx, out_dev, smac, _ip6, _ip6);
> }
> return err;
> }
> 
> 
> 2. connection lost when peer VTEP mac changed
> when VTEP mac is already in tunnel neigh cache,   exp:
> 10.182.6.81   fa:eb:26:c3:16:a5   br-phy
> 
> so when data packet come in,  it will use this mac for encaping outer 
> VXLAN header.
> but VTEP 10.182.6.81  mac changed from  fa:eb:26:c3:16:a5 to  
> 24:eb:26:c3:16:a5 because of NIC changed.
> 
> data packet continue sending with the old mac  fa:eb:26:c3:16:a5, but the 
> peer VTEP will not accept these packets because of mac
> not match.
> the wrong tunnel neigh entry aging until the data packet stop sending.
> 
> 
>if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
> tnl_neigh_snoop(flow, wc, ctx->xbridge->name);
> }
> 
> 
> 3. is there anybody has working for these problems?
> 
> 
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] HI

2018-03-27 Thread Lucy Boston
-- 
Greeting, once again is me Lucy Boston this is twice am contacting you
please is very urgent respond to me for more details through my.
Email:

dr.lucybos...@gmail.com
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Medir el éxito de nuestras acciones

2018-03-27 Thread KPI´s en RRHH
 
 

 
KPI´s en RRHH para incrementar la productividad 
Abril 11- webinar Interactivo

Introducción:

Cuando planificamos acciones dentro de la empresa, es clave, para medir el 
éxito de nuestras acciones y su posterior retorno, establecer Indicadores y 
unidades de medición a cada una de esas acciones. 

Temas a tratar:

- Introducción a los KPIS o Indicadores Clave de Rendimiento.
- Para qué medir en el departamento de RRHH.
- KPI´s del departamento de RRHH.
- Seguimiento y comunicación de los KPIs.
- Porcentaje de Vacaciones Vencidas .

 
 
Temario e Inscripciones:

Respondiendo por este medio "RRHH"+TELÉFONO + NOMBRE o marcando al:

045 + 5515546630  



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] python-windows: Fix unicode python tests on Windows

2018-03-27 Thread Alin Gabriel Serdean
This patch enables default Windows encodings for the python3 buffers which
are already used un python2

Signed-off-by: Alin Gabriel Serdean 
Co-authored-by: Alin Balutoiu 
Signed-off-by: Alin Balutoiu 
---
 tests/atlocal.in | 9 +
 1 file changed, 9 insertions(+)

diff --git a/tests/atlocal.in b/tests/atlocal.in
index 55f9333ee..0df504be7 100644
--- a/tests/atlocal.in
+++ b/tests/atlocal.in
@@ -106,6 +106,15 @@ FreeBSD|NetBSD)
 ;;
 esac
 
+if test x"$PYTHON3" != x && test "$IS_WIN32" = yes; then
+# enables legacy windows unicode printing needed for Python3 compatibility
+# with the Python2 tests
+PYTHONLEGACYWINDOWSFSENCODING=true
+export PYTHONLEGACYWINDOWSFSENCODING
+PYTHONLEGACYWINDOWSSTDIO=true
+export PYTHONLEGACYWINDOWSSTDIO
+fi
+
 # Check whether to run IPv6 tests.
 if $PYTHON -c '
 import socket
-- 
2.16.1.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] Clustered DB commits causes sandbox errors

2018-03-27 Thread aginwala
Hi Mark:

I did reset to  HEAD~6 and the sandbox still crashes . So took commit:
cb8cbbbe97b56401c399fa261b9670eb1698bf14 that Han recently used to rebase
his patches and it works fine. So the diff is somewhere from this commit to
the master.


Also, I noticed one thing that if we are using ssl by default for sandbox,
it's better to configure northd to use ssl too by generating :
e.g. I restarted northd as below in the current master sandbox and it
worked fine:
ovn-northd --detach --no-chdir --pidfile -vconsole:off --log-file
--ovnsb-db=ssl:127.0.0.1:6642 --ovnnb-db=ssl:127.0.0.1:6641 -p
/root/ovs/tutorial/sandbox/chassis-1-privkey.pem -c
/root/ovs/tutorial/sandbox/chassis-1-cert.pem -C /root/ovs/tutorial/sandbox/
pki/switchca/cacert.pem

Will try to see if I can get the fix real quick.  Ofcourse the diff is
super big! :)


On Mon, Mar 26, 2018 at 2:42 PM, Mark Michelson  wrote:

> Hi,
>
> I'm currently on the master branch of OVS, commit "1b1d2e6da ovsdb:
> Introduce experimental support for clustered databases." I started the OVS
> sandbox using `make sandbox SANDBOXFLAGS="--ovn"` . I tried to run some
> tests to add some logical switch ports to OVN. Running `ovn-nbctl --wait=hv
> lsp-add ls0 lsp0` blocks forever. I found that ovn-northd.log was peppered
> with lines like the following:
>
> 2018-03-26T21:21:06.509Z|00018|reconnect|INFO|unix:/home/
> putnopvut/ovs/tutorial/sandbox/ovnnb_db.sock: connecting...
> 2018-03-26T21:21:06.509Z|00019|reconnect|INFO|unix:/home/
> putnopvut/ovs/tutorial/sandbox/ovnnb_db.sock: connection attempt failed
> (No such file or directory)
> 2018-03-26T21:21:06.509Z|00020|reconnect|INFO|unix:/home/
> putnopvut/ovs/tutorial/sandbox/ovnnb_db.sock: continuing to reconnect in
> the background but suppressing further logging
> 2018-03-26T21:21:06.509Z|00021|reconnect|INFO|unix:/home/
> putnopvut/ovs/tutorial/sandbox/ovnsb_db.sock: connecting...
> 2018-03-26T21:21:06.509Z|00022|reconnect|INFO|unix:/home/
> putnopvut/ovs/tutorial/sandbox/ovnsb_db.sock: connection attempt failed
> (No such file or directory)
> 2018-03-26T21:21:06.509Z|00023|reconnect|INFO|unix:/home/
> putnopvut/ovs/tutorial/sandbox/ovnsb_db.sock: continuing to reconnect in
> the background but suppressing further logging
>
> And ovn-controller.log has lines like:
>
> 2018-03-26T21:21:00.202Z|00021|rconn|INFO|unix:/home/putnopv
> ut/ovs/tutorial/sandbox/br-int.mgmt: connected
> 2018-03-26T21:21:00.203Z|00022|ovsdb_idl|WARN|transaction error:
> {"details":"RBAC rules for client \"chassis-1\" role \"ovn-controller\"
> prohibit row insertion into table \"Encap\".","error":"permission error"}
>
> I attempted to bisect to see what commit introduced the problem, but I ran
> into problems here, too. If I revert to HEAD~6 (077f03028 jsonrpc-server:
> Separate changing read_only status from reconnecting.), then the
> ovs-sandbox works as expected. If I revert to HEAD~5, HEAD~4, HEAD~3,
> HEAD~2, or HEAD~, I hit a compilation error:
>
> In file included from lib/ovsdb-idl.c:45:0:
> lib/ovsdb-idl.c: In function ‘ovsdb_idl_send_monitor_request’:
> lib/ovsdb-idl.c:1638:34: error: ‘struct ovsdb_idl’ has no member named
> ‘class_’
>idl->class_->database, column->name);
>   ^
> ./include/openvswitch/vlog.h:271:41: note: in definition of macro ‘VLOG’
>  vlog(_module, level__, __VA_ARGS__);   \
>  ^~~
> lib/ovsdb-idl.c:1636:21: note: in expansion of macro ‘VLOG_WARN’
>  VLOG_WARN("%s table in %s database has synthetic "
>  ^
>
> Unfortunately, I have a 6 commit range where the error may have been
> introduced. I would love to have submitted a patch to fix this, but I don't
> have much more time left today to work on this, I'm off tomorrow, and the
> diff between HEAD~6 and HEAD is massive.
>
> Mark!
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] Windows: Fix broken build caused by a bad file extension

2018-03-27 Thread Alin Gabriel Serdean
The compiler (cl) complains:
`ovsdb/ovsdb-server.c(689) : fatal error C1083:
Cannot open include file: 'ovsdb/_server.ovsschema.inc':
   No such file or directory`
(https://ci.appveyor.com/project/blp/ovs/build/1.0.4079#L2586)

Generated compiler objects have the extension `.obj` on Windows.

This patch switches to `$(OBJEXT)` instead, so the schema will be generated.

Signed-off-by: Alin Gabriel Serdean aserd...@ovn.org
---
 ovsdb/automake.mk | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovsdb/automake.mk b/ovsdb/automake.mk
index c43acf7e4..b895f4292 100644
--- a/ovsdb/automake.mk
+++ b/ovsdb/automake.mk
@@ -120,7 +120,7 @@ OVSDB_DOT = $(run_python) $(srcdir)/ovsdb/ovsdb-dot.in
 
 EXTRA_DIST += ovsdb/_server.ovsschema
 CLEANFILES += ovsdb/_server.ovsschema.inc
-ovsdb/ovsdb-server.o: ovsdb/_server.ovsschema.inc
+ovsdb/ovsdb-server.$(OBJEXT): ovsdb/_server.ovsschema.inc
 ovsdb/_server.ovsschema.inc: ovsdb/_server.ovsschema $(srcdir)/build-aux/text2c
$(AM_V_GEN)$(run_python) $(srcdir)/build-aux/text2c < $< > $@.tmp
$(AM_V_at)mv $@.tmp $@
-- 
2.16.1.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 3/3] dpif-netdev: Detection and logging of suspicious PMD iterations

2018-03-27 Thread Stokes, Ian
> This patch enhances dpif-netdev-perf to detect iterations with suspicious
> statistics according to the following criteria:
> 
> - iteration lasts longer than US_THR microseconds (default 250).
>   This can be used to capture events where a PMD is blocked or
>   interrupted for such a period of time that there is a risk for
>   dropped packets on any of its Rx queues.
> 
> - max vhost qlen exceeds a threshold Q_THR (default 128). This can
>   be used to infer virtio queue overruns and dropped packets inside
>   a VM, which are not visible in OVS otherwise.
> 
> Such suspicious iterations can be logged together with their iteration
> statistics to be able to correlate them to packet drop or other events
> outside OVS.
> 
> A new command is introduced to enable/disable logging at run-time and to
> adjust the above thresholds for suspicious iterations:
> 
> ovs-appctl dpif-netdev/pmd-perf-log-set on | off
> [-b before] [-a after] [-e|-ne] [-us usec] [-q qlen]
> 
> Turn logging on or off at run-time (on|off).
> 
> -b before:  The number of iterations before the suspicious iteration to
> be logged (default 5).
> -a after:   The number of iterations after the suspicious iteration to
> be logged (default 5).
> -e: Extend logging interval if another suspicious iteration is
> detected before logging occurs.
> -ne:Do not extend logging interval (default).
> -q qlen:Suspicious vhost queue fill level threshold. Increase this
> to 512 if the Qemu supports 1024 virtio queue length.
> (default 128).
> -us usec:   change the duration threshold for a suspicious iteration
> (default 250 us).
> 
> Note: Logging of suspicious iterations itself consumes a considerable
> amount of processing cycles of a PMD which may be visible in the iteration
> history. In the worst case this can lead OVS to detect another suspicious
> iteration caused by logging.
> 
> If more than 100 iterations around a suspicious iteration have been logged
> once, OVS falls back to the safe default values (-b 5/-a 5/-ne) to avoid
> that logging itself causes continuos further logging.
> 
> Signed-off-by: Jan Scheurich 
> Acked-by: Billy O'Mahony 
> ---
>  NEWS|   2 +
>  lib/dpif-netdev-perf.c  | 201
> 
>  lib/dpif-netdev-perf.h  |  42 +
>  lib/dpif-netdev-unixctl.man |  59 +
>  lib/dpif-netdev.c   |   5 ++
>  5 files changed, 309 insertions(+)
> 
> diff --git a/NEWS b/NEWS
> index 8f66fd3..61148b1 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -76,6 +76,8 @@ v2.9.0 - 19 Feb 2018
>   * Commands ovs-appctl dpif-netdev/pmd-*-show can now work on a
> single PMD
>   * Detailed PMD performance metrics available with new command
>   ovs-appctl dpif-netdev/pmd-perf-show
> + * Supervision of PMD performance metrics and logging of suspicious
> +   iterations
> - vswitchd:
>   * Datapath IDs may now be specified as 0x1 (etc.) instead of 16
> digits.
>   * Configuring a controller, or unconfiguring all controllers, now
> deletes diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c index
> 2b36410..410a209 100644
> --- a/lib/dpif-netdev-perf.c
> +++ b/lib/dpif-netdev-perf.c
> @@ -25,6 +25,24 @@
> 
>  VLOG_DEFINE_THIS_MODULE(pmd_perf);
> 
> +#define ITER_US_THRESHOLD 250   /* Warning threshold for iteration
> duration
> +   in microseconds. */
> +#define VHOST_QUEUE_FULL 128/* Size of the virtio TX queue. */
> +#define LOG_IT_BEFORE 5 /* Number of iterations to log before
> +   suspicious iteration. */
> +#define LOG_IT_AFTER 5  /* Number of iterations to log after
> +   suspicious iteration. */
> +
> +bool log_enabled = false;
> +bool log_extend = false;

Will cause compilation error 'error: symbol 'log_extend' was not declared. 
Should it be static?'

You could declare it as an extern similar to 'bool log_enabled' in 
dpif-netdev-perf.h.

Thanks
Ian
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 2/3] dpif-netdev: Detailed performance stats for PMDs

2018-03-27 Thread Jan Scheurich
> -Original Message-
> From: Stokes, Ian [mailto:ian.sto...@intel.com]
> Sent: Tuesday, 27 March, 2018 16:21
> To: Ilya Maximets ; Jan Scheurich 
> ; d...@openvswitch.org
> Cc: ktray...@redhat.com; O Mahony, Billy 
> Subject: RE: [PATCH v10 2/3] dpif-netdev: Detailed performance stats for PMDs
> 
> > Comments inline.
> >
> > Best regards, Ilya Maximets.
> >
> > On 18.03.2018 20:55, Jan Scheurich wrote:
> > > This patch instruments the dpif-netdev datapath to record detailed
> > > statistics of what is happening in every iteration of a PMD thread.
> > >
> > > The collection of detailed statistics can be controlled by a new
> > > Open_vSwitch configuration parameter "other_config:pmd-perf-metrics".
> > > By default it is disabled. The run-time overhead, when enabled, is
> > > in the order of 1%.
> > >
> 
> [snip]
> 
> > > +}
> > > +if (tx_packets > 0) {
> > > +ds_put_format(str,
> > > +"  Tx packets:  %12"PRIu64"  (%.0f Kpps)\n"
> > > +"  Tx batches:  %12"PRIu64"  (%.2f pkts/batch)"
> > > +"\n",
> > > +tx_packets, (tx_packets / duration) / 1000,
> > > +tx_batches, 1.0 * tx_packets / tx_batches);
> > > +} else {
> > > +ds_put_format(str,
> > > +"  Tx packets:  %12"PRIu64"\n"
> > > +"\n",
> > > +0ULL);
> >
> > I have a few interesting warnings on 64bit ARMv8.
> >
> > Clang:
> >
> > lib/dpif-netdev-perf.c:216:17: error: format specifies type 'unsigned
> > long' but the argument has type 'unsigned long long' [-Werror,-Wformat]
> > 0ULL);
> > ^~~~
> > lib/dpif-netdev-perf.c:229:17: error: format specifies type 'unsigned
> > long' but the argument has type 'unsigned long long' [-Werror,-Wformat]
> > 0ULL);
> > ^~~~
> >
> > GCC:
> >
> > lib/dpif-netdev-perf.c: In function ‘pmd_perf_format_overall_stats’:
> > lib/dpif-netdev-perf.c:215:17: error: format ‘%lu’ expects argument of
> > type ‘long unsigned int’, but argument 3 has type ‘long long unsigned int’
> > [-Werror=format=]
> >  "  Rx packets:  %12"PRIu64"\n",
> >  ^
> > lib/dpif-netdev-perf.c:227:17: error: format ‘%lu’ expects argument of
> > type ‘long unsigned int’, but argument 3 has type ‘long long unsigned int’
> > [-Werror=format=]
> >  "  Tx packets:  %12"PRIu64"\n"
> >  ^
> >
> > Both are coming from the fact that PRIu64 expands to '%lu'.
> > Why we need this printing at all? Can we just print 0 in a string?
> > Otherwise, the only way to fix these warnings is to cast 0 directly to
> > uint64_t.
> 
> I see the same in Travis.
> 
> In the v9 of the series the format used was 0UL. This allowed compilation in 
> Travis except for when compiling OVS with the 32 bit flag.
> From the logs the introduction of 0ULL seems to avoid the issue for 32 bit 
> compilation but introduces the problem for 64 bit
> compilation.
> 
> I don’t see a way around it either without casting.
> 
> Ian

I'll work around this by printing "0" as a string :-)

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 2/3] dpif-netdev: Detailed performance stats for PMDs

2018-03-27 Thread Stokes, Ian
> Comments inline.
> 
> Best regards, Ilya Maximets.
> 
> On 18.03.2018 20:55, Jan Scheurich wrote:
> > This patch instruments the dpif-netdev datapath to record detailed
> > statistics of what is happening in every iteration of a PMD thread.
> >
> > The collection of detailed statistics can be controlled by a new
> > Open_vSwitch configuration parameter "other_config:pmd-perf-metrics".
> > By default it is disabled. The run-time overhead, when enabled, is
> > in the order of 1%.
> >

[snip]

> > +}
> > +if (tx_packets > 0) {
> > +ds_put_format(str,
> > +"  Tx packets:  %12"PRIu64"  (%.0f Kpps)\n"
> > +"  Tx batches:  %12"PRIu64"  (%.2f pkts/batch)"
> > +"\n",
> > +tx_packets, (tx_packets / duration) / 1000,
> > +tx_batches, 1.0 * tx_packets / tx_batches);
> > +} else {
> > +ds_put_format(str,
> > +"  Tx packets:  %12"PRIu64"\n"
> > +"\n",
> > +0ULL);
> 
> I have a few interesting warnings on 64bit ARMv8.
> 
> Clang:
> 
> lib/dpif-netdev-perf.c:216:17: error: format specifies type 'unsigned
> long' but the argument has type 'unsigned long long' [-Werror,-Wformat]
> 0ULL);
> ^~~~
> lib/dpif-netdev-perf.c:229:17: error: format specifies type 'unsigned
> long' but the argument has type 'unsigned long long' [-Werror,-Wformat]
> 0ULL);
> ^~~~
> 
> GCC:
> 
> lib/dpif-netdev-perf.c: In function ‘pmd_perf_format_overall_stats’:
> lib/dpif-netdev-perf.c:215:17: error: format ‘%lu’ expects argument of
> type ‘long unsigned int’, but argument 3 has type ‘long long unsigned int’
> [-Werror=format=]
>  "  Rx packets:  %12"PRIu64"\n",
>  ^
> lib/dpif-netdev-perf.c:227:17: error: format ‘%lu’ expects argument of
> type ‘long unsigned int’, but argument 3 has type ‘long long unsigned int’
> [-Werror=format=]
>  "  Tx packets:  %12"PRIu64"\n"
>  ^
> 
> Both are coming from the fact that PRIu64 expands to '%lu'.
> Why we need this printing at all? Can we just print 0 in a string?
> Otherwise, the only way to fix these warnings is to cast 0 directly to
> uint64_t.

I see the same in Travis.

In the v9 of the series the format used was 0UL. This allowed compilation in 
Travis except for when compiling OVS with the 32 bit flag.
From the logs the introduction of 0ULL seems to avoid the issue for 32 bit 
compilation but introduces the problem for 64 bit compilation.

I don’t see a way around it either without casting.

Ian

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel: Stop managing the /run/openvswitch directory with systemd.

2018-03-27 Thread Markos Chandras
On 27/03/18 15:06, Aaron Conole wrote:
> There are a few advantages (and some disadvantages, also).
> 
> One thing that's nice is systemd will clean up the directories when the
> service ends.  I realize that /run is usually tmpfs, but it's nice that
> they don't linger - even if ovs-lib "breaks in the middle" (meaning
> something goes wrong .. though I'm unable to name an instance where I
> observed that).  Actually, I am looking at tmpfiles.d entries for
> managing some of these complicated directory lists (like /dev/hugepages,
> etc).
> 

I agree, tmpfiles.d is probably going to simplify things a bit.

> Another advantage is when we fully hook up with the
> user+group+capabilities (it's on my TODO list) in systemd service
> files.  At that point, it will not be possible for the ovs-lib to create
> the runtime directories.
> 
> Of course, we know the biggest disadvantage - if systemd breaks things,
> they are really broken.
> 
> Does it make sense?
> 

Yes thank you very much

-- 
markos

SUSE LINUX GmbH | GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg) Maxfeldstr. 5, D-90409, Nürnberg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6] Configurable Link State Change (LSC) detection mode

2018-03-27 Thread Jan Scheurich
Hi Ilya,

This patch is the upstream version of a fix we implemented downstream a year 
ago to fix the issue with massive packet drop of OVS-DPDK on Fortville NICs.

The root cause of this packet drop was the extended blocking of the 
ovs-vswitchd by the i40e PMD during the rte_eth_link_get_nowait() function, 
which caused the PMDs to hang for up to 40ms during upcalls.

At the time switching to LSC interrupt was the only viable solution. OVS still 
polls the links state from DPDK with rte_eth_link_get_nowait() but DPDK returns 
the locally buffered link state, updated through LSC interrupt, instead of 
going through the FVL's admin queue.

Now, the new i40e PMD fix in DPDK bypassing the admin queue should also solve 
the problem with Fortville NICs. It would have to be backported to older DPDK 
releases to be useful as fix for OVS 2.6, 2.7, 2.8 and 2.9, whereas the LSC 
interrupt solution in OVS would work as back-port for all OVS versions since 
2.6.

That's why we still think there is value in pursuing the LSC interrupt track.

BR, Jan

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Ilya Maximets
> Sent: Tuesday, 27 March, 2018 13:17
> To: Stokes, Ian ; Róbert Mulik 
> ; d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v6] Configurable Link State Change (LSC) 
> detection mode
> 
> On 27.03.2018 13:19, Stokes, Ian wrote:
> >> It is possible to change LSC detection mode to polling or interrupt mode
> >> for DPDK interfaces. The default is polling mode. To set interrupt mode,
> >> option dpdk-lsc-interrupt has to be set to true.
> >>
> >> In polling mode more processor time is needed, since the OVS repeatedly
> >> reads the link state with a short period. It can lead to packet loss for
> >> certain systems.
> >>
> >> In interrupt mode the hardware itself triggers an interrupt when link
> >> state change happens, so less processing time needs for the OVS.
> >>
> >> For detailed description and usage see the dpdk install documentation.
> 
> Could you, please, better describe why we need this change?
> Because we're not removing the polling thread. OVS will still
> poll the link states periodically. This config option has
> no effect on that side. Also, link state polling in OVS uses
> 'rte_eth_link_get_nowait()' function which will be called in both
> cases and should not wait for hardware reply in any implementation.
> 
> There was recent bug fix for intel NICs that fixes waiting of an
> admin queue on link state requests despite of 'no_wait' flag:
> http://dpdk.org/ml/archives/dev/2018-March/092156.html
> Will this fix your target case?
> 
> So, the difference of execution time of 'rte_eth_link_get_nowait()'
> with enabled and disabled interrupts should be not so significant.
> Do you have performance measurements? Measurement with above fix applied?
> 
> 
> >
> > Thanks for working on this Robert.
> >
> > I've completed some testing including the case where LSC is not supported, 
> > in which case the port will remain in a down state and
> fail rx/tx traffic. This behavior conforms to the netdev_reconfigure 
> expectations in the fail case so that's ok.
> 
> I'm not sure if this is acceptable. For example, we're not failing
> reconfiguration in case of issues with number of queues. We're trying
> different numbers until we have working configuration.
> Maybe we need the same fall-back mechanism in case of not supported LSC
> interrupts? (MTU setup errors are really uncommon unlike LSC interrupts'
> support in PMDs).
> 
> >
> > I'm a bit late to the thread but I have a few other comments below.
> >
> > I'd like to get this patch in the next pull request if possible so I'd 
> > appreciate if others can give any comments on the patch also.
> >
> > Thanks
> > Ian
> >
> >>
> >> Signed-off-by: Robert Mulik 
> >> ---
> >> v5 -> v6:
> >> - DPDK install documentation updated.
> >> - Status of lsc_interrupt_mode of DPDK interfaces can be read by command
> >>   ovs-appctl dpif/show.
> >> - It was suggested to check if the HW supports interrupt mode, but it is
> >> not
> >>   possible to do without DPDK code change, so it is skipped from this
> >> patch.
> >> ---
> >>  Documentation/intro/install/dpdk.rst | 33
> >> +
> >>  lib/netdev-dpdk.c| 24 ++--
> >>  vswitchd/vswitch.xml | 17 +
> >>  3 files changed, 72 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/Documentation/intro/install/dpdk.rst
> >> b/Documentation/intro/install/dpdk.rst
> >> index ed358d5..eb1bc7b 100644
> >> --- a/Documentation/intro/install/dpdk.rst
> >> +++ b/Documentation/intro/install/dpdk.rst
> >> @@ -628,6 +628,39 @@ The average number of packets per output batch can be
> >> checked in PMD stats::
> >>
> >>  $ ovs-appctl dpif-netdev/pmd-stats-show
> >>
> >> +Link State 

Re: [ovs-dev] [PATCH v6] Configurable Link State Change (LSC) detection mode

2018-03-27 Thread Stokes, Ian
> On 27.03.2018 13:19, Stokes, Ian wrote:
> >> It is possible to change LSC detection mode to polling or interrupt
> >> mode for DPDK interfaces. The default is polling mode. To set
> >> interrupt mode, option dpdk-lsc-interrupt has to be set to true.
> >>
> >> In polling mode more processor time is needed, since the OVS
> >> repeatedly reads the link state with a short period. It can lead to
> >> packet loss for certain systems.
> >>
> >> In interrupt mode the hardware itself triggers an interrupt when link
> >> state change happens, so less processing time needs for the OVS.
> >>
> >> For detailed description and usage see the dpdk install documentation.
> 
> Could you, please, better describe why we need this change?
> Because we're not removing the polling thread. OVS will still poll the
> link states periodically. This config option has no effect on that side.
> Also, link state polling in OVS uses 'rte_eth_link_get_nowait()' function
> which will be called in both cases and should not wait for hardware reply
> in any implementation.

I believe it was related to a case where bonded mode in active back was causing 
packet drops due to the frequency that the LSC was being polled. Using 
interrupt based approach alleviated the issue. (I'm open to correction on this 
:))

@Robert/Eelco You may be able to provide some more light here and whether the 
patches below in DPDK resolve the issue?

> 
> There was recent bug fix for intel NICs that fixes waiting of an admin
> queue on link state requests despite of 'no_wait' flag:
> http://dpdk.org/ml/archives/dev/2018-March/092156.html
> Will this fix your target case?
> 
> So, the difference of execution time of 'rte_eth_link_get_nowait()'
> with enabled and disabled interrupts should be not so significant.
> Do you have performance measurements? Measurement with above fix applied?
> 
> 
> >
> > Thanks for working on this Robert.
> >
> > I've completed some testing including the case where LSC is not
> supported, in which case the port will remain in a down state and fail
> rx/tx traffic. This behavior conforms to the netdev_reconfigure
> expectations in the fail case so that's ok.
> 
> I'm not sure if this is acceptable. For example, we're not failing
> reconfiguration in case of issues with number of queues. We're trying
> different numbers until we have working configuration.
> Maybe we need the same fall-back mechanism in case of not supported LSC
> interrupts? (MTU setup errors are really uncommon unlike LSC interrupts'
> support in PMDs).

Thanks for raising this Ilya.

I thought of this as well. I'd like to see a fall back to the PMD but didn’t 
see how it could be done in a clean way.

Unfortunately rte_eth_dev_configure() returns -EINVAL when lsc mode is 
requested but not supported.

It doesn't give us a clue if the error is related to lsc mode as it could also 
relate to a number of other configure issues such as nb_rxq/nb_txq/portid etc.

It would be better if we could query the device via ethdev api to see if it 
supports lsc interrupt mode but that’s not available currently.

The only way I have seen lsc support queries in DPDK is by querying the device 
data itself which doesn't look great, this was discussed in an earlier thread I 
think for this patch.

The alternative could be to introduce a generic retry for 
rte_eth_dev_configure() when configure fails but with lsc configured to PMD 
instead but I'd prefer to see an indication that the lsc mode was the cause.

If we have a cleaner way to fall back to PMD that’s ok by me but I think we 
have to allow for a possible failure in during a reconfigure and follow the 
prescribed behavior as per netdev_reconfigure() which this code currently does.

> 
> >
> > I'm a bit late to the thread but I have a few other comments below.
> >
> > I'd like to get this patch in the next pull request if possible so I'd
> appreciate if others can give any comments on the patch also.
> >
> > Thanks
> > Ian
> >
> >>
> >> Signed-off-by: Robert Mulik 
> >> ---
> >> v5 -> v6:
> >> - DPDK install documentation updated.
> >> - Status of lsc_interrupt_mode of DPDK interfaces can be read by
> command
> >>   ovs-appctl dpif/show.
> >> - It was suggested to check if the HW supports interrupt mode, but it
> >> is not
> >>   possible to do without DPDK code change, so it is skipped from this
> >> patch.
> >> ---
> >>  Documentation/intro/install/dpdk.rst | 33
> >> +
> >>  lib/netdev-dpdk.c| 24 ++--
> >>  vswitchd/vswitch.xml | 17 +
> >>  3 files changed, 72 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/Documentation/intro/install/dpdk.rst
> >> b/Documentation/intro/install/dpdk.rst
> >> index ed358d5..eb1bc7b 100644
> >> --- a/Documentation/intro/install/dpdk.rst
> >> +++ b/Documentation/intro/install/dpdk.rst
> >> @@ -628,6 +628,39 @@ The average number of packets per output batch
> >> 

Re: [ovs-dev] [PATCH] rhel: Stop managing the /run/openvswitch directory with systemd.

2018-03-27 Thread Aaron Conole
Markos Chandras  writes:

> On 27/03/18 14:34, Aaron Conole wrote:
>> 
>> Systemd has fixed this with commit:
>> 
>> 30c81ce2cef9 ("pid1: when creating service directories, don't chown existing 
>> files")
>> 
>> Which was caught thanks to some proactive testing:
>> 
>> https://bugzilla.redhat.com/show_bug.cgi?id=1508495
>> 
>> I think we probably don't need this fix, provided downstream versions
>> backport that commit.
>> 
>
> Hi Aaron,
>
> Thank you for the information. I am curious, do you know why we are
> managing the /run/openvswitch directory in the systemd service file
> given that ovs-lib already tries to manage it as well?

There are a few advantages (and some disadvantages, also).

One thing that's nice is systemd will clean up the directories when the
service ends.  I realize that /run is usually tmpfs, but it's nice that
they don't linger - even if ovs-lib "breaks in the middle" (meaning
something goes wrong .. though I'm unable to name an instance where I
observed that).  Actually, I am looking at tmpfiles.d entries for
managing some of these complicated directory lists (like /dev/hugepages,
etc).

Another advantage is when we fully hook up with the
user+group+capabilities (it's on my TODO list) in systemd service
files.  At that point, it will not be possible for the ovs-lib to create
the runtime directories.

Of course, we know the biggest disadvantage - if systemd breaks things,
they are really broken.

Does it make sense?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 3/3] dpif-netdev: Detection and logging of suspicious PMD iterations

2018-03-27 Thread Ilya Maximets
I see following behaviour:

1. Configure low -us (like 100)
2. After that I see many logs about suspicious iterations (expected).

2018-03-27T13:58:27Z|03574|pmd_perf(pmd7)|WARN|Suspicious iteration (Excessive 
total cycles): tsc=520415762246435 duration=106 us
2018-03-27T13:58:27Z|03575|pmd_perf(pmd7)|WARN|Neighborhood of suspicious 
iteration:
   tsc cycles   packets  cycles/pkt   pkts/batch   
vhost qlen   upcalls  cycles/upcall
   520415762297985 9711 32   303  32   424  
00  
   520415762287041 1066732   333  32   419  
00  
   520415762277319 9722 32   303  32   429  
00  
   520415762267083 9971 32   311  32   443  
00  
   520415762257413 9670 32   302  32   451  
00  
   520415762246435 1069932   334  32   448  
00  
   520415762235033 1110932   347  32   455  
00  
   520415762180220 9826 32   307  32   399  
00  
   520415762169792 1022932   319  32   413  
00  
   520415762160385 9407 32   293  32   408  
00  
   520415762150221 9891 32   309  32   434  
00  
2018-03-27T13:58:27Z|03576|pmd_perf(pmd7)|WARN|Suspicious iteration (Excessive 
total cycles): tsc=520415762469997 duration=104 us
2018-03-27T13:58:27Z|03577|pmd_perf(pmd7)|WARN|Neighborhood of suspicious 
iteration:
   tsc cycles   packets  cycles/pkt   pkts/batch   
vhost qlen   upcalls  cycles/upcall
   520415762519119 9462 32   295  32   505  
00  
   520415762509595 9319 32   291  32   537  
00  
   520415762500154 9283 32   290  32   569  
00  
   520415762490585 9287 32   290  32   601  
00  
   520415762480693 9730 32   304  32   633  
00  
   520415762469997 1041432   325  32   665  
00  
   520415762459348 1034232   323  32   697  
00  
   520415762297985 9711 32   303  32   424  
00  
   520415762287041 1066732   333  32   419  
00  
   520415762277319 9722 32   303  32   429  
00  
   520415762267083 9971 32   311  32   443  
00

3. Configure back high -us (like 1000).
4. Logs are still there with zero duration. Logs printed every second like this:

2018-03-27T14:02:08Z|04140|pmd_perf(pmd7)|WARN|Suspicious iteration (Excessive 
total cycles): tsc=520437806368099 duration=0 us
[Thread 0x7fb56f2910 (LWP 19754) exited]
[New Thread 0x7fb56f2910 (LWP 19755)]
2018-03-27T14:02:08Z|04141|pmd_perf(pmd7)|WARN|Neighborhood of suspicious 
iteration:
   tsc cycles   packets  cycles/pkt   pkts/batch   
vhost qlen   upcalls  cycles/upcall
   520437806368309 44   0000
00  
   520437806368266 43   0000
00  
   520437806368223 43   0000
00  
   520437806368179 44   0000
00  
   520437806368134 45   0000
00  
   520437806368099 35   0000
00  
   520437806005193 362819   0000
00  
   520437806005149 44   0000
00  
   520437806005105 44   0000
00  
   520437806005061 44   0000
00  
   520437806005017 44  

Re: [ovs-dev] [PATCH 4/4] rhel: selinux-policy to invoke proper label macros

2018-03-27 Thread Aaron Conole
Ansis Atteka  writes:

> On 20 March 2018 at 14:05, Aaron Conole  wrote:
>> The rpm doesn't invoke all of the required selinux helpers to enact labeling
>> or relabeling on all versions of Fedora/RHEL.  According to:
>>   https://fedoraproject.org/wiki/SELinux/IndependentPolicy
>>
>> This commit switches to use the selinux rpm macros which will ensure that
>> all of the labels defined in the .fc.in file are applied properly.
>
> Ok, it seems you need to send similar patch for
> rhel/openvswitch.spec.in. Not only for fedora.

Cool, will do.

> In the meantime I will later try to add fedorabuilder to the Vagrant
> builder recipes and test what you have for Fedora.

Ansis++!! Thanks!

> Also, why was I able to reload openvswitch kernel module on CentOS
> without the ovs-kmod-ctl being properly marked? Are there some rules
> that we would need to remove now from openvswitch.te?

I'm not sure.  I'm using Fedora and RHEL for my testing, and it seems
the policies/labels are a bit different.  Maybe Lukas (cc'd) knows more?

>>
>> Signed-off-by: Aaron Conole 
>> ---
>>  rhel/openvswitch-fedora.spec.in | 10 --
>>  1 file changed, 8 insertions(+), 2 deletions(-)
>>
>> diff --git a/rhel/openvswitch-fedora.spec.in 
>> b/rhel/openvswitch-fedora.spec.in
>> index 8fbc985ce..b606cb7e0 100644
>> --- a/rhel/openvswitch-fedora.spec.in
>> +++ b/rhel/openvswitch-fedora.spec.in
>> @@ -340,6 +340,9 @@ rm -f $RPM_BUILD_ROOT%{_bindir}/ovs-parse-backtrace \
>>  %clean
>>  rm -rf $RPM_BUILD_ROOT
>>
>> +%pre selinux-policy
>> +%selinux_relabel_pre -s targeted
>> +
>>  %preun
>>  %if 0%{?systemd_preun:1}
>>  %systemd_preun %{name}.service
>> @@ -444,7 +447,7 @@ fi
>>  %endif
>>
>>  %post selinux-policy
>> -/usr/sbin/semodule -i 
>> %{_datadir}/selinux/packages/%{name}/openvswitch-custom.pp &> /dev/null || :
>> +%selinux_modules_install -s targeted 
>> %{_datadir}/selinux/packages/%{name}/openvswitch-custom.pp
>>
>>  %postun
>>  %if 0%{?systemd_postun:1}
>> @@ -476,9 +479,12 @@ fi
>>
>>  %postun selinux-policy
>>  if [ $1 -eq 0 ] ; then
>> -  /usr/sbin/semodule -r openvswitch-custom &> /dev/null || :
>> +  %selinux_modules_uninstall -s targeted openvswitch-custom
>>  fi
>>
>> +%posttrans selinux-policy
>> +%selinux_relabel_post -s targeted
>> +
>>  %files selinux-policy
>>  %defattr(-,root,root)
>>  %{_datadir}/selinux/packages/%{name}/openvswitch-custom.pp
>> --
>> 2.14.3
>>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 3/4] selinux: introduce domain transitioned kmod helper

2018-03-27 Thread Aaron Conole
Ansis Atteka  writes:

> On 20 March 2018 at 14:05, Aaron Conole  wrote:
>> This commit uses the previously defined selinux label to transition
>> from the openvswitch_t to openvswitch_load_module_t domain, by way of
>> a specially labelled ovs-kmod-ctl helper.
>
> s/by way of a specially labelled ovs-kmod-ctl helper/ by executing
> ovs-kmod-ctl that is labelled with openvswitch_load_module_exec_t
> type.

I like that this also eliminates a silly spelling mistake.  I'll use it.

Thanks!

>>
>> Signed-off-by: Aaron Conole 
>> ---
>>  selinux/.gitignore   | 4 
>>  selinux/automake.mk  | 3 ++-
>>  selinux/openvswitch-custom.fc.in | 1 +
>>  3 files changed, 7 insertions(+), 1 deletion(-)
>>  create mode 100644 selinux/openvswitch-custom.fc.in
>>
>> diff --git a/selinux/.gitignore b/selinux/.gitignore
>> index 83a0afb51..64e834cd1 100644
>> --- a/selinux/.gitignore
>> +++ b/selinux/.gitignore
>> @@ -1 +1,5 @@
>>  openvswitch-custom.te
>> +openvswitch-custom.fc
>> +openvswitch-custom.pp
>> +openvswitch-custom.if
>> +tmp/
>> diff --git a/selinux/automake.mk b/selinux/automake.mk
>> index b37e8f337..c7dfe6ed5 100644
>> --- a/selinux/automake.mk
>> +++ b/selinux/automake.mk
>> @@ -6,11 +6,12 @@
>>  # without warranty of any kind.
>>
>>  EXTRA_DIST += \
>> +selinux/openvswitch-custom.fc.in \
>>  selinux/openvswitch-custom.te.in
>>
>>  PHONY: selinux-policy
>>
>> -selinux-policy: selinux/openvswitch-custom.te
>> +selinux-policy: selinux/openvswitch-custom.te selinux/openvswitch-custom.fc
>> $(MAKE) -C selinux/ -f /usr/share/selinux/devel/Makefile
>>
>>  CLEANFILES += \
>> diff --git a/selinux/openvswitch-custom.fc.in 
>> b/selinux/openvswitch-custom.fc.in
>> new file mode 100644
>> index 0..c2756d04b
>> --- /dev/null
>> +++ b/selinux/openvswitch-custom.fc.in
>> @@ -0,0 +1 @@
>> +@pkgdatadir@/scripts/ovs-kmod-ctl -- 
>> gen_context(system_u:object_r:openvswitch_load_module_exec_t,s0)
>
> It seems that above line did now work for me on Centos 7 (at least
> automatically). If you use vagrant then you can repro by:

Right, I'm not surprised.  I discovered that there needs to still be a
relabel operation.

> # cd poc/builders
> # vagrant up centosbuilder
> # vagrant ssh centosbuilder
> # cd /var/www/html/RPMS/x86_64
> # install ovs rpm
> # cd /var/www/html/RPMS/noarch
> # install selinux rpm
> # ls -Z /usr/share/openvswitch/scripts/ovs-kmod-ctl

But I wonder if it's still not functional after 4/4 - I'll look into it.

> to see it for yourself.
>> --
>> 2.14.3
>>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 2/4] selinux: create a transition type for module loading

2018-03-27 Thread Aaron Conole
Ansis Atteka  writes:

> On 20 March 2018 at 14:05, Aaron Conole  wrote:
>> Defines a type 'openvswitch_load_module_t' used exclusively for loading
>> modules.  This means that the 'openvswitch_t' domain won't require
>> modules
>
> Are you sure the bootstrapping to intended openvswitch_load_module_t
> happens properly?

Are you asking whether the domain works?  It did for me.

> In my case it does not appear to happen correctly, because the
> ovs-kmod-ctl does not have the right SElinux type:
>
> [vagrant@centosbuilder ~]$ ls -Z /usr/share/openvswitch/scripts/ovs-kmod-ctl
> -rwxr-xr-x. root root system_u:object_r:usr_t:s0
> /usr/share/openvswitch/scripts/ovs-kmod-ctl
>
> and then in "ps -Z" I see:
>
> unconfined_u:system_r:openvswitch_t:s0 root 32013 31995  0 21:37 ?
> 00:00:00 /bin/sh /usr/share/openvswitch/scripts/ovs-kmod-ctl insert
>
> After manually:
>
> # chcon system_u:object_r:openvswitch_load_module_exec_t:s0
> /usr/share/openvswitch/scripts/ovs-kmod-ctl
>
> I see that in "ps -Z ..." output suddenly the process executing
> ovs-kmod-ctl transitions to the correct openvswitch_load_module_t
> type:
>
> unconfined_u:system_r:openvswitch_load_module_t:s0 root 12225 12215  0
> 21:33 ? 00:00:00 /bin/sh /usr/share/openvswitch/scripts/ovs-kmod-ctl
> insert
>
>
> Is this a bug or am I missing something?

This commit creates the domain, but nothing is labeled to it, until
3/4.  After 3/4, the label will exist in the policy (but only get
applied when the label operation is invoked, it seems - which was
confusing for me).  This is also why I needed 4/4 - the selinux labeling
operations weren't there.

Make sense?

>> access to the module loading facility - such access can only happen
>> after transitioning through the 'openvswitch_load_module_exec_t'
>> transition context.
>>
>> A future commit will label the appropriate script with extended attributes
>> to make use of this new domain.
>>
>> Signed-off-by: Aaron Conole 
>> ---
>>  selinux/openvswitch-custom.te.in | 79 
>> +---
>>  1 file changed, 74 insertions(+), 5 deletions(-)
>>
>> diff --git a/selinux/openvswitch-custom.te.in 
>> b/selinux/openvswitch-custom.te.in
>> index db3cf6d8d..31e8fab15 100644
>> --- a/selinux/openvswitch-custom.te.in
>> +++ b/selinux/openvswitch-custom.te.in
>> @@ -1,13 +1,31 @@
>>  module openvswitch-custom 1.0.1;
> Unrelated to your series, but I think we should peg the Open vSwitch
> selinux module version to the Open vSwitch version. What do you think?

I think it's a good idea.  I can fold it in as a new patch in the
series.  Or if you want to submit it formally, go ahead and include my
Acked-by :)

>>
>>  require {
>> +role system_r;
>> +role object_r;
>> +
>>  type openvswitch_t;
>>  type openvswitch_rw_t;
>>  type openvswitch_tmp_t;
>>  type openvswitch_var_run_t;
>>
>> +type bin_t;
>>  type ifconfig_exec_t;
>> +type init_t;
>> +type init_var_run_t;
>> +type insmod_exec_t;
>>  type hostname_exec_t;
>> +type modules_conf_t;
>> +type modules_object_t;
>> +type passwd_file_t;
>> +type plymouth_exec_t;
>> +type proc_t;
>> +type shell_exec_t;
>> +type sssd_t;
>> +type sssd_public_t;
>> +type sssd_var_lib_t;
>> +type sysfs_t;
>> +type systemd_unit_file_t;
>>  type tun_tap_device_t;
>>
>>  @begin_dpdk@
>> @@ -21,18 +39,36 @@ require {
>>
>>  class capability { dac_override audit_write };
>>  class chr_file { write getattr read open ioctl };
>> -class dir { write remove_name add_name lock read };
>> -class file { write getattr read open execute execute_no_trans 
>> create unlink };
>> +class dir { write remove_name add_name lock read getattr search 
>> open };
>> +class fd { use };
>> +class file { write getattr read open execute execute_no_trans 
>> create unlink map entrypoint lock ioctl };
>> +class fifo_file { getattr read write append ioctl lock open };
>> +class filesystem getattr;
>> +class lnk_file { read open };
>>  class netlink_audit_socket { create nlmsg_relay audit_write read 
>> write };
>>  class netlink_socket { setopt getopt create connect getattr write 
>> read };
>> -class unix_stream_socket { write getattr read connectto connect 
>> setopt getopt sendto accept bind recvfrom acceptfrom };
>> +class sock_file { write };
>> +class system module_load;
>> +class process { sigchld signull transition noatsecure siginh 
>> rlimitinh };
>> +class unix_stream_socket { write getattr read connectto connect 
>> setopt getopt sendto accept bind recvfrom acceptfrom ioctl };
>>
>>  @begin_dpdk@
>> -class sock_file { read write append getattr open };
>> +class sock_file { read append getattr 

Re: [ovs-dev] [PATCH] rhel: Stop managing the /run/openvswitch directory with systemd.

2018-03-27 Thread Markos Chandras
On 27/03/18 14:34, Aaron Conole wrote:
> 
> Systemd has fixed this with commit:
> 
> 30c81ce2cef9 ("pid1: when creating service directories, don't chown existing 
> files")
> 
> Which was caught thanks to some proactive testing:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1508495
> 
> I think we probably don't need this fix, provided downstream versions
> backport that commit.
> 

Hi Aaron,

Thank you for the information. I am curious, do you know why we are
managing the /run/openvswitch directory in the systemd service file
given that ovs-lib already tries to manage it as well?

-- 
markos

SUSE LINUX GmbH | GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg) Maxfeldstr. 5, D-90409, Nürnberg
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel: Stop managing the /run/openvswitch directory with systemd.

2018-03-27 Thread Aaron Conole
Markos Chandras  writes:

> It appears that new systemd versions (tested with v237) changed the
> way RuntimeDirectory option behaves. Upstream commit 3536f49e8fa2
> ("core: add {State,Cache,Log,Configuration}Directory=") modified the
> RuntimeDirectory code to run before every ExecStart* command instead
> of running it once per service file when the service is run as 'root'.
>
> This breaks the ovsdb-server because after the chown command was applied,
> the RuntimeDirectory code was executed again, effectively wiping the
> /run/openvswitch directory and creating it again resulting in the
> following problem.
>
> |2|daemon_unix|EMER|/var/run/openvswitch/ovsdb-server.pid.tmp: create 
> failed (Permission denied)
> Mar 19 16:37:20 susetest ovs-ctl[3045]: ovsdb-server: 
> /var/run/openvswitch/ovsdb-server.pid.tmp: create failed (Permission denied)
> Mar 19 16:37:20 susetest ovs-ctl[3045]: Starting ovsdb-server ... failed!
>
> The ovs-lib code can already manage that directory for us so we can
> remove these entries from the systemd file and let ovs-vsctl do it.
>
> Cc: Aaron Conole 
> Signed-off-by: Markos Chandras 
> ---

Systemd has fixed this with commit:

30c81ce2cef9 ("pid1: when creating service directories, don't chown existing 
files")

Which was caught thanks to some proactive testing:

https://bugzilla.redhat.com/show_bug.cgi?id=1508495

I think we probably don't need this fix, provided downstream versions
backport that commit.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] rhel: don't drop capabilities when running as root

2018-03-27 Thread Aaron Conole
Aaron Conole  writes:

> Currently, regardless of which user is being set as the running user,
> Open vSwitch daemons on RHEL systems drop capabilities.  This means the
> very powerful CAP_SYS_ADMIN is dropped, even when the user is 'root'.
>
> For the majority of use cases this behavior works, as the user can
> enable or disable various configurations, regardless of which datapath
> functions are desired.  However, when using certain DPDK PMDs, the
> enablement and configuration calls require CAP_SYS_ADMIN.
>
> Instead of retaining CAP_SYS_ADMIN in all cases, which would practically
> nullify the uid/gid and privilege drop, we don't pass the --ovs-user
> option to the daemons.  This shunts the capability and privilege
> dropping code.
>
> Reported-by: Marcos Felipe Schwarz 
> Reported-at: 
> https://mail.openvswitch.org/pipermail/ovs-discuss/2018-January/045955.html
> Fixes: e3e738a3d058 ("redhat: allow dpdk to also run as non-root user")
> Signed-off-by: Aaron Conole 
> ---

Ping?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 2/3] dpif-netdev: Detailed performance stats for PMDs

2018-03-27 Thread Ilya Maximets
Comments inline.

Best regards, Ilya Maximets.

On 18.03.2018 20:55, Jan Scheurich wrote:
> This patch instruments the dpif-netdev datapath to record detailed
> statistics of what is happening in every iteration of a PMD thread.
> 
> The collection of detailed statistics can be controlled by a new
> Open_vSwitch configuration parameter "other_config:pmd-perf-metrics".
> By default it is disabled. The run-time overhead, when enabled, is
> in the order of 1%.
> 
> The covered metrics per iteration are:
>   - cycles
>   - packets
>   - (rx) batches
>   - packets/batch
>   - max. vhostuser qlen
>   - upcalls
>   - cycles spent in upcalls
> 
> This raw recorded data is used threefold:
> 
> 1. In histograms for each of the following metrics:
>- cycles/iteration (log.)
>- packets/iteration (log.)
>- cycles/packet
>- packets/batch
>- max. vhostuser qlen (log.)
>- upcalls
>- cycles/upcall (log)
>The histograms bins are divided linear or logarithmic.
> 
> 2. A cyclic history of the above statistics for 999 iterations
> 
> 3. A cyclic history of the cummulative/average values per millisecond
>wall clock for the last 1000 milliseconds:
>- number of iterations
>- avg. cycles/iteration
>- packets (Kpps)
>- avg. packets/batch
>- avg. max vhost qlen
>- upcalls
>- avg. cycles/upcall
> 
> The gathered performance metrics can be printed at any time with the
> new CLI command
> 
> ovs-appctl dpif-netdev/pmd-perf-show [-nh] [-it iter_len] [-ms ms_len]
> [-pmd core] [dp]
> 
> The options are
> 
> -nh:Suppress the histograms
> -it iter_len:   Display the last iter_len iteration stats
> -ms ms_len: Display the last ms_len millisecond stats
> -pmd core:  Display only the specified PMD
> 
> The performance statistics are reset with the existing
> dpif-netdev/pmd-stats-clear command.
> 
> The output always contains the following global PMD statistics,
> similar to the pmd-stats-show command:
> 
> Time: 15:24:55.270
> Measurement duration: 1.008 s
> 
> pmd thread numa_id 0 core_id 1:
> 
>   Cycles:2419034712  (2.40 GHz)
>   Iterations:572817  (1.76 us/it)
>   - idle:486808  (15.9 % cycles)
>   - busy: 86009  (84.1 % cycles)
>   Rx packets:   2399607  (2381 Kpps, 848 cycles/pkt)
>   Datapath passes:  3599415  (1.50 passes/pkt)
>   - EMC hits:336472  ( 9.3 %)
>   - Megaflow hits:  3262943  (90.7 %, 1.00 subtbl lookups/hit)
>   - Upcalls:  0  ( 0.0 %, 0.0 us/upcall)
>   - Lost upcalls: 0  ( 0.0 %)
>   Tx packets:   2399607  (2381 Kpps)
>   Tx batches:171400  (14.00 pkts/batch)
> 
> Signed-off-by: Jan Scheurich 
> Acked-by: Billy O'Mahony 
> ---
>  NEWS|   3 +
>  lib/automake.mk |   1 +
>  lib/dpif-netdev-perf.c  | 350 
> +++-
>  lib/dpif-netdev-perf.h  | 258 ++--
>  lib/dpif-netdev-unixctl.man | 157 
>  lib/dpif-netdev.c   | 183 +--
>  manpages.mk |   2 +
>  vswitchd/ovs-vswitchd.8.in  |  27 +---
>  vswitchd/vswitch.xml|  12 ++
>  9 files changed, 940 insertions(+), 53 deletions(-)
>  create mode 100644 lib/dpif-netdev-unixctl.man
> 
> diff --git a/NEWS b/NEWS
> index 8d0b502..8f66fd3 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -73,6 +73,9 @@ v2.9.0 - 19 Feb 2018
>   * Add support for vHost dequeue zero copy (experimental)
> - Userspace datapath:
>   * Output packet batching support.
> + * Commands ovs-appctl dpif-netdev/pmd-*-show can now work on a single 
> PMD
> + * Detailed PMD performance metrics available with new command
> + ovs-appctl dpif-netdev/pmd-perf-show

I guess, this should go to Post-v2.9.0.

> - vswitchd:
>   * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits.
>   * Configuring a controller, or unconfiguring all controllers, now 
> deletes
> diff --git a/lib/automake.mk b/lib/automake.mk
> index 5c26e0f..7a5632d 100644
> --- a/lib/automake.mk
> +++ b/lib/automake.mk
> @@ -484,6 +484,7 @@ MAN_FRAGMENTS += \
>   lib/dpctl.man \
>   lib/memory-unixctl.man \
>   lib/netdev-dpdk-unixctl.man \
> + lib/dpif-netdev-unixctl.man \
>   lib/ofp-version.man \
>   lib/ovs.tmac \
>   lib/service.man \
> diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c
> index f06991a..2b36410 100644
> --- a/lib/dpif-netdev-perf.c
> +++ b/lib/dpif-netdev-perf.c
> @@ -15,18 +15,324 @@
>   */
>  
>  #include 
> +#include 
>  
> +#include "dpif-netdev-perf.h"
>  #include "openvswitch/dynamic-string.h"
>  #include "openvswitch/vlog.h"
> -#include "dpif-netdev-perf.h"
> +#include "ovs-thread.h"
>  #include "timeval.h"
>  
>  VLOG_DEFINE_THIS_MODULE(pmd_perf);
>  
> +#ifdef DPDK_NETDEV
> +static uint64_t
> 

[ovs-dev] can not update userspace vxlan tunnel neigh mac when peer VTEP mac changed

2018-03-27 Thread ychen
Hi, 
   I found that sometime userspace vxlan can not work happily.
   1.  first data packet loss
when tunnel neigh cache is empty, then the first data packet triggered  
sending ARP packet to peer VTEP, and the data packet dropped,
tunnel neigh cache added this entry when receive ARP reply packet.
   
err = tnl_neigh_lookup(out_dev->xbridge->name, _ip6, );
   if (err) {
xlate_report(ctx, OFT_DETAIL,
 "neighbor cache miss for %s on bridge %s, "
 "sending %s request",
 buf_dip6, out_dev->xbridge->name, d_ip ? "ARP" : "ND");
if (d_ip) {
tnl_send_arp_request(ctx, out_dev, smac, s_ip, d_ip);
} else {
tnl_send_nd_request(ctx, out_dev, smac, _ip6, _ip6);
}
return err;   
}


2. connection lost when peer VTEP mac changed
when VTEP mac is already in tunnel neigh cache,   exp: 
10.182.6.81   fa:eb:26:c3:16:a5   br-phy

so when data packet come in,  it will use this mac for encaping outer VXLAN 
header.
but VTEP 10.182.6.81  mac changed from  fa:eb:26:c3:16:a5 to  
24:eb:26:c3:16:a5 because of NIC changed.

data packet continue sending with the old mac  fa:eb:26:c3:16:a5, but the 
peer VTEP will not accept these packets because of mac not match.
the wrong tunnel neigh entry aging until the data packet stop sending.


   if (ovs_native_tunneling_is_on(ctx->xbridge->ofproto)) {
tnl_neigh_snoop(flow, wc, ctx->xbridge->name);
}


3. is there anybody has working for these problems?



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v10 1/3] netdev: Add optional qfill output parameter to rxq_recv()

2018-03-27 Thread Ilya Maximets
Still don't like the general way of implementation of the patch-set,
but, I guess, we could live with that for some time while thinking
about rework.

One minor comment inline.

Sorry again for late responses.

Best regards, Ilya Maximets.

On 18.03.2018 20:55, Jan Scheurich wrote:
> If the caller provides a non-NULL qfill pointer and the netdev
> implemementation supports reading the rx queue fill level, the rxq_recv()
> function returns the remaining number of packets in the rx queue after
> reception of the packet burst to the caller. If the implementation does
> not support this, it returns -ENOTSUP instead. Reading the remaining queue
> fill level should not substantilly slow down the recv() operation.
> 
> A first implementation is provided for ethernet and vhostuser DPDK ports
> in netdev-dpdk.c.
> 
> This output parameter will be used in the upcoming commit for PMD
> performance metrics to supervise the rx queue fill level for DPDK
> vhostuser ports.
> 
> Signed-off-by: Jan Scheurich 
> Acked-by: Billy O'Mahony 
> ---
>  lib/dpif-netdev.c |  2 +-
>  lib/netdev-bsd.c  |  8 +++-
>  lib/netdev-dpdk.c | 25 +++--
>  lib/netdev-dummy.c|  8 +++-
>  lib/netdev-linux.c|  7 ++-
>  lib/netdev-provider.h |  7 ++-
>  lib/netdev.c  |  5 +++--
>  lib/netdev.h  |  3 ++-
>  8 files changed, 55 insertions(+), 10 deletions(-)
> 
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index b07fc6b..86d8739 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -3276,7 +3276,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread 
> *pmd,
>  pmd->ctx.last_rxq = rxq;
>  dp_packet_batch_init();
>  
> -error = netdev_rxq_recv(rxq->rx, );
> +error = netdev_rxq_recv(rxq->rx, , NULL);
>  if (!error) {
>  /* At least one packet received. */
>  *recirc_depth_get() = 0;
> diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c
> index 05974c1..b70f327 100644
> --- a/lib/netdev-bsd.c
> +++ b/lib/netdev-bsd.c
> @@ -618,7 +618,8 @@ netdev_rxq_bsd_recv_tap(struct netdev_rxq_bsd *rxq, 
> struct dp_packet *buffer)
>  }
>  
>  static int
> -netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch)
> +netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch,
> +int *qfill)
>  {
>  struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_);
>  struct netdev *netdev = rxq->up.netdev;
> @@ -643,6 +644,11 @@ netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct 
> dp_packet_batch *batch)
>  batch->packets[0] = packet;
>  batch->count = 1;
>  }
> +
> +if (qfill) {
> +*qfill = -ENOTSUP;
> +}
> +
>  return retval;
>  }
>  
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index af9843a..66f2439 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -1808,7 +1808,7 @@ netdev_dpdk_vhost_update_rx_counters(struct 
> netdev_stats *stats,
>   */
>  static int
>  netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
> -   struct dp_packet_batch *batch)
> +   struct dp_packet_batch *batch, int *qfill)
>  {
>  struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
>  struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev);
> @@ -1846,11 +1846,24 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
>  batch->count = nb_rx;
>  dp_packet_batch_init_packet_fields(batch);
>  
> +if (qfill) {
> +if (nb_rx == NETDEV_MAX_BURST) {
> +/* The DPDK API returns a uint32_t which often has invalid bits 
> in
> + * the upper 16-bits. Need to restrict the value to uint16_t. */
> +*qfill = rte_vhost_rx_queue_count(netdev_dpdk_get_vid(dev),
> +  qid * VIRTIO_QNUM + VIRTIO_TXQ)
> +& UINT16_MAX;
> +} else {
> +*qfill = 0;
> +}
> +}
> +
>  return 0;
>  }
>  
>  static int
> -netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch)
> +netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch,
> + int *qfill)
>  {
>  struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq);
>  struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev);
> @@ -1887,6 +1900,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
> dp_packet_batch *batch)
>  batch->count = nb_rx;
>  dp_packet_batch_init_packet_fields(batch);
>  
> +if (qfill) {
> +if (nb_rx == NETDEV_MAX_BURST) {
> +*qfill = rte_eth_rx_queue_count(rx->port_id, rxq->queue_id);
> +} else {
> +*qfill = 0;
> +}
> +}
> +
>  return 0;
>  }
>  
> diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c
> index 8af9e1a..13bc580 100644
> --- a/lib/netdev-dummy.c
> +++ b/lib/netdev-dummy.c
> @@ -992,7 

Re: [ovs-dev] [PATCH v6] Configurable Link State Change (LSC) detection mode

2018-03-27 Thread Ilya Maximets
On 27.03.2018 13:19, Stokes, Ian wrote:
>> It is possible to change LSC detection mode to polling or interrupt mode
>> for DPDK interfaces. The default is polling mode. To set interrupt mode,
>> option dpdk-lsc-interrupt has to be set to true.
>>
>> In polling mode more processor time is needed, since the OVS repeatedly
>> reads the link state with a short period. It can lead to packet loss for
>> certain systems.
>>
>> In interrupt mode the hardware itself triggers an interrupt when link
>> state change happens, so less processing time needs for the OVS.
>>
>> For detailed description and usage see the dpdk install documentation.

Could you, please, better describe why we need this change?
Because we're not removing the polling thread. OVS will still
poll the link states periodically. This config option has
no effect on that side. Also, link state polling in OVS uses
'rte_eth_link_get_nowait()' function which will be called in both
cases and should not wait for hardware reply in any implementation.

There was recent bug fix for intel NICs that fixes waiting of an
admin queue on link state requests despite of 'no_wait' flag:
http://dpdk.org/ml/archives/dev/2018-March/092156.html
Will this fix your target case?

So, the difference of execution time of 'rte_eth_link_get_nowait()'
with enabled and disabled interrupts should be not so significant.
Do you have performance measurements? Measurement with above fix applied?


> 
> Thanks for working on this Robert.
> 
> I've completed some testing including the case where LSC is not supported, in 
> which case the port will remain in a down state and fail rx/tx traffic. This 
> behavior conforms to the netdev_reconfigure expectations in the fail case so 
> that's ok.

I'm not sure if this is acceptable. For example, we're not failing
reconfiguration in case of issues with number of queues. We're trying
different numbers until we have working configuration.
Maybe we need the same fall-back mechanism in case of not supported LSC
interrupts? (MTU setup errors are really uncommon unlike LSC interrupts'
support in PMDs).

> 
> I'm a bit late to the thread but I have a few other comments below.
> 
> I'd like to get this patch in the next pull request if possible so I'd 
> appreciate if others can give any comments on the patch also.
> 
> Thanks
> Ian
> 
>>
>> Signed-off-by: Robert Mulik 
>> ---
>> v5 -> v6:
>> - DPDK install documentation updated.
>> - Status of lsc_interrupt_mode of DPDK interfaces can be read by command
>>   ovs-appctl dpif/show.
>> - It was suggested to check if the HW supports interrupt mode, but it is
>> not
>>   possible to do without DPDK code change, so it is skipped from this
>> patch.
>> ---
>>  Documentation/intro/install/dpdk.rst | 33
>> +
>>  lib/netdev-dpdk.c| 24 ++--
>>  vswitchd/vswitch.xml | 17 +
>>  3 files changed, 72 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/intro/install/dpdk.rst
>> b/Documentation/intro/install/dpdk.rst
>> index ed358d5..eb1bc7b 100644
>> --- a/Documentation/intro/install/dpdk.rst
>> +++ b/Documentation/intro/install/dpdk.rst
>> @@ -628,6 +628,39 @@ The average number of packets per output batch can be
>> checked in PMD stats::
>>
>>  $ ovs-appctl dpif-netdev/pmd-stats-show
>>
>> +Link State Change (LSC) detection configuration
>> +~~~
>> +
>> +There are two methods to get the information when Link State Change
>> +(LSC) happens on a network interface: by polling or interrupt.
>> +
>> +With the polling method, the main process checks the link state on a
>> +fixed interval. This fixed interval determines how fast a link change
>> +is detected.
>> +
>> +If interrupts are used to get LSC information, the hardware itself
>> +triggers an interrupt when link state change happens, the interrupt
>> +thread wakes up from sleep, updates the information, and goes back to
>> +sleep mode. When no link state change happens (most of the time), the
>> +thread remains in sleep mode and doesn`t use processor time at all. The
>> +disadvantage of this method is that when an interrupt happens, the
>> +processor has to handle it immediately, so it puts the currently
>> +running process to background, handles the interrupt, and takes the
>> background process back.
>> +
>> +Note that not all PMD drivers support LSC interrupts.
>> +
>> +The default configuration is polling mode. To set interrupt mode,
>> +option ``dpdk-lsc-interrupt`` has to be set to ``true``.
>> +
>> +Command to set interrupt mode for a specific interface::
>> +$ ovs-vsctl set interface 
>> +options:dpdk-lsc-interrupt=true
>> +
>> +Command to set polling mode for a specific interface::
>> +$ ovs-vsctl set interface 
>> +options:dpdk-lsc-interrupt=false
>> +
>> +Command to remove ``dpdk-lsc-interrupt`` option::
>> +$ ovs-vsctl remove interface  options
>> 

Re: [ovs-dev] [PATCH v6] Configurable Link State Change (LSC) detection mode

2018-03-27 Thread Stokes, Ian
> It is possible to change LSC detection mode to polling or interrupt mode
> for DPDK interfaces. The default is polling mode. To set interrupt mode,
> option dpdk-lsc-interrupt has to be set to true.
> 
> In polling mode more processor time is needed, since the OVS repeatedly
> reads the link state with a short period. It can lead to packet loss for
> certain systems.
> 
> In interrupt mode the hardware itself triggers an interrupt when link
> state change happens, so less processing time needs for the OVS.
> 
> For detailed description and usage see the dpdk install documentation.

Thanks for working on this Robert.

I've completed some testing including the case where LSC is not supported, in 
which case the port will remain in a down state and fail rx/tx traffic. This 
behavior conforms to the netdev_reconfigure expectations in the fail case so 
that's ok.

I'm a bit late to the thread but I have a few other comments below.

I'd like to get this patch in the next pull request if possible so I'd 
appreciate if others can give any comments on the patch also.

Thanks
Ian

> 
> Signed-off-by: Robert Mulik 
> ---
> v5 -> v6:
> - DPDK install documentation updated.
> - Status of lsc_interrupt_mode of DPDK interfaces can be read by command
>   ovs-appctl dpif/show.
> - It was suggested to check if the HW supports interrupt mode, but it is
> not
>   possible to do without DPDK code change, so it is skipped from this
> patch.
> ---
>  Documentation/intro/install/dpdk.rst | 33
> +
>  lib/netdev-dpdk.c| 24 ++--
>  vswitchd/vswitch.xml | 17 +
>  3 files changed, 72 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst
> b/Documentation/intro/install/dpdk.rst
> index ed358d5..eb1bc7b 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -628,6 +628,39 @@ The average number of packets per output batch can be
> checked in PMD stats::
> 
>  $ ovs-appctl dpif-netdev/pmd-stats-show
> 
> +Link State Change (LSC) detection configuration
> +~~~
> +
> +There are two methods to get the information when Link State Change
> +(LSC) happens on a network interface: by polling or interrupt.
> +
> +With the polling method, the main process checks the link state on a
> +fixed interval. This fixed interval determines how fast a link change
> +is detected.
> +
> +If interrupts are used to get LSC information, the hardware itself
> +triggers an interrupt when link state change happens, the interrupt
> +thread wakes up from sleep, updates the information, and goes back to
> +sleep mode. When no link state change happens (most of the time), the
> +thread remains in sleep mode and doesn`t use processor time at all. The
> +disadvantage of this method is that when an interrupt happens, the
> +processor has to handle it immediately, so it puts the currently
> +running process to background, handles the interrupt, and takes the
> background process back.
> +
> +Note that not all PMD drivers support LSC interrupts.
> +
> +The default configuration is polling mode. To set interrupt mode,
> +option ``dpdk-lsc-interrupt`` has to be set to ``true``.
> +
> +Command to set interrupt mode for a specific interface::
> +$ ovs-vsctl set interface 
> +options:dpdk-lsc-interrupt=true
> +
> +Command to set polling mode for a specific interface::
> +$ ovs-vsctl set interface 
> +options:dpdk-lsc-interrupt=false
> +
> +Command to remove ``dpdk-lsc-interrupt`` option::
> +$ ovs-vsctl remove interface  options
> +dpdk-lsc-interrupt

Just a query, why do we need the above option to remove lsc, is setting lsc 
true or false with the previous commands not enough?

> +
>  Limitations
>  
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 94fb163..e2794e8
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -433,6 +433,12 @@ struct netdev_dpdk {
>  /* DPDK-ETH hardware offload features,
>   * from the enum set 'dpdk_hw_ol_features' */
>  uint32_t hw_ol_features;
> +
> +/* Properties for link state change detection mode.
> + * If lsc_interrupt_mode is set to false, poll mode is used,
> + * otherwise interrupt mode is used. */
> +bool requested_lsc_interrupt_mode;
> +bool lsc_interrupt_mode;
>  );
> 
>  PADDED_MEMBERS(CACHE_LINE_SIZE,
> @@ -686,12 +692,14 @@ dpdk_watchdog(void *dummy OVS_UNUSED)  }
> 
>  static int
> -dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int n_rxq, int n_txq)
> +dpdk_eth_dev_port_config(struct netdev_dpdk *dev, int n_rxq, int n_txq)
>  {
>  int diag = 0;
>  int i;
>  struct rte_eth_conf conf = port_conf;
> 
> +conf.intr_conf.lsc = dev->lsc_interrupt_mode;

Should above be dev->requested_lsc_interrupt_mode? Similar to how we request 
MTUs, we first 

[ovs-dev] [PATCH] rhel: Stop managing the /run/openvswitch directory with systemd.

2018-03-27 Thread Markos Chandras
It appears that new systemd versions (tested with v237) changed the
way RuntimeDirectory option behaves. Upstream commit 3536f49e8fa2
("core: add {State,Cache,Log,Configuration}Directory=") modified the
RuntimeDirectory code to run before every ExecStart* command instead
of running it once per service file when the service is run as 'root'.

This breaks the ovsdb-server because after the chown command was applied,
the RuntimeDirectory code was executed again, effectively wiping the
/run/openvswitch directory and creating it again resulting in the
following problem.

|2|daemon_unix|EMER|/var/run/openvswitch/ovsdb-server.pid.tmp: create 
failed (Permission denied)
Mar 19 16:37:20 susetest ovs-ctl[3045]: ovsdb-server: 
/var/run/openvswitch/ovsdb-server.pid.tmp: create failed (Permission denied)
Mar 19 16:37:20 susetest ovs-ctl[3045]: Starting ovsdb-server ... failed!

The ovs-lib code can already manage that directory for us so we can
remove these entries from the systemd file and let ovs-vsctl do it.

Cc: Aaron Conole 
Signed-off-by: Markos Chandras 
---
 rhel/usr_lib_systemd_system_ovsdb-server.service | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/rhel/usr_lib_systemd_system_ovsdb-server.service 
b/rhel/usr_lib_systemd_system_ovsdb-server.service
index 234d39355..ce496ec30 100644
--- a/rhel/usr_lib_systemd_system_ovsdb-server.service
+++ b/rhel/usr_lib_systemd_system_ovsdb-server.service
@@ -10,7 +10,6 @@ Type=forking
 Restart=on-failure
 EnvironmentFile=/etc/openvswitch/default.conf
 EnvironmentFile=-/etc/sysconfig/openvswitch
-ExecStartPre=/usr/bin/chown ${OVS_USER_ID} /var/run/openvswitch
 ExecStart=/usr/share/openvswitch/scripts/ovs-ctl \
   --no-ovs-vswitchd --no-monitor --system-id=random \
   --ovs-user=${OVS_USER_ID} \
@@ -19,5 +18,3 @@ ExecStop=/usr/share/openvswitch/scripts/ovs-ctl 
--no-ovs-vswitchd stop
 ExecReload=/usr/share/openvswitch/scripts/ovs-ctl --no-ovs-vswitchd \
--ovs-user=${OVS_USER_ID} \
--no-monitor restart $OPTIONS
-RuntimeDirectory=openvswitch
-RuntimeDirectoryMode=0755
-- 
2.16.2

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 15/15] ovsdb: Introduce experimental support for clustered databases. (2/2)

2018-03-27 Thread Numan Siddique
On Sun, Mar 25, 2018 at 6:16 AM, Ben Pfaff  wrote:

> I applied this series to master, with the exception of ovn-ctl, which
> needs some extra work and will come in a separate patch to be posted
> later.
>
>
Hi Ben,

I am on it. I will address the review comments and post a new patch.

Thanks
Numan


> On Fri, Mar 23, 2018 at 03:27:24PM -0700, Ben Pfaff wrote:
> > Thank you for the review.  As before, I am applying most of your
> > comments and not bothering to respond to them, so below you can find
> > what needed a reply.
> >
> > On Sat, Feb 24, 2018 at 03:17:51PM -0800, Justin Pettit wrote:
> > > > On Dec 31, 2017, at 9:16 PM, Ben Pfaff  wrote:
> > > > diff --git a/ovsdb/raft-private.c b/ovsdb/raft-private.c
> > > > new file mode 100644
> > > > index ..457d1292a949
> > > > --- /dev/null
> > > > +++ b/ovsdb/raft-private.c
> > > >
> > > > +struct ovsdb_error * OVS_WARN_UNUSED_RESULT
> > > > +raft_address_validate(const char *address)
> > > > +{
> > > > ...
> > > > +return ovsdb_error(NULL, "%s: expected \"tcp\" or \"ssl\"
> address",
> > > > +   address);
> > >
> > > I assume you're downplaying "unix:", since it only makes sense for
> testing purposes?
> >
> > Yes.  I don't want to encourage users to use it, since it isn't useful
> > for distribution but only for testing..
> >
> > > > +char *
> > > > +raft_address_to_nickname(const char *address, const struct uuid
> *sid)
> > > > +{
> > >
> > > I think it may be worth explaining what a nickname is and how one gets
> one.
> >
> > OK, I added a comment.
> >
> > > > +void
> > > > +raft_servers_format(const struct hmap *servers, struct ds *ds)
> > > > +{
> > > > +int i = 0;
> > > > +const struct raft_server *s;
> > > > +HMAP_FOR_EACH (s, hmap_node, servers) {
> > > > +if (i++) {
> > > > +ds_put_cstr(ds, ", ");
> > > > +}
> > > > +ds_put_format(ds, SID_FMT"(%s)", SID_ARGS(>sid),
> s->address);
> > >
> > > Do you think it's worth putting a space between the SID and
> parenthetical address?
> >
> > It's actually more readable the way it is.  I spent a lot of time
> > reading this stuff.
> >
> > > > +static void
> > > > +raft_append_request_from_jsonrpc(struct ovsdb_parser *p,
> > > > + struct raft_append_request *rq)
> > > > +{
> > > > ...
> > > > +
> > > > +const struct json *log = ovsdb_parser_member(p, "log",
> OP_ARRAY);
> > > > +if (!log) {
> > > > +return;
> > > > +}
> > >
> > > Should this parser argument include OP_OPTIONAL?
> >
> > No, an append request always includes a log entry array (although I
> > guess it could be an empty array).  Can you help me to understand what
> > makes you think so?  Perhaps there is some inconsistency here that I do
> > not yet see, that I should fix.
> >
> > > > +static void
> > > > +raft_append_reply_to_jsonrpc(const struct raft_append_reply *rpy,
> > > > + struct json *args)
> > > > +{
> > > > ...
> > > > +json_object_put_string(args, "result",
> > > > +   raft_append_result_to_string(
> rpy->result));
> > >
> > > I believe raft_append_result_to_string() can return NULL, which could
> > > cause a problem here.
> >
> > The code shouldn't try to send an invalid RPC; if it does, then a
> > segfault will allow the problem to be diagnosed about as well as an
> > assertion failure would.
> >
> > > > +static void
> > > > +raft_format_append_reply(const struct raft_append_reply *rpy,
> struct ds *s)
> > > > +{
> > > > ...
> > > > +ds_put_format(s, " result=\"%s\"",
> > > > +  raft_append_result_to_string(rpy->result));
> > >
> > > I don't think this will crash, but it will print strange if
> > > 'rpy->result' is not valid.
> >
> > The code shouldn't try to print an invalid RPC; such an RPC, if
> > received, would fail to demarshal before it got to that point.
> >
> > > Is it intentional that "n_entries" and "prev_log_*" aren't printed?
> >
> > They aren't very useful for debugging and clutter the output.
> >
> > > > +void
> > > > +raft_rpc_format(const union raft_rpc *rpc, struct ds *s)
> > > > +{
> > > > +ds_put_format(s, "%s", raft_rpc_type_to_string(rpc->type));
> > > > +if (rpc->common.comment) {
> > > > +ds_put_format(s, " \"%s\"", rpc->common.comment);
> > > > +}
> > > > +ds_put_char(s, ':');
> > >
> > > Is printing the SID not important?
> >
> > It's better to let the caller print it, since it has more context that
> > allows for more human-friendly names (e.g. those nicknames you asked
> > about).
> >
> > I added a comment.
> >
> > > > +/* Creates a database file that represents a new server in an
> existing Raft
> > > > + * cluster.
> > > > + *
> > > > + * Creates the local copy of the cluster's log in 'file_name',
> which must not
> > > > + * already exist.  Gives it the name 'name', which must be the same
> name
> > > > + * passed in to 

[ovs-dev] [PATCH v8 6/6] Documentation: document ovs-dpdk flow offload

2018-03-27 Thread Shahaf Shuler
From: Yuanhan Liu 

Add details in the DPDK howto guide on the way to enable the offload along
with the supported NICs and flow types.

The flow offload is marked as experimental.

Signed-off-by: Yuanhan Liu 
Signed-off-by: Shahaf Shuler 
---
 Documentation/howto/dpdk.rst | 22 ++
 NEWS |  3 ++-
 2 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/Documentation/howto/dpdk.rst b/Documentation/howto/dpdk.rst
index 79b626c..c5794bc 100644
--- a/Documentation/howto/dpdk.rst
+++ b/Documentation/howto/dpdk.rst
@@ -739,3 +739,25 @@ devices to bridge ``br0``. Once complete, follow the below 
steps:
Check traffic on multiple queues::
 
$ cat /proc/interrupts | grep virtio
+
+.. _dpdk-flow-hardware-offload:
+
+Flow Hardware Offload (Experimental)
+
+
+The flow hardware offload is disabled by default and can be enabled by::
+
+$ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
+
+So far only partial flow offload is implemented. Moreover, it only works
+with PMD drivers have the rte_flow action "MARK + RSS" support.
+
+The validated NICs are:
+
+- Mellanox (ConnectX-4, ConnectX-4 Lx, ConnectX-5)
+- Napatech (NT200B01)
+
+Supported protocols for hardware offload are:
+- L2: Ethernet, VLAN
+- L3: IPv4, IPv6
+- L4: TCP, UDP, SCTP, ICMP
diff --git a/NEWS b/NEWS
index 8d0b502..f682b25 100644
--- a/NEWS
+++ b/NEWS
@@ -15,6 +15,8 @@ Post-v2.9.0
  * OFPT_ROLE_STATUS is now available in OpenFlow 1.3.
- Linux kernel 4.14
  * Add support for compiling OVS with the latest Linux 4.14 kernel
+   - DPDK:
+ * Add experimental flow hardware offload support
 
 v2.9.0 - 19 Feb 2018
 
@@ -70,7 +72,6 @@ v2.9.0 - 19 Feb 2018
  * New appctl command 'dpif-netdev/pmd-rxq-rebalance' to rebalance rxq to
pmd assignments.
  * Add rxq utilization of pmd to appctl 'dpif-netdev/pmd-rxq-show'.
- * Add support for vHost dequeue zero copy (experimental)
- Userspace datapath:
  * Output packet batching support.
- vswitchd:
-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v8 5/6] dpif-netdev: do hw flow offload in a thread

2018-03-27 Thread Shahaf Shuler
From: Yuanhan Liu 

Currently, the major trigger for hw flow offload is at upcall handling,
which is actually in the datapath. Moreover, the hw offload installation
and modification is not that lightweight. Meaning, if there are so many
flows being added or modified frequently, it could stall the datapath,
which could result to packet loss.

To diminish that, all those flow operations will be recorded and appended
to a list. A thread is then introduced to process this list (to do the
real flow offloading put/del operations). This could leave the datapath
as lightweight as possible.

Signed-off-by: Yuanhan Liu 
Signed-off-by: Shahaf Shuler 
---
 lib/dpif-netdev.c | 348 -
 1 file changed, 258 insertions(+), 90 deletions(-)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 7489a2f..8300286 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -345,6 +345,12 @@ enum rxq_cycles_counter_type {
 RXQ_N_CYCLES
 };
 
+enum {
+DP_NETDEV_FLOW_OFFLOAD_OP_ADD,
+DP_NETDEV_FLOW_OFFLOAD_OP_MOD,
+DP_NETDEV_FLOW_OFFLOAD_OP_DEL,
+};
+
 #define XPS_TIMEOUT 50LL/* In microseconds. */
 
 /* Contained by struct dp_netdev_port's 'rxqs' member.  */
@@ -721,6 +727,8 @@ static inline bool emc_entry_alive(struct emc_entry *ce);
 static void emc_clear_entry(struct emc_entry *ce);
 
 static void dp_netdev_request_reconfigure(struct dp_netdev *dp);
+static void queue_netdev_flow_del(struct dp_netdev_pmd_thread *pmd,
+  struct dp_netdev_flow *flow);
 
 static void
 emc_cache_init(struct emc_cache *flow_cache)
@@ -1854,13 +1862,11 @@ struct flow_mark {
 struct cmap megaflow_to_mark;
 struct cmap mark_to_flow;
 struct id_pool *pool;
-struct ovs_mutex mutex;
 };
 
 static struct flow_mark flow_mark = {
 .megaflow_to_mark = CMAP_INITIALIZER,
 .mark_to_flow = CMAP_INITIALIZER,
-.mutex = OVS_MUTEX_INITIALIZER,
 };
 
 static uint32_t
@@ -2010,7 +2016,7 @@ flow_mark_flush(struct dp_netdev_pmd_thread *pmd)
 
 CMAP_FOR_EACH (flow, mark_node, _mark.mark_to_flow) {
 if (flow->pmd_id == pmd->core_id) {
-mark_to_flow_disassociate(pmd, flow);
+queue_netdev_flow_del(pmd, flow);
 }
 }
 }
@@ -2032,6 +2038,251 @@ mark_to_flow_find(const struct dp_netdev_pmd_thread 
*pmd,
 return NULL;
 }
 
+struct dp_flow_offload_item {
+struct dp_netdev_pmd_thread *pmd;
+struct dp_netdev_flow *flow;
+int op;
+struct match match;
+struct nlattr *actions;
+size_t actions_len;
+
+struct ovs_list node;
+};
+
+struct dp_flow_offload {
+struct ovs_mutex mutex;
+struct ovs_list list;
+pthread_cond_t cond;
+};
+
+static struct dp_flow_offload dp_flow_offload = {
+.mutex = OVS_MUTEX_INITIALIZER,
+.list  = OVS_LIST_INITIALIZER(_flow_offload.list),
+};
+
+static struct ovsthread_once offload_thread_once
+= OVSTHREAD_ONCE_INITIALIZER;
+
+static struct dp_flow_offload_item *
+dp_netdev_alloc_flow_offload(struct dp_netdev_pmd_thread *pmd,
+ struct dp_netdev_flow *flow,
+ int op)
+{
+struct dp_flow_offload_item *offload;
+
+offload = xzalloc(sizeof(*offload));
+offload->pmd = pmd;
+offload->flow = flow;
+offload->op = op;
+
+dp_netdev_flow_ref(flow);
+dp_netdev_pmd_try_ref(pmd);
+
+return offload;
+}
+
+static void
+dp_netdev_free_flow_offload(struct dp_flow_offload_item *offload)
+{
+dp_netdev_pmd_unref(offload->pmd);
+dp_netdev_flow_unref(offload->flow);
+
+free(offload->actions);
+free(offload);
+}
+
+static void
+dp_netdev_append_flow_offload(struct dp_flow_offload_item *offload)
+{
+ovs_mutex_lock(_flow_offload.mutex);
+ovs_list_push_back(_flow_offload.list, >node);
+xpthread_cond_signal(_flow_offload.cond);
+ovs_mutex_unlock(_flow_offload.mutex);
+}
+
+static int
+dp_netdev_flow_offload_del(struct dp_flow_offload_item *offload)
+{
+return mark_to_flow_disassociate(offload->pmd, offload->flow);
+}
+
+/*
+ * There are two flow offload operations here: addition and modification.
+ *
+ * For flow addition, this function does:
+ * - allocate a new flow mark id
+ * - perform hardware flow offload
+ * - associate the flow mark with flow and mega flow
+ *
+ * For flow modification, both flow mark and the associations are still
+ * valid, thus only item 2 needed.
+ */
+static int
+dp_netdev_flow_offload_put(struct dp_flow_offload_item *offload)
+{
+struct dp_netdev_port *port;
+struct dp_netdev_pmd_thread *pmd = offload->pmd;
+struct dp_netdev_flow *flow = offload->flow;
+odp_port_t in_port = flow->flow.in_port.odp_port;
+bool modification = offload->op == DP_NETDEV_FLOW_OFFLOAD_OP_MOD;
+struct offload_info info;
+uint32_t mark;
+int ret;
+
+if (flow->dead) {
+return -1;
+}
+
+if (modification) {
+

[ovs-dev] [PATCH v8 4/6] netdev-dpdk: add debug for rte flow patterns

2018-03-27 Thread Shahaf Shuler
From: Yuanhan Liu 

For debug purpose.

Co-authored-by: Finn Christensen 
Signed-off-by: Yuanhan Liu 
Signed-off-by: Finn Christensen 
Signed-off-by: Shahaf Shuler 
---
 lib/netdev-dpdk.c | 177 +
 1 file changed, 177 insertions(+)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index df4d480..9785b1e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -3797,6 +3797,182 @@ struct flow_actions {
 };
 
 static void
+dump_flow_pattern(struct rte_flow_item *item)
+{
+if (item->type == RTE_FLOW_ITEM_TYPE_ETH) {
+const struct rte_flow_item_eth *eth_spec = item->spec;
+const struct rte_flow_item_eth *eth_mask = item->mask;
+
+VLOG_DBG("rte flow eth pattern:\n");
+if (eth_spec) {
+VLOG_DBG("  Spec: src="ETH_ADDR_FMT", dst="ETH_ADDR_FMT", "
+ "type=0x%04" PRIx16"\n",
+ eth_spec->src.addr_bytes[0], eth_spec->src.addr_bytes[1],
+ eth_spec->src.addr_bytes[2], eth_spec->src.addr_bytes[3],
+ eth_spec->src.addr_bytes[4], eth_spec->src.addr_bytes[5],
+ eth_spec->dst.addr_bytes[0], eth_spec->dst.addr_bytes[1],
+ eth_spec->dst.addr_bytes[2], eth_spec->dst.addr_bytes[3],
+ eth_spec->dst.addr_bytes[4], eth_spec->dst.addr_bytes[5],
+ ntohs(eth_spec->type));
+} else {
+VLOG_DBG("  Spec = null\n");
+}
+if (eth_mask) {
+VLOG_DBG("  Mask: src="ETH_ADDR_FMT", dst="ETH_ADDR_FMT", "
+ "type=0x%04"PRIx16"\n",
+ eth_mask->src.addr_bytes[0], eth_mask->src.addr_bytes[1],
+ eth_mask->src.addr_bytes[2], eth_mask->src.addr_bytes[3],
+ eth_mask->src.addr_bytes[4], eth_mask->src.addr_bytes[5],
+ eth_mask->dst.addr_bytes[0], eth_mask->dst.addr_bytes[1],
+ eth_mask->dst.addr_bytes[2], eth_mask->dst.addr_bytes[3],
+ eth_mask->dst.addr_bytes[4], eth_mask->dst.addr_bytes[5],
+ eth_mask->type);
+} else {
+VLOG_DBG("  Mask = null\n");
+}
+}
+
+if (item->type == RTE_FLOW_ITEM_TYPE_VLAN) {
+const struct rte_flow_item_vlan *vlan_spec = item->spec;
+const struct rte_flow_item_vlan *vlan_mask = item->mask;
+
+VLOG_DBG("rte flow vlan pattern:\n");
+if (vlan_spec) {
+VLOG_DBG("  Spec: tpid=0x%"PRIx16", tci=0x%"PRIx16"\n",
+ ntohs(vlan_spec->tpid), ntohs(vlan_spec->tci));
+} else {
+VLOG_DBG("  Spec = null\n");
+}
+
+if (vlan_mask) {
+VLOG_DBG("  Mask: tpid=0x%"PRIx16", tci=0x%"PRIx16"\n",
+ vlan_mask->tpid, vlan_mask->tci);
+} else {
+VLOG_DBG("  Mask = null\n");
+}
+}
+
+if (item->type == RTE_FLOW_ITEM_TYPE_IPV4) {
+const struct rte_flow_item_ipv4 *ipv4_spec = item->spec;
+const struct rte_flow_item_ipv4 *ipv4_mask = item->mask;
+
+VLOG_DBG("rte flow ipv4 pattern:\n");
+if (ipv4_spec) {
+VLOG_DBG("  Spec: tos=0x%"PRIx8", ttl=%"PRIx8", proto=0x%"PRIx8
+ ", src="IP_FMT", dst="IP_FMT"\n",
+ ipv4_spec->hdr.type_of_service,
+ ipv4_spec->hdr.time_to_live,
+ ipv4_spec->hdr.next_proto_id,
+ IP_ARGS(ipv4_spec->hdr.src_addr),
+ IP_ARGS(ipv4_spec->hdr.dst_addr));
+} else {
+VLOG_DBG("  Spec = null\n");
+}
+if (ipv4_mask) {
+VLOG_DBG("  Mask: tos=0x%"PRIx8", ttl=%"PRIx8", proto=0x%"PRIx8
+ ", src="IP_FMT", dst="IP_FMT"\n",
+ ipv4_mask->hdr.type_of_service,
+ ipv4_mask->hdr.time_to_live,
+ ipv4_mask->hdr.next_proto_id,
+ IP_ARGS(ipv4_mask->hdr.src_addr),
+ IP_ARGS(ipv4_mask->hdr.dst_addr));
+} else {
+VLOG_DBG("  Mask = null\n");
+}
+}
+
+if (item->type == RTE_FLOW_ITEM_TYPE_UDP) {
+const struct rte_flow_item_udp *udp_spec = item->spec;
+const struct rte_flow_item_udp *udp_mask = item->mask;
+
+VLOG_DBG("rte flow udp pattern:\n");
+if (udp_spec) {
+VLOG_DBG("  Spec: src_port=%"PRIu16", dst_port=%"PRIu16"\n",
+ ntohs(udp_spec->hdr.src_port),
+ ntohs(udp_spec->hdr.dst_port));
+} else {
+VLOG_DBG("  Spec = null\n");
+}
+if (udp_mask) {
+VLOG_DBG("  Mask: src_port=0x%"PRIx16", dst_port=0x%"PRIx16"\n",
+ udp_mask->hdr.src_port,
+ udp_mask->hdr.dst_port);
+} else {
+

[ovs-dev] [PATCH v8 3/6] netdev-dpdk: implement flow offload with rte flow

2018-03-27 Thread Shahaf Shuler
From: Finn Christensen 

The basic yet the major part of this patch is to translate the "match"
to rte flow patterns. And then, we create a rte flow with MARK + RSS
actions. Afterwards, all packets match the flow will have the mark id in
the mbuf.

The reason RSS is needed is, for most NICs, a MARK only action is not
allowed. It has to be used together with some other actions, such as
QUEUE, RSS, etc. However, QUEUE action can specify one queue only, which
may break the rss. Likely, RSS action is currently the best we could
now. Thus, RSS action is choosen.

For any unsupported flows, such as MPLS, -1 is returned, meaning the
flow offload is failed and then skipped.

Co-authored-by: Yuanhan Liu 
Signed-off-by: Finn Christensen 
Signed-off-by: Yuanhan Liu 
Signed-off-by: Shahaf Shuler 
---
 lib/netdev-dpdk.c | 563 -
 1 file changed, 562 insertions(+), 1 deletion(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index af9843a..df4d480 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -38,7 +38,9 @@
 #include 
 #include 
 #include 
+#include 
 
+#include "cmap.h"
 #include "dirs.h"
 #include "dp-packet.h"
 #include "dpdk.h"
@@ -51,6 +53,7 @@
 #include "openvswitch/list.h"
 #include "openvswitch/ofp-print.h"
 #include "openvswitch/vlog.h"
+#include "openvswitch/match.h"
 #include "ovs-numa.h"
 #include "ovs-thread.h"
 #include "ovs-rcu.h"
@@ -60,6 +63,7 @@
 #include "sset.h"
 #include "unaligned.h"
 #include "timeval.h"
+#include "uuid.h"
 #include "unixctl.h"
 
 enum {VIRTIO_RXQ, VIRTIO_TXQ, VIRTIO_QNUM};
@@ -170,6 +174,17 @@ static const struct rte_eth_conf port_conf = {
 };
 
 /*
+ * A mapping from ufid to dpdk rte_flow.
+ */
+static struct cmap ufid_to_rte_flow = CMAP_INITIALIZER;
+
+struct ufid_to_rte_flow_data {
+struct cmap_node node;
+ovs_u128 ufid;
+struct rte_flow *rte_flow;
+};
+
+/*
  * These callbacks allow virtio-net devices to be added to vhost ports when
  * configuration has been fully completed.
  */
@@ -3709,6 +3724,552 @@ unlock:
 return err;
 }
 
+
+/* Find rte_flow with @ufid */
+static struct rte_flow *
+ufid_to_rte_flow_find(const ovs_u128 *ufid) {
+size_t hash = hash_bytes(ufid, sizeof(*ufid), 0);
+struct ufid_to_rte_flow_data *data;
+
+CMAP_FOR_EACH_WITH_HASH (data, node, hash, _to_rte_flow) {
+if (ovs_u128_equals(*ufid, data->ufid)) {
+return data->rte_flow;
+}
+}
+
+return NULL;
+}
+
+static inline void
+ufid_to_rte_flow_associate(const ovs_u128 *ufid,
+   struct rte_flow *rte_flow) {
+size_t hash = hash_bytes(ufid, sizeof(*ufid), 0);
+struct ufid_to_rte_flow_data *data = xzalloc(sizeof(*data));
+
+/*
+ * We should not simply overwrite an existing rte flow.
+ * We should have deleted it first before re-adding it.
+ * Thus, if following assert triggers, something is wrong:
+ * the rte_flow is not destroyed.
+ */
+ovs_assert(ufid_to_rte_flow_find(ufid) == NULL);
+
+data->ufid = *ufid;
+data->rte_flow = rte_flow;
+
+cmap_insert(_to_rte_flow,
+CONST_CAST(struct cmap_node *, >node), hash);
+}
+
+static inline void
+ufid_to_rte_flow_disassociate(const ovs_u128 *ufid) {
+size_t hash = hash_bytes(ufid, sizeof(*ufid), 0);
+struct ufid_to_rte_flow_data *data;
+
+CMAP_FOR_EACH_WITH_HASH (data, node, hash, _to_rte_flow) {
+if (ovs_u128_equals(*ufid, data->ufid)) {
+cmap_remove(_to_rte_flow,
+CONST_CAST(struct cmap_node *, >node), hash);
+free(data);
+return;
+}
+}
+
+VLOG_WARN("ufid "UUID_FMT" is not associated with an rte flow\n",
+  UUID_ARGS((struct uuid *)ufid));
+}
+
+/*
+ * To avoid individual xrealloc calls for each new element, a 'curent_max'
+ * is used to keep track of current allocated number of elements. Starts
+ * by 8 and doubles on each xrealloc call
+ */
+struct flow_patterns {
+struct rte_flow_item *items;
+int cnt;
+int current_max;
+};
+
+struct flow_actions {
+struct rte_flow_action *actions;
+int cnt;
+int current_max;
+};
+
+static void
+add_flow_pattern(struct flow_patterns *patterns, enum rte_flow_item_type type,
+ const void *spec, const void *mask) {
+int cnt = patterns->cnt;
+
+if (cnt == 0) {
+patterns->current_max = 8;
+patterns->items = xcalloc(patterns->current_max, sizeof(struct 
rte_flow_item));
+} else if (cnt == patterns->current_max) {
+patterns->current_max *= 2;
+patterns->items = xrealloc(patterns->items, patterns->current_max *
+   sizeof(struct rte_flow_item));
+}
+
+patterns->items[cnt].type = type;
+patterns->items[cnt].spec = spec;
+patterns->items[cnt].mask = mask;
+patterns->items[cnt].last = NULL;

[ovs-dev] [PATCH v8 2/6] dpif-netdev: retrieve flow directly from the flow mark

2018-03-27 Thread Shahaf Shuler
From: Yuanhan Liu 

So that we could skip some very costly CPU operations, including but
not limiting to miniflow_extract, emc lookup, dpcls lookup, etc. Thus,
performance could be greatly improved.

A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and
1 million streams (tp_src=1000-1999, tp_dst=2000-2999) show more that
260% performance boost.

Note that though the heavy miniflow_extract is skipped, we still have
to do per packet checking, due to we have to check the tcp_flags.

Co-authored-by: Finn Christensen 
Signed-off-by: Yuanhan Liu 
Signed-off-by: Finn Christensen 
Signed-off-by: Shahaf Shuler 
---
 lib/dp-packet.h   |  13 +
 lib/dpif-netdev.c |  44 --
 lib/flow.c| 155 +++--
 lib/flow.h|   1 +
 4 files changed, 175 insertions(+), 38 deletions(-)

diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 21c8ca5..dd3f17b 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -691,6 +691,19 @@ reset_dp_packet_checksum_ol_flags(struct dp_packet *p)
 #define reset_dp_packet_checksum_ol_flags(arg)
 #endif
 
+static inline bool
+dp_packet_has_flow_mark(struct dp_packet *p OVS_UNUSED,
+uint32_t *mark OVS_UNUSED)
+{
+#ifdef DPDK_NETDEV
+if (p->mbuf.ol_flags & PKT_RX_FDIR_ID) {
+*mark = p->mbuf.hash.fdir.hi;
+return true;
+}
+#endif
+return false;
+}
+
 enum { NETDEV_MAX_BURST = 32 }; /* Maximum number packets in a batch. */
 
 struct dp_packet_batch {
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index 2fdb6ef..7489a2f 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -2015,6 +2015,23 @@ flow_mark_flush(struct dp_netdev_pmd_thread *pmd)
 }
 }
 
+static struct dp_netdev_flow *
+mark_to_flow_find(const struct dp_netdev_pmd_thread *pmd,
+  const uint32_t mark)
+{
+struct dp_netdev_flow *flow;
+
+CMAP_FOR_EACH_WITH_HASH (flow, mark_node, hash_int(mark, 0),
+ _mark.mark_to_flow) {
+if (flow->mark == mark && flow->pmd_id == pmd->core_id &&
+flow->dead == false) {
+return flow;
+}
+}
+
+return NULL;
+}
+
 static void
 dp_netdev_pmd_remove_flow(struct dp_netdev_pmd_thread *pmd,
   struct dp_netdev_flow *flow)
@@ -5204,10 +5221,10 @@ struct packet_batch_per_flow {
 static inline void
 packet_batch_per_flow_update(struct packet_batch_per_flow *batch,
  struct dp_packet *packet,
- const struct miniflow *mf)
+ uint16_t tcp_flags)
 {
 batch->byte_count += dp_packet_size(packet);
-batch->tcp_flags |= miniflow_get_tcp_flags(mf);
+batch->tcp_flags |= tcp_flags;
 batch->array.packets[batch->array.count++] = packet;
 }
 
@@ -5241,7 +5258,7 @@ packet_batch_per_flow_execute(struct 
packet_batch_per_flow *batch,
 
 static inline void
 dp_netdev_queue_batches(struct dp_packet *pkt,
-struct dp_netdev_flow *flow, const struct miniflow *mf,
+struct dp_netdev_flow *flow, uint16_t tcp_flags,
 struct packet_batch_per_flow *batches,
 size_t *n_batches)
 {
@@ -5252,7 +5269,7 @@ dp_netdev_queue_batches(struct dp_packet *pkt,
 packet_batch_per_flow_init(batch, flow);
 }
 
-packet_batch_per_flow_update(batch, pkt, mf);
+packet_batch_per_flow_update(batch, pkt, tcp_flags);
 }
 
 /* Try to process all ('cnt') the 'packets' using only the exact match cache
@@ -5283,6 +5300,7 @@ emc_processing(struct dp_netdev_pmd_thread *pmd,
 const size_t cnt = dp_packet_batch_size(packets_);
 uint32_t cur_min;
 int i;
+uint16_t tcp_flags;
 
 atomic_read_relaxed(>dp->emc_insert_min, _min);
 pmd_perf_update_counter(>perf_stats,
@@ -5291,6 +5309,7 @@ emc_processing(struct dp_netdev_pmd_thread *pmd,
 
 DP_PACKET_BATCH_REFILL_FOR_EACH (i, cnt, packet, packets_) {
 struct dp_netdev_flow *flow;
+uint32_t mark;
 
 if (OVS_UNLIKELY(dp_packet_size(packet) < ETH_HEADER_LEN)) {
 dp_packet_delete(packet);
@@ -5298,6 +5317,16 @@ emc_processing(struct dp_netdev_pmd_thread *pmd,
 continue;
 }
 
+if (dp_packet_has_flow_mark(packet, )) {
+flow = mark_to_flow_find(pmd, mark);
+if (flow) {
+tcp_flags = parse_tcp_flags(packet);
+dp_netdev_queue_batches(packet, flow, tcp_flags, batches,
+n_batches);
+continue;
+}
+}
+
 if (i != cnt - 1) {
 struct dp_packet **packets = packets_->packets;
 /* Prefetch next packet data and metadata. */
@@ -5323,7 +5352,8 @@ emc_processing(struct dp_netdev_pmd_thread *pmd,
 flow = NULL;
 }

[ovs-dev] [PATCH v8 1/6] dpif-netdev: associate flow with a mark id

2018-03-27 Thread Shahaf Shuler
From: Yuanhan Liu 

Most modern NICs have the ability to bind a flow with a mark, so that
every packet matches such flow will have that mark present in its
descriptor.

The basic idea of doing that is, when we receives packets later, we could
directly get the flow from the mark. That could avoid some very costly
CPU operations, including (but not limiting to) miniflow_extract, emc
lookup, dpcls lookup, etc. Thus, performance could be greatly improved.

Thus, the major work of this patch is to associate a flow with a mark
id (an uint32_t number). The association in netdev datapath is done
by CMAP, while in hardware it's done by the rte_flow MARK action.

One tricky thing in OVS-DPDK is, the flow tables is per-PMD. For the
case there is only one phys port but with 2 queues, there could be 2
PMDs. In other words, even for a single mega flow (i.e. udp,tp_src=1000),
there could be 2 different dp_netdev flows, one for each PMD. That could
results to the same mega flow being offloaded twice in the hardware,
worse, we may get 2 different marks and only the last one will work.

To avoid that, a megaflow_to_mark CMAP is created. An entry will be
added for the first PMD that wants to offload a flow. For later PMDs,
it will see such megaflow is already offloaded, then the flow will not
be offloaded to HW twice.

Meanwhile, the mark to flow mapping becomes to 1:N mapping. That is
what the mark_to_flow CMAP is for. When the first PMD wants to offload
a flow, it allocates a new mark and performs the flow offload by reusing
the ->flow_put method. When it succeeds, a "mark to flow" entry will be
added. For later PMDs, it will get the corresponding mark by above
megaflow_to_mark CMAP. Then, another "mark to flow" entry will be added.

Co-authored-by: Finn Christensen 
Signed-off-by: Yuanhan Liu 
Signed-off-by: Finn Christensen 
Signed-off-by: Shahaf Shuler 
---
 lib/dpif-netdev.c | 285 +
 lib/netdev.h  |   6 ++
 2 files changed, 291 insertions(+)

diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b07fc6b..2fdb6ef 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -75,6 +75,7 @@
 #include "tnl-ports.h"
 #include "unixctl.h"
 #include "util.h"
+#include "uuid.h"
 
 VLOG_DEFINE_THIS_MODULE(dpif_netdev);
 
@@ -430,7 +431,9 @@ struct dp_netdev_flow {
 /* Hash table index by unmasked flow. */
 const struct cmap_node node; /* In owning dp_netdev_pmd_thread's */
  /* 'flow_table'. */
+const struct cmap_node mark_node; /* In owning flow_mark's mark_to_flow */
 const ovs_u128 ufid; /* Unique flow identifier. */
+const ovs_u128 mega_ufid;/* Unique mega flow identifier. */
 const unsigned pmd_id;   /* The 'core_id' of pmd thread owning this */
  /* flow. */
 
@@ -441,6 +444,7 @@ struct dp_netdev_flow {
 struct ovs_refcount ref_cnt;
 
 bool dead;
+uint32_t mark;   /* Unique flow mark assigned to a flow */
 
 /* Statistics. */
 struct dp_netdev_flow_stats stats;
@@ -1837,6 +1841,180 @@ dp_netdev_pmd_find_dpcls(struct dp_netdev_pmd_thread 
*pmd,
 return cls;
 }
 
+#define MAX_FLOW_MARK   (UINT32_MAX - 1)
+#define INVALID_FLOW_MARK   (UINT32_MAX)
+
+struct megaflow_to_mark_data {
+const struct cmap_node node;
+ovs_u128 mega_ufid;
+uint32_t mark;
+};
+
+struct flow_mark {
+struct cmap megaflow_to_mark;
+struct cmap mark_to_flow;
+struct id_pool *pool;
+struct ovs_mutex mutex;
+};
+
+static struct flow_mark flow_mark = {
+.megaflow_to_mark = CMAP_INITIALIZER,
+.mark_to_flow = CMAP_INITIALIZER,
+.mutex = OVS_MUTEX_INITIALIZER,
+};
+
+static uint32_t
+flow_mark_alloc(void)
+{
+uint32_t mark;
+
+if (!flow_mark.pool) {
+/* Haven't initiated yet, do it here */
+flow_mark.pool = id_pool_create(0, MAX_FLOW_MARK);
+}
+
+if (id_pool_alloc_id(flow_mark.pool, )) {
+return mark;
+}
+
+return INVALID_FLOW_MARK;
+}
+
+static void
+flow_mark_free(uint32_t mark)
+{
+id_pool_free_id(flow_mark.pool, mark);
+}
+
+/* associate megaflow with a mark, which is a 1:1 mapping */
+static void
+megaflow_to_mark_associate(const ovs_u128 *mega_ufid, uint32_t mark)
+{
+size_t hash = dp_netdev_flow_hash(mega_ufid);
+struct megaflow_to_mark_data *data = xzalloc(sizeof(*data));
+
+data->mega_ufid = *mega_ufid;
+data->mark = mark;
+
+cmap_insert(_mark.megaflow_to_mark,
+CONST_CAST(struct cmap_node *, >node), hash);
+}
+
+/* disassociate meagaflow with a mark */
+static void
+megaflow_to_mark_disassociate(const ovs_u128 *mega_ufid)
+{
+size_t hash = dp_netdev_flow_hash(mega_ufid);
+struct megaflow_to_mark_data *data;
+
+CMAP_FOR_EACH_WITH_HASH (data, node, hash, _mark.megaflow_to_mark) {
+if (ovs_u128_equals(*mega_ufid, 

[ovs-dev] [PATCH v8 0/6] OVS-DPDK flow offload with rte_flow

2018-03-27 Thread Shahaf Shuler
Hi,

Here is a joint work from Mellanox and Napatech, to enable the flow hw
offload with the DPDK generic flow interface (rte_flow).

The basic idea is to associate the flow with a mark id (a unit32_t number).
Later, we then get the flow directly from the mark id, which could bypass
some heavy CPU operations, including but not limiting to mini flow extract,
emc lookup, dpcls lookup, etc.

The association is done with CMAP in patch 1. The CPU workload bypassing
is done in patch 2. The flow offload is done in patch 3, which mainly does
two things:

- translate the ovs match to DPDK rte flow patterns
- bind those patterns with a RSS + MARK action.

Patch 5 makes the offload work happen in another thread, for leaving the
datapath as light as possible.

A PHY-PHY forwarding with 1000 mega flows (udp,tp_src=1000-1999) and 1
million streams (tp_src=1000-1999, tp_dst=2000-2999) show more than 260%
performance boost.

Note that it's disabled by default, which can be enabled by:

$ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true

v8: - enhanced documentation with more details on supported protocols
- fixed VLOG to start with capital letter
- fixed compilation issues
- fixed coding style 
- addressed the rest of Ian's comments 

v7: - fixed wrong hash for mark_to_flow that has been refactored in v6
- set the rss_conf for rss action to NULL, to workaround a mlx5 change
  in DPDK v17.11. Note that it will obey the rss settings OVS-DPDK has
  set in the beginning. Thus, nothing should be effected.

v6: - fixed a sparse warning
- added documentation
- used hash_int to compute mark to flow hash
- added more comments
- added lock for pot lookup
- rebased on top of the latest code

v5: - fixed an issue that it took too long if we do flow add/remove
  repeatedly.
- removed an unused mutex lock
- turned most of the log level to DBG
- rebased on top of the latest code

v4: - use RSS action instead of QUEUE action with MARK
- make it work with multiple queue (see patch 1)
- rebased on top of latest code

v3: - The mark and id association is done with array instead of CMAP.
- Added a thread to do hw offload operations
- Removed macros completely
- dropped the patch to set FDIR_CONF, which is a workround some
  Intel NICs.
- Added a debug patch to show all flow patterns we have created.
- Misc fixes

v2: - workaround the queue action issue
- fixed the tcp_flags being skipped issue, which also fixed the
  build warnings
- fixed l2 patterns for Intel nic
- Converted some macros to functions
- did not hardcode the max number of flow/action
- rebased on top of the latest code

Thanks.

---

Finn Christensen (1):
  netdev-dpdk: implement flow offload with rte flow

Yuanhan Liu (5):
  dpif-netdev: associate flow with a mark id
  dpif-netdev: retrieve flow directly from the flow mark
  netdev-dpdk: add debug for rte flow patterns
  dpif-netdev: do hw flow offload in a thread
  Documentation: document ovs-dpdk flow offload

 Documentation/howto/dpdk.rst |  22 ++
 NEWS |   3 +-
 lib/dp-packet.h  |  13 +
 lib/dpif-netdev.c| 497 -
 lib/flow.c   | 155 ++--
 lib/flow.h   |   1 +
 lib/netdev-dpdk.c| 740 +-
 lib/netdev.h |   6 +
 8 files changed, 1397 insertions(+), 40 deletions(-)

-- 
2.7.4

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev