from:"Flavio Leitner via dev"

[ovs-dev] [PATCH] ovsdb-idlc.in: fix dict change during iteration.

2019-09-14 Thread Flavio Leitner via dev

Python3 complains if a dict key is changed during the
iteration.

Use list() to create a copy of it.

Traceback (most recent call last):
  File "./ovsdb/ovsdb-idlc.in", line 1581, in 
func(*args[1:])
  File "./ovsdb/ovsdb-idlc.in", line 185, in printCIDLHeader
replace_cplusplus_keyword(schema)
  File "./ovsdb/ovsdb-idlc.in", line 179, in replace_cplusplus_keyword
for columnName in table.columns:
RuntimeError: dictionary keys changed during iteration

Signed-off-by: Flavio Leitner 
---
 ovsdb/ovsdb-idlc.in | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovsdb/ovsdb-idlc.in b/ovsdb/ovsdb-idlc.in
index 40fef39ed..22d0a4e22 100755
--- a/ovsdb/ovsdb-idlc.in
+++ b/ovsdb/ovsdb-idlc.in
@@ -176,7 +176,7 @@ def replace_cplusplus_keyword(schema):
 'wchar_t', 'while', 'xor', 'xor_eq'}
 
 for tableName, table in schema.tables.items():
-for columnName in table.columns:
+for columnName in list(table.columns):
 if columnName in keywords:
 table.columns[columnName + '_'] = table.columns.pop(columnName)
 
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.9 1/2] Set release date for 2.9.6.

2019-09-04 Thread Flavio Leitner via dev

On Tue, Sep 03, 2019 at 03:17:03PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 
> ---

Also to the set.
Acked-by: Flavio Leitner 

thanks Justin
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.10 1/2] Set release date for 2.10.3.

2019-09-04 Thread Flavio Leitner via dev

On Tue, Sep 03, 2019 at 03:13:54PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 
> ---
To the set
Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.11 1/2] Set release date for 2.11.2.

2019-09-04 Thread Flavio Leitner via dev

On Tue, Sep 03, 2019 at 03:10:17PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 
> ---

To the set
Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [branch-2.12 1/2] Set release date for 2.12.0.

2019-09-04 Thread Flavio Leitner via dev

On Tue, Sep 03, 2019 at 03:02:59PM -0700, Justin Pettit wrote:
> Signed-off-by: Justin Pettit 
> ---

To the set
Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] tnl-neigh: Use outgoing ofproto version.

2019-08-28 Thread Flavio Leitner via dev

On Wed, Aug 28, 2019 at 11:29:54AM -0700, Ben Pfaff wrote:
> On Tue, Aug 13, 2019 at 01:34:04PM -0300, Flavio Leitner via dev wrote:
> > When a packet needs to be encapsulated in userspace, the endpoint
> > address needs to be resolved to fill in the headers. If it is not,
> > then currently OvS sends either a Neighbor Solicitation (IPv6)
> > or an ARP Query (IPv4) to resolve it.
> 
> Thanks, I applied this to master and backported it as far as it would go.

Hi Ben,

Sorry, but 2.9, 2.8 and 2.7 have a different output indentation which
breaks the testsuite.

I sent out a patch to fix those branches:
https://mail.openvswitch.org/pipermail/ovs-dev/2019-August/362166.html

Thanks,
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH 2.9 2.8 2.7] tests: fix output indentation in tunnel.at

2019-08-28 Thread Flavio Leitner via dev

The commit b1356b50aa6a ("tnl-neigh: Use outgoing ofproto version.")
uses the output indentation from newer OvS versions which include the
commit 7be29a47576d ("ofproto-dpif: Remove tabs from output.").

Fixes: b1356b50aa6a ("tnl-neigh: Use outgoing ofproto version.")
Signed-off-by: Flavio Leitner 
---
 tests/tunnel.at | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/tests/tunnel.at b/tests/tunnel.at
index fd9fe04b6..38f7f48d8 100644
--- a/tests/tunnel.at
+++ b/tests/tunnel.at
@@ -408,13 +408,13 @@ AT_CHECK([ovs-vsctl add-port int-br v1 -- set Interface 
v1 type=vxlan \
 
 AT_CHECK([ovs-appctl dpif/show], [0], [dnl
 dummy@ovs-dummy: hit:0 missed:0
-  br0:
-br0 65534/100: (dummy-internal)
-p0 1/1: (dummy)
-  int-br:
-int-br 65534/2: (dummy-internal)
-v1 2/4789: (vxlan: key=123, remote_ip=172.31.1.2)
-v2 3/3: (dummy-internal)
+   br0:
+   br0 65534/100: (dummy-internal)
+   p0 1/1: (dummy)
+   int-br:
+   int-br 65534/2: (dummy-internal)
+   v1 2/4789: (vxlan: key=123, remote_ip=172.31.1.2)
+   v2 3/3: (dummy-internal)
 ])
 
 dnl First setup dummy interface IP address, then add the route
-- 
2.20.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] tnl-neigh: Use outgoing ofproto version.

2019-08-28 Thread Flavio Leitner via dev

On Wed, Aug 28, 2019 at 11:29:54AM -0700, Ben Pfaff wrote:
> On Tue, Aug 13, 2019 at 01:34:04PM -0300, Flavio Leitner via dev wrote:
> > When a packet needs to be encapsulated in userspace, the endpoint
> > address needs to be resolved to fill in the headers. If it is not,
> > then currently OvS sends either a Neighbor Solicitation (IPv6)
> > or an ARP Query (IPv4) to resolve it.
> 
> Thanks, I applied this to master and backported it as far as it would go.

Thanks a lot Ben!
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] tnl-neigh: Use outgoing ofproto version.

2019-08-28 Thread Flavio Leitner via dev



Hi Ben and Ilya,

This patch has two reviews, so any chance for you to take a look
soon as well?

Thanks in advance!
fbl

On Tue, Aug 13, 2019 at 01:34:04PM -0300, Flavio Leitner via dev wrote:
> When a packet needs to be encapsulated in userspace, the endpoint
> address needs to be resolved to fill in the headers. If it is not,
> then currently OvS sends either a Neighbor Solicitation (IPv6)
> or an ARP Query (IPv4) to resolve it.
> 
> The problem is that the NS/ARP packet will go through the flow
> rules in the new bridge, but inheriting the ofproto table version
> from the original packet to be encapsulated. When those versions
> don't match, the result is unexpected because no flow rules might
> be visible, which would cause the default table rule to be used
> to drop the packet. Or only part of the flow rules would be visible
> and so on.
> 
> Since the NS/ARP packet is created by OvS and will be injected in
> the outgoing bridge, use the corresponding ofproto version instead.
> 
> Signed-off-by: Flavio Leitner 
> ---
>  ofproto/ofproto-dpif-xlate.c |  4 +--
>  tests/tunnel.at  | 62 
>  2 files changed, 64 insertions(+), 2 deletions(-)
> 
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 28a7fdd84..5a8a46370 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -3414,6 +3414,7 @@ compose_table_xlate(struct xlate_ctx *ctx, const struct 
> xport *out_dev,
>  struct dp_packet *packet)
>  {
>  struct xbridge *xbridge = out_dev->xbridge;
> +ovs_version_t version = 
> ofproto_dpif_get_tables_version(xbridge->ofproto);
>  struct ofpact_output output;
>  struct flow flow;
>  
> @@ -3423,8 +3424,7 @@ compose_table_xlate(struct xlate_ctx *ctx, const struct 
> xport *out_dev,
>  output.port = OFPP_TABLE;
>  output.max_len = 0;
>  
> -return ofproto_dpif_execute_actions__(xbridge->ofproto,
> -  ctx->xin->tables_version, ,
> +return ofproto_dpif_execute_actions__(xbridge->ofproto, version, ,
>NULL, , sizeof 
> output,
>ctx->depth, ctx->resubmits, 
> packet);
>  }
> diff --git a/tests/tunnel.at b/tests/tunnel.at
> index fc6f87936..faffb4149 100644
> --- a/tests/tunnel.at
> +++ b/tests/tunnel.at
> @@ -394,6 +394,68 @@ AT_CHECK([ovs-appctl dpif/show | tail -n +3], [0], [dnl
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
>  
> +AT_SETUP([tunnel - table version])
> +dnl check if changes in the egress bridge flow table affects
> +dnl discovering the link layer address of tunnel endpoints.
> +OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy 
> ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00])
> +AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], 
> [0])
> +AT_CHECK([ovs-vsctl add-port int-br v1 -- set Interface v1 type=vxlan \
> +   options:remote_ip=172.31.1.2 options:key=123 \
> +   ofport_request=2 \
> + -- add-port int-br v2 -- set Interface v2 type=internal \
> +   ofport_request=3 \
> +   ], [0])
> +
> +AT_CHECK([ovs-appctl dpif/show], [0], [dnl
> +dummy@ovs-dummy: hit:0 missed:0
> +  br0:
> +br0 65534/100: (dummy-internal)
> +p0 1/1: (dummy)
> +  int-br:
> +int-br 65534/2: (dummy-internal)
> +v1 2/4789: (vxlan: key=123, remote_ip=172.31.1.2)
> +v2 3/3: (dummy-internal)
> +])
> +
> +dnl First setup dummy interface IP address, then add the route
> +dnl so that tnl-port table can get valid IP address for the device.
> +AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 172.31.1.1/24], [0], [OK
> +])
> +AT_CHECK([ovs-appctl ovs/route/add 172.31.1.0/24 br0], [0], [OK
> +])
> +
> +dnl change the flow table to bump the internal table version
> +AT_CHECK([ovs-ofctl add-flow int-br action=normal])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl add-flow br0 action=normal])
> +AT_CHECK([ovs-ofctl del-flows br0])
> +AT_CHECK([ovs-ofctl add-f

Re: [ovs-dev] [PATCH] tnl-neigh: Use outgoing ofproto version.

2019-08-13 Thread Flavio Leitner via dev

On Tue, Aug 13, 2019 at 01:34:04PM -0300, Flavio Leitner via dev wrote:
> When a packet needs to be encapsulated in userspace, the endpoint
> address needs to be resolved to fill in the headers. If it is not,
> then currently OvS sends either a Neighbor Solicitation (IPv6)
> or an ARP Query (IPv4) to resolve it.
> 
> The problem is that the NS/ARP packet will go through the flow
> rules in the new bridge, but inheriting the ofproto table version
> from the original packet to be encapsulated. When those versions
> don't match, the result is unexpected because no flow rules might
> be visible, which would cause the default table rule to be used
> to drop the packet. Or only part of the flow rules would be visible
> and so on.
> 
> Since the NS/ARP packet is created by OvS and will be injected in
> the outgoing bridge, use the corresponding ofproto version instead.

Hi,

Please backport this up to 2.9 at least.
Thanks,
fbl


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

[ovs-dev] [PATCH] tnl-neigh: Use outgoing ofproto version.

2019-08-13 Thread Flavio Leitner via dev

When a packet needs to be encapsulated in userspace, the endpoint
address needs to be resolved to fill in the headers. If it is not,
then currently OvS sends either a Neighbor Solicitation (IPv6)
or an ARP Query (IPv4) to resolve it.

The problem is that the NS/ARP packet will go through the flow
rules in the new bridge, but inheriting the ofproto table version
from the original packet to be encapsulated. When those versions
don't match, the result is unexpected because no flow rules might
be visible, which would cause the default table rule to be used
to drop the packet. Or only part of the flow rules would be visible
and so on.

Since the NS/ARP packet is created by OvS and will be injected in
the outgoing bridge, use the corresponding ofproto version instead.

Signed-off-by: Flavio Leitner 
---
 ofproto/ofproto-dpif-xlate.c |  4 +--
 tests/tunnel.at  | 62 
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
index 28a7fdd84..5a8a46370 100644
--- a/ofproto/ofproto-dpif-xlate.c
+++ b/ofproto/ofproto-dpif-xlate.c
@@ -3414,6 +3414,7 @@ compose_table_xlate(struct xlate_ctx *ctx, const struct 
xport *out_dev,
 struct dp_packet *packet)
 {
 struct xbridge *xbridge = out_dev->xbridge;
+ovs_version_t version = ofproto_dpif_get_tables_version(xbridge->ofproto);
 struct ofpact_output output;
 struct flow flow;
 
@@ -3423,8 +3424,7 @@ compose_table_xlate(struct xlate_ctx *ctx, const struct 
xport *out_dev,
 output.port = OFPP_TABLE;
 output.max_len = 0;
 
-return ofproto_dpif_execute_actions__(xbridge->ofproto,
-  ctx->xin->tables_version, ,
+return ofproto_dpif_execute_actions__(xbridge->ofproto, version, ,
   NULL, , sizeof output,
   ctx->depth, ctx->resubmits, packet);
 }
diff --git a/tests/tunnel.at b/tests/tunnel.at
index fc6f87936..faffb4149 100644
--- a/tests/tunnel.at
+++ b/tests/tunnel.at
@@ -394,6 +394,68 @@ AT_CHECK([ovs-appctl dpif/show | tail -n +3], [0], [dnl
 OVS_VSWITCHD_STOP
 AT_CLEANUP
 
+AT_SETUP([tunnel - table version])
+dnl check if changes in the egress bridge flow table affects
+dnl discovering the link layer address of tunnel endpoints.
+OVS_VSWITCHD_START([add-port br0 p0 -- set Interface p0 type=dummy 
ofport_request=1 other-config:hwaddr=aa:55:aa:55:00:00])
+AT_CHECK([ovs-vsctl add-br int-br -- set bridge int-br datapath_type=dummy], 
[0])
+AT_CHECK([ovs-vsctl add-port int-br v1 -- set Interface v1 type=vxlan \
+   options:remote_ip=172.31.1.2 options:key=123 \
+   ofport_request=2 \
+ -- add-port int-br v2 -- set Interface v2 type=internal \
+   ofport_request=3 \
+   ], [0])
+
+AT_CHECK([ovs-appctl dpif/show], [0], [dnl
+dummy@ovs-dummy: hit:0 missed:0
+  br0:
+br0 65534/100: (dummy-internal)
+p0 1/1: (dummy)
+  int-br:
+int-br 65534/2: (dummy-internal)
+v1 2/4789: (vxlan: key=123, remote_ip=172.31.1.2)
+v2 3/3: (dummy-internal)
+])
+
+dnl First setup dummy interface IP address, then add the route
+dnl so that tnl-port table can get valid IP address for the device.
+AT_CHECK([ovs-appctl netdev-dummy/ip4addr br0 172.31.1.1/24], [0], [OK
+])
+AT_CHECK([ovs-appctl ovs/route/add 172.31.1.0/24 br0], [0], [OK
+])
+
+dnl change the flow table to bump the internal table version
+AT_CHECK([ovs-ofctl add-flow int-br action=normal])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+AT_CHECK([ovs-ofctl del-flows br0])
+AT_CHECK([ovs-ofctl add-flow br0 action=normal])
+
+dnl Check Neighbour discovery.
+AT_CHECK([ovs-vsctl -- set Interface p0 options:pcap=p0.pcap])
+
+AT_CHECK([ovs-appctl netdev-dummy/receive int-br 
'in_port(2),eth(src=aa:55:aa:55:00:00,dst=f8:bc:12:ff:ff:ff),eth_type(0x0800),ipv4(src=1.1.3.92,dst=1.1.3.88,proto=1,tos=0,ttl=64,frag=no),icmp(type=0,code=0)'])
+AT_CHECK([ovs-pcap p0.pcap > p0.pcap.txt 2>&1])
+
+dnl When the wrong version is used, the flow is not visible and the
+dnl packet is dropped.
+AT_CHECK([cat p0.pcap.txt | grep 
aa55aa5508060001080006040001aa55aa55ac1f0101ac1f0102
 | uniq], [0], [dnl
+aa55aa5508060001080006040001aa55aa55ac1f0101ac1f0102
+])
+OVS_VSWITCHD_STOP
+AT_CLEANUP
+

Re: [ovs-dev] [PATCH] tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted

2019-07-08 Thread Flavio Leitner via dev

On Tue, Jul 02, 2019 at 08:50:16PM -0400, Vasu Dasari wrote:
> Hi Flavio,
> 
> I am trying to emulate the test case scenario you mentioned earlier, where
> in we need to clean you neighbor cache when an external interface goes
> down. To study that, I wrote a small script based off of test case
> system-traffic.at, "datapath - ping over vxlan tunnel". And this experiment
> gave me good experience around the understanding of netdev and kmod
> implementation of VxLAN.
> 
> Various tests I have performed using the new script include,
> 1. ovs-vswitchd started normally, for performing kernel mode VxLAN tests.
> And in this mode, tested following cases
>a. VxLAN interface being part of OVS.
>b. VxLAN interface external to OVS
>I see that in kernel mode of operation, native tunneling is off, and
> hence no ARP entries are learnt in tnl-neigh-cache. Please refer to
> ofproto-dpif-xlate.c:terminate_native_tunnel(). So, whatever changes we are
> making here(as part of my commit) are not effected in kernel mode of
> operation.

In either case my understanding is that OvS doesn't care about the
VXLAN and just hands off the packet to the interface. In another
words, it's the interface's job to encapsulate the traffic and
that's why it doesn't impact on the tnl-neigh-cache.


> 2. ovs-vswitchd started with "--disable-system" option for performing
> usermode VxLAN tests. And in this mode, tested following cases
>a. VxLAN interface being part of OVS. This test case works. In this
> case, entries are cleanly removed by user deleting the ports from the
> bridge.

Right, this is the scenario you tested first, if I recall correctly.

>b. VxLAN interface external to OVS. I could not get this case to work.

I think OvS will inject the packet to the VXLAN interface, but I
never tried this as it seems unusual scenario to have.

> 3. ovs-vswitchd started normally(without --disable-system option) for
> performing usermode VxLAN tests. And in this mode, tested following cases
>a. VxLAN interface being part of OVS. This test case works. In this
> case, entries are cleanly removed by user deleting the ports from the
> bridge.
>b. VxLAN interface external to OVS. I could not get this case to work.

This looks like a perfect copy from item #2 which is okay, just checking
if there was no copy error somewhere :-)

 
> I could not 2b and 3b use cases to at all. Do you expect these to work
> normally?  The reason I ask this is, as I understand from the code, even
> though "ovs/route/show" shows the route to remote vtep, OVS does not know
> which ODP port to take to send the packet out of. Please refer to
> ofproto-dpif-xlate.c:native_tunnel_output() and tnl_route_lookup_flow() and
> hence packet transmission fails with "native tunnel routing failed".
> 
> If you agree that 2b and 3b are not supported use cases, then I can submit
> my code changes as per your review comments.

Let me step back a bit because in the scenario I told initially had
the egress device (ethX) with the endpoint address outside of the
bridge while the VXLAN interface was always part of the bridge.
Therefore we had two scenarios to cover with the difference being
the endpoint address being part or not part of the OvS bridge. 

However, looking more closely to the code if the VXLAN is part of the
bridge in userspace, then native tunnel will need the tnl-neigh-cache
to build the headers.  Then it sends out the ARP Request packet to the
datapath only, so it doesn't seem to support endpoint addresses outside
of the OvS. I guess your initial patch covering only interfaces as part
of OvS was good enough then.

Do you agree with that?

Thanks!
fbl


> 
> Please let me know what you think of.
> -Vasu
> 
> *Vasu Dasari*
> 
> 
> On Mon, Jun 24, 2019 at 6:15 PM Flavio Leitner  wrote:
> 
> > On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote:
> > > On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner  wrote:
> > > > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > > > > +{
> > > > > +struct tnl_neigh_entry *neigh;
> > > > > +bool changed = false;
> > > > > +
> > > > > +ovs_mutex_lock();
> > > > > +CMAP_FOR_EACH(neigh, cmap_node, ) {
> > > > > +if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > > > > +tnl_neigh_delete(neigh);
> > > > > +changed = true;
> > > >
> > > > Do you expect to match on additional entries? Otherwise it
> > > > could bail out here.
> > > >
> > > >
> > > Say there are two or more neighbors on the port or on bridge, then by
> > > bailing out we would be missing others. So, will leave it there.
> >
> > You're right.
> >
> > [...]
> > > > However, as I mentioned in the discussion, the tnl IP address could
> > > > be on a device that doesn't belong to OVS at all. For example:
> > [...]
> > > > The tnl endpoint must have a valid route, so I suggest to hook the
> > > > tnl_neigh_cache_flush into route-table.c which receives a notification
> > > > when a route

Re: [ovs-dev] criteria/benchmarks for an improved userspace datapath classifier?

2019-07-03 Thread Flavio Leitner via dev



Hi Michal,

On Tue, Jul 02, 2019 at 10:06:47PM +0100, Michal Orsák wrote:
> Hello,
> 
> On 02/07/2019 19:47, Flavio Leitner wrote:
> > On Tue, Jul 02, 2019 at 10:00:44AM -0700, Ben Pfaff wrote:
> > > Hi Ilya and Ian.  Please allow me to introduce Michal Orsak, a grad
> > > student currently looking at packet classifiers.  He's implemented a
> > > novel classifier that is faster than the one already in OVS in the
> > > benchmarks that he's run.  His classifier is tree-based, like most
> > > high-performance classifiers, but also incremental so that flows can be
> > > inserted and deleted individually without undue delay.  Ultimately, it
> > > might make sense to replace the OVS userspace datapath classifier by one
> > > based on the concepts that he's come up with.
> > > 
> > > A difficulty with classifiers, however, is coming up with an appropriate
> > > set of benchmarks to compare them fairly.  The userspace datapath
> > > focuses on performance, so do you have a set of benchmarks that you
> > > recommend for comparison?  Are there other criteria that would be
> > > important (besides correctness)?
> > > 
> > > (I'd take answers from anyone, not just Ian and Ilya!)
> > Hi Ben,
> > 
> > We use 1M constant IPv4 flows, and 200k new and 200k retiring IPv4
> > flows per second to compare as it seems to be close to some production
> > workloads.
> 
> What is the format of the rules in classifier?

Sorry for my quick reply.  It's a good question. Most deployments I am
aware of use OpenStack and depending on the version you might have security
groups and etc... otherwise it is the action NORMAL, so it wouldn't stress
test insertion/deletion in this case.

> Do you also remove and add 200K rules/s?

No, the flow table is pretty "static", or better, a function of how many
VMs and tunnels are running. 

I guess you need a good flow table and a traffic pattern see if it
improves or not. What I am suggesting is a traffic pattern we learned
from the field in a deployment that at least we should not regress on.

HTH,
fbl 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] criteria/benchmarks for an improved userspace datapath classifier?

2019-07-02 Thread Flavio Leitner via dev

On Tue, Jul 02, 2019 at 10:00:44AM -0700, Ben Pfaff wrote:
> Hi Ilya and Ian.  Please allow me to introduce Michal Orsak, a grad
> student currently looking at packet classifiers.  He's implemented a
> novel classifier that is faster than the one already in OVS in the
> benchmarks that he's run.  His classifier is tree-based, like most
> high-performance classifiers, but also incremental so that flows can be
> inserted and deleted individually without undue delay.  Ultimately, it
> might make sense to replace the OVS userspace datapath classifier by one
> based on the concepts that he's come up with.
> 
> A difficulty with classifiers, however, is coming up with an appropriate
> set of benchmarks to compare them fairly.  The userspace datapath
> focuses on performance, so do you have a set of benchmarks that you
> recommend for comparison?  Are there other criteria that would be
> important (besides correctness)?
> 
> (I'd take answers from anyone, not just Ian and Ilya!)

Hi Ben,

We use 1M constant IPv4 flows, and 200k new and 200k retiring IPv4
flows per second to compare as it seems to be close to some production
workloads.

fbl

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 3/4] netdev-dpdk: Add stats for vhost tx retries.

2019-06-28 Thread Flavio Leitner via dev

On Fri, Jun 28, 2019 at 04:29:36PM +0300, Ilya Maximets wrote:
> On 28.06.2019 16:17, Flavio Leitner wrote:
> > On Fri, Jun 28, 2019 at 03:21:07PM +0300, Ilya Maximets wrote:
> >> On 27.06.2019 14:12, Kevin Traynor wrote:
> >>> vhost tx retries may occur, and it can be a sign that
> >>> the guest is not optimally configured.
> >>>
> >>> Add some stats so a user will know if vhost tx retries are
> >>> occurring and hence give a hint that guest config should be
> >>> examined.
> >>>
> >>> Signed-off-by: Kevin Traynor 
> >>> Acked-by: Flavio Leitner 
> >>> Acked-by: Eelco Chaudron 
> >>> ---
> >>>  Documentation/topics/dpdk/vhost-user.rst | 5 +
> >>>  include/openvswitch/netdev.h | 1 +
> >>>  lib/netdev-dpdk.c| 7 +--
> >>>  vswitchd/bridge.c| 3 ++-
> >>>  4 files changed, 13 insertions(+), 3 deletions(-)
> >>>
> >>> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> >>> b/Documentation/topics/dpdk/vhost-user.rst
> >>> index 1dd02b8b6..3caa88231 100644
> >>> --- a/Documentation/topics/dpdk/vhost-user.rst
> >>> +++ b/Documentation/topics/dpdk/vhost-user.rst
> >>> @@ -112,4 +112,9 @@ The guest should also have sufficient cores dedicated 
> >>> for consuming and
> >>>  processing packets at the required rate.
> >>>  
> >>> +The amount of Tx retries on a vhost-user or vhost-user-client interface 
> >>> can be
> >>> +shown with::
> >>> +
> >>> +  ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries
> >>> +
> >>>  .. _dpdk-vhost-user:
> >>>  
> >>> diff --git a/include/openvswitch/netdev.h b/include/openvswitch/netdev.h
> >>> index 0c10f7b48..4d18b9f66 100644
> >>> --- a/include/openvswitch/netdev.h
> >>> +++ b/include/openvswitch/netdev.h
> >>> @@ -46,4 +46,5 @@ struct netdev_stats {
> >>>  uint64_t multicast; /* Multicast packets received. */
> >>>  uint64_t collisions;
> >>> +uint64_t tx_retries;/* Retries when unable to transmit.*/
> >>
> >>
> >> This seems very vhost specific counter.
> >> Maybe it's better to report it in custom_stats?
> > 
> > I thought we could implement retry for nics and rings if we want to
> > not drop any packets. Not sure if it will ever be implemented
> > though.
> 
> For HW NICs, I think, it's unlikely that we'll push packets faster than
> HW will send them. For ring ports it makes more sense, but do you know
> if anyone uses them? Since ivshmem deprecation this functionality looks
> not very interesting. I don't know if anyone uses ring ports at all.

Things might change when we mix with HWOL, but it's hard to tell
today.
 
> Anyway, with custom stats where will be no difference for the user.
> The only difference is in OVS code. At least, there is no re-try
> functionality for system interfaces in both datapaths.

Just wanted to point out that possibility, but that's not the case
today so I have no strong opinion either way.

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v3 4/4] netdev-dpdk: Enable vhost-tx-retries config.

2019-06-27 Thread Flavio Leitner via dev

On Thu, Jun 27, 2019 at 12:12:32PM +0100, Kevin Traynor wrote:
> vhost tx retries can provide some mitigation against
> dropped packets due to a temporarily slow guest/limited queue
> size for an interface, but on the other hand when a system
> is fully loaded those extra cycles retrying could mean
> packets are dropped elsewhere.
> 
> Up to now max vhost tx retries have been hardcoded, which meant
> no tuning and no way to disable for debugging to see if extra
> cycles spent retrying resulted in rx drops on some other
> interface.
> 
> Add an option to change the max retries, with a value of
> 0 effectively disabling vhost tx retries.
> 
> Signed-off-by: Kevin Traynor 
> ---

Acked-by: Flavio Leitner 
Thanks Kevin!
fbl

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 4/4] netdev-dpdk: Enable vhost-tx-retries config.

2019-06-26 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 03:57:24PM +0100, Kevin Traynor wrote:
> vhost tx retries can provide some mitigation against
> dropped packets due to a temporarily slow guest/limited queue
> size for an interface, but on the other hand when a system
> is fully loaded those extra cycles retrying could mean
> packets are dropped elsewhere.
> 
> Up to now max vhost retries have been hardcoded, which meant
> no tuning and no way to disable for debugging to see if extra
> cycles spent retrying resulted in rx drops on some other
> interface.
> 
> Add an option to change the max retries, with a value of
> 0 effectively disabling vhost tx retries.
> 
> Signed-off-by: Kevin Traynor 
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 26 ++
>  lib/netdev-dpdk.c| 28 +---
>  vswitchd/vswitch.xml | 10 +
>  3 files changed, 61 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index a25a64237..efee19b62 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -392,4 +392,30 @@ The default value is ``false``.
>  .. _dpdk-testpmd:
>  
> +vhost-user-client tx retries
> +
> +
> +For vhost-user-client interfaces, the max amount of retries can be changed 
> from
> +the default 8 by setting ``vhost-tx-retries``.
> +
> +The minimum is 0 which means there will be no retries and if any packets in
> +each batch cannot be sent immediately they will be dropped. The maximum is 
> 31,
> +which would mean that after the first packet(s) in the batch was sent there
> +could be up to a separate retry for each of the remaining packets, until they
> +are all sent or no packet was sent during a retry.
> +
> +Retries can help with avoiding packet loss when temporarily unable to a vhost

... unable to  to a vhost interface?


> +interface because the virtqueue is full. However, spending more time retrying
> +to send to one interface, will reduce the time available for rx/tx and
> +processing packets on other interfaces, so some tuning may be required for 
> best
> +performance.
> +
> +Tx retries can be set for vhost-user-client ports::
> +
> +$ ovs-vsctl set Interface vhost-client-1 options:vhost-tx-retries=0
> +
> +.. note::
> +
> + Configurable vhost tx retries are not supported with vhost-user ports.
> +
>  DPDK in the Guest
>  -
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 65161deaf..97a90d4a5 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -160,4 +160,5 @@ typedef uint16_t dpdk_port_t;
>  
>  #define VHOST_ENQ_RETRY_NUM 8
> +#define VHOST_MAX_ENQ_RETRY_NUM 31
>  #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ)

Maybe:
  
/* One retry for each packet in the batch */
#define VHOST_ENQ_RETRY_MAX  ( NETDEV_MAX_BURST - 1 )
/* Minimum value to not retry */
#define VHOST_ENQ_RETRY_MIN 0
/* Arbitrary, debatable, though reasonable legacy value */
#define VHOST_ENQ_RETRY_DEF 8

and fix accordingly below...


>  
> @@ -409,5 +410,6 @@ struct netdev_dpdk {
>  /* True if vHost device is 'up' and has been reconfigured at least 
> once */
>  bool vhost_reconfigured;
> -/* 3 pad bytes here. */
> +atomic_uint8_t vhost_tx_retries;
> +/* 2 pad bytes here. */
>  );
>  
> @@ -1240,4 +1242,6 @@ vhost_common_construct(struct netdev *netdev)
>  }
>  
> +atomic_store_relaxed(>vhost_tx_retries, VHOST_ENQ_RETRY_NUM);
> +
>  return common_construct(netdev, DPDK_ETH_PORT_ID_INVALID,
>  DPDK_DEV_VHOST, socket_id);
> @@ -1899,4 +1903,5 @@ netdev_dpdk_vhost_client_set_config(struct netdev 
> *netdev,
>  struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
>  const char *path;
> +int max_tx_retries, cur_max_tx_retries;
>  
>  ovs_mutex_lock(>mutex);
> @@ -1915,4 +1920,16 @@ netdev_dpdk_vhost_client_set_config(struct netdev 
> *netdev,
>  }
>  }
> +
> +max_tx_retries = smap_get_int(args, "vhost-tx-retries",
> +  VHOST_ENQ_RETRY_NUM);
> +if (max_tx_retries < 0 || max_tx_retries > VHOST_MAX_ENQ_RETRY_NUM) {
> +max_tx_retries = VHOST_ENQ_RETRY_NUM;
> +}
> +atomic_read_relaxed(>vhost_tx_retries, _max_tx_retries);
> +if (max_tx_retries != cur_max_tx_retries) {
> +atomic_store_relaxed(>vhost_tx_retries, max_tx_retries);
> +VLOG_INFO("Max Tx retries for vhost device '%s' set to %d",
> +  dev->up.name, max_tx_retries);
> +}
>  ovs_mutex_unlock(>mutex);
>  
> @@ -2323,4 +2340,5 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>  int i, retries = 0;
>  int vid = netdev_dpdk_get_vid(dev);
> +int max_retries = 0;

Perhaps declare retries and max_retries close to each other?

>  
>  qid = dev->tx_q[qid % netdev->n_txq].map;
> @@ -2351,9

Re: [ovs-dev] [PATCH v2 2/4] doc: Add info on vhost tx retries.

2019-06-26 Thread Flavio Leitner via dev

On Wed, Jun 26, 2019 at 11:05:42AM -0300, Flavio Leitner via dev wrote:
> On Tue, Jun 25, 2019 at 03:57:22PM +0100, Kevin Traynor wrote:
> > Add documentation about vhost tx retries and external
> > configuration that can help reduce/avoid them.
> > 
> > Signed-off-by: Kevin Traynor 
> > Acked-by: Eelco Chaudron 
> > ---
> 
> This also has a magic number without a define, but since the
> max number of retries is exposed to the users in the next patches
> as a parameter in the documentation it seems fine to me.
> 
> Otherwise the patch looks good to me.

Actually it mentions NETDEV_MAX_BATCH where it is NETDEV_MAX_BURST
as defined in lib/dp-packet.h

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 3/4] netdev-dpdk: Add stats for vhost tx retries.

2019-06-26 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 03:57:23PM +0100, Kevin Traynor wrote:
> vhost tx retries may occur, and it can be a sign that
> the guest is not optimally configured.
> 
> Add some stats so a user will know if vhost tx retries are
> occurring and hence give a hint that guest config should be
> examined.
> 
> Signed-off-by: Kevin Traynor 
> ---

Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 2/4] doc: Add info on vhost tx retries.

2019-06-26 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 03:57:22PM +0100, Kevin Traynor wrote:
> Add documentation about vhost tx retries and external
> configuration that can help reduce/avoid them.
> 
> Signed-off-by: Kevin Traynor 
> Acked-by: Eelco Chaudron 
> ---

This also has a magic number without a define, but since the
max number of retries is exposed to the users in the next patches
as a parameter in the documentation it seems fine to me.

Otherwise the patch looks good to me.

Acked-by: Flavio Leitner 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2 1/4] netdev-dpdk: Fix additional vhost tx retry.

2019-06-26 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 03:57:21PM +0100, Kevin Traynor wrote:
> Fix minor issue of one possible additional retry.
> 
> Fixes: c6ec9d176dbf ("netdev-dpdk: Fix vHost stats.")
> Signed-off-by: Kevin Traynor 
> Acked-by: Eelco Chaudron 
> ---

Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] vswitchd: Always cleanup userspace datapath.

2019-06-26 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 03:59:57PM +0300, Ilya Maximets wrote:
> On 25.06.2019 15:41, Flavio Leitner wrote:
> > On Tue, Jun 25, 2019 at 03:37:09PM +0300, Ilya Maximets wrote:
> >> On 25.06.2019 15:20, Flavio Leitner wrote:
> >>> On Tue, Jun 25, 2019 at 01:07:53PM +0300, Ilya Maximets wrote:
>  On 25.06.2019 0:15, Flavio Leitner wrote:
> > On Mon, Jun 24, 2019 at 06:28:37PM +0300, Ilya Maximets wrote:
>  BTW, it's system dependent if ip/route information preserved on tap
>  device detaching. I have a few systems, where 'ovs-appctl exit'
>  leads to DOWN state for preserved internal ports and consequently
>  loosing their IP addresses. On a few other systems this doesn't happen.
> >>>
> >>> I haven't seen any bug reports yet related to that. Do you happen to
> >>> know the kernel and OvS versions?
> >>
> >> I see this with OVS master and Linux 4.2.6:
> >>
> >> # ip addr add 192.168.114.20/24 dev ovsdpdk_br0
> >> # ip addr show dev ovsdpdk_br0
> >> 43: ovsdpdk_br0:  mtu 1500 qdisc 
> >> pfifo_fast state UP group default qlen 500
> >> link/ether f2:20:dc:bf:7d:44 brd ff:ff:ff:ff:ff:ff
> >> inet 192.168.114.20/24 scope global ovsdpdk_br0
> >>valid_lft forever preferred_lft forever
> >>
> >> # ovs-appctl -t ovs-vswitchd exit
> >>
> >> # ip addr show dev ovsdpdk_br0
> >> 43: ovsdpdk_br0:  mtu 1500 qdisc pfifo_fast 
> >> state DOWN group default qlen 500
> >> link/ether f2:20:dc:bf:7d:44 brd ff:ff:ff:ff:ff:ff
> >>
> >> The same scenario on RHEL7 keeps the interface in NO-CARRIER,UP state and
> >> preserved ip address.
> >>
> >> I don't think that it's OVS. It looks more like a kernel issue.
> > 
> > Sounds like that. 
> > If you start OvS again, does it bring the device UP?
> 
> Nope. It stays DOWN just like the newly created one.

Interesting, I will see if I can recreate the issue here as well.

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] ovs-thread: Add pthread spin lock support.

2019-06-26 Thread Flavio Leitner via dev



Hi William,

On Thu, Jun 20, 2019 at 11:00:56AM -0700, William Tu wrote:
> The patch adds the basic spin lock functions:
> ovs_spin_{lock, try_lock, unlock, init, destroy}.
> I have some use cases using af_xdp, patches will
> come later.

Usually the new API is merged along with an user otherwise
it is difficult to evaluate what is really needed and etc...

It also avoids the issue of having to maintain dead code if
the API's user never comes or gets merged for whatever reason.

Does that make sense to you?

Thanks!
fbl

> 
> Signed-off-by: William Tu 
> ---
>  include/openvswitch/thread.h | 18 ++
>  lib/ovs-thread.c | 23 +++
>  2 files changed, 41 insertions(+)
> 
> diff --git a/include/openvswitch/thread.h b/include/openvswitch/thread.h
> index 2987db37c9dc..44cd5b77f087 100644
> --- a/include/openvswitch/thread.h
> +++ b/include/openvswitch/thread.h
> @@ -33,6 +33,11 @@ struct OVS_LOCKABLE ovs_mutex {
>  const char *where;  /* NULL if and only if uninitialized. */
>  };
>  
> +struct OVS_LOCKABLE ovs_spin {
> +pthread_spinlock_t lock;
> +const char *where;  /* NULL if and only if uninitialized. */
> +};
> +
>  /* "struct ovs_mutex" initializer. */
>  #ifdef PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP
>  #define OVS_MUTEX_INITIALIZER { PTHREAD_ERRORCHECK_MUTEX_INITIALIZER_NP, \
> @@ -70,6 +75,19 @@ int ovs_mutex_trylock_at(const struct ovs_mutex *mutex, 
> const char *where)
>  
>  void ovs_mutex_cond_wait(pthread_cond_t *, const struct ovs_mutex *mutex)
>  OVS_REQUIRES(mutex);
> +
> +void ovs_spin_init(const struct ovs_spin *);
> +void ovs_spin_destroy(const struct ovs_spin *);
> +void ovs_spin_unlock(const struct ovs_spin *spin) OVS_RELEASES(spin);
> +void ovs_spin_lock_at(const struct ovs_spin *spin, const char *where)
> +OVS_ACQUIRES(spin);
> +#define ovs_spin_lock(spin) \
> +ovs_spin_lock_at(spin, OVS_SOURCE_LOCATOR)
> +
> +int ovs_spin_trylock_at(const struct ovs_spin *spin, const char *where)
> +OVS_TRY_LOCK(0, spin);
> +#define ovs_spin_trylock(spin) \
> +ovs_spin_trylock_at(spin, OVS_SOURCE_LOCATOR)
>  
>  /* Convenient once-only execution.
>   *
> diff --git a/lib/ovs-thread.c b/lib/ovs-thread.c
> index 159d87e5b0ca..d6b4ed1d61a9 100644
> --- a/lib/ovs-thread.c
> +++ b/lib/ovs-thread.c
> @@ -75,6 +75,7 @@ static bool multithreaded;
>  LOCK_FUNCTION(mutex, lock);
>  LOCK_FUNCTION(rwlock, rdlock);
>  LOCK_FUNCTION(rwlock, wrlock);
> +LOCK_FUNCTION(spin, lock);
>  
>  #define TRY_LOCK_FUNCTION(TYPE, FUN) \
>  int \
> @@ -103,6 +104,7 @@ LOCK_FUNCTION(rwlock, wrlock);
>  TRY_LOCK_FUNCTION(mutex, trylock);
>  TRY_LOCK_FUNCTION(rwlock, tryrdlock);
>  TRY_LOCK_FUNCTION(rwlock, trywrlock);
> +TRY_LOCK_FUNCTION(spin, trylock);
>  
>  #define UNLOCK_FUNCTION(TYPE, FUN, WHERE) \
>  void \
> @@ -125,6 +127,8 @@ UNLOCK_FUNCTION(mutex, unlock, "");
>  UNLOCK_FUNCTION(mutex, destroy, NULL);
>  UNLOCK_FUNCTION(rwlock, unlock, "");
>  UNLOCK_FUNCTION(rwlock, destroy, NULL);
> +UNLOCK_FUNCTION(spin, unlock, "");
> +UNLOCK_FUNCTION(spin, destroy, NULL);
>  
>  #define XPTHREAD_FUNC1(FUNCTION, PARAM1)\
>  void\
> @@ -268,6 +272,25 @@ ovs_mutex_cond_wait(pthread_cond_t *cond, const struct 
> ovs_mutex *mutex_)
>  }
>  }
>  
> +static void
> +ovs_spin_init__(const struct ovs_spin *l_, int pshared)
> +{
> +struct ovs_spin *l = CONST_CAST(struct ovs_spin *, l_);
> +int error;
> +
> +l->where = "";
> +error = pthread_spin_init(>lock, pshared);
> +if (OVS_UNLIKELY(error)) {
> +ovs_abort(error, "pthread_spin_failed");
> +}
> +}
> +
> +void
> +ovs_spin_init(const struct ovs_spin *spin)
> +{
> +ovs_spin_init__(spin, PTHREAD_PROCESS_PRIVATE);
> +}
> +
>  /* Initializes the 'barrier'.  'size' is the number of threads
>   * expected to hit the barrier. */
>  void
> -- 
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH v2] vswitchd: Always cleanup userspace datapath.

2019-06-25 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 01:12:11PM +0300, Ilya Maximets wrote:
> 'netdev' datapath is implemented within ovs-vswitchd process and can
> not exist without it, so it should be gracefully terminated with a
> full cleanup of resources upon ovs-vswitchd exit.
> 
> This change forces dpif cleanup for 'netdev' datapath regardless of
> passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to
> not pass this additional option everytime for userspace datapath
> installations and also allowes to not terminate system datapath in
> setups where both datapaths runs at the same time.
> 
> Exception made for 'internal' ports that could have user ip/route
> configuration. These ports will not be removed without '--cleanup'.
> 
> This change fixes OVS disappearing from the DPDK point of view
> (keeping HW NICs improperly configured, sudden closing of vhost-user
> connections) and will help with linux devices clearing with upcoming
> AF_XDP netdev support.
> 
> Signed-off-by: Ilya Maximets 
> ---

LGTM
Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] vswitchd: Always cleanup userspace datapath.

2019-06-25 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 03:37:09PM +0300, Ilya Maximets wrote:
> On 25.06.2019 15:20, Flavio Leitner wrote:
> > On Tue, Jun 25, 2019 at 01:07:53PM +0300, Ilya Maximets wrote:
> >> On 25.06.2019 0:15, Flavio Leitner wrote:
> >>> On Mon, Jun 24, 2019 at 06:28:37PM +0300, Ilya Maximets wrote:
> >> BTW, it's system dependent if ip/route information preserved on tap
> >> device detaching. I have a few systems, where 'ovs-appctl exit'
> >> leads to DOWN state for preserved internal ports and consequently
> >> loosing their IP addresses. On a few other systems this doesn't happen.
> > 
> > I haven't seen any bug reports yet related to that. Do you happen to
> > know the kernel and OvS versions?
> 
> I see this with OVS master and Linux 4.2.6:
> 
> # ip addr add 192.168.114.20/24 dev ovsdpdk_br0
> # ip addr show dev ovsdpdk_br0
> 43: ovsdpdk_br0:  mtu 1500 qdisc 
> pfifo_fast state UP group default qlen 500
> link/ether f2:20:dc:bf:7d:44 brd ff:ff:ff:ff:ff:ff
> inet 192.168.114.20/24 scope global ovsdpdk_br0
>valid_lft forever preferred_lft forever
> 
> # ovs-appctl -t ovs-vswitchd exit
> 
> # ip addr show dev ovsdpdk_br0
> 43: ovsdpdk_br0:  mtu 1500 qdisc pfifo_fast 
> state DOWN group default qlen 500
> link/ether f2:20:dc:bf:7d:44 brd ff:ff:ff:ff:ff:ff
> 
> The same scenario on RHEL7 keeps the interface in NO-CARRIER,UP state and
> preserved ip address.
> 
> I don't think that it's OVS. It looks more like a kernel issue.

Sounds like that. 
If you start OvS again, does it bring the device UP?

fbl

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] vswitchd: Always cleanup userspace datapath.

2019-06-25 Thread Flavio Leitner via dev

On Tue, Jun 25, 2019 at 01:07:53PM +0300, Ilya Maximets wrote:
> On 25.06.2019 0:15, Flavio Leitner wrote:
> > On Mon, Jun 24, 2019 at 06:28:37PM +0300, Ilya Maximets wrote:
> >> 'netdev' datapath is implemented within ovs-vswitchd process and can
> >> not exist without it, so it should be gracefully terminated with a
> >> full cleanup of resources upon ovs-vswitchd exit.
> >>
> >> This change forces dpif cleanup for 'netdev' datapath regardless of
> >> passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to
> >> not pass this additional option everytime for userspace datapath
> >> installations and also allowes to not terminate system datapath in
> >> setups where both datapaths runs at the same time.
> >>
> >> This fixes OVS disappearing from the DPDK point of view (keeping HW
> >> NICs improperly configured, sudden closing of vhost-user connections)
> >> and will help with linux devices clearing with upcoming AF_XDP
> >> netdev support.
> > 
> > Does this mean that OVS bridges or internal ports will be deleted
> > from the system regardless of --cleanup parameter?
> 
> Yes. This will remove all ports and completely destroy userspace datapath.
> 
> I guess, you're warring about user ip/route settings that could be
> manually applied to internal ports and not stored in network-scripts
> or network manager.

Yes!

> I think, that it's not very important case. However, the main goal
> of this patch is to properly destroy non-internal ports and it could
> be achieved without loosing user configurations. I'll post v2 with
> this change.

Thanks to that we can replace/restart OvS without causing major
disruptions as the configuration remains the same.

 
> BTW, it's system dependent if ip/route information preserved on tap
> device detaching. I have a few systems, where 'ovs-appctl exit'
> leads to DOWN state for preserved internal ports and consequently
> loosing their IP addresses. On a few other systems this doesn't happen.

I haven't seen any bug reports yet related to that. Do you happen to
know the kernel and OvS versions?

Thanks!
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted

2019-06-24 Thread Flavio Leitner via dev

On Mon, Jun 24, 2019 at 05:43:26PM -0400, Vasu Dasari wrote:
> On Mon, Jun 24, 2019 at 3:58 PM Flavio Leitner  wrote:
> > On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> > > +{
> > > +struct tnl_neigh_entry *neigh;
> > > +bool changed = false;
> > > +
> > > +ovs_mutex_lock();
> > > +CMAP_FOR_EACH(neigh, cmap_node, ) {
> > > +if (nullable_string_is_equal(neigh->br_name, br_name)) {
> > > +tnl_neigh_delete(neigh);
> > > +changed = true;
> >
> > Do you expect to match on additional entries? Otherwise it
> > could bail out here.
> >
> >
> Say there are two or more neighbors on the port or on bridge, then by
> bailing out we would be missing others. So, will leave it there.

You're right.

[...]
> > However, as I mentioned in the discussion, the tnl IP address could
> > be on a device that doesn't belong to OVS at all. For example:
[...]
> > The tnl endpoint must have a valid route, so I suggest to hook the
> > tnl_neigh_cache_flush into route-table.c which receives a notification
> > when a route changes. If a route is deleted, we should flush the
> > device's tnl cache. Doing so, would cover both userspace and kernel
> > datapath for OVS and non-OVS devices.
> >
> >
> I see what you are saying. Let me play with code a bit and resubmit patch.

OK, looking forward to it!
Thanks Vasu, 
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 2/3] netdev-dpdk: Add stats for vhost tx retries.

2019-06-24 Thread Flavio Leitner via dev

On Fri, Jun 21, 2019 at 02:41:57PM +0100, Kevin Traynor wrote:
> vhost tx retries may occur, and it can be a sign that
> the guest is not optimally configured.
> 
> Add some stats so a user will know if vhost tx retries are
> occurring and hence give a hint that guest config should be
> examined.
> 
> Signed-off-by: Kevin Traynor 
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 5 +
>  include/openvswitch/netdev.h | 1 +
>  lib/netdev-dpdk.c| 7 +--
>  vswitchd/bridge.c| 3 ++-
>  4 files changed, 13 insertions(+), 3 deletions(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index d1682fd05..e79ed082e 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -107,4 +107,9 @@ The guest should also have sufficient cores dedicated for 
> consuming and
>  processing packets at the required rate.
>  
> +The amount of Tx retries on a vhost-user or vhost-user-client interface can 
> be
> +shown with::
> +
> +  ovs-vsctl get Interface dpdkvhostclient0 statistics:tx_retries
> +
>  .. _dpdk-vhost-user:
>  
> diff --git a/include/openvswitch/netdev.h b/include/openvswitch/netdev.h
> index 0c10f7b48..4d18b9f66 100644
> --- a/include/openvswitch/netdev.h
> +++ b/include/openvswitch/netdev.h
> @@ -46,4 +46,5 @@ struct netdev_stats {
>  uint64_t multicast; /* Multicast packets received. */
>  uint64_t collisions;
> +uint64_t tx_retries;/* Retries when unable to transmit.*/
>  
>  /* Detailed receive errors. */
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 0f3b9c6f4..03a8de380 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2298,5 +2298,6 @@ netdev_dpdk_vhost_update_tx_counters(struct 
> netdev_stats *stats,
>   struct dp_packet **packets,
>   int attempted,
> - int dropped)
> + int dropped,
> + int retries)
>  {
>  int i;
> @@ -2305,4 +2306,5 @@ netdev_dpdk_vhost_update_tx_counters(struct 
> netdev_stats *stats,
>  stats->tx_packets += sent;
>  stats->tx_dropped += dropped;
> +stats->tx_retries += MIN(retries, VHOST_ENQ_RETRY_NUM);

Why MIN(, VHOST_ENQ_RETRY_NUM)?
The retrans loop is limited to that, so it seems redundant.

fbl
 
>  for (i = 0; i < sent; i++) {
> @@ -2359,5 +2361,5 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>  rte_spinlock_lock(>stats_lock);
>  netdev_dpdk_vhost_update_tx_counters(>stats, pkts, total_pkts,
> - cnt + dropped);
> + cnt + dropped, retries);
>  rte_spinlock_unlock(>stats_lock);
>  
> @@ -2597,4 +2599,5 @@ netdev_dpdk_vhost_get_stats(const struct netdev *netdev,
>  stats->rx_errors = dev->stats.rx_errors;
>  stats->rx_length_errors = dev->stats.rx_length_errors;
> +stats->tx_retries = dev->stats.tx_retries;
>  
>  stats->rx_1_to_64_packets = dev->stats.rx_1_to_64_packets;
> diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
> index 2976771ae..3dab8d4c7 100644
> --- a/vswitchd/bridge.c
> +++ b/vswitchd/bridge.c
> @@ -2409,5 +2409,6 @@ iface_refresh_stats(struct iface *iface)
>  IFACE_STAT(rx_oversize_errors,  "rx_oversize_errors")   \
>  IFACE_STAT(rx_fragmented_errors,"rx_fragmented_errors") \
> -IFACE_STAT(rx_jabber_errors,"rx_jabber_errors")
> +IFACE_STAT(rx_jabber_errors,"rx_jabber_errors") \
> +IFACE_STAT(tx_retries,  "tx_retries")
>  
>  #define IFACE_STAT(MEMBER, NAME) + 1
> -- 
> 2.20.1
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH 1/3] netdev-dpdk: Fix and document vhost tx retries.

2019-06-24 Thread Flavio Leitner via dev

On Fri, Jun 21, 2019 at 02:41:56PM +0100, Kevin Traynor wrote:
> Fix minor issue of one additional retry and add
> documentation about vhost tx retries and ways to
> reduce/remove them.
> 
> Signed-off-by: Kevin Traynor 
> 
> ---
> There is a checkpatch warning that one of the libvirt
> lines in docs is a few characters too long. It is similar
> for some others in the file and I think it makes sense to
> leave it as is but can wrap if required.
> ---
>  Documentation/topics/dpdk/vhost-user.rst | 31 
>  lib/netdev-dpdk.c|  2 +-
>  2 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/topics/dpdk/vhost-user.rst 
> b/Documentation/topics/dpdk/vhost-user.rst
> index f7b4b338e..d1682fd05 100644
> --- a/Documentation/topics/dpdk/vhost-user.rst
> +++ b/Documentation/topics/dpdk/vhost-user.rst
> @@ -76,4 +76,35 @@ mode ports require QEMU version 2.7.  Ports of type 
> vhost-user are currently
>  deprecated and will be removed in a future release.
>  
> +vhost tx retries
> +
> +
> +When sending a batch of packets (max is 32) to a vhost-user or

Maybe use the actual define name (NETDEV_MAX_BURST) since it might change?

> +vhost-user-client interface, it may happen that some but not all of the 
> packets
> +in the batch are able to be sent to the guest. This is often because there is
> +not enough free descriptors in the virtqueue for all the packets to be sent.
> +In this case there will be a retry, with a default maximum of 8 occurring. If
> +at any time no packets can be sent, it may mean the guest is not accepting
> +packets, so there are no (more) retries.
> +
> +Tx Retries may be reduced or removed by some external configuration, such

s/removed/even avoided/ ?

> +as increasing the virtqueue size through the ``rx_queue_size`` parameter in
> +libvirt::
> +
> +  
> +  
> +  
> +  
> +  
> +   function='0x0'/>
> +  

Do we need to worry about the qemu version?

> +
> +The guest application will also need need to provide enough descriptors. For
> +example with ``testpmd`` the command line argument can be used::
> +
> + --rxd=1024 --txd=1024
> +
> +The guest should also have sufficient cores dedicated for consuming and
> +processing packets at the required rate.
> +
>  .. _dpdk-vhost-user:
>  
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> index 4856c56aa..0f3b9c6f4 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -2353,5 +2353,5 @@ __netdev_dpdk_vhost_send(struct netdev *netdev, int qid,
>  break;
>  }
> -} while (cnt && (retries++ <= VHOST_ENQ_RETRY_NUM));
> +} while (cnt && (retries++ < VHOST_ENQ_RETRY_NUM));

You're removing the packet's last hope to reach home, are you sure? :-)
fbl

>  
>  rte_spinlock_unlock(>tx_q[qid].tx_lock);
> -- 
> 2.20.1
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] vswitchd: Always cleanup userspace datapath.

2019-06-24 Thread Flavio Leitner via dev

On Mon, Jun 24, 2019 at 06:28:37PM +0300, Ilya Maximets wrote:
> 'netdev' datapath is implemented within ovs-vswitchd process and can
> not exist without it, so it should be gracefully terminated with a
> full cleanup of resources upon ovs-vswitchd exit.
> 
> This change forces dpif cleanup for 'netdev' datapath regardless of
> passing '--cleanup' to 'ovs-appctl exit'. Such solution allowes to
> not pass this additional option everytime for userspace datapath
> installations and also allowes to not terminate system datapath in
> setups where both datapaths runs at the same time.
> 
> This fixes OVS disappearing from the DPDK point of view (keeping HW
> NICs improperly configured, sudden closing of vhost-user connections)
> and will help with linux devices clearing with upcoming AF_XDP
> netdev support.

Does this mean that OVS bridges or internal ports will be deleted
from the system regardless of --cleanup parameter?

fbl


> 
> Signed-off-by: Ilya Maximets 
> ---
>  NEWS   | 2 ++
>  lib/dpif-netdev.c  | 1 +
>  lib/dpif-netlink.c | 1 +
>  lib/dpif-provider.h| 6 ++
>  lib/dpif.c | 7 +++
>  lib/dpif.h | 2 ++
>  ofproto/ofproto-dpif.c | 4 +++-
>  vswitchd/ovs-vswitchd.8.in | 3 +++
>  8 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/NEWS b/NEWS
> index af1659ce8..dbbc3827b 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -27,6 +27,8 @@ Post-v2.11.0
>   * New action "check_pkt_len".
>   * Port configuration with "other-config:priority-tags" now has a mode
> that retains the 802.1Q header even if VLAN and priority are both 
> zero.
> + * 'ovs-appctl exit' now implies userspace datapath cleanup regardless
> +   of '--cleanup' option.
> - OVSDB:
>   * OVSDB clients can now resynchronize with clustered servers much more
> quickly after a brief disconnection, saving bandwidth and CPU time.
> diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> index 4e73f96fb..ad6ff5338 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -7411,6 +7411,7 @@ dpif_netdev_ipf_dump_done(struct dpif *dpif OVS_UNUSED, 
> void *ipf_dump_ctx)
>  
>  const struct dpif_class dpif_netdev_class = {
>  "netdev",
> +true,   /* cleanup_required */
>  dpif_netdev_init,
>  dpif_netdev_enumerate,
>  dpif_netdev_port_open_type,
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index ba80a0079..985a28426 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -3384,6 +3384,7 @@ probe_broken_meters(struct dpif *dpif)
>  
>  const struct dpif_class dpif_netlink_class = {
>  "system",
> +false,  /* cleanup_required */
>  NULL,   /* init */
>  dpif_netlink_enumerate,
>  NULL,
> diff --git a/lib/dpif-provider.h b/lib/dpif-provider.h
> index b2a4dff96..f503d116a 100644
> --- a/lib/dpif-provider.h
> +++ b/lib/dpif-provider.h
> @@ -119,6 +119,12 @@ struct dpif_class {
>   * the type assumed if no type is specified when opening a dpif. */
>  const char *type;
>  
> +/* If 'true', datapath must be destroyed on ofproto destruction.
> + *
> + * This is used by the vswitch at exit, so that it can delete any
> + * datapaths that can not exist without it (e.g. netdev datapath).  */
> +bool cleanup_required;
> +
>  /* Called when the dpif provider is registered, typically at program
>   * startup.  Returning an error from this function will prevent any
>   * datapath with this class from being created.
> diff --git a/lib/dpif.c b/lib/dpif.c
> index a18fb1b02..c88b2106f 100644
> --- a/lib/dpif.c
> +++ b/lib/dpif.c
> @@ -498,6 +498,13 @@ dpif_type(const struct dpif *dpif)
>  return dpif->dpif_class->type;
>  }
>  
> +/* Checks if datapath 'dpif' requires cleanup. */
> +bool
> +dpif_cleanup_required(const struct dpif *dpif)
> +{
> +return dpif->dpif_class->cleanup_required;
> +}
> +
>  /* Returns the fully spelled out name for the given datapath 'type'.
>   *
>   * Normalized type string can be compared with strcmp().  Unnormalized type
> diff --git a/lib/dpif.h b/lib/dpif.h
> index 883472a59..289d574a0 100644
> --- a/lib/dpif.h
> +++ b/lib/dpif.h
> @@ -419,6 +419,8 @@ const char *dpif_name(const struct dpif *);
>  const char *dpif_base_name(const struct dpif *);
>  const char *dpif_type(const struct dpif *);
>  
> +bool dpif_cleanup_required(const struct dpif *);
> +
>  int dpif_delete(struct dpif *);
>  
>  /* Statistics for a dpif as a whole. */
> diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
> index cbeb6776f..957f70cfa 100644
> --- a/ofproto/ofproto-dpif.c
> +++ b/ofproto/ofproto-dpif.c
> @@ -678,7 +678,7 @@ close_dpif_backer(struct dpif_backer *backer, bool del)
>  shash_find_and_delete(_dpif_backers, backer->type);
>  free(backer->type);
>  free(backer->dp_version_string);
> -if (del) {
> +if (del ||

Re: [ovs-dev] [PATCH] tnl-neigh-cache: Purge learnt neighbors when port/bridge is deleted

2019-06-24 Thread Flavio Leitner via dev

On Wed, Jun 19, 2019 at 11:02:07PM -0400, Vasu Dasari wrote:
> Say an ARP entry is learnt on a OVS port and when such a port is deleted,
> learnt entry should be removed from the port. It would have be aged out after
> ARP ageout time. This code will clean up immediately.
> 
> Added test case(tunnel - neighbor entry add and deletion) in tunnel.at, to
> verify neighbors are added and removed on deletion of a ports and bridges.
> 
> Discussion for this addition is at:
> https://mail.openvswitch.org/pipermail/ovs-discuss/2019-June/048754.html
> 
> Signed-off-by: Vasu Dasari 
> ---
>  lib/tnl-neigh-cache.c| 20 +
>  lib/tnl-neigh-cache.h|  1 +
>  ofproto/ofproto-dpif-xlate.c |  3 +++
>  tests/tunnel.at  | 43 
>  4 files changed, 67 insertions(+)
> 
> diff --git a/lib/tnl-neigh-cache.c b/lib/tnl-neigh-cache.c
> index b28f9f1bb..8718405db 100644
> --- a/lib/tnl-neigh-cache.c
> +++ b/lib/tnl-neigh-cache.c
> @@ -220,6 +220,26 @@ tnl_neigh_cache_run(void)
>  }
>  }
>  
> +void
> +tnl_neigh_flush(const char br_name[])

It seems the file uses a convention declaring using IFNAMSIZ

> +{
> +struct tnl_neigh_entry *neigh;
> +bool changed = false;
> +
> +ovs_mutex_lock();
> +CMAP_FOR_EACH(neigh, cmap_node, ) {
> +if (nullable_string_is_equal(neigh->br_name, br_name)) {
> +tnl_neigh_delete(neigh);
> +changed = true;

Do you expect to match on additional entries? Otherwise it
could bail out here.


> +}
> +}
> +ovs_mutex_unlock();
> +
> +if (changed) {
> +seq_change(tnl_conf_seq);
> +}
> +}
> +
>  static void
>  tnl_neigh_cache_flush(struct unixctl_conn *conn, int argc OVS_UNUSED,
>  const char *argv[] OVS_UNUSED, void *aux OVS_UNUSED)
> diff --git a/lib/tnl-neigh-cache.h b/lib/tnl-neigh-cache.h
> index fee8e6a6f..ded9c2f86 100644
> --- a/lib/tnl-neigh-cache.h
> +++ b/lib/tnl-neigh-cache.h
> @@ -37,5 +37,6 @@ int tnl_neigh_lookup(const char dev_name[], const struct 
> in6_addr *dst,
>   struct eth_addr *mac);
>  void tnl_neigh_cache_init(void);
>  void tnl_neigh_cache_run(void);
> +void tnl_neigh_flush(const char dev_name[]);
>  
>  #endif
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index 73966a4e8..e0cd8edab 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -1481,6 +1481,9 @@ xlate_ofport_remove(struct ofport_dpif *ofport)
>  ovs_assert(new_xcfg);
>  
>  xport = xport_lookup(new_xcfg, ofport);
> +if(xport) {
 ^--- space needed here.

> +tnl_neigh_flush(netdev_get_name(xport->netdev));
> +}
>  xlate_xport_remove(new_xcfg, xport);


Shouldn't this be in xlate_xport_remove()?

However, as I mentioned in the discussion, the tnl IP address could
be on a device that doesn't belong to OVS at all. For example:

br-tun
   |
   +- vxlan0 --- remote-ipaddr=192.168.1.10
   |
   +- vxlan1 --- remote-ipaddr=192.168.2.10

And then you have:
   eth0 --- 192.168.1.1/24

   eth1 --- 192.168.2.1/24

In this case, if we took either eth0 or eth1 down, the cache is not
immediately flushed.

The tnl endpoint must have a valid route, so I suggest to hook the
tnl_neigh_cache_flush into route-table.c which receives a notification
when a route changes. If a route is deleted, we should flush the
device's tnl cache. Doing so, would cover both userspace and kernel
datapath for OVS and non-OVS devices.

What do you think?

Thanks,
fbl


> diff --git a/tests/tunnel.at b/tests/tunnel.at
> index 035c54f67..6d7550724 100644
> --- a/tests/tunnel.at
> +++ b/tests/tunnel.at
> @@ -920,3 +920,46 @@ dnl which is not correct
>  
>  OVS_VSWITCHD_STOP
>  AT_CLEANUP
> +
> +AT_SETUP([tunnel - neighbor entry add and deletion])
> +OVS_VSWITCHD_START([add-port br0 p1 -- set Interface p1 type=gre \
> +options:remote_ip=1.1.1.1 options:local_ip=2.2.2.2 \
> +options:key=5 ofport_request=1 \
> +-- add-port br0 p2 -- set Interface p2 type=gre \
> +options:local_ip=3.3.3.3 options:remote_ip=4.4.4.4 \
> +ofport_request=2])
> +AT_CHECK([ovs-vsctl add-br br1 -- set bridge br1 datapath_type=dummy], [0])
> +
> +dnl Populate tunnel neighbor cache table
> +AT_CHECK([
> +ovs-appctl tnl/arp/set p1 10.0.0.1 00:00:10:00:00:01
> +ovs-appctl tnl/arp/set p2 10.0.1.1 00:00:10:00:01:01
> +ovs-appctl tnl/arp/set br0 10.0.2.1 00:00:10:00:02:01
> +ovs-appctl tnl/arp/set br1 20.0.0.1 00:00:20:00:00:01
> +], [0], [stdout])
> +
> +AT_CHECK([ovs-appctl tnl/neigh/show | tail -n+3 | sort], [0], [dnl
> +10.0.0.1  00:00:10:00:00:01   p1
> +10.0.1.1  00:00:10:00:01:01   p2
> +10.0.2.1  00:00:10:00:02:01   br0
> +20.0.0.1

Re: [ovs-dev] [PATCH] rhel: update udev rules to allow vfio access

2019-05-14 Thread Flavio Leitner via dev

On Fri, May 10, 2019 at 02:31:26PM -0400, Aaron Conole wrote:
> Aaron Conole  writes:
> 
> > Flavio Leitner  writes:
> >
> >> On Thu, Apr 18, 2019 at 01:46:22PM -0600, Alex Williamson wrote:
> >>> On Thu, 18 Apr 2019 15:50:43 -0300
> >>> Flavio Leitner  wrote:
> >>> 
> >>> > On Thu, Apr 18, 2019 at 12:06:57PM -0600, Alex Williamson wrote:
> >>> > > On Thu, 18 Apr 2019 13:56:23 -0300
> >>> > > Flavio Leitner  wrote:
> >>> > >   
> >>> > > > On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote:  
> >>> > > > > On Thu, 18 Apr 2019 13:23:54 -0300
> >>> > > > > Flavio Leitner  wrote:
> >>> > > > Another thing is that when the module is ready and the event is sent
> >>> > > > out, what holds OVS for not trying to open and get EACCESS before
> >>> > > > udev is triggered to fix the device permission?  
> >>> > > 
> >>> > > If there were a race, could ovs ever run before udev on system
> >>> > > startup?  Probably not.  
> >>> > 
> >>> > It does wait, but only for the udev to settle, which means if the
> >>> > module has not triggered an event until that time, OVS will not wait
> >>> > and we still have a race.
> >>> 
> >>> But udev isn't waiting on the module to trigger an event, the module
> >>> contains a MODULE_ALIAS, so I believe it's just the static processing
> >>> of the modules.alias that triggers the event.
> >>
> >> What I am saying is that driverctl will trigger load the module and
> >> bind the device, later on systemd will trigger OVS service which
> >> waits udev to settle, but none of that guarantees that the permissions
> >> are updated when OVS is initializing, see below.
> >>
> >>> > >  Ideally perhaps a cleaner solution might be an
> >>> > > explicit dependency on the vfio module specific to ovs startup rather
> >>> > > than changing a system policy, but it really depends on the context 
> >>> > > and
> >>> > > use cases.  Thanks,  
> >>> > 
> >>> > It does have. The driverctl will bind the devices to vfio-pci but
> >>> > the problem is that which signal we should rely on to know when
> >>> > the vfio module is still initializing, or failed or finished.
> >>> 
> >>> What signal/mechanism is being used currently?  If driverctl is asked
> >>> to set a driver override it does:
> >>> 
> >>>  1) if module is not loaded, modprobe
> >>>  2) unbinds device from existing driver, if any
> >>>  3) sets driver_override
> >>>  4) triggers drivers_probe
> >>>  5) tests if device is bound to a driver, any driver
> >>> 
> >>> There are certainly some deficiencies here, unbinding the device before
> >>> setting the driver_override leaves the device open to getting bound by
> >>> the wrong driver, and the verification in the last step could be more
> >>> specific in testing for binding to the correct driver, but step #1 is
> >>> the modprobe of the driver, which should be a synchronous operation.
> >>> We shouldn't be able to complete a 'driverctl set-override $DEV
> >>> vfio-pci' without vfio being initialized, afaict.  Thanks,
> >>
> >> Right, sounds like systemd is starting openvswitch service before
> >> the driverctl is done with the devices.
> >
> > I'm not sure.  The ordering could be a problem.
> >
> > Perhaps we could try adding:
> >
> >   After=basic.target
> >
> > for the ovs-vswitchd.service if we have a machine that exhibits this
> > behavior, but I don't know if it will resolve the race.  There is some
> > kind of strange ordering looking at:
> >
> > https://www.freedesktop.org/software/systemd/man/systemd.special.html
> > and
> > https://www.freedesktop.org/software/systemd/man/bootup.html#
> >
> > I can't find how network.target dependency really works w.r.t. ordering
> > and the driverctl+basic.target services.
> 
> Ping?  Any thoughts?  Do you have an alternative approach you'd rather
> see?  I can try asking the customer if they can test out the
> After=basic.target change I propose, but I'm not positive it will
> resolve anything.  And if it doesn't, I want to be able to say "well,
> here's a follow up."

IIRC we have a dependency on systemd-udev-settle.service, which
would mean systemd would wait for the device probing to be done,
but apparently it doesn't mean that udev rules have completed
execution.

Maybe using systemd-analyze after had reproduced the issue can
shed some light?

Or change ovs-vswitchd.service to take a screenshot of all running
processes when the service is starting (ExecStartPre) ? That way
we will know if modprobe is still running and whatnot.

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next] net: openvswitch: return an error instead of doing BUG_ON()

2019-05-02 Thread Flavio Leitner via dev

On Thu, May 02, 2019 at 04:12:38PM -0400, Eelco Chaudron wrote:
> For all other error cases in queue_userspace_packet() the error is
> returned, so it makes sense to do the same for these two error cases.
> 
> Reported-by: Davide Caratti 
> Signed-off-by: Eelco Chaudron 
> ---

LGTM
Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] rhel: update udev rules to allow vfio access

2019-04-18 Thread Flavio Leitner via dev

On Thu, Apr 18, 2019 at 01:46:22PM -0600, Alex Williamson wrote:
> On Thu, 18 Apr 2019 15:50:43 -0300
> Flavio Leitner  wrote:
> 
> > On Thu, Apr 18, 2019 at 12:06:57PM -0600, Alex Williamson wrote:
> > > On Thu, 18 Apr 2019 13:56:23 -0300
> > > Flavio Leitner  wrote:
> > >   
> > > > On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote:  
> > > > > On Thu, 18 Apr 2019 13:23:54 -0300
> > > > > Flavio Leitner  wrote:
> > > > Another thing is that when the module is ready and the event is sent
> > > > out, what holds OVS for not trying to open and get EACCESS before
> > > > udev is triggered to fix the device permission?  
> > > 
> > > If there were a race, could ovs ever run before udev on system
> > > startup?  Probably not.  
> > 
> > It does wait, but only for the udev to settle, which means if the
> > module has not triggered an event until that time, OVS will not wait
> > and we still have a race.
> 
> But udev isn't waiting on the module to trigger an event, the module
> contains a MODULE_ALIAS, so I believe it's just the static processing
> of the modules.alias that triggers the event.

What I am saying is that driverctl will trigger load the module and
bind the device, later on systemd will trigger OVS service which
waits udev to settle, but none of that guarantees that the permissions
are updated when OVS is initializing, see below.

> > >  Ideally perhaps a cleaner solution might be an
> > > explicit dependency on the vfio module specific to ovs startup rather
> > > than changing a system policy, but it really depends on the context and
> > > use cases.  Thanks,  
> > 
> > It does have. The driverctl will bind the devices to vfio-pci but
> > the problem is that which signal we should rely on to know when
> > the vfio module is still initializing, or failed or finished.
> 
> What signal/mechanism is being used currently?  If driverctl is asked
> to set a driver override it does:
> 
>  1) if module is not loaded, modprobe
>  2) unbinds device from existing driver, if any
>  3) sets driver_override
>  4) triggers drivers_probe
>  5) tests if device is bound to a driver, any driver
> 
> There are certainly some deficiencies here, unbinding the device before
> setting the driver_override leaves the device open to getting bound by
> the wrong driver, and the verification in the last step could be more
> specific in testing for binding to the correct driver, but step #1 is
> the modprobe of the driver, which should be a synchronous operation.
> We shouldn't be able to complete a 'driverctl set-override $DEV
> vfio-pci' without vfio being initialized, afaict.  Thanks,

Right, sounds like systemd is starting openvswitch service before
the driverctl is done with the devices.

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] rhel: update udev rules to allow vfio access

2019-04-18 Thread Flavio Leitner via dev

On Thu, Apr 18, 2019 at 12:06:57PM -0600, Alex Williamson wrote:
> On Thu, 18 Apr 2019 13:56:23 -0300
> Flavio Leitner  wrote:
> 
> > On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote:
> > > On Thu, 18 Apr 2019 13:23:54 -0300
> > > Flavio Leitner  wrote:
> > >   
> > > > On Thu, Apr 18, 2019 at 11:05:28AM -0400, Aaron Conole wrote:  
> > > > > On some systems, it's possible that the initialization of the misc 
> > > > > chardev
> > > > > associated with /dev/vfio/vfio is delayed.  This happens on machines 
> > > > > with
> > > > > large numbers of cores (at least 88+).  If this delay exceeds the time
> > > > > required for ovs-vswitchd to call the dpdk initialization routine, the
> > > > > permissions won't be updated and the open call will return EACCES.
> > > > > 
> > > > > To fix this, we explicitly allow global open.  This means any user may
> > > > > open() the vfio device before the vfio modules are initialized, thus
> > > > > triggering a module load.  The applications (including ovs-vswitchd)
> > > > > would be pended while the module loads.  This should be safe, as any 
> > > > > user
> > > > > may open the vfio device once the module is loaded anyway, since the.
> > > > > module rewrites the permissions as 0666.
> > > > > 
> > > > > Signed-off-by: Aaron Conole 
> > > > > ---
> > > > >  rhel/usr_lib_udev_rules.d_91-vfio.rules | 1 +
> > > > >  1 file changed, 1 insertion(+)
> > > > > 
> > > > > diff --git a/rhel/usr_lib_udev_rules.d_91-vfio.rules 
> > > > > b/rhel/usr_lib_udev_rules.d_91-vfio.rules
> > > > > index 8e34b2a2b..c0504ab5a 100644
> > > > > --- a/rhel/usr_lib_udev_rules.d_91-vfio.rules
> > > > > +++ b/rhel/usr_lib_udev_rules.d_91-vfio.rules
> > > > > @@ -1 +1,2 @@
> > > > >  ACTION=="add", SUBSYSTEM=="vfio*", GROUP="hugetlbfs", MODE="0660"
> > > > > +ACTION=="add", SUBSYSTEM=="misc", KERNEL=="vfio", MODE="0666"
> > > > 
> > > > Wouldn't be better to fix this at the module to fix the permissions
> > > > before whatever is taking long to initialize?  
> > > 
> > > The stronger security policy is to not allow unprivileged users to
> > > trigger an operation which loads a module, so no, upper level system
> > > policy should relax that as necessary, but not the module.  VFIO is
> > > unavailable until enabled through privileged entities, so there's some
> > > degree that ovs is "jumping the gun" in making use of it, afaict.  
> > 
> > Ok, but if the machine has few cores, then the permission is set to
> > 0666 and I assumed that it was done by the module itself. Assuming
> > that it's true, then why can't the module fix the permission before?
> 
> Because if the file were initially mode 0666 then an unprivileged user
> accessing the file will load the module, which as discussed above is a
> weaker security policy. 

Agreed, but I was referring to after the proper trigger.

> Any case of the permissions being "fixed" by
> the module load, directed by a privileged entity, asynchronous to ovs
> using this interface is a race, no matter how early the module manages
> to fix those permissions.

True.

> > Another thing is that when the module is ready and the event is sent
> > out, what holds OVS for not trying to open and get EACCESS before
> > udev is triggered to fix the device permission?
> 
> If there were a race, could ovs ever run before udev on system
> startup?  Probably not.

It does wait, but only for the udev to settle, which means if the
module has not triggered an event until that time, OVS will not wait
and we still have a race.

>  Ideally perhaps a cleaner solution might be an
> explicit dependency on the vfio module specific to ovs startup rather
> than changing a system policy, but it really depends on the context and
> use cases.  Thanks,

It does have. The driverctl will bind the devices to vfio-pci but
the problem is that which signal we should rely on to know when
the vfio module is still initializing, or failed or finished.

BTW, this is not specific to OVS because any DPDK based app using
VFIO would face the issue.

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] rhel: update udev rules to allow vfio access

2019-04-18 Thread Flavio Leitner via dev

On Thu, Apr 18, 2019 at 10:43:11AM -0600, Alex Williamson wrote:
> On Thu, 18 Apr 2019 13:23:54 -0300
> Flavio Leitner  wrote:
> 
> > On Thu, Apr 18, 2019 at 11:05:28AM -0400, Aaron Conole wrote:
> > > On some systems, it's possible that the initialization of the misc chardev
> > > associated with /dev/vfio/vfio is delayed.  This happens on machines with
> > > large numbers of cores (at least 88+).  If this delay exceeds the time
> > > required for ovs-vswitchd to call the dpdk initialization routine, the
> > > permissions won't be updated and the open call will return EACCES.
> > > 
> > > To fix this, we explicitly allow global open.  This means any user may
> > > open() the vfio device before the vfio modules are initialized, thus
> > > triggering a module load.  The applications (including ovs-vswitchd)
> > > would be pended while the module loads.  This should be safe, as any user
> > > may open the vfio device once the module is loaded anyway, since the.
> > > module rewrites the permissions as 0666.
> > > 
> > > Signed-off-by: Aaron Conole 
> > > ---
> > >  rhel/usr_lib_udev_rules.d_91-vfio.rules | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/rhel/usr_lib_udev_rules.d_91-vfio.rules 
> > > b/rhel/usr_lib_udev_rules.d_91-vfio.rules
> > > index 8e34b2a2b..c0504ab5a 100644
> > > --- a/rhel/usr_lib_udev_rules.d_91-vfio.rules
> > > +++ b/rhel/usr_lib_udev_rules.d_91-vfio.rules
> > > @@ -1 +1,2 @@
> > >  ACTION=="add", SUBSYSTEM=="vfio*", GROUP="hugetlbfs", MODE="0660"
> > > +ACTION=="add", SUBSYSTEM=="misc", KERNEL=="vfio", MODE="0666"  
> > 
> > Wouldn't be better to fix this at the module to fix the permissions
> > before whatever is taking long to initialize?
> 
> The stronger security policy is to not allow unprivileged users to
> trigger an operation which loads a module, so no, upper level system
> policy should relax that as necessary, but not the module.  VFIO is
> unavailable until enabled through privileged entities, so there's some
> degree that ovs is "jumping the gun" in making use of it, afaict.

Ok, but if the machine has few cores, then the permission is set to
0666 and I assumed that it was done by the module itself. Assuming
that it's true, then why can't the module fix the permission before?

Another thing is that when the module is ready and the event is sent
out, what holds OVS for not trying to open and get EACCESS before
udev is triggered to fix the device permission?

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] rhel: update udev rules to allow vfio access

2019-04-18 Thread Flavio Leitner via dev

On Thu, Apr 18, 2019 at 11:05:28AM -0400, Aaron Conole wrote:
> On some systems, it's possible that the initialization of the misc chardev
> associated with /dev/vfio/vfio is delayed.  This happens on machines with
> large numbers of cores (at least 88+).  If this delay exceeds the time
> required for ovs-vswitchd to call the dpdk initialization routine, the
> permissions won't be updated and the open call will return EACCES.
> 
> To fix this, we explicitly allow global open.  This means any user may
> open() the vfio device before the vfio modules are initialized, thus
> triggering a module load.  The applications (including ovs-vswitchd)
> would be pended while the module loads.  This should be safe, as any user
> may open the vfio device once the module is loaded anyway, since the.
> module rewrites the permissions as 0666.
> 
> Signed-off-by: Aaron Conole 
> ---
>  rhel/usr_lib_udev_rules.d_91-vfio.rules | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/rhel/usr_lib_udev_rules.d_91-vfio.rules 
> b/rhel/usr_lib_udev_rules.d_91-vfio.rules
> index 8e34b2a2b..c0504ab5a 100644
> --- a/rhel/usr_lib_udev_rules.d_91-vfio.rules
> +++ b/rhel/usr_lib_udev_rules.d_91-vfio.rules
> @@ -1 +1,2 @@
>  ACTION=="add", SUBSYSTEM=="vfio*", GROUP="hugetlbfs", MODE="0660"
> +ACTION=="add", SUBSYSTEM=="misc", KERNEL=="vfio", MODE="0666"

Wouldn't be better to fix this at the module to fix the permissions
before whatever is taking long to initialize?

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] OVN/OVS Split POC: version 2

2019-04-18 Thread Flavio Leitner via dev

On Wed, Apr 17, 2019 at 04:35:05PM -0700, Ben Pfaff wrote:
> On Wed, Apr 17, 2019 at 05:07:34PM -0300, Flavio Leitner via dev wrote:
> > On Tue, Mar 26, 2019 at 04:15:07PM -0400, Mark Michelson wrote:
> > > I've once again rolled another OVN/OVS split version. It can be found at
> > > https://github.com/putnopvut/ovn_mk2.git
> > > 
> > > The main changes between this and the old split POC are as follows:
> > > 
> > > * This is based on a much newer build of OVS master. Therefore, build 
> > > errors
> > > people had with dhparams.c *should* be cleared up.
> > > 
> > > * This fixes errors with manpages.mk generation/checking, so there is no
> > > need to do a pointless `touch` of manpages.mk during the build process.
> > > 
> > > * In many cases, rather than hard-coding paths to OVS, we use variables.
> > > This isn't universally applied, but it's used for the locations of C
> > > headers, libraries, and manpages.
> > > 
> > > Please give this a look and let me know what you think.
> > 
> > I see OVS binaries, headers, man-pages and libraries being built
> > and installed. Not sure what the plans are, but this could cause
> > some confusion to users once they start using the repo.
> 
> It's good to fix this issue, and anything else we find before we do the
> split, but I want to avoid the idea that it has to be perfect before we
> pull the trigger.  There will be problems.  We'll fix them.

Sure, I agree. I brought up some issues, but I am not opposing to pull
the trigger.

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] OVN/OVS Split POC: version 2

2019-04-17 Thread Flavio Leitner via dev

On Tue, Mar 26, 2019 at 04:15:07PM -0400, Mark Michelson wrote:
> I've once again rolled another OVN/OVS split version. It can be found at
> https://github.com/putnopvut/ovn_mk2.git
> 
> The main changes between this and the old split POC are as follows:
> 
> * This is based on a much newer build of OVS master. Therefore, build errors
> people had with dhparams.c *should* be cleared up.
> 
> * This fixes errors with manpages.mk generation/checking, so there is no
> need to do a pointless `touch` of manpages.mk during the build process.
> 
> * In many cases, rather than hard-coding paths to OVS, we use variables.
> This isn't universally applied, but it's used for the locations of C
> headers, libraries, and manpages.
> 
> Please give this a look and let me know what you think.

I see OVS binaries, headers, man-pages and libraries being built
and installed. Not sure what the plans are, but this could cause
some confusion to users once they start using the repo.

Other than that, it builds. It does without openvswitch-devel
package installed or direct install (unpackaged) of OVS, which\
brings my point for the previous iteration back. It's okay to do
later in my opinion, but somehow this will need to be cleaned up
or you might end up with two OVSs to maintain. Also that it
becomes less clear of which fixes we need to do in OVS itself
to allow OVN to be build.

The Makefile target rpm-fedora should build OVN packages, and
there should be no rpm-fedora-kmod. The rpm-fedora-ovn becomes
the rpm-fedora as already pointed.

The make rpm-fedora-ovn works though, super cool!

Thanks Mark!
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next 2/8] netfilter: add API to manage NAT helpers.

2019-04-11 Thread Flavio Leitner via dev

On Mon, Apr 01, 2019 at 12:10:32AM +0200, Pablo Neira Ayuso wrote:
> On Tue, Mar 26, 2019 at 05:57:09PM -0300, Flavio Leitner wrote:
> > The API allows a conntrack helper to indicate its corresponding
> > NAT helper which then can be loaded and reference counted.
> > 
> > Signed-off-by: Flavio Leitner 
> > ---
> >  include/net/netfilter/nf_conntrack_helper.h |  19 +++-
> >  net/netfilter/nf_conntrack_amanda.c |   2 +
> >  net/netfilter/nf_conntrack_ftp.c|   6 +-
> >  net/netfilter/nf_conntrack_helper.c | 108 +++-
> >  net/netfilter/nf_conntrack_irc.c|   3 +-
> >  net/netfilter/nf_conntrack_sane.c   |   4 +-
> >  net/netfilter/nf_conntrack_sip.c|  12 ++-
> >  net/netfilter/nf_conntrack_tftp.c   |   6 +-
> >  8 files changed, 147 insertions(+), 13 deletions(-)
> > 
> > diff --git a/include/net/netfilter/nf_conntrack_helper.h 
> > b/include/net/netfilter/nf_conntrack_helper.h
> > index e86fadf7e7c5..0d36d6bfb522 100644
> > --- a/include/net/netfilter/nf_conntrack_helper.h
> > +++ b/include/net/netfilter/nf_conntrack_helper.h
> > @@ -58,6 +58,8 @@ struct nf_conntrack_helper {
> > unsigned int queue_num;
> > /* length of userspace private data stored in nf_conn_help->data */
> > u16 data_len;
> > +   /* name of NAT helper module */
> > +   char nat_mod_name[NF_CT_HELPER_NAME_LEN];
> >  };
> >  
> >  /* Must be kept in sync with the classes defined by helpers */
> > @@ -98,7 +100,8 @@ void nf_ct_helper_init(struct nf_conntrack_helper 
> > *helper,
> >enum ip_conntrack_info ctinfo),
> >int (*from_nlattr)(struct nlattr *attr,
> >   struct nf_conn *ct),
> > -  struct module *module);
> > +  struct module *module,
> > +  const char *nat_mod_name);
> >  
> >  int nf_conntrack_helper_register(struct nf_conntrack_helper *);
> >  void nf_conntrack_helper_unregister(struct nf_conntrack_helper *);
> > @@ -157,4 +160,18 @@ nf_ct_helper_expectfn_find_by_symbol(const void 
> > *symbol);
> >  extern struct hlist_head *nf_ct_helper_hash;
> >  extern unsigned int nf_ct_helper_hsize;
> >  
> > +struct nf_conntrack_helper_nat {
> > +   struct list_head list;
> > +   char name[NF_CT_HELPER_NAME_LEN];
> > +   struct module *module;  /* pointer to self */
> > +};
> > +
> > +void nf_ct_helper_nat_init(struct nf_conntrack_helper_nat *nat,
> > +  const char *name, struct module *module);
> 
> Instead of this nf_ct_helper_nat_init() runtime initializer, define
> the structure in C99 as static in the NAT helper module?
> 
> Telling this because we can probably also extend this structure to
> remove the RCU hook between ct helper and nat helper at some point
> through this new definition.

Sounds good, let me try that.


> > +void nf_conntrack_helper_nat_register(struct nf_conntrack_helper_nat *nat);
> 
> Shorter name suggestion:
> 
> nf_nat_helper_register()
> 
> > +void nf_conntrack_helper_nat_unregister(struct nf_conntrack_helper_nat 
> > *nat);
> 
> nf_nat_helper_unregister()
> 
> > +int nf_conntrack_helper_nat_try_module_get(const char *name, u16 l3num,
> > +  u8 protonum);
> 
> nf_nat_helper_try_module_get()
> 
> > +void nf_conntrack_helper_nat_put(struct nf_conntrack_helper *helper);
> 
> nf_nat_helper_nat_put()

Ok to all the above.

Thanks,
fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH net-next 1/8] netfilter: use macros to create module aliases.

2019-04-11 Thread Flavio Leitner via dev

On Mon, Apr 01, 2019 at 12:07:43AM +0200, Pablo Neira Ayuso wrote:
> On Tue, Mar 26, 2019 at 05:57:08PM -0300, Flavio Leitner wrote:
> > Each NAT helper creates a module alias which follows a pattern.
> > Use macros for consistency.
> > 
> > Signed-off-by: Flavio Leitner 
> > ---
> >  include/net/netfilter/nf_conntrack_helper.h | 4 
> >  net/ipv4/netfilter/nf_nat_h323.c| 2 +-
> >  net/ipv4/netfilter/nf_nat_pptp.c| 2 +-
> >  net/netfilter/nf_nat_amanda.c   | 2 +-
> >  net/netfilter/nf_nat_ftp.c  | 2 +-
> >  net/netfilter/nf_nat_irc.c  | 2 +-
> >  net/netfilter/nf_nat_sip.c  | 2 +-
> >  net/netfilter/nf_nat_tftp.c | 2 +-
> >  8 files changed, 11 insertions(+), 7 deletions(-)
> > 
> > diff --git a/include/net/netfilter/nf_conntrack_helper.h 
> > b/include/net/netfilter/nf_conntrack_helper.h
> > index ec52a8dc32fd..e86fadf7e7c5 100644
> > --- a/include/net/netfilter/nf_conntrack_helper.h
> > +++ b/include/net/netfilter/nf_conntrack_helper.h
> > @@ -15,6 +15,10 @@
> >  #include 
> >  #include 
> >  
> > +#define NF_CT_NAT_HELPER_MOD_NAME(name)"ip_nat_" name
> 
> I'd suggest a rename from NF_CT_NAT_HELPER_MOD_NAME() to
> NF_NAT_HELPER_NAME().

OK.


> Please, also use "nf_nat_" prefix instead, "ip_nat" is legacy stuff.

I don't think we can, because people might be using
the current alias which is ip_nat_something. This patch
is just making it more robust.

> > +#define MODULE_ALIAS_NFCT_HELPER_NAT(name) \
> > +   MODULE_ALIAS(NF_CT_NAT_HELPER_MOD_NAME(name))
> 
> Probably:
> 
> MODULE_ALIAS_NF_NAT_HELPER
> 
> instead of MODULE_ALIAS_NFCT_HELPER_NAT.

OK.

fbl
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Re: [ovs-dev] [PATCH] datapath: Revert "datapath: Fix template leak in error cases."

2019-04-03 Thread Flavio Leitner via dev

On Wed, Apr 03, 2019 at 09:49:13AM -0700, Yifeng Sun wrote:
> From: Flavio Leitner 
> 
> Upstream commit:
> commit 7f6d6558ae44bc193eb28df3617c364d3bb6df39
> Author: Flavio Leitner 
> Date:   Fri Sep 28 14:55:34 2018 -0300
> 
> Revert "openvswitch: Fix template leak in error cases."
> This reverts commit 90c7afc.
> 
> When the commit was merged, the code used nf_ct_put() to free
> the entry, but later on commit 7664423 ("openvswitch: Free
> tmpl with tmpl_free.") replaced that with nf_ct_tmpl_free which
> is a more appropriate. Now the original problem is removed.
> 
> Then 44d6e2f ("net: Replace NF_CT_ASSERT() with WARN_ON().")
> replaced a debug assert with a WARN_ON() which is trigged now.
> 
> Signed-off-by: Flavio Leitner 
> Acked-by: Joe Stringer 
> Signed-off-by: David S. Miller 
> 
> This patch backports this upstream patch to OVS.
> 
> Cc: Flavio Leitner 
> Signed-off-by: Yifeng Sun 
> ---

LGTM
Acked-by: Flavio Leitner 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

42 matches

Mail list logo