[ovs-dev] Did You Get My Message This Time?

2017-05-10 Thread Friedrich Mayrhofer


This is the second time i am sending you this mail.I, Friedrich Mayrhofer 
Donate $ 1,000,000.00 to You, Email  Me personally for more details.

Regards.
Friedrich Mayrhofer

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv2] tunnel-tests: Add test to match tunnel traffic.

2017-05-10 Thread William Tu
On Wed, May 10, 2017 at 3:07 PM, Darrell Ball  wrote:
> fyi:
>
> I submitted an updated patch here
>  https://patchwork.ozlabs.org/patch/760876/
>
> I removed the ack since there were significant changes from the
> original patch.
>
>
Hi Darrell,

Thanks for the tunnel v3 test patch. Looks good to me.
William
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [RFC 2/2] doc: Convert ovs-vswitchd to rST

2017-05-10 Thread Stephen Finucane
This is the largest man page by some distance. There are a couple of
changes needed to make this work:

- Combining italics and bold without spaces (i.e. hello-world
  ) and nesting italics and bold (i.e. hello-world)
  is not supported by rST. Where present, this has been removed.

- Duplicated opts are not possible. Where before we would have had a
  list like so:

-vPATTERN:destination:pattern
  Description

-vFACILITY:facility
  Description

   We now have:

 -v 

   PATTERN::
 Description

   FACILITY:
 Description

  This is necessary if we want to make best use of the Sphinx indexing.

- TODO

Signed-off-by: Stephen Finucane 
---
This patch isn't complete, but I wanted to see what people thought of
the formatting for more complex commands. There are a couple of issues
with this that I still need to work through:

- There's a big loss of formatting for subcommands. While I don't think
  the bit of loss in the option descriptions is _that_ big a deal, the
  subcommands are longer and could benefit from the formatting more. I
  could probably fix it in Sphinx internally for Sphinx 1.6, but that'd
  take a year or two to propagate to distros.

- I'm not sure if there's a way to use this kind of macro:

\*(DX\fBadd\-dp \fIdp\fR [\fInetdev\fR[\fB,\fIoption\fR]...]

  which is called like so:

.ds DX \fBdpctl/\fR
.de DO
\\$2 \\$1 \\$3

  and yields:

dpctl/add-dp dp [netdev[,option]...]

  This will probably result in some duplication, else a _lot_ of 'inc'
  files

- I'm not sure how we should pass in autoconf (?) variables like
  @RUNDIR@. Could we do this in Python or something?

- Some other stuff that I've forgotten

In any case though, it should go folks a rough idea of what a larger man
page in rST/Sphinx will actually look like.
---
 Documentation/conf.py  |   2 +
 Documentation/ref/common.inc   |   7 +
 Documentation/ref/coverage-unixctl.inc |  16 ++
 Documentation/ref/daemon.inc   |  85 ++
 Documentation/ref/dpctl.inc| 199 ++
 Documentation/ref/index.rst|   1 +
 Documentation/ref/memory-unixctl.inc   |  10 +
 Documentation/ref/ofproto-dpif-unixctl.inc |  38 +++
 Documentation/ref/ofproto-tnl-unixctl.inc  |  40 +++
 Documentation/ref/ofproto-unixctl.inc  | 162 +++
 Documentation/ref/ovs-vswitchd.8.rst   | 414 +
 Documentation/ref/remote-active.inc|  22 ++
 Documentation/ref/remote-passive.inc   |  28 ++
 Documentation/ref/service.inc  |  14 +
 Documentation/ref/ssl-bootstrap.inc|  22 ++
 Documentation/ref/ssl.inc  |  33 +++
 Documentation/ref/unixctl.inc  |  16 ++
 Documentation/ref/vlog-unixctl.inc |  79 ++
 Documentation/ref/vlog.inc |  88 ++
 19 files changed, 1276 insertions(+)
 create mode 100644 Documentation/ref/common.inc
 create mode 100644 Documentation/ref/coverage-unixctl.inc
 create mode 100644 Documentation/ref/daemon.inc
 create mode 100644 Documentation/ref/dpctl.inc
 create mode 100644 Documentation/ref/memory-unixctl.inc
 create mode 100644 Documentation/ref/ofproto-dpif-unixctl.inc
 create mode 100644 Documentation/ref/ofproto-tnl-unixctl.inc
 create mode 100644 Documentation/ref/ofproto-unixctl.inc
 create mode 100644 Documentation/ref/ovs-vswitchd.8.rst
 create mode 100644 Documentation/ref/remote-active.inc
 create mode 100644 Documentation/ref/remote-passive.inc
 create mode 100644 Documentation/ref/service.inc
 create mode 100644 Documentation/ref/ssl-bootstrap.inc
 create mode 100644 Documentation/ref/ssl.inc
 create mode 100644 Documentation/ref/unixctl.inc
 create mode 100644 Documentation/ref/vlog-unixctl.inc
 create mode 100644 Documentation/ref/vlog.inc

diff --git a/Documentation/conf.py b/Documentation/conf.py
index d70ee6b..174cc7e 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -113,6 +113,8 @@ html_static_path = ['_static']
 # One entry per manual page. List of tuples
 # (source start file, name, description, authors, manual section).
 _man_pages = [
+('ovs-vswitchd.8',
+ u'Open vSwitch daemon'),
 ('ovs-test.8',
  u'Check Linux drivers for performance, vlan and L3 tunneling problems'),
 ('ovs-vlan-test.8',
diff --git a/Documentation/ref/common.inc b/Documentation/ref/common.inc
new file mode 100644
index 000..321b331
--- /dev/null
+++ b/Documentation/ref/common.inc
@@ -0,0 +1,7 @@
+.. option:: -h, --help
+
+Prints a brief help message to the console.
+
+.. option:: -V, --version
+
+Prints version information to the console.
diff --git a/Documentation/ref/coverage-unixctl.inc 
b/Documentation/ref/coverage-unixctl.inc
new file mode 100644
index 000..90563e9
--- /dev/null
+++ b/Documentation/ref/coverage-unixctl.inc
@@ -0,0 +1,16 @@
+Coverage Commands
+-
+
+These commands manage 

Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

2017-05-10 Thread Andy Zhou
On Wed, May 10, 2017 at 2:30 PM, Joe Stringer  wrote:
> On 10 May 2017 at 12:59, Andy Zhou  wrote:
>> On Wed, May 10, 2017 at 7:56 AM, William Tu  wrote:
 It may be cleaner if we add a new trunc action for the datapath, say
 trunc2  that applies
 to all outputs within the clone.

 So the translation will look like: clone(trunc2, native tunnel
 translation). Would this
 approach work?

>>>
>>> Or how about we apply actual packet truncation when clone action
>>> follows truncate action?
>>> Now we apply actual packet truncation when:
>>> actions=trunc, output
>>> actions=trunc, tunnel_push
>>> actions=trunc, sample
>>
>>>
>>> If we add clone as another truncate target, then
>>> actions = trunc(100), clone(tnl(...)),  actionX,
>>> Inside clone will see packet of size 100, and actionX sees original
>>> size. Then I think we don't need to introduce trunc2?
>>
>> This is a reasonable approach. Thanks for the suggestion.
>>
>> Picking up the topic of trunc on patch port.
>>
>> Instead of banning trunc output to a patch port, any down side of
>> translating that
>> to trunc, clone()? After all, native tunneling
>> looks a lot like patch port conceptually.
>
> How does clone() interact with trunc() in the datapath?
>
> As I understand, before clone() existed in datapath (at least linux
> side), the trunc() would only apply to the next output().
> Conceptually, if trunc(),clone(...) applies the truncation for the
> duration of the clone() then that sounds fine.

That's the proposal on the table. Conceptually, we just treat clone
as another output.

I'm a little concerned
> that a really small truncate value should actually affect parsing of
> packet headers for subsequent tables/lookups though.

Isn't this the inherent problem with trunc? trunc causes a packet
to become a partial packet, that may cause subsequent actions
to fail.

 All of this is a
> lot simpler if it's tied directly to the output, but now that we're
> looking at handling the translation over another bridge without
> re-extracting the packet it gets more complex.

What is the additional complexity over handling patch ports
over that of native tunneling?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] ovn: SFC Patch V3

2017-05-10 Thread Mickey Spiegel
Three issues before diving in:


1. Placement of S_SWITCH_IN_CHAIN

For some reason I thought S_SWITCH_IN_CHAIN was after all the stateful
processing, but now I see that it is table 7. That breaks ACLs and other
stateful processing, since REGBIT_CONNTRACK_COMMIT is set in
S_SWITCH_IN_ACL and matched in S_SWITCH_IN_STATEFUL.

S_SWITCH_IN_CHAIN should instead be table 10. The comments below are
written assuming this change.


2. Ingress pipeline needs to be expanded to more than 16 tables

DNS went in since the v3 patch and used up the last of the 16 ingress
tables. If you rebase, you will see that there is no space in the ingress
pipeline for the addition of S_SWITCH_IN_CHAIN. Someone (not me) needs to
expand the ingress pipeline to more than 16 stages before you can proceed.


3. While writing this response, I paid a little more attention to the
"exit-lport" direction and noticed a couple of significant issues.

a. If a packet goes from VM A on port 1 to VM B on port 4, there is a
logical port chain classifier on port 1 in the "entry-lport" direction, and
there is a logical port chain classifier on port 4 in the "exit-lport"
direction, you will only go down one of the service chains. Since the
priorities are equal, I cannot even tell you which one of the service
chains. Logically I would think that the packet should go down both service
chains, first the port 1 "entry-lport" service chain and then the port 4
"exit-lport" service chain.

b. This is done in the ingress pipeline, not the egress pipeline, and is
based on matching eth.dst. This assumes that the forwarding decision will
be based on eth.dst, since you are immediately going down the service
chain, skipping the other ingress pipeline stages, and at the end you go
directly to the egress pipeline with outport based on eth.dst. That is
quite restrictive for a generic forwarding architecture like OVN. I would
think that the right thing to do would be to move the classifier to the
egress pipeline stage, but then I do not know how to avoid loops. When a
packet comes to the egress pipeline stage where the VM resides, there is no
way to tell whether the packet has already gone down the service chain or
not. I guess you could put a S_SWITCH_IN_EGRESS_CHAIN ingress pipeline
stage right after L2_LKUP instead, and match on outport in addition to
eth.dst, but it feels a bit unclean.

On Tue, May 9, 2017 at 4:33 PM, John McDowall <
jmcdow...@paloaltonetworks.com> wrote:

> Mickey,
>
>
>
> Thanks for the review. I need some help understanding a couple of things:
>
>
>
> 1)   The proposed change, I could see the previous logic where we
> inserted the flow back in the ingress pipeline just after the IN_CHAIN
> stage. The changes you suggest seem to imply that the action is still
> insert after the _*IN*_CHAIN stage but in the egress (OUT) pipeline. I am
> missing something here – can you give me some more info?
>
Assume you have port 1 to a VM on HV1, port 2 as the input port to a VNF on
HV2, and port 3 as the output port from that same VNF on HV2. The service
chain is just that one VNF, with direction "entry-lport".

The packet progresses as follows:

HV1, ingress pipeline, inport 1
Tables 1 to 9
Table 10 (S_SWITCH_IN_CHAIN) hits priority 100 flow setting outport = 2 and
then going to output immediately

Table 32 sends the packet through a geneve tunnel to HV2

HV2, egress pipeline, outport 2
Tables 48 to 65

After packet gets delivered to the VNF, it comes back

HV2, ingress pipeline, inport 3
Tables 1 to 9

Table 10 (S_SWITCH_IN_CHAIN)

With the code in v3, you are doing a clone with inport = outport (which is
not yet set!),
then progressing to ingress Table 11, still on HV2

Eventually you hit S_SWITCH_IN_L2_LKUP and go directly to the destination
from HV2. If the destination is in the outside world, this just broke
upstream L2 learning.

My suggestion is instead Table 10 hits a priority 150 flow setting outport
= 1 and then going to output immediately.

Table 32 sends the packet through a geneve tunnel to HV1 based on outport
== 1, but it ends up in the egress pipeline

HV1, egress pipeline, outport 1

New Table 0 (S_SWITCH_OUT_SFC_LOOPBACK) hits the entry doing a clone with
input = outport and jumping to the ingress pipeline Table 11

HV1, ingress pipeline, inport 1
Tables 11 to 15 (continuation from where things left off at the beginning,
when the packet was redirected down the service chain)
...

Since this code does not use NSH, all metadata accumulated in Tables 1 to 9
is lost while going around the service chain, before proceeding through
Tables 11 to 15. At the moment this works, but this becomes a restriction
on future features, if they are to work with SFC. The alternative is to
back through all the Tables 1 through 15 instead, with
REGBIT_CHAIN_LOOPBACK as in the v2 patch to avoid going hitting any entries
in Table 10 (S_SWITCH_IN_CHAIN). With this alternative, all packets go
through Tables 1 to 9 on HV1 twice. We have to make a judgement call 

[ovs-dev] Docker/PuppetLabs/Chef Customers list

2017-05-10 Thread Eva Smith
 

 

 

Hi,

 

Hope all is well at your end.

 

I am writing to check if you would be interested in reaching out to the
customers of following:

 

. Docker

. PuppetLabs

. Chef

. VmWare

. Jenkins

 

 

We also have list for lead personas like: 

 

. CEO, CTO, CIO, COO,VP, Director and Managers of IT, Operations,
Engineering and Product Development, Project Managers

 

Please share your thoughts and also advise if we can connect on a quick call
to discuss this in detail.

 

Look forward to an opportunity to work with you.

 

Regards,

Eva Smith

Lead generation department 

Brigade 

 

 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv2] tunnel-tests: Add test to match tunnel traffic.

2017-05-10 Thread Darrell Ball
fyi:

I submitted an updated patch here
 https://patchwork.ozlabs.org/patch/760876/

I removed the ack since there were significant changes from the
original patch.




On Wed, May 10, 2017 at 12:29 PM, Darrell Ball  wrote:

> Right now, we have the original breakage backed out.
> I will submit an updated test, merging the proposed changes and some
> others.
> Is that ok with you William ?
>
> On 5/10/17, 12:20 PM, "Joe Stringer"  wrote:
>
> On 8 May 2017 at 18:54, Darrell Ball  wrote:
> > If you add this second incremental, it is explicit that you want to
> match on the outer header
> > as part of this test (to check the present breakage).
> > It also makes it clear which packets are involved and needed in this
> test.
>
> I have no idea what the latest version of this patch is since there
> are several proposed changes flying around without independent full
> patch submission, so I'm just going to assume the person with the
> latest version will submit it for review. I'll mark the one I
> submitted as superseded in patchwork.
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

2017-05-10 Thread Jarno Rajahalme
> 
> On May 10, 2017, at 12:59 PM, Andy Zhou  > wrote:
> 
> On Wed, May 10, 2017 at 7:56 AM, William Tu  > wrote:
>>> It may be cleaner if we add a new trunc action for the datapath, say
>>> trunc2  that applies
>>> to all outputs within the clone.
>>> 
>>> So the translation will look like: clone(trunc2, native tunnel
>>> translation). Would this
>>> approach work?
>>> 
>> 
>> Or how about we apply actual packet truncation when clone action
>> follows truncate action?
>> Now we apply actual packet truncation when:
>> actions=trunc, output
>> actions=trunc, tunnel_push
>> actions=trunc, sample
> 
>> 
>> If we add clone as another truncate target, then
>> actions = trunc(100), clone(tnl(...)),  actionX,
>> Inside clone will see packet of size 100, and actionX sees original
>> size. Then I think we don't need to introduce trunc2?
> 
> This is a reasonable approach. Thanks for the suggestion.
> 
> Picking up the topic of trunc on patch port.
> 
> Instead of banning trunc output to a patch port, any down side of
> translating that
> to trunc, clone()? After all, native tunneling
> looks a lot like patch port conceptually.
> 

Right, why should truncated OUTPUT to a patch port behave any different from 
any other OUTPUT port?

  Jarno

> 
>> 
>> Regards,
>> William
>> 
 
 Without the "Avoid recirculation" patch we have two datapath flows, 
 because the
 packet is recirculated. At the end of the first flow the packet size is 
 changed
 and the packet with modified size enters the OF pipeline again.
 
 What is the reason not to change packet size when truncate action is 
 applied?
 
>>> 
>>> One of the reasons could be that we introduced trunc before clone. 
>>> Otherwise, a
>>> clone(trunc2, output:x) is equivalent to trunc, output:x.  Note that
>>> the trunc datapath
>>> action is different than other datapath actions, which usually applies
>>> to all following
>>> actions. Native tunneling may be the first use case that motivates
>>> trunc2, which should
>>> have the normal datapath action behavior.
>>> 
> ___
> dev mailing list
> d...@openvswitch.org 
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev 
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

2017-05-10 Thread Jarno Rajahalme

> On May 10, 2017, at 12:59 PM, Andy Zhou  wrote:
> 
> On Wed, May 10, 2017 at 7:56 AM, William Tu  > wrote:
>>> It may be cleaner if we add a new trunc action for the datapath, say
>>> trunc2  that applies
>>> to all outputs within the clone.
>>> 
>>> So the translation will look like: clone(trunc2, native tunnel
>>> translation). Would this
>>> approach work?
>>> 
>> 
>> Or how about we apply actual packet truncation when clone action
>> follows truncate action?
>> Now we apply actual packet truncation when:
>> actions=trunc, output
>> actions=trunc, tunnel_push
>> actions=trunc, sample
> 
>> 
>> If we add clone as another truncate target, then
>> actions = trunc(100), clone(tnl(...)),  actionX,
>> Inside clone will see packet of size 100, and actionX sees original
>> size. Then I think we don't need to introduce trunc2?
> 
> This is a reasonable approach. Thanks for the suggestion.
> 
> Picking up the topic of trunc on patch port.
> 
> Instead of banning trunc output to a patch port, any down side of
> translating that
> to trunc, clone()? After all, native tunneling
> looks a lot like patch port conceptually.
> 

Right, why should truncated OUTPUT to a patch port behave any different from 
any other OUTPUT port?

  Jarno

> 
>> 
>> Regards,
>> William
>> 
 
 Without the "Avoid recirculation" patch we have two datapath flows, 
 because the
 packet is recirculated. At the end of the first flow the packet size is 
 changed
 and the packet with modified size enters the OF pipeline again.
 
 What is the reason not to change packet size when truncate action is 
 applied?
 
>>> 
>>> One of the reasons could be that we introduced trunc before clone. 
>>> Otherwise, a
>>> clone(trunc2, output:x) is equivalent to trunc, output:x.  Note that
>>> the trunc datapath
>>> action is different than other datapath actions, which usually applies
>>> to all following
>>> actions. Native tunneling may be the first use case that motivates
>>> trunc2, which should
>>> have the normal datapath action behavior.
>>> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

2017-05-10 Thread Joe Stringer
On 10 May 2017 at 12:59, Andy Zhou  wrote:
> On Wed, May 10, 2017 at 7:56 AM, William Tu  wrote:
>>> It may be cleaner if we add a new trunc action for the datapath, say
>>> trunc2  that applies
>>> to all outputs within the clone.
>>>
>>> So the translation will look like: clone(trunc2, native tunnel
>>> translation). Would this
>>> approach work?
>>>
>>
>> Or how about we apply actual packet truncation when clone action
>> follows truncate action?
>> Now we apply actual packet truncation when:
>> actions=trunc, output
>> actions=trunc, tunnel_push
>> actions=trunc, sample
>
>>
>> If we add clone as another truncate target, then
>> actions = trunc(100), clone(tnl(...)),  actionX,
>> Inside clone will see packet of size 100, and actionX sees original
>> size. Then I think we don't need to introduce trunc2?
>
> This is a reasonable approach. Thanks for the suggestion.
>
> Picking up the topic of trunc on patch port.
>
> Instead of banning trunc output to a patch port, any down side of
> translating that
> to trunc, clone()? After all, native tunneling
> looks a lot like patch port conceptually.

How does clone() interact with trunc() in the datapath?

As I understand, before clone() existed in datapath (at least linux
side), the trunc() would only apply to the next output().
Conceptually, if trunc(),clone(...) applies the truncation for the
duration of the clone() then that sounds fine. I'm a little concerned
that a really small truncate value should actually affect parsing of
packet headers for subsequent tables/lookups though. All of this is a
lot simpler if it's tied directly to the output, but now that we're
looking at handling the translation over another bridge without
re-extracting the packet it gets more complex.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Mackerels Faroer

2017-05-10 Thread Bonesca - Jona
    [ View in browser ]( http://r.newsletter.bonescamail.nl/nru6rbz2oatrf.html 
)   
 
 
 
NEW ARRIVAL TOPQUALITY ATLANTIC MACKERELS
FROM THE FAROER ISLANDS
 
PACKING : 10 X 2 KILO
 
1 BOX € 2,15
10 BOX € 2,05
1 PALET (50 BOX) € 1,95
3 PALETS € 1,85    

 
This email was sent to d...@openvswitch.org
You received this email because you are registered with Bonesca Import en 
Export BV
 
[ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/nru6rbz2oatrg.html )  

Sent by
[  ]( http://r.newsletter.bonescamail.nl/click/2n3cr1bvxaoatrd.html )     
© 2017 Bonesca Import en Export BV  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

2017-05-10 Thread Andy Zhou
On Wed, May 10, 2017 at 7:56 AM, William Tu  wrote:
>> It may be cleaner if we add a new trunc action for the datapath, say
>> trunc2  that applies
>> to all outputs within the clone.
>>
>> So the translation will look like: clone(trunc2, native tunnel
>> translation). Would this
>> approach work?
>>
>
> Or how about we apply actual packet truncation when clone action
> follows truncate action?
> Now we apply actual packet truncation when:
> actions=trunc, output
> actions=trunc, tunnel_push
> actions=trunc, sample

>
> If we add clone as another truncate target, then
> actions = trunc(100), clone(tnl(...)),  actionX,
> Inside clone will see packet of size 100, and actionX sees original
> size. Then I think we don't need to introduce trunc2?

This is a reasonable approach. Thanks for the suggestion.

Picking up the topic of trunc on patch port.

Instead of banning trunc output to a patch port, any down side of
translating that
to trunc, clone()? After all, native tunneling
looks a lot like patch port conceptually.


>
> Regards,
> William
>
>>>
>>> Without the "Avoid recirculation" patch we have two datapath flows, because 
>>> the
>>> packet is recirculated. At the end of the first flow the packet size is 
>>> changed
>>> and the packet with modified size enters the OF pipeline again.
>>>
>>> What is the reason not to change packet size when truncate action is 
>>> applied?
>>>
>>
>> One of the reasons could be that we introduced trunc before clone. 
>> Otherwise, a
>> clone(trunc2, output:x) is equivalent to trunc, output:x.  Note that
>> the trunc datapath
>> action is different than other datapath actions, which usually applies
>> to all following
>> actions. Native tunneling may be the first use case that motivates
>> trunc2, which should
>> have the normal datapath action behavior.
>>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCHv2] tunnel-tests: Add test to match tunnel traffic.

2017-05-10 Thread Darrell Ball
Right now, we have the original breakage backed out.
I will submit an updated test, merging the proposed changes and some others.
Is that ok with you William ?

On 5/10/17, 12:20 PM, "Joe Stringer"  wrote:

On 8 May 2017 at 18:54, Darrell Ball  wrote:
> If you add this second incremental, it is explicit that you want to match 
on the outer header
> as part of this test (to check the present breakage).
> It also makes it clear which packets are involved and needed in this test.

I have no idea what the latest version of this patch is since there
are several proposed changes flying around without independent full
patch submission, so I'm just going to assume the person with the
latest version will submit it for review. I'll mark the one I
submitted as superseded in patchwork.


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

2017-05-10 Thread Joe Stringer
On 9 May 2017 at 04:46, Zoltán Balogh  wrote:
> Hi,
>
> Thank you for your comments. Actually, I was wrong in my previous e-mail when 
> I stated that packet is truncated only at the end of the pipeline, when 
> output is applied. The packet size is set according max_len set by truncate 
> when tunnel_push is applied. So the size of packet is correct.
>
> The root cause of the problem, why stats in underlay bridge is wrong, is that 
> statistics will be incremented by the packet number and packet size set by 
> the upcall_xlate().
>
> static void
> upcall_xlate(struct udpif *udpif, struct upcall *upcall,
>  struct ofpbuf *odp_actions, struct flow_wildcards *wc)
> {
> struct dpif_flow_stats stats;
> struct xlate_in xin;
>
> stats.n_packets = 1;
> stats.n_bytes = dp_packet_size(upcall->packet);
> stats.used = time_msec();
> stats.tcp_flags = ntohs(upcall->flow->tcp_flags);
> ...
> }
>
> Since we excluded recirculation in the "Avoid recirculation" patch, there is 
> no second upcall when packet enters tunnel, but stats created by "first" and 
> only upcall are used for statistics of both integrate and underlay bridges. 
> And that's not correct.
>
> I completed my old patch to solve this problem by adding two new members to 
> struct rule_dpif and modify stats with their values.

Hi Zoltan, thanks again for looking into this.

I had trouble trying to review this, in part because I felt like
changes to fix multiple issues are being placed into one patch. If
there are separate issues being addressed, each issue should be fixed
in a separate patch, each with a clear description of the intent of
the change (with links to patches that introduced the breakage if
possible!). If there is refactoring to do, consider separating the
non-functional changes into a separate patch---when looking at the
original patch that started this discussion, I found it difficult to
review post-merge because the refactoring was included and I couldn't
tell if there were actually functional changes.

Given that this is taking a bit longer than I thought and we need to
take a holistic approach to how all of these interactions should work,
I plan to go ahead and apply the revert patch. I look forward to
seeing a fresh series proposal that will bring back the benefits of
the original patch.

More feedback below.

> Here comes the patch:
>
>
> Since tunneling and forwarding via patch port uses clone action, the tunneled

Where does forwarding via patch port use clone action?

> packet will not be parsed by miniflow_extract() and flow data will not be
> updated within clone. So when a tunnel header is added to the packet, the MAC
> and IP data of struct flow will not be updated according to the newly created
> tunnel header.
>
> This patch stores MAC and IP data of flow before calling
> apply_nested_clone_actions() in build_tunnel_send(), then restores the data
> after apply_nested_clone_actions() has returned.
>
> Furthermore, it introduces truncate_packet_size and tunnel_header_size struct
> rule_dpif members in order to correct n_bytes statistics of OF flows.
> ---
>  ofproto/ofproto-dpif-xlate.c | 92 
> 
>  ofproto/ofproto-dpif.c   | 14 ++-
>  ofproto/ofproto-dpif.h   |  5 +++
>  tests/ofproto-dpif.at| 11 +-
>  4 files changed, 120 insertions(+), 2 deletions(-)
>
> diff --git a/ofproto/ofproto-dpif-xlate.c b/ofproto/ofproto-dpif-xlate.c
> index b308f21..55015d7 100644
> --- a/ofproto/ofproto-dpif-xlate.c
> +++ b/ofproto/ofproto-dpif-xlate.c
> @@ -3141,6 +3141,33 @@ tnl_send_arp_request(struct xlate_ctx *ctx, const 
> struct xport *out_dev,
>  dp_packet_uninit();
>  }
>
> +/*
> + * Copy IP data of 'flow->tunnel' into 'flow' and 'base_flow'.
> + */
> +static void
> +propagate_tunnel_data_to_flow(struct flow *flow, struct flow *base_flow)
> +{
> +flow->nw_dst = flow->tunnel.ip_dst;
> +flow->nw_src = flow->tunnel.ip_src;
> +flow->ipv6_dst = flow->tunnel.ipv6_dst;
> +flow->ipv6_src = flow->tunnel.ipv6_src;
> +
> +flow->nw_tos = flow->tunnel.ip_tos;
> +flow->nw_ttl = flow->tunnel.ip_ttl;
> +flow->tp_dst = flow->tunnel.tp_dst;
> +flow->tp_src = flow->tunnel.tp_src;
> +
> +base_flow->nw_dst = flow->nw_dst;
> +base_flow->nw_src = flow->nw_src;
> +base_flow->ipv6_dst = flow->ipv6_dst;
> +base_flow->ipv6_src = flow->ipv6_src;
> +
> +base_flow->nw_tos = flow->nw_tos;
> +base_flow->nw_ttl = flow->nw_ttl;
> +base_flow->tp_dst = flow->tp_dst;
> +base_flow->tp_src = flow->tp_src;
> +}
> +
>  static int
>  build_tunnel_send(struct xlate_ctx *ctx, const struct xport *xport,
>const struct flow *flow, odp_port_t tunnel_odp_port)
> @@ -3156,6 +3183,11 @@ build_tunnel_send(struct xlate_ctx *ctx, const struct 
> xport *xport,
>  int err;
>  char buf_sip6[INET6_ADDRSTRLEN];
>  char buf_dip6[INET6_ADDRSTRLEN];
> +/* Structures to 

Re: [ovs-dev] [PATCH 0/3] Support OpenFlow 1.5 packet-out

2017-05-10 Thread Yi-Hung Wei
Hi Ben,

I get a few questions when finalizing this patch series. It would be
great if I can have some comments from you before I send the v2 out.

1) In order to support tun_metadata as the pipeline fields in the
packet-out message, it seems like it makes more sense to use
'struct match' than 'struct flow' since 'struct match'
includes 'struct tun_metadata_allocation'?

2) Should we include the connection tracking fields as the pipeline
fileds in the packet out message?

Thanks,

-Yi-Hung

On Mon, May 8, 2017 at 1:55 PM, Ben Pfaff  wrote:
> On Mon, May 08, 2017 at 10:24:16AM -0700, Yi-Hung Wei wrote:
>> Hi Ben,
>>
>> Thanks for your valuable review. Yes, it makes sense to use 'struct flow'
>> instead of 'struct match' to represent metadata.
>>
>> As for the "pipeline fields", I briefly look at ovs-fields (7), and I think 
>> the
>> patch series should be update to include at least the following fields.
>> * Tunnel fields: (tun_src/dst, tun_ipv6_src/dst, tun_gbp_id,
>> tun_gbp_flags, tun_flags, tun_metadata0 - tun_metadata63)
>> * Register fields: (reg0 - reg15, xreg0 - xreg7, xxreg0 - xxreg3)
>>
>> I will address these issues and send out a v2 soon.
>
> OK, thank you!
>
> Maybe there should be an mf_*() function that identifies pipeline
> fields.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v6] dpif-netdev: Assign ports to pmds on non-local numa node.

2017-05-10 Thread Billy O'Mahony
From: billyom 

Previously if there is no available (non-isolated) pmd on the numa node
for a port then the port is not polled at all. This can result in a
non-operational system until such time as nics are physically
repositioned. It is preferable to operate with a pmd on the 'wrong' numa
node albeit with lower performance. Local pmds are still chosen when
available.

Signed-off-by: Billy O'Mahony 
---
v6: Change 'port' to 'queue' in a warning msg
v5: Fix warning msg; Update same in docs
v4: Fix a checkpatch error
v3: Fix warning messages not appearing when using multiqueue
v2: Add details of warning messages into docs

 Documentation/intro/install/dpdk.rst | 10 +
 lib/dpif-netdev.c| 43 +++-
 2 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index d1c0e65..7a66bff 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -460,6 +460,16 @@ affinitized accordingly.
 pmd thread on a NUMA node is only created if there is at least one DPDK
 interface from that NUMA node added to OVS.
 
+  .. note::
+   On NUMA systems PCI devices are also local to a NUMA node.  Rx queues for
+   PCI device will assigned to a pmd on it's local NUMA node if pmd-cpu-mask
+   has created a pmd thread on that NUMA node.  If not the queue will be
+   assigned to a pmd on a remote NUMA node.  This will result in reduced
+   maximum throughput on that device.  In the case such a queue assignment
+   is made a warning message will be logged: "There's no available (non-
+   isolated) pmd thread on numa node N. Queue Q on port P will be assigned to
+   the pmd on core C (numa node N'). Expect reduced performance."
+
 - QEMU vCPU thread Affinity
 
   A VM performing simple packet forwarding or running complex packet pipelines
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b3a0806..34f1963 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3149,10 +3149,13 @@ rr_numa_list_lookup(struct rr_numa_list *rr, int 
numa_id)
 }
 
 static void
-rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr)
+rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr,
+  int *all_numa_ids, unsigned all_numa_ids_sz,
+  int *num_ids_written)
 {
 struct dp_netdev_pmd_thread *pmd;
 struct rr_numa *numa;
+unsigned idx = 0;
 
 hmap_init(>numas);
 
@@ -3170,7 +3173,11 @@ rr_numa_list_populate(struct dp_netdev *dp, struct 
rr_numa_list *rr)
 numa->n_pmds++;
 numa->pmds = xrealloc(numa->pmds, numa->n_pmds * sizeof *numa->pmds);
 numa->pmds[numa->n_pmds - 1] = pmd;
+
+all_numa_ids[idx % all_numa_ids_sz] = pmd->numa_id;
+idx++;
 }
+*num_ids_written = idx;
 }
 
 static struct dp_netdev_pmd_thread *
@@ -3202,8 +3209,15 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
 {
 struct dp_netdev_port *port;
 struct rr_numa_list rr;
+int all_numa_ids [64];
+int all_numa_ids_sz = sizeof all_numa_ids / sizeof all_numa_ids[0];
+unsigned all_numa_ids_idx = 0;
+int all_numa_ids_max_idx = 0;
+int num_numa_ids = 0;
 
-rr_numa_list_populate(dp, );
+rr_numa_list_populate(dp, , all_numa_ids, all_numa_ids_sz,
+  _numa_ids);
+all_numa_ids_max_idx = MIN(num_numa_ids - 1, all_numa_ids_sz - 1);
 
 HMAP_FOR_EACH (port, node, >ports) {
 struct rr_numa *numa;
@@ -3234,10 +3248,29 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
 }
 } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) {
 if (!numa) {
-VLOG_WARN("There's no available (non isolated) pmd thread "
+if (all_numa_ids_max_idx < 0) {
+VLOG_ERR("There is no available (non-isolated) pmd "
+ "thread for port \'%s\' queue %d. This queue "
+ "will not be polled. Is pmd-cpu-mask set to "
+ "zero? Or are all PMDs isolated to other "
+ "queues?", netdev_get_name(port->netdev),
+ qid);
+continue;
+}
+int alt_numa_id = all_numa_ids[all_numa_ids_idx];
+struct rr_numa *alt_numa;
+alt_numa = rr_numa_list_lookup(, alt_numa_id);
+q->pmd = rr_numa_get_pmd(alt_numa);
+VLOG_WARN("There's no available (non-isolated) pmd thread "
   "on numa node %d. Queue %d on port \'%s\' will "
-  "not be polled.",
-  numa_id, qid, 

Re: [ovs-dev] [PATCH ovs V8 25/26] dpif: Refactor flow logging functions to be used by other modules

2017-05-10 Thread Simon Horman
On Wed, May 10, 2017 at 05:19:26PM +0200, Simon Horman wrote:
> On Wed, May 03, 2017 at 06:08:16PM +0300, Roi Dayan wrote:
> > Signed-off-by: Roi Dayan 
> > Reviewed-by: Paul Blakey 
> 
> Reviewed-by: Simon Horman 

Sorry, I didn't notice before but could you add some text to the changelog?
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovs V8 26/26] dpif-netlink: Use dpif logging functions

2017-05-10 Thread Simon Horman
On Wed, May 03, 2017 at 06:08:17PM +0300, Roi Dayan wrote:

Please add some changelog text here.

> Signed-off-by: Roi Dayan 
> Reviewed-by: Paul Blakey 
> ---
>  lib/dpif-netlink.c | 55 
> +-
>  1 file changed, 17 insertions(+), 38 deletions(-)
> 
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index 048dae6..3638358 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -2113,35 +2113,11 @@ out:
>  return err;
>  }
>  
> -static void
> -dbg_print_flow(const struct nlattr *key, size_t key_len,
> -   const struct nlattr *mask, size_t mask_len,
> -   const struct nlattr *actions, size_t actions_len,
> -   const ovs_u128 *ufid,
> -   const char *op)
> -{
> -struct ds s;
> -
> -ds_init();
> -ds_put_cstr(, op);
> -ds_put_cstr(, " (");
> -odp_format_ufid(ufid, );
> -ds_put_cstr(, ")");
> -if (key_len) {
> -ds_put_cstr(, "\nflow (verbose): ");
> -odp_flow_format(key, key_len, mask, mask_len, NULL, , true);
> -ds_put_cstr(, "\nflow: ");
> -odp_flow_format(key, key_len, mask, mask_len, NULL, , false);
> -}
> -ds_put_cstr(, "\nactions: ");
> -format_odp_actions(, actions, actions_len);
> -VLOG_DBG("\n%s", ds_cstr());
> -ds_destroy();
> -}
> -
>  static int
>  try_send_to_netdev(struct dpif_netlink *dpif, struct dpif_op *op)
>  {
> +int err = EOPNOTSUPP;
> +

It is unclear to me how the re-arangement of err or handling
is related to the stated goal of using dpif logging functions.

>  switch (op->type) {
>  case DPIF_OP_FLOW_PUT: {
>  struct dpif_flow_put *put = >u.flow_put;
> @@ -2149,10 +2125,10 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
> dpif_op *op)
>  if (!put->ufid) {
>  break;
>  }
> -dbg_print_flow(put->key, put->key_len, put->mask, put->mask_len,
> -   put->actions, put->actions_len, put->ufid,
> -   (put->flags & DPIF_FP_MODIFY ? "PUT(MODIFY)" : 
> "PUT"));
> -return parse_flow_put(dpif, put);
> +
> +log_flow_put_message(>dpif, _module, put, 0);
> +err = parse_flow_put(dpif, put);
> +break;
>  }
>  case DPIF_OP_FLOW_DEL: {
>  struct dpif_flow_del *del = >u.flow_del;
> @@ -2160,10 +2136,11 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
> dpif_op *op)
>  if (!del->ufid) {
>  break;
>  }
> -dbg_print_flow(del->key, del->key_len, NULL, 0, NULL, 0,
> -   del->ufid, "DEL");
> -return netdev_ports_flow_del(DPIF_HMAP_KEY(>dpif), del->ufid,
> - del->stats);
> +
> +log_flow_del_message(>dpif, _module, del, 0);
> +err = netdev_ports_flow_del(DPIF_HMAP_KEY(>dpif), del->ufid,
> +del->stats);
> +break;
>  }
>  case DPIF_OP_FLOW_GET: {
>  struct dpif_flow_get *get = >u.flow_get;
> @@ -2171,15 +2148,17 @@ try_send_to_netdev(struct dpif_netlink *dpif, struct 
> dpif_op *op)
>  if (!op->u.flow_get.ufid) {
>  break;
>  }
> -dbg_print_flow(get->key, get->key_len, NULL, 0, NULL, 0,
> -   get->ufid, "GET");
> -return parse_flow_get(dpif, get);
> +
> +log_flow_get_message(>dpif, _module, get, 0);
> +err = parse_flow_get(dpif, get);
> +break;
>  }
>  case DPIF_OP_EXECUTE:
>  default:
>  break;
>  }
> -return EOPNOTSUPP;
> +
> +return err;
>  }
>  
>  static void
> -- 
> 2.7.4
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovs V8 25/26] dpif: Refactor flow logging functions to be used by other modules

2017-05-10 Thread Simon Horman
On Wed, May 03, 2017 at 06:08:16PM +0300, Roi Dayan wrote:
> Signed-off-by: Roi Dayan 
> Reviewed-by: Paul Blakey 

Reviewed-by: Simon Horman 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [BUG] ovs-ofctl version 2.5.0 will crash with OFPFMFC_BAD_COMMAND

2017-05-10 Thread Zang MingJie

Confirmed, can be easily reproduced using described method.

Using ovs 2.6.2

On 01/23/2017 11:58 PM, Vidyasagara Guntaka via dev wrote:

Hi Ben,

We could reproduce this with the latest version 2.6.1.  When we compiled the 
code, we removed -O2 from CFLAGS.  This seems to make it happen more 
frequently.  With the following script running, the error starts happening 
within a few seconds and then continues to happen after every few seconds.  In 
summary, our suspicion is that having no controller set and having no NORMAL 
processing flow seems to trigger the stack that was pointed out in the gdb 
session more often and hence we hit this race condition very easily using the 
following script.  (Even if there is a default NORMAL processing flow entry, 
after the first iteration of the script below, that will be deleted).

Also, a few things about the setup - just in case:
  * enp5s0 belongs to the physical interface on this hypervisor.
  * vport0 and vport1 belong to tap interfaces corresponding to two VMs running 
on this hypervisor.
  * When the script was running, we had pings issued from the VMs so that 
packets make it to the bridge br0.

Here is a small script that makes it happen on multiple of our hypervisors:

while true
do
ovs-ofctl add-flow br0 
"priority=200,table=123,idle_timeout=1,in_port=1,actions=controller"
ovs-ofctl add-flow br0 
"priority=200,table=123,idle_timeout=1,in_port=2,actions=controller"
ovs-ofctl add-flow br0 
"priority=200,table=123,idle_timeout=1,in_port=3,actions=controller"
ovs-ofctl add-flow br0 
"priority=200,table=123,idle_timeout=1,in_port=4,actions=controller"
ovs-ofctl del-flows br0
done

Here is our bridge br0 setup:

[root@deepspace ~]# ifconfig br0
br0: flags=4163  mtu 1500
inet 192.168.2.142  netmask 255.255.255.0  broadcast 192.168.2.255
inet6 fe80::213:3bff:fe0f:1301  prefixlen 64  scopeid 0x20
ether 00:13:3b:0f:13:01  txqueuelen 1000  (Ethernet)
RX packets 89417814  bytes 12088012200 (11.2 GiB)
RX errors 0  dropped 82  overruns 0  frame 0
TX packets 32330647  bytes 3168352394 (2.9 GiB)
TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@deepspace ~]# ovs-vsctl show
54f89e00-edd2-486e-9626-6d11c7d8b0b6
Bridge "br0"
Port "vport1"
Interface "vport1"
Port "br0"
Interface "br0"
type: internal
Port vtep
Interface vtep
type: vxlan
options: {key=flow, remote_ip="192.168.1.141"}
Port "vport0"
Interface "vport0"
Port "enp5s0"
Interface "enp5s0"
ovs_version: “2.6.1"

[root@deepspace ~]# ovs-ofctl show br0
OFPT_FEATURES_REPLY (xid=0x2): dpid:00133b0f1301
n_tables:254, n_buffers:256
capabilities: FLOW_STATS TABLE_STATS PORT_STATS QUEUE_STATS ARP_MATCH_IP
actions: output enqueue set_vlan_vid set_vlan_pcp strip_vlan mod_dl_src 
mod_dl_dst mod_nw_src mod_nw_dst mod_nw_tos mod_tp_src mod_tp_dst
 1(enp5s0): addr:00:13:3b:0f:13:01
 config: 0
 state:  0
 current:1GB-FD AUTO_NEG
 advertised: 10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-HD 1GB-FD COPPER 
AUTO_NEG AUTO_PAUSE AUTO_PAUSE_ASYM
 supported:  10MB-HD 10MB-FD 100MB-HD 100MB-FD 1GB-HD 1GB-FD COPPER AUTO_NEG
 speed: 1000 Mbps now, 1000 Mbps max
 2(vport0): addr:fe:00:00:00:00:03
 config: 0
 state:  0
 current:10MB-FD COPPER
 speed: 10 Mbps now, 0 Mbps max
 3(vport1): addr:fe:00:00:00:00:04
 config: 0
 state:  0
 current:10MB-FD COPPER
 speed: 10 Mbps now, 0 Mbps max
 4(vtep): addr:aa:97:d2:a9:19:ed
 config: 0
 state:  0
 speed: 0 Mbps now, 0 Mbps max
 LOCAL(br0): addr:00:13:3b:0f:13:01
 config: 0
 state:  0
 speed: 0 Mbps now, 0 Mbps max
OFPT_GET_CONFIG_REPLY (xid=0x4): frags=normal miss_send_len=0

[root@deepspace ~]# ovs-ofctl --version
ovs-ofctl (Open vSwitch) 2.6.1
OpenFlow versions 0x1:0x4

Please let us know if you need anything else to reproduce this.

Thanks,
Sagar.


On Jan 18, 2017, at 1:19 PM, Ben Pfaff  wrote:

If you can come up with simple reproduction instructions that work for
me, I'm happy to track this down.  It's probably something very simple.

On Tue, Jan 17, 2017 at 08:50:20AM -0800, Vidyasagara Guntaka wrote:

This issue happened on our in-use systems and we were trying to find a way
to move forward avoiding this issue so that we do not have to upgrade OVS
on thousands of our hypervisors causing down time. Our debugging did help
us avoid the issue for now by installing an explicit rule to to drop
packets when there is no match and this issue is not seen over many hours
of test runs.

We will definitely run this test with latest version.  But, will need more
time since we are busy with our release related activities.

Regards,
Sagar.

On Tue, Jan 17, 2017 at 8:42 AM, Ben Pfaff 

Re: [ovs-dev] [PATCH v5] tunneling: Avoid recirculation on datapath by computing the recirculate actions at translate time.

2017-05-10 Thread William Tu
> It may be cleaner if we add a new trunc action for the datapath, say
> trunc2  that applies
> to all outputs within the clone.
>
> So the translation will look like: clone(trunc2, native tunnel
> translation). Would this
> approach work?
>

Or how about we apply actual packet truncation when clone action
follows truncate action?
Now we apply actual packet truncation when:
actions=trunc, output
actions=trunc, tunnel_push
actions=trunc, sample

If we add clone as another truncate target, then
actions = trunc(100), clone(tnl(...)),  actionX,
Inside clone will see packet of size 100, and actionX sees original
size. Then I think we don't need to introduce trunc2?

Regards,
William

>>
>> Without the "Avoid recirculation" patch we have two datapath flows, because 
>> the
>> packet is recirculated. At the end of the first flow the packet size is 
>> changed
>> and the packet with modified size enters the OF pipeline again.
>>
>> What is the reason not to change packet size when truncate action is applied?
>>
>
> One of the reasons could be that we introduced trunc before clone. Otherwise, 
> a
> clone(trunc2, output:x) is equivalent to trunc, output:x.  Note that
> the trunc datapath
> action is different than other datapath actions, which usually applies
> to all following
> actions. Native tunneling may be the first use case that motivates
> trunc2, which should
> have the normal datapath action behavior.
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Reconfirm Your Name

2017-05-10 Thread FedEx Courier NG
Sir,

With reference to our meeting with World Bank Group and the International 
Monetary Fund Washington, D.C to release all outstanding payment including 
inheritance and contract payment.

Transmittal letter has been given to Federal Reserve Bank of New York to 
release your payment withing three working days, please kindly reconfirm your 
information once again if truly you have not yet receive your payment.

1. Your full Name.
2. Contact Home Address.
3. Your Cell-phone Number.
4. Your Date of Birth.
5. Your International Passport or Driver's License

Regards,

Mr. Alex Chung Chao-Shu

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH ovs V8 00/26] Introducing HW offload support for openvswitch

2017-05-10 Thread Flavio Leitner
On Wed, May 10, 2017 at 10:44:46AM +0300, Roi Dayan wrote:
> 
> 
> On 09/05/2017 22:05, Flavio Leitner wrote:
> > On Sun, May 07, 2017 at 10:55:32AM +0300, Roi Dayan wrote:
> > > 
> > > 
> > > On 03/05/2017 18:58, Federico Iezzi wrote:
> > > > On Wed, May 3, 2017 at 5:07 PM, Roi Dayan  wrote:
> > > > > This patch series introduces rule offload functionality to 
> > > > > dpif-netlink
> > > > > via netdev ports new flow offloading API. The user can specify 
> > > > > whether to
> > > > > enable rule offloading or not via OVS configuration. Netdev providers
> > > > > are able to implement netdev flow offload API in order to offload 
> > > > > rules.
> > > > > 
> > > > > This patch series also implements one offload scheme for netdev-linux,
> > > > > using TC flower classifier, which was chosen because its sort of 
> > > > > natural
> > > > > to state OVS DP rules for this classifier. However, the code can be
> > > > > extended to support other classifiers such as U32, eBPF, etc which 
> > > > > support
> > > > > offload as well.
> > > > > 
> > > > > The use-case we are currently addressing is the newly sriov switchdev 
> > > > > mode
> > > > > in the Linux kernel which was introduced in version 4.8.
> > > > > This series was tested against sriov vfs vports representors of the
> > > > > Mellanox 100G ConnectX-4 series exposed by the mlx5 kernel driver.
> > > > > 
> > > > > 
> > > > > V7->V8
> > > > > - Refactor dpif logging functions and use them in dpif-netlink
> > > > > - Ignore internal devices from netdev hashmap
> > > > > - Refactor netdev hmap naming to prefix netdev_ports_*
> > > > > - Use single hashmap with 2 nodes for ufid/tc mapping
> > > > > - Verify ifindex is valid in netdev_hmap_port_add
> > > > > - Close netdev in netdev_tc_flow_get error flow
> > > > > - Improve comments for flow offload api
> > > > > - Reorder flow api output args to be last args
> > > > > - Remove redundant netdev_flow_support
> > > > > - Fix using uninitialized var 's'
> > > > > 
> > > > > Not done:
> > > > > refactor netdev_ports_* functions to accept a typed obj
> > > > > (e.g. netdev_ports struct) instead of void*.
> > > > > We can do it as a follow-up commit later.
> > > > > 
> > > > > V6->V7:
> > > > > - Fix L3 IPv4 matching got broken
> > > > > - Refactor offloads test and testsuite to have same prefix
> > > > > - Better handling of unsupported match attributes
> > > > > 
> > > > > V5->V6:
> > > > > - Rebase over master branch, fix compilation issue
> > > > > - Add Nicira copyright to tc interface
> > > > > 
> > > > > V4->V5:
> > > > > - Fix compat
> > > > > - Fix VXLAN IPv6 tunnel matching
> > > > > - Fix order of actions in dump flows
> > > > > - Update ovs-dpctl man page about the addtion of type to 
> > > > > dump-flows
> > > > > 
> > > > > Travis
> > > > > https://travis-ci.org/roidayan/ovs/builds/213735371
> > > > > AppVeyor
> > > > > https://ci.appveyor.com/project/roidayan/ovs/build/1.0.18
> > > > > 
> > > > > V3->V4:
> > > > > - Move declarations to the right commit with implementation
> > > > > - Fix tc_get_flower flow return false success
> > > > > - Fix memory leaks - not releasing tc_transact replies
> > > > > - Fix travis failure for OSX compilation
> > > > > - Fix loop in dpif_netlink_flow_dump_next
> > > > > - Fix declared default value for tc-policy in vswitch.xml
> > > > > - Refactor loop in netdev_tc_flow_dump_next
> > > > > - Add missing error checks in parse_flow_put
> > > > > - Fix handling modify request where old rule is in hw and new
> > > > >   rule is not supported and needs to be in sw.
> > > > > - Use 2 hashmaps instead of 1 for faster reverse lookup of ufid 
> > > > > from tc
> > > > > - Init ports when enabling hw-offload after OVS is running
> > > > > 
> > > > > TODO: Fix breaking of datapath compilation
> > > > >   Fix testsuite failures
> > > > > 
> > > > > Travis
> > > > > https://travis-ci.org/roidayan/ovs/builds/210549325
> > > > > AppVeyor
> > > > > https://ci.appveyor.com/project/roidayan/ovs/build/1.0.15
> > > > > 
> > > > > V2->V3:
> > > > > - Code styling fixes
> > > > > - Bug fixes
> > > > > - Using already available macros/functions to match current OVS 
> > > > > code
> > > > > - Refactored code according to V2 review
> > > > > - Replaced bool option skip-hw for string option tc-policy
> > > > > - Added hw offload tests using policy skip_hw
> > > > > - Fixed most compatability compiling issues
> > > > > - Travis
> > > > > https://travis-ci.org/roidayan/ovs/builds/199610124
> > > > > - AppVeyor
> > > > > https://ci.appveyor.com/project/roidayan/ovs/build/1.0.14
> > > > > - Fixed compiling with DPDK enabled
> > > > > 
> > > > > TODO:
> > > > > - need to fix datapath compiling issues found in 

[ovs-dev] [PATCH v5] dpif-netdev: Assign ports to pmds on non-local numa node.

2017-05-10 Thread Billy O'Mahony
From: billyom 

Previously if there is no available (non-isolated) pmd on the numa node
for a port then the port is not polled at all. This can result in a
non-operational system until such time as nics are physically
repositioned. It is preferable to operate with a pmd on the 'wrong' numa
node albeit with lower performance. Local pmds are still chosen when
available.

Signed-off-by: Billy O'Mahony 
---
 Documentation/intro/install/dpdk.rst | 10 +
 lib/dpif-netdev.c| 43 +++-
 2 files changed, 48 insertions(+), 5 deletions(-)

diff --git a/Documentation/intro/install/dpdk.rst 
b/Documentation/intro/install/dpdk.rst
index d1c0e65..7a66bff 100644
--- a/Documentation/intro/install/dpdk.rst
+++ b/Documentation/intro/install/dpdk.rst
@@ -460,6 +460,16 @@ affinitized accordingly.
 pmd thread on a NUMA node is only created if there is at least one DPDK
 interface from that NUMA node added to OVS.
 
+  .. note::
+   On NUMA systems PCI devices are also local to a NUMA node.  Rx queues for
+   PCI device will assigned to a pmd on it's local NUMA node if pmd-cpu-mask
+   has created a pmd thread on that NUMA node.  If not the queue will be
+   assigned to a pmd on a remote NUMA node.  This will result in reduced
+   maximum throughput on that device.  In the case such a queue assignment
+   is made a warning message will be logged: "There's no available (non-
+   isolated) pmd thread on numa node N. Queue Q on port P will be assigned to
+   the pmd on core C (numa node N'). Expect reduced performance."
+
 - QEMU vCPU thread Affinity
 
   A VM performing simple packet forwarding or running complex packet pipelines
diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
index b3a0806..466a818 100644
--- a/lib/dpif-netdev.c
+++ b/lib/dpif-netdev.c
@@ -3149,10 +3149,13 @@ rr_numa_list_lookup(struct rr_numa_list *rr, int 
numa_id)
 }
 
 static void
-rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr)
+rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr,
+  int *all_numa_ids, unsigned all_numa_ids_sz,
+  int *num_ids_written)
 {
 struct dp_netdev_pmd_thread *pmd;
 struct rr_numa *numa;
+unsigned idx = 0;
 
 hmap_init(>numas);
 
@@ -3170,7 +3173,11 @@ rr_numa_list_populate(struct dp_netdev *dp, struct 
rr_numa_list *rr)
 numa->n_pmds++;
 numa->pmds = xrealloc(numa->pmds, numa->n_pmds * sizeof *numa->pmds);
 numa->pmds[numa->n_pmds - 1] = pmd;
+
+all_numa_ids[idx % all_numa_ids_sz] = pmd->numa_id;
+idx++;
 }
+*num_ids_written = idx;
 }
 
 static struct dp_netdev_pmd_thread *
@@ -3202,8 +3209,15 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
 {
 struct dp_netdev_port *port;
 struct rr_numa_list rr;
+int all_numa_ids [64];
+int all_numa_ids_sz = sizeof all_numa_ids / sizeof all_numa_ids[0];
+unsigned all_numa_ids_idx = 0;
+int all_numa_ids_max_idx = 0;
+int num_numa_ids = 0;
 
-rr_numa_list_populate(dp, );
+rr_numa_list_populate(dp, , all_numa_ids, all_numa_ids_sz,
+  _numa_ids);
+all_numa_ids_max_idx = MIN(num_numa_ids - 1, all_numa_ids_sz - 1);
 
 HMAP_FOR_EACH (port, node, >ports) {
 struct rr_numa *numa;
@@ -3234,10 +3248,29 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned) 
OVS_REQUIRES(dp->port_mutex)
 }
 } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) {
 if (!numa) {
-VLOG_WARN("There's no available (non isolated) pmd thread "
+if (all_numa_ids_max_idx < 0) {
+VLOG_ERR("There is no available (non-isolated) pmd "
+ "thread for port \'%s\' queue %d. This port "
+ "will not be polled. Is pmd-cpu-mask set to "
+ "zero? Or are all PMDs isolated to other "
+ "queues?", netdev_get_name(port->netdev),
+ qid);
+continue;
+}
+int alt_numa_id = all_numa_ids[all_numa_ids_idx];
+struct rr_numa *alt_numa;
+alt_numa = rr_numa_list_lookup(, alt_numa_id);
+q->pmd = rr_numa_get_pmd(alt_numa);
+VLOG_WARN("There's no available (non-isolated) pmd thread "
   "on numa node %d. Queue %d on port \'%s\' will "
-  "not be polled.",
-  numa_id, qid, netdev_get_name(port->netdev));
+  "be assigned to the pmd on core %d "
+  "(numa node %d). Expect reduced performance.",
+  numa_id, qid, 

Re: [ovs-dev] [PATCH v4] dpif-netdev: Assign ports to pmds on non-local numa node.

2017-05-10 Thread Stokes, Ian
> Hi Ian,
> 
> I'll send a new patch shortly.
> 
> One comment below.
> 
> Thanks for reviewing,
> /Billy.
> 
> > -Original Message-
> > From: Stokes, Ian
> > Sent: Wednesday, May 10, 2017 1:19 PM
> > To: O Mahony, Billy ; d...@openvswitch.org
> > Subject: RE: [ovs-dev] [PATCH v4] dpif-netdev: Assign ports to pmds on
> > non- local numa node.
> >
> > > From: billyom 
> > >
> > > Previously if there is no available (non-isolated) pmd on the numa
> > > node for a port then the port is not polled at all. This can result
> > > in a non- operational system until such time as nics are physically
> > > repositioned. It is preferable to operate with a pmd on the 'wrong'
> > > numa node albeit with lower performance. Local pmds are still chosen
> > when available.
> >
> > Thanks Billy, few minor comments below.
> >
> > >
> > > Signed-off-by: Billy O'Mahony 
> > > ---
> > > v4: Fix a checkpatch error
> > > v3: Fix warning messages not appearing when using multiqueue
> > > v2: Add details of warning messages into docs
> > >
> > >  Documentation/intro/install/dpdk.rst | 10 +
> > >  lib/dpif-netdev.c| 40
> > > 
> > >  2 files changed, 46 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/Documentation/intro/install/dpdk.rst
> > > b/Documentation/intro/install/dpdk.rst
> > > index d1c0e65..ca73184 100644
> > > --- a/Documentation/intro/install/dpdk.rst
> > > +++ b/Documentation/intro/install/dpdk.rst
> > > @@ -460,6 +460,16 @@ affinitized accordingly.
> > >  pmd thread on a NUMA node is only created if there is at least
> > > one DPDK
> > >  interface from that NUMA node added to OVS.
> > >
> > > +  .. note::
> > > +   On NUMA systems PCI devices are also local to a NUMA node.  Rx
> > > + queues
> > > for
> > > +   PCI device will assigned to a pmd on it's local NUMA node if
> > > + pmd-cpu-
> > > mask
> > > +   has created a pmd thread on that NUMA node.  If not the queue will
> be
> > > +   assigned to a pmd on a remote NUMA node.  This will result in
> reduced
> > > +   maximum throughput on that device.  In the case such a queue
> > > assignment
> > > +   is made a warning message will be logged: "There's no available
> > > +   (non isolated) pmd thread on numa node N. Queue Q on port P will
> > > + be
> > > assigned
> > > +   to a pmd on numa node M. Expect reduced performance."
> >
> > The warning message above is different to what is outputted now by the
> > code (no mention of the core etc).
> > Can you update this?
> >
> > Also a minor nit on formatting, there's a mix of single and double
> > spaces at the start of new sentences throughout the paragraph, can you
> fix these?
> [[BO'M]] The single spaces are used only within the description/quote of
> the error message. Double spaces are used in the help text proper. Or at
> least that is the intention. I'll double check.


Ah, my bad. Thanks for pointing this out.

Ian
> >
> > > +
> > >  - QEMU vCPU thread Affinity
> > >
> > >A VM performing simple packet forwarding or running complex
> > > packet pipelines diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > > index
> > > b3a0806..bcbd325 100644
> > > --- a/lib/dpif-netdev.c
> > > +++ b/lib/dpif-netdev.c
> > > @@ -3149,10 +3149,13 @@ rr_numa_list_lookup(struct rr_numa_list *rr,
> > > int
> > > numa_id)  }
> > >
> > >  static void
> > > -rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list
> > > *rr)
> > > +rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr,
> > > +  int *all_numa_ids, unsigned all_numa_ids_sz,
> > > +  int *num_ids_written)
> > >  {
> > >  struct dp_netdev_pmd_thread *pmd;
> > >  struct rr_numa *numa;
> > > +unsigned idx = 0;
> > >
> > >  hmap_init(>numas);
> > >
> > > @@ -3170,7 +3173,11 @@ rr_numa_list_populate(struct dp_netdev *dp,
> > > struct rr_numa_list *rr)
> > >  numa->n_pmds++;
> > >  numa->pmds = xrealloc(numa->pmds, numa->n_pmds * sizeof
> > > *numa-
> > > >pmds);
> > >  numa->pmds[numa->n_pmds - 1] = pmd;
> > > +
> > > +all_numa_ids[idx % all_numa_ids_sz] = pmd->numa_id;
> > > +idx++;
> > >  }
> > > +*num_ids_written = idx;
> > >  }
> > >
> > >  static struct dp_netdev_pmd_thread * @@ -3202,8 +3209,15 @@
> > > rxq_scheduling(struct dp_netdev *dp, bool
> > > pinned)
> > > OVS_REQUIRES(dp->port_mutex)  {
> > >  struct dp_netdev_port *port;
> > >  struct rr_numa_list rr;
> > > +int all_numa_ids [64];
> > > +int all_numa_ids_sz = sizeof all_numa_ids / sizeof
> all_numa_ids[0];
> > > +unsigned all_numa_ids_idx = 0;
> > > +int all_numa_ids_max_idx = 0;
> > > +int num_numa_ids = 0;
> > >
> > > -rr_numa_list_populate(dp, );
> > > +rr_numa_list_populate(dp, , all_numa_ids, all_numa_ids_sz,
> > > +  _numa_ids);
> > > +

Re: [ovs-dev] [PATCH ovs V8 21/26] dpctl: Add an option to dump only certain kinds of flows

2017-05-10 Thread Simon Horman
On Wed, May 03, 2017 at 06:08:12PM +0300, Roi Dayan wrote:
> From: Paul Blakey 
> 
> Usage:
> # to dump all datapath flows (default):
> ovs-dpctl dump-flows
> 
> # to dump only flows that in kernel datapath:
> ovs-dpctl dump-flows type=ovs
> 
> # to dump only flows that are offloaded:
> ovs-dpctl dump-flows type=offloaded
> 
> Signed-off-by: Paul Blakey 
> Reviewed-by: Roi Dayan 
> Reviewed-by: Simon Horman 

As per Flavio's review of v7, can this be made to return an error
if type is invalid. I may have missed something I'm not seeing that in
this patch.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] dpif-netdev: Assign ports to pmds on non-local numa node.

2017-05-10 Thread O Mahony, Billy
Hi Ian,

I'll send a new patch shortly. 

One comment below.

Thanks for reviewing,
/Billy.

> -Original Message-
> From: Stokes, Ian
> Sent: Wednesday, May 10, 2017 1:19 PM
> To: O Mahony, Billy ; d...@openvswitch.org
> Subject: RE: [ovs-dev] [PATCH v4] dpif-netdev: Assign ports to pmds on non-
> local numa node.
> 
> > From: billyom 
> >
> > Previously if there is no available (non-isolated) pmd on the numa
> > node for a port then the port is not polled at all. This can result in
> > a non- operational system until such time as nics are physically
> > repositioned. It is preferable to operate with a pmd on the 'wrong'
> > numa node albeit with lower performance. Local pmds are still chosen
> when available.
> 
> Thanks Billy, few minor comments below.
> 
> >
> > Signed-off-by: Billy O'Mahony 
> > ---
> > v4: Fix a checkpatch error
> > v3: Fix warning messages not appearing when using multiqueue
> > v2: Add details of warning messages into docs
> >
> >  Documentation/intro/install/dpdk.rst | 10 +
> >  lib/dpif-netdev.c| 40
> > 
> >  2 files changed, 46 insertions(+), 4 deletions(-)
> >
> > diff --git a/Documentation/intro/install/dpdk.rst
> > b/Documentation/intro/install/dpdk.rst
> > index d1c0e65..ca73184 100644
> > --- a/Documentation/intro/install/dpdk.rst
> > +++ b/Documentation/intro/install/dpdk.rst
> > @@ -460,6 +460,16 @@ affinitized accordingly.
> >  pmd thread on a NUMA node is only created if there is at least
> > one DPDK
> >  interface from that NUMA node added to OVS.
> >
> > +  .. note::
> > +   On NUMA systems PCI devices are also local to a NUMA node.  Rx
> > + queues
> > for
> > +   PCI device will assigned to a pmd on it's local NUMA node if
> > + pmd-cpu-
> > mask
> > +   has created a pmd thread on that NUMA node.  If not the queue will be
> > +   assigned to a pmd on a remote NUMA node.  This will result in reduced
> > +   maximum throughput on that device.  In the case such a queue
> > assignment
> > +   is made a warning message will be logged: "There's no available
> > +   (non isolated) pmd thread on numa node N. Queue Q on port P will
> > + be
> > assigned
> > +   to a pmd on numa node M. Expect reduced performance."
> 
> The warning message above is different to what is outputted now by the
> code (no mention of the core etc).
> Can you update this?
> 
> Also a minor nit on formatting, there's a mix of single and double spaces at
> the start of new sentences throughout the paragraph, can you fix these?
[[BO'M]] The single spaces are used only within the description/quote of the 
error message. Double spaces are used in the help text proper. Or at least that 
is the intention. I'll double check.
> 
> > +
> >  - QEMU vCPU thread Affinity
> >
> >A VM performing simple packet forwarding or running complex packet
> > pipelines diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> > b3a0806..bcbd325 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -3149,10 +3149,13 @@ rr_numa_list_lookup(struct rr_numa_list *rr,
> > int
> > numa_id)  }
> >
> >  static void
> > -rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr)
> > +rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr,
> > +  int *all_numa_ids, unsigned all_numa_ids_sz,
> > +  int *num_ids_written)
> >  {
> >  struct dp_netdev_pmd_thread *pmd;
> >  struct rr_numa *numa;
> > +unsigned idx = 0;
> >
> >  hmap_init(>numas);
> >
> > @@ -3170,7 +3173,11 @@ rr_numa_list_populate(struct dp_netdev *dp,
> > struct rr_numa_list *rr)
> >  numa->n_pmds++;
> >  numa->pmds = xrealloc(numa->pmds, numa->n_pmds * sizeof
> > *numa-
> > >pmds);
> >  numa->pmds[numa->n_pmds - 1] = pmd;
> > +
> > +all_numa_ids[idx % all_numa_ids_sz] = pmd->numa_id;
> > +idx++;
> >  }
> > +*num_ids_written = idx;
> >  }
> >
> >  static struct dp_netdev_pmd_thread *
> > @@ -3202,8 +3209,15 @@ rxq_scheduling(struct dp_netdev *dp, bool
> > pinned)
> > OVS_REQUIRES(dp->port_mutex)  {
> >  struct dp_netdev_port *port;
> >  struct rr_numa_list rr;
> > +int all_numa_ids [64];
> > +int all_numa_ids_sz = sizeof all_numa_ids / sizeof all_numa_ids[0];
> > +unsigned all_numa_ids_idx = 0;
> > +int all_numa_ids_max_idx = 0;
> > +int num_numa_ids = 0;
> >
> > -rr_numa_list_populate(dp, );
> > +rr_numa_list_populate(dp, , all_numa_ids, all_numa_ids_sz,
> > +  _numa_ids);
> > +all_numa_ids_max_idx = MIN(num_numa_ids - 1, all_numa_ids_sz -
> > + 1);
> >
> >  HMAP_FOR_EACH (port, node, >ports) {
> >  struct rr_numa *numa;
> > @@ -3234,10 +3248,28 @@ rxq_scheduling(struct dp_netdev *dp, bool
> > pinned)
> > OVS_REQUIRES(dp->port_mutex)
> >  }
> >  } else if (!pinned && 

Re: [ovs-dev] [PATCH ovs V8 13/26] netdev-tc-offloads: Implement netdev flow put using tc interface

2017-05-10 Thread Simon Horman
On Wed, May 03, 2017 at 06:08:04PM +0300, Roi Dayan wrote:
> From: Paul Blakey 
> 
> Currently only tunnel offload is supported.
> 
> Signed-off-by: Paul Blakey 
> Reviewed-by: Roi Dayan 
> Reviewed-by: Simon Horman 
> ---
>  lib/dpif-netlink.c   |   4 +-
>  lib/netdev-tc-offloads.c | 392 
> ++-
>  2 files changed, 385 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/dpif-netlink.c b/lib/dpif-netlink.c
> index 81d513d..7e6c647 100644
> --- a/lib/dpif-netlink.c
> +++ b/lib/dpif-netlink.c
> @@ -1937,8 +1937,6 @@ parse_flow_put(struct dpif_netlink *dpif, struct 
> dpif_flow_put *put)
>  return EOPNOTSUPP;
>  }
>  
> -memset(, 0, sizeof match);
> -
>  err = parse_key_and_mask_to_match(put->key, put->key_len, put->mask,
>put->mask_len, );
>  if (err) {
> @@ -2011,7 +2009,7 @@ parse_flow_put(struct dpif_netlink *dpif, struct 
> dpif_flow_put *put)
>  
>  VLOG_DBG("added flow");
>  } else if (err != EEXIST) {
> -VLOG_ERR_RL(, "failed to offload flow: %s", ovs_strerror(err));
> +VLOG_ERR_RL(, "failed adding flow: %s", ovs_strerror(err));
>  }
>  
>  out:

I may be misunderstanding things but perhaps the above changes
belong in an earlier patch?

> diff --git a/lib/netdev-tc-offloads.c b/lib/netdev-tc-offloads.c
> index 7e33fff..e3daf62 100644
> --- a/lib/netdev-tc-offloads.c
> +++ b/lib/netdev-tc-offloads.c
> @@ -461,16 +461,392 @@ netdev_tc_flow_dump_next(struct netdev_flow_dump *dump,
>  return false;
>  }

...

> +static int
> +test_key_and_mask(struct match *match) {
> +static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(5, 20);
> +const struct flow *key = >flow;
> +struct flow *mask = >wc.masks;
> +
> +if (mask->pkt_mark) {
> +VLOG_DBG_RL(, "offloading attribute pkt_mark isn't supported");
> +return EOPNOTSUPP;
> +}
> +
> +if (mask->recirc_id && key->recirc_id != 0) {
> +VLOG_DBG_RL(, "offloading attribute recirc_id isn't supported");
> +return EOPNOTSUPP;
> +}
> +mask->recirc_id = 0;

Its not clear to me why the recirc_id mask is reset here.

> +
> +if (mask->dp_hash) {
> +VLOG_DBG_RL(, "offloading attribute dp_hash isn't supported");
> +return EOPNOTSUPP;
> +}

...

> +if (mask->metadata != 0) {

It seems to me that "if (!mask->metdata)" would make the above more
consistent with other checks in this function.

> +VLOG_DBG_RL(, "offloading attribute metadata isn't supported");
> +return EOPNOTSUPP;
> +}

...
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v4] dpif-netdev: Assign ports to pmds on non-local numa node.

2017-05-10 Thread Stokes, Ian
> From: billyom 
> 
> Previously if there is no available (non-isolated) pmd on the numa node
> for a port then the port is not polled at all. This can result in a non-
> operational system until such time as nics are physically repositioned. It
> is preferable to operate with a pmd on the 'wrong' numa node albeit with
> lower performance. Local pmds are still chosen when available.

Thanks Billy, few minor comments below.

> 
> Signed-off-by: Billy O'Mahony 
> ---
> v4: Fix a checkpatch error
> v3: Fix warning messages not appearing when using multiqueue
> v2: Add details of warning messages into docs
> 
>  Documentation/intro/install/dpdk.rst | 10 +
>  lib/dpif-netdev.c| 40
> 
>  2 files changed, 46 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/intro/install/dpdk.rst
> b/Documentation/intro/install/dpdk.rst
> index d1c0e65..ca73184 100644
> --- a/Documentation/intro/install/dpdk.rst
> +++ b/Documentation/intro/install/dpdk.rst
> @@ -460,6 +460,16 @@ affinitized accordingly.
>  pmd thread on a NUMA node is only created if there is at least one
> DPDK
>  interface from that NUMA node added to OVS.
> 
> +  .. note::
> +   On NUMA systems PCI devices are also local to a NUMA node.  Rx queues
> for
> +   PCI device will assigned to a pmd on it's local NUMA node if pmd-cpu-
> mask
> +   has created a pmd thread on that NUMA node.  If not the queue will be
> +   assigned to a pmd on a remote NUMA node.  This will result in reduced
> +   maximum throughput on that device.  In the case such a queue
> assignment
> +   is made a warning message will be logged: "There's no available
> +   (non isolated) pmd thread on numa node N. Queue Q on port P will be
> assigned
> +   to a pmd on numa node M. Expect reduced performance."

The warning message above is different to what is outputted now by the code (no 
mention of the core etc).
Can you update this?

Also a minor nit on formatting, there's a mix of single and double spaces at 
the start of new sentences throughout the paragraph, can you fix these?

> +
>  - QEMU vCPU thread Affinity
> 
>A VM performing simple packet forwarding or running complex packet
> pipelines diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index
> b3a0806..bcbd325 100644
> --- a/lib/dpif-netdev.c
> +++ b/lib/dpif-netdev.c
> @@ -3149,10 +3149,13 @@ rr_numa_list_lookup(struct rr_numa_list *rr, int
> numa_id)  }
> 
>  static void
> -rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr)
> +rr_numa_list_populate(struct dp_netdev *dp, struct rr_numa_list *rr,
> +  int *all_numa_ids, unsigned all_numa_ids_sz,
> +  int *num_ids_written)
>  {
>  struct dp_netdev_pmd_thread *pmd;
>  struct rr_numa *numa;
> +unsigned idx = 0;
> 
>  hmap_init(>numas);
> 
> @@ -3170,7 +3173,11 @@ rr_numa_list_populate(struct dp_netdev *dp, struct
> rr_numa_list *rr)
>  numa->n_pmds++;
>  numa->pmds = xrealloc(numa->pmds, numa->n_pmds * sizeof *numa-
> >pmds);
>  numa->pmds[numa->n_pmds - 1] = pmd;
> +
> +all_numa_ids[idx % all_numa_ids_sz] = pmd->numa_id;
> +idx++;
>  }
> +*num_ids_written = idx;
>  }
> 
>  static struct dp_netdev_pmd_thread *
> @@ -3202,8 +3209,15 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned)
> OVS_REQUIRES(dp->port_mutex)  {
>  struct dp_netdev_port *port;
>  struct rr_numa_list rr;
> +int all_numa_ids [64];
> +int all_numa_ids_sz = sizeof all_numa_ids / sizeof all_numa_ids[0];
> +unsigned all_numa_ids_idx = 0;
> +int all_numa_ids_max_idx = 0;
> +int num_numa_ids = 0;
> 
> -rr_numa_list_populate(dp, );
> +rr_numa_list_populate(dp, , all_numa_ids, all_numa_ids_sz,
> +  _numa_ids);
> +all_numa_ids_max_idx = MIN(num_numa_ids - 1, all_numa_ids_sz - 1);
> 
>  HMAP_FOR_EACH (port, node, >ports) {
>  struct rr_numa *numa;
> @@ -3234,10 +3248,28 @@ rxq_scheduling(struct dp_netdev *dp, bool pinned)
> OVS_REQUIRES(dp->port_mutex)
>  }
>  } else if (!pinned && q->core_id == OVS_CORE_UNSPEC) {
>  if (!numa) {
> +if (all_numa_ids_max_idx < 0) {
> +VLOG_ERR("There is no available (non-isolated)
> pmd "
> + "thread for port \'%s\'. This port will
> "
> + "not be polled. Is pmd-cpu-mask set to "
> + "zero? Or are all PMDs isolated to other
> "
> + "queues?", netdev_get_name(port-
> >netdev));

Could you identify the queue in the ERR above? As it is now the ERR will repeat 
for each queue that cannot be assigned a pmd on the port without any 
discernible differences between the ERR messages.

> +continue;
> +}
> + 

[ovs-dev] [PATCH v2 5/5] stp: Add link-state checking support for stp ports.

2017-05-10 Thread nickcooper-zhangtonghao
When bridge stp enabled, we can enable the stp ports
despite ports are down. When initializing, this patch checks
link-state of ports and enable or disable them according
to their link-state. This patch also allow user to enable
and disable a port when bridge stp is running. If a stp
port is in disable state, it can forward packets. If its
link is down and this patch sets it to disable, there is
no L2 loop.

Signed-off-by: nickcooper-zhangtonghao 
---
 ofproto/ofproto-dpif.c | 41 -
 tests/stp.at   | 71 +-
 2 files changed, 110 insertions(+), 2 deletions(-)

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index 6d3f1fb..098d363 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -2574,6 +2574,37 @@ update_stp_port_state(struct ofport_dpif *ofport)
 }
 }
 
+static void
+stp_check_and_update_link_state(struct ofproto_dpif *ofproto)
+{
+struct ofport *ofport_;
+struct ofport_dpif *ofport;
+bool up;
+
+HMAP_FOR_EACH (ofport_, hmap_node, >up.ports) {
+ofport = ofport_dpif_cast(ofport_);
+up = netdev_get_carrier(ofport_->netdev);
+
+if (ofport->stp_port &&
+up != (stp_port_get_state(ofport->stp_port) != STP_DISABLED)) {
+
+VLOG_DBG("bridge: %s, port: %s is %s, %s it", ofproto->up.name,
+ netdev_get_name(ofport->up.netdev),
+ up ? "up" : "down",
+ up ? "enabling" : "disabling");
+
+if (up) {
+stp_port_enable(ofport->stp_port);
+stp_port_set_aux(ofport->stp_port, ofport);
+} else {
+stp_port_disable(ofport->stp_port);
+}
+
+update_stp_port_state(ofport);
+}
+}
+}
+
 /* Configures STP on 'ofport_' using the settings defined in 's'.  The
  * caller is responsible for assigning STP port numbers and ensuring
  * there are no duplicates. */
@@ -2604,7 +2635,12 @@ set_stp_port(struct ofport *ofport_,
 /* Set name before enabling the port so that debugging messages can print
  * the name. */
 stp_port_set_name(sp, netdev_get_name(ofport->up.netdev));
-stp_port_enable(sp);
+
+if (netdev_get_carrier(ofport_->netdev)) {
+stp_port_enable(sp);
+} else {
+stp_port_disable(sp);
+}
 
 stp_port_set_aux(sp, ofport);
 stp_port_set_priority(sp, s->priority);
@@ -2666,6 +2702,9 @@ stp_run(struct ofproto_dpif *ofproto)
 stp_tick(ofproto->stp, MIN(INT_MAX, elapsed));
 ofproto->stp_last_tick = now;
 }
+
+stp_check_and_update_link_state(ofproto);
+
 while (stp_get_changed_port(ofproto->stp, )) {
 struct ofport_dpif *ofport = stp_port_get_aux(sp);
 
diff --git a/tests/stp.at b/tests/stp.at
index bd5d208..e27600e 100644
--- a/tests/stp.at
+++ b/tests/stp.at
@@ -420,6 +420,7 @@ AT_CHECK([ovs-vsctl add-port br1 p8 -- \
set port p8 other_config:stp-enable=false -- \
 ])
 
+ovs-appctl netdev-dummy/set-admin-state up
 ovs-appctl time/stop
 
 AT_CHECK([ovs-ofctl add-flow br0 "in_port=7 icmp actions=1"])
@@ -519,7 +520,7 @@ AT_CHECK([
 set interface p6 type=dummy options:pstream=punix:$OVS_RUNDIR/p6.sock 
ofport_request=6
 ], [0])
 
-
+ovs-appctl netdev-dummy/set-admin-state up
 ovs-appctl time/stop
 
 # give time for STP to move initially
@@ -626,5 +627,73 @@ AT_CHECK([ovs-appctl mdb/show br2], [0], [dnl
  port  VLAN  GROUPAge
 ])
 
+AT_CLEANUP
+
+AT_SETUP([STP - check link-state when stp is running])
+OVS_VSWITCHD_START([])
+
+AT_CHECK([
+ovs-vsctl -- \
+set port br0 other_config:stp-enable=false -- \
+set bridge br0 datapath-type=dummy stp_enable=true \
+other-config:hwaddr=aa:66:aa:66:00:00
+], [0])
+
+AT_CHECK([
+ovs-vsctl add-port br0 p1 -- \
+set interface p1 type=dummy -- \
+set port p1 other_config:stp-port-num=1
+ovs-vsctl add-port br0 p2 -- \
+set interface p2 type=dummy -- \
+set port p2 other_config:stp-port-num=2
+], [0])
+
+ovs-appctl netdev-dummy/set-admin-state up
+ovs-appctl time/stop
+
+# give time for STP to move initially
+for i in $(seq 0 30); do
+ovs-appctl time/warp 1000
+done
+
+AT_CHECK([ovs-appctl stp/show br0 | grep p1], [0], [dnl
+   p1 designated forwarding 19128.1
+])
+AT_CHECK([ovs-appctl stp/show br0 | grep p2], [0], [dnl
+   p2 designated forwarding 19128.2
+])
+
+# add a stp port
+AT_CHECK([
+ovs-vsctl add-port br0 p3 -- \
+set interface p3 type=dummy -- \
+set port p3 other_config:stp-port-num=3
+], [0])
+
+ovs-appctl netdev-dummy/set-admin-state p3 down
+
+# We should not show the p3 because its link-state is down
+AT_CHECK([ovs-appctl stp/show br0 | grep p1], [0], [dnl
+   p1 designated forwarding 19128.1
+])
+AT_CHECK([ovs-appctl stp/show br0 | grep p2], [0], [dnl
+   p2 

[ovs-dev] [PATCH v2 3/5] rstp: Add the 'ovs-appctl rstp/show' command.

2017-05-10 Thread nickcooper-zhangtonghao
The rstp/show command will help users and developers to
get more details about rstp. This patch works together with
the previous patches.

Signed-off-by: nickcooper-zhangtonghao 
---
 NEWS   |   3 +-
 lib/rstp.c | 114 +++--
 lib/rstp.h |   2 +-
 vswitchd/ovs-vswitchd.8.in |  11 -
 4 files changed, 123 insertions(+), 7 deletions(-)

diff --git a/NEWS b/NEWS
index 7a2b185..4576408 100644
--- a/NEWS
+++ b/NEWS
@@ -28,7 +28,8 @@ Post-v2.7.0
abbreviated to 4 hex digits.
  * "ovn-sbctl lflow-list" can now print OpenFlow flows that correspond
to logical flows.
-   - Add the command 'ovs-appctl stp/show' (see ovs-vswitchd(8)).
+   - Add the command 'ovs-appctl stp/show' and 'ovs-appctl rstp/show'
+ (see ovs-vswitchd(8)).
- OpenFlow:
  * All features required by OpenFlow 1.4 are now implemented, so
ovs-vswitchd now enables OpenFlow 1.4 by default (in addition to
diff --git a/lib/rstp.c b/lib/rstp.c
index b942f6e..99ae161 100644
--- a/lib/rstp.c
+++ b/lib/rstp.c
@@ -120,6 +120,10 @@ static void rstp_port_set_mcheck__(struct rstp_port *, 
bool mcheck)
 OVS_REQUIRES(rstp_mutex);
 static void reinitialize_port__(struct rstp_port *p)
 OVS_REQUIRES(rstp_mutex);
+static void rstp_unixctl_tcn(struct unixctl_conn *, int argc,
+ const char *argv[], void *aux);
+static void rstp_unixctl_show(struct unixctl_conn *, int argc,
+  const char *argv[], void *aux);
 
 const char *
 rstp_state_name(enum rstp_state state)
@@ -208,9 +212,6 @@ rstp_port_get_number(const struct rstp_port *p)
 return number;
 }
 
-static void rstp_unixctl_tcn(struct unixctl_conn *, int argc,
- const char *argv[], void *aux);
-
 /* Decrements the State Machines' timers. */
 void
 rstp_tick_timers(struct rstp *rstp)
@@ -246,6 +247,8 @@ rstp_init(void)
 
 unixctl_command_register("rstp/tcn", "[bridge]", 0, 1, 
rstp_unixctl_tcn,
  NULL);
+unixctl_command_register("rstp/show", "[bridge]", 0, 1,
+ rstp_unixctl_show, NULL);
 ovsthread_once_done();
 }
 }
@@ -1398,7 +1401,7 @@ rstp_get_designated_root(const struct rstp *rstp)
  * there is no such port.
  */
 struct rstp_port *
-rstp_get_root_port(struct rstp *rstp)
+rstp_get_root_port(const struct rstp *rstp)
 OVS_EXCLUDED(rstp_mutex)
 {
 struct rstp_port *p;
@@ -1545,3 +1548,106 @@ rstp_unixctl_tcn(struct unixctl_conn *conn, int argc,
 out:
 ovs_mutex_unlock(_mutex);
 }
+
+static void
+rstp_bridge_id_details(struct ds *ds, const rstp_identifier bridge_id,
+   const uint16_t hello_time, const uint16_t max_age,
+   const uint16_t forward_delay)
+OVS_REQUIRES(rstp_mutex)
+{
+uint16_t priority = bridge_id >> 48;
+ds_put_format(ds, "\tstp-priority\t%"PRIu16"\n", priority);
+
+struct eth_addr mac;
+const uint64_t mac_bits = (UINT64_C(1) << 48) - 1;
+eth_addr_from_uint64(bridge_id & mac_bits, );
+ds_put_format(ds, "\tstp-system-id\t"ETH_ADDR_FMT"\n", ETH_ADDR_ARGS(mac));
+ds_put_format(ds, "\tstp-hello-time\t%"PRIu16"s\n", hello_time);
+ds_put_format(ds, "\tstp-max-age\t%"PRIu16"s\n", max_age);
+ds_put_format(ds, "\tstp-fwd-delay\t%"PRIu16"s\n", forward_delay);
+}
+
+static void
+rstp_print_details(struct ds *ds, const struct rstp *rstp)
+OVS_REQUIRES(rstp_mutex)
+{
+ds_put_format(ds, " %s \n", rstp->name);
+ds_put_cstr(ds, "Root ID:\n");
+
+bool is_root = rstp_is_root_bridge(rstp);
+struct rstp_port *p = rstp_get_root_port(rstp);
+
+rstp_identifier bridge_id =
+is_root ? rstp->bridge_identifier : rstp_get_root_id(rstp);
+uint16_t hello_time =
+is_root ? rstp->bridge_hello_time : p->designated_times.hello_time;
+uint16_t max_age =
+is_root ? rstp->bridge_max_age : p->designated_times.max_age;
+uint16_t forward_delay =
+is_root ? rstp->bridge_forward_delay : 
p->designated_times.forward_delay;
+
+rstp_bridge_id_details(ds, bridge_id, hello_time, max_age, forward_delay);
+if (is_root) {
+ds_put_cstr(ds, "\tThis bridge is the root\n");
+} else {
+ds_put_format(ds, "\troot-port\t%s\n", p->port_name);
+ds_put_format(ds, "\troot-path-cost\t%u\n",
+  rstp_get_root_path_cost(rstp));
+}
+ds_put_cstr(ds, "\n");
+
+ds_put_cstr(ds, "Bridge ID:\n");
+rstp_bridge_id_details(ds, rstp->bridge_identifier,
+   rstp->bridge_hello_time,
+   rstp->bridge_max_age,
+   rstp->bridge_forward_delay);
+ds_put_cstr(ds, "\n");
+
+ds_put_format(ds, "\t%-11.10s%-11.10s%-11.10s%-9.8s%-8.7s\n",
+  "Interface", "Role", "State", "Cost", "Pri.Nbr");
+ds_put_cstr(ds, "\t-- 

[ovs-dev] [PATCH v2 4/5] rstp: Increment the rstp port num counter.

2017-05-10 Thread nickcooper-zhangtonghao
OvS only supports RSTP_MAX_PORTS rstp ports while max port
num of stp is STP_MAX_PORTS. This patch increments the rstp
port num counter, otherwise the counter is 0 and the checking
above will always fail.

Signed-off-by: nickcooper-zhangtonghao 
---
 vswitchd/bridge.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 31203d1..86b7c11 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -1485,6 +1485,10 @@ port_configure_rstp(const struct ofproto *ofproto, 
struct port *port,
 port_s->port_num = 0;
 }
 
+/* Increment the port num counter, because we only support
+ * RSTP_MAX_PORTS rstp ports. */
+(*port_num_counter) ++;
+
 config_str = smap_get(>cfg->other_config, "rstp-path-cost");
 if (config_str) {
 port_s->path_cost = strtoul(config_str, NULL, 10);
-- 
1.8.3.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 2/5] rstp: Add rstp port name for human reading.

2017-05-10 Thread nickcooper-zhangtonghao
This patch is useful to debug rstp subsystem and log the
port name instead of port number. This patch will also
be used to display rstp info for next patches.

Signed-off-by: nickcooper-zhangtonghao 
Acked-by: Jarno Rajahalme 
---
 lib/rstp-common.h  |  1 +
 lib/rstp.c | 14 +-
 lib/rstp.h |  3 ++-
 ofproto/ofproto-dpif.c |  2 +-
 4 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/rstp-common.h b/lib/rstp-common.h
index 27e8079..c108232 100644
--- a/lib/rstp-common.h
+++ b/lib/rstp-common.h
@@ -262,6 +262,7 @@ struct rstp_port {
 struct rstp *rstp OVS_GUARDED_BY(rstp_mutex);
 struct hmap_node node OVS_GUARDED_BY(rstp_mutex); /* In rstp->ports. */
 void *aux OVS_GUARDED_BY(rstp_mutex);
+char *port_name;
 struct rstp_bpdu received_bpdu_buffer OVS_GUARDED_BY(rstp_mutex);
 /*
  * MAC status parameters
diff --git a/lib/rstp.c b/lib/rstp.c
index 6f1c1e3..b942f6e 100644
--- a/lib/rstp.c
+++ b/lib/rstp.c
@@ -760,6 +760,14 @@ rstp_port_set_port_number__(struct rstp_port *port, 
uint16_t port_number)
 }
 }
 
+static void
+rstp_port_set_port_name__(struct rstp_port *port, const char *name)
+OVS_REQUIRES(rstp_mutex)
+{
+free(port->port_name);
+port->port_name = xstrdup(name);
+}
+
 /* Converts the link speed to a port path cost [Table 17-3]. */
 uint32_t
 rstp_convert_speed_to_cost(unsigned int speed)
@@ -1173,6 +1181,7 @@ rstp_add_port(struct rstp *rstp)
 rstp_port_set_priority__(p, RSTP_DEFAULT_PORT_PRIORITY);
 rstp_port_set_port_number__(p, 0);
 p->aux = NULL;
+p->port_name = NULL;
 rstp_initialize_port_defaults__(p);
 VLOG_DBG("%s: RSTP port "RSTP_PORT_ID_FMT" initialized.", rstp->name,
  p->port_id);
@@ -1210,6 +1219,7 @@ rstp_port_unref(struct rstp_port *rp)
 ovs_mutex_lock(_mutex);
 rstp = rp->rstp;
 rstp_port_set_state__(rp, RSTP_DISABLED);
+free(rp->port_name);
 hmap_remove(>ports, >node);
 VLOG_DBG("%s: removed port "RSTP_PORT_ID_FMT"", rstp->name,
  rp->port_id);
@@ -1448,13 +1458,15 @@ void
 rstp_port_set(struct rstp_port *port, uint16_t port_num, int priority,
   uint32_t path_cost, bool is_admin_edge, bool is_auto_edge,
   enum rstp_admin_point_to_point_mac_state admin_p2p_mac_state,
-  bool admin_port_state, bool do_mcheck, void *aux)
+  bool admin_port_state, bool do_mcheck, void *aux,
+  const char *name)
 OVS_EXCLUDED(rstp_mutex)
 {
 ovs_mutex_lock(_mutex);
 port->aux = aux;
 rstp_port_set_priority__(port, priority);
 rstp_port_set_port_number__(port, port_num);
+rstp_port_set_port_name__(port, name);
 rstp_port_set_path_cost__(port, path_cost);
 rstp_port_set_admin_edge__(port, is_admin_edge);
 rstp_port_set_auto_edge__(port, is_auto_edge);
diff --git a/lib/rstp.h b/lib/rstp.h
index 78e07fb..fa67e3c 100644
--- a/lib/rstp.h
+++ b/lib/rstp.h
@@ -221,7 +221,8 @@ uint32_t rstp_convert_speed_to_cost(unsigned int speed);
 void rstp_port_set(struct rstp_port *, uint16_t port_num, int priority,
uint32_t path_cost, bool is_admin_edge, bool is_auto_edge,
enum rstp_admin_point_to_point_mac_state 
admin_p2p_mac_state,
-   bool admin_port_state, bool do_mcheck, void *aux)
+   bool admin_port_state, bool do_mcheck, void *aux,
+   const char *name)
 OVS_EXCLUDED(rstp_mutex);
 
 enum rstp_state rstp_port_get_state(const struct rstp_port *)
diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index dc5f004..6d3f1fb 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -2722,7 +2722,7 @@ set_rstp_port(struct ofport *ofport_,
 rstp_port_set(rp, s->port_num, s->priority, s->path_cost,
   s->admin_edge_port, s->auto_edge,
   s->admin_p2p_mac_state, s->admin_port_state, s->mcheck,
-  ofport);
+  ofport, netdev_get_name(ofport->up.netdev));
 update_rstp_port_state(ofport);
 /* Synchronize operational status. */
 rstp_port_set_mac_operational(rp, ofport->may_enable);
-- 
1.8.3.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2 1/5] rstp: Init a recursive mutex for rstp.

2017-05-10 Thread nickcooper-zhangtonghao
* This patch will be used for next patch. The 'rstp/show' command,
which uses the mutex, calls functions which also use the mutex.
We should init it as a recursive mutex.

* Some rstp tests of OvS, which run with ovstest, does not run rstp_init
in the bridge_init. We should call rstp_init when creating the rstp
and stp also do that, otherwise tests fail.

* This patch remove the rstp_mutex in header file and make it static
in c file because we only use it in lib/rstp.c

Signed-off-by: nickcooper-zhangtonghao 
---
 lib/rstp.c | 15 ---
 lib/rstp.h |  6 --
 2 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/lib/rstp.c b/lib/rstp.c
index 907a907..6f1c1e3 100644
--- a/lib/rstp.c
+++ b/lib/rstp.c
@@ -50,7 +50,7 @@
 
 VLOG_DEFINE_THIS_MODULE(rstp);
 
-struct ovs_mutex rstp_mutex = OVS_MUTEX_INITIALIZER;
+static struct ovs_mutex rstp_mutex;
 
 static struct ovs_list all_rstps__ = OVS_LIST_INITIALIZER(_rstps__);
 static struct ovs_list *const all_rstps OVS_GUARDED_BY(rstp_mutex) = 
_rstps__;
@@ -239,8 +239,15 @@ void
 rstp_init(void)
 OVS_EXCLUDED(rstp_mutex)
 {
-unixctl_command_register("rstp/tcn", "[bridge]", 0, 1, rstp_unixctl_tcn,
- NULL);
+static struct ovsthread_once once = OVSTHREAD_ONCE_INITIALIZER;
+
+if (ovsthread_once_start()) {
+ovs_mutex_init_recursive(_mutex);
+
+unixctl_command_register("rstp/tcn", "[bridge]", 0, 1, 
rstp_unixctl_tcn,
+ NULL);
+ovsthread_once_done();
+}
 }
 
 /* Creates and returns a new RSTP instance that initially has no ports. */
@@ -255,6 +262,8 @@ rstp_create(const char *name, rstp_identifier 
bridge_address,
 
 VLOG_DBG("Creating RSTP instance");
 
+rstp_init();
+
 rstp = xzalloc(sizeof *rstp);
 rstp->name = xstrdup(name);
 
diff --git a/lib/rstp.h b/lib/rstp.h
index 4942d59..78e07fb 100644
--- a/lib/rstp.h
+++ b/lib/rstp.h
@@ -36,12 +36,6 @@
 #include "compiler.h"
 #include "util.h"
 
-/* Thread Safety: Callers passing in RSTP and RSTP port object
- * pointers must hold a reference to the passed object to ensure that
- * the object does not become stale while it is being accessed. */
-
-extern struct ovs_mutex rstp_mutex;
-
 #define RSTP_MAX_PORTS 4095
 
 struct dp_packet;
-- 
1.8.3.1



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] OVN localport type support

2017-05-10 Thread Daniel Alvarez Sanchez
On Tue, May 9, 2017 at 9:17 PM, Ben Pfaff  wrote:

> Thanks.
>
> This is a good summary of some of what the patch does.  Should it be in
> the commit message?
>
ack
I ommited the commands and outputs in the commit message.



In order to illustrate the purpose of this patch:

- One logical switch sw0 with 2 ports (p1, p2) and 1 localport (lp)
- Two hypervisors: HV1 and HV2
- p1 will be in HV1 (OVS port with external-id:iface-id="p1")
- p2 will be in HV2 (OVS port with external-id:iface-id="p2")
- lp will be in both (OVS port with external-id:iface-id="lp")
- p1 should be able to reach p2 and viceversa
- lp on HV1 should be able to reach p1 but not p2
- lp on HV2 should be able to reach p2 but not p1


ovn-nbctl ls-add sw0
ovn-nbctl lsp-add sw0 p1
ovn-nbctl lsp-add sw0 p2
ovn-nbctl lsp-add sw0 lp
ovn-nbctl lsp-set-addresses p1 "00:00:00:aa:bb:10 10.0.1.10"
ovn-nbctl lsp-set-addresses p2 "00:00:00:aa:bb:20 10.0.1.20"
ovn-nbctl lsp-set-addresses lp "00:00:00:aa:bb:30 10.0.1.30"
ovn-nbctl lsp-set-type lp localport

add_phys_port() {
name=$1
mac=$2
ip=$3
mask=$4
gw=$5
iface_id=$6
sudo ip netns add $name
sudo ovs-vsctl add-port br-int $name -- set interface $name
type=internal
sudo ip link set $name netns $name
sudo ip netns exec $name ip link set $name address $mac
sudo ip netns exec $name ip addr add $ip/$mask dev $name
sudo ip netns exec $name ip link set $name up
sudo ip netns exec $name ip route add default via $gw
sudo ovs-vsctl set Interface $name external_ids:iface-id=$iface_id
}

# Add p1 to HV1, p2 to HV2 and localport to both

# HV1
add_phys_port p1 00:00:00:aa:bb:10 10.0.1.10 24 10.0.1.1 p1
add_phys_port lp 00:00:00:aa:bb:30 10.0.1.30 24 10.0.1.1 lp

$ sudo ip netns exec p1 ping -c 2 10.0.1.20
PING 10.0.1.20 (10.0.1.20) 56(84) bytes of data.
64 bytes from 10.0.1.20: icmp_seq=1 ttl=64 time=0.738 ms
64 bytes from 10.0.1.20: icmp_seq=2 ttl=64 time=0.502 ms

--- 10.0.1.20 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.502/0.620/0.738/0.118 ms

$ sudo ip netns exec lp ping -c 2 10.0.1.10
PING 10.0.1.10 (10.0.1.10) 56(84) bytes of data.
64 bytes from 10.0.1.10: icmp_seq=1 ttl=64 time=0.187 ms
64 bytes from 10.0.1.10: icmp_seq=2 ttl=64 time=0.032 ms

--- 10.0.1.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 999ms
rtt min/avg/max/mdev = 0.032/0.109/0.187/0.078 ms


$ sudo ip netns exec lp ping -c 2 10.0.1.20
PING 10.0.1.20 (10.0.1.20) 56(84) bytes of data.

--- 10.0.1.20 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1000ms


$ sudo ovs-ofctl dump-flows br-int | grep table=32
cookie=0x0, duration=141.939s, table=32, n_packets=2, n_bytes=196,
idle_age=123, priority=150,reg14=0x3,reg15=0x2,metadata=0x7 actions=drop
cookie=0x0, duration=141.939s, table=32, n_packets=2, n_bytes=196,
idle_age=129, priority=100,reg15=0x2,metadata=0x7
actions=load:0x7->NXM_NX_TUN_ID[0..23],set_field:0x2->tun_metadata0,move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30],output:59



# On HV2

add_phys_port p2 00:00:00:aa:bb:20 10.0.1.20 24 10.0.1.1 p2
add_phys_port lp 00:00:00:aa:bb:30 10.0.1.30 24 10.0.1.1 lp

$ sudo ip netns exec p2 ping -c 2 10.0.1.10
PING 10.0.1.10 (10.0.1.10) 56(84) bytes of data.
64 bytes from 10.0.1.10: icmp_seq=1 ttl=64 time=0.810 ms
64 bytes from 10.0.1.10: icmp_seq=2 ttl=64 time=0.673 ms

--- 10.0.1.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.673/0.741/0.810/0.073 ms

$ sudo ip netns exec lp ping -c 2 10.0.1.20
PING 10.0.1.20 (10.0.1.20) 56(84) bytes of data.
64 bytes from 10.0.1.20: icmp_seq=1 ttl=64 time=0.357 ms
64 bytes from 10.0.1.20: icmp_seq=2 ttl=64 time=0.062 ms

--- 10.0.1.20 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.062/0.209/0.357/0.148 ms

$ sudo ip netns exec lp ping -c 2 10.0.1.10
PING 10.0.1.10 (10.0.1.10) 56(84) bytes of data.

--- 10.0.1.10 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 999ms

$ sudo ovs-ofctl dump-flows br-int | grep table=32
 cookie=0x0, duration=24.169s, table=32, n_packets=2, n_bytes=196,
idle_age=12, priority=150,reg14=0x3,reg15=0x1,metadata=0x7 actions=drop
 cookie=0x0, duration=24.169s, table=32, n_packets=2, n_bytes=196,
idle_age=14, priority=100,reg15=0x1,metadata=0x7
actions=load:0x7->NXM_NX_TUN_ID[0..23],set_field:0x1->tun_metadata0,move:NXM_NX_REG14[0..14]->NXM_NX_TUN_METADATA0[16..30],output:40




> On Tue, May 09, 2017 at 05:09:43PM +0200, Daniel Alvarez Sanchez wrote:
> > Hi,
> >
> > I've submitted a new patch v3 where I removed the external-id
> > "ovn-localport"
> > from the Interface which I used to identify a port as localport in
> > physical.c.
> >
> > Instead, I have passed another parameter "local_lports" to physical_run.
> > When
> > inserting flows in table 32, I'm inserting higher priority 

[ovs-dev] [PATCH v4] OVN localport type support

2017-05-10 Thread Daniel Alvarez
This patch introduces a new type of OVN ports called "localport".
These ports will be present in every hypervisor and may have the
same IP/MAC addresses. They are not bound to any chassis and traffic
to these ports will never go through a tunnel.

Its main use case is the OpenStack metadata API support which relies
on a local agent running on every hypervisor and serving metadata to
VM's locally. This service is described in detail at [0].

An example to illustrate the purpose of this patch:

- One logical switch sw0 with 2 ports (p1, p2) and 1 localport (lp)
- Two hypervisors: HV1 and HV2
- p1 in HV1 (OVS port with external-id:iface-id="p1")
- p2 in HV2 (OVS port with external-id:iface-id="p2")
- lp in both hypevisors (OVS port with external-id:iface-id="lp")
- p1 should be able to reach p2 and viceversa
- lp on HV1 should be able to reach p1 but not p2
- lp on HV2 should be able to reach p2 but not p1

Explicit drop rules are inserted in table 32 with priority 150
in order to prevent traffic originated at a localport to go over
a tunnel.

[0] https://review.openstack.org/#/c/452811/

Signed-off-by: Daniel Alvarez 
---
 ovn/controller/binding.c|   3 +-
 ovn/controller/ovn-controller.c |   2 +-
 ovn/controller/physical.c   |  34 ++-
 ovn/controller/physical.h   |   4 +-
 ovn/northd/ovn-northd.8.xml |   8 +--
 ovn/northd/ovn-northd.c |   6 +-
 ovn/ovn-architecture.7.xml  |  25 ++--
 ovn/ovn-nb.xml  |   9 +++
 ovn/ovn-sb.xml  |  14 +
 tests/ovn.at| 122 
 10 files changed, 210 insertions(+), 17 deletions(-)

diff --git a/ovn/controller/binding.c b/ovn/controller/binding.c
index 95e9deb..83a7543 100644
--- a/ovn/controller/binding.c
+++ b/ovn/controller/binding.c
@@ -380,7 +380,8 @@ consider_local_datapath(struct controller_ctx *ctx,
 if (iface_rec && qos_map && ctx->ovs_idl_txn) {
 get_qos_params(binding_rec, qos_map);
 }
-our_chassis = true;
+   if(strcmp(binding_rec->type, "localport"))
+our_chassis = true;
 } else if (!strcmp(binding_rec->type, "l2gateway")) {
 const char *chassis_id = smap_get(_rec->options,
   "l2gateway-chassis");
diff --git a/ovn/controller/ovn-controller.c b/ovn/controller/ovn-controller.c
index f22551d..0f4dd35 100644
--- a/ovn/controller/ovn-controller.c
+++ b/ovn/controller/ovn-controller.c
@@ -655,7 +655,7 @@ main(int argc, char *argv[])
 
 physical_run(, mff_ovn_geneve,
  br_int, chassis, _zones, ,
- _table, _datapaths);
+ _table, _datapaths, _lports);
 
 ofctrl_put(_table, _ct_zones,
get_nb_cfg(ctx.ovnsb_idl));
diff --git a/ovn/controller/physical.c b/ovn/controller/physical.c
index 457fc45..c98b305 100644
--- a/ovn/controller/physical.c
+++ b/ovn/controller/physical.c
@@ -293,7 +293,8 @@ consider_port_binding(enum mf_field_id mff_ovn_geneve,
   const struct sbrec_port_binding *binding,
   const struct sbrec_chassis *chassis,
   struct ofpbuf *ofpacts_p,
-  struct hmap *flow_table)
+  struct hmap *flow_table,
+  const struct sset *local_lports)
 {
 uint32_t dp_key = binding->datapath->tunnel_key;
 uint32_t port_key = binding->tunnel_key;
@@ -601,6 +602,32 @@ consider_port_binding(enum mf_field_id mff_ovn_geneve,
 } else {
 /* Remote port connected by tunnel */
 
+/* Table 32, priority 150.
+ * ===
+ *
+ * Drop traffic originated from a localport to a remote destination.
+ */
+const char *localport;
+SSET_FOR_EACH (localport, local_lports) {
+/* Iterate over all local logical ports and insert a drop
+ * rule with higher priority for every localport in this
+ * datapath. */
+const struct sbrec_port_binding *pb = lport_lookup_by_name(
+lports, localport);
+if (pb && pb->datapath->tunnel_key == dp_key &&
+!strcmp(pb->type, "localport")) {
+match_init_catchall();
+ofpbuf_clear(ofpacts_p);
+/* Match localport logical in_port. */
+match_set_reg(, MFF_LOG_INPORT - MFF_REG0,
+  pb->tunnel_key);
+/* Match MFF_LOG_DATAPATH, MFF_LOG_OUTPORT. */
+match_set_metadata(, htonll(dp_key));
+match_set_reg(, MFF_LOG_OUTPORT - MFF_REG0, port_key);
+ofctrl_add_flow(flow_table, OFTABLE_REMOTE_OUTPUT, 150, 0,
+, ofpacts_p);
+}
+}
 /* Table 32, priority 100.