Re: [ovs-dev] [PATCH] lex: Treat formfeeds as white space.

2016-10-11 Thread Ben Pfaff
On Tue, Oct 11, 2016 at 04:27:40PM -0700, Andy Zhou wrote:
> On Fri, Oct 7, 2016 at 10:04 AM, Ben Pfaff  wrote:
> 
> > Also vertical tabs, whatever those are.
> >
> > Signed-off-by: Ben Pfaff 
> >
> 
> Acked-by: Andy Zhou 

Thanks, I'll apply this to master in a minute.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Optimise the initialization of "port_conf"

2016-10-11 Thread xu . binbin1
"port_conf" is a global const variables, and in function 
dpdk_eth_dev_queue_setup
it is asigned to the local variables "conf". The jumbo_frame bit is set in 
the
local varables, previous configuration shouldn't influence on the other 
port.

Thanks


"Kavanagh, Mark B"  写于 2016/10/11 21:46:00:

> 发件人:  "Kavanagh, Mark B" 
> 收件人:  Binbin Xu , "dev@openvswitch.org" 
> , 
> 日期:  2016/10/11 21:46
> 主题: RE: [ovs-dev] [PATCH] netdev-dpdk: Optimise the 
> initialization of "port_conf"
> 
> >
> >The member "max_rx_pkt_len" of "port_conf" is only used if
> >jumbo_frame enabled, so it can be initialized with value '0'.
> >
> >Signed-off-by: Binbin Xu 
> >---
> > lib/netdev-dpdk.c | 4 +---
> > 1 file changed, 1 insertion(+), 3 deletions(-)
> >
> >diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
> >index 39bf930..c4a0cc0 100644
> >--- a/lib/netdev-dpdk.c
> >+++ b/lib/netdev-dpdk.c
> >@@ -154,6 +154,7 @@ static char *vhost_sock_dir = NULL;   /* 
> Location of vhost-user sockets
> >*/
> > static const struct rte_eth_conf port_conf = {
> > .rxmode = {
> > .mq_mode = ETH_MQ_RX_RSS,
> >+.max_rx_pkt_len = 0,
> > .split_hdr_size = 0,
> > .header_split   = 0, /* Header Split disabled */
> > .hw_ip_checksum = 0, /* IP checksum offload disabled */
> >@@ -648,9 +649,6 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk 
> *dev, int n_rxq, int n_txq)
> > if (dev->mtu > ETHER_MTU) {
> > conf.rxmode.jumbo_frame = 1;
> > conf.rxmode.max_rx_pkt_len = dev->max_packet_len;
> >-} else {
> >-conf.rxmode.jumbo_frame = 0;
> >-conf.rxmode.max_rx_pkt_len = 0;
> > }
> 
> NACK: if a previous configuration already set the jumbo_frame bit, a
> subsequent non-jumbo port could inherit this attribute - that's why 
> it's reset explicitly here.
> 
> > /* A device may report more queues than it makes available (this 
has
> >  * been observed for Intel xl710, which reserves some of them for
> >--
> >2.9.3
> >
> >___
> >dev mailing list
> >dev@openvswitch.org
> >http://openvswitch.org/mailman/listinfo/dev

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] tests: Fix double-rebuild of testsuite for "check-valgrind".

2016-10-11 Thread Andy Zhou
On Wed, Oct 5, 2016 at 6:28 PM, Ben Pfaff  wrote:

> When I ran "make check-valgrind -j10" and the testsuite needed to be
> rebuilt, two copies of it were rebuilt in parallel and sometimes they
> raced with each other.  I don't have the full story on exactly why this
> happened, but this commit, which eliminates redundant dependencies from
> check-* targets, fixes the problem for me.  The dependencies are redundant
> because these targets depend on "all", which also depends on them.
>
> Signed-off-by: Ben Pfaff 
>

I was not clear to me either why the race happened, but the change looks
good to me.

Acked-by: Andy Zhou 

There are two additional targets that have the similar pattern, wonder if
we should change them as well...


diff --git a/tests/automake.mk b/tests/automake.mk
index f022feb..c170ae7 100644
--- a/tests/automake.mk
+++ b/tests/automake.mk
@@ -223,8 +223,7 @@ check-valgrind: all $(valgrind_wrappers) $(check_DATA)
@echo
'--'
@echo 'Valgrind output can be found in
tests/testsuite.dir/*/valgrind.*'
@echo
'--'
-check-helgrind: all tests/atconfig tests/atlocal $(TESTSUITE) \
-$(valgrind_wrappers) $(check_DATA)
+check-helgrind: all $(valgrind_wrappers) $(check_DATA)
-$(SHELL) '$(TESTSUITE)' -C tests CHECK_VALGRIND=true
VALGRIND='$(HELGRIND)' AUTOTEST_PATH='tests/valgri

 ^L
@@ -245,7 +244,7 @@ check-kernel: all
"$$@" || (test X'$(RECHECK)' = Xyes && "$$@" --recheck)

 # Testing the out of tree Kernel module
-check-kmod: all tests/atconfig tests/atlocal $(SYSTEM_KMOD_TESTSUITE)
+check-kmod: all
$(MAKE) modules_install
modprobe -r -a vport-geneve vport-gre vport-lisp vport-stt
vport-vxlan openvswitch
$(MAKE) check-kernel


>
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] lex: Treat formfeeds as white space.

2016-10-11 Thread Andy Zhou
On Fri, Oct 7, 2016 at 10:04 AM, Ben Pfaff  wrote:

> Also vertical tabs, whatever those are.
>
> Signed-off-by: Ben Pfaff 
>

Acked-by: Andy Zhou 
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] meta-flow: Fix the NXM_NX_* names of xxreg2 and xxreg3.

2016-10-11 Thread Jarno Rajahalme

> On Oct 11, 2016, at 8:33 AM, Ben Pfaff  wrote:
> 
> On Mon, Oct 10, 2016 at 11:38:21AM -0700, Jarno Rajahalme wrote:
>> 
>>> On Oct 7, 2016, at 5:54 PM, Justin Pettit  wrote:
>>> 
>>> 
 On Oct 7, 2016, at 5:41 PM, Jarno Rajahalme  wrote:
 
 xxreg2 and xxreg3 had the same NXM_NX_* names as xxreg0 and xxreg1,
 correspondingly.
 
 Found by inspection.
 
 CC: Justin Pettit 
 Fixes: b23ada8eecfd ("Introduce 128-bit xxregs.")
 Signed-off-by: Jarno Rajahalme 
>>> 
>>> Thanks!
>>> 
>>> Acked-by: Justin Pettit 
>> 
>> Thanks for the speedy review! Pushed to master and branch-2.6 with
>> placeholders for four more xxregs.
> 
> Thanks for fixing this.
> 
> It didn't seem to have made it to branch-2.6, so I did the backport
> myself.

It appears I forgot to push after the cherry-pick on the local branch-2.6, duh!

  Jarno

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH v1, 1/1] ovn-northd: fix router ingress table ID in comments

2016-10-11 Thread Ben Pfaff
On Tue, Oct 11, 2016 at 07:50:20AM +, Zongkai LI wrote:
> This patch fixes wrong table ID in comments for logical router ingress
> table IP Routing, ARP Resolution and ARP request.
> 
> Signed-off-by: Zongkai LI 

Applied, thanks!
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] QoS Rate limiting

2016-10-11 Thread KHAN, RAO ADNAN
Hi,

I like to set ingress policy on TAP interfaces using "tc" utility. Reason being 
that I am using a different vswitch and not using OVS, hence cannot use 
ovs-vsctl as mentioned on 
http://openvswitch.org/support/config-cookbooks/qos-rate-limiting/

My question is, how do I get the *device* name of a TAP interface, so that I 
can set qdisc and filters?

Thanks,

Rao Adnan Khan
AT Integrated Cloud (AIC) Development | SE
Software Development & Engineering (SD)
Emai: rk2...@att.com
Cell phone: 972-342-5638

RESTRICTED - PROPRIETARY INFORMATION
This email is the property of AT and intended solely for the use of the 
addressee. If you have reason to believe you have received this in error, 
please delete this immediately; any other use, retention, dissemination, 
copying or printing of this email is strictly prohibited.


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] ovn-test: Fix 'test-ovn composition' crash

2016-10-11 Thread Andy Zhou
Without this fix, the added test will core dump.

Signed-off-by: Andy Zhou 
---
 tests/ovn.at | 4 
 tests/test-ovn.c | 4 ++--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index caf9f98..1f65e75 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -175,6 +175,10 @@ ct_state = NXM_NX_CT_STATE
 ]])
 AT_CLEANUP
 
+AT_SETUP([ovn -- compsition])
+AT_CHECK([ovstest test-ovn composition 2], [0], [ignore])
+AT_CLEANUP
+
 AT_SETUP([ovn -- expression parser])
 dnl For lines without =>, input and expected output are identical.
 dnl For lines with =>, input precedes => and expected output follows =>.
diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index 637ac7d..2e82a6f 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -452,11 +452,11 @@ next_composition(unsigned int *state, int s[], int sn)
 j++;
 } else {
 j--;
-s[j] = s[j + 1];
-s[j - 1]++;
 if (!j) {
 return 0;
 }
+s[j] = s[j + 1];
+s[j - 1]++;
 }
 }
 return j + 1;
-- 
1.9.1

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] The Patch netdev-dpdk: add TSO support for vhost-user ports

2016-10-11 Thread Michael Qiu


Hi, Mark


OK, once it's ready, pls let me know and I'm glade to help to test it.


BTW, you mentioned the gap, is that the TSO and CSUM of tunnel offload? 
I'm familiar with it, and I hope I could do something on it if possible.



2016/10/11 22:34, Kavanagh, Mark B :

Hi,  all


This patch is very important for users want to improve the performance
of the large packets.


But you know, in data center, lots of networks using vxlan, so if it
supports vxlan, then it will be very useful.


Would you guys has a plan to support it? I would like to help test it,
or work together on it?

Hi Michael,

Some work has already been done to enable TSO over VxLAN, but it was blocked 
until recently, on account of gaps in the DPKD i40e PMD's support for tunnel 
offload.

We believe that those gaps have now been addressed; our current focus is on 
ironing out the issues in TSO support over flat and VLAN networks - once that's 
done, we plan to resume work on VxLAN support (and GRE, IPinIP).

Hope this helps,
Mark
 


--
Thanks,
Michael



--
Thanks,
Michael


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 05/12] dpif-netdev: Clear flow batches inside packet_batch_execute.

2016-10-11 Thread Fischetti, Antonio
Thanks Daniele for all your explanations.
Instead of the proposed changes, is it worth to add a comment to the code just 
to explain why that loop is needed? This way we will avoid other people 
attempting to make the same change in the future.
Could be something like:

+/* All the flow batches need to be reset before any call to 
+packet_batch_per_flow_execute() as it could potentially 
+trigger recirculation.
+When a packet matching flow ‘j’ happens to be recirculated, 
+the nested call to dp_netdev_input__() could potentially 
+classify the packet as matching another flow - say 'k'.
+It could happen that in the previous call to dp_netdev_input__()
+that same flow 'k' had already its own batches[k] still waiting
+to be served.  So if its ‘batch’ member is not reset, the
+recirculated packet would be wrongly appended to batches[k]
+of the 1st call to dp_netdev_input__(). */
 for (i = 0; i < n_batches; i++) {
 batches[i].flow->batch = NULL;
 }


From: Daniele Di Proietto [mailto:diproiet...@ovn.org] 
Sent: Monday, October 10, 2016 1:31 AM
To: Fischetti, Antonio 
Cc: Jarno Rajahalme ; dev@openvswitch.org
Subject: Re: [ovs-dev] [PATCH 05/12] dpif-netdev: Clear flow batches inside 
packet_batch_execute.



2016-10-09 0:54 GMT-07:00 Fischetti, Antonio :
Thanks Daniele,
in this patch we moved the reset of batches[i].flow->batch
from dp_netdev_input__() into packet_batch_per_flow_execute().
So before calling dp_netdev_execute_actions() the flow->batch is already NULL.

batches[0]->flow->batch is already NULL, but batches[1]->flow->batch is not 
NULL with this patch.  The lookup after recirculation could find 
batches[1]->flow

I think this should prevent recirculation, or am I missing some detail?

Suppose the datapath has the following flow table (I'm not sure something like 
this is ever generated by the current translation code, but it's a valid 
datapath flow table):
in_port(1),recird_id(0),...,tcp(src=80) actions:set(tcp(src=443)),recirc(0)   
#flow 0
in_port(1),recird_id(0),...,tcp(src=443) actions:output:2               
   #flow 1
Suppose the following packets enter the datapath on port 1, in the same input 
batch, in this order:

in_port(1),...,tcp(src=80)   #packet 0
in_port(1),...,tcp(src=443) #packet 1
Here's how the datapath behaves:
The batch is received and classified. We create two batches_per_flow
packet_batch_per_flow[0] = { .flow = flow1, .array = [packet 1] } #batch 0
packet_batch_per_flow[1] = { .flow = flow2, .array = [packet 2] } #batch 1
Flow 0 points to batch_per_flow[0] and flow 1 points to batch_per_flow[1].
Now the datapath starts to execute the batch 0. The recirculation will execute 
immediately and it will find flow 1.  If we didn't clear flow 1->batch before 
executing batch 0 (with this patch we don't), the datapath will append #packet 
0 (after recirculation) to batch 1.  As I said, I believe this to be wrong 
because we should have created a new batch.
Every time I look at that code again I need some time to re understand this, 
and I actually got this wrong when I introduced the batch pointer into struct 
dp_netdev_flow, so I agree it's quite complicated.  Please, feel free to ask 
more if my reasoning seems wrong or if I'm making wrong assumptions.
Thanks,
Daniele


Antonio

> -Original Message-
> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Daniele Di
> Proietto
> Sent: Friday, October 7, 2016 11:46 PM
> To: Jarno Rajahalme 
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 05/12] dpif-netdev: Clear flow batches
> inside packet_batch_execute.
>
> This patch basically reverts 603f2ce04d00("dpif-netdev: Clear flow batches
> before execute.")
>
> As explained in that commit message the problem is that
> packet_batch_per_flow_execute() can trigger recirculation.  This means
> that
> we will call recursively dp_netdev_input__().  Here's a stack frame:
>
> dp_netdev_input__()
> {
>     struct packet_batch_per_flow batches[PKT_ARRAY_SIZE];
>         /*...*/
>         packet_batch_per_flow_execute()
>         {
>             dp_execute_cb()
>             {
>                 case OVS_ACTION_ATTR_RECIRC:
>                     dp_netdev_input__() {
>                         struct packet_batch_per_flow
> batches[PKT_ARRAY_SIZE];
>                         /*...*/
>                     }
>             }
>
> }
>
> In the inner dp_netdev_input__() a lookup could theoretically find the
> same
> flows that it finds in the outer.
> If we don't clear _all_ the batch pointers before calling the inner
> dp_netdev_input__() we could find a flow that still has a pointer to a
> batch in the outer dp_netdev_input__(). Then we will append packets to the
> outer batch, which is clearly wrong.
>
> 2016-10-07 14:09 GMT-07:00 Jarno Rajahalme :
>
> > Daniele had a comment on this, I believe?
> >
> > > 

Re: [ovs-dev] [PATCH 09/12] dpif: Reorder elements in dpif_upcall structure.

2016-10-11 Thread Bodireddy, Bhanuprakash
>-Original Message-
>From: Jarno Rajahalme [mailto:ja...@ovn.org]
>Sent: Monday, October 10, 2016 9:01 PM
>To: Bodireddy, Bhanuprakash 
>Cc: dev@openvswitch.org
>Subject: Re: [ovs-dev] [PATCH 09/12] dpif: Reorder elements in dpif_upcall
>structure.
>
>How about making the ‘dp_packet’ member the first member and adding a
>comment that this should be first?
This makes sense and by doing so we can avoid the hole, will make  this change 
in V2.

- Bhanu Prakash. 

>
>  Jarno
>
>> On Oct 10, 2016, at 8:42 AM, Bodireddy, Bhanuprakash
> wrote:
>>
>> Hello jarno,
>>
>> Thanks for the feedback, while reordering the members of dpif_upcall, I had
>to deviate from standards due to below reason.
>> The dp_packet member has mbuf as first member that starts at new cache
>line creating hole of size 60 bytes between dpif_upcall_type and dp_packet as
>pointed below.
>>
>> struct dpif_upcall {
>>enum dpif_upcall_type  type;
>>
>>   -->  60 bytes hole
>>
>>/* --- cacheline 1 boundary (64 bytes) --- */
>>struct dp_packet {
>>struct rte_mbuf {
>>/* typedef MARKER */ void * cacheline0[0]; /* 
>>64 0 */
>>
>>   }
>>   struct nlattr *key;
>> .
>> .
>> }
>>
>> I tried to pack this hole by moving other members in to this space.
>>
>> Regards,
>> Bhanu Prakash.
>>
>>> -Original Message-
>>> From: Jarno Rajahalme [mailto:ja...@ovn.org]
>>> Sent: Friday, October 7, 2016 10:11 PM
>>> To: Bodireddy, Bhanuprakash 
>>> Cc: dev@openvswitch.org
>>> Subject: Re: [ovs-dev] [PATCH 09/12] dpif: Reorder elements in
>>> dpif_upcall structure.
>>>
>>> CodingStyle.md instructs to group struct members into related groups.
>>> Also, changing the relative order of pointers should not make any
>>> difference. Could you achieve the same by reordering just the members
>>> above the ‘DPIF_UC_ACTION only.’ comment?
>>>
>>> Jarno
>>>
 On Oct 7, 2016, at 9:17 AM, Bhanuprakash Bodireddy
>>>  wrote:

 By reordering the data elements in dpif_upcall structure, pad bytes
 can be reduced and also a cache line.

 Before: structure size:768, holes:1, sum padbytes:60, cachelines:12
 After: structure size:656, holes:1, sum padbytes:4, cachelines:11

 Signed-off-by: Bhanuprakash Bodireddy
 
 Signed-off-by: Antonio Fischetti 
 ---
 lib/dpif.h | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

 diff --git a/lib/dpif.h b/lib/dpif.h index a7c5097..4a4bb3d 100644
 --- a/lib/dpif.h
 +++ b/lib/dpif.h
 @@ -779,17 +779,18 @@ const char *dpif_upcall_type_to_string(enum
 dpif_upcall_type); struct dpif_upcall {
/* All types. */
enum dpif_upcall_type type;
 -struct dp_packet packet;   /* Packet data. */
 -struct nlattr *key; /* Flow key. */
 -size_t key_len; /* Length of 'key' in bytes. */
 -ovs_u128 ufid;  /* Unique flow identifier for 'key'. */
 -struct nlattr *mru; /* Maximum receive unit. */
 -struct nlattr *cutlen;  /* Number of bytes shrink from the end. */

/* DPIF_UC_ACTION only. */
 -struct nlattr *userdata;/* Argument to
>>> OVS_ACTION_ATTR_USERSPACE. */
 -struct nlattr *out_tun_key;/* Output tunnel key. */
struct nlattr *actions;/* Argument to
>OVS_ACTION_ATTR_USERSPACE.
>>> */
 +struct nlattr *out_tun_key;/* Output tunnel key. */
 +struct nlattr *userdata;/* Argument to
>>> OVS_ACTION_ATTR_USERSPACE. */
 +
 +struct nlattr *cutlen;  /* Number of bytes shrink from the end. */
 +struct nlattr *mru; /* Maximum receive unit. */
 +ovs_u128 ufid;  /* Unique flow identifier for 'key'. */
 +struct dp_packet packet;   /* Packet data. */
 +struct nlattr *key; /* Flow key. */
 +size_t key_len; /* Length of 'key' in bytes. */
 };

 /* A callback to notify higher layer of dpif about to be purged, so
 that
 --
 2.4.11

 ___
 dev mailing list
 dev@openvswitch.org
 http://openvswitch.org/mailman/listinfo/dev
>>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 4/4] expr: Better simplify some special cases of expressions.

2016-10-11 Thread Ben Pfaff
On Mon, Oct 10, 2016 at 04:52:07PM -0700, Andy Zhou wrote:
> On Thu, Oct 6, 2016 at 8:30 PM, Ben Pfaff  wrote:
> 
> > It's pretty unlikely that a human would write expressions like these, but
> > they can come up in machine-generated expressions and it seems best to
> > simplify them in an efficient way.
> >
> > Signed-off-by: Ben Pfaff 
> >
> Acked-by: Andy Zhou 

Thanks for the reviews!  I applied these to master and I'm planning to
apply the first three to branch-2.6 in a minute.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH 01/12] dpcls: Use 32 packet batches for lookups.

2016-10-11 Thread Fischetti, Antonio
Comments inline.

Thanks,
Antonio

> -Original Message-
> From: dev [mailto:dev-boun...@openvswitch.org] On Behalf Of Jarno
> Rajahalme
> Sent: Friday, October 7, 2016 10:08 PM
> To: Bodireddy, Bhanuprakash 
> Cc: dev@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH 01/12] dpcls: Use 32 packet batches for
> lookups.
> 
> 
> > On Oct 7, 2016, at 9:17 AM, Bhanuprakash Bodireddy
>  wrote:
> >
> > This patch increases the number of packets processed in a batch during a
> > lookup from 16 to 32. Processing batches of 32 packets improves
> > performance and also one of the internal loops can be avoided here.
> >
> 
> Can you provide some qualification of the performance test conditions? Do
> you believe the performance difference applies universally?
> 
[Antonio F] We saw a performance improvement in EMC disabled PHY2PHY loopback 
testcase with 2 physical ports and few tens of active IXIA streams. With 1 
subtable - flow: from port1 to port2 - we got a throughput increment of +150 
kpps. With 4 subtables the increment was +80 kpps. 
Also we spent some time profiling OVS with VTune. When comparing stock ovs with 
the patched one, it’s been observed that for dpcls_lookup(), there is 
significant reduction in the overall retired instructions (reduced by 
729,800,000), CPI rate improved from 0.471 to 0.427 and Front-end bound cycles 
reduced from 19.1% to 8.5%.


> > Signed-off-by: Antonio Fischetti 
> > Signed-off-by: Bhanuprakash Bodireddy 
> > ---
> > lib/dpif-netdev.c | 109 +++-
> --
> > 1 file changed, 46 insertions(+), 63 deletions(-)
> >
> > diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c
> > index 6e09e44..c002dd3 100644
> > --- a/lib/dpif-netdev.c
> > +++ b/lib/dpif-netdev.c
> > @@ -4975,23 +4975,20 @@ dpcls_lookup(struct dpcls *cls, const struct
> netdev_flow_key keys[],
> >  int *num_lookups_p)
> > {
> > /* The received 'cnt' miniflows are the search-keys that will be
> processed
> > - * in batches of 16 elements.  N_MAPS will contain the number of
> these
> > - * 16-elements batches.  i.e. for 'cnt' = 32, N_MAPS will be 2.
> The batch
> > - * size 16 was experimentally found faster than 8 or 32. */
> > -typedef uint16_t map_type;
> > + * to find a matching entry into the available subtables.
> > + * The number of bits in map_type is equal to NETDEV_MAX_BURST. */
> 
> This needs a build time assertion to catch any future change in
> NETDEV_MAX_BURST.
> 
> Preferably, if you can verify that the compiler is able to remove the
> unnecessary loop in this case, you should consider not removing the extra
> loop that would kick in only if NETDEV_MAX_BURST becomes larger than 32.
> 
[Antonio F] Is that ok if we add the following line after the MAP_BITS define?
BUILD_ASSERT_DECL(MAP_BITS >= NETDEV_MAX_BURST);

It seems NETDEV_MAX_BURST was reduced to 32 from 256 a while ago, and one of 
the reasons 
being the slow VMs and packet drops at VMs. Can we safely assume that 
NETDEV_MAX_BURST is
unlikely to be increased as it has wider consequence on the system stability?


> > +typedef uint32_t map_type;
> > #define MAP_BITS (sizeof(map_type) * CHAR_BIT)
> >
> > -#if !defined(__CHECKER__) && !defined(_WIN32)
> > -const int N_MAPS = DIV_ROUND_UP(cnt, MAP_BITS);
> > -#else
> > -enum { N_MAPS = DIV_ROUND_UP(NETDEV_MAX_BURST, MAP_BITS) };
> > -#endif
> > -map_type maps[N_MAPS];
> > struct dpcls_subtable *subtable;
> >
> > -memset(maps, 0xff, sizeof maps);
> > -if (cnt % MAP_BITS) {
> > -maps[N_MAPS - 1] >>= MAP_BITS - cnt % MAP_BITS; /* Clear extra
> bits. */
> > +map_type keys_map = 0x;
> > +map_type found_map;
> > +uint32_t hashes[MAP_BITS];
> > +const struct cmap_node *nodes[MAP_BITS];
> > +
> > +if (OVS_UNLIKELY(cnt != NETDEV_MAX_BURST)) {
> > +keys_map >>= NETDEV_MAX_BURST - cnt; /* Clear extra bits. */
> > }
> > memset(rules, 0, cnt * sizeof *rules);
> >
> > @@ -5007,59 +5004,45 @@ dpcls_lookup(struct dpcls *cls, const struct
> netdev_flow_key keys[],
> > PVECTOR_FOR_EACH (subtable, >subtables) {
> > const struct netdev_flow_key *mkeys = keys;
> > struct dpcls_rule **mrules = rules;
> > -map_type remains = 0;
> > -int m;
> > -
> > -BUILD_ASSERT_DECL(sizeof remains == sizeof *maps);
> > -
> > -/* Loops on each batch of 16 search-keys. */
> > -for (m = 0; m < N_MAPS; m++, mkeys += MAP_BITS, mrules +=
> MAP_BITS) {
> > -uint32_t hashes[MAP_BITS];
> > -const struct cmap_node *nodes[MAP_BITS];
> > -unsigned long map = maps[m];
> > -int i;
> > -
> > -if (!map) {
> > -continue; /* Skip empty maps. */
> > -}
> > -
> > -/* Compute hashes for the remaining keys.  

Re: [ovs-dev] [PATCH] meta-flow: Fix the NXM_NX_* names of xxreg2 and xxreg3.

2016-10-11 Thread Ben Pfaff
On Mon, Oct 10, 2016 at 11:38:21AM -0700, Jarno Rajahalme wrote:
> 
> > On Oct 7, 2016, at 5:54 PM, Justin Pettit  wrote:
> > 
> > 
> >> On Oct 7, 2016, at 5:41 PM, Jarno Rajahalme  wrote:
> >> 
> >> xxreg2 and xxreg3 had the same NXM_NX_* names as xxreg0 and xxreg1,
> >> correspondingly.
> >> 
> >> Found by inspection.
> >> 
> >> CC: Justin Pettit 
> >> Fixes: b23ada8eecfd ("Introduce 128-bit xxregs.")
> >> Signed-off-by: Jarno Rajahalme 
> > 
> > Thanks!
> > 
> > Acked-by: Justin Pettit 
> 
> Thanks for the speedy review! Pushed to master and branch-2.6 with
> placeholders for four more xxregs.

Thanks for fixing this.

It didn't seem to have made it to branch-2.6, so I did the backport
myself.
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH 1/2] ovn-nbctl: Add NAT commands.

2016-10-11 Thread nickcooper-zhangtonghao
This patch provides the command line to create NAT rules
on logical router.

Signed-off-by: nickcooper-zhangtonghao 
---
 ovn/utilities/ovn-nbctl.8.xml |  66 +++
 ovn/utilities/ovn-nbctl.c | 185 ++
 tests/ovn-nbctl.at| 114 ++
 3 files changed, 365 insertions(+)

diff --git a/ovn/utilities/ovn-nbctl.8.xml b/ovn/utilities/ovn-nbctl.8.xml
index 2cbd6e0..be802da 100644
--- a/ovn/utilities/ovn-nbctl.8.xml
+++ b/ovn/utilities/ovn-nbctl.8.xml
@@ -424,6 +424,72 @@
   
 
 
+NAT Commands
+
+
+  [--may-exist] lr-nat-add router 
type external_ip logical_ip
+  
+
+  Adds the specified NAT to router.
+  The type must be one of snat,
+  dnat, or dnat_and_snat.
+  The external_ip is an IPv4 address.
+  The logical_ip is an IPv4 network (e.g 192.168.1.0/24)
+  or an IPv4 address.
+
+
+  When type is dnat, the externally
+  visible IP address external_ip is DNATted to the
+  IP address logical_ip in the logical space.
+
+
+  When type is snat, IP packets with their
+  source IP address that either matches the IP address in
+  logical_ip or is in the network provided by
+  logical_ip is SNATed into the IP address in
+  external_ip.
+
+
+  When type is dnat_and_snat,
+  the externally visible IP address external_ip
+  is DNATted to the IP address logical_ip in
+  the logical space.  In addition, IP packets with the source
+  IP address that matches logical_ip is SNATed into
+  the IP address in external_ip.
+
+
+  It is an error if a NAT already exists,
+  unless --may-exist is specified.
+
+  
+
+  [--if-exists] lr-nat-del router 
[type [ip]]
+  
+
+  Deletes NATs from router.  If only router
+  is supplied, all the NATs from the logical router are
+  deleted.  If type is also specified, then all the
+  NATs that match the type will be deleted from the logical
+  router.  If all the fields are given, then a single NAT rule
+  that matches all the fields will be deleted.  When type
+  is snat, the ip should be logical_ip.
+  When type is dnat or
+  dnat_and_snat, the ip shoud be external_ip.
+
+
+
+  It is an error if ip is specified and there
+  is no matching NAT entry, unless --if-exists is
+  specified.
+
+  
+
+  lr-nat-list router
+  
+Lists the NATs on router.
+  
+
+
 Load Balancer Commands
 
   [--may-exist | --add-duplicate] 
lb-add lb vip ips 
[protocol]
diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
index ad2d2f8..916dedb 100644
--- a/ovn/utilities/ovn-nbctl.c
+++ b/ovn/utilities/ovn-nbctl.c
@@ -384,6 +384,13 @@ Route commands:\n\
 remove routes from ROUTER\n\
   lr-route-list ROUTER  print routes for ROUTER\n\
 \n\
+NAT commands:\n\
+  lr-nat-add ROUTER TYPE EXTERNAL_IP LOGICAL_IP\n\
+add a NAT to ROUTER\n\
+  lr-nat-del ROUTER [TYPE [IP]]\n\
+remove NATs from ROUTER\n\
+  lr-nat-list ROUTERprint NATs for ROUTER\n\
+\n\
 LB commands:\n\
   lb-add LB VIP[:PORT] IP[:PORT]... [PROTOCOL]\n\
 create a load-balancer or add a VIP to an\n\
@@ -2165,6 +2172,177 @@ nbctl_lr_route_del(struct ctl_context *ctx)
 }
 free(prefix);
 }
+
+static void
+nbctl_lr_nat_add(struct ctl_context *ctx)
+{
+const struct nbrec_logical_router *lr;
+const char *nat_type = ctx->argv[2];
+const char *external_ip = ctx->argv[3];
+const char *logical_ip = ctx->argv[4];
+char *new_logical_ip = NULL;
+
+lr = lr_by_name_or_uuid(ctx, ctx->argv[1], true);
+
+if (strcmp(nat_type, "dnat") && strcmp(nat_type, "snat")
+&& strcmp(nat_type, "dnat_and_snat")) {
+ctl_fatal("%s: type must be one of \"dnat\", \"snat\" and "
+"\"dnat_and_snat\".", nat_type);
+}
+
+ovs_be32 ipv4 = 0;
+unsigned int plen;
+if (!ip_parse(external_ip, )) {
+ctl_fatal("%s: should be an IPv4 address.", external_ip);
+}
+
+if (strcmp("snat", nat_type)) {
+if (!ip_parse(logical_ip, )) {
+ctl_fatal("%s: should be an IPv4 address.", logical_ip);
+}
+new_logical_ip = xstrdup(logical_ip);
+} else {
+char *error = ip_parse_cidr(logical_ip, , );
+if (error) {
+free(error);
+ctl_fatal("%s: should be an IPv4 address or network.",
+logical_ip);
+}
+new_logical_ip = normalize_ipv4_prefix(ipv4, plen);
+}
+
+bool may_exist = shash_find(>options, 

[ovs-dev] [PATCH 2/2] ovn-nbctl: Fix manpage formatting typo.

2016-10-11 Thread nickcooper-zhangtonghao
Signed-off-by: nickcooper-zhangtonghao 
---
 ovn/utilities/ovn-nbctl.8.xml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/ovn/utilities/ovn-nbctl.8.xml b/ovn/utilities/ovn-nbctl.8.xml
index be802da..993e1c4 100644
--- a/ovn/utilities/ovn-nbctl.8.xml
+++ b/ovn/utilities/ovn-nbctl.8.xml
@@ -412,7 +412,7 @@
 
 
 
-  It is an error if prefix is specified and there
+  It is an error if prefix is specified and there
   is no matching route entry, unless --if-exists is
   specified.
 
-- 
1.8.3.1




___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] The Patch netdev-dpdk: add TSO support for vhost-user ports

2016-10-11 Thread Kavanagh, Mark B
>
>Hi,  all
>
>
>This patch is very important for users want to improve the performance
>of the large packets.
>
>
>But you know, in data center, lots of networks using vxlan, so if it
>supports vxlan, then it will be very useful.
>
>
>Would you guys has a plan to support it? I would like to help test it,
>or work together on it?

Hi Michael,

Some work has already been done to enable TSO over VxLAN, but it was blocked 
until recently, on account of gaps in the DPKD i40e PMD's support for tunnel 
offload.

We believe that those gaps have now been addressed; our current focus is on 
ironing out the issues in TSO support over flat and VLAN networks - once that's 
done, we plan to resume work on VxLAN support (and GRE, IPinIP).

Hope this helps,
Mark

>
> 
>--
>Thanks,
>Michael
>

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] The Patch netdev-dpdk: add TSO support for vhost-user ports

2016-10-11 Thread Michael Qiu

Hi,  all


This patch is very important for users want to improve the performance 
of the large packets.



But you know, in data center, lots of networks using vxlan, so if it 
supports vxlan, then it will be very useful.



Would you guys has a plan to support it? I would like to help test it, 
or work together on it?



--
Thanks,
Michael


___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


Re: [ovs-dev] [PATCH] netdev-dpdk: Optimise the initialization of "port_conf"

2016-10-11 Thread Kavanagh, Mark B
>
>The member "max_rx_pkt_len" of "port_conf" is only used if
>jumbo_frame enabled, so it can be initialized with value '0'.
>
>Signed-off-by: Binbin Xu 
>---
> lib/netdev-dpdk.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
>diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
>index 39bf930..c4a0cc0 100644
>--- a/lib/netdev-dpdk.c
>+++ b/lib/netdev-dpdk.c
>@@ -154,6 +154,7 @@ static char *vhost_sock_dir = NULL;   /* Location of 
>vhost-user sockets
>*/
> static const struct rte_eth_conf port_conf = {
> .rxmode = {
> .mq_mode = ETH_MQ_RX_RSS,
>+.max_rx_pkt_len = 0,
> .split_hdr_size = 0,
> .header_split   = 0, /* Header Split disabled */
> .hw_ip_checksum = 0, /* IP checksum offload disabled */
>@@ -648,9 +649,6 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
>n_rxq, int n_txq)
> if (dev->mtu > ETHER_MTU) {
> conf.rxmode.jumbo_frame = 1;
> conf.rxmode.max_rx_pkt_len = dev->max_packet_len;
>-} else {
>-conf.rxmode.jumbo_frame = 0;
>-conf.rxmode.max_rx_pkt_len = 0;
> }

NACK: if a previous configuration already set the jumbo_frame bit, a subsequent 
non-jumbo port could inherit this attribute - that's why it's reset explicitly 
here.

> /* A device may report more queues than it makes available (this has
>  * been observed for Intel xl710, which reserves some of them for
>--
>2.9.3
>
>___
>dev mailing list
>dev@openvswitch.org
>http://openvswitch.org/mailman/listinfo/dev
___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH] netdev-dpdk: Optimise the initialization of "port_conf"

2016-10-11 Thread Binbin Xu
The member "max_rx_pkt_len" of "port_conf" is only used if
jumbo_frame enabled, so it can be initialized with value '0'.

Signed-off-by: Binbin Xu 
---
 lib/netdev-dpdk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 39bf930..c4a0cc0 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -154,6 +154,7 @@ static char *vhost_sock_dir = NULL;   /* Location of 
vhost-user sockets */
 static const struct rte_eth_conf port_conf = {
 .rxmode = {
 .mq_mode = ETH_MQ_RX_RSS,
+.max_rx_pkt_len = 0,
 .split_hdr_size = 0,
 .header_split   = 0, /* Header Split disabled */
 .hw_ip_checksum = 0, /* IP checksum offload disabled */
@@ -648,9 +649,6 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
 if (dev->mtu > ETHER_MTU) {
 conf.rxmode.jumbo_frame = 1;
 conf.rxmode.max_rx_pkt_len = dev->max_packet_len;
-} else {
-conf.rxmode.jumbo_frame = 0;
-conf.rxmode.max_rx_pkt_len = 0;
 }
 /* A device may report more queues than it makes available (this has
  * been observed for Intel xl710, which reserves some of them for
-- 
2.9.3

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [PATCH v1, 1/1] ovn-northd: fix router ingress table ID in comments

2016-10-11 Thread Zongkai LI
This patch fixes wrong table ID in comments for logical router ingress
table IP Routing, ARP Resolution and ARP request.

Signed-off-by: Zongkai LI 
---
 ovn/northd/ovn-northd.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/ovn/northd/ovn-northd.c b/ovn/northd/ovn-northd.c
index 281dc62..9d74ec6 100644
--- a/ovn/northd/ovn-northd.c
+++ b/ovn/northd/ovn-northd.c
@@ -4014,7 +4014,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
   "ip", "flags.loopback = 1; ct_dnat;");
 }
 
-/* Logical router ingress table 4: IP Routing.
+/* Logical router ingress table 5: IP Routing.
  *
  * A packet that arrives at this table is an IP packet that should be
  * routed to the address in 'ip[46].dst'. This table sets outport to
@@ -4056,7 +4056,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
 
 /* XXX destination unreachable */
 
-/* Local router ingress table 5: ARP Resolution.
+/* Local router ingress table 6: ARP Resolution.
  *
  * Any packet that reaches this table is an IP packet whose next-hop IP
  * address is in reg0. (ip4.dst is the final destination.) This table
@@ -4251,7 +4251,7 @@ build_lrouter_flows(struct hmap *datapaths, struct hmap 
*ports,
   "get_nd(outport, xxreg0); next;");
 }
 
-/* Local router ingress table 6: ARP request.
+/* Local router ingress table 7: ARP request.
  *
  * In the common case where the Ethernet destination has been resolved,
  * this table outputs the packet (priority 0).  Otherwise, it composes
-- 
2.7.4

___
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev


[ovs-dev] [CudaMailTagged] [PATCH v3] netdev-dpdk: In some case, needn't reconfigure pmd threads when changing MTU

2016-10-11 Thread Binbin Xu
If the port is not an ethernet port, and the aligned size for the new MTU
doesn't change, we needn't to reconfigure pmd thread. What we should do
is that updating 'max_packet_len' atomic.

Signed-off-by: Binbin Xu 
---
 lib/netdev-dpdk.c | 38 --
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index 39bf930..f13bc8e 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -338,7 +338,7 @@ struct ingress_policer {
 struct netdev_dpdk {
 struct netdev up;
 int port_id;
-int max_packet_len;
+atomic_int max_packet_len;
 enum dpdk_dev_type type;
 
 struct dpdk_tx_queue *tx_q;
@@ -586,7 +586,8 @@ netdev_dpdk_mempool_configure(struct netdev_dpdk *dev)
 dev->dpdk_mp = mp;
 dev->mtu = dev->requested_mtu;
 dev->socket_id = dev->requested_socket_id;
-dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
+atomic_store_relaxed(>max_packet_len,
+MTU_TO_FRAME_LEN(dev->mtu));
 }
 
 return 0;
@@ -647,7 +648,8 @@ dpdk_eth_dev_queue_setup(struct netdev_dpdk *dev, int 
n_rxq, int n_txq)
 
 if (dev->mtu > ETHER_MTU) {
 conf.rxmode.jumbo_frame = 1;
-conf.rxmode.max_rx_pkt_len = dev->max_packet_len;
+atomic_read_relaxed(>max_packet_len, 
+_rx_pkt_len);
 } else {
 conf.rxmode.jumbo_frame = 0;
 conf.rxmode.max_rx_pkt_len = 0;
@@ -840,7 +842,7 @@ netdev_dpdk_init(struct netdev *netdev, unsigned int 
port_no,
 dev->type = type;
 dev->flags = 0;
 dev->requested_mtu = dev->mtu = ETHER_MTU;
-dev->max_packet_len = MTU_TO_FRAME_LEN(dev->mtu);
+atomic_init(>max_packet_len, MTU_TO_FRAME_LEN(dev->mtu));
 ovsrcu_index_init(>vid, -1);
 dev->vhost_reconfigured = false;
 
@@ -1511,12 +1513,15 @@ netdev_dpdk_filter_packet_len(struct netdev_dpdk *dev, 
struct rte_mbuf **pkts,
 int i = 0;
 int cnt = 0;
 struct rte_mbuf *pkt;
+int max_packet_len;
+
+atomic_read_relaxed(>max_packet_len, _packet_len);
 
 for (i = 0; i < pkt_cnt; i++) {
 pkt = pkts[i];
-if (OVS_UNLIKELY(pkt->pkt_len > dev->max_packet_len)) {
+if (OVS_UNLIKELY(pkt->pkt_len > max_packet_len)) {
 VLOG_WARN_RL(, "%s: Too big size %" PRIu32 " max_packet_len %d",
- dev->up.name, pkt->pkt_len, dev->max_packet_len);
+ dev->up.name, pkt->pkt_len, max_packet_len);
 rte_pktmbuf_free(pkt);
 continue;
 }
@@ -1620,6 +1625,7 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct 
dp_packet_batch *batch)
 int dropped = 0;
 int newcnt = 0;
 int i;
+int max_packet_len;
 
 /* If we are on a non pmd thread we have to use the mempool mutex, because
  * every non pmd thread shares the same mempool cache */
@@ -1630,12 +1636,14 @@ dpdk_do_tx_copy(struct netdev *netdev, int qid, struct 
dp_packet_batch *batch)
 
 dp_packet_batch_apply_cutlen(batch);
 
+atomic_read_relaxed(>max_packet_len, _packet_len);
+
 for (i = 0; i < batch->count; i++) {
 int size = dp_packet_size(batch->packets[i]);
 
-if (OVS_UNLIKELY(size > dev->max_packet_len)) {
+if (OVS_UNLIKELY(size > max_packet_len)) {
 VLOG_WARN_RL(, "Too big size %d max_packet_len %d",
- (int) size, dev->max_packet_len);
+ (int) size, max_packet_len);
 
 dropped++;
 continue;
@@ -1803,7 +1811,15 @@ netdev_dpdk_set_mtu(struct netdev *netdev, int mtu)
 ovs_mutex_lock(>mutex);
 if (dev->requested_mtu != mtu) {
 dev->requested_mtu = mtu;
-netdev_request_reconfigure(netdev);
+
+if (dev->type == DPDK_DEV_ETH
+|| dpdk_buf_size(dev->mtu) != dpdk_buf_size(mtu)) {
+netdev_request_reconfigure(netdev);
+} else {
+dev->mtu = mtu;
+atomic_store_relaxed(>max_packet_len, 
+MTU_TO_FRAME_LEN(dev->mtu));
+}
 }
 ovs_mutex_unlock(>mutex);
 
@@ -2234,6 +2250,7 @@ netdev_dpdk_get_status(const struct netdev *netdev, 
struct smap *args)
 {
 struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
 struct rte_eth_dev_info dev_info;
+int max_packet_len;
 
 if (!rte_eth_dev_is_valid_port(dev->port_id)) {
 return ENODEV;
@@ -2248,7 +2265,8 @@ netdev_dpdk_get_status(const struct netdev *netdev, 
struct smap *args)
rte_eth_dev_socket_id(dev->port_id));
 smap_add_format(args, "driver_name", "%s", dev_info.driver_name);
 smap_add_format(args, "min_rx_bufsize", "%u", dev_info.min_rx_bufsize);
-smap_add_format(args, "max_rx_pktlen", "%u", dev->max_packet_len);
+atomic_read_relaxed(>max_packet_len, _packet_len);
+smap_add_format(args, "max_rx_pktlen", "%u", max_packet_len);
 smap_add_format(args, "max_rx_queues", "%u",