Re: [ovs-dev] [PATCH v2] openvswitch: Fix an error handling path in 'ovs_nla_init_match_and_action()'

2017-09-11 Thread Pravin Shelar
On Mon, Sep 11, 2017 at 12:56 PM, Christophe JAILLET
 wrote:
> All other error handling paths in this function go through the 'error'
> label. This one should do the same.
>
> Fixes: 9cc9a5cb176c ("datapath: Avoid using stack larger than 1024.")
> Signed-off-by: Christophe JAILLET 
> ---
> I think that the comment above the function could be improved. It looks
> like the commit log which has introduced this function.
>
> I'm also not sure that commit 9cc9a5cb176c is of any help. It is
> supposed to remove a warning, and I guess it does. But 
> 'ovs_nla_init_match_and_action()'
> is called unconditionnaly from 'ovs_flow_cmd_set()'. So even if the stack
> used by each function is reduced, the overall stack should be the same, if
> not larger.
>
It depends on which function stack depth are are looking at. for some
function it remains same. For nested function it goes down.

> So this commit sounds like adding a bug where the code was fine and states
> to fix an issue but, at the best, only hides it.
>
> Instead of fixing the code with the proposed patch, reverting the initial
> commit could also be considered.
>
> V2: update Subject line
> ---
>  net/openvswitch/datapath.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index 76cf273a56c7..c3aec6227c91 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -1112,7 +1112,8 @@ static int ovs_nla_init_match_and_action(struct net 
> *net,
> if (!a[OVS_FLOW_ATTR_KEY]) {
> OVS_NLERR(log,
>   "Flow key attribute not present in set 
> flow.");
> -   return -EINVAL;
> +   error = -EINVAL;
> +   goto error;
> }
>
Patch looks good to me.

Acked-by: Pravin B Shelar 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] conntrack: Fix conn_type need be checked when remove rev_conn.

2017-09-11 Thread Darrell Ball
We cannot merge this patch.

Can you provide answers to the questions I asked here

https://mail.openvswitch.org/pipermail/ovs-discuss/2017-September/045308.html

Thanks Darrell


On Mon, Sep 11, 2017 at 2:49 AM, w00273186  wrote:

> From: Yunjian Wang 
>
> The rev_conn need will be removed, only when conn_type is
> CT_CONN_TYPE_UN_NAT.
> This crash will be triggered when remove conn in ct-clean thread.
>
> Reported-by: Lili Huang 
> Signed-off-by: Yunjian Wang 
> ---
>  lib/conntrack.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/lib/conntrack.c b/lib/conntrack.c
> index 419cb1d..c1adb56 100644
> --- a/lib/conntrack.c
> +++ b/lib/conntrack.c
> @@ -684,9 +684,10 @@ nat_clean(struct conntrack *ct, struct conn *conn,
>
>  /* In the unlikely event, rev conn was recreated, then skip
>   * rev_conn cleanup. */
> -if (rev_conn && (!nat_conn_key_node ||
> - conn_key_cmp(_conn_key_node->value,
> -  _conn->rev_key))) {
> +if (rev_conn &&
> +(rev_conn->conn_type == CT_CONN_TYPE_UN_NAT) &&
> +(!nat_conn_key_node || conn_key_cmp(_conn_key_node->value,
> +_conn->rev_key))) {
>  hmap_remove(>buckets[bucket_rev_conn].connections,
>  _conn->node);
>  free(rev_conn);
> --
> 1.8.3.1
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix sparse error on test-ovn.c

2017-09-11 Thread Shashank Ram




_
Subject: [ovs-dev] [PATCH] tests: Fix sparse error on test-ovn.c

Commit d5c70d4bcc344ae10a644b83f1790a0235871efc fixed the MSVC issue
however, introduced a sparse error:
"tests/test-ovn.c:205:43: warning: Using plain integer as NULL pointer"

Use 'NULL' instead of '0'.

Signed-off-by: Alin Gabriel Serdean 
---
 
Acked-by: Shashank Ram 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: reset packet_type for reused dp_packets

2017-09-11 Thread Darrell Ball
This looks good Zoltan; we will fold in.

Thanks Darrell

On 9/11/17, 1:53 AM, "Zoltán Balogh"  wrote:

Hi Darrell,



Thank you for the clarification! I removed the line below from the source. 
I sent v5 to the dev list:


https://urldefense.proofpoint.com/v2/url?u=https-3A__mail.openvswitch.org_pipermail_ovs-2Ddev_2017-2DSeptember_338567.html=DwIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=dGZmbKhBG9tJHY4odedsGA=b0qkwdVfwE55Yjm1IETXhBSdMvSgv_mxJnzc0_8jrMc=QXJz2bcu8GG8lKrnw9ShljKTf7Gl_E7KpsUhU4NiLBs=
 


https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.ozlabs.org_patch_812245_=DwIGaQ=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs=dGZmbKhBG9tJHY4odedsGA=b0qkwdVfwE55Yjm1IETXhBSdMvSgv_mxJnzc0_8jrMc=Fpaw04e1DeSI3ZQmzEGgU2aNQMDPQv-nkm3CrUlR02k=
 



Best regards,

Zoltan





> -Original Message-

> From: Darrell Ball [mailto:db...@vmware.com]

> Sent: Friday, September 08, 2017 6:24 PM

> To: Zoltán Balogh ; 'd...@openvswitch.org' 


> Subject: Re: [ovs-dev] [PATCH v2] netdev-dpdk: reset packet_type for 
reused dp_packets

> 

> Hi Zoltan

> 

> The packet_type initialization that I was referring to as becoming 
redundant now was the one moved from to

> 

> void

> dp_packet_init_dpdk(struct dp_packet *b, size_t allocated)

> {

> dp_packet_set_allocated(b, allocated);

> b->source = DPBUF_DPDK;

> b->packet_type = htonl(PT_ETH);   

> }

> 

> Essentially , it was ‘split out’ from dp_packet_init__ for dpdk.

> 

> Since the new packet_type was not being reset as would be needed if the 
buffer pool only init is done,

> we are now initing all packets for packet_type with this patch, hence the 
above line for dpdk buffer

> pool init becomes redundant.

> 

> Thanks Darrell

> 

> 

> On 9/8/17, 3:01 AM, "Zoltán Balogh"  wrote:

> 

> Hi,

> 

> Please ignore this patch. It's faulty. We cannot remove initialization

> of packet_type from dp_packet_init__(). Beacause of non-dpdk netdevs

> still need it. Here is a link to the corrected patch on patchwork:

> https://urldefense.proofpoint.com/v2/url?u=https-

> 
3A__patchwork.ozlabs.org_patch_811445_=DwIFAw=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-

> 
uEs=dGZmbKhBG9tJHY4odedsGA=8TOx50_GxR8BEr45vTRIK07hVU3zdUM74V8mfYZ71Pc=7M6jjf8mR02x3pm7OqQKPVuq0RoVq7odJtobax0

> JYJw=

> 

> Best regards,

> Zoltan

> 

> 

> > -Original Message-

> > From: ovs-dev-boun...@openvswitch.org 
[mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Zoltán Balogh

> > Sent: Friday, September 08, 2017 10:43 AM

> > To: 'd...@openvswitch.org' 

> > Subject: [ovs-dev] [PATCH v2] netdev-dpdk: reset packet_type for 
reused dp_packets

> >

> > DPDK uses dp-packet pool for storing received packets. The pool is

> > reused by rxq_recv funcions of the DPDK netdevs. The datapath is

> > capable to modify the packet_type property of packets. For instance

> > when encapsulated L3 packets are received on a ptap gre port.

> > In this case the packet_type property of struct dp_packet can be

> > modified and later the same dp_packet with the modified packet_type

> > can be reused in the rxq_rec function, so it can contain corrupted

> > data.

> >

> > The dp_packet_batch_init_cutlen() in the rxq_recv functions iterates

> > over dp_packets and sets their cutlen. So I modified this function

> > to set packet_type to Ethernet for the dp_packets as well. I also

> > renamed this function because of the added functionality.

> >

> > The dp_packet_batch_init_cutlen() iterates over batch->count 
dp_packet.

> > Therefore setting of batch->count = nb_rx needs to be done before 
the

> > former function is invoked. This is an additional fix.

> >

> > This commit also removes initialization of packet_type from

> > dp_packet_init__(), since that was moved to dp_packet_init_dpdk() 
before.

> >

> > Signed-off-by: Zoltan Balogh 

> > Signed-off-by: Laszlo Suru 

> > Co-authored-by: Laszlo Suru 

> > CC: Jan Scheurich 

[ovs-dev] [PATCH] tests: Fix sparse error on test-ovn.c

2017-09-11 Thread Alin Gabriel Serdean
Commit d5c70d4bcc344ae10a644b83f1790a0235871efc fixed the MSVC issue
however, introduced a sparse error:
"tests/test-ovn.c:205:43: warning: Using plain integer as NULL pointer"

Use 'NULL' instead of '0'.

Signed-off-by: Alin Gabriel Serdean 
---
 tests/test-ovn.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index 345fd47..4beb2b8 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -202,7 +202,7 @@ create_addr_sets(struct shash *addr_sets)
 static const char *const addrs3[] = {
 "00:00:00:00:00:01", "00:00:00:00:00:02", "00:00:00:00:00:03",
 };
-static const char *const addrs4[] = { 0 };
+static const char *const addrs4[] = { NULL };
 
 expr_addr_sets_add(addr_sets, "set1", addrs1, 3);
 expr_addr_sets_add(addr_sets, "set2", addrs2, 3);
-- 
2.10.2.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] 答复: Re: 答复: Re: 答复: Re: 答复: Re: 答复: Re: [PATCH v2] ovn: Support for taas(tap-as-a-service) function

2017-09-11 Thread wang . qianyu
Ok,I will do not added new taas port and mark the destination port as 
taas port. And do a limitation that vswitch do not receive any packets 
from taas port.
Thanks.





Russell Bryant 
2017/09/12 05:38
 
收件人:Gao Zhenyu , 
抄送:  "wang.qianyu" , Takashi YAMAMOTO 
, ovs dev , xurong00037997 
, zhou.huij...@zte.com.cn
主题:  Re: 答复: Re: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: 
[PATCH v2] ovn: Support for taas(tap-as-a-service) function


I was wondering the same ... it seems odd to make it both the mirror 
destination and a regular port at the same time.

On Sun, Sep 10, 2017 at 9:13 PM, Gao Zhenyu  
wrote:
A application may link to this destination port for collecting/analysing 
mirror traffic.  How to distinguish a packet whether it's regular traffic 
or mirror traffic if destination port receives both regular traffic and 
mirror traffic?

Thanks
Zhenyu Gao

2017-09-09 11:10 GMT+08:00 :
If destination port only receive mirrored traffic, this function do not 
need add port with new type of taas. In this situation, the mirror flag is 
needed. 

But, I think, destination port receive both mirrored traffic and regular 
traffic may be more flexible. 

Thanks 




Takashi YAMAMOTO  
2017/09/08 20:54 

收件人:Russell Bryant , 
抄送:wang.qia...@zte.com.cn, ovs dev , zhou.huij...@zte.com.cn, xurong00037997  
主题:Re: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: [PATCH 
v2] ovn: Support for taas(tap-as-a-service) function





On Wed, Sep 6, 2017 at 3:57 AM, Russell Bryant  wrote: 
What if a mirror port *only* receives mirrored packets?  If the only
packets it ever receives are mirrored packets, a new flag would not be
necessary.

Do you intend for the port to operate as both a regular port *and* to
receive a mirror of traffic  for another port? 

in taas, a destination port is supposed to receive both of mirrored 
traffic and regular traffic. 

(i haven't looked at this implementation yet) 
  

On Thu, Aug 24, 2017 at 10:31 PM,  wrote:

> I know your mean.
> The receiver need to distinguish the traffic is regular or mirror. This
> may need some special flow table to deal with it.
>
> Thanks
>
>
>
> *Gao Zhenyu >*
>
> 2017/08/25 10:12
>
> 收件人:wang.qia...@zte.com.cn,
> 抄送:ovs dev , Russell Bryant <
> russ...@ovn.org>, xurong00037997 ,
> zhou.huij...@zte.com.cn
> 主题:Re: 答复: Re: [ovs-dev] 答复: Re: [PATCH v2] ovn: 
Support
> for taas(tap-as-a-service) function
>
>
> I mean for regular packet, ovs should not add the geneve option,  the 
new
> geneve option is only for mirror traffic.
>
> Did you meant some mirror traffic has mirror flag and some would not 
have?
>
> Thanks
> Zhenyu Gao
>
> 2017-08-25 9:44 GMT+08:00 <*wang.qia...@zte.com.cn*
> >:
> Hi zhenyu,
> Thanks for your opinion.
> The mirror flag is not always exist, so I do not think add a new geneve
> option is a good idea.
>
> Thanks.
>
>
> *Gao Zhenyu <**sysugaozhe...@gmail.com* *>*
>
> 2017/08/25 09:34
>
> 收件人:*wang.qia...@zte.com.cn* ,
> 抄送:Russell Bryant <*russ...@ovn.org* >,
> ovs dev <*d...@openvswitch.org* >,
> *zhou.huij...@zte.com.cn* , xurong00037997 <
> *xu.r...@zte.com.cn* >
> 主题:Re: [ovs-dev] 答复: Re: [PATCH v2] ovn: Support 
for
> taas(tap-as-a-service) function
>
>
>
> Although adding a new geneve option is more complicate but I think it
> still worth having that.
> Once the destination chassis found that geneve option, it can tag the
> mirror flag on packet. And it make the whole process looks same no 
matter
> on same chassis or not.
>
> Thanks
> Zhenyu Gao
>
> 2017-08-25 9:15 GMT+08:00 <*wang.qia...@zte.com.cn*
> >:
> Hi Russell,
>
> Thanks for your review.
>
> When the mirror destination is in other chassis, the mirror flag which
> used to mark the packet need be transmitted to the destination chassis.
>
> We could use the inport, geneve option or new type of out port to 
indicate
> the packet as a mirrored packet.
>
> When we use inport to indicate the flag, this may need use inport as the
> match field in the egress pipeline, I think this may conflict with the
> egress pipeline.
>
> If use geneve option to deliver the mirror flag, this may be more
> complicated. So, I add a new type of port as the destination of mirror
> flow. The port types of mirror and taas corresponding to configurations 
of
> tap-flow and tap-service.
>
> 

Re: [ovs-dev] [PATCH] test: Avoid using 'truncate' command

2017-09-11 Thread aserdean
Thanks Mark! I applied this on master and branch-2.8.

Ben: Thanks for the encouragement .

Alin.
> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Ben Pfaff
> Sent: Tuesday, September 12, 2017 2:03 AM
> To: aserd...@ovn.org
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] test: Avoid using 'truncate' command
> 
> Alin, since Mark isn't a committer, and you are, it's probably best to just 
> apply
> this yourself.
> 
> On Tue, Sep 12, 2017 at 12:55:35AM +0300, aserd...@ovn.org wrote:
> > Tested-by: Alin Gabriel Serdean 
> > Acked-by: Alin Gabriel Serdean 
> >
> > > -Original Message-
> > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > > boun...@openvswitch.org] On Behalf Of Mark Michelson
> > > Sent: Monday, September 11, 2017 11:58 PM
> > > To: d...@openvswitch.org
> > > Subject: [ovs-dev] [PATCH] test: Avoid using 'truncate' command
> > >
> > > The 'truncate' command is not available on all platforms. Since the
> > command
> > > was being used to zero out a file, there are other easy options available.
> > In
> > > this case, I've replaced 'truncate' with a redirection.
> > >
> > > Reported-by: Alin Gabriel Serdean 
> > > Signed-off-by: Mark Michelson 

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 0/6] OVS-DPDK rxq to pmd assignment improvements.

2017-09-11 Thread Darrell Ball


On 9/11/17, 12:25 PM, "Federico Iezzi"  wrote:

On Fri, Aug 25, 2017 at 10:36 AM, Darrell Ball  wrote:
> I applied the series to 
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_darball_ovs_commits_dpdk-5Fmerge=DwIBaQ=uilaK90D4TOVoH58JNXRgQ=BVhFA09CGX7JQ5Ih-uZnsw=C5lG7aon2zE9j1Xvdo9MBQuwqOjQOBRrUcTTFdcxT6s=Pb67xViu5aZuBs9GL7Vp2uG4WXE_TT35X-JlmaDd_yQ=
 

I believe this series is worth a mention in the NEWS, reengineering an
important OVS-DPDK limitation.


Part of the enhancement is not easily user visible, as an underlying algorithm 
was changed, which has a probabilistic benefit.
So, I guess you refer to the rebalance command addition ?
We had identified some enhancements that were to be done to guide the use of 
the rebalance command, which is probably the
main reason why we had not added to NEWS yet.
However, we can still add to NEWS for 2.8.1, if that is the consensus ?


>
>
>
> On 8/24/17, 4:37 PM, "Kevin Traynor"  wrote:
>
> For the DPDK datapath, by default rxqs are assigned to available pmds
> in round robin order with no weight or priority.
>
> It can happen that some very busy queues are handled by one pmd which
> does not have enough cycles to prevent packets being dropped on them.
> While at the same time another pmd which handles queues with no 
traffic
> on them is essentially idle.
>
> Rxq to pmd assignment happens as a result of a number of events and
> when it does, the same unweighted round robin approach is applied
> each time.
>
> This patchset proposes to improve the round robin nature of rxq to pmd
> assignment by counting the processing cycles used by the rxqs during
> their operation and incorporating that data into assignment.
>
> Before assigning in a round robin manner, the rxqs will be sorted in
> order of the processing cycles they have been consuming. Assuming
> multiple pmds, this ensures that the rxqs measured to be using the
> most processing cycles will be assigned to different cores.
>
> In some cases the measured cycles for an rxq may be not available as
> the rxq is new or may not be useful for assignment as traffic patterns
> may change.  In those cases the code will essentially fallback to 
being
> round round similar to what currently exists. However, in the case
> where data is available and a reliable indication of future rxq cycles
> consumption, rxq to pmd distribution will be much improved.
>
> V5 -> V6
> 
> Minor changes to 2/6, 3/6, 4/6, 5/6 from Darrell's review comments.
>
> V4 -> V5
> 
> Changed history of rxq considered during assignment to 1 min. In order
> to have data available quicker than 1 min and not to be using up to
> 1 min old data, introduced storing of data in multiple intervals
> similar to suggestion by Darrell. Some minor name changes to reflect
> this.
>
> 2/6 Added storage for multiple intervals
> 3/6 Store cycles into interval
> 4/6 Sum cycles from intervals and use total for assignment
>
> V3 -> V4
> 
> 4/6
> Rebased to accomodate new cross numa assigment.
>
> V2 -> V3
> 
> Dropped v2 1/7 as not reusing dpcls optimisation interval anymore
>
> 2/6
> Moved unused functions to 3/6 to avoid compiler warning
>
> 3/6
> Made pmd rxq interval independent from dpcls opt interval
>
> 4/6
> Moved docs about rebalance command to when it is available in 6/6
> Added logging info for pmd to rxq assignment
>
> 5/6
> Added an example to docs
>
> 6/6
> Noted in commit msg that Jan requested this for testing purposes
>
> V1 -> V2
> 
> Dropped Ciara's patch to change how pmd cycles are counted as it 
merged.
>
> 6/7: Rebased unit tests.
>
>
> Kevin Traynor (6):
>   dpif-netdev: Change polled_queue to use dp_netdev_rxq.
>   dpif-netdev: Add rxq processing cycle counters.
>   dpif-netdev: Count the rxq processing cycles for an rxq.
>   dpif-netdev: Change rxq_scheduling to use rxq processing cycles.
>   dpif-netdev: Change pmd selection order.
>   dpif-netdev: Add ovs-appctl dpif-netdev/pmd-rxq-rebalance.
>
>  Documentation/howto/dpdk.rst |  26 
>  lib/dpif-netdev.c| 298 
---
>  tests/pmd.at |   2 +-
>  vswitchd/ovs-vswitchd.8.in   |   2 +
>  4 files changed, 284 insertions(+), 44 deletions(-)
>
> 

Re: [ovs-dev] [PATCH] test: Avoid using 'truncate' command

2017-09-11 Thread Ben Pfaff
Alin, since Mark isn't a committer, and you are, it's probably best to
just apply this yourself.

On Tue, Sep 12, 2017 at 12:55:35AM +0300, aserd...@ovn.org wrote:
> Tested-by: Alin Gabriel Serdean 
> Acked-by: Alin Gabriel Serdean 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Mark Michelson
> > Sent: Monday, September 11, 2017 11:58 PM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH] test: Avoid using 'truncate' command
> > 
> > The 'truncate' command is not available on all platforms. Since the
> command
> > was being used to zero out a file, there are other easy options available.
> In
> > this case, I've replaced 'truncate' with a redirection.
> > 
> > Reported-by: Alin Gabriel Serdean 
> > Signed-off-by: Mark Michelson 
> > ---
> >  tests/ovn.at | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/tests/ovn.at b/tests/ovn.at index 2a00232ed..6c38b973f 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -8501,7 +8501,7 @@ ovn-nbctl --wait=hv --timeout=3 sync  # doesn't
> > have the same effect because "name" is conserved, and the  # Chassis entry
> > is not replaced.
> > 
> > -truncate -s 0 gw1/ovn-controller.log
> > +> gw1/ovn-controller.log
> > 
> >  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
> > ovn-sbctl destroy Chassis $gw2_chassis
> > --
> > 2.13.5
> > 
> > ___
> > dev mailing list
> > d...@openvswitch.org
> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Skip OVN test if truncate command not found

2017-09-11 Thread aserdean
Do you mind sending out a patch for it?

Alin.

 

Patch is here: 
https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338627.html

 

 

[Alin Serdean] Thanks! I tested and acked it.

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] test: Avoid using 'truncate' command

2017-09-11 Thread aserdean
Tested-by: Alin Gabriel Serdean 
Acked-by: Alin Gabriel Serdean 

> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Mark Michelson
> Sent: Monday, September 11, 2017 11:58 PM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH] test: Avoid using 'truncate' command
> 
> The 'truncate' command is not available on all platforms. Since the
command
> was being used to zero out a file, there are other easy options available.
In
> this case, I've replaced 'truncate' with a redirection.
> 
> Reported-by: Alin Gabriel Serdean 
> Signed-off-by: Mark Michelson 
> ---
>  tests/ovn.at | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/ovn.at b/tests/ovn.at index 2a00232ed..6c38b973f 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -8501,7 +8501,7 @@ ovn-nbctl --wait=hv --timeout=3 sync  # doesn't
> have the same effect because "name" is conserved, and the  # Chassis entry
> is not replaced.
> 
> -truncate -s 0 gw1/ovn-controller.log
> +> gw1/ovn-controller.log
> 
>  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
> ovn-sbctl destroy Chassis $gw2_chassis
> --
> 2.13.5
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: 答复: Re: [PATCH v2] ovn: Support for taas(tap-as-a-service) function

2017-09-11 Thread Russell Bryant
I was wondering the same ... it seems odd to make it both the mirror
destination and a regular port at the same time.

On Sun, Sep 10, 2017 at 9:13 PM, Gao Zhenyu  wrote:

> A application may link to this destination port for collecting/analysing 
> mirror
> traffic.  How to distinguish a packet whether it's regular traffic or
> mirror traffic if destination port receives both regular traffic and mirror
> traffic?
>
> Thanks
> Zhenyu Gao
>
> 2017-09-09 11:10 GMT+08:00 :
>
>> If destination port only receive mirrored traffic, this function do not
>> need add port with new type of taas. In this situation, the mirror flag is
>> needed.
>>
>> But, I think, destination port receive both mirrored traffic and regular
>> traffic may be more flexible.
>>
>> Thanks
>>
>>
>> *Takashi YAMAMOTO >*
>>
>> 2017/09/08 20:54
>>
>> 收件人:Russell Bryant ,
>> 抄送:wang.qia...@zte.com.cn, ovs dev ,
>> zhou.huij...@zte.com.cn, xurong00037997 
>> 主题:Re: [ovs-dev] 答复: Re: 答复: Re: 答复: Re: [PATCH v2] ovn:
>> Support for taas(tap-as-a-service) function
>>
>>
>>
>>
>>
>> On Wed, Sep 6, 2017 at 3:57 AM, Russell Bryant <*russ...@ovn.org*
>> > wrote:
>> What if a mirror port *only* receives mirrored packets?  If the only
>> packets it ever receives are mirrored packets, a new flag would not be
>> necessary.
>>
>> Do you intend for the port to operate as both a regular port *and* to
>> receive a mirror of traffic  for another port?
>>
>> in taas, a destination port is supposed to receive both of mirrored
>> traffic and regular traffic.
>>
>> (i haven't looked at this implementation yet)
>>
>>
>> On Thu, Aug 24, 2017 at 10:31 PM, <*wang.qia...@zte.com.cn*
>> > wrote:
>>
>> > I know your mean.
>> > The receiver need to distinguish the traffic is regular or mirror. This
>> > may need some special flow table to deal with it.
>> >
>> > Thanks
>> >
>> >
>> >
>> > *Gao Zhenyu <*sysugaozhe...@gmail.com*  <
>> *sysugaozhe...@gmail.com* >>*
>> >
>> > 2017/08/25 10:12
>> >
>> > 收件人:*wang.qia...@zte.com.cn* ,
>> > 抄送:ovs dev <*d...@openvswitch.org* >,
>> Russell Bryant <
>> > *russ...@ovn.org* >, xurong00037997 <
>> *xu.r...@zte.com.cn* >,
>> > *zhou.huij...@zte.com.cn* 
>> > 主题:Re: 答复: Re: [ovs-dev] 答复: Re: [PATCH v2] ovn: Support
>> > for taas(tap-as-a-service) function
>> >
>> >
>> > I mean for regular packet, ovs should not add the geneve option,  the
>> new
>> > geneve option is only for mirror traffic.
>> >
>> > Did you meant some mirror traffic has mirror flag and some would not
>> have?
>> >
>> > Thanks
>> > Zhenyu Gao
>> >
>> > 2017-08-25 9:44 GMT+08:00 <**wang.qia...@zte.com.cn*
>> *
>> > <*wang.qia...@zte.com.cn* >>:
>> > Hi zhenyu,
>> > Thanks for your opinion.
>> > The mirror flag is not always exist, so I do not think add a new geneve
>> > option is a good idea.
>> >
>> > Thanks.
>> >
>> >
>> > *Gao Zhenyu <***sysugaozhe...@gmail.com* * <
>> *sysugaozhe...@gmail.com* >*>*
>> >
>> > 2017/08/25 09:34
>> >
>> > 收件人:**wang.qia...@zte.com.cn* *
>> <*wang.qia...@zte.com.cn* >,
>> > 抄送:Russell Bryant <**russ...@ovn.org* *
>> <*russ...@ovn.org* >>,
>> > ovs dev <**d...@openvswitch.org* * <
>> *d...@openvswitch.org* >>,
>> > **zhou.huij...@zte.com.cn* * <
>> *zhou.huij...@zte.com.cn* >, xurong00037997 <
>> > **xu.r...@zte.com.cn* * <*xu.r...@zte.com.cn*
>> >>
>> > 主题:Re: [ovs-dev] 答复: Re: [PATCH v2] ovn: Support for
>> > taas(tap-as-a-service) function
>> >
>> >
>> >
>> > Although adding a new geneve option is more complicate but I think it
>> > still worth having that.
>> > Once the destination chassis found that geneve option, it can tag the
>> > mirror flag on packet. And it make the whole process looks same no
>> matter
>> > on same chassis or not.
>> >
>> > Thanks
>> > Zhenyu Gao
>> >
>> > 2017-08-25 9:15 GMT+08:00 <**wang.qia...@zte.com.cn*
>> *
>> > <*wang.qia...@zte.com.cn* >>:
>> > Hi Russell,
>> >
>> > Thanks for your review.
>> >
>> > When the mirror destination is in other chassis, the mirror flag which
>> > used to mark the packet need be transmitted to the destination chassis.
>> >
>> > We could use the inport, geneve option or new type of out port to
>> indicate
>> > the packet as a mirrored packet.

[ovs-dev] MICROSOFT VERIFICATION UPDATE

2017-09-11 Thread Razim Bin Mohd Noor
MICROSOFT VERIFICATION UPDATE

Kjære kunde,
Vennligst les og følg instruksjonene for å beskytte Microsofts personvern. Som 
en del av vår innsats for å forbedre din erfaring på tvers av våre 
forbrukertjenester oppdaterer vi Microsoft Services-avtalen og Microsofts 
personvernerklæring. Vi vil benytte anledningen til å varsle deg om disse 
oppdateringene. KLIKK HER for 
oppdateringer og bekreftelse, hvis du ikke gjør dette, vil din e-postkonto bli 
suspendert.

Vi beklager eventuelle ulemper dette kan føre til.

Microsoft respekterer personvernet ditt.
Administrasjonsavdeling
© 3M 1995-2017. Alle rettigheter reservert.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix MSVC compiler error on test-ovn.c

2017-09-11 Thread aserdean
Thanks for the quick review. I applied it on master, branch-2.8, branch-2.7
and branch-2.6.

Alin.
> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Ben Pfaff
> Sent: Monday, September 11, 2017 11:08 PM
> To: Alin Gabriel Serdean 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH] tests: Fix MSVC compiler error on
test-ovn.c
> 
> On Mon, Sep 11, 2017 at 10:36:53PM +0300, Alin Gabriel Serdean wrote:
> > MSVC doesn't like an empty initializer for a flexible array.
> >
> > Add a zero initializer to keep the compiler happy.
> >
> > Fixes current build on Windows.
> >
> > Signed-off-by: Alin Gabriel Serdean 
> 
> Acked-by: Ben Pfaff 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 16/16] travis: Update build kernel list

2017-09-11 Thread Greg Rose
Linux kernel 4.13 has been released. Update the kernel build list to
the current list of kernels from kernel.org.

Signed-off-by: Greg Rose 
---
 .travis.yml | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 9d0fd44..f217840 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -25,16 +25,16 @@ sudo: false
 
 env:
   - OPTS="--disable-ssl"
-  - TESTSUITE=1 KERNEL=3.16.46
+  - TESTSUITE=1 KERNEL=3.16.47
   - TESTSUITE=1 OPTS="--enable-shared"
   - BUILD_ENV="-m32" OPTS="--disable-ssl"
-  - KERNEL=3.16.46 DPDK=1
-  - KERNEL=3.16.46 DPDK=1 OPTS="--enable-shared"
-  - KERNEL=4.12.2
-  - KERNEL=4.11.11
-  - KERNEL=4.9.38
-  - KERNEL=4.4.77
-  - KERNEL=4.1.42
+  - KERNEL=3.16.47 DPDK=1
+  - KERNEL=3.16.47 DPDK=1 OPTS="--enable-shared"
+  - KERNEL=4.13
+  - KERNEL=4.12.11
+  - KERNEL=4.9.48
+  - KERNEL=4.4.87
+  - KERNEL=4.1.43
   - KERNEL=3.10.107
   - TESTSUITE=1 LIBS=-ljemalloc
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 15/16] Documentation: Update NEWS and FAQ.

2017-09-11 Thread Greg Rose
Document Open vSwitch Linux kernel support for the 4.13 kernel
release.

Signed-off-by: Greg Rose 
---
 Documentation/faq/releases.rst | 2 +-
 NEWS   | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index c785529..064a496 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -65,7 +65,7 @@ Q: What Linux kernel versions does each Open vSwitch release 
work with?
 2.5.x2.6.32 to 4.3
 2.6.x3.10 to 4.7
 2.7.x3.10 to 4.9
-2.8.x3.10 to 4.12
+2.8.x3.10 to 4.13
  ==
 
 Open vSwitch userspace should also work with the Linux kernel module built
diff --git a/NEWS b/NEWS
index 6a5d2bf..a3c1a02 100644
--- a/NEWS
+++ b/NEWS
@@ -3,6 +3,8 @@ Post-v2.8.0
- OVN:
  * The "requested-chassis" option for a logical switch port now accepts a
chassis "hostname" in addition to a chassis "name".
+   - Linux kernel 4.13
+ * Add support for compiling OVS with the latest Linux 4.13 kernel
 
 v2.8.0 - xx xxx 
 -
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 14/16] acinclude: Add support for Linux 4.13

2017-09-11 Thread Greg Rose
Add configuration support for the just released 4.13 Linux kernel.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index 46efa69..cb5f3ae 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -143,10 +143,10 @@ AC_DEFUN([OVS_CHECK_LINUX], [
 AC_MSG_RESULT([$kversion])
 
 if test "$version" -ge 4; then
-   if test "$version" = 4 && test "$patchlevel" -le 12; then
+   if test "$version" = 4 && test "$patchlevel" -le 13; then
   : # Linux 4.x
else
-  AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version 
newer than 4.11.x is not supported (please refer to the FAQ for advice)])
+  AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version 
newer than 4.13.x is not supported (please refer to the FAQ for advice)])
fi
 elif test "$version" = 3 && test "$patchlevel" -ge 10; then
: # Linux 3.x
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 13/16] datapath: Fix up vxlan device flags

2017-09-11 Thread Greg Rose
I missed a couple of usages of the flags parameter from vxlan_dev
while adding compatibility code to handle the removal of the flags.
Add the checks so that the module can compile for Linux kernel
release 4.13

Fixes: 143656435c ("datapath: get rid of redundant vxlan_dev.flags")
Signed-off-by: Greg Rose 
---
 datapath/vport-vxlan.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
index 2910694..c7139ab 100644
--- a/datapath/vport-vxlan.c
+++ b/datapath/vport-vxlan.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2015 Nicira, Inc.
+ * Copyright (c) 2015,2017 Nicira, Inc.
  * Copyright (c) 2013 Cisco Systems, Inc.
  *
  * This program is free software; you can redistribute it and/or
@@ -60,14 +60,22 @@ static int vxlan_get_options(const struct vport *vport, 
struct sk_buff *skb)
return -EMSGSIZE;
 
nla_nest_end(skb, exts);
+#ifdef HAVE_VXLAN_DEV_CFG
+   } else if (vxlan->cfg.flags & VXLAN_F_GPE) {
+#else
} else if (vxlan->flags & VXLAN_F_GPE) {
+#endif
struct nlattr *exts;
 
exts = nla_nest_start(skb, OVS_TUNNEL_ATTR_EXTENSION);
if (!exts)
return -EMSGSIZE;
 
+#ifdef HAVE_VXLAN_DEV_CFG
+   if (vxlan->cfg.flags & VXLAN_F_GPE &&
+#else
if (vxlan->flags & VXLAN_F_GPE &&
+#endif
nla_put_flag(skb, OVS_VXLAN_EXT_GPE))
return -EMSGSIZE;
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 12/16] datapath: Check for existence of nf_hook_ops member list

2017-09-11 Thread Greg Rose
The list member of nf_hook_ops has been removed in Linux kernel
release 4.13. Check for the definition of it in pre-4.13 kernels.

Signed-off-by: Greg Rose 
---
 datapath/linux/compat/stt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c
index 07f5892..37d5f4b 100644
--- a/datapath/linux/compat/stt.c
+++ b/datapath/linux/compat/stt.c
@@ -2110,7 +2110,9 @@ int stt_init_module(void)
if (rc)
goto out2;
 
+#ifdef HAVE_LIST_IN_NF_HOOK_OPS
INIT_LIST_HEAD(_hook_ops.list);
+#endif
pr_info("STT tunneling driver\n");
return 0;
 out2:
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 11/16] acinclude: Check for existence of nf_hook_ops member "list".

2017-09-11 Thread Greg Rose
The "list" member of the nf_hook_ops structure is removed in Linux
kernel release 4.13.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index 577437f..46efa69 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -774,6 +774,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   OVS_FIND_FIELD_IFELSE([$KSRC/include/net/rtnetlink.h], [rtnl_link_ops],
 [extack],
   [OVS_DEFINE([HAVE_EXT_ACK_IN_RTNL_LINKOPS])])
+  OVS_FIND_FIELD_IFELSE([$KSRC/include/linux/netfilter.h], [nf_hook_ops],
+[list],
+[OVS_DEFINE([HAVE_LIST_IN_NF_HOOK_OPS])])
 
 
   if cmp -s datapath/linux/kcompat.h.new \
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 10/16] datapath: Fixup RTNL ops for kernel 4.13

2017-09-11 Thread Greg Rose
The RTNL ops validate and newlink functions now take the extended
netlink ack parameter.  Use the new HAVE_EXT_ACK_IN_RTNL_LINKOPS
define to check if the additional parameter is present and add the
parameter if so.

While in the modules remove the checks for Linux kernels < 2.3.39
since they are no longer supported since 2.5.x.

Signed-off-by: Greg Rose 
---
 datapath/linux/compat/lisp.c | 21 ++---
 datapath/linux/compat/stt.c  | 11 +++
 2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/datapath/linux/compat/lisp.c b/datapath/linux/compat/lisp.c
index c505fad..34f8232 100644
--- a/datapath/linux/compat/lisp.c
+++ b/datapath/linux/compat/lisp.c
@@ -608,7 +608,12 @@ static const struct nla_policy lisp_policy[IFLA_LISP_MAX + 
1] = {
[IFLA_LISP_PORT]  = { .type = NLA_U16 },
 };
 
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
+static int lisp_validate(struct nlattr *tb[], struct nlattr *data[],
+struct netlink_ext_ack __always_unused *extack)
+#else
 static int lisp_validate(struct nlattr *tb[], struct nlattr *data[])
+#endif
 {
if (tb[IFLA_ADDRESS]) {
if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
@@ -660,17 +665,15 @@ static int lisp_configure(struct net *net, struct 
net_device *dev,
return 0;
 }
 
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,39)
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
 static int lisp_newlink(struct net *net, struct net_device *dev,
-   struct nlattr *tb[], struct nlattr *data[])
-{
+   struct nlattr *tb[], struct nlattr *data[],
+   struct netlink_ext_ack __always_unused *extack)
 #else
-static int lisp_newlink(struct net_device *dev,
+static int lisp_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[])
-
-{
-   struct net *net = _net;
 #endif
+{
__be16 dst_port = htons(LISP_UDP_PORT);
 
if (data[IFLA_LISP_PORT])
@@ -679,11 +682,7 @@ static int lisp_newlink(struct net_device *dev,
return lisp_configure(net, dev, dst_port);
 }
 
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,39)
 static void lisp_dellink(struct net_device *dev, struct list_head *head)
-#else
-static void lisp_dellink(struct net_device *dev)
-#endif
 {
struct lisp_dev *lisp = netdev_priv(dev);
 
diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c
index 964d993..07f5892 100644
--- a/datapath/linux/compat/stt.c
+++ b/datapath/linux/compat/stt.c
@@ -1909,7 +1909,12 @@ static const struct nla_policy stt_policy[IFLA_STT_MAX + 
1] = {
[IFLA_STT_PORT]  = { .type = NLA_U16 },
 };
 
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
+static int stt_validate(struct nlattr *tb[], struct nlattr *data[],
+   struct netlink_ext_ack __always_unused *extack)
+#else
 static int stt_validate(struct nlattr *tb[], struct nlattr *data[])
+#endif
 {
if (tb[IFLA_ADDRESS]) {
if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
@@ -1961,8 +1966,14 @@ static int stt_configure(struct net *net, struct 
net_device *dev,
return 0;
 }
 
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
+static int stt_newlink(struct net *net, struct net_device *dev,
+   struct nlattr *tb[], struct nlattr *data[],
+   struct netlink_ext_ack __always_unused *extack)
+#else
 static int stt_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[])
+#endif
 {
__be16 dst_port = htons(STT_DST_PORT);
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 09/16] acinclude: Check for extended netlink ack presence

2017-09-11 Thread Greg Rose
RTNL ops validate and newlink now include the extended netlink
ack feature.  Check for it and set HAVE_EXT_ACK_IN_RTNL_LINKOPS
if found.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index 772ff03..577437f 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -771,6 +771,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   [OVS_DEFINE([HAVE_SKB_GSO_UDP])])
   OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE],
   [OVS_DEFINE([HAVE_DST_NOCACHE])])
+  OVS_FIND_FIELD_IFELSE([$KSRC/include/net/rtnetlink.h], [rtnl_link_ops],
+[extack],
+  [OVS_DEFINE([HAVE_EXT_ACK_IN_RTNL_LINKOPS])])
 
 
   if cmp -s datapath/linux/kcompat.h.new \
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 08/16] compat: Add check for DST_NOCACHE

2017-09-11 Thread Greg Rose
DST_NOCACHE was removed from the Linux 4.13 kernel.  Check if
HAVE_DST_NOCACHE is defined for older kernels.

Signed-off-by: Greg Rose 
---
 datapath/linux/compat/include/net/ip6_fib.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/datapath/linux/compat/include/net/ip6_fib.h 
b/datapath/linux/compat/include/net/ip6_fib.h
index 2c8a513..4d58375 100644
--- a/datapath/linux/compat/include/net/ip6_fib.h
+++ b/datapath/linux/compat/include/net/ip6_fib.h
@@ -27,7 +27,11 @@
 static inline u32 rt6_get_cookie(const struct rt6_info *rt)
 {
if (rt->rt6i_flags & RTF_PCPU ||
-   (unlikely(rt->dst.flags & DST_NOCACHE) && rt->dst.from))
+#ifdef HAVE_DST_NOCACHE
+   (unlikely(rt->dst.flags & DST_NOCACHE) && rt->dst.from))
+#else
+   (unlikely(!list_empty(>rt6i_uncached)) && rt->dst.from))
+#endif
rt = (struct rt6_info *)(rt->dst.from);
 
return rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 07/16] acinclude: Add compat define for DST_NOCACHE

2017-09-11 Thread Greg Rose
DST_NOCACHE is removed in the 4.13 Linux kernel - add check for it
and if found set HAVE_DST_NOCACHE.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index 0e98ade..772ff03 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -769,6 +769,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   [nf_conntrack_helper_put])
   OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h],[SKB_GSO_UDP],
   [OVS_DEFINE([HAVE_SKB_GSO_UDP])])
+  OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE],
+  [OVS_DEFINE([HAVE_DST_NOCACHE])])
 
 
   if cmp -s datapath/linux/kcompat.h.new \
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 06/16] datapath: fix skb_panic due to the incorrect actions attrlen

2017-09-11 Thread Greg Rose
Upstream commit:
commit 494bea39f3201776cdfddc232705f54a0bd210c4
Author: Liping Zhang 
Date:   Wed Aug 16 13:30:07 2017 +0800

openvswitch: fix skb_panic due to the incorrect actions attrlen

For sw_flow_actions, the actions_len only represents the kernel part's
size, and when we dump the actions to the userspace, we will do the
convertions, so it's true size may become bigger than the actions_len.

But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len
to alloc the skbuff, so the user_skb's size may become insufficient and
oops will happen like this:
  skbuff: skb_over_panic: text:8148fabf len:1749 put:157 head:
  881300f39000 data:881300f39000 tail:0x6d5 end:0x6c0 dev:
  [ cut here ]
  kernel BUG at net/core/skbuff.c:129!
  [...]
  Call Trace:
   
   [] skb_put+0x43/0x44
   [] skb_zerocopy+0x6c/0x1f4
   [] queue_userspace_packet+0x3a3/0x448 [openvswitch]
   [] ovs_dp_upcall+0x30/0x5c [openvswitch]
   [] output_userspace+0x132/0x158 [openvswitch]
   [] ? ip6_rcv_finish+0x74/0x77 [ipv6]
   [] do_execute_actions+0xcc1/0xdc8 [openvswitch]
   [] ovs_execute_actions+0x74/0x106 [openvswitch]
   [] ovs_dp_process_packet+0xe1/0xfd [openvswitch]
   [] ? key_extract+0x63c/0x8d5 [openvswitch]
   [] ovs_vport_receive+0xa1/0xc3 [openvswitch]
  [...]

Also we can find that the actions_len is much little than the orig_len:
  crash> struct sw_flow_actions 0x8812f539d000
  struct sw_flow_actions {
rcu = {
  next = 0x8812f5398800,
  func = 0xe3b00035db32
},
orig_len = 1384,
actions_len = 592,
actions = 0x8812f539d01c
  }

So as a quick fix, use the orig_len instead of the actions_len to alloc
the user_skb.

Last, this oops happened on our system running a relative old kernel, but
the same risk still exists on the mainline, since we use the wrong
actions_len from the beginning.

Fixes: ccea74457bbd ("openvswitch: include datapath actions with sampled-pac
Cc: Neil McKee 
Signed-off-by: Liping Zhang 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Fixes: 0e469d3b380c ("datapath: Include datapath actions with sampled-packet 
upcall to userspace.")
Signed-off-by: Greg Rose 
---
 datapath/actions.c  | 1 +
 datapath/datapath.c | 7 ---
 datapath/datapath.h | 2 ++
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/datapath/actions.c b/datapath/actions.c
index 59d91b2..ad18c2c 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -1348,6 +1348,7 @@ int ovs_execute_actions(struct datapath *dp, struct 
sk_buff *skb,
goto out;
}
 
+   OVS_CB(skb)->acts_origlen = acts->orig_len;
err = do_execute_actions(dp, skb, key,
 acts->actions, acts->actions_len);
 
diff --git a/datapath/datapath.c b/datapath/datapath.c
index b565fc5..1780819 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -388,7 +388,7 @@ static int queue_gso_packets(struct datapath *dp, struct 
sk_buff *skb,
 }
 
 static size_t upcall_msg_size(const struct dp_upcall_info *upcall_info,
- unsigned int hdrlen)
+ unsigned int hdrlen, int actions_attrlen)
 {
size_t size = NLMSG_ALIGN(sizeof(struct ovs_header))
+ nla_total_size(hdrlen) /* OVS_PACKET_ATTR_PACKET */
@@ -405,7 +405,7 @@ static size_t upcall_msg_size(const struct dp_upcall_info 
*upcall_info,
 
/* OVS_PACKET_ATTR_ACTIONS */
if (upcall_info->actions_len)
-   size += nla_total_size(upcall_info->actions_len);
+   size += nla_total_size(actions_attrlen);
 
/* OVS_PACKET_ATTR_MRU */
if (upcall_info->mru)
@@ -472,7 +472,8 @@ static int queue_userspace_packet(struct datapath *dp, 
struct sk_buff *skb,
else
hlen = skb->len;
 
-   len = upcall_msg_size(upcall_info, hlen - cutlen);
+   len = upcall_msg_size(upcall_info, hlen - cutlen,
+ OVS_CB(skb)->acts_origlen);
user_skb = genlmsg_new(len, GFP_ATOMIC);
if (!user_skb) {
err = -ENOMEM;
diff --git a/datapath/datapath.h b/datapath/datapath.h
index f20deed..70ad0ac 100644
--- a/datapath/datapath.h
+++ b/datapath/datapath.h
@@ -100,11 +100,13 @@ struct datapath {
  * when a packet is received by OVS.
  * @mru: The maximum received fragement size; 0 if the packet is not
  * fragmented.
+ * @acts_origlen: The netlink size of the flow actions applied to this skb.
  * @cutlen: The number of bytes from the packet end to be removed.
  */
 struct ovs_skb_cb {
struct vport*input_vport;
u16 

[ovs-dev] [PATCH V2 05/16] datapath: Remove unnecessary newlines from OVS_NLERR uses

2017-09-11 Thread Greg Rose
Upstream commit:
commit 0ed80da518a1f27562a013f106505e495e891fe4
Author: Joe Perches 
Date:   Fri Aug 11 04:26:26 2017 -0700

openvswitch: Remove unnecessary newlines from OVS_NLERR uses

OVS_NLERR already adds a newline so these just add blank
lines to the logging.

Signed-off-by: Joe Perches 
Acked-by: Joe Stringer 
Signed-off-by: David S. Miller 

Signed-off-by: Greg Rose 
---
 datapath/conntrack.c| 14 +-
 datapath/flow_netlink.c |  2 +-
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index b645ab1..d517a87 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -1221,15 +1221,13 @@ static int parse_nat(const struct nlattr *attr,
int type = nla_type(a);
 
if (type > OVS_NAT_ATTR_MAX) {
-   OVS_NLERR(log,
- "Unknown NAT attribute (type=%d, max=%d).\n",
+   OVS_NLERR(log, "Unknown NAT attribute (type=%d, 
max=%d)",
  type, OVS_NAT_ATTR_MAX);
return -EINVAL;
}
 
if (nla_len(a) != ovs_nat_attr_lens[type][ip_vers]) {
-   OVS_NLERR(log,
- "NAT attribute type %d has unexpected length 
(%d != %d).\n",
+   OVS_NLERR(log, "NAT attribute type %d has unexpected 
length (%d != %d)",
  type, nla_len(a),
  ovs_nat_attr_lens[type][ip_vers]);
return -EINVAL;
@@ -1239,9 +1237,7 @@ static int parse_nat(const struct nlattr *attr,
case OVS_NAT_ATTR_SRC:
case OVS_NAT_ATTR_DST:
if (info->nat) {
-   OVS_NLERR(log,
- "Only one type of NAT may be 
specified.\n"
- );
+   OVS_NLERR(log, "Only one type of NAT may be 
specified");
return -ERANGE;
}
info->nat |= OVS_CT_NAT;
@@ -1291,13 +1287,13 @@ static int parse_nat(const struct nlattr *attr,
break;
 
default:
-   OVS_NLERR(log, "Unknown nat attribute (%d).\n", type);
+   OVS_NLERR(log, "Unknown nat attribute (%d)", type);
return -EINVAL;
}
}
 
if (rem > 0) {
-   OVS_NLERR(log, "NAT attribute has %d unknown bytes.\n", rem);
+   OVS_NLERR(log, "NAT attribute has %d unknown bytes", rem);
return -EINVAL;
}
if (!info->nat) {
diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
index 9b48612..df9d88e 100644
--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -1257,7 +1257,7 @@ static int ovs_key_from_nlattrs(struct net *net, struct 
sw_flow_match *match,
}
 
if (!is_mask && ipv6_key->ipv6_label & htonl(0xFFF0)) {
-   OVS_NLERR(log, "IPv6 flow label %x is out of range 
(max=%x).\n",
+   OVS_NLERR(log, "IPv6 flow label %x is out of range 
(max=%x)",
  ntohl(ipv6_key->ipv6_label), (1 << 20) - 1);
return -EINVAL;
}
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 04/16] datapath: Optimize operations for OvS flow_stats.

2017-09-11 Thread Greg Rose
Upstream commit:
commit c4b2bf6b4a35348fe6d1eb06928eb68d7b9d99a9
Author: Tonghao Zhang 
Date:   Mon Jul 17 23:28:06 2017 -0700

openvswitch: Optimize operations for OvS flow_stats.

When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).

The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.

VM01 --- OvS-VM --- VM02

Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.

4.23%  [kernel][k] _raw_spin_unlock_irqrestore
3.62%  [kernel][k] __do_softirq
3.16%  [kernel][k] __memcpy
3.02%  [kernel][k] flow_free
2.42%  libc-2.17.so[.] __memcpy_ssse3_back
2.18%  [kernel][k] copy_user_generic_unrolled
2.17%  [kernel][k] find_next_bit

When applied this patch, perf-top show as below: Not shown on
the list anymore.

4.11%  [kernel][k] _raw_spin_unlock_irqrestore
3.79%  [kernel][k] __do_softirq
3.46%  [kernel][k] __memcpy
2.73%  libc-2.17.so[.] __memcpy_ssse3_back
2.25%  [kernel][k] copy_user_generic_unrolled
1.89%  libc-2.17.so[.] _int_malloc
1.53%  ovs-vswitchd[.] xlate_actions

With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance imporve).

This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct does’t increase the cacheline number.

Signed-off-by: Tonghao Zhang 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Signed-off-by: Greg Rose 
---
 datapath/flow.c   | 7 ---
 datapath/flow.h   | 2 ++
 datapath/flow_table.c | 4 +++-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/datapath/flow.c b/datapath/flow.c
index 30e4d21..5da7e3e 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -71,7 +71,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   const struct sk_buff *skb)
 {
struct flow_stats *stats;
-   int cpu = smp_processor_id();
+   unsigned int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
 
stats = rcu_dereference(flow->stats[cpu]);
@@ -116,6 +116,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
 
rcu_assign_pointer(flow->stats[cpu],
   new_stats);
+   cpumask_set_cpu(cpu, 
>cpu_used_mask);
goto unlock;
}
}
@@ -143,7 +144,7 @@ void ovs_flow_stats_get(const struct sw_flow *flow,
memset(ovs_stats, 0, sizeof(*ovs_stats));
 
/* We open code this to make sure cpu 0 is always considered */
-   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
cpu_possible_mask)) {
+   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
>cpu_used_mask)) {
struct flow_stats *stats = 
rcu_dereference_ovsl(flow->stats[cpu]);
 
if (stats) {
@@ -167,7 +168,7 @@ void ovs_flow_stats_clear(struct sw_flow *flow)
int cpu;
 
/* We open code this to make sure cpu 0 is always considered */
-   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
cpu_possible_mask)) {
+   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
>cpu_used_mask)) {
struct flow_stats *stats = ovsl_dereference(flow->stats[cpu]);
 
if (stats) {
diff --git a/datapath/flow.h b/datapath/flow.h
index 07af912..0796b09 100644
--- a/datapath/flow.h
+++ b/datapath/flow.h
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -218,6 +219,7 @@ struct sw_flow {
 */
struct sw_flow_key key;
struct sw_flow_id id;
+   struct cpumask cpu_used_mask;
struct sw_flow_mask *mask;
struct sw_flow_actions __rcu *sf_acts;
struct flow_stats __rcu 

[ovs-dev] [PATCH V2 03/16] datapath: Optimize updating for OvS flow_stats.

2017-09-11 Thread Greg Rose
Upstream commit:
commit c57c054eb5b1ccf230c49f736f7a018fcbc3e952
Author: Tonghao Zhang 
Date:   Mon Jul 17 23:28:05 2017 -0700

openvswitch: Optimize updating for OvS flow_stats.

In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.

Signed-off-by: Tonghao Zhang 
Signed-off-by: David S. Miller 

Signed-off-by: Greg Rose 
---
 datapath/flow.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/datapath/flow.c b/datapath/flow.c
index 599b4e5..30e4d21 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -71,7 +71,6 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   const struct sk_buff *skb)
 {
struct flow_stats *stats;
-   int node = numa_node_id();
int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
 
@@ -107,7 +106,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   __GFP_THISNODE |
   __GFP_NOWARN |
  __GFP_NOMEMALLOC,
- node);
+ numa_node_id());
if (likely(new_stats)) {
new_stats->used = jiffies;
new_stats->packet_count = 1;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 02/16] datapath: Remove all references to SKB_GSO_UDP.

2017-09-11 Thread Greg Rose
Upstream commit:
commit 880388aa3c07fdea4f9b85e35641753017b1852f
Author: David S. Miller 
Date:   Mon Jul 3 07:29:12 2017 -0700

net: Remove all references to SKB_GSO_UDP.

Such packets are no longer possible.

Signed-off-by: David S. Miller 

SKB_GSO_UDP is removed in the upstream kernel.  Use HAVE_SKB_GSO_UDP
define from acinclude to detect if SKB_GSO_UDP exists and if so apply
openvswitch section of this upstream patch.

Signed-off-by: Greg Rose 
---
 datapath/flow.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/datapath/flow.c b/datapath/flow.c
index c4f63b0..599b4e5 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -589,8 +589,12 @@ static int key_extract(struct sk_buff *skb, struct 
sw_flow_key *key)
key->ip.frag = OVS_FRAG_TYPE_LATER;
return 0;
}
+#ifdef HAVE_SKB_GSO_UDP
if (nh->frag_off & htons(IP_MF) ||
skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+#else
+   if (nh->frag_off & htons(IP_MF))
+#endif
key->ip.frag = OVS_FRAG_TYPE_FIRST;
else
key->ip.frag = OVS_FRAG_TYPE_NONE;
@@ -707,9 +711,11 @@ static int key_extract(struct sk_buff *skb, struct 
sw_flow_key *key)
 
if (key->ip.frag == OVS_FRAG_TYPE_LATER)
return 0;
+#ifdef HAVE_SKB_GSO_UDP
if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
key->ip.frag = OVS_FRAG_TYPE_FIRST;
 
+#endif
/* Transport layer. */
if (key->ip.proto == NEXTHDR_TCP) {
if (tcphdr_ok(skb)) {
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH V2 01/16] acinclude: Check for SKB_GSO_UDP

2017-09-11 Thread Greg Rose
Removed in kernel 4.13

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index aeb594a..0e98ade 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -767,6 +767,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
 [OVS_DEFINE([HAVE_VXLAN_DEV_CFG])])
   OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_helper.h],
   [nf_conntrack_helper_put])
+  OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h],[SKB_GSO_UDP],
+  [OVS_DEFINE([HAVE_SKB_GSO_UDP])])
+
 
   if cmp -s datapath/linux/kcompat.h.new \
 datapath/linux/kcompat.h >/dev/null 2>&1; then
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Skip OVN test if truncate command not found

2017-09-11 Thread Mark Michelson
On Mon, Sep 11, 2017 at 3:37 PM Alin Serdean <
aserd...@cloudbasesolutions.com> wrote:

> > On Mon, Sep 11, 2017 at 2:59 PM Mark Michelson 
> > wrote:
> >
> > > On Mon, Sep 11, 2017 at 2:37 PM Alin Gabriel Serdean
> > > 
> > > wrote:
> > >
> > >> Test: "testing ovn -- ensure one gw controller restart in HA doesn't
> > >> bounce the master" uses the command `truncate`.
> > >>
> > >> Skip the test if the command is not found.
> > >>
> > >>
> > > The use of the truncate command here is to make the file 0 bytes.
> > > Rather than skipping the test if the truncate command is not present,
> > > the test could be altered to make the file 0 bytes in some other way.
> > >
> > > # dd if=/dev/null of=gw1/ovn-controller.log # > gw1-controller.log #
> > > rm gw1-controller.log && touch gw1/ovn-controller.log
> > >
> > > Are potential ways of expressing the same thing without needing to use
> > > truncate.
> > >
> >
> > OOps, I typoed the filename in those last two suggestions. They should be
> >
> > # > gw1/ovn-controller.log
> > # rm gw1/ovn-controller.log && touch gw1/ovn-controller.log
> > ___
> [Alin Serdean] Thanks for the quick review Mark!
> Sorry I didn't look at the actual test.
> I just saw it failed and:
> "ovs/tests/testsuite.dir/at-groups/2364/test-source: line 82: truncate:
> command not found"
> in the log.
> Short story msys does not have truncate, msys2 has it that's why I was so
> eager .
>
> I would avoid using dd with /dev/null because that would also a bit
> specific.
>
> I tested with the following
> diff --git a/tests/ovn.at b/tests/ovn.at
> index 2a00232..572102b 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -8501,7 +8501,8 @@ ovn-nbctl --wait=hv --timeout=3 sync
>  # doesn't have the same effect because "name" is conserved, and the
>  # Chassis entry is not replaced.
>
> -truncate -s 0 gw1/ovn-controller.log
> +#truncate -s 0 gw1/ovn-controller.log
> +> gw1/ovn-controller.log
>
>  gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
>  ovn-sbctl destroy Chassis $gw2_chassis
>
> Do you mind sending out a patch for it?
>
> Alin.
>

Patch is here:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338627.html
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] test: Avoid using 'truncate' command

2017-09-11 Thread Mark Michelson
The 'truncate' command is not available on all platforms. Since the
command was being used to zero out a file, there are other easy options
available. In this case, I've replaced 'truncate' with a redirection.

Reported-by: Alin Gabriel Serdean 
Signed-off-by: Mark Michelson 
---
 tests/ovn.at | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/ovn.at b/tests/ovn.at
index 2a00232ed..6c38b973f 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -8501,7 +8501,7 @@ ovn-nbctl --wait=hv --timeout=3 sync
 # doesn't have the same effect because "name" is conserved, and the
 # Chassis entry is not replaced.
 
-truncate -s 0 gw1/ovn-controller.log
+> gw1/ovn-controller.log
 
 gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
 ovn-sbctl destroy Chassis $gw2_chassis
-- 
2.13.5

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Skip OVN test if truncate command not found

2017-09-11 Thread Alin Serdean
> On Mon, Sep 11, 2017 at 2:59 PM Mark Michelson 
> wrote:
> 
> > On Mon, Sep 11, 2017 at 2:37 PM Alin Gabriel Serdean
> > 
> > wrote:
> >
> >> Test: "testing ovn -- ensure one gw controller restart in HA doesn't
> >> bounce the master" uses the command `truncate`.
> >>
> >> Skip the test if the command is not found.
> >>
> >>
> > The use of the truncate command here is to make the file 0 bytes.
> > Rather than skipping the test if the truncate command is not present,
> > the test could be altered to make the file 0 bytes in some other way.
> >
> > # dd if=/dev/null of=gw1/ovn-controller.log # > gw1-controller.log #
> > rm gw1-controller.log && touch gw1/ovn-controller.log
> >
> > Are potential ways of expressing the same thing without needing to use
> > truncate.
> >
> 
> OOps, I typoed the filename in those last two suggestions. They should be
> 
> # > gw1/ovn-controller.log
> # rm gw1/ovn-controller.log && touch gw1/ovn-controller.log
> ___
[Alin Serdean] Thanks for the quick review Mark!
Sorry I didn't look at the actual test.
I just saw it failed and:
"ovs/tests/testsuite.dir/at-groups/2364/test-source: line 82: truncate: command 
not found"
in the log.
Short story msys does not have truncate, msys2 has it that's why I was so eager 
.

I would avoid using dd with /dev/null because that would also a bit specific.

I tested with the following
diff --git a/tests/ovn.at b/tests/ovn.at
index 2a00232..572102b 100644
--- a/tests/ovn.at
+++ b/tests/ovn.at
@@ -8501,7 +8501,8 @@ ovn-nbctl --wait=hv --timeout=3 sync
 # doesn't have the same effect because "name" is conserved, and the
 # Chassis entry is not replaced.

-truncate -s 0 gw1/ovn-controller.log
+#truncate -s 0 gw1/ovn-controller.log
+> gw1/ovn-controller.log

 gw2_chassis=$(ovn-sbctl --bare --columns=_uuid find Chassis name=gw2)
 ovn-sbctl destroy Chassis $gw2_chassis

Do you mind sending out a patch for it?

Alin.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Fix MSVC compiler error on test-ovn.c

2017-09-11 Thread Ben Pfaff
On Mon, Sep 11, 2017 at 10:36:53PM +0300, Alin Gabriel Serdean wrote:
> MSVC doesn't like an empty initializer for a flexible array.
> 
> Add a zero initializer to keep the compiler happy.
> 
> Fixes current build on Windows.
> 
> Signed-off-by: Alin Gabriel Serdean 

Acked-by: Ben Pfaff 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Skip OVN test if truncate command not found

2017-09-11 Thread Mark Michelson
On Mon, Sep 11, 2017 at 2:59 PM Mark Michelson  wrote:

> On Mon, Sep 11, 2017 at 2:37 PM Alin Gabriel Serdean 
> wrote:
>
>> Test: "testing ovn -- ensure one gw controller restart in HA doesn't
>> bounce the master" uses the command `truncate`.
>>
>> Skip the test if the command is not found.
>>
>>
> The use of the truncate command here is to make the file 0 bytes. Rather
> than skipping the test if the truncate command is not present, the test
> could be altered to make the file 0 bytes in some other way.
>
> # dd if=/dev/null of=gw1/ovn-controller.log
> # > gw1-controller.log
> # rm gw1-controller.log && touch gw1/ovn-controller.log
>
> Are potential ways of expressing the same thing without needing to use
> truncate.
>

OOps, I typoed the filename in those last two suggestions. They should be

# > gw1/ovn-controller.log
# rm gw1/ovn-controller.log && touch gw1/ovn-controller.log
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v2] openvswitch: Fix an error handling path in 'ovs_nla_init_match_and_action()'

2017-09-11 Thread Christophe JAILLET
All other error handling paths in this function go through the 'error'
label. This one should do the same.

Fixes: 9cc9a5cb176c ("datapath: Avoid using stack larger than 1024.")
Signed-off-by: Christophe JAILLET 
---
I think that the comment above the function could be improved. It looks
like the commit log which has introduced this function.

I'm also not sure that commit 9cc9a5cb176c is of any help. It is
supposed to remove a warning, and I guess it does. But 
'ovs_nla_init_match_and_action()'
is called unconditionnaly from 'ovs_flow_cmd_set()'. So even if the stack
used by each function is reduced, the overall stack should be the same, if
not larger.

So this commit sounds like adding a bug where the code was fine and states
to fix an issue but, at the best, only hides it.

Instead of fixing the code with the proposed patch, reverting the initial
commit could also be considered.

V2: update Subject line
---
 net/openvswitch/datapath.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 76cf273a56c7..c3aec6227c91 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1112,7 +1112,8 @@ static int ovs_nla_init_match_and_action(struct net *net,
if (!a[OVS_FLOW_ATTR_KEY]) {
OVS_NLERR(log,
  "Flow key attribute not present in set 
flow.");
-   return -EINVAL;
+   error = -EINVAL;
+   goto error;
}
 
*acts = get_flow_actions(net, a[OVS_FLOW_ATTR_ACTIONS], key,
-- 
2.11.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] tests: Skip OVN test if truncate command not found

2017-09-11 Thread Mark Michelson
On Mon, Sep 11, 2017 at 2:37 PM Alin Gabriel Serdean 
wrote:

> Test: "testing ovn -- ensure one gw controller restart in HA doesn't
> bounce the master" uses the command `truncate`.
>
> Skip the test if the command is not found.
>
>
The use of the truncate command here is to make the file 0 bytes. Rather
than skipping the test if the truncate command is not present, the test
could be altered to make the file 0 bytes in some other way.

# dd if=/dev/null of=gw1/ovn-controller.log
# > gw1-controller.log
# rm gw1-controller.log && touch gw1/ovn-controller.log

Are potential ways of expressing the same thing without needing to use
truncate.


> Signed-off-by: Alin Gabriel Serdean 
> ---
>  tests/ovn.at | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/tests/ovn.at b/tests/ovn.at
> index f203529..4563964 100644
> --- a/tests/ovn.at
> +++ b/tests/ovn.at
> @@ -8432,6 +8432,7 @@ AT_CLEANUP
>
>  AT_SETUP([ovn -- ensure one gw controller restart in HA doesn't bounce
> the master])
>  AT_SKIP_IF([test $HAVE_PYTHON = no])
> +AT_SKIP_IF([!(test -x "$(command -v truncate)")])
>  ovn_start
>
>  net_add n1
> --
> 2.10.2.windows.1
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] tests: Fix MSVC compiler error on test-ovn.c

2017-09-11 Thread Alin Gabriel Serdean
MSVC doesn't like an empty initializer for a flexible array.

Add a zero initializer to keep the compiler happy.

Fixes current build on Windows.

Signed-off-by: Alin Gabriel Serdean 
---
 tests/test-ovn.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/test-ovn.c b/tests/test-ovn.c
index 148ce12..345fd47 100644
--- a/tests/test-ovn.c
+++ b/tests/test-ovn.c
@@ -202,7 +202,7 @@ create_addr_sets(struct shash *addr_sets)
 static const char *const addrs3[] = {
 "00:00:00:00:00:01", "00:00:00:00:00:02", "00:00:00:00:00:03",
 };
-static const char *const addrs4[] = {};
+static const char *const addrs4[] = { 0 };
 
 expr_addr_sets_add(addr_sets, "set1", addrs1, 3);
 expr_addr_sets_add(addr_sets, "set2", addrs2, 3);
-- 
2.10.2.windows.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] datapath: Fix an error handling path in 'ovs_nla_init_match_and_action()'

2017-09-11 Thread David Miller

Please use a proper Subject subsystem prefix for openvswitch patches.
"datapath" isn't very specific in the global kernel namespace at all.

The entire networking stack packet processing path is a "datapath"

"openvswitch: " would have been much better.
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] datapath: Fix an error handling path in 'ovs_nla_init_match_and_action()'

2017-09-11 Thread Christophe JAILLET
All other error handling paths in this function go through the 'error'
label. This one should do the same.

Fixes: 9cc9a5cb176c ("datapath: Avoid using stack larger than 1024.")
Signed-off-by: Christophe JAILLET 
---
I think that the comment above the function could be improved. It looks
like the commit log which has introduced this function.

I'm also not sure that commit 9cc9a5cb176c is of any help. It is
supposed to remove a warning, and I guess it does. But 
'ovs_nla_init_match_and_action()'
is called unconditionnaly from 'ovs_flow_cmd_set()'. So even if the stack
used by each function is reduced, the overall stack should be the same, if
not larger.

So this commit sounds like adding a bug where the code was fine and states
to fix an issue but, at the best, only hides it.

Instead of fixing the code with the proposed patch, reverting the initial
commit could also be considered.
---
 net/openvswitch/datapath.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index 76cf273a56c7..c3aec6227c91 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -1112,7 +1112,8 @@ static int ovs_nla_init_match_and_action(struct net *net,
if (!a[OVS_FLOW_ATTR_KEY]) {
OVS_NLERR(log,
  "Flow key attribute not present in set 
flow.");
-   return -EINVAL;
+   error = -EINVAL;
+   goto error;
}
 
*acts = get_flow_actions(net, a[OVS_FLOW_ATTR_ACTIONS], key,
-- 
2.11.0

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v6 0/6] OVS-DPDK rxq to pmd assignment improvements.

2017-09-11 Thread Federico Iezzi
On Fri, Aug 25, 2017 at 10:36 AM, Darrell Ball  wrote:
> I applied the series to https://github.com/darball/ovs/commits/dpdk_merge

I believe this series is worth a mention in the NEWS, reengineering an
important OVS-DPDK limitation.

>
>
>
> On 8/24/17, 4:37 PM, "Kevin Traynor"  wrote:
>
> For the DPDK datapath, by default rxqs are assigned to available pmds
> in round robin order with no weight or priority.
>
> It can happen that some very busy queues are handled by one pmd which
> does not have enough cycles to prevent packets being dropped on them.
> While at the same time another pmd which handles queues with no traffic
> on them is essentially idle.
>
> Rxq to pmd assignment happens as a result of a number of events and
> when it does, the same unweighted round robin approach is applied
> each time.
>
> This patchset proposes to improve the round robin nature of rxq to pmd
> assignment by counting the processing cycles used by the rxqs during
> their operation and incorporating that data into assignment.
>
> Before assigning in a round robin manner, the rxqs will be sorted in
> order of the processing cycles they have been consuming. Assuming
> multiple pmds, this ensures that the rxqs measured to be using the
> most processing cycles will be assigned to different cores.
>
> In some cases the measured cycles for an rxq may be not available as
> the rxq is new or may not be useful for assignment as traffic patterns
> may change.  In those cases the code will essentially fallback to being
> round round similar to what currently exists. However, in the case
> where data is available and a reliable indication of future rxq cycles
> consumption, rxq to pmd distribution will be much improved.
>
> V5 -> V6
> 
> Minor changes to 2/6, 3/6, 4/6, 5/6 from Darrell's review comments.
>
> V4 -> V5
> 
> Changed history of rxq considered during assignment to 1 min. In order
> to have data available quicker than 1 min and not to be using up to
> 1 min old data, introduced storing of data in multiple intervals
> similar to suggestion by Darrell. Some minor name changes to reflect
> this.
>
> 2/6 Added storage for multiple intervals
> 3/6 Store cycles into interval
> 4/6 Sum cycles from intervals and use total for assignment
>
> V3 -> V4
> 
> 4/6
> Rebased to accomodate new cross numa assigment.
>
> V2 -> V3
> 
> Dropped v2 1/7 as not reusing dpcls optimisation interval anymore
>
> 2/6
> Moved unused functions to 3/6 to avoid compiler warning
>
> 3/6
> Made pmd rxq interval independent from dpcls opt interval
>
> 4/6
> Moved docs about rebalance command to when it is available in 6/6
> Added logging info for pmd to rxq assignment
>
> 5/6
> Added an example to docs
>
> 6/6
> Noted in commit msg that Jan requested this for testing purposes
>
> V1 -> V2
> 
> Dropped Ciara's patch to change how pmd cycles are counted as it merged.
>
> 6/7: Rebased unit tests.
>
>
> Kevin Traynor (6):
>   dpif-netdev: Change polled_queue to use dp_netdev_rxq.
>   dpif-netdev: Add rxq processing cycle counters.
>   dpif-netdev: Count the rxq processing cycles for an rxq.
>   dpif-netdev: Change rxq_scheduling to use rxq processing cycles.
>   dpif-netdev: Change pmd selection order.
>   dpif-netdev: Add ovs-appctl dpif-netdev/pmd-rxq-rebalance.
>
>  Documentation/howto/dpdk.rst |  26 
>  lib/dpif-netdev.c| 298 
> ---
>  tests/pmd.at |   2 +-
>  vswitchd/ovs-vswitchd.8.in   |   2 +
>  4 files changed, 284 insertions(+), 44 deletions(-)
>
> --
> 1.8.3.1
>
>
>
>
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH 09/15] acinclude: Check for extended netlink ack presence

2017-09-11 Thread Greg Rose

On 09/11/2017 10:56 AM, Greg Rose wrote:

RTNL ops validate and newlink now include the extended netlink
ack feature.  Check for it and set HAVE_EXT_ACK_IN_RTNL_LINKOPS
if found.

Signed-off-by: Greg Rose 
---
  acinclude.m4 | 6 ++
  1 file changed, 6 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index 772ff03..46efa69 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -771,6 +771,12 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
[OVS_DEFINE([HAVE_SKB_GSO_UDP])])
OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE],
[OVS_DEFINE([HAVE_DST_NOCACHE])])
+  OVS_FIND_FIELD_IFELSE([$KSRC/include/net/rtnetlink.h], [rtnl_link_ops],
+[extack],
+  [OVS_DEFINE([HAVE_EXT_ACK_IN_RTNL_LINKOPS])])


This part below for nf_hook_ops got munged into this patch but should should 
have been in
its own separate patch.

I'll follow up with a V2 of the series.

- Greg


+  OVS_FIND_FIELD_IFELSE([$KSRC/include/linux/netfilter.h], [nf_hook_ops],
+[list],
+[OVS_DEFINE([HAVE_LIST_IN_NF_HOOK_OPS])])
  
  
if cmp -s datapath/linux/kcompat.h.new \




___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Pour une collaboration

2017-09-11 Thread Mr Laurent Jacquemet via dev
Bonjour

Je suis ravi de faire votre connaissance, mais avant tout toutes mes excuses 
pour l'occasion par laquelle je prends contact avec vous c'est à dire par le 
biais de l'internet. C'est suite à des recherches sur le net que j'ai trouvé 
votre adresse électronique. Avec tout le respect, je vous prie de bien vouloir 
prendre connaissance de mon mail ainsi que son contenu.Je suis directeur du 
département de crédit et de rémittence internationale dans une Banque . J'ai la 
souscription d'un de nos clients Monsieur Mohamed Ahmed étrangers qui a trouvé 
la mort avec toute sa famille entière dans un crash d'avion survenu le 25 
décembre 2003 à l'aéroport International de Cotonou. Sur son compte a été 
découvert d'une forte somme dont je ne dis pas encore la valeur.Je vous 
rappelle que cette transaction est légale, car cet argent n'appartient plus à 
personne et que tous les agencements exigés ont été pris pour mener à bien le 
transfert sans risque aucun pour les deux partis (vous et moi). Je vous 
donnerai plus de détails si vous me donnez votre accord pour un partenariat 
gagnant. Les fonds seront partagés à la proportion équitable et seul 2 % seront 
mis de côté pour couvrir les dépenses et impôts (taxe). Traitez s'il vous plaît 
cette proposition d'affaires avec la confidentialité extrême et envoyez-moi un 
mail pour confirmation.Voici mon mon email personnel jacq...@hotmail.com

Cordialement

Mr Laurent Jacquemet
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 15/15] travis: Update build kernel list

2017-09-11 Thread Greg Rose
Linux kernel 4.13 has been released. Update the kernel build list to
the current list of kernels from kernel.org.

Signed-off-by: Greg Rose 
---
 .travis.yml | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/.travis.yml b/.travis.yml
index 9d0fd44..f217840 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -25,16 +25,16 @@ sudo: false
 
 env:
   - OPTS="--disable-ssl"
-  - TESTSUITE=1 KERNEL=3.16.46
+  - TESTSUITE=1 KERNEL=3.16.47
   - TESTSUITE=1 OPTS="--enable-shared"
   - BUILD_ENV="-m32" OPTS="--disable-ssl"
-  - KERNEL=3.16.46 DPDK=1
-  - KERNEL=3.16.46 DPDK=1 OPTS="--enable-shared"
-  - KERNEL=4.12.2
-  - KERNEL=4.11.11
-  - KERNEL=4.9.38
-  - KERNEL=4.4.77
-  - KERNEL=4.1.42
+  - KERNEL=3.16.47 DPDK=1
+  - KERNEL=3.16.47 DPDK=1 OPTS="--enable-shared"
+  - KERNEL=4.13
+  - KERNEL=4.12.11
+  - KERNEL=4.9.48
+  - KERNEL=4.4.87
+  - KERNEL=4.1.43
   - KERNEL=3.10.107
   - TESTSUITE=1 LIBS=-ljemalloc
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 14/15] Documentation: Update NEWS and FAQ.

2017-09-11 Thread Greg Rose
Document Open vSwitch Linux kernel support for the 4.13 kernel
release.

Signed-off-by: Greg Rose 
---
 Documentation/faq/releases.rst | 2 +-
 NEWS   | 2 ++
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/faq/releases.rst b/Documentation/faq/releases.rst
index c785529..064a496 100644
--- a/Documentation/faq/releases.rst
+++ b/Documentation/faq/releases.rst
@@ -65,7 +65,7 @@ Q: What Linux kernel versions does each Open vSwitch release 
work with?
 2.5.x2.6.32 to 4.3
 2.6.x3.10 to 4.7
 2.7.x3.10 to 4.9
-2.8.x3.10 to 4.12
+2.8.x3.10 to 4.13
  ==
 
 Open vSwitch userspace should also work with the Linux kernel module built
diff --git a/NEWS b/NEWS
index 6a5d2bf..a3c1a02 100644
--- a/NEWS
+++ b/NEWS
@@ -3,6 +3,8 @@ Post-v2.8.0
- OVN:
  * The "requested-chassis" option for a logical switch port now accepts a
chassis "hostname" in addition to a chassis "name".
+   - Linux kernel 4.13
+ * Add support for compiling OVS with the latest Linux 4.13 kernel
 
 v2.8.0 - xx xxx 
 -
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 13/15] acinclude: Add support for Linux 4.13

2017-09-11 Thread Greg Rose
Add configuration support for the just released 4.13 Linux kernel.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/acinclude.m4 b/acinclude.m4
index 46efa69..cb5f3ae 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -143,10 +143,10 @@ AC_DEFUN([OVS_CHECK_LINUX], [
 AC_MSG_RESULT([$kversion])
 
 if test "$version" -ge 4; then
-   if test "$version" = 4 && test "$patchlevel" -le 12; then
+   if test "$version" = 4 && test "$patchlevel" -le 13; then
   : # Linux 4.x
else
-  AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version 
newer than 4.11.x is not supported (please refer to the FAQ for advice)])
+  AC_ERROR([Linux kernel in $KBUILD is version $kversion, but version 
newer than 4.13.x is not supported (please refer to the FAQ for advice)])
fi
 elif test "$version" = 3 && test "$patchlevel" -ge 10; then
: # Linux 3.x
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 12/15] datapath: Fix up vxlan device flags

2017-09-11 Thread Greg Rose
I missed a couple of usages of the flags parameter from vxlan_dev
while adding compatibility code to handle the removal of the flags.
Add the checks so that the module can compile for Linux kernel
release 4.13

Fixes: 143656435c ("datapath: get rid of redundant vxlan_dev.flags")
Signed-off-by: Greg Rose 
---
 datapath/vport-vxlan.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/datapath/vport-vxlan.c b/datapath/vport-vxlan.c
index 2910694..c7139ab 100644
--- a/datapath/vport-vxlan.c
+++ b/datapath/vport-vxlan.c
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2015 Nicira, Inc.
+ * Copyright (c) 2015,2017 Nicira, Inc.
  * Copyright (c) 2013 Cisco Systems, Inc.
  *
  * This program is free software; you can redistribute it and/or
@@ -60,14 +60,22 @@ static int vxlan_get_options(const struct vport *vport, 
struct sk_buff *skb)
return -EMSGSIZE;
 
nla_nest_end(skb, exts);
+#ifdef HAVE_VXLAN_DEV_CFG
+   } else if (vxlan->cfg.flags & VXLAN_F_GPE) {
+#else
} else if (vxlan->flags & VXLAN_F_GPE) {
+#endif
struct nlattr *exts;
 
exts = nla_nest_start(skb, OVS_TUNNEL_ATTR_EXTENSION);
if (!exts)
return -EMSGSIZE;
 
+#ifdef HAVE_VXLAN_DEV_CFG
+   if (vxlan->cfg.flags & VXLAN_F_GPE &&
+#else
if (vxlan->flags & VXLAN_F_GPE &&
+#endif
nla_put_flag(skb, OVS_VXLAN_EXT_GPE))
return -EMSGSIZE;
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 11/15] datapath: Check for existence of nf_hook_ops member list

2017-09-11 Thread Greg Rose
The list member of nf_hook_ops has been removed in Linux kernel
release 4.13. Check for the definition of it in pre-4.13 kernels.

Signed-off-by: Greg Rose 
---
 datapath/linux/compat/stt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c
index 07f5892..37d5f4b 100644
--- a/datapath/linux/compat/stt.c
+++ b/datapath/linux/compat/stt.c
@@ -2110,7 +2110,9 @@ int stt_init_module(void)
if (rc)
goto out2;
 
+#ifdef HAVE_LIST_IN_NF_HOOK_OPS
INIT_LIST_HEAD(_hook_ops.list);
+#endif
pr_info("STT tunneling driver\n");
return 0;
 out2:
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 10/15] datapath: Fixup RTNL ops for kernel 4.13

2017-09-11 Thread Greg Rose
The RTNL ops validate and newlink functions now take the extended
netlink ack parameter.  Use the new HAVE_EXT_ACK_IN_RTNL_LINKOPS
define to check if the additional parameter is present and add the
parameter if so.

While in the modules remove the checks for Linux kernels < 2.3.39
since they are no longer supported since 2.5.x.

Signed-off-by: Greg Rose 
---
 datapath/linux/compat/lisp.c | 21 ++---
 datapath/linux/compat/stt.c  | 11 +++
 2 files changed, 21 insertions(+), 11 deletions(-)

diff --git a/datapath/linux/compat/lisp.c b/datapath/linux/compat/lisp.c
index c505fad..34f8232 100644
--- a/datapath/linux/compat/lisp.c
+++ b/datapath/linux/compat/lisp.c
@@ -608,7 +608,12 @@ static const struct nla_policy lisp_policy[IFLA_LISP_MAX + 
1] = {
[IFLA_LISP_PORT]  = { .type = NLA_U16 },
 };
 
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
+static int lisp_validate(struct nlattr *tb[], struct nlattr *data[],
+struct netlink_ext_ack __always_unused *extack)
+#else
 static int lisp_validate(struct nlattr *tb[], struct nlattr *data[])
+#endif
 {
if (tb[IFLA_ADDRESS]) {
if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
@@ -660,17 +665,15 @@ static int lisp_configure(struct net *net, struct 
net_device *dev,
return 0;
 }
 
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,39)
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
 static int lisp_newlink(struct net *net, struct net_device *dev,
-   struct nlattr *tb[], struct nlattr *data[])
-{
+   struct nlattr *tb[], struct nlattr *data[],
+   struct netlink_ext_ack __always_unused *extack)
 #else
-static int lisp_newlink(struct net_device *dev,
+static int lisp_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[])
-
-{
-   struct net *net = _net;
 #endif
+{
__be16 dst_port = htons(LISP_UDP_PORT);
 
if (data[IFLA_LISP_PORT])
@@ -679,11 +682,7 @@ static int lisp_newlink(struct net_device *dev,
return lisp_configure(net, dev, dst_port);
 }
 
-#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,39)
 static void lisp_dellink(struct net_device *dev, struct list_head *head)
-#else
-static void lisp_dellink(struct net_device *dev)
-#endif
 {
struct lisp_dev *lisp = netdev_priv(dev);
 
diff --git a/datapath/linux/compat/stt.c b/datapath/linux/compat/stt.c
index 964d993..07f5892 100644
--- a/datapath/linux/compat/stt.c
+++ b/datapath/linux/compat/stt.c
@@ -1909,7 +1909,12 @@ static const struct nla_policy stt_policy[IFLA_STT_MAX + 
1] = {
[IFLA_STT_PORT]  = { .type = NLA_U16 },
 };
 
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
+static int stt_validate(struct nlattr *tb[], struct nlattr *data[],
+   struct netlink_ext_ack __always_unused *extack)
+#else
 static int stt_validate(struct nlattr *tb[], struct nlattr *data[])
+#endif
 {
if (tb[IFLA_ADDRESS]) {
if (nla_len(tb[IFLA_ADDRESS]) != ETH_ALEN)
@@ -1961,8 +1966,14 @@ static int stt_configure(struct net *net, struct 
net_device *dev,
return 0;
 }
 
+#ifdef HAVE_EXT_ACK_IN_RTNL_LINKOPS
+static int stt_newlink(struct net *net, struct net_device *dev,
+   struct nlattr *tb[], struct nlattr *data[],
+   struct netlink_ext_ack __always_unused *extack)
+#else
 static int stt_newlink(struct net *net, struct net_device *dev,
struct nlattr *tb[], struct nlattr *data[])
+#endif
 {
__be16 dst_port = htons(STT_DST_PORT);
 
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 09/15] acinclude: Check for extended netlink ack presence

2017-09-11 Thread Greg Rose
RTNL ops validate and newlink now include the extended netlink
ack feature.  Check for it and set HAVE_EXT_ACK_IN_RTNL_LINKOPS
if found.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index 772ff03..46efa69 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -771,6 +771,12 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   [OVS_DEFINE([HAVE_SKB_GSO_UDP])])
   OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE],
   [OVS_DEFINE([HAVE_DST_NOCACHE])])
+  OVS_FIND_FIELD_IFELSE([$KSRC/include/net/rtnetlink.h], [rtnl_link_ops],
+[extack],
+  [OVS_DEFINE([HAVE_EXT_ACK_IN_RTNL_LINKOPS])])
+  OVS_FIND_FIELD_IFELSE([$KSRC/include/linux/netfilter.h], [nf_hook_ops],
+[list],
+[OVS_DEFINE([HAVE_LIST_IN_NF_HOOK_OPS])])
 
 
   if cmp -s datapath/linux/kcompat.h.new \
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 08/15] compat: Add check for DST_NOCACHE

2017-09-11 Thread Greg Rose
DST_NOCACHE was removed from the Linux 4.13 kernel.  Check if
HAVE_DST_NOCACHE is defined for older kernels.

Signed-off-by: Greg Rose 
---
 datapath/linux/compat/include/net/ip6_fib.h | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/datapath/linux/compat/include/net/ip6_fib.h 
b/datapath/linux/compat/include/net/ip6_fib.h
index 2c8a513..4d58375 100644
--- a/datapath/linux/compat/include/net/ip6_fib.h
+++ b/datapath/linux/compat/include/net/ip6_fib.h
@@ -27,7 +27,11 @@
 static inline u32 rt6_get_cookie(const struct rt6_info *rt)
 {
if (rt->rt6i_flags & RTF_PCPU ||
-   (unlikely(rt->dst.flags & DST_NOCACHE) && rt->dst.from))
+#ifdef HAVE_DST_NOCACHE
+   (unlikely(rt->dst.flags & DST_NOCACHE) && rt->dst.from))
+#else
+   (unlikely(!list_empty(>rt6i_uncached)) && rt->dst.from))
+#endif
rt = (struct rt6_info *)(rt->dst.from);
 
return rt->rt6i_node ? rt->rt6i_node->fn_sernum : 0;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 07/15] acinclude: Add compat define for DST_NOCACHE

2017-09-11 Thread Greg Rose
DST_NOCACHE is removed in the 4.13 Linux kernel - add check for it
and if found set HAVE_DST_NOCACHE.

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index 0e98ade..772ff03 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -769,6 +769,8 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
   [nf_conntrack_helper_put])
   OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h],[SKB_GSO_UDP],
   [OVS_DEFINE([HAVE_SKB_GSO_UDP])])
+  OVS_GREP_IFELSE([$KSRC/include/net/dst.h],[DST_NOCACHE],
+  [OVS_DEFINE([HAVE_DST_NOCACHE])])
 
 
   if cmp -s datapath/linux/kcompat.h.new \
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 06/15] datapath: fix skb_panic due to the incorrect actions attrlen

2017-09-11 Thread Greg Rose
Upstream commit:
commit 494bea39f3201776cdfddc232705f54a0bd210c4
Author: Liping Zhang 
Date:   Wed Aug 16 13:30:07 2017 +0800

openvswitch: fix skb_panic due to the incorrect actions attrlen

For sw_flow_actions, the actions_len only represents the kernel part's
size, and when we dump the actions to the userspace, we will do the
convertions, so it's true size may become bigger than the actions_len.

But unfortunately, for OVS_PACKET_ATTR_ACTIONS, we use the actions_len
to alloc the skbuff, so the user_skb's size may become insufficient and
oops will happen like this:
  skbuff: skb_over_panic: text:8148fabf len:1749 put:157 head:
  881300f39000 data:881300f39000 tail:0x6d5 end:0x6c0 dev:
  [ cut here ]
  kernel BUG at net/core/skbuff.c:129!
  [...]
  Call Trace:
   
   [] skb_put+0x43/0x44
   [] skb_zerocopy+0x6c/0x1f4
   [] queue_userspace_packet+0x3a3/0x448 [openvswitch]
   [] ovs_dp_upcall+0x30/0x5c [openvswitch]
   [] output_userspace+0x132/0x158 [openvswitch]
   [] ? ip6_rcv_finish+0x74/0x77 [ipv6]
   [] do_execute_actions+0xcc1/0xdc8 [openvswitch]
   [] ovs_execute_actions+0x74/0x106 [openvswitch]
   [] ovs_dp_process_packet+0xe1/0xfd [openvswitch]
   [] ? key_extract+0x63c/0x8d5 [openvswitch]
   [] ovs_vport_receive+0xa1/0xc3 [openvswitch]
  [...]

Also we can find that the actions_len is much little than the orig_len:
  crash> struct sw_flow_actions 0x8812f539d000
  struct sw_flow_actions {
rcu = {
  next = 0x8812f5398800,
  func = 0xe3b00035db32
},
orig_len = 1384,
actions_len = 592,
actions = 0x8812f539d01c
  }

So as a quick fix, use the orig_len instead of the actions_len to alloc
the user_skb.

Last, this oops happened on our system running a relative old kernel, but
the same risk still exists on the mainline, since we use the wrong
actions_len from the beginning.

Fixes: ccea74457bbd ("openvswitch: include datapath actions with sampled-pac
Cc: Neil McKee 
Signed-off-by: Liping Zhang 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Fixes: 0e469d3b380c ("datapath: Include datapath actions with sampled-packet 
upcall to userspace.")
Signed-off-by: Greg Rose 
---
 datapath/actions.c  | 1 +
 datapath/datapath.c | 7 ---
 datapath/datapath.h | 2 ++
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/datapath/actions.c b/datapath/actions.c
index 59d91b2..ad18c2c 100644
--- a/datapath/actions.c
+++ b/datapath/actions.c
@@ -1348,6 +1348,7 @@ int ovs_execute_actions(struct datapath *dp, struct 
sk_buff *skb,
goto out;
}
 
+   OVS_CB(skb)->acts_origlen = acts->orig_len;
err = do_execute_actions(dp, skb, key,
 acts->actions, acts->actions_len);
 
diff --git a/datapath/datapath.c b/datapath/datapath.c
index b565fc5..1780819 100644
--- a/datapath/datapath.c
+++ b/datapath/datapath.c
@@ -388,7 +388,7 @@ static int queue_gso_packets(struct datapath *dp, struct 
sk_buff *skb,
 }
 
 static size_t upcall_msg_size(const struct dp_upcall_info *upcall_info,
- unsigned int hdrlen)
+ unsigned int hdrlen, int actions_attrlen)
 {
size_t size = NLMSG_ALIGN(sizeof(struct ovs_header))
+ nla_total_size(hdrlen) /* OVS_PACKET_ATTR_PACKET */
@@ -405,7 +405,7 @@ static size_t upcall_msg_size(const struct dp_upcall_info 
*upcall_info,
 
/* OVS_PACKET_ATTR_ACTIONS */
if (upcall_info->actions_len)
-   size += nla_total_size(upcall_info->actions_len);
+   size += nla_total_size(actions_attrlen);
 
/* OVS_PACKET_ATTR_MRU */
if (upcall_info->mru)
@@ -472,7 +472,8 @@ static int queue_userspace_packet(struct datapath *dp, 
struct sk_buff *skb,
else
hlen = skb->len;
 
-   len = upcall_msg_size(upcall_info, hlen - cutlen);
+   len = upcall_msg_size(upcall_info, hlen - cutlen,
+ OVS_CB(skb)->acts_origlen);
user_skb = genlmsg_new(len, GFP_ATOMIC);
if (!user_skb) {
err = -ENOMEM;
diff --git a/datapath/datapath.h b/datapath/datapath.h
index f20deed..70ad0ac 100644
--- a/datapath/datapath.h
+++ b/datapath/datapath.h
@@ -100,11 +100,13 @@ struct datapath {
  * when a packet is received by OVS.
  * @mru: The maximum received fragement size; 0 if the packet is not
  * fragmented.
+ * @acts_origlen: The netlink size of the flow actions applied to this skb.
  * @cutlen: The number of bytes from the packet end to be removed.
  */
 struct ovs_skb_cb {
struct vport*input_vport;
u16 

[ovs-dev] [PATCH 05/15] datapath: Remove unnecessary newlines from OVS_NLERR uses

2017-09-11 Thread Greg Rose
Upstream commit:
commit 0ed80da518a1f27562a013f106505e495e891fe4
Author: Joe Perches 
Date:   Fri Aug 11 04:26:26 2017 -0700

openvswitch: Remove unnecessary newlines from OVS_NLERR uses

OVS_NLERR already adds a newline so these just add blank
lines to the logging.

Signed-off-by: Joe Perches 
Acked-by: Joe Stringer 
Signed-off-by: David S. Miller 

Signed-off-by: Greg Rose 
---
 datapath/conntrack.c| 14 +-
 datapath/flow_netlink.c |  2 +-
 2 files changed, 6 insertions(+), 10 deletions(-)

diff --git a/datapath/conntrack.c b/datapath/conntrack.c
index b645ab1..d517a87 100644
--- a/datapath/conntrack.c
+++ b/datapath/conntrack.c
@@ -1221,15 +1221,13 @@ static int parse_nat(const struct nlattr *attr,
int type = nla_type(a);
 
if (type > OVS_NAT_ATTR_MAX) {
-   OVS_NLERR(log,
- "Unknown NAT attribute (type=%d, max=%d).\n",
+   OVS_NLERR(log, "Unknown NAT attribute (type=%d, 
max=%d)",
  type, OVS_NAT_ATTR_MAX);
return -EINVAL;
}
 
if (nla_len(a) != ovs_nat_attr_lens[type][ip_vers]) {
-   OVS_NLERR(log,
- "NAT attribute type %d has unexpected length 
(%d != %d).\n",
+   OVS_NLERR(log, "NAT attribute type %d has unexpected 
length (%d != %d)",
  type, nla_len(a),
  ovs_nat_attr_lens[type][ip_vers]);
return -EINVAL;
@@ -1239,9 +1237,7 @@ static int parse_nat(const struct nlattr *attr,
case OVS_NAT_ATTR_SRC:
case OVS_NAT_ATTR_DST:
if (info->nat) {
-   OVS_NLERR(log,
- "Only one type of NAT may be 
specified.\n"
- );
+   OVS_NLERR(log, "Only one type of NAT may be 
specified");
return -ERANGE;
}
info->nat |= OVS_CT_NAT;
@@ -1291,13 +1287,13 @@ static int parse_nat(const struct nlattr *attr,
break;
 
default:
-   OVS_NLERR(log, "Unknown nat attribute (%d).\n", type);
+   OVS_NLERR(log, "Unknown nat attribute (%d)", type);
return -EINVAL;
}
}
 
if (rem > 0) {
-   OVS_NLERR(log, "NAT attribute has %d unknown bytes.\n", rem);
+   OVS_NLERR(log, "NAT attribute has %d unknown bytes", rem);
return -EINVAL;
}
if (!info->nat) {
diff --git a/datapath/flow_netlink.c b/datapath/flow_netlink.c
index 9b48612..df9d88e 100644
--- a/datapath/flow_netlink.c
+++ b/datapath/flow_netlink.c
@@ -1257,7 +1257,7 @@ static int ovs_key_from_nlattrs(struct net *net, struct 
sw_flow_match *match,
}
 
if (!is_mask && ipv6_key->ipv6_label & htonl(0xFFF0)) {
-   OVS_NLERR(log, "IPv6 flow label %x is out of range 
(max=%x).\n",
+   OVS_NLERR(log, "IPv6 flow label %x is out of range 
(max=%x)",
  ntohl(ipv6_key->ipv6_label), (1 << 20) - 1);
return -EINVAL;
}
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 04/15] datapath: Optimize operations for OvS flow_stats.

2017-09-11 Thread Greg Rose
Upstream commit:
commit c4b2bf6b4a35348fe6d1eb06928eb68d7b9d99a9
Author: Tonghao Zhang 
Date:   Mon Jul 17 23:28:06 2017 -0700

openvswitch: Optimize operations for OvS flow_stats.

When calling the flow_free() to free the flow, we call many times
(cpu_possible_mask, eg. 128 as default) cpumask_next(). That will
take up our CPU usage if we call the flow_free() frequently.
When we put all packets to userspace via upcall, and OvS will send
them back via netlink to ovs_packet_cmd_execute(will call flow_free).

The test topo is shown as below. VM01 sends TCP packets to VM02,
and OvS forward packtets. When testing, we use perf to report the
system performance.

VM01 --- OvS-VM --- VM02

Without this patch, perf-top show as below: The flow_free() is
3.02% CPU usage.

4.23%  [kernel][k] _raw_spin_unlock_irqrestore
3.62%  [kernel][k] __do_softirq
3.16%  [kernel][k] __memcpy
3.02%  [kernel][k] flow_free
2.42%  libc-2.17.so[.] __memcpy_ssse3_back
2.18%  [kernel][k] copy_user_generic_unrolled
2.17%  [kernel][k] find_next_bit

When applied this patch, perf-top show as below: Not shown on
the list anymore.

4.11%  [kernel][k] _raw_spin_unlock_irqrestore
3.79%  [kernel][k] __do_softirq
3.46%  [kernel][k] __memcpy
2.73%  libc-2.17.so[.] __memcpy_ssse3_back
2.25%  [kernel][k] copy_user_generic_unrolled
1.89%  libc-2.17.so[.] _int_malloc
1.53%  ovs-vswitchd[.] xlate_actions

With this patch, the TCP throughput(we dont use Megaflow Cache
+ Microflow Cache) between VMs is 1.18Gbs/sec up to 1.30Gbs/sec
(maybe ~10% performance imporve).

This patch adds cpumask struct, the cpu_used_mask stores the cpu_id
that the flow used. And we only check the flow_stats on the cpu we
used, and it is unncessary to check all possible cpu when getting,
cleaning, and updating the flow_stats. Adding the cpu_used_mask to
sw_flow struct does’t increase the cacheline number.

Signed-off-by: Tonghao Zhang 
Acked-by: Pravin B Shelar 
Signed-off-by: David S. Miller 

Signed-off-by: Greg Rose 
---
 datapath/flow.c   | 7 ---
 datapath/flow.h   | 2 ++
 datapath/flow_table.c | 4 +++-
 3 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/datapath/flow.c b/datapath/flow.c
index 30e4d21..5da7e3e 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -71,7 +71,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   const struct sk_buff *skb)
 {
struct flow_stats *stats;
-   int cpu = smp_processor_id();
+   unsigned int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
 
stats = rcu_dereference(flow->stats[cpu]);
@@ -116,6 +116,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
 
rcu_assign_pointer(flow->stats[cpu],
   new_stats);
+   cpumask_set_cpu(cpu, 
>cpu_used_mask);
goto unlock;
}
}
@@ -143,7 +144,7 @@ void ovs_flow_stats_get(const struct sw_flow *flow,
memset(ovs_stats, 0, sizeof(*ovs_stats));
 
/* We open code this to make sure cpu 0 is always considered */
-   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
cpu_possible_mask)) {
+   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
>cpu_used_mask)) {
struct flow_stats *stats = 
rcu_dereference_ovsl(flow->stats[cpu]);
 
if (stats) {
@@ -167,7 +168,7 @@ void ovs_flow_stats_clear(struct sw_flow *flow)
int cpu;
 
/* We open code this to make sure cpu 0 is always considered */
-   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
cpu_possible_mask)) {
+   for (cpu = 0; cpu < nr_cpu_ids; cpu = cpumask_next(cpu, 
>cpu_used_mask)) {
struct flow_stats *stats = ovsl_dereference(flow->stats[cpu]);
 
if (stats) {
diff --git a/datapath/flow.h b/datapath/flow.h
index 07af912..0796b09 100644
--- a/datapath/flow.h
+++ b/datapath/flow.h
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -218,6 +219,7 @@ struct sw_flow {
 */
struct sw_flow_key key;
struct sw_flow_id id;
+   struct cpumask cpu_used_mask;
struct sw_flow_mask *mask;
struct sw_flow_actions __rcu *sf_acts;
struct flow_stats __rcu 

[ovs-dev] [PATCH 03/15] datapath: Optimize updating for OvS flow_stats.

2017-09-11 Thread Greg Rose
Upstream commit:
commit c57c054eb5b1ccf230c49f736f7a018fcbc3e952
Author: Tonghao Zhang 
Date:   Mon Jul 17 23:28:05 2017 -0700

openvswitch: Optimize updating for OvS flow_stats.

In the ovs_flow_stats_update(), we only use the node
var to alloc flow_stats struct. But this is not a
common case, it is unnecessary to call the numa_node_id()
everytime. This patch is not a bugfix, but there maybe
a small increase.

Signed-off-by: Tonghao Zhang 
Signed-off-by: David S. Miller 

Signed-off-by: Greg Rose 
---
 datapath/flow.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/datapath/flow.c b/datapath/flow.c
index 599b4e5..30e4d21 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -71,7 +71,6 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   const struct sk_buff *skb)
 {
struct flow_stats *stats;
-   int node = numa_node_id();
int cpu = smp_processor_id();
int len = skb->len + (skb_vlan_tag_present(skb) ? VLAN_HLEN : 0);
 
@@ -107,7 +106,7 @@ void ovs_flow_stats_update(struct sw_flow *flow, __be16 
tcp_flags,
   __GFP_THISNODE |
   __GFP_NOWARN |
  __GFP_NOMEMALLOC,
- node);
+ numa_node_id());
if (likely(new_stats)) {
new_stats->used = jiffies;
new_stats->packet_count = 1;
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 02/15] datapath: Remove all references to SKB_GSO_UDP.

2017-09-11 Thread Greg Rose
Upstream commit:
commit 880388aa3c07fdea4f9b85e35641753017b1852f
Author: David S. Miller 
Date:   Mon Jul 3 07:29:12 2017 -0700

net: Remove all references to SKB_GSO_UDP.

Such packets are no longer possible.

Signed-off-by: David S. Miller 

SKB_GSO_UDP is removed in the upstream kernel.  Use HAVE_SKB_GSO_UDP
define from acinclude to detect if SKB_GSO_UDP exists and if so apply
openvswitch section of this upstream patch.

Signed-off-by: Greg Rose 
---
 datapath/flow.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/datapath/flow.c b/datapath/flow.c
index c4f63b0..599b4e5 100644
--- a/datapath/flow.c
+++ b/datapath/flow.c
@@ -589,8 +589,12 @@ static int key_extract(struct sk_buff *skb, struct 
sw_flow_key *key)
key->ip.frag = OVS_FRAG_TYPE_LATER;
return 0;
}
+#ifdef HAVE_SKB_GSO_UDP
if (nh->frag_off & htons(IP_MF) ||
skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
+#else
+   if (nh->frag_off & htons(IP_MF))
+#endif
key->ip.frag = OVS_FRAG_TYPE_FIRST;
else
key->ip.frag = OVS_FRAG_TYPE_NONE;
@@ -707,9 +711,11 @@ static int key_extract(struct sk_buff *skb, struct 
sw_flow_key *key)
 
if (key->ip.frag == OVS_FRAG_TYPE_LATER)
return 0;
+#ifdef HAVE_SKB_GSO_UDP
if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP)
key->ip.frag = OVS_FRAG_TYPE_FIRST;
 
+#endif
/* Transport layer. */
if (key->ip.proto == NEXTHDR_TCP) {
if (tcphdr_ok(skb)) {
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH 01/15] acinclude: Check for SKB_GSO_UDP

2017-09-11 Thread Greg Rose
Removed in kernel 4.13

Signed-off-by: Greg Rose 
---
 acinclude.m4 | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/acinclude.m4 b/acinclude.m4
index aeb594a..0e98ade 100644
--- a/acinclude.m4
+++ b/acinclude.m4
@@ -767,6 +767,9 @@ AC_DEFUN([OVS_CHECK_LINUX_COMPAT], [
 [OVS_DEFINE([HAVE_VXLAN_DEV_CFG])])
   OVS_GREP_IFELSE([$KSRC/include/net/netfilter/nf_conntrack_helper.h],
   [nf_conntrack_helper_put])
+  OVS_GREP_IFELSE([$KSRC/include/linux/skbuff.h],[SKB_GSO_UDP],
+  [OVS_DEFINE([HAVE_SKB_GSO_UDP])])
+
 
   if cmp -s datapath/linux/kcompat.h.new \
 datapath/linux/kcompat.h >/dev/null 2>&1; then
-- 
1.8.3.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-nbctl: fix validation logic for acl-del command

2017-09-11 Thread Justin Pettit

> On Sep 8, 2017, at 11:56 AM, Han Zhou  wrote:
> 
> The command error message is misleading, e.g.:
> 
> $ ovn-nbctl acl-del ls1 100 'outport=="lsp1" && ip4'
> ovn-nbctl: cannot specify priority without match
> 
> $ ovn-nbctl acl-del ls1 to-lport 'outport=="lsp1" && ip4'
> ovn-nbctl: cannot specify priority without match
> 
> This patch fixes the problem.
> $
> $ ovn-nbctl acl-del ls1 100 'outport=="lsp1" && ip4'
> 
> Signed-off-by: Han Zhou 

Thanks.  I pushed this to master.

--Justin


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn: fix wrong comment in acl log test case

2017-09-11 Thread Justin Pettit

> On Aug 30, 2017, at 3:42 PM, Han Zhou  wrote:
> 
> Signed-off-by: Han Zhou 

Thanks.  I pushed this to master.

--Justin



___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] Sushi Toppings

2017-09-11 Thread Bonesca - Jona
    [ View in browser ]( http://r.newsletter.bonescamail.nl/nru6ryvyoatrf.html 
)   
 
 
 
Topquality Tilapia Izumidai Slices Taiwan 25 x 160 grs (20 x 8 gr) 
1 box € 14,95 - 10 box € 14,25 per kilo
 
Topquality Barramundi Slices Taiwan 25 x 160 grs (20 x 8 gr)
1 box € 15,95 - 10 box € 15,25 per kilo    







 
This email was sent to d...@openvswitch.org
You received this email because you are registered with Bonesca Import en 
Export BV
 
[ Unsubscribe here ]( http://r.newsletter.bonescamail.nl/nru6ryvyoatrg.html )  

Sent by
[  ]( http://r.newsletter.bonescamail.nl/track/click/2n3cr3vjhaoatrd )     
© 2017 Bonesca Import en Export BV  

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-nbctl: fix validation logic for acl-del command

2017-09-11 Thread Mark Michelson
On Fri, Sep 8, 2017 at 1:57 PM Han Zhou  wrote:

> The command error message is misleading, e.g.:
>
> $ ovn-nbctl acl-del ls1 100 'outport=="lsp1" && ip4'
> ovn-nbctl: cannot specify priority without match
>
> $ ovn-nbctl acl-del ls1 to-lport 'outport=="lsp1" && ip4'
> ovn-nbctl: cannot specify priority without match
>
> This patch fixes the problem.
> $
> $ ovn-nbctl acl-del ls1 100 'outport=="lsp1" && ip4'


> Signed-off-by: Han Zhou 
>
Acked-by: Mark Michelson 

> ---
>  ovn/utilities/ovn-nbctl.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/ovn/utilities/ovn-nbctl.c b/ovn/utilities/ovn-nbctl.c
> index 46ede4e..8e5c1a4 100644
> --- a/ovn/utilities/ovn-nbctl.c
> +++ b/ovn/utilities/ovn-nbctl.c
> @@ -1469,10 +1469,6 @@ nbctl_acl_del(struct ctl_context *ctx)
>  const struct nbrec_logical_switch *ls;
>  ls = ls_by_name_or_uuid(ctx, ctx->argv[1], true);
>
> -if (ctx->argc != 2 && ctx->argc != 3 && ctx->argc != 5) {
> -ctl_fatal("cannot specify priority without match");
> -}
> -
>  if (ctx->argc == 2) {
>  /* If direction, priority, and match are not specified, delete
>   * all ACLs. */
> @@ -1503,6 +1499,10 @@ nbctl_acl_del(struct ctl_context *ctx)
>
>  int64_t priority = parse_priority(ctx->argv[3]);
>
> +if (ctx->argc == 4) {
> +ctl_fatal("cannot specify priority without match");
> +}
> +
>  /* Remove the matching rule. */
>  for (size_t i = 0; i < ls->n_acls; i++) {
>  struct nbrec_acl *acl = ls->acls[i];
> --
> 2.1.0
>
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH] ovn-controller: Fix empty address set parsing problem.

2017-09-11 Thread Ben Pfaff
On Sun, Sep 10, 2017 at 11:10:58AM -0700, Han Zhou wrote:
> On Sun, Sep 10, 2017 at 10:56 AM, Ben Pfaff  wrote:
> >
> > On Sat, Sep 09, 2017 at 10:58:25PM -0700, Han Zhou wrote:
> > > When an address set is empty, current implementation will generate
> > > an ovs flow that matches random things (and in most cases matching
> > > everything) due to a problem in expression parser of constant set.
> > > This patch fixes it by replacing the expression by a boolean false
> > > when the set is empty, and adds tests cases accordingly.
> > >
> > > Reported-by: Guru Shetty 
> > > Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338441.html
> > > Signed-off-by: Han Zhou 
> > > ---
> > >  ovn/lib/expr.c   | 4 
> > >  tests/ovn.at | 9 +
> > >  tests/test-ovn.c | 2 ++
> > >  3 files changed, 15 insertions(+)
> > >
> > > diff --git a/ovn/lib/expr.c b/ovn/lib/expr.c
> > > index 060e9ee..6a731de 100644
> > > --- a/ovn/lib/expr.c
> > > +++ b/ovn/lib/expr.c
> > > @@ -612,6 +612,10 @@ make_cmp(struct expr_context *ctx,
> > >  }
> > >  }
> > >
> > > +if (!cs->n_values) {
> > > +e = expr_create_boolean(false);
> > > +goto exit;
> > > +}
> >
> > Thanks for working on this.  I agree with everyone that this is an
> > important fix.
> >
> > I think that it overlooks the behavior of !=.  Presumably, == and !=
> > should yield opposite results.  To get that behavior, we want:
> > e = expr_create_boolean(r == EXPR_R_NE);
> > instead of:
> > e = expr_create_boolean(false);
> >
> > Here's a v2 that includes this change plus tests for !=.  I added myself
> > as coauthor; I hope that's OK.
> >
> > Thanks,
> >
> > Ben.
> >
> > --8<--cut here-->8--
> >
> > From: Han Zhou 
> > Date: Sat, 9 Sep 2017 22:58:25 -0700
> > Subject: [PATCH] ovn-controller: Fix empty address set parsing problem.
> >
> > When an address set is empty, current implementation will generate
> > an ovs flow that matches random things (and in most cases matching
> > everything) due to a problem in expression parser of constant set.
> > This patch fixes it by replacing the expression by a boolean false
> > when the set is empty, and adds tests cases accordingly.
> >
> > Reported-by: Guru Shetty 
> > Reported-at:
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338441.html
> > Signed-off-by: Han Zhou 
> > Co-authored-by: Ben Pfaff 
> > Signed-off-by: Ben Pfaff 
> > ---
> >  ovn/lib/expr.c   |  4 
> >  tests/ovn.at | 32 
> >  tests/test-ovn.c |  2 ++
> >  3 files changed, 38 insertions(+)
> >
> > diff --git a/ovn/lib/expr.c b/ovn/lib/expr.c
> > index 060e9ee3d088..79ff45762f65 100644
> > --- a/ovn/lib/expr.c
> > +++ b/ovn/lib/expr.c
> > @@ -612,6 +612,10 @@ make_cmp(struct expr_context *ctx,
> >  }
> >  }
> >
> > +if (!cs->n_values) {
> > +e = expr_create_boolean(r == EXPR_R_NE);
> > +goto exit;
> > +}
> >  e = make_cmp__(f, r, >values[0]);
> >  for (size_t i = 1; i < cs->n_values; i++) {
> >  e = expr_combine(r == EXPR_R_EQ ? EXPR_T_OR : EXPR_T_AND,
> > diff --git a/tests/ovn.at b/tests/ovn.at
> > index bbce0b1b..f2035290f66e 100644
> > --- a/tests/ovn.at
> > +++ b/tests/ovn.at
> > @@ -623,6 +623,38 @@ dl_src=00:00:00:00:00:02
> >  dl_src=00:00:00:00:00:03
> >  dl_src=ba:be:be:ef:de:ad
> >  ])
> > +AT_CHECK([expr_to_flow 'ip4.src == {$set4}'], [0], [dnl
> > +(no flows)
> > +])
> > +AT_CHECK([expr_to_flow 'ip4.src == {1.2.3.4, $set4}'], [0], [dnl
> > +ip,nw_src=1.2.3.4
> > +])
> > +AT_CHECK([expr_to_flow 'ip4.src == 1.2.3.4 || ip4.src == {$set4}'], [0],
> [dnl
> > +ip,nw_src=1.2.3.4
> > +])
> > +AT_CHECK([expr_to_flow 'ip4.src != {$set4}'], [0], [dnl
> > +
> > +])
> > +AT_CHECK([expr_to_flow 'ip4.src != {1.0.0.0/8, $set4}'], [0], [dnl
> > +ip,nw_src=0.0.0.0/1.0.0.0
> > +ip,nw_src=128.0.0.0/1
> > +ip,nw_src=16.0.0.0/16.0.0.0
> > +ip,nw_src=2.0.0.0/2.0.0.0
> > +ip,nw_src=32.0.0.0/32.0.0.0
> > +ip,nw_src=4.0.0.0/4.0.0.0
> > +ip,nw_src=64.0.0.0/64.0.0.0
> > +ip,nw_src=8.0.0.0/8.0.0.0
> > +])
> > +AT_CHECK([expr_to_flow 'ip4.src != 1.0.0.0/8 && ip4.src != {$set4}'],
> [0], [dnl
> > +ip,nw_src=0.0.0.0/1.0.0.0
> > +ip,nw_src=128.0.0.0/1
> > +ip,nw_src=16.0.0.0/16.0.0.0
> > +ip,nw_src=2.0.0.0/2.0.0.0
> > +ip,nw_src=32.0.0.0/32.0.0.0
> > +ip,nw_src=4.0.0.0/4.0.0.0
> > +ip,nw_src=64.0.0.0/64.0.0.0
> > +ip,nw_src=8.0.0.0/8.0.0.0
> > +])
> >  AT_CLEANUP
> >
> >  AT_SETUP([ovn -- action parsing])
> > diff --git a/tests/test-ovn.c b/tests/test-ovn.c
> > index 694bc794a923..148ce122d385 100644
> > --- a/tests/test-ovn.c
> > +++ b/tests/test-ovn.c
> > @@ -202,10 +202,12 @@ create_addr_sets(struct shash *addr_sets)
> >  static const char *const addrs3[] = {
> >  "00:00:00:00:00:01", "00:00:00:00:00:02", 

Re: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the upcall

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 11:12 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the
> upcall
> 
> On Mon, Sep 11, 2017 at 09:49:52AM +, Chandran, Sugesh wrote:
> > > > > For the DPDK flow offload, which basically just binds a MARK
> > > > > action to a flow, the MARK is required to be used together with
> > > > > a QUEUE action for the most NICs I'm aware of. The QUEUE action
> > > > > then needs a queue index, which is not given in the flow content.
> > > > [Sugesh] Looks to me this is again another hardware specific req.
> > > > This could have impact on RSS hash distribution and queue
> > > > redistribution across pmds at runtime.
> > >
> > > If you have read my earlier version, you should have seen similar
> > > concerns from me.
> > [Sugesh] I feel this has to be addressed properly to make this feature work
> on all the cases.
> 
> Agreed.
> 
> > > > > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
> > > > >   struct dp_packet *packet,
> > > > >   const struct netdev_flow_key *key,
> > > > >   struct ofpbuf *actions, struct ofpbuf 
> > > > > *put_actions,
> > > > > - int *lost_cnt, long long now)
> > > > > + int *lost_cnt, long long now, int rxq)
> > > > [Sugesh] IMHO its not really good practice to change the default
> > > > packet processing path for some specific hardware offload. Rxq
> > > > doesn't
> > > have any meaning for handling the packets in normal path.
> > >
> > > The same: some concerns I have already expressed before.
> > > Unfortunately, we didn't come up with something better.
> > >
> > > > Why cant install flow on all the configured queues for a specific 
> > > > inport?
> > > Flow handling is per port, not per queue.
> > > > This will assure the packets will have mark even after the rss
> > > > hash
> > > distribution.
> > >
> > > Like how? The QUEUE action only accepts one queue index. Setting it
> > > to many times will only let the last one take effect. The another
> > > possiblity I'm aware of is with the RTE_FLOW_ACTION_TYPE_RSS, which,
> > > unfortunately, is only supported by Mellanox in DPDK. Yet again, I
> > > was told it's not functional well.
> > [Sugesh] Hmm. I got it what you meant.   Flow director has an option called
> > passthrough. This will allow to use RSS hash on packet after the filter.
> > If I remember correctly this has been in supported in XL710 all the time.
> > This will allow to program the flow without any queue index.
> 
> Good to know. However, if you git grep it, you will see that only i40e and tap
> have this support.
[Sugesh] Yes, you are right. 
> 
> > If there is a
> > functionality issue to configure the MARK action properly, it has to
> > be fixed in DPDK than doing a workaround in OVS.
> 
> Again, can't agree more on this. But the truth/fact is that it's not an easy 
> task
> to fix DPDK. For Mellanox, I don't know exactly how many parts need to be
> fixed (something like DPDK PMD, libibvers, kernel, etc). For others, it might
> just be a hardware limitation.
> 
> It's even harder (if not impossible) to fix most (if not all) DPDK pmds has 
> rte
> flow support.
> 
> Even if we could do that, it may take years to finish that. At least, I see no
> related tasks from DPDK v17.11.
[Sugesh] Ok. IMO making changes in DPDK is cleaner and avoid lot of extra
work in OVS
> 
> > > Also, even it could work, I think it still would be probematic. I'm
> > > thinking what might happen for following case.
> > >
> > > Assume there is a 5-tuple flow. According to the initial RSS setting
> > > by OVS, all pkts match that flow would be ended up being recieved
> > > from one queue only. If we re-do RSS settings on it, if the RSS
> > > settings are the same, the behaviour might be the same. If not,
> > > those pkts which are supposed to be
> > [Sugesh] When a user changes number of queues, RSS setting might
> change.
> > Also in the current design, when a queue get pinned to different PMD
> > at run time, The mark details may loose as its on the PMD struct.
> > > distributed to one queue only might be distributed to many queues.
> > >
> > > Is it a valid concern?
> > [Sugesh] I feel there will be performance and scalability issues if we
> > wanted to program a flow for a queue ID. Hence I prefer to have flow
> programming without queue specific information.
> 
> True me, I also really want to get rid of the queue action, just if we have 
> good
> options.

> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 0/8] OVS-DPDK flow offload with rte_flow

2017-09-11 Thread Yuanhan Liu
On Mon, Sep 11, 2017 at 10:00:06AM +, Chandran, Sugesh wrote:
> > > In our implementation we have a feature discovery at the OVS init. It will
> > also populate the OVSDB to expose the device capability to higher
> > management layers. The new table introduced in OVSDB is like below.
> > 
> > The solution I want to go, however, is different though. I was thinking to
> > introduce few DPDK rte_flow APIs and structs to define the NIC flow
> > capabilities.
> [Sugesh] technically rte_flow is for flow programming and not for device 
> capabilities.

Not really. rte_flow is just a framework, it needs the underlaying NIC
to do the real thing. Each NIC has different limitations (we have seen
quite few of them). Thus, we need something like this.

If you think this way: device capabilities regarding to flow, it may
make more sense to you :)

> Again if DPDK can have such API in rte_flow. I think it should be fine.

Good! I will make a proposal to DPDK for v18.02 then.

> > > 4) AFAIK, these hardware programmability for a NIC/not for a specific 
> > > port.
> > i.e the FDIR/RSS hash configuration are device specific. This will be an 
> > issue if
> > a NIC shared between kernel and DPDK drivers?
> > 
> > That might be NIC specific. What do you mean by sharing between kernel
> > and DPDK? In most NICs I'm aware of, it's requried to unbind the kernel
> > driver first. Thus, it won't be shared. For Mellanox, the control unit is 
> > based
> > on queues, thus it could be shared correctly.
> [Sugesh] What  I meant by that is, consider a case of NIC with 4*10G ports.
> 2 ports bound to DPDK and 2 ports to kernel.

I see.

> If I remember correctly XL710 NIC can support total 8k exact match flow 
> entries in its
> Flow director.  Similarly some other resources are also shared across all the 
> ports in the NIC. 
> Now how these resources are properly managed between
> Kernel and DPDK.  

Honestly, I don't know. We may need testings/investigations.

>  I agree that Mellanox NICs this should be fine, but not sure if it work on 
> all
> the NICs out there. This will make adverse effect on each other when making 
> changes to any global configuration.

For sure I agree we should make it work on as many nics as possible.
And I think that's what this patchset trying to do.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH RFC] conntrack: Share conn info with EMC.

2017-09-11 Thread antonio . fischetti
When OvS-DPDK is set up as a firewall, the overall performance drops
significantly. This performance drop can be attributed to two reasons.

(1) Recirculation.
(2) Bottleneck inside CT module.

This RFC patch is an attempt to mitigate the bottleneck with CT module[2].
When a connection is established some of its relevant info is saved in
EMC and this info is used to track the incoming packets of the same
connection so that the CT module can be skipped for these packets.

The current implementation is done for UDP connections due to its stateless
nature. A UDP packet that is hitting an established connection will be sent
through the CT module only if timeout elapses. Otherwise it will skip the CT
module there by saving precious CPU cycles. Significant performance
improvement is observed with UDP connections.

Note that implementation is conditioned for UDP as skipping the CT
module breaks the TCP sequence checking logic inside 'tcp_conn_update'
for TCP connections.

We would like to receive feedback from the CT experts on the overall
approach before making more code changes. Also it would be nice
to know any corner-cases that would be affected with the proposed RFC.

CC: Darrell Ball 
CC: Joe Stringer 
Signed-off-by: Antonio Fischetti 
Co-authored-by: Bhanuprakash Bodireddy 
Signed-off-by: Bhanuprakash Bodireddy 

---
 - No testing is done with S-NAT, D-NAT cases.
 - The timeout value in CT_CONN_REFRESH_TIMEOUT_MS is an arbitrary value.
 - A further step could be to store some other infos, eg the hash
   values and take advantage of that when processing pkts of the
   same connection.
 - Below the performance jump we saw in our Firewall test-bench.

SETUP
=
table=0, priority=1 actions=drop
table=0, priority=10,arp actions=NORMAL
table=0, priority=100,ct_state=-trk,ip actions=ct(table=1)
table=1, ct_state=+new+trk,ip,in_port=dpdk0 actions=ct(commit),output:dpdk1
table=1, ct_state=+new+trk,ip,in_port=dpdk1 actions=drop
table=1, ct_state=+est+trk,ip,in_port=dpdk0 actions=output:dpdk1
table=1, ct_state=+est+trk,ip,in_port=dpdk1 actions=output:dpdk0

IXIA: UDP Flows, 64B pkts, bidirectional test.
PDM threads:  2

Code built with: CFLAGS="-O2 -march=native -g"

MEASUREMENTS

I measured packet Rx rate in Mpps (regardless of packet loss)
for each Rx side.

Orig means Commit ID:
b9fedfa61f000f49500973d2a51e99a80d9cf9b8

  Flows  | Orig [Mpps] | +patch [Mpps]
 +-+
  1  | 1.77, 1.72  |  5.97, 6.59
  10 | 2.35, 2.35  |  6.04, 6.28
  100| 2.51, 2.54  |  5.29, 5.31
  1000   | 2.11, 2.14  |  3.81, 3.83
  1  | 1.67, 1.70  |  2.01, 1.99
  10 | 1.49, 1.52  |  1.51, 1.54
---
 lib/conntrack.c   | 84 +++
 lib/dpif-netdev.c | 13 +++--
 lib/packets.h |  1 +
 3 files changed, 96 insertions(+), 2 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 419cb1d..b26fb68 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -53,6 +53,18 @@ struct conn_lookup_ctx {
 bool icmp_related;
 };
 
+struct emc_est_conn_info {
+long long conn_refresh; /* Dead-line to refresh connection. */
+uint8_t  ct_state;  /* Connection state. */
+uint32_t ct_mark;   /* Connection mark. */
+ovs_u128 ct_label;  /* Connection label. */
+ovs_be32 ipv4_src;
+ovs_be32 ipv4_dst;
+ovs_be16 src_port;
+ovs_be16 dst_port;
+uint8_t  ipv4_proto;
+};
+
 enum ftp_ctl_pkt {
 /* Control packets with address and/or port specifiers. */
 CT_FTP_CTL_INTEREST,
@@ -1123,6 +1135,46 @@ process_one(struct conntrack *ct, struct dp_packet *pkt,
 }
 }
 
+static inline void
+update_pkt_fields(struct dp_packet *packet,
+  struct emc_est_conn_info *conn_shared_info,
+  bool update_pkt_md)
+{
+if (update_pkt_md) {
+packet->md.ct_state = conn_shared_info->ct_state;
+packet->md.ct_mark = conn_shared_info->ct_mark;
+packet->md.ct_label = conn_shared_info->ct_label;
+packet->md.ct_orig_tuple.ipv4.ipv4_src =
+conn_shared_info->ipv4_src;
+packet->md.ct_orig_tuple.ipv4.ipv4_dst =
+conn_shared_info->ipv4_dst;
+packet->md.ct_orig_tuple.ipv4.src_port =
+conn_shared_info->src_port;
+packet->md.ct_orig_tuple.ipv4.dst_port =
+conn_shared_info->dst_port;
+packet->md.ct_orig_tuple.ipv4.ipv4_proto =
+conn_shared_info->ipv4_proto;
+} else {
+conn_shared_info->ct_state = packet->md.ct_state;
+conn_shared_info->ct_mark = packet->md.ct_mark;
+conn_shared_info->ct_label = packet->md.ct_label;
+conn_shared_info->ipv4_src =
+packet->md.ct_orig_tuple.ipv4.ipv4_src;
+  

Re: [ovs-dev] [PATCH v2 4/8] netdev-dpdk: implement flow put with rte flow

2017-09-11 Thread Yuanhan Liu
On Mon, Sep 11, 2017 at 10:04:21AM +, Chandran, Sugesh wrote:
> 
> 
> Regards
> _Sugesh
> 
> 
> > -Original Message-
> > From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> > Sent: Monday, September 11, 2017 10:58 AM
> > To: Chandran, Sugesh 
> > Cc: d...@openvswitch.org
> > Subject: Re: [ovs-dev] [PATCH v2 4/8] netdev-dpdk: implement flow put with
> > rte flow
> > 
> > On Sun, Sep 10, 2017 at 04:32:19PM +, Chandran, Sugesh wrote:
> > > As mentioned earlier in the cover letter we also have a similar
> > implementation to do the flow translate.
> > > I feel its good to make bit more modular for this translation.  The reason
> > being its easy to extend and add more protocols in the future.
> > 
> > Honestly, I don't see a strong need to make it that complex first. Yes, 
> > it's a big
> > function with a chunk of codes. And yes, I'm also a fun of splitting big
> > monsters to many small functions. However, if you look at it closer, you 
> > will
> > see it's nicely organized: the function is split nicely with chunks: 
> > something
> > like one chunk for one protocol.
> > 
> > Extending is also not that hard: just adding another chunk of the code.
> [Sugesh] Sure, I just wanted to share our proposal which works fine for our 
> usecase. :)

Thanks for sharing!

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the upcall

2017-09-11 Thread Yuanhan Liu
On Mon, Sep 11, 2017 at 09:49:52AM +, Chandran, Sugesh wrote:
> > > > For the DPDK flow offload, which basically just binds a MARK action
> > > > to a flow, the MARK is required to be used together with a QUEUE
> > > > action for the most NICs I'm aware of. The QUEUE action then needs a
> > > > queue index, which is not given in the flow content.
> > > [Sugesh] Looks to me this is again another hardware specific req.
> > > This could have impact on RSS hash distribution and queue
> > > redistribution across pmds at runtime.
> > 
> > If you have read my earlier version, you should have seen similar concerns
> > from me.
> [Sugesh] I feel this has to be addressed properly to make this feature work 
> on all the cases.

Agreed.

> > > > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
> > > >   struct dp_packet *packet,
> > > >   const struct netdev_flow_key *key,
> > > >   struct ofpbuf *actions, struct ofpbuf 
> > > > *put_actions,
> > > > - int *lost_cnt, long long now)
> > > > + int *lost_cnt, long long now, int rxq)
> > > [Sugesh] IMHO its not really good practice to change the default
> > > packet processing path for some specific hardware offload. Rxq doesn't
> > have any meaning for handling the packets in normal path.
> > 
> > The same: some concerns I have already expressed before. Unfortunately,
> > we didn't come up with something better.
> > 
> > > Why cant install flow on all the configured queues for a specific inport?
> > Flow handling is per port, not per queue.
> > > This will assure the packets will have mark even after the rss hash
> > distribution.
> > 
> > Like how? The QUEUE action only accepts one queue index. Setting it to
> > many times will only let the last one take effect. The another possiblity 
> > I'm
> > aware of is with the RTE_FLOW_ACTION_TYPE_RSS, which, unfortunately, is
> > only supported by Mellanox in DPDK. Yet again, I was told it's not 
> > functional
> > well.
> [Sugesh] Hmm. I got it what you meant.   Flow director has an option called
> passthrough. This will allow to use RSS hash on packet after the filter.
> If I remember correctly this has been in supported in XL710 all the time.
> This will allow to program the flow without any queue index.

Good to know. However, if you git grep it, you will see that only i40e
and tap have this support.

> If there is a
> functionality issue to configure the MARK action properly, it has to be fixed 
> in
> DPDK than doing a workaround in OVS. 

Again, can't agree more on this. But the truth/fact is that it's not an
easy task to fix DPDK. For Mellanox, I don't know exactly how many parts
need to be fixed (something like DPDK PMD, libibvers, kernel, etc). For
others, it might just be a hardware limitation.

It's even harder (if not impossible) to fix most (if not all) DPDK pmds
has rte flow support.

Even if we could do that, it may take years to finish that. At least, 
I see no related tasks from DPDK v17.11.

> > Also, even it could work, I think it still would be probematic. I'm 
> > thinking what
> > might happen for following case.
> > 
> > Assume there is a 5-tuple flow. According to the initial RSS setting by 
> > OVS, all
> > pkts match that flow would be ended up being recieved from one queue
> > only. If we re-do RSS settings on it, if the RSS settings are the same, the
> > behaviour might be the same. If not, those pkts which are supposed to be
> [Sugesh] When a user changes number of queues, RSS setting might change.
> Also in the current design, when a queue get pinned to different PMD at run 
> time, 
> The mark details may loose as its on the PMD struct.
> > distributed to one queue only might be distributed to many queues.
> > 
> > Is it a valid concern?
> [Sugesh] I feel there will be performance and scalability issues if we wanted 
> to program
> a flow for a queue ID. Hence I prefer to have flow programming without queue 
> specific information.

True me, I also really want to get rid of the queue action, just if we have
good options.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/8] netdev-dpdk: implement flow put with rte flow

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 10:58 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 4/8] netdev-dpdk: implement flow put with
> rte flow
> 
> On Sun, Sep 10, 2017 at 04:32:19PM +, Chandran, Sugesh wrote:
> > As mentioned earlier in the cover letter we also have a similar
> implementation to do the flow translate.
> > I feel its good to make bit more modular for this translation.  The reason
> being its easy to extend and add more protocols in the future.
> 
> Honestly, I don't see a strong need to make it that complex first. Yes, it's 
> a big
> function with a chunk of codes. And yes, I'm also a fun of splitting big
> monsters to many small functions. However, if you look at it closer, you will
> see it's nicely organized: the function is split nicely with chunks: something
> like one chunk for one protocol.
> 
> Extending is also not that hard: just adding another chunk of the code.
[Sugesh] Sure, I just wanted to share our proposal which works fine for our 
usecase. :)
> 
> Besides, if you see tc offloads, they do it in a same way.
[Sugesh] yes. 
> 
> [...]
> > > +
> > > +/*
> > > + * Validate for later rte flow offload creation. If any unsupported
> > > + * flow are specified, return -1.
> > > + */
> > [Sugesh] I feel this is very hardware centric. There are chances
> > hardware can support
> > Ipv6 or other fields in a packet. This has to be based on what flow fields a
> hardware can support.
> 
> Yes, you are right. Again, we need capabilities feedback from DPDK.
[Sugesh] Ok.
> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 0/8] OVS-DPDK flow offload with rte_flow

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 10:12 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 0/8] OVS-DPDK flow offload with rte_flow
> 
> On Sun, Sep 10, 2017 at 04:12:47PM +, Chandran, Sugesh wrote:
> > Hi Yuanhan,
> >
> > Thank you for sending out the patch series.
> 
> Hi Sugesh,
> 
> Thank you for taking your time to review it!
> 
> 
> > We are also looking into something similar to enable full offload in OVS-
> DPDK.
> 
> Good to know!
> 
> > It is based on ' http://dpdk.org/ml/archives/dev/2017-
> September/074746.html' and some other rte_flow extension in DPDK.
> 
> I saw the patches, I will take some time to read it.
[Sugesh]Sure.
> 
> >
> > It is noted that the patch series doesn't work very well for some of our
> requirements.
> > Please find below for the high level comments. I have also provided specific
> comments on the following patches.
> >
> > 1) Looks to me the patch series is to enable/use just one functionality of
> NIC(mark action). In a multiple hardware environment it is necessary to have
> a feature discovery mechanism to define what needs to be installed in the
> hardware based on its capabilities, for eg: MARK+QUEUE, MARK only,
> number of supported flow entries, supported flow fields and etc. This is very
> important to support different hardware NICs and make flow install easy.
> 
> Yes, you are right. I have also observed this issue while coding this patch.
[Sugesh] Ok.
> 
> > In our implementation we have a feature discovery at the OVS init. It will
> also populate the OVSDB to expose the device capability to higher
> management layers. The new table introduced in OVSDB is like below.
> 
> The solution I want to go, however, is different though. I was thinking to
> introduce few DPDK rte_flow APIs and structs to define the NIC flow
> capabilities.
[Sugesh] technically rte_flow is for flow programming and not for device 
capabilities.
Again if DPDK can have such API in rte_flow. I think it should be fine.
> 
> I think this would help long run, as the capabilitiy will be updated as the 
> new
> features are added (when new versions are released). For the solution you
> proposed, it won't allow DPDK work with multiple DPDK versions (assume
> they provides different rte flow capabilities).
> 
> >   
> > 
> >   Hardware switching configuration and capabilities.
> > 
> > 
> >   The name of hardware acceleration device.
> > 
> > 
> >   The integer device id of hardware accelerated NIC.
> > 
> >  
> >   The PCI ID of the hardware acceleration device. The broker id/PF id.
> >  
> >  
> >   The number of supported vhost ports in the hardware switch.
> >  
> >   
> >
> > The features column can be extended with more fields as necessary.
> > IMO the proposed partial offload doesn't need populating the OVSDB,
> however its necessary to have some kind of feature discovery at init.
> >
> > 2) I feel its better to keep the hardware offload functionalities in netdev 
> > as
> much as possible similar to kernel implementation. I see changes in upcall
> and dpif.
> 
> I agree with you. But unfortunately, due to some dirver or hardware
> limitation, that's what I can get best.
[Sugesh] Ok. 
> 
> > 3) The cost of flow install . PMDs are blocked when a hardware flow install
> is happening. Its an issue when there are lot of short lived flows are getting
> installed in the DP.
> 
> I wasn't aware of it. Thank you for letting me know that!
> 
[Sugesh] Ok
> > One option to handle this would be move the flow install into revalidate.
> The advantage of this approach would be hardware offload would happen
> only when a flow is being used at least for sometime. Something like how
> revalidator thread handle the flow modify operation.
> 
> Yes, it sounds workable. However, the MARK and QUEUE workaround won't
> work then: we need record the rxq first. And again, I know the workaround is
> far from being perfect.
> 
> > 4) AFAIK, these hardware programmability for a NIC/not for a specific port.
> i.e the FDIR/RSS hash configuration are device specific. This will be an 
> issue if
> a NIC shared between kernel and DPDK drivers?
> 
> That might be NIC specific. What do you mean by sharing between kernel
> and DPDK? In most NICs I'm aware of, it's requried to unbind the kernel
> driver first. Thus, it won't be shared. For Mellanox, the control unit is 
> based
> on queues, thus it could be shared correctly.
[Sugesh] What  I meant by that is, consider a case of NIC with 4*10G ports.
2 ports bound to DPDK and 2 ports to kernel.
If I remember correctly XL710 NIC can support total 8k exact match flow entries 
in its
Flow director.  Similarly some other resources are also shared across all the 
ports in the NIC. 
Now how these resources are properly managed between
Kernel and DPDK.  
 I 

Re: [ovs-dev] [PATCH v2 4/8] netdev-dpdk: implement flow put with rte flow

2017-09-11 Thread Yuanhan Liu
On Sun, Sep 10, 2017 at 04:32:19PM +, Chandran, Sugesh wrote:
> As mentioned earlier in the cover letter we also have a similar 
> implementation to do the flow translate.
> I feel its good to make bit more modular for this translation.  The reason 
> being its easy to extend and add more protocols in the future.

Honestly, I don't see a strong need to make it that complex first. Yes,
it's a big function with a chunk of codes. And yes, I'm also a fun of
splitting big monsters to many small functions. However, if you look
at it closer, you will see it's nicely organized: the function is split
nicely with chunks: something like one chunk for one protocol.

Extending is also not that hard: just adding another chunk of the code.

Besides, if you see tc offloads, they do it in a same way.

[...]
> > +
> > +/*
> > + * Validate for later rte flow offload creation. If any unsupported
> > + * flow are specified, return -1.
> > + */
> [Sugesh] I feel this is very hardware centric. There are chances hardware can 
> support
> Ipv6 or other fields in a packet. This has to be based on what flow fields a 
> hardware can support.

Yes, you are right. Again, we need capabilities feedback from DPDK.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH] conntrack: Fix conn_type need be checked when remove rev_conn.

2017-09-11 Thread w00273186
From: Yunjian Wang 

The rev_conn need will be removed, only when conn_type is CT_CONN_TYPE_UN_NAT.
This crash will be triggered when remove conn in ct-clean thread.

Reported-by: Lili Huang 
Signed-off-by: Yunjian Wang 
---
 lib/conntrack.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/lib/conntrack.c b/lib/conntrack.c
index 419cb1d..c1adb56 100644
--- a/lib/conntrack.c
+++ b/lib/conntrack.c
@@ -684,9 +684,10 @@ nat_clean(struct conntrack *ct, struct conn *conn,
 
 /* In the unlikely event, rev conn was recreated, then skip
  * rev_conn cleanup. */
-if (rev_conn && (!nat_conn_key_node ||
- conn_key_cmp(_conn_key_node->value,
-  _conn->rev_key))) {
+if (rev_conn &&
+(rev_conn->conn_type == CT_CONN_TYPE_UN_NAT) &&
+(!nat_conn_key_node || conn_key_cmp(_conn_key_node->value,
+_conn->rev_key))) {
 hmap_remove(>buckets[bucket_rev_conn].connections,
 _conn->node);
 free(rev_conn);
-- 
1.8.3.1


___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the upcall

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 9:33 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the
> upcall
> 
> On Sun, Sep 10, 2017 at 04:40:10PM +, Chandran, Sugesh wrote:
> >
> >
> > Regards
> > _Sugesh
> >
> >
> > > -Original Message-
> > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > > boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> > > Sent: Tuesday, September 5, 2017 10:23 AM
> > > To: d...@openvswitch.org
> > > Subject: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id
> > > for the upcall
> > >
> > > From: Shachar Beiser 
> > >
> > > For the DPDK flow offload, which basically just binds a MARK action
> > > to a flow, the MARK is required to be used together with a QUEUE
> > > action for the most NICs I'm aware of. The QUEUE action then needs a
> > > queue index, which is not given in the flow content.
> > [Sugesh] Looks to me this is again another hardware specific req.
> > This could have impact on RSS hash distribution and queue
> > redistribution across pmds at runtime.
> 
> If you have read my earlier version, you should have seen similar concerns
> from me.
[Sugesh] I feel this has to be addressed properly to make this feature work on 
all the cases.

> 
> [...]
> > > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
> > >   struct dp_packet *packet,
> > >   const struct netdev_flow_key *key,
> > >   struct ofpbuf *actions, struct ofpbuf *put_actions,
> > > - int *lost_cnt, long long now)
> > > + int *lost_cnt, long long now, int rxq)
> > [Sugesh] IMHO its not really good practice to change the default
> > packet processing path for some specific hardware offload. Rxq doesn't
> have any meaning for handling the packets in normal path.
> 
> The same: some concerns I have already expressed before. Unfortunately,
> we didn't come up with something better.
> 
> > Why cant install flow on all the configured queues for a specific inport?
> Flow handling is per port, not per queue.
> > This will assure the packets will have mark even after the rss hash
> distribution.
> 
> Like how? The QUEUE action only accepts one queue index. Setting it to
> many times will only let the last one take effect. The another possiblity I'm
> aware of is with the RTE_FLOW_ACTION_TYPE_RSS, which, unfortunately, is
> only supported by Mellanox in DPDK. Yet again, I was told it's not functional
> well.
[Sugesh] Hmm. I got it what you meant.   Flow director has an option called
passthrough. This will allow to use RSS hash on packet after the filter.
If I remember correctly this has been in supported in XL710 all the time.
This will allow to program the flow without any queue index. If there is a
functionality issue to configure the MARK action properly, it has to be fixed in
DPDK than doing a workaround in OVS. 
> 
> Also, even it could work, I think it still would be probematic. I'm thinking 
> what
> might happen for following case.
> 
> Assume there is a 5-tuple flow. According to the initial RSS setting by OVS, 
> all
> pkts match that flow would be ended up being recieved from one queue
> only. If we re-do RSS settings on it, if the RSS settings are the same, the
> behaviour might be the same. If not, those pkts which are supposed to be
[Sugesh] When a user changes number of queues, RSS setting might change.
Also in the current design, when a queue get pinned to different PMD at run 
time, 
The mark details may loose as its on the PMD struct.
> distributed to one queue only might be distributed to many queues.
> 
> Is it a valid concern?
[Sugesh] I feel there will be performance and scalability issues if we wanted 
to program
a flow for a queue ID. Hence I prefer to have flow programming without queue 
specific information.
> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/8] dpif-netdev: associate flow with a mark id

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 9:04 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 1/8] dpif-netdev: associate flow with a
> mark id
> 
> On Sun, Sep 10, 2017 at 04:14:01PM +, Chandran, Sugesh wrote:
> > > This patch associate a flow with a mark id (a uint32_t number) by CMAP.
> > >
> > > It re-uses the flow API (more precisely, the ->flow_put method) to
> > > setup the hw flow. The flow_put implementation then is supposed to
> > > create a flow with MARK action.
> > >
> > > Co-authored-by: Finn Christensen 
> > > Signed-off-by: Yuanhan Liu 
> > > Signed-off-by: Finn Christensen 
> > > ---
> > >  lib/dpif-netdev.c | 85
> > > +++
> > >  lib/netdev.h  |  6 
> > >  2 files changed, 91 insertions(+)
> >
> > [snip]
> > >   * Any thread trying to keep a rule from being freed should
> > > hold its own @@ -569,6 +575,8 @@ struct dp_netdev_pmd_thread {
> > >  struct ovs_mutex flow_mutex;
> > >  struct cmap flow_table OVS_GUARDED; /* Flow table. */
> > >
> > > +struct cmap mark_to_flow;
> > [Sugesh] I would prefer to have this cmap per port than a pmd. The
> > reason being it offers better Performance due to less lookup. Anyway
> > the flow programming is on netdev, hence its good to keep it netdev
> centric than pmd. What do you think?
> 
> I think it might be workable.
[Sugesh] ok
> 
> > >  /* One classifier per in_port polled by the pmd */
> > >  struct cmap classifiers;
> > >  /* Periodically sort subtable vectors according to hit
> > > frequencies */ @@ -
> > > 1839,6 +1847,8 @@ dp_netdev_pmd_remove_flow(struct
> > > dp_netdev_pmd_thread *pmd,
> > >  OVS_REQUIRES(pmd->flow_mutex)
> > >  {
> > >  struct cmap_node *node = CONST_CAST(struct cmap_node *, 
> > > >node);
> > > +struct cmap_node *mark_node = CONST_CAST(struct cmap_node *,
> > > + >mark_node);
> > >  struct dpcls *cls;
> > >  odp_port_t in_port = flow->flow.in_port.odp_port;
> > >
> > > @@ -1846,6 +1856,10 @@ dp_netdev_pmd_remove_flow(struct
> > > dp_netdev_pmd_thread *pmd,
> > >  ovs_assert(cls != NULL);
> > >  dpcls_remove(cls, >cr);
> > >  cmap_remove(>flow_table, node,
> dp_netdev_flow_hash(
> > > >ufid));
> > > +if (flow->has_mark) {
> > [Sugesh] Any reason these are not mutex protected? As far as I know
> > the flow remove is handled in revalidator thread. Looks like this is a
> > critical section. So the install and delete should be protected.
> 
> I think you are right.
[Sugesh] ok. Thanks.
> 
> 
> [...]
> > > +if (netdev_is_flow_api_enabled() &&
> > > +dp_netdev_alloc_flow_mark(pmd, _mark)) {
> > > +struct dp_netdev_port *port;
> > > +port = dp_netdev_lookup_port(pmd->dp, in_port);
> > > +if (netdev_flow_put(port->netdev, match,
> > > +CONST_CAST(struct nlattr *, actions),
> > > +actions_len, ufid, , NULL) == 0) {
> > [Sugesh] Cost of flow install is a concern here. It is vary for device to 
> > device.
> > Have you done any analysis on the cost of rte_flow translate + hardware
> flow install?
> > It would be nice to mention it in the cover letter.
> 
> I haven't measured it yet: it's not my current focus. It's not in the datapath
> after all. Besides, it might not be a trivial task to optimize it.
[Sugesh] TBH I feel this is very important aspect to decide if it make any sense
to install the flow or not. What do you meant by not in datapath? The PMD is
blocked for all time until the flow install is complete.? For a millions of 
short lived flow
use case, this is very important to know.
Let me know if I have missed anything.


> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 8:56 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action
> 
> On Mon, Sep 11, 2017 at 07:42:19AM +, Chandran, Sugesh wrote:
> >
> >
> > Regards
> > _Sugesh
> >
> >
> > > -Original Message-
> > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > > boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> > > Sent: Tuesday, September 5, 2017 10:23 AM
> > > To: d...@openvswitch.org
> > > Subject: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue
> > > action
> > >
> > > From: Finn Christensen 
> > >
> > > AFAIK, most (if not all) NICs (including Mellanox and Intel) do not
> > > support a pure MARK action.  It's required to be used together with
> > > some other actions, like QUEUE.
> > >
> > > To workaround it, retry with a queue action when first try failed.
> > [Sugesh] This means a NIC that doesn't support pure MARK, will do
> > every flow install twice no matter how many times the flow install failed
> before for the same reason.
> > The retry can be avoided if we know what a port/NIC can support. It
> > leads to a need of feature discovery of a NIC at the init time.
> 
> I believe the reply I made to Simon answered all questions you asked above.
[Sugesh] Thanks!

> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 2/8] dpif-netdev: retrieve flow directly from the flow mark

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 7:36 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 2/8] dpif-netdev: retrieve flow directly
> from the flow mark
> 
> On Sun, Sep 10, 2017 at 04:15:42PM +, Chandran, Sugesh wrote:
> > >  atomic_read_relaxed(>dp->emc_insert_min, _min);
> > >
> > >  DP_PACKET_BATCH_REFILL_FOR_EACH (i, size, packet, packets_) {
> > >  struct dp_netdev_flow *flow;
> > > +uint32_t flow_mark;
> > >
> > >  if (OVS_UNLIKELY(dp_packet_size(packet) < ETH_HEADER_LEN)) {
> > >  dp_packet_delete(packet); @@ -4972,6 +4974,16 @@
> > > emc_processing(struct dp_netdev_pmd_thread *pmd,
> > >  continue;
> > >  }
> > >
> > > +if (dp_packet_has_flow_mark(packet, _mark)) {
> > > +flow = dp_netdev_pmd_find_flow_by_mark(pmd, flow_mark);
> > > +if (flow) {
> > [Sugesh] If the NIC/hardware can match on tcp_flags then this parsing can
> be avoided?
> > Is that true?
> 
> Maybe. If so, we could get better performance, as I have showed in v1.
[Sugesh] Then another probing feature to see if this parsing needs to be 
done/not.?
> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: Yuanhan Liu [mailto:y...@fridaylinux.org]
> Sent: Monday, September 11, 2017 8:56 AM
> To: Chandran, Sugesh 
> Cc: d...@openvswitch.org
> Subject: Re: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action
> 
> On Mon, Sep 11, 2017 at 07:42:19AM +, Chandran, Sugesh wrote:
> >
> >
> > Regards
> > _Sugesh
> >
> >
> > > -Original Message-
> > > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > > boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> > > Sent: Tuesday, September 5, 2017 10:23 AM
> > > To: d...@openvswitch.org
> > > Subject: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue
> > > action
> > >
> > > From: Finn Christensen 
> > >
> > > AFAIK, most (if not all) NICs (including Mellanox and Intel) do not
> > > support a pure MARK action.  It's required to be used together with
> > > some other actions, like QUEUE.
> > >
> > > To workaround it, retry with a queue action when first try failed.
> > [Sugesh] This means a NIC that doesn't support pure MARK, will do
> > every flow install twice no matter how many times the flow install failed
> before for the same reason.
> > The retry can be avoided if we know what a port/NIC can support. It
> > leads to a need of feature discovery of a NIC at the init time.
> 
> I believe the reply I made to Simon answered all questions you asked above.
[Sugesh] Apologies , as I didn't see some of comments and its replies.
Thank you Yuanhan. Hope you will be working on some probing mechanism.
I am interested to know how its going to looks like
> 
>   --yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2] netdev-dpdk: reset packet_type for reused dp_packets

2017-09-11 Thread Zoltán Balogh
Hi Darrell,

Thank you for the clarification! I removed the line below from the source. I 
sent v5 to the dev list:
https://mail.openvswitch.org/pipermail/ovs-dev/2017-September/338567.html
https://patchwork.ozlabs.org/patch/812245/

Best regards,
Zoltan


> -Original Message-
> From: Darrell Ball [mailto:db...@vmware.com]
> Sent: Friday, September 08, 2017 6:24 PM
> To: Zoltán Balogh ; 'd...@openvswitch.org' 
> 
> Subject: Re: [ovs-dev] [PATCH v2] netdev-dpdk: reset packet_type for reused 
> dp_packets
> 
> Hi Zoltan
> 
> The packet_type initialization that I was referring to as becoming redundant 
> now was the one moved from to
> 
> void
> dp_packet_init_dpdk(struct dp_packet *b, size_t allocated)
> {
> dp_packet_set_allocated(b, allocated);
> b->source = DPBUF_DPDK;
> b->packet_type = htonl(PT_ETH);   
> }
> 
> Essentially , it was ‘split out’ from dp_packet_init__ for dpdk.
> 
> Since the new packet_type was not being reset as would be needed if the 
> buffer pool only init is done,
> we are now initing all packets for packet_type with this patch, hence the 
> above line for dpdk buffer
> pool init becomes redundant.
> 
> Thanks Darrell
> 
> 
> On 9/8/17, 3:01 AM, "Zoltán Balogh"  wrote:
> 
> Hi,
> 
> Please ignore this patch. It's faulty. We cannot remove initialization
> of packet_type from dp_packet_init__(). Beacause of non-dpdk netdevs
> still need it. Here is a link to the corrected patch on patchwork:
> https://urldefense.proofpoint.com/v2/url?u=https-
> 3A__patchwork.ozlabs.org_patch_811445_=DwIFAw=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-
> uEs=dGZmbKhBG9tJHY4odedsGA=8TOx50_GxR8BEr45vTRIK07hVU3zdUM74V8mfYZ71Pc=7M6jjf8mR02x3pm7OqQKPVuq0RoVq7odJtobax0
> JYJw=
> 
> Best regards,
> Zoltan
> 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org 
> [mailto:ovs-dev-boun...@openvswitch.org] On Behalf Of Zoltán Balogh
> > Sent: Friday, September 08, 2017 10:43 AM
> > To: 'd...@openvswitch.org' 
> > Subject: [ovs-dev] [PATCH v2] netdev-dpdk: reset packet_type for reused 
> dp_packets
> >
> > DPDK uses dp-packet pool for storing received packets. The pool is
> > reused by rxq_recv funcions of the DPDK netdevs. The datapath is
> > capable to modify the packet_type property of packets. For instance
> > when encapsulated L3 packets are received on a ptap gre port.
> > In this case the packet_type property of struct dp_packet can be
> > modified and later the same dp_packet with the modified packet_type
> > can be reused in the rxq_rec function, so it can contain corrupted
> > data.
> >
> > The dp_packet_batch_init_cutlen() in the rxq_recv functions iterates
> > over dp_packets and sets their cutlen. So I modified this function
> > to set packet_type to Ethernet for the dp_packets as well. I also
> > renamed this function because of the added functionality.
> >
> > The dp_packet_batch_init_cutlen() iterates over batch->count dp_packet.
> > Therefore setting of batch->count = nb_rx needs to be done before the
> > former function is invoked. This is an additional fix.
> >
> > This commit also removes initialization of packet_type from
> > dp_packet_init__(), since that was moved to dp_packet_init_dpdk() 
> before.
> >
> > Signed-off-by: Zoltan Balogh 
> > Signed-off-by: Laszlo Suru 
> > Co-authored-by: Laszlo Suru 
> > CC: Jan Scheurich 
> > CC: Sugesh Chandran 
> > CC: Darrell Ball 
> > ---
> >  lib/dp-packet.c   | 2 --
> >  lib/dp-packet.h   | 3 ++-
> >  lib/netdev-dpdk.c | 7 ---
> >  3 files changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/lib/dp-packet.c b/lib/dp-packet.c
> > index e5d16a6..21dfe39 100644
> > --- a/lib/dp-packet.c
> > +++ b/lib/dp-packet.c
> > @@ -33,8 +33,6 @@ dp_packet_init__(struct dp_packet *b, size_t 
> allocated, enum dp_packet_source so
> >  dp_packet_rss_invalidate(b);
> >  dp_packet_mbuf_init(b);
> >  dp_packet_reset_cutlen(b);
> > -/* By default assume the packet type to be Ethernet. */
> > -b->packet_type = htonl(PT_ETH);
> >  }
> >
> >  static void
> > diff --git a/lib/dp-packet.h b/lib/dp-packet.h
> > index 046f3ab..b4b721c 100644
> > --- a/lib/dp-packet.h
> > +++ b/lib/dp-packet.h
> > @@ -805,12 +805,13 @@ dp_packet_delete_batch(struct dp_packet_batch 
> *batch, bool may_steal)
> >  }
> >
> >  static inline void
> > -dp_packet_batch_init_cutlen(struct dp_packet_batch *batch)
> > +dp_packet_batch_init_packet_fields(struct dp_packet_batch *batch)
> >  

Re: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the upcall

2017-09-11 Thread Yuanhan Liu
On Sun, Sep 10, 2017 at 04:40:10PM +, Chandran, Sugesh wrote:
> 
> 
> Regards
> _Sugesh
> 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> > Sent: Tuesday, September 5, 2017 10:23 AM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH v2 5/8] dpif-netdev: record rx queue id for the
> > upcall
> > 
> > From: Shachar Beiser 
> > 
> > For the DPDK flow offload, which basically just binds a MARK action to a 
> > flow,
> > the MARK is required to be used together with a QUEUE action for the most
> > NICs I'm aware of. The QUEUE action then needs a queue index, which is not
> > given in the flow content.
> [Sugesh] Looks to me this is again another hardware specific req.
> This could have impact on RSS hash distribution and queue redistribution 
> across pmds
> at runtime.

If you have read my earlier version, you should have seen similar concerns
from me.

[...]
> > handle_packet_upcall(struct dp_netdev_pmd_thread *pmd,
> >   struct dp_packet *packet,
> >   const struct netdev_flow_key *key,
> >   struct ofpbuf *actions, struct ofpbuf *put_actions,
> > - int *lost_cnt, long long now)
> > + int *lost_cnt, long long now, int rxq)
> [Sugesh] IMHO its not really good practice to change the default packet 
> processing path for
> some specific hardware offload. Rxq doesn't have any meaning for handling the 
> packets in normal path.

The same: some concerns I have already expressed before. Unfortunately,
we didn't come up with something better.

> Why cant install flow on all the configured queues for a specific inport? 
> Flow handling is per port, not per queue.
> This will assure the packets will have mark even after the rss hash 
> distribution. 

Like how? The QUEUE action only accepts one queue index. Setting it to
many times will only let the last one take effect. The another possiblity
I'm aware of is with the RTE_FLOW_ACTION_TYPE_RSS, which, unfortunately,
is only supported by Mellanox in DPDK. Yet again, I was told it's not
functional well.

Also, even it could work, I think it still would be probematic. I'm
thinking what might happen for following case.

Assume there is a 5-tuple flow. According to the initial RSS setting by
OVS, all pkts match that flow would be ended up being recieved from one
queue only. If we re-do RSS settings on it, if the RSS settings are the
same, the behaviour might be the same. If not, those pkts which are
supposed to be distributed to one queue only might be distributed to
many queues.

Is it a valid concern?

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for physical port

2017-09-11 Thread O Mahony, Billy
Hi Wang,

I believe that the PMD stats processing cycles includes EMC processing time. 

This is just in the context of your results being surprising. It could be a 
factor if you are using code where the bug exists. The patch carries a fixes: 
tag (I think) that should help you figure out if your results were potentially 
affected by this issue.

Regards,
/Billy. 

> -Original Message-
> From: 王志克 [mailto:wangzh...@jd.com]
> Sent: Monday, September 11, 2017 3:00 AM
> To: O Mahony, Billy ; ovs-
> d...@openvswitch.org; Jan Scheurich ; Darrell
> Ball ; ovs-disc...@openvswitch.org; Kevin Traynor
> 
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Billy,
> 
> In my test, almost all traffic went trough via EMC. So the fix does not impact
> the result, especially we want to know the difference (not the exact num).
> 
> Can you test to get some data? Thanks.
> 
> Br,
> Wang Zhike
> 
> -Original Message-
> From: O Mahony, Billy [mailto:billy.o.mah...@intel.com]
> Sent: Friday, September 08, 2017 11:18 PM
> To: 王志克; ovs-dev@openvswitch.org; Jan Scheurich; Darrell Ball; ovs-
> disc...@openvswitch.org; Kevin Traynor
> Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> physical port
> 
> Hi Wang,
> 
> https://mail.openvswitch.org/pipermail/ovs-dev/2017-August/337309.html
> 
> I see it's been acked and is due to be pushed to master with other changes
> on the dpdk merge branch so you'll have to apply it manually for now.
> 
> /Billy.
> 
> > -Original Message-
> > From: 王志克 [mailto:wangzh...@jd.com]
> > Sent: Friday, September 8, 2017 11:48 AM
> > To: ovs-dev@openvswitch.org; Jan Scheurich
> > ; O Mahony, Billy
> > ; Darrell Ball ; ovs-
> > disc...@openvswitch.org; Kevin Traynor 
> > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> > Hi Billy,
> >
> > I used ovs2.7.0. I searched the git log, and not sure which commit it
> > is. Do you happen to know?
> >
> > Yes, I cleared the stats after traffic run.
> >
> > Br,
> > Wang Zhike
> >
> >
> > From: "O Mahony, Billy" 
> > To: "wangzh...@jd.com" , Jan Scheurich
> > , Darrell Ball ,
> > "ovs-disc...@openvswitch.org" ,
> > "ovs-dev@openvswitch.org" , Kevin
> Traynor
> > 
> > Subject: Re: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> > Message-ID:
> > <03135aea779d444e90975c2703f148dc58c19...@irsmsx107.ger.c
> > orp.intel.com>
> >
> > Content-Type: text/plain; charset="utf-8"
> >
> > Hi Wang,
> >
> > Thanks for the figures. Unexpected results as you say. Two things come
> > to
> > mind:
> >
> > I?m not sure what code you are using but the cycles per packet
> > statistic was broken for a while recently. Ilya posted a patch to fix
> > it so make sure you have that patch included.
> >
> > Also remember to reset the pmd stats after you start your traffic and
> > then measure after a short duration.
> >
> > Regards,
> > Billy.
> >
> >
> >
> > From: ??? [mailto:wangzh...@jd.com]
> > Sent: Friday, September 8, 2017 8:01 AM
> > To: Jan Scheurich ; O Mahony, Billy
> > ; Darrell Ball ; ovs-
> > disc...@openvswitch.org; ovs-dev@openvswitch.org; Kevin Traynor
> > 
> > Subject: RE: [ovs-dev] OVS DPDK NUMA pmd assignment question for
> > physical port
> >
> >
> > Hi All,
> >
> >
> >
> > I tested below cases, and get some performance data. The data shows
> > there is little impact for cross NUMA communication, which is
> > different from my expectation. (Previously I mentioned that cross NUMA
> > would add 60% cycles, but I can NOT reproduce it any more).
> >
> >
> >
> > @Jan,
> >
> > You mentioned cross NUMA communication would cost lots more cycles.
> > Can you share your data? I am not sure whether I made some mistake or
> not.
> >
> >
> >
> > @All,
> >
> > Welcome your data if you have data for similar cases. Thanks.
> >
> >
> >
> > Case1: VM0->PMD0->NIC0
> >
> > Case2:VM1->PMD1->NIC0
> >
> > Case3:VM1->PMD0->NIC0
> >
> > Case4:NIC0->PMD0->VM0
> >
> > Case5:NIC0->PMD1->VM1
> >
> > Case6:NIC0->PMD0->VM1
> >
> >
> >
> > ? VM Tx Mpps  Host Tx Mpps  avg cycles per packet   avg processing
> > cycles per packet
> >
> > Case1 1.4   1.4 512 
> > 415
> >
> > Case2 1.3   1.3 537 
> > 436
> >
> > Case3 1.351.35   514 390
> >
> >
> >
> > ?  VM Rx MppsHost Rx Mpps  avg cycles per packet   avg processing
> cycles
> > per 

Re: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config

2017-09-11 Thread Finn Christensen
-Original Message-
From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-boun...@openvswitch.org] 
On Behalf Of Yuanhan Liu
Sent: 11. september 2017 09:55
To: Chandran, Sugesh 
Cc: d...@openvswitch.org
Subject: Re: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config

On Mon, Sep 11, 2017 at 07:42:57AM +, Chandran, Sugesh wrote:
> 
> 
> Regards
> _Sugesh
> 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev- 
> > boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> > Sent: Tuesday, September 5, 2017 10:23 AM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config
> > 
> > From: Finn Christensen 
> > 
> > The Intel i40e PMD driver requires the fdir mode set to 
> > RTE_FDIR_MODE_PERFECT, otherwise, the flow creation would be failed.
> [Sugesh] this means it doesn't honor the flow masks which passed onto 
> rte_flow_*?

IIRC, that's what I found after divig the code. It's an issue reported/fixed by 
Finn. I also don't have the nic for testing.

[Finn] Yes, this was needed to make our test setup using an XL710 work, with 
the rte_flow implementation.
It's a while ago so I don't exactly remember how we ended up with this 
solution. However, we are definitely not
Intel XL710 experts, so there might be other ways to achieve the rte_flow 
functionality.
This issue, and problem raised about the overall change in configuration impact 
on NICs using this setting (Napatech 
does not use it), I think should be reviewed/verified by NIC vendors using it.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 1/8] dpif-netdev: associate flow with a mark id

2017-09-11 Thread Yuanhan Liu
On Sun, Sep 10, 2017 at 04:14:01PM +, Chandran, Sugesh wrote:
> > This patch associate a flow with a mark id (a uint32_t number) by CMAP.
> > 
> > It re-uses the flow API (more precisely, the ->flow_put method) to setup the
> > hw flow. The flow_put implementation then is supposed to create a flow
> > with MARK action.
> > 
> > Co-authored-by: Finn Christensen 
> > Signed-off-by: Yuanhan Liu 
> > Signed-off-by: Finn Christensen 
> > ---
> >  lib/dpif-netdev.c | 85
> > +++
> >  lib/netdev.h  |  6 
> >  2 files changed, 91 insertions(+)
> 
> [snip]
> >   * Any thread trying to keep a rule from being freed should hold its 
> > own
> > @@ -569,6 +575,8 @@ struct dp_netdev_pmd_thread {
> >  struct ovs_mutex flow_mutex;
> >  struct cmap flow_table OVS_GUARDED; /* Flow table. */
> > 
> > +struct cmap mark_to_flow;
> [Sugesh] I would prefer to have this cmap per port than a pmd. The reason 
> being it offers better
> Performance due to less lookup. Anyway the flow programming is on netdev, 
> hence its good to
> keep it netdev centric than pmd. What do you think?

I think it might be workable.

> >  /* One classifier per in_port polled by the pmd */
> >  struct cmap classifiers;
> >  /* Periodically sort subtable vectors according to hit frequencies */ 
> > @@ -
> > 1839,6 +1847,8 @@ dp_netdev_pmd_remove_flow(struct
> > dp_netdev_pmd_thread *pmd,
> >  OVS_REQUIRES(pmd->flow_mutex)
> >  {
> >  struct cmap_node *node = CONST_CAST(struct cmap_node *, 
> > >node);
> > +struct cmap_node *mark_node = CONST_CAST(struct cmap_node *,
> > + >mark_node);
> >  struct dpcls *cls;
> >  odp_port_t in_port = flow->flow.in_port.odp_port;
> > 
> > @@ -1846,6 +1856,10 @@ dp_netdev_pmd_remove_flow(struct
> > dp_netdev_pmd_thread *pmd,
> >  ovs_assert(cls != NULL);
> >  dpcls_remove(cls, >cr);
> >  cmap_remove(>flow_table, node, dp_netdev_flow_hash(
> > >ufid));
> > +if (flow->has_mark) {
> [Sugesh] Any reason these are not mutex protected? As far as I know the flow 
> remove
> is handled in revalidator thread. Looks like this is a critical section. So 
> the install and delete
> should be protected.

I think you are right.


[...]
> > +if (netdev_is_flow_api_enabled() &&
> > +dp_netdev_alloc_flow_mark(pmd, _mark)) {
> > +struct dp_netdev_port *port;
> > +port = dp_netdev_lookup_port(pmd->dp, in_port);
> > +if (netdev_flow_put(port->netdev, match,
> > +CONST_CAST(struct nlattr *, actions),
> > +actions_len, ufid, , NULL) == 0) {
> [Sugesh] Cost of flow install is a concern here. It is vary for device to 
> device.
> Have you done any analysis on the cost of rte_flow translate + hardware flow 
> install?
> It would be nice to mention it in the cover letter. 

I haven't measured it yet: it's not my current focus. It's not in the
datapath after all. Besides, it might not be a trivial task to optimize
it.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action

2017-09-11 Thread Yuanhan Liu
On Mon, Sep 11, 2017 at 07:42:19AM +, Chandran, Sugesh wrote:
> 
> 
> Regards
> _Sugesh
> 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> > Sent: Tuesday, September 5, 2017 10:23 AM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action
> > 
> > From: Finn Christensen 
> > 
> > AFAIK, most (if not all) NICs (including Mellanox and Intel) do not support 
> > a
> > pure MARK action.  It's required to be used together with some other
> > actions, like QUEUE.
> > 
> > To workaround it, retry with a queue action when first try failed.
> [Sugesh] This means a NIC that doesn't support pure MARK, will
> do every flow install twice no matter how many times the flow install failed 
> before for the same reason. 
> The retry can be avoided if we know what a port/NIC can support. It leads to 
> a need of feature discovery of a NIC
> at the init time.

I believe the reply I made to Simon answered all questions you asked above.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


[ovs-dev] [PATCH v5] netdev-dpdk: reset packet_type for reused dp_packets

2017-09-11 Thread Zoltán Balogh
DPDK uses dp-packet pool for storing received packets. The pool is
reused by rxq_recv funcions of the DPDK netdevs. The datapath is
capable to modify the packet_type property of packets. For instance
when encapsulated L3 packets are received on a ptap gre port.
In this case the packet_type property of struct dp_packet can be
modified and later the same dp_packet with the modified packet_type
can be reused in the rxq_rec function, so it can contain corrupted
data.

The dp_packet_batch_init_cutlen() in the rxq_recv functions iterates
over dp_packets and sets their cutlen. So I modified this function
to set packet_type to Ethernet for the dp_packets as well. I also
renamed this function because of the added functionality.

The dp_packet_batch_init_cutlen() iterates over batch->count dp_packet.
Therefore setting of batch->count = nb_rx needs to be done before the
former function is invoked. This is an additional fix.

Signed-off-by: Zoltan Balogh 
Signed-off-by: Laszlo Suru 
Co-authored-by: Laszlo Suru 
CC: Jan Scheurich 
CC: Sugesh Chandran 
CC: Darrell Ball 
---
 lib/dp-packet.c   | 1 -
 lib/dp-packet.h   | 3 ++-
 lib/netdev-dpdk.c | 7 ---
 3 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/lib/dp-packet.c b/lib/dp-packet.c
index e5d16a6..443c225 100644
--- a/lib/dp-packet.c
+++ b/lib/dp-packet.c
@@ -103,7 +103,6 @@ dp_packet_init_dpdk(struct dp_packet *b, size_t allocated)
 {
 dp_packet_set_allocated(b, allocated);
 b->source = DPBUF_DPDK;
-b->packet_type = htonl(PT_ETH);
 }
 
 /* Initializes 'b' as an empty dp_packet with an initial capacity of 'size'
diff --git a/lib/dp-packet.h b/lib/dp-packet.h
index 046f3ab..b4b721c 100644
--- a/lib/dp-packet.h
+++ b/lib/dp-packet.h
@@ -805,12 +805,13 @@ dp_packet_delete_batch(struct dp_packet_batch *batch, 
bool may_steal)
 }
 
 static inline void
-dp_packet_batch_init_cutlen(struct dp_packet_batch *batch)
+dp_packet_batch_init_packet_fields(struct dp_packet_batch *batch)
 {
 struct dp_packet *packet;
 
 DP_PACKET_BATCH_FOR_EACH (packet, batch) {
 dp_packet_reset_cutlen(packet);
+packet->packet_type = htonl(PT_ETH);
 }
 }
 
diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c
index f58e9be..b9a 100644
--- a/lib/netdev-dpdk.c
+++ b/lib/netdev-dpdk.c
@@ -1644,8 +1644,9 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq,
  nb_rx, dropped);
 rte_spinlock_unlock(>stats_lock);
 
-dp_packet_batch_init_cutlen(batch);
-batch->count = (int) nb_rx;
+batch->count = nb_rx;
+dp_packet_batch_init_packet_fields(batch);
+
 return 0;
 }
 
@@ -1684,8 +1685,8 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct 
dp_packet_batch *batch)
 rte_spinlock_unlock(>stats_lock);
 }
 
-dp_packet_batch_init_cutlen(batch);
 batch->count = nb_rx;
+dp_packet_batch_init_packet_fields(batch);
 
 return 0;
 }
-- 
1.9.1

___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config

2017-09-11 Thread Yuanhan Liu
On Mon, Sep 11, 2017 at 07:42:57AM +, Chandran, Sugesh wrote:
> 
> 
> Regards
> _Sugesh
> 
> 
> > -Original Message-
> > From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> > boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> > Sent: Tuesday, September 5, 2017 10:23 AM
> > To: d...@openvswitch.org
> > Subject: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config
> > 
> > From: Finn Christensen 
> > 
> > The Intel i40e PMD driver requires the fdir mode set to
> > RTE_FDIR_MODE_PERFECT, otherwise, the flow creation would be failed.
> [Sugesh] this means it doesn't honor the flow masks which passed onto 
> rte_flow_*?

IIRC, that's what I found after divig the code. It's an issue reported/fixed
by Finn. I also don't have the nic for testing.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> Sent: Tuesday, September 5, 2017 10:23 AM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config
> 
> From: Finn Christensen 
> 
> The Intel i40e PMD driver requires the fdir mode set to
> RTE_FDIR_MODE_PERFECT, otherwise, the flow creation would be failed.
[Sugesh] this means it doesn't honor the flow masks which passed onto 
rte_flow_*?
> 
> Co-authored-by: Yuanhan Liu 
> Signed-off-by: Finn Christensen 
> Signed-off-by: Yuanhan Liu 
> ---
> ---
>  lib/netdev-dpdk.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index e363c92..beb7b68
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -177,6 +177,9 @@ static const struct rte_eth_conf port_conf = {
>  .txmode = {
>  .mq_mode = ETH_MQ_TX_NONE,
>  },
> +.fdir_conf = {
> +.mode = RTE_FDIR_MODE_PERFECT,
> +},
>  };
> 
>  /*
> --
> 2.7.4
> 
> ___
> dev mailing list
> d...@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action

2017-09-11 Thread Chandran, Sugesh


Regards
_Sugesh


> -Original Message-
> From: ovs-dev-boun...@openvswitch.org [mailto:ovs-dev-
> boun...@openvswitch.org] On Behalf Of Yuanhan Liu
> Sent: Tuesday, September 5, 2017 10:23 AM
> To: d...@openvswitch.org
> Subject: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action
> 
> From: Finn Christensen 
> 
> AFAIK, most (if not all) NICs (including Mellanox and Intel) do not support a
> pure MARK action.  It's required to be used together with some other
> actions, like QUEUE.
> 
> To workaround it, retry with a queue action when first try failed.
[Sugesh] This means a NIC that doesn't support pure MARK, will
do every flow install twice no matter how many times the flow install failed 
before for the same reason. 
The retry can be avoided if we know what a port/NIC can support. It leads to a 
need of feature discovery of a NIC
at the init time.
> 
> Moreover, some Intel's NIC (say XL710) needs the QUEUE action set before
> the MARK action.
> 
> Co-authored-by: Yuanhan Liu 
> Signed-off-by: Finn Christensen 
> Signed-off-by: Yuanhan Liu 
> ---
> 
> v2: - workarounded the queue action issue, which sets the queue index
>   to 0 blindly.
> - added some comments regarding to the retry with queue action
> ---
>  lib/netdev-dpdk.c | 59
> ---
>  1 file changed, 52 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index 37b0f99..320fe80
> 100644
> --- a/lib/netdev-dpdk.c
> +++ b/lib/netdev-dpdk.c
> @@ -3324,6 +3324,7 @@ static struct ovs_mutex ufid_lock =
> OVS_MUTEX_INITIALIZER;  struct ufid_dpdk_flow_data {
>  struct hmap_node node;
>  ovs_u128 ufid;
> +int rxq;
>  struct rte_flow *rte_flow;
>  };
> 
> @@ -3366,7 +3367,8 @@ del_ufid_dpdk_flow_mapping(const ovs_u128
> *ufid)
> 
>  /* Add ufid to dpdk_flow mapping */
>  static inline void
> -add_ufid_dpdk_flow_mapping(const ovs_u128 *ufid, struct rte_flow
> *rte_flow)
> +add_ufid_dpdk_flow_mapping(const ovs_u128 *ufid, struct rte_flow
> *rte_flow,
> +   int rxq)
>  {
>  size_t hash = hash_bytes(ufid, sizeof(*ufid), 0);
>  struct ufid_dpdk_flow_data *data = xzalloc(sizeof(*data)); @@ -3381,6
> +3383,7 @@ add_ufid_dpdk_flow_mapping(const ovs_u128 *ufid, struct
> rte_flow *rte_flow)
> 
>  data->ufid = *ufid;
>  data->rte_flow = rte_flow;
> +data->rxq = rxq;
> 
>  ovs_mutex_lock(_lock);
>  hmap_insert(_dpdk_flow, >node, hash); @@ -3664,15
> +3667,51 @@ netdev_dpdk_add_rte_flow_offload(struct netdev *netdev,
>  }
>  add_flow_action(, RTE_FLOW_ACTION_TYPE_END, NULL);
> 
> +bool tried = 0;
> +struct rte_flow_action_queue queue;
> +again:
>  flow = rte_flow_create(dev->port_id, _attr, patterns.items,
> actions.actions, );
> -if (!flow) {
> +if (!flow && !tried && actions_len) {
> +/*
> + * Seems most (if not all) NICs do not support a pure MARK
> + * action. It's required to be used together with some other
> + * actions, such as QUEUE.
> + *
> + * Thus, if flow creation fails, here we try again with
> + * QUEUE action.
> + *
> + */
> +if (info->rxq < 0 || info->rxq >= netdev->n_rxq) {
> +VLOG_ERR("invalid queue index: %d\n", info->rxq);
> +ret = -1;
> +goto out;
> +}
> +queue.index = info->rxq;
> +
> +/* re-build the action */
> +actions.cnt = 0;
> +/*
> + * NOTE: Intel PMD driver needs the QUEUE action set before
> + * the MARK action.
> + */
> +add_flow_action(, RTE_FLOW_ACTION_TYPE_QUEUE,
> );
> +add_flow_action(, RTE_FLOW_ACTION_TYPE_MARK, );
> +add_flow_action(, RTE_FLOW_ACTION_TYPE_END, NULL);
> +
> +VLOG_INFO("failed to create rte flow, "
> +  "try again with QUEUE action (queue index = %d)\n",
> +  info->rxq);
> +tried = true;
> +
> +goto again;
> +} else if (!flow) {
>  VLOG_ERR("rte flow creat error: %u : message : %s\n",
>   error.type, error.message);
>  ret = -1;
>  goto out;
>  }
> -add_ufid_dpdk_flow_mapping(ufid, flow);
> +add_ufid_dpdk_flow_mapping(ufid, flow, info->rxq);
>  VLOG_INFO("installed flow %p by ufid "UUID_FMT"\n",
>flow, UUID_ARGS((struct uuid *)ufid));
> 
> @@ -3784,17 +3823,23 @@ netdev_dpdk_flow_put(struct netdev *netdev,
> struct match *match,
>   const ovs_u128 *ufid, struct offload_info *info,
>   struct dpif_flow_stats *stats OVS_UNUSED)  {
> -struct rte_flow *rte_flow;
> +struct ufid_dpdk_flow_data *flow_data;
>  int ret;
> 
>  /*
>   * If an old rte_flow exists, it means it's a flow modification.
>   * Here destroy the 

Re: [ovs-dev] [PATCH v2 2/8] dpif-netdev: retrieve flow directly from the flow mark

2017-09-11 Thread Yuanhan Liu
On Sun, Sep 10, 2017 at 04:15:42PM +, Chandran, Sugesh wrote:
> >  atomic_read_relaxed(>dp->emc_insert_min, _min);
> > 
> >  DP_PACKET_BATCH_REFILL_FOR_EACH (i, size, packet, packets_) {
> >  struct dp_netdev_flow *flow;
> > +uint32_t flow_mark;
> > 
> >  if (OVS_UNLIKELY(dp_packet_size(packet) < ETH_HEADER_LEN)) {
> >  dp_packet_delete(packet);
> > @@ -4972,6 +4974,16 @@ emc_processing(struct dp_netdev_pmd_thread
> > *pmd,
> >  continue;
> >  }
> > 
> > +if (dp_packet_has_flow_mark(packet, _mark)) {
> > +flow = dp_netdev_pmd_find_flow_by_mark(pmd, flow_mark);
> > +if (flow) {
> [Sugesh] If the NIC/hardware can match on tcp_flags then this parsing can be 
> avoided?
> Is that true?

Maybe. If so, we could get better performance, as I have showed in v1.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 0/8] OVS-DPDK flow offload with rte_flow

2017-09-11 Thread Yuanhan Liu
I see. Thanks!

--yliu

On Fri, Sep 08, 2017 at 07:29:48PM +, Darrell Ball wrote:
> Hi Yuanhan
> 
> In the dpdk public meeting, we requested Simon to also review this series to 
> check
> that it is sync with the HWOL for kernel with respect to one or two high 
> level aspects.
> I had some comments related to coding standards for V2, but I’ll wait for you 
> to respond
> to Simon’s comments before adding my own other comments, in order to avoid 
> confusion.
> 
> Thanks Darrell
>  
> 
> On 9/5/17, 2:22 AM, "Yuanhan Liu"  wrote:
> 
> Hi,
> 
> Here is a joint work from Mellanox and Napatech, to enable the flow hw
> offload with the DPDK generic flow interface (rte_flow).
> 
> The basic idea is to associate the flow with a mark id (a unit32_t 
> number).
> Later, we then get the flow directly from the mark id, bypassing the heavy
> emc processing, including miniflow_extract.
> 
> The association is done with CMAP in patch 1. It also reuses the flow
> APIs introduced while adding the tc offloads. The emc bypassing is done
> in patch 2. The flow offload is done in patch 4, which mainly does two
> things:
> 
> - translate the ovs match to DPDK rte flow patterns
> - bind those patterns with a MARK action.
> 
> Afterwards, the NIC will set the mark id in every pkt's mbuf when it
> matches the flow. That's basically how we could get the flow directly
> from the received mbuf.
> 
> While testing with PHY-PHY forwarding with one core and one queue, I got
> about 54% performance boost. For PHY-vhost forwarding, I got about 41%
> performance boost. The reason it's lower than v1 is I added the logic
> to get the correct tcp_flags, which examines all packets recieved.
> 
> The major issue mentioned in last version is also workarounded: the
> queue index is never set to 0 blindly anymore, but set to the rxq that
> first receives the upcall pkt.
> 
> Note that it's disabled by default, which can be enabled by:
> 
> $ ovs-vsctl set Open_vSwitch . other_config:hw-offload=true
> 
> 
> v2: - workaround the queue action issue
> - fixed the tcp_flags being skipped issue, which also fixed the
>   build warnings
> - fixed l2 patterns for Intel nic
> - Converted some macros to functions
> - did not hardcode the max number of flow/action
> - rebased on top of the lastest code
> 
> Thanks.
> 
> --yliu
> 
> 
> ---
> Finn Christensen (3):
>   netdev-dpdk: implement flow put with rte flow
>   netdev-dpdk: retry with queue action
>   netdev-dpdk: set FDIR config
> 
> Shachar Beiser (1):
>   dpif-netdev: record rx queue id for the upcall
> 
> Yuanhan Liu (4):
>   dpif-netdev: associate flow with a mark id
>   dpif-netdev: retrieve flow directly from the flow mark
>   netdev-dpdk: convert ufid to dpdk flow
>   netdev-dpdk: remove offloaded flow on deletion
> 
>  lib/dp-packet.h   |  14 ++
>  lib/dpif-netdev.c | 132 +++--
>  lib/flow.c|  78 
>  lib/flow.h|   1 +
>  lib/netdev-dpdk.c | 574 
> +-
>  lib/netdev.c  |   1 +
>  lib/netdev.h  |   7 +
>  7 files changed, 795 insertions(+), 12 deletions(-)
> 
> -- 
> 2.7.4
> 
> 
> 
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 8/8] netdev-dpdk: set FDIR config

2017-09-11 Thread Yuanhan Liu
On Fri, Sep 08, 2017 at 06:48:16PM +0200, Simon Horman wrote:
> On Tue, Sep 05, 2017 at 05:23:01PM +0800, Yuanhan Liu wrote:
> > From: Finn Christensen 
> > 
> > The Intel i40e PMD driver requires the fdir mode set to
> > RTE_FDIR_MODE_PERFECT, otherwise, the flow creation would
> > be failed.
> 
> This appears to update a setting that is used more broadly than
> with the hardware cited above. I'd value some discussion
> of whether that is safe/appropriate/etc... ?

Sure, and comments are welcome. For Mellanox, AFAIK, the setting is
ignored.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 6/8] netdev-dpdk: retry with queue action

2017-09-11 Thread Yuanhan Liu
On Fri, Sep 08, 2017 at 06:48:50PM +0200, Simon Horman wrote:
> On Tue, Sep 05, 2017 at 05:22:59PM +0800, Yuanhan Liu wrote:
> > From: Finn Christensen 
> > 
> > AFAIK, most (if not all) NICs (including Mellanox and Intel) do not
> > support a pure MARK action.  It's required to be used together with
> > some other actions, like QUEUE.
> > 
> > To workaround it, retry with a queue action when first try failed.
> > 
> > Moreover, some Intel's NIC (say XL710) needs the QUEUE action set
> > before the MARK action.
> > 
> > Co-authored-by: Yuanhan Liu 
> > Signed-off-by: Finn Christensen 
> > Signed-off-by: Yuanhan Liu 
> 
> This feels a bit like the tail wagging the dog.
> Is this the lowest level at which it makes sense to implement
> this logic?
> 
> If so then I wonder if some sort of probing would be in order
> to avoid the cost of trying to add the flow twice to hardware
> where the queue is required.

Do you mean something like rte_flow capability query, like whether
a queue action is needed for a mark action? If so, yes, I do think
we miss an interface like this.

Note that even in this solution, the flow won't be created twice
to the hardware, because the first try would be failed.

--yliu
___
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev


Re: [ovs-dev] [PATCH v2 4/8] netdev-dpdk: implement flow put with rte flow

2017-09-11 Thread Yuanhan Liu
On Fri, Sep 08, 2017 at 06:47:55PM +0200, Simon Horman wrote:
> On Tue, Sep 05, 2017 at 05:22:57PM +0800, Yuanhan Liu wrote:
> > From: Finn Christensen 
> > 
> > The basic yet the major part of this patch is to translate the "match"
> > to rte flow patterns. And then, we create a rte flow with a MARK action.
> > Afterwards, all pkts matches the flow will have the mark id in the mbuf.
> > 
> > For any unsupported flows, such as MPLS, -1 is returned, meaning the
> > flow offload is failed and then skipped.
> 
> ...
> 
> > +static int
> > +netdev_dpdk_add_rte_flow_offload(struct netdev *netdev,
> > + const struct match *match,
> > + struct nlattr *nl_actions OVS_UNUSED,
> > + size_t actions_len,
> > + const ovs_u128 *ufid,
> > + struct offload_info *info)
> > +{
> > +struct netdev_dpdk *dev = netdev_dpdk_cast(netdev);
> > +const struct rte_flow_attr flow_attr = {
> > +.group = 0,
> > +.priority = 0,
> > +.ingress = 1,
> > +.egress = 0
> > +};
> > +struct flow_patterns patterns = { .items = NULL, .cnt = 0 };
> > +struct flow_actions actions = { .actions = NULL, .cnt = 0 };
> 
> I believe the following will give the same result as the above,
> less verbosely.
> 
> +const struct rte_flow_attr flow_attr = { .ingress = 1 };
> +struct flow_patterns patterns = { .cnt = 0 };
> +struct flow_actions actions = { .cnt = 0 };

I'm not quite sure on that. If the compiler doesn't do zero assigment
correctly, something deadly wrong could happen. They all are short
structs, that I think it's fine, IMO. If they are big, I'd prefer an
explicit memset.

> > +struct rte_flow *flow;
> > +struct rte_flow_error error;
> > +uint8_t *ipv4_next_proto_mask = NULL;
> > +int ret = 0;
> > +
> > +/* Eth */
> > +struct rte_flow_item_eth eth_spec;
> > +struct rte_flow_item_eth eth_mask;
> 
> Empty line here please.

I actually want to bind the two declartions with the following code piece,
to show that they are tightly related.

> > +memset(_mask, 0, sizeof(eth_mask));
> > +if (match->wc.masks.dl_src.be16[0] ||
> > +match->wc.masks.dl_src.be16[1] ||
> > +match->wc.masks.dl_src.be16[2] ||
> > +match->wc.masks.dl_dst.be16[0] ||
> > +match->wc.masks.dl_dst.be16[1] ||
> > +match->wc.masks.dl_dst.be16[2]) {
> 
> It looks like eth_addr_is_zero() can be used in the above.

Yes, we could. Thanks for the tip.

> > +rte_memcpy(_spec.dst, >flow.dl_dst, 
> > sizeof(eth_spec.dst));
> > +rte_memcpy(_spec.src, >flow.dl_src, 
> > sizeof(eth_spec.src));
> > +eth_spec.type = match->flow.dl_type;
> > +
> > +rte_memcpy(_mask.dst, >wc.masks.dl_dst,
> > +   sizeof(eth_mask.dst));
> > +rte_memcpy(_mask.src, >wc.masks.dl_src,
> > +   sizeof(eth_mask.src));
> > +eth_mask.type = match->wc.masks.dl_type;
> > +
> > +add_flow_pattern(, RTE_FLOW_ITEM_TYPE_ETH,
> > + _spec, _mask);
> > +} else {
> > +/*
> > + * If user specifies a flow (like UDP flow) without L2 patterns,
> > + * OVS will at least set the dl_type. Normally, it's enough to
> > + * create an eth pattern just with it. Unluckily, some Intel's
> > + * NIC (such as XL710) doesn't support that. Below is a workaround,
> > + * which simply matches any L2 pkts.
> > + */
> > +add_flow_pattern(, RTE_FLOW_ITEM_TYPE_ETH, NULL, NULL);
> > +}
> 
> This feels a lot like a layer violation - making the core aware
> of specific implementation details of lower layers.

I agree with you, but unlickily, without it, Intel NIC simply won't work
(according to Finn), unless you have mac addr being provided.

> >From a functional point of view, is the idea that
> a eth_type+5-tuple match is converted to a 5-tuple match?

Unluckily, yes.

> > +/* VLAN */
> > +struct rte_flow_item_vlan vlan_spec;
> > +struct rte_flow_item_vlan vlan_mask;
> 
> Please declare all local variables at the top of the context
> (in this case function).
> 
> ...
> 
> > +struct rte_flow_item_udp udp_spec;
> > +struct rte_flow_item_udp udp_mask;
> > +memset(_mask, 0, sizeof(udp_mask));
> > +if ((proto == IPPROTO_UDP) &&
> > +(match->wc.masks.tp_src || match->wc.masks.tp_dst)) {
> > +udp_spec.hdr.src_port = match->flow.tp_src;
> > +udp_spec.hdr.dst_port = match->flow.tp_dst;
> > +
> > +udp_mask.hdr.src_port = match->wc.masks.tp_src;
> > +udp_mask.hdr.dst_port = match->wc.masks.tp_dst;
> > +
> > +add_flow_pattern(, RTE_FLOW_ITEM_TYPE_UDP,
> > + _spec, _mask);
> > +
> > +/* proto == UDP and ITEM_TYPE_UDP, thus no need for proto match */
> > +if